clang-p2996

Author	SHA1	Message	Date
Joseph Huber	7155c1ef65	[NVPTX] Allow compiling LLVM-IR without `-march` set (#79873 ) Summary: The NVPTX tools require an architecture to be used, however if we are creating generic LLVM-IR we should be able to leave it unspecified. This will result in the `target-cpu` attributes not being set on the functions so it can be changed when linked into code. This allows the standalone `--target=nvptx64-nvidia-cuda` toolchain to create LLVM-IR simmilar to how CUDA's deviceRTL looks from C/C++	2024-01-30 21:44:43 -06:00
Joseph Huber	626fe71fa5	[Clang] Fix test failing on systems without ROCm installed Summary: Forgot to specify `-nogpulib` which makes this test look for ROCm.	2024-01-30 13:17:02 -06:00
Joseph Huber	f2a78e68ee	[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#80035 ) Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into another program via `-mlink-builtin-bitcode` to inherit its attributes. However, there are a handful of macros that can leak incorrect information when compiling for an unspecified architecture. Currently, things like the wavefront size will default to 64, which is actually variable. We should not expose these macros unless it is known.	2024-01-30 13:05:29 -06:00
Joseph Huber	72d4fc1b4d	Revert "[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660 )" This reverts commit `c9a6e993f7`. This breaks HIP code that incorrectly depended on GPU-specific macros to be set. The code is totally wrong as using `__WAVEFRTONSIZE__` on the host is absolutely meaningless, but it seems this entire corner of the toolchain is fundmentally broken. Reverting for now to avoid breakages.	2024-01-29 11:11:25 -06:00
Joseph Huber	c9a6e993f7	[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660 ) Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into another program via `-mlink-builtin-bitcode` to inherit its attributes. However, there are a handful of macros that can leak incorrect information when compiling for an unspecified architecture. Currently, things like the wavefront size will default to 64, which is actually variable. We should not expose these macros unless it is known.	2024-01-29 08:46:14 -06:00
Freddy Ye	19e784604c	[X86] Remove RAO-INT from Grandridge (#76420 ) According to latest spec: https://cdrdv2.intel.com/v1/dl/getContent/671368	2023-12-28 10:06:54 +08:00
Phoebe Wang	c78aeabaec	[X86] Add a EVEX256 macro to match with GCC and MSVC (#71317 )	2023-11-07 14:39:24 +08:00
Freddy Ye	278e533ee9	[X86] Support -march=pantherlake,clearwaterforest (#69277 )	2023-10-19 15:11:15 +08:00
XinWang10	057ec767ad	[X86][NFC]Update test cases after D159250 (#68517 )	2023-10-10 09:32:32 +08:00
Fangrui Song	8cfe9d8f2a	[Driver] Remove remnant myriad pieces after Myriad.cpp removal after D104279 and D158706.	2023-08-25 13:29:10 -07:00
Freddy Ye	6acff5390d	[X86] Support -march=gracemont gracemont has some different tuning features from alderlake. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158046	2023-08-21 08:49:01 +08:00
Freddy Ye	c9d92e6638	[X86] Support -march=arrowlake,arrowlake-s,lunarlake Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D156239	2023-07-28 15:05:54 +08:00
Freddy Ye	6d23a3faa4	[X86] Support -march=graniterapids-d and update -march=graniterapids Reviewed By: pengfei, RKSimon, skan Differential Revision: https://reviews.llvm.org/D155798	2023-07-25 13:48:31 +08:00
Freddy Ye	5cc4b1059b	[X86] Update features for sierraforest, grandridge Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D155784	2023-07-25 11:00:41 +08:00
Freddy Ye	548e08c3f6	[NFC] Add missing cpu tests in predefined-arch-macros.c Added tests for penryn, nehalem, westmere, sandybridge, ivybridge, haswell, bonnell, silvermont. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D153714	2023-06-29 13:30:13 +08:00
Freddy Ye	847abddedc	[X86] Add AMX_COMPLEX to Graniterapids This patch also rename __AMXCOMPLEX__ to __AMX_COMPLEX__ Reviewed By: skan, xiangzhangllvm Differential Revision: https://reviews.llvm.org/D147525	2023-04-06 13:19:44 +08:00
Joe Loser	8998fa6c14	[clang] Change AMX macros to match names from GCC The current behavior for AMX macros is: ``` gcc -march=native -dM -E - < /dev/null \| grep TILE clang -march=native -dM -E - < /dev/null \| grep TILE ``` which is not ideal. Change `__AMXTILE__` and friends to `__AMX_TILE__` (i.e. have an underscore in them). This makes GCC and Clang agree on the naming of these AMX macros to simplify downstream user code. Fix this for `__AMXTILE__`, `__AMX_INT8__`, `__AMX_BF16__`, and `__AMX_FP16__`. Differential Revision: https://reviews.llvm.org/D143094	2023-02-03 07:00:16 -07:00
Ben Shi	16f9451b07	[clang] Redefine some AVR specific macros Fixes https://github.com/llvm/llvm-project/issues/58855 Reviewed By: aykevl, Miss_Grape Differential Revision: https://reviews.llvm.org/D141598	2023-01-13 17:22:15 +08:00
Ben Shi	485ba407a6	[clang][test] Remove unnecessary 'REQUIRES' The test 'Preprocessor/predefined-arch-macros.c' contains many target tests than 'amdgpu'. If clang is built without 'amdgpu', then failures in other target tests will not be reported. Reviewed By: aaron.ballman, MaskRay Differential Revision: https://reviews.llvm.org/D141647	2023-01-13 10:04:22 +08:00
Brad Smith	f70d17fc2c	[LoongArch] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Reviewed By: SixWeining, MaskRay Differential Revision: https://reviews.llvm.org/D141070	2023-01-05 20:21:22 -05:00
Freddy Ye	27b8f54f51	[X86] Support -march=emeraldrapids Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D140950	2023-01-05 20:27:32 +08:00
Brad Smith	d227c3b68c	[Hexagon][VE][WebAssembly] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Reviewed By: kparzysz, aheejin, MaskRay Differential Revision: https://reviews.llvm.org/D140757	2023-01-05 04:45:07 -05:00
Brad Smith	2784b243e3	[M68k] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros Fixes #58974 Reviewed By: myhsu, glaubitz, 0x59616e Differential Revision: https://reviews.llvm.org/D140695	2022-12-29 05:07:35 -05:00
Ganesh Gopalasubramanian	1f057e365f	[X86] AMD Zen 4 Initial enablement Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D139073	2022-12-17 16:15:22 +05:30
Freddy Ye	84a18a260e	[X86] Support -march=sierraforest, grandridge, graniterapids. Reviewed By: skan, pengfei, MaskRay Differential Revision: https://reviews.llvm.org/D137153	2022-11-09 16:56:03 +08:00
Freddy Ye	a806fc2767	[X86] Support -march=raptorlake, meteorlake Reviewed By: pengfei, skan, MaskRay Differential Revision: https://reviews.llvm.org/D135937	2022-11-04 09:32:17 +08:00
Simon Pilgrim	6e19e6ce36	[clang][X86] Add RDPRU predefined macro tests for znver2/znver3 targets These were missed in D128934	2022-08-11 15:48:39 +01:00
Ulrich Weigand	1283ccb610	Support z16 processor name The recently announced IBM z16 processor implements the architecture already supported as "arch14" in LLVM. This patch adds support for "z16" as an alternate architecture name for arch14.	2022-04-21 19:58:22 +02:00
John Paul Adrian Glaubitz	5061eb6b01	[Sparc] Don't define __sparcv9 and __sparcv9__ when targeting V8+ Currently, clang defines the three macros __sparcv9, __sparcv9__ and __sparc_v9__ when targeting the V8+ baseline, i.e. using the V9 instruction set on a 32-bit target. Since neither gcc nor SolarisStudio define __sparcv9 and __sparcv9__ when targeting V8+, some existing code such as the glibc breaks when defining either of these two macros on a 32-bit target as they are used to detect a 64-bit target. Update the tests accordingly. Fixes PR49562. Reviewed By: jrtc27, MaskRay, hvdijk Differential Revision: https://reviews.llvm.org/D98574	2022-01-21 09:57:17 -08:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Ulrich Weigand	8cd8120a7b	[SystemZ] Add support for new cpu architecture - arch14 This patch adds support for the next-generation arch14 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch14 as host processor. - Assembler/disassembler support for new instructions. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10304. Note: No currently available Z system supports the arch14 architecture. Once new systems become available, the official system name will be added as supported -march name.	2021-07-26 16:57:28 +02:00
Freddy Ye	3fc1fe8db8	[X86] Support -march=rocketlake Reviewed By: skan, craig.topper, MaskRay Differential Revision: https://reviews.llvm.org/D100085	2021-04-13 09:48:13 +08:00
Freddy Ye	5cb47be410	[X86] Remove FeatureCLWB from FeaturesICLClient Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100279	2021-04-12 12:08:59 +08:00
Freddy Ye	5f9489b754	[X86] Refine "Support -march=alderlake" Refine "Support -march=alderlake" Compare with tremont, it includes 25 more new features. They are adx, aes, avx, avx2, avxvnni, bmi, bmi2, cldemote, f16c, fma, hreset, invpcid, kl, lzcnt, movdir64b, movdiri, pclmulqdq, pconfig, pku, serialize, shstk, vaes, vpclmulqdq, waitpkg, widekl. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D97832	2021-03-08 13:17:18 +08:00
Yaxun (Sam) Liu	efc063b621	Fix lit test failure due to 0b81d9 These lit tests now requires amdgpu-registered-target since they use clang driver and clang driver passes an LLVM option which is available only if amdgpu target is registered. Change-Id: I2df31967409f1627fc6d342d1ab5cc8aa17c9c0c	2020-12-07 19:50:21 -05:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Benjamin Kramer	39a0d6889d	[X86] Add a stub for Intel's alderlake. No scheduling, no autodetection.	2020-10-24 19:01:22 +02:00
Benjamin Kramer	bd2cf96c09	[X86] Add a stub for znver3 based on the little public information there is in AMD's manuals No scheduling, no autodetection. Just enough so -march=znver3 works.	2020-10-24 19:01:22 +02:00
Tianqing Wang	be39a6fe6f	[X86] Add User Interrupts(UINTR) instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89301	2020-10-22 17:33:07 +08:00
Fangrui Song	012dd42e02	[X86] Support -march=x86-64-v[234] PR47686. These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 GCC 11 will support these levels. Note, -mtune=x86-64-v[234] are invalid and __builtin_cpu_is cannot be used on them. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D89197	2020-10-12 10:29:46 -07:00
Fangrui Song	cbe4d973ed	[X86] Define __LAHF_SAHF__ if feature 'sahf' is set or 32-bit mode GCC 11 will define this macro. In LLVM, the feature flag only applies to 64-bit mode and we always define the macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can suppress the macro. The discrepancy can unlikely cause trouble. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89198	2020-10-11 09:46:00 -07:00
Rainer Orth	76e85ae268	[clang][Sparc] Default to -mcpu=v9 for Sparc V8 on Solaris As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit Sparc, unlike `gcc` on Solaris. In a 1-stage build with `gcc`, only two testcases are affected (currently `XFAIL`ed), while in a 2-stage build more than 100 tests `FAIL` due to this issue. The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike with `clang`'s default of `-mcpu=v8`. This patch changes `clang` to use `-mcpu=v9` on 32-bit Solaris/SPARC, too. Doing so uncovered two bugs: `clang -m32 -mcpu=v9` chokes with any Solaris system headers included: /usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined" #error "Both _ILP32 and _LP64 are defined" While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9` compilation, neither `gcc` nor Studio `cc` do. In fact, the Studio 12.6 `cc(1)` man page clearly states: These predefinitions are valid in all modes: [...] __sparcv8 (SPARC) __sparcv9 (SPARC -m64) At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]` for a 32-bit Sparc compilation with any V9 cpu. I've also changed `MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic Types, 3.1 Size and Alignment of Atomic C Types). The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed again. Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`. Differential Revision: https://reviews.llvm.org/D86621	2020-09-11 09:53:19 +02:00
Craig Topper	e6bb4c8e7b	[X86] SSE4_A should only imply SSE3 not SSSE3 in the frontend. SSE4_1 and SSE4_2 due imply SSSE3. So I guess I got confused when switching the code to being table based in D83273. Fixes PR47464	2020-09-08 10:50:59 -07:00
Freddy Ye	e02d081f2b	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Brad Smith	5fe171321c	[Sparc] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros on SPARCv9	2020-08-11 00:04:24 -04:00
Craig Topper	f886f07248	[X86] Some CHECK-NOTs for FMA4/TBM/XOP for znver1/znver2 in predefined-arch-macros.c These features exist in earlier CPUs, but were deprecated on znver1/znver2. While working on D82731 I accidentally copied them from the earlier CPU. And nothing caught my mistake. Having these additional checks would have helped.	2020-06-30 12:04:26 -07:00
Craig Topper	9e8b5a20e9	[X86] Add MOVBE and RDRND features to BDVER4. Only 6 years behind gcc. https://gcc.gnu.org/legacy-ml/gcc-patches/2014-08/msg00231.html Found while working on improving how we define CPU features for clang and auditing for correctness.	2020-06-26 23:32:17 -07:00
Craig Topper	a7db230d75	[X86] Add CMPXCHG16B feature to amdfam10 in the frontend. We already have this feature on it in the backend.	2020-06-25 22:55:36 -07:00
Craig Topper	6673d69226	[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature. The PREFETCHW instruction was originally part of the 3DNow. But it was given its own CPUID bit on later CPUs just before 3DNow was deprecated. We were setting the -mprfchw flag if -m3dnow was passed or the CPU supported 3dnow unless -mno-prfchw was passed. But -march=native on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw. So -march=k8 will behave differently than -march=native on a K8 for example. So remove this implicit setting from the frontend and instead enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled. Also enable PRFCHW flag on amdfam10/barcelona which seems to be where this CPUID bit was introduced. That CPU also supported 3dnow.	2020-06-25 12:46:52 -07:00
Craig Topper	01c18f9199	Revert "[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature." This is failing on the bots. This reverts commit `636d31a5c3`.	2020-06-25 11:43:02 -07:00

1 2 3 4

170 Commits