Commit Graph

245 Commits

Author SHA1 Message Date
Craig Topper
7bd8a165f9 [X86] Don't allow '+f' as an inline asm constraint. (#113871)
f cannot be used as an output constraint. We already errored for '=f'
but not '+f'.

Fixes #113692.
2024-10-28 13:20:46 -07:00
Freddy Ye
c4248fa3ed [X86] Support MOVRS and AVX10.2 instructions. (#113274)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-10-25 09:00:19 +08:00
Ganesh
02e4186d0b [X86] AMD Zen 5 Initial enablement (#107964)
This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.
2024-09-13 17:45:33 +01:00
Freddy Ye
83ad644afa [X86][AVX10.2] Support AVX10.2-BF16 new instructions. (#101603)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-09-04 08:13:24 +08:00
Phoebe Wang
3f25f23a2b [X86][AVX10] Fix unexpected error and warning when using intrinsic (#104781)
E.g.: https://godbolt.org/z/G8zK5svjK

Based on Evgenii's work.
2024-08-20 19:56:19 +08:00
Phoebe Wang
259ca9ee9c Reland "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)" (#101616)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-03 09:26:07 +08:00
Phoebe Wang
2e0588d5e1 Revert "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions" (#101612)
Reverts llvm/llvm-project#101452

There are several buildbot failed. Revert first.
2024-08-02 13:04:10 +08:00
Phoebe Wang
10bad2c8d7 [X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-02 12:10:50 +08:00
Shengchen Kan
88e9bd822f [X86][Driver] Enable feature zu for -mapxf
This is follow-up for #78901 after validation.
Drop the comments for stability since zu is the last feature for cpuid APX_F.
2024-07-19 12:34:41 +08:00
James Y Knight
f0eb5587ce Remove support for 3DNow!, both intrinsics and builtins. (#96246)
This set of instructions was only supported by AMD chips starting in
the K6-2 (introduced 1998), and before the "Bulldozer" family
(2011). They were never much used, as they were effectively superseded
by the more-widely-implemented SSE (first implemented on the AMD side
in Athlon XP in 2001).

This is being done as a predecessor towards general removal of MMX
register usage. Since there is almost no usage of the 3DNow!
intrinsics, and no modern hardware even implements them, simple
removal seems like the best option.

(Clang half originally uploaded in https://reviews.llvm.org/D94213)

Works towards issue #41665 and issue #98272.
2024-07-16 12:08:48 -04:00
Feng Zou
e603451f3c [X86] Support branch hint (#97721)
For more details about this feature, please refer to latest Intel 64 and
IA-32 Architectures Optimization Reference Manual Volume 1:
https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
2024-07-08 13:12:50 +08:00
Shengchen Kan
8ad32ce738 [X86] Add sub-feature zu (zero upper) for APX
This is a follow-up patch for #74199
2024-06-25 09:25:32 +08:00
Shengchen Kan
45a7af7c99 [X86][Driver] Enable feature cf for -mapxf
This is follow-up for #78901 after validation.
2024-06-24 15:11:07 +08:00
Freddy Ye
6f2794afeb Fix build warning for '[X86] Support EGPR for inline assembly. (#92338)' (#93777) 2024-05-30 15:25:08 +08:00
Freddy Ye
73f4c2547d [X86] Support EGPR for inline assembly. (#92338)
"jR": explicitly enables EGPR
"r", "l", "q": enables/disables EGPR w/wo -mapx-inline-asm-use-gpr32
"jr": explicitly enables GPR with -mapx-inline-asm-use-gpr32
-mapx-inline-asm-use-gpr32 will also define a new macro:
`__APX_INLINE_ASM_USE_GPR32__`

GCC patches:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631183.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631186.html
[[PATCH v2] x86: Define _APX_INLINE_ASM_USE_GPR32_
(gnu.org)](https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649003.html)

Reference: https://gcc.godbolt.org/z/nPPvbY6r4
2024-05-30 14:47:47 +08:00
Shengchen Kan
0f7b4b04a5 [X86][Driver] Enable feature ccmp,nf for -mapxf
This is follow-up for #78901 after validation.
2024-05-29 17:34:26 +08:00
Freddy Ye
4def1ce101 Reland "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93136)
This reverts commit aa4069ea96.
2024-05-24 13:46:34 +08:00
Freddy Ye
aa4069ea96 Revert "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93123)
This reverts commit 282d2ab58f.
2024-05-23 10:25:23 +08:00
Freddy Ye
282d2ab58f [X86] Remove knl/knm specific ISAs supports (#92883)
Cont. patch after https://github.com/llvm/llvm-project/pull/75580
2024-05-23 09:46:44 +08:00
Shengchen Kan
575177f610 [X86] Add sub-feature nf (no flags update) for APX
This is a follow-up patch for #74199
2024-05-11 15:55:59 +08:00
Freddy Ye
e44600f3ab [X86][CFE] Support EGPR in GCCRegNames. (#91323) 2024-05-08 15:07:18 +08:00
Freddy Ye
db2fb3d96b [X86] Define __APX_F__ when APX is enabled. (#88343)
Relate gcc patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648789.html
2024-04-11 16:57:32 +08:00
Kazu Hirata
3c93c037c9 [Basic] Use StringRef::ends_with (NFC) 2024-02-03 21:43:05 -08:00
Kazu Hirata
b67ce7e349 [clang] Use StringRef::starts_with (NFC) 2024-01-31 23:54:09 -08:00
Fangrui Song
d4cb5d9f2b [X86] Add "Ws" constraint and "p" modifier for symbolic address/label reference (#77886)
Printing the raw symbol is useful in inline asm (e.g. getting the C++
mangled name, referencing a symbol in a custom way while ensuring it is
not optimized out even if internal). Similar constraints are available
in other targets (e.g. "S" for aarch64/riscv, "Cs" for m68k).

```
namespace ns { extern int var, a[4]; }
void foo() {
  asm(".pushsection .xxx,\"aw\"; .dc.a %p0; .popsection" :: "Ws"(&ns::var));
  asm(".reloc ., BFD_RELOC_NONE, %p0" :: "Ws"(&ns::a[3]));
}
```

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105576
2024-01-16 23:57:42 -08:00
Freddy Ye
19870ed9c3 [X86] Emit Warnings for frontend options to enable knl/knm specific ISAs. (#75580)
Since Knight Landing and Knight Mill microarchitectures are EOL, we
would like to remove intrinsic supports for its specific ISA in LLVM 19.
In LLVM 18, we will first emit a warning for the usage.
2024-01-09 19:43:14 +08:00
Kazu Hirata
a70dcc2cda [clang] Use StringRef::ltrim (NFC) 2023-12-27 09:10:39 -08:00
Shengchen Kan
6d6baef5c9 [X86] Support CFE flags for APX features (#74199)
Positive options: -mapx-features=<comma-separated-features>
Negative options: -mno-apx-features=<comma-separated-features>

-m[no-]apx-features is designed to be able to control separate APX
features.

Besides, we also support the flag -m[no-]apxf, which can be used like an
alias of -m[no-]apx-features=< all APX features covered by CPUID APX_F>

Behaviour when positive and negative options are used together:

For boolean flags, the last one wins

-mapxf   -mno-apxf   -> -mno-apxf
-mno-apxf   -mapxf   -> -mapxf

For flags that take a set as arguments, it sets the mask by order of the
flags

-mapx-features=egpr,ndd  -mno-apx-features=egpr  ->   -egpr,+ndd
-mapx-features=egpr  -mno-apx-features=egpr,ndd  ->   -egpr,-ndd
-mno-apx-features=egpr  -mapx-features=egpr,ndd  ->   +egpr,+ndd
-mno-apx-features=egpr,ndd  -mapx-features=egpr  ->   -ndd,+egpr

The design is aligned with gcc
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628905.html
2023-12-04 19:22:56 +08:00
Phoebe Wang
e96eddec5e Reland "[X86][AVX10] Fix a bug when using -march with no-evex512 attribute (#72126)"
Fixes #72106
2023-11-14 15:39:30 +08:00
Phoebe Wang
17dd0c70c8 Revert "[X86][AVX10] Fix a bug when using -march with no-evex512 attribute (#72126)"
This reverts commit 451c594bcb.

Revert due to buildbot fails.
2023-11-14 15:34:38 +08:00
Phoebe Wang
451c594bcb [X86][AVX10] Fix a bug when using -march with no-evex512 attribute (#72126)
#71318 failed to clear EVEX512 feature for intended intrinsics.

Fixes #72106
2023-11-14 15:15:34 +08:00
Phoebe Wang
f229ba4e8d [X86][AVX10] Permit AVX512 options/features used together with AVX10 (#71318)
This patch relaxes the driver logic to permit combinations between
AVX512 and AVX10 options and makes sure we have a unified behavior
between options and features combination.

Here are rules we are following when handle these combinations:
1. evex512 can only be used for avx512xxx options/features. It will be
ignored if used without them;
2. avx512xxx and avx10.xxx are options in two worlds. Avoid to use them
together in any case. It will enable a common super set when they are
used together. E.g., "-mavx512f -mavx10.1-256" euqals "-mavx10.1-512".

Compiler emits warnings when user using combinations like "-mavx512f
-mavx10.1-256" in case they won't get unexpected result silently.

Function target feature attribute follows the same rule now. We have to
add "no-evex512" feature for intrinsics shared between AVX512 and AVX10.
We also add "no-evex512" for early ISAs like AVX etc., because some of
them are called by AVX512 intrinsics.
2023-11-10 15:21:05 +08:00
Phoebe Wang
c78aeabaec [X86] Add a EVEX256 macro to match with GCC and MSVC (#71317) 2023-11-07 14:39:24 +08:00
Freddy Ye
278e533ee9 [X86] Support -march=pantherlake,clearwaterforest (#69277) 2023-10-19 15:11:15 +08:00
Phoebe Wang
cfbf0a500f [X86][RFC] Support AVX10 options (#67278)
AVX10 Architecture Specification:
https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC:
https://discourse.llvm.org/t/rfc-design-for-avx10-options-support/73672
2023-10-19 07:52:50 +08:00
Freddy Ye
819ac45d1c [X86] Add USER_MSR instructions. (#68944)
For more details about this instruction, please refer to the latest ISE
document:
https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
2023-10-16 10:12:53 +08:00
Phoebe Wang
24194090e1 [X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features
This is an alternative of D157485 and a pre-feature to support AVX10.

AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D159250
2023-09-08 22:47:22 +08:00
Phoebe Wang
0856efbf88 Revert "[X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features"
This reverts commit 7dd48cc24d.

Causing buildbot failure.
2023-09-07 21:59:01 +08:00
Phoebe Wang
7dd48cc24d [X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features
This is an alternative of D157485 and a pre-feature to support AVX10.

AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D159250
2023-09-07 21:38:35 +08:00
Fangrui Song
27da15381c [X86] __builtin_cpu_supports: support x86-64{,-v2,-v3,-v4}
GCC 12 (https://gcc.gnu.org/PR101696) allows
__builtin_cpu_supports("x86-64") (and -v2 -v3 -v4).
This patch ports the feature.

* Add `FEATURE_X86_64_{BASELINE,V2,V3,V4}` to enum ProcessorFeatures,
  but keep CPU_FEATURE_MAX unchanged to make
  FeatureInfos/FeatureInfos_WithPLUS happy.
* Change validateCpuSupports to allow `x86-64{,-v2,-v3,-v4}`
* Change getCpuSupportsMask to return `std::array<uint32_t, 4>` where
  `x86-64{,-v2,-v3,-v4}` set bits `FEATURE_X86_64_{BASELINE,V2,V3,V4}`.
* `target("x86-64")` and `cpu_dispatch(x86_64)` are invalid. Tested by commit 9de3b35ac9

Close https://github.com/llvm/llvm-project/issues/59961

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D158811
2023-08-25 20:56:25 -07:00
Freddy Ye
6acff5390d [X86] Support -march=gracemont
gracemont has some different tuning features from alderlake.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D158046
2023-08-21 08:49:01 +08:00
Freddy Ye
c9d92e6638 [X86] Support -march=arrowlake,arrowlake-s,lunarlake
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D156239
2023-07-28 15:05:54 +08:00
Freddy Ye
6d23a3faa4 [X86] Support -march=graniterapids-d and update -march=graniterapids
Reviewed By: pengfei, RKSimon, skan

Differential Revision: https://reviews.llvm.org/D155798
2023-07-25 13:48:31 +08:00
Fangrui Song
14b466b940 [X86] Fix a typo of Broadwell after D74918. NFC
Close #64053
2023-07-23 15:15:05 -07:00
Freddy Ye
1c154bd755 [X86] Add AVX-VNNI-INT16 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155145
2023-07-20 14:31:16 +08:00
Freddy Ye
049d6a3f42 [X86] Add SM4 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155148
2023-07-20 13:35:15 +08:00
Freddy Ye
c6f66de21a [X86] Add SM3 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D155147
2023-07-20 10:24:16 +08:00
Freddy Ye
fc3b7874b6 [X86] Add SHA512 instructions.
For more details about this instruction, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D155146
2023-07-20 09:44:44 +08:00
Freddy Ye
7717c0071d [X86] Remove CPU_SPECIFIC* MACROs and add getCPUDispatchMangling
This refactor patch means to remove CPU_SPECIFIC* MACROs in X86TargetParser.def
and move those information into ProcInfo of X86TargetParser.cpp. Since these
two files both maintain a table with redundant info such as cpuname and its
features supported. CPU_SPECIFIC* MACROs define some different information. This
patch dealt with them in these ways when moving:
1.mangling
This is now moved to Mangling in ProcInfo and directly initialized at array of
Processors. CPUs don't support cpu_dispatch/specific are assigned '\0' as
mangling.
2.CPU alias
The alias cpu will also be initialized in array of Processors, its attributes
will be same as its alias target cpu. Same feature list, same mangling.
3.TUNE_NAME
Before my change, some cpu names support cpu_dispatch/specific are not
supported in X86.td, which means optimizer/backend doesn't recognize them. So
they use a different TUNE_NAME to generate in IR. In this patch, I added these
missing cpu support at X86.td by utilizing existing Features and XXXTunings, so
that each cpu name can directly use its own name as TUNE_NAME to be supported
by optimizer/backend.
4.Feature list
The feature list of one CPU maintained in X86TargetParser.def is not same as
the one in X86TargetParser.cpp. It only maintains part of features of one CPU
(features defined by X86_FEATURE_COMPAT). While X86TargetParser.cpp maintains
a complete one. This patch abandons the feature list maintained by CPU_SPECIFIC*
MACROs because assigning a CPU with a complete one doesn't affect the
functionality of cpu_dispatch/specific.
Except these four info, since some of CPUs supported by cpu_dispatch/specific
doesn's support clang options like -march, -mtune before, this patch also kept
this behavior still by adding another member OnlyForCPUDispatchSpecific in
ProcInfo.

Reviewed By: pengfei, RKSimon

Differential Revision: https://reviews.llvm.org/D151696
2023-07-05 17:32:00 +08:00
M. Zeeshan Siddiqui
e621757365 [Clang][BFloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support
Pursuant to discussions at
https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/22,
this commit enhances the handling of the __bf16 type in Clang.
- Firstly, it upgrades __bf16 from a storage-only type to an arithmetic
  type.
- Secondly, it changes the mangling of __bf16 to DF16b on all
  architectures except ARM. This change has been made in
  accordance with the finalization of the mangling for the
  std::bfloat16_t type, as discussed at
  https://github.com/itanium-cxx-abi/cxx-abi/pull/147.
- Finally, this commit extends the existing excess precision support to
  the __bf16 type. This applies to hardware architectures that do not
  natively support bfloat16 arithmetic.
Appropriate tests have been added to verify the effects of these
changes and ensure no regressions in other areas of the compiler.

Reviewed By: rjmccall, pengfei, zahiraam

Differential Revision: https://reviews.llvm.org/D150913
2023-05-27 13:33:50 +08:00