Removes sve-bf16, sve-ebf16, and sve-i8mm since they are obsolete. One
could write target_version("sve+bf16") instead of sve-bf16 for instance.
Approved in ACLE as https://github.com/ARM-software/acle/pull/353
According to the Arm Architecture Reference Manual for A-profile
architecture you can't have one feature without having the other:
ID_AA64ZFR0_EL1.AES, bits [7:4]
> FEAT_SVE_AES implements the functionality identified by the value
0b0001.
> FEAT_SVE_PMULL128 implements the functionality identified by the value
0b0010.
> The permitted values are 0b0000 and 0b0010.
(The following was removed from the latest release of the specification,
but it appears to be a mistake that was not intended to relax the
architecture constraints. The discrepancy has been reported)
ID_AA64ISAR0_EL1.AES, bits [7:4]
> FEAT_AES implements the functionality identified by the value 0b0001.
> FEAT_PMULL implements the functionality identified by the value
0b0010.
> From Armv8, the permitted values are 0b0000 and 0b0010.
Approved in ACLE as https://github.com/ARM-software/acle/pull/352
If we split these features in the compiler (see relevant pull request
https://github.com/llvm/llvm-project/pull/109299), we would only be able
to hand-write a 'memtag2' version using inline assembly since the
compiler cannot generate the instructions that become available with
FEAT_MTE2. However these instructions only work at Exception Level 1, so
they would be unusable since FMV is a user space facility. I am
therefore unifying them.
Approved in ACLE as https://github.com/ARM-software/acle/pull/351
According to https://developer.arm.com/documentation/102105/latest Arm
Architecture Reference Manual for A-profile architecture: Known issues
2.206 D22789
In section C5.2.25 "SSBS, Speculative Store Bypass Safe", under the
heading 'Configurations', the text that reads:
"This register is present only when FEAT_SSBS is implemented. Otherwise,
direct accesses to SSBS are UNDEFINED."
is changed to read:
"This register is present only when FEAT_SSBS2 is implemented.
Otherwise, direct accesses to SSBS are UNDEFINED."
This suggests that it's not worth splitting FEAT_SSBS2 from FEAT_SSBS in
the compiler, since FEAT_SSBS cannot be used for predicating the MRS/MSR
instructions. Those can access PSTATE.SSBS only when FEAT_SSBS2 is
available. Moreover, there are no hardware implementations which
implement FEAT_SSBS without FEAT_SSBS2, therefore unifying these
features in the specification should not be a regression for feature
detection.
Approved in ACLE as https://github.com/ARM-software/acle/pull/350
Originally I tried spliting these features in the compiler with
https://github.com/llvm/llvm-project/pull/101712, but we decided to lump
those features in the ACLE specification (see
https://github.com/ARM-software/acle/pull/346). Since there are no
hardware implementations out there which implement ls64 without ls64_v
or ls64_accdata, this shouldn't be a regression for feature detection.
Perform the same macro expansion in the header to improve handling
the various ARM64 environments which use different CPU architecture
identification macro spellings.
When clang is used as `clang-cl`, we use MSVC style macros. The spelling
of `__aarch64__` is converted to `_M_ARM64`. Account for this
alternative spelling in the conditional check. While in the area, add a
tertiary spelling of `__arm64__` to ensure that we catch more of the
variants.
…ImplID
This patch
1. remove the vendorId from `__riscv_vendor_feature_bits`
2. Define a new structure for vendorID, ArchID and ImplID
3. Update the relate init code
This patch add `void* PlatformArgs` parameter to
`__init_riscv_feature_bits`. `PlatformArgs` allows the platform to
provide pre-computed data and access it without extra effort. For
example, Linux could pass the vDSO object to avoid an extra system call.
```
__init_riscv_feature_bits()
->
__init_riscv_feature_bits(void *PlatformArgs)
```
The spec can be found at
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/74.
1. Add the new extension GroupID/Bitmask with latest hwprobe key.
2. Update the `initRISCVFeature `
3. Update `EmitRISCVCpuSupports` due to not only group0 now.
GCC-14.1.1 emit an error due to uninitialized variables
x86.c:303:17: error: ‘EAX’ may be used uninitialized
[-Werror=maybe-uninitialized]
x86.c:970:35: error: ‘MaxLevel’ may be used uninitialized
[-Werror=maybe-uninitialized]
x86.c:987:48: error: ‘MaxExtLevel’ may be used uninitialized
[-Werror=maybe-uninitialized]
It doesn't handle properly that these variables initialized indirectly
in functions that takes pointers to them
This reverts commit f1905f0644.
This relands commit 19cf8deabe.
There were issues with the preprocessor includes that should have
excluded MSVC still including clang functions building on windows and
using intrin.h. This relanding fixes this behavior by additionally
wrapping the uses of __get_cpuid and __get_cpuid_count in _MSC_VER so
that clang in MSVC mode, which includes intrin.h, does not have any
conflicts.
Changes included:
- Adding CONSTRUCTOR_ATTRIBUTE so that the static data is setup early on
in process lifetime. This is required by gcc docs for
__builtin_cpu_supports which we hope to implement in terms of this.
- Move the length initialization outside of the #if defined(linux) block
so that the length field always reflects the size of the structures even
if non of the feature bits are non-zero.
- Change the __riscv_vendor_feature_bits.length field to match the
length of the actual structure.
Note: Copy from https://github.com/llvm/llvm-project/pull/99958
---------
Co-authored-by: Philip Reames <preames@rivosinc.com>
This addresses the spurious inclusion of (now unsupported) target
features '-3dnow' and '-3dnowa' when disabling mmx (when then caused log
output from `clang -mno-mmx`).
It should've been part of PR #96246, but was missed.
Also tweaks the warning in prfchwintrin.h to not recommend the
deprecated mm3dnow.h header.
This reverts commit f6616e99c7.
Was causing buildbot failures on Windows. I also remember seeing a
AMDGPU buildbot failing somewhere on a warning as they have -Werror
enabled.
This reverts commit 2039e13064.
This relands commit 19cf8deabe.
Added some additional preprocessor directives to ensure that Host.cpp
only includes cpuid.h when being built on x86.
This patch unifies the implementation of getAMDProcessorTypeAndSubtype
between compiler-rt and LLVM.
This patch is intended to be a step towards pulling these functions out
into identical .inc files to better facilitate code sharing between LLVM
and compiler-rt.
This reverts commit 19cf8deabe.
This was causing quite a few buildbot failures (see the PR description).
Reverting for now while I have time to sort it out. Seems like it should
just be conditional preprocessor macros for X86 however.
This patch makes the host/feature detection in compiler-rt and LLVM use
the functions provided in cpuid.h(__get_cpuid, __get_cpuid_count)
instead of inline assembly. This simplifies the implementation and moves
any inline assembly away to a more common place.
A while ago, some similar cleanup was attempted, but this ended up
resulting in some compilation errors due to toolchain minimum version
issues (https://bugs.llvm.org/show_bug.cgi?id=30384). After the
reversion landed, there have been no attempts since then to clean up the
code, even though the minimum supported compilers now support the
relevant functions (https://godbolt.org/z/o1Mjz8ndv).
This patch refactors the AArch64 CPUFeatures enum into a separate
include file that is identical between LLVM and compiler-rt. This, along
with a test in compiler-rt to ensure that the two stay in sync.
MacOS 15.0 and iOS 18.0 added a new sysctl to fetch a bitvector of all
the hw.optional.arm.FEAT_*'s in one go. Using this has a perf advantage
over doing multiple round-trips to the kernel and back, but since it's
not present in older oses, we still need the slow fallback.
To detect features we either use HWCAPs or directly extract system
register bitfields and compare with a value. In many cases equality
comparisons give wrong results for example FEAT_SVE is not set if SVE2
is available (see the issue #93651). I am also making the access to
__aarch64_cpu_features atomic.
The corresponding PR for the ACLE specification is
https://github.com/ARM-software/acle/pull/322.
Reverts llvm/llvm-project#88965
This caused a test suite failure:
https://lab.llvm.org/buildbot/#/builders/185/builds/6583
NOEXE: test-suite::aarch64-acle-fmv-features.test
```
/home/tcwg-buildbot/worker/clang-aarch64-lld-2stage/test/test-suite/SingleSource/UnitTests/AArch64/acle-fmv-features.c:98:1: error: redefinition of 'check_sha1'
98 | CHECK(sha1, {
| ^
/home/tcwg-buildbot/worker/clang-aarch64-lld-2stage/test/test-suite/SingleSource/UnitTests/AArch64/acle-fmv-features.c:36:17: note: expanded from macro 'CHECK'
36 | static void check_##X(void) { \
| ^
<scratch space>:150:1: note: expanded from here
150 | check_sha1
| ^
```
I presume that the useless features need to be removed from the fmv test
as well.
As explained in https://github.com/ARM-software/acle/pull/315 we
are deprecating features which aren't adding any value. These are:
sha1, pmull, dit, dgh, ebf16, sve-bf16, sve-ebf16, sve-i8mm,
sve2-pmull128, memtag2, memtag3, ssbs2, bti, ls64_v, ls64_accdata
The patch adds support for FEAT_MOPS (Memory Copy and Memory Set
instructions) in Function Multi Versioning. The bits [19:16] of the
system register ID_AA64ISAR2_EL1 indicate whether FEAT_MOPS is
implemented in AArch64 state. This information is accessible via ELF
hwcaps.
a15532d764 landed a patch that added
support for detecting more AMD znver2 CPUs and cleaned up some of the
surrounding code, including the znver3 detection. Since one model group
is 00h-0fh, I adjusted the check to include checking if the value is
greater than zero. Since the value is unsigned, this is always true and
gcc warns on it. This patch removes the comparison with zero to get rid
of the compiler warning.
[builtins] Fix CPU feature detection for FreeBSD on AArch64
This is a follow-up to #75635 which broke the build for FreeBSD on
AArch64:
```
compiler-rt/lib/builtins/cpu_model/aarch64/lse_atomics/freebsd.inc:3:16: error: call to undeclared function 'elf_aux_info'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
3 | int result = elf_aux_info(AT_HWCAP, &hwcap, sizeof hwcap);
| ^
```
Using `elf_aux_info()` requires including `<sys/auxv.h>` first. To
prevent redeclaration issues with `hwcap.inc` attempting to define
`HWCAP_xxx` macros before `<sys/auxv.h>` does so, include `<sys/auxv.h>`
before any of the `.inc` files on FreeBSD.