The tests are testing that specifying individual Zvk* extensions
set the preprocessor directives for Zvk* shorthand extensions.
None of the shorthands refer to Zvbb so we should use Zvkb(which
is implied by Zvbb).
This follows the same implementation logic as with C++ and is
compatible with the GCC behavior in C.
Trigraphs are enabled by default in -std=c* conformance modes before
C23, but are disabled in GNU and Microsoft modes as well as in C23 or
later.
Drop -menable-experimenta-extensions where it isn't needed.
This file has sections for non-experimental and experimental extensions,
but we keep forgetting to move things when we change the extension
status.
This adds minimal support for 7 new unprivileged extensions that were
defined as a part of
the RISC-V Profiles specification here:
https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#7-new-isa-extensions
* Ziccif: Main memory supports instruction fetch with atomicity
requirement
* Ziccrse: Main memory supports forward progress on LR/SC sequences
* Ziccamoa: Main memory supports all atomics in A
* Zicclsm: Main memory supports misaligned loads/stores
* Za64rs: Reservation set size of 64 bytes
* Za128rs: Reservation set size of 128 bytes
* Zic64b: Cache block size isf 64 bytes
As stated in the specification, these extensions don't add any new
features but
describe existing features. So this patch only adds parsing and
subtarget
features.
Currently there are several bits of code in the AArch64 driver which
attempt to enforce dependencies between optional features in the -march=
and -mcpu= options. However, these are based on the list of feature
names being enabled/disabled, so they have a lot of logic to consider
the order in which features were turned on and off, which doesn't scale
well as dependency chains get longer.
This patch moves the code handling these dependencies to TargetParser,
and changes them to use a Bitset of enabled features. This makes it easy
to check which features are enabled, and is converted back to a list of
LLVM feature names once all of the command-line options are parsed.
The motivating example for this was the -mcpu=cortex-r82+nofp option.
Previously, the code handling the dependency between the fp16 and
fp16fml extensions did not consider the nofp modifier, so it added
+fullfp16 to the feature list. This should have been disabled by the
+nofp modifier, and also the backend did follow the dependency between
fullfp16 and fp, resulting in fp being turned back on in the backend.
Most of the dependencies added to AArch64TargetParser.h weren't known
about by clang before, I built that list by checking what the backend
thinks the dependencies between SubtargetFeatures are.
This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.
The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.
`RVE` can be combined with all current standard extensions.
The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.
If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.
To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.
FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.
Differential Revision: https://reviews.llvm.org/D70401
-mbranch-protection=gcs (enabled by -mbranch-protection=standard) causes
generated objects to be marked with the gcs feature. This is done via
the guarded-control-stack module flag, in a similar way to
branch-target-enforcement and sign-return-address.
Enabling GCS causes the GNU_PROPERTY_AARCH64_FEATURE_1_GCS bit to be set
on generated objects. No code generation changes are required, as GCS
just requires that functions are called using BL and returned from using
RET (or other similar variant instructions), which is already the case.
This removes a long standing piece of technical debt. Most other
platforms have moved all their header search path logic to the driver,
but Darwin still had some logic for setting framework search paths
present in the frontend. This patch moves that logic to the driver
alongside existing logic that already handles part of these search
paths.
This is intended to be a pure refactor without any functional change
visible to users, since the search paths before and after should be the
same, and in the same order. The change in the tests is necessary
because we would previously add the DriverKit framework search path in
the frontend regardless of whether we actually need to, which we now
handle correctly because the driver checks for ld64-605.1+.
Fixes#75638
The patch adds the instructions in Zicfiss extension. Zicfiss extension
is to support shadow stack for control flow integrity. This patch is
based on version [0.3.1].
[0.3.1]: https://github.com/riscv/riscv-cfi/releases/tag/v0.3.1
This reverts 0d3eee33f2 and
4c37d30e22.
XSfcie is not an official SiFive extension name. It stands for
SiFive Custom Instruction Extension, which is mentioned in the S76
manual, but then elsewhere in the manual says it is not supported
for S76.
LLVM had various instructions and CSRs listed as part of this
extension, but as far as SiFive is concerned, none of them are part
of it. There are no documented extension names for these instructions
and CSRs either externally or internally.
If these are important to LLVM users, I can facilitate creating
extension names for them and have them documented. For now I'm
removing everything.
Unfortunately, these instructions and CSRs are in LLVM 17 so this
is an incompatible change.
This patch deprecates `module.map` in favor of `module.modulemap`, which
has been the preferred form since 2014. The eventual goal is to remove
support for `module.map` to reduce the number of stats Clang needs to do
while searching for module map files.
This patch touches a lot of files, but the majority of them are just
renaming tests or references to the file in comments or documentation.
The relevant files are:
* lib/Lex/HeaderSearch.cpp
* include/clang/Basic/DiagnosticGroups.td
* include/clang/Basic/DiagnosticLexKinds.td
GCC sets `#define HAVE_atomic_compare_and_swapti 1` and therefore
defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16`.
Clang compiles the 16-byte legacy `__sync_bool_compare_and_swap` and new
`__atomic_compare_exchange` compile to LDXP/STXP or (with LSE)
CASP{,A,L,AL}.
Link: https://github.com/llvm/llvm-project/issues/71883
Summary:
The standard GNU atomic operations are a very common way to target
hardware atomics on the device. With more heterogenous devices being
introduced, the concept of memory scopes has been in the LLVM language
for awhile via the `syncscope` modifier. For targets, such as the GPU,
this can change code generation depending on whether or not we only need
to be consistent with the memory ordering with the entire system, the
single GPU device, or lower.
Previously these scopes were only exported via the `opencl` and `hip`
variants of these functions. However, this made it difficult to use
outside of those languages and the semantics were different from the
standard GNU versions. This patch introduces a `__scoped_atomic` variant
for the common functions. There was some discussion over whether or not
these should be overloads of the existing ones, or simply new variants.
I leant towards new variants to be less disruptive.
The scope here can be one of the following
```
__MEMORY_SCOPE_SYSTEM // All devices and systems
__MEMORY_SCOPE_DEVICE // Just this device
__MEMORY_SCOPE_WRKGRP // A 'work-group' AKA CUDA block
__MEMORY_SCOPE_WVFRNT // A 'wavefront' AKA CUDA warp
__MEMORY_SCOPE_SINGLE // A single thread.
```
Naming consistency was attempted, but it is difficult to capture to full
spectrum with no many names. Suggestions appreciated.
Positive options: -mapx-features=<comma-separated-features>
Negative options: -mno-apx-features=<comma-separated-features>
-m[no-]apx-features is designed to be able to control separate APX
features.
Besides, we also support the flag -m[no-]apxf, which can be used like an
alias of -m[no-]apx-features=< all APX features covered by CPUID APX_F>
Behaviour when positive and negative options are used together:
For boolean flags, the last one wins
-mapxf -mno-apxf -> -mno-apxf
-mno-apxf -mapxf -> -mapxf
For flags that take a set as arguments, it sets the mask by order of the
flags
-mapx-features=egpr,ndd -mno-apx-features=egpr -> -egpr,+ndd
-mapx-features=egpr -mno-apx-features=egpr,ndd -> -egpr,-ndd
-mno-apx-features=egpr -mapx-features=egpr,ndd -> +egpr,+ndd
-mno-apx-features=egpr,ndd -mapx-features=egpr -> -ndd,+egpr
The design is aligned with gcc
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628905.html
While working on #embed, I noticed that the PR accidentally broke the
warning group but no tests failed as a result. This is adding the
missing test coverage.
Initial commits to support OpenACC. This patchset:
adds a clang-command line argument '-fopenacc', and starts
to define _OPENACC, albeit to '1' instead of the standardized
value (since we don't properly implement OpenACC yet).
The OpenACC spec defines `_OPENACC` to be equal to the latest standard
implemented. However, since we're not done implementing any standard,
we've defined this by default to be `1`. As it is useful to run our
compiler against existing OpenACC workloads, we're providing a
temporary override flag to change the `_OPENACC` value to be any
entirely digit value, permitting testing against any existing OpenACC
project.
Exactly like the OpenMP parser, the OpenACC pragma parser needs to
consume and reprocess the tokens. This patch sets up the infrastructure
to do so by refactoring the OpenMP version of this into a more general
version that works for OpenACC as well.
Additionally, this adds a few diagnostics and token kinds to get us
started.
Sometimes bpf developer might want to develop different codes
based on particular cpu versioins. For example, cpu v1/v2/v3
branch target is 16bit while cpu v4 branch target is 32bit,
thus cpu v4 allows more aggressive loop unrolling than cpu v1/v2/v3
(see [1] for a kernel selftest failure due to this).
We would like to maintain aggressive loop unrolling for cpu v4
while limit loop unrolling for earlier cpu versions.
Another example, signed divide also only available with cpu v4.
Actually, adding cpu specific macros are fairly common
in llvm. For example, x86 has maco like 'i486', '__pentium_mmx__', etc.
AArch64 has '__ARM_NEON', '__ARM_FEATURE_SVE', etc.
This patch added __BPF_CPU_VERSION__ macro. Current possible values
are 0/1/2/3/4. The following are the -mcpu=... to __BPF_CPU_VERSION__
mapping:
```
cpu __BPF_CPU_VERSION__
no -mcpu=<...> 1
-mcpu=v1 1
-mcpu=v2 2
-mcpu=v3 3
-mcpu=v4 4
-mcpu=generic 1
-mcpu=probe 0
```
This patch also added some macros for developers to identify some cpu
insn features:
```
feature macro enabled in which cpu
__BPF_FEATURE_JMP_EXT >= v2
__BPF_FEATURE_JMP32 >= v3
__BPF_FEATURE_ALU32 >= v3
__BPF_FEATURE_LDSX >= v4
__BPF_FEATURE_MOVSX >= v4
__BPF_FEATURE_BSWAP >= v4
__BPF_FEATURE_SDIV_SMOD >= v4
__BPF_FEATURE_GOTOL >= v4
__BPF_FEATURE_ST >= v4
```
[1]
https://lore.kernel.org/bpf/3e3a8a30-dde0-43a1-981e-2274962780ef@linux.dev/
This reverts commit 578a4716f5.
This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.
Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.
Fixes#70011.
`ModuleDeclState` is incorrectly changed to `NamedModuleImplementation`
for `struct module {}; void foo(module a);`. This is mostly benign but
leads to a spurious warning after #69555.
A real world example is:
```
// pybind11.h
class module_ { ... };
using module = module_;
// tensorflow
void DefineMetricsModule(pybind11::module main_module);
// `module main_module);` incorrectly changes `ModuleDeclState` to `NamedModuleImplementation`
#include <algorithm> // spurious warning
```
This patch adds compiler options -mlsx/-mlasx which enables the
instruction sets of LSX and LASX, and sets related predefined macros
according to the options.
-mcmodel= is supported for a few architectures. Reject the option for
other architectures.
* -mcmodel= is unsupported on x86-32.
* -mcmodel=large is unsupported for PIC on AArch64.
* -mcmodel= is unsupported for aarch64_32 triples.
* https://reviews.llvm.org/D67066 (for RISC-V) made
-mcmodel=medany/-mcmodel=medlow aliases for all architectures. Restrict
this to RISC-V.
* llvm/lib/Target/Sparc has some small/medium/large support, but the
values listed on https://gcc.gnu.org/onlinedocs/gcc/SPARC-Options.html
had been supported before https://reviews.llvm.org/D67066. Consider
-mcmodel= unsupported for Sparc.
* https://reviews.llvm.org/D106371 translated -mcmodel=medium to
-mcmodel=large on AIX, even for 32-bit systems. Retain this behavior but
reject -mcmodel= for other PPC32 systems.
In general the accept/reject behavior is more similar to GCC.
err_drv_invalid_argument_to_option is less clear than
err_drv_unsupported_option_argument. As the supported values are
different for
different architectures, add a
err_drv_unsupported_option_argument_for_target
for better clarity.
RISC-V C API introduced predefined macro to achieve hints about
unaligned accesses ([pr]). This patch defines __riscv_misaligned_fast
when using -mno-strict-align, otherwise, defines
__riscv_misaligned_avoid.
Note: This ignores __riscv_misaligned_slow which is also defined by
spec.
[pr]: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/40
GCC defines this macro for how many single-precision floating point registers
can be used.
If the -mno-odd-spreg option is given, it will be 16; if either -mno-odd-spreg
nor -modd-spreg are given, we set it to 16 for FPXX.
Reviewed By: theraven
Differential Revision: https://reviews.llvm.org/D157896