Enables the support of `-fcf-protection=return` on RISC-V, which
requires Zicfiss. It also adds a string attribute "hw-shadow-stack"
to every function if the option is set on RISC-V
This patch avoids eagerly populating the submodule index on `Module`
construction. The `StringMap` allocation shows up in my profiles of
`clang-scan-deps`, while the index is not necessary most of the time. We
still construct it on-demand.
Moreover, this patch avoids performing qualified submodule lookup in
`ASTReader` whenever we're serializing a module graph whose top-level
module is unknown. This is pointless, since that's guaranteed to never
find any existing submodules anyway.
This speeds up `clang-scan-deps` by ~0.5% on my workload.
This patch shrinks the size of the `Module` class from 2112B to 1624B. I
wasn't able to get a good data on the actual impact on memory usage, but
given my `clang-scan-deps` workload at hand (with tens of thousands of
instances), I think there should be some win here. This also speeds up
my benchmark by under 0.1%.
This commit implements the [wide-arithmetic] proposal which has recently
reached phase 2 in the WebAssembly proposals process. The goal here is
to implement support in LLVM for emitting these instructions which are
gated behind a new feature flag by default. A new `wide-arithmetic`
feature flag is introduced which gates these four new instructions from
being emitted.
Emission of each instruction itself is relatively simple given LLVM's
preexisting lowering rules and infrastructure. The main gotcha is that
due to the multi-result nature of all of these instructions it needed
the lowerings to be implemented in C++ rather than in TableGen.
[wide-arithmetic]: https://github.com/WebAssembly/wide-arithmetic
Two options for clang: -mlam-bh & -mno-lam-bh.
Enable or disable amswap[__db].{b/h} and amadd[__db].{b/h} instructions.
The default is -mno-lam-bh.
Only works on LoongArch64.
In `clang-scan-deps`, we're creating lots of `Module` instances.
Allocating them all in a bump-pointer allocator reduces the number of
retired instructions by 1-1.5% on my workload.
Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
If we split these features in the compiler (see relevant pull request
https://github.com/llvm/llvm-project/pull/109299), we would only be able
to hand-write a 'memtag2' version using inline assembly since the
compiler cannot generate the instructions that become available with
FEAT_MTE2. However these instructions only work at Exception Level 1, so
they would be unusable since FMV is a user space facility. I am
therefore unifying them.
Approved in ACLE as https://github.com/ARM-software/acle/pull/351
This patch adds an IsText parameter to the following getBufferForFile,
getBufferForFileImpl. We introduce a new virtual function
openFileForReadBinary which defaults to openFileForRead except in
RealFileSystem which uses the OF_None flag instead of OF_Text.
The default is set to OF_Text instead of OF_None, this change in value
does not affect any other platforms other than z/OS. Setting this
parameter correctly is required to open files on z/OS in the correct
encoding. The IsText parameter is based on the context of where we open
files, for example, in the ASTReader, HeaderMap requires that files
always be opened in binary even though they might be tagged as text.
This change implements support for the `cr` and `cf` register
constraints (which allocate a RVC GPR or RVC FPR respectively), and the
`N` modifier (which prints the raw encoding of a register rather than
the name).
The intention behind these additions is to make it easier to use inline
assembly when assembling raw instructions that are not supported by the
compiler, for instance when experimenting with new instructions or when
supporting proprietary extensions outside the toolchain.
These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92
As part of the implementation, I felt there was not enough coverage of
inline assembly and the "in X" floating-point extensions, so I have
added more regression tests around these configurations.
MSVC has a set of qualifiers to allow using 32-bit signed/unsigned
pointers when building 64-bit targets. This is useful for WoW code
(i.e., the part of Windows that handles running 32-bit application on a
64-bit OS). Currently this is supported on x64 using the 270, 271 and
272 address spaces, but does not work for AArch64 at all.
This change adds the same 270, 271 and 272 address spaces to AArch64 and
adjusts the data layout string accordingly. Clang will generate the
correct address space casts, but these will currently be ignored until
the AArch64 backend is updated to handle them.
Partially fixes#62536
This is a resurrected version of <https://reviews.llvm.org/D158857>
(originally created by @a_vorobev) - I've cleaned it up a little, fixed
the rest of the tests and added to auto-upgrade for the data layout.
Gentoo is planning to introduce a `*t64` suffix for triples that will be
used by 32-bit platforms that use 64-bit `time_t`. Add support for
parsing and accepting these triples, and while at it make clang
automatically enable the necessary glibc feature macros when this suffix
is used.
An open question is whether we can backport this to LLVM 19.x. After
all, adding new triplets to Triple sounds like an ABI change — though I
suppose we can minimize the risk of breaking something if we move new
enum values to the very end.
It would be nice to see what our users think about this change, as this
is something that WG21/EWG quite wants to fix a handful of questionable
issues with UB. Depending on the outcome of this after being committed,
we might instead suggest EWG undeprecate this, and require a bit of
'magic' from the lexer.
Additionally, this patch makes it so we emit this diagnostic ALSO in
cases where the literal name is reserved. It doesn't make sense to limit
that.
---------
Co-authored-by: Vlad Serebrennikov <serebrennikov.vladislav@gmail.com>
To consolidate behavior of function mangling and limit the number of
places that ABI changes will need to be made, this switches the DirectX
target used for HLSL to use the Itanium ABI from the Microsoft ABI. The
Itanium ABI has greater flexibility in decisions regarding mangling of
new types of which we have more than a few yet to add.
One effect of this will be that linking library shaders compiled with
DXC will not be possible with shaders compiled with clang. That isn't
considered a terribly interesting use case and one that would likely
have been onerous to maintain anyway.
This involved adding a function to call all global destructors as the
Microsoft ABI had done.
This requires a few changes to tests. Most notably the mangling style
has changed which accounts for most of the changes. In making those
changes, I took the opportunity to harmonize some very similar tests for
greater consistency. I also shaved off some unneeded run flags that had
probably been copied over from one test to another.
Other changes effected by using the new ABI include using different
types when manipulating smaller bitfields, eliminating an unnecessary
alloca in one instance in this-assignment.hlsl, changing the way static
local initialization is guarded, and changing the order of inout
parameters getting copied in and out. That last is a subtle change in
functionality, but one where there was sufficient inconsistency in the
past that standardizing is important, but the particular direction of
the standardization is less important for the sake of existing shaders.
fixes#110736
Add the permutation clause for the interchange directive which will be
introduced in the upcoming OpenMP 6.0 specification. A preview has been
published in
[Technical Report12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).
This is primarily meant to address the issue identified in #109182,
around incorrect usage of `-fsycl-is-device`; we now have AMDGCN
flavoured SPIR-V which retains the desired behaviour around the default
AS and does not depend on the SYCL language being enabled to do so.
Overall, there are three changes:
1. We unconditionally use the `SPIRDefIsGen` AS map for AMDGCNSPIRV
target, as there is no case where the hack of setting default to private
would be desirable, and it can be used for languages other than OCL/HIP;
2. We implement `SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace` for
SPIR-V in general, because otherwise using it from languages other than
HIP or OpenCL would yield 0, incorrectly;
3. We remove the incorrect usage of `-fsycl-is-device`.
This patch enables the following command line flags for RISC-V targets:
+ `-fcf-protection=branch` turns on forward-edge control-flow integrity conditioning
+ `-mcf-branch-label-scheme=unlabeled|func-sig` selects the label scheme used in the forward-edge CFI conditioning
- Remove dependence on `STLExtras.h`.
- Remove unused header inclusions.
- Make `count` use `contains` for deduplication.
- Replace hand-written linear scans on Vector by `std::find`.
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect.
Furthermore, the default / implicit scope is changed from Device (an
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL
defaulting to Device scope is now reflected in the front-end handling of atomic
ops, which seems preferable.
For some reason `__BPF_FEATURE_MAY_GOTO` is available for CPUs v{2,3,4}
but is not available for CPU v1. This limitation is arbitrary:
- the instruction is never produced by LLVM backend;
- on Linux Kernel side this instruction is available in kernels that
also support CPUv4.
Hence, it is more consistent to either always allow
`__BPF_FEATURE_MAY_GOTO` or only allow it for CPUv4.
Only allow GPR registers and verify the size is the same as XLen.
This fixes the crash seen in #109588 by making it a frontend error.
gcc does accept the code so we may need to consider if we can fix the
backend. Some other targets I tried appear to have similar issues so it
might not be straightforward to fix.
This adds support for:
* `muslabin32` (MIPS N32)
* `muslabi64` (MIPS N64)
* `muslf32` (LoongArch ILP32F/LP64F)
* `muslsf` (LoongArch ILP32S/LP64S)
As we start adding glibc/musl cross-compilation support for these
targets in Zig, it would make our life easier if LLVM recognized these
triples. I'm hoping this'll be uncontroversial since the same has
already been done for `musleabi`, `musleabihf`, and `muslx32`.
I intentionally left out a musl equivalent of `gnuf64` (LoongArch
ILP32D/LP64D); my understanding is that Loongson ultimately settled on
simply `gnu` for this much more common case, so there doesn't *seem* to
be a particularly compelling reason to add a `muslf64` that's basically
deprecated on arrival.
Note: I don't have commit access.
Introduce changes necessary for UEFI X86_64 target Clang driver.
Addressed the review comments originally suggested in Phabricator.
Differential Revision: https://reviews.llvm.org/D159541
This patch adds an IsText parameter to the following functions
openFileForRead, getBufferForFile, getBufferForFileImpl and determines
whether a file is text by querying the file tag on z/OS. The default is
set to OF_Text instead of OF_None, this change in value does not affect
any other platforms other than z/OS.
Resolves: #70930 (and probably latest comments from clangd/clangd#251)
by fixing racing for the shared DiagStorage value which caused messing with args inside the storage and then formatting the following message with getArgSInt(1) == 2:
def err_module_odr_violation_function : Error<
"%q0 has different definitions in different modules; "
"%select{definition in module '%2'|defined here}1 "
"first difference is "
which causes HandleSelectModifier to go beyond the ArgumentLen so the recursive call to FormatDiagnostic was made with DiagStr > DiagEnd that leads to infinite while (DiagStr != DiagEnd).
The Main Idea:
Reuse the existing DiagStorageAllocator logic to make all DiagnosticBuilders having independent states.
Also, encapsulating the rest of state (e.g. ID and Loc) into DiagnosticBuilder.
The last attempt failed -
https://github.com/llvm/llvm-project/pull/108187#issuecomment-2353122096
so was reverted - #108838
As captured in issue #108044, HLSL 202x is the target language mode for
conformance for Clang. Earlier language modes will be a best effort and
prioritized after 2020x. To make this easier and reduce our testing
complexity we want to make 202x the default language mode now, and align
all earlier modes to match 202x (except where we explicitly deviate).
This change has the following concrete changes:
* All older language modes gain `CPlusPlus11` as a base
* The default language mode for HLSL sources is changed to 202x
* A few test cases are updated to resolve differences in generated
diagnostics.
Second to last change for #108044