This patch adds clang options `-mxcoff-roptr` and `-mno-xcoff-roptr` to specify storage locations for constant pointers on AIX.
When the `-mxcoff-roptr` option is in effect, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. When the `-mno-xcoff-roptr` option is in effect, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.
This patch depends on https://reviews.llvm.org/D144189.
Reviewed By: hubert.reinterpretcast, stephenpeckham
Differential Revision: https://reviews.llvm.org/D144190
Currenlty there is a mismatch between LoongArch gcc and clang about
handling register name in inlineasm, i.e. gcc allows both `$`-prefixed
and non-prefiexed names for GPRs while clang only allows `$`-prefixed
one. This patch fixes this mismatch by adding non-prefixed GPR names
in clang.
Take `$r4` for example. With this patch, clang accepts `$r4`, `r4`,
`$a0` and `a0` like what gcc does.
Reviewed By: xen0n
Differential Revision: https://reviews.llvm.org/D136436
The support added by D149215 to remove memprof metadata and attributes
if we don't link with an allocator supporting hot/cold operator new
interfaces did not update imported code. Move the update handling later
in the ThinLTO backend to just after importing, and update the test to
check this case.
Differential Revision: https://reviews.llvm.org/D150295
Without opaque pointers, this code determined !heapallocsite based
on the innermost cast of the allocation call. With opaque pointers,
the casts no longer generate an instruction, so the outermost cast
is used. Add an explicit check for nested casts to prevent this.
Differential Revision: https://reviews.llvm.org/D145788
Clang on Windows targets often requires indirect calls through the
import address table (IAT), and also .refptr stubs for MinGW target.
On 32-bit this generates assembly in the form of
`call dword ptr [__imp__func]`, which MC had failed to handle correctly.
64-bit targets are not affected because rip-relative addressing is used.
Reported on: https://github.com/llvm/llvm-project/issues/62010
Depends on D149695, D149920
Differential Revision: https://reviews.llvm.org/D149579
Adds an LTO option to indicate that whether we are linking with an
allocator that supports hot/cold operator new interfaces. If not,
at the start of the LTO backends any existing memprof hot/cold
attributes are removed from the IR, and we also remove memprof metadata
so that post-LTO inlining doesn't add any new attributes.
This is done via setting a new flag in the module summary index. It is
important to communicate via the index to the LTO backends so that
distributed ThinLTO handles this correctly, as they are invoked by
separate clang processes and the combined index is how we communicate
information from the LTO link. Specifically, for distributed ThinLTO the
LTO related processes look like:
```
# Thin link:
$ lld --thinlto-index-only obj1.o ... objN.o -llib ...
# ThinLTO backends:
$ clang -x ir obj1.o -fthinlto-index=obj1.o.thinlto.bc -c -O2
...
$ clang -x ir objN.o -fthinlto-index=objN.o.thinlto.bc -c -O2
```
It is during the thin link (lld --thinlto-index-only) that we have
visibility into linker dependences and want to be able to pass the new
option via -Wl,-supports-hot-cold-new. This will be recorded in the
summary indexes created for the distributed backend processes
(*.thinlto.bc) and queried from there, so that we don't need to know
during those individual clang backends what allocation library was
linked. Since in-process ThinLTO and regular LTO also use a combined
index, for consistency we query the flag out of the index in all LTO
backends.
Additionally, when the LTO option is disabled, exit early from the
MemProfContextDisambiguation handling performed during LTO, as this is
unnecessary.
Depends on D149117 and D149192.
Differential Revision: https://reviews.llvm.org/D149215
Clang has mechanism to specify required target features of a built-in
function. This patch adds such definitions to Altivec, VSX, HTM,
PairedVec and MMA builtins.
This will help frontend to detect incompatible target features of
bulitin when using target attribute syntax.
Reviewed By: nemanjai, kamaub
Differential Revision: https://reviews.llvm.org/D143467
In a `__asm` block, a symbol reference is usually a memory constraint
(indirect TargetLowering::C_Memory) [LOOP]. CALL and JUMP instructions are special
that `__asm call k` can be an address constraint, if `k` is a function.
Clang always gives us indirect TargetLowering::C_Memory and need to convert it
to direct TargetLowering::C_Address. D133914 implements this conversion, but
does not consider JMP or case-insensitive CALL. This patch implements the missing
cases, so that `__asm jmp k` (`jmp ${0:P}`) will correctly lower to `jmp _k`
instead of `jmp dword ptr [_k]`.
(`__asm call k` lowered to `call dword ptr ${0:P}` and is fixed by D149695 to
lower to `call ${0:P}` instead.)
[LOOP]: Some instructions like LOOP{,E,NE} and Jcc always use an address
constraint (`loop _k` instead of `loop dword ptr [_k]`).
After this patch and D149579, all the following cases will be correct.
```
int k(int);
int (*kptr)(int);
...
__asm call k; // correct without this patch
__asm CALL k; // correct, but needs this patch to be compatible with D149579
__asm jmp k; // correct, but needs this patch to be compatible with D149579
__asm call kptr; // will be fixed by D149579. "Broken case" in clang/test/CodeGen/ms-inline-asm-functions.c
__asm jmp kptr; // will be fixed by this patch and D149579
```
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D149920
The desired semantics for HasLegalHalfType are slightly unclear in that
the comment for HasLegalHalfType says "True if the backend supports
operations on the half LLVM IR type." Which operations? We get very
limited scalar operations with zfhmin, more with zfh, and vector support
with zvfh. While the comment for hasLegalHalfType() says "Determine
whether _Float16 is supported on this target."
This patch sets HasLegalHalfType to true for zfh.
Differential Revision: https://reviews.llvm.org/D145071
The AOK_SizeDirective part from 5b37c18129
(2014-08) seems unneeded nowadays (the root cause has likely been fixed
elsewhere). The part abuses that `call dword ptr foo` assembles the same way as `call
foo` in Intel syntax, which is going to be fixed (changed) by D149579.
The generated object files for
CodeGen/ms-inline-asm{,-functions,-variables,-static-variable}.c and
CodeGenCXX/ms-inline-asm-fields.cpp are unchanged
(-mno-incremental-linker-compatible) with just this patch. When D149579 is
subsequently applied, the FIXME part of `kptr` in
CodeGen/ms-inline-asm-functions.c will be fixed.
Differential Revision: https://reviews.llvm.org/D149695
We filed some CD ballot comments which WG14 considered during the
ballot comment resolution meetings in Jan and Feb 2023, and this
updates our implementation based on the decisions reached. Those
decisions were (paraphrased for brevity):
US 9-034 (REJECTED)
allow (void *)nullptr to be a null pointer constant
US 10-035 (ACCEPTED)
accept the following code, as in C++:
void func(nullptr_t); func(0);
US 22-058 (REJECTED)
accept the following code, as in C++:
nullptr_t val; (void)(1 ? val : 0); (void)(1 ? nullptr : 0);
US 23-062 (REJECTED)
reject the following code, as in C++:
nullptr_t val; bool b1 = val; bool b2 = nullptr;
US 24-061 (ACCEPTED)
accept the following code, as in C++:
nullptr_t val; val = 0;
US 21-068 (ACCEPTED)
accept the following code, as in C++:
(nullptr_t)nullptr;
GB-071 (ACCEPTED)
accept the following code, as in C++:
nullptr_t val; (void)(val == nullptr);
This patch updates the implementation as appropriate, but is primarily
focused around US 10-035, US 24-061, and US 23-062 in terms of
functional changes.
Differential Revision: https://reviews.llvm.org/D148800
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.
The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.
The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.
Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.
This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.
Depends on D143437
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D145441
Since we don't always need the vendor extension to be in riscv_vector.td,
so it's better to make it be in separated header.
Depends on D148223 and D148680
Differential Revision: https://reviews.llvm.org/D148308
Before this patch, initialized class members would have the LifetimeKind
LK_MemInitializer, which does not allow for binding a temporary to a
reference. Binding to a temporary however is allowed in parenthesized
aggregate initialization, even if it leads to a dangling reference. To
fix this, we create a new EntityKind, EK_ParenAggInitMember, which has
LifetimeKind LK_FullExpression.
This patch does *not* attempt to diagnose dangling references as a
result of using this feature.
This patch also refactors TryOrBuildParenListInitialization(...) to
accomodate creating different InitializedEntity objects.
Fixes#61567
[0]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0960r3.html
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D148274
Previously, when checking whether an in-class initializer exists when
performing parenthesized aggregate initialization, Clang checks that the
output of FieldDecl::getInClassInitializer() is non-null. This is
incorrect; if the field is part of a templated type, then
getInClassInitializer() will return nullptr if we haven't called
Sem::BuildCXXDefaultInitExpr(...) before, even if
FieldDecl::hasInClassInitializer() returns true. The end result is that
Clang incorrectly ignores the in class initializer and
value-initializes the field. The fix therefore is to instead call
FieldDecl::hasInClassInitializer(), which is what we do for braced init
lists [0].
Before this patch, Clang does correctly recognize the in-class field
initializer in certain cases. This is Sema::BuildCXXDefaultInitExpr(...)
populates the in class initializer of the corresponding FieldDecl
object. Therefore, if that method was previously called with the same
FieldDecl object, as can happen with a decltype(...) or a braced list
initialization, FieldDecl::getInClassInitializer() will return a
non-null expression, and the field becomes properly initialized.
Fixes 62266
[0]: be5f35e24f/clang/lib/Sema/SemaInit.cpp (L685)
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D149389
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
This allows the user to set the size of the scalable vector so they
can be used in structs and as the type of global variables. This works
by representing the type as a fixed vector instead of a scalable vector
in IR. Conversions to and from scalable vectors are made where necessary
like function arguments/returns and intrinsics.
This features has been requested here
https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/176
I know arm_sve_vector_bits is used by the Eigen library so this
could be used to port Eigen to RVV.
This patch adds a new preprocessor define `__riscv_v_fixed_vlen` that
is set when -mrvv_vector_bits is passed on the command line.
The code is largely based on the AArch64 code. A lot of code was
copy/pasted and then modiied to RVV. There may be some opportunities
for sharing.
This first patch only supports the LMUL=1 types. Additional changes
will be needed to support other LMULs. I have also not supported
mask vectors.
Differential Revision: https://reviews.llvm.org/D145088
See https://discourse.llvm.org/t/rfc-enable-assignment-tracking/69399
This sets the -Xclang -fexperimental-assignment-tracking flag to the value
enabled which means it will be enabled so long as none of the following are
true: it's an LTO build, LLDB debugger tuning has been specified, or it's an O0
build (no work is done in any case if -g is not specified or -gmlt is used).
This reverts commit 0ba922f600 which reverts
https://reviews.llvm.org/D146987
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
For
`clang -c -g -fdebug-prefix-map=a/b=y -fdebug-prefix-map=a=x a/b/c.c`,
we apply the longest prefix substitution, but
GCC has always been picking the last applicable option (`a=x`, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109591).
I feel that GCC's behavior is reasonable given the convention that the last
value wins for the same option.
Before D49466, Clang appeared to apply the shortest prefix substitution,
which likely made the least sense.
Reviewed By: #debug-info, scott.linder
Differential Revision: https://reviews.llvm.org/D148975
This test was added in D72538, along with multiple LLVM pipeline tests,
to ensure that distributed ThinLTO backends invoked via clang set up the
expected ThinLTO optimization pipeline. However, this introduces churn
to clang tests from LLVM pipeline changes (see recent comment in that
patch).
Since the full pipeline setup is tested by LLVM, I have changed this
test to simply look for a single pass that is only invoked during LTO
backends, to make sure that clang is provoking the an LTO backend
pipeline setup.
This amends 2e275e2435. That commit added
a null to pointer cast kind when determining whether the expression can
be a valid constant initializer, but failed to update the constant
expression evaluator to perform the evaluation. This commit updates the
constant expression evaluator to handle that cast kind.
This commit implements the two NTLH intrinsic functions.
```
type __riscv_ntl_load (type *ptr, int domain);
void __riscv_ntl_store (type *ptr, type val, int domain);
```
```
enum {
__RISCV_NTLH_INNERMOST_PRIVATE = 2,
__RISCV_NTLH_ALL_PRIVATE,
__RISCV_NTLH_INNERMOST_SHARED,
__RISCV_NTLH_ALL
};
```
We encode the non-temporal domain into MachineMemOperand flags.
1. Create the RISC-V built-in function with custom semantic checking.
2. Assume the domain argument is a compile time constant,
and make it as LLVM IR metadata (nontemp_node).
3. Encode domain value as two bits MachineMemOperand TargetMMOflag.
4. According to MachineMemOperand TargetMMOflag, select corrsponding ntlh instruction.
Currently, it supports scalar type and fixed-length vector type.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D143364
This is effectively a debugging pass to adjust function attributes.
I don't think it makes sense to run it in the post-link pipeline.
Differential Revision: https://reviews.llvm.org/D148904
LICM does not use ORE from the pass manager, it constructs its
own instance. As such, explicitly requiring the analysis in the
pipeline is unnecessary.
See https://discourse.llvm.org/t/rfc-enable-assignment-tracking/69399
This sets the -Xclang -fexperimental-assignment-tracking flag to the value
enabled which means it will be enabled so long as none of the following are
true: it's an LTO build, LLDB debugger tuning has been specified, or it's an O0
build (no work is done in any case if -g is not specified or -gmlt is used).
This reverts commit a65ca4546b which reverts
https://reviews.llvm.org/D146987
This patch prefixes omp outlined helpers and reduction funcs
with the original function's name.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D140722