`CGBuilderTy::CreateElementBitCast()` no longer does what its name suggests.
Remove remaining in-tree uses by one of the following methods.
* drop the call entirely
* fold it to an `Address` construction
* replace it with `Address::withElementType()`
This is a NFC cleanup effort.
Reviewed By: barannikov88, nikic, jrtc27
Differential Revision: https://reviews.llvm.org/D154285
Move `AttributeMask` out of `llvm/IR/Attributes.h` to a new file
`llvm/IR/AttributeMask.h`. After doing this we can remove the
`#include <bitset>` and `#include <set>` directives from `Attributes.h`.
Since there are many headers including `Attributes.h`, but not needing
the definition of `AttributeMask`, this causes unnecessary bloating of
the translation units and slows down compilation.
This commit adds in the include directive for `llvm/IR/AttributeMask.h`
to the handful of source files that need to see the definition.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,917,509,187 to 1,902,982,273 - a
reduction of ~0.76%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D153728
Allow specifying 'nomerge' attribute for function pointers,
e.g. like in the following C code:
extern void (*foo)(void) __attribute__((nomerge));
void bar(long i) {
if (i)
foo();
else
foo();
}
With the goal to attach 'nomerge' to both calls done through 'foo':
@foo = external local_unnamed_addr global ptr, align 8
define dso_local void @bar(i64 noundef %i) local_unnamed_addr #0 {
; ...
%0 = load ptr, ptr @foo, align 8, !tbaa !5
; ...
if.then:
tail call void %0() #1
br label %if.end
if.else:
tail call void %0() #1
br label %if.end
if.end:
ret void
}
; ...
attributes #1 = { nomerge ... }
Report a warning in case if 'nomerge' is specified for a variable that
is not a function pointer, e.g.:
t.c:2:22: warning: 'nomerge' attribute is ignored because 'j' is not a function pointer [-Wignored-attributes]
2 | int j __attribute__((nomerge));
| ^
The intended use-case is for BPF backend.
BPF provides a sort of "standard library" functions that are called
helpers. BPF also verifies usage of these helpers before program
execution. Because of limitations of verification / runtime model it
is important to keep calls to some of such helpers from merging.
An example could be found by the link [1], there input C code:
if (data_end - data > 1024) {
bpf_for_each_map_elem(&map1, cb, &cb_data, 0);
} else {
bpf_for_each_map_elem(&map2, cb, &cb_data, 0);
}
Is converted to bytecode equivalent to:
if (data_end - data > 1024)
tmp = &map1;
else
tmp = &map2;
bpf_for_each_map_elem(tmp, cb, &cb_data, 0);
However, BPF verification/runtime requires to use the same map address
for each particular `bpf_for_each_map_elem()` call.
The 'nomerge' attribute is a perfect match for this situation, but
unfortunately BPF helpers are declared as pointers to functions:
static long (*bpf_for_each_map_elem)(void *map, ...) = (void *) 164;
Hence, this commit, allowing to use 'nomerge' for function pointers.
[1] https://lore.kernel.org/bpf/03bdf90f-f374-1e67-69d6-76dd9c8318a4@meta.com/
Differential Revision: https://reviews.llvm.org/D152986
Clang provides the `-mlink-bitcode-file` and `-mlink-builtin-bitcode`
options to insert LLVM-IR into the current TU. These are usefuly
primarily for including LLVM-IR files that require special handling to
be correct and cannot be linked normally, such as GPU vendor libraries
like `libdevice.10.bc`. Currently these options can only be used if the
source input goes through the AST consumer path. This patch makes the
changes necessary to also support this when the input is LLVM-IR. This
will allow the following operation:
```
clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc
```
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D152391
Device libs make use of patterns like this:
```
__attribute__((target("gfx11-insts")))
static unsigned do_intrin_stuff(void)
{
return __builtin_amdgcn_s_sendmsg_rtnl(0x0);
}
```
For functions that are assumed to be eliminated if the currennt GPU target doesn't support them.
At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function.
D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually.
This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again.
Fixes SWDEV-403642
Reviewed By: arsenm, yaxunl
Differential Revision: https://reviews.llvm.org/D152251
This patch uses castAs instead of getAs which will assert if the type doesn't match in clang::CodeGen::CodeGenTypes::GetFunctionTypeForVTable(clang::GlobalDecl).
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151957
If there is an infinite cycle in the IR, the loop will never exit. Keep
track of visited basic blocks in a set and return nullptr if a block is
visited again.
Fixes#62830.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D151076
For the cover letter of this patch-set, please checkout D146872.
Depends on D146872.
This is the 2nd patch of the patch-set. This patch originates from
D97264. This patch further allows local variable declaration and
function parameter passing by adjustment in clang lowering.
Test cases are provided to demonstrate the LLVM IR generated.
Note: This patch is currently only a proof-of-concept with only a
single RVV tuple type declared here, the rest will be added when
the concept of this patch-set is accepted.
Authored-by: eop Chen <eop.chen@sifive.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D146873
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
Like CUDA and OpenCL, the SYCL specification says that throwing and
catching exceptions in device functions is not supported, so this change
extends the logic for adding the NoUnwind attribute to SYCL.
The existing convergent.cpp test, which tests that the convergent
attribute is added to functions by default, is renamed and reused to
test that the nounwind attribute is added by default. This test now has
-fexceptions added to it, which the driver adds by default as well.
The obvious question here is why not simply change the driver to remove
-fexceptions. This change follows the direction given by the TODO
comment because removing -fexceptions would also disable the
__EXCEPTIONS macro, which should reflect whether exceptions are enabled
on the host, rather than on the device, to avoid conflicts in types
shared between host and device.
Reviewed By: bader
Differential Revision: https://reviews.llvm.org/D147097
Currently clang emits error when both always_inline and target
attributes are on callee, but caller doesn't have some feature.
This patch makes clang emit error when caller cannot meet target feature
requirements from an always-inlined callee.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D143479
Set this on any source level floating-point type argument,
return value, call return or outgoing parameter which is lowered
to a valid IR type for the attribute. Currently this isn't
applied to emitted intrinsics since those don't go through
ABI code.
With the Microsoft ABI, some destructors need to offset a parameter to
get the derived this pointer, in which case the type of that parameter
should not be a pointer to the derived type.
Fixes#60465
Neither OpenCL nor C++ for OpenCL support exceptions, so add the
`nounwind` attribute unconditionally for those languages.
Differential Revision: https://reviews.llvm.org/D142033
This is the default behavior and cuts down on attribute spam.
Probably should also do something to consolidate the option spellings;
printing and parsing it is repeated in at least 3 different places.
In the OpenMP tests, I had to manually delete some metadata check
lines update_cc_test_checks was inserting that included the local
build revision.
This change will allow users to call getNullability() without providing an ASTContext.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D140104
Msan needs noundef consistency between interface and implementation. If
we call C++ from C we can have noundef on C++ side, and no noundef on
caller C side, noundef implementation will not set TLS for return value,
no noundef caller will expect it. Then we have false reports in msan.
The workaround could be set TLS to zero even for noundef return values.
However if we do that always it will increase binary size by about 10%.
If we do that selectively we need to handle "address is taken"
functions, any non local functions, and probably all function which have
musttail callers. Which is still a lot.
The existing implementation of HasStrictReturn refers to C standard as
the reason not enforcing noundef. I believe it applies only to the case
when return statement is omitted. Testing on Google codebase I never see
such cases, however I've see tens of cases where C code returns actual
uninitialized variables, but we ignore that it because of "omitted
return" case.
So this patch will:
1. fix false-positives with TLS missmatch.
2. detect bugs returning uninitialized variables for C as well.
3. report "omitted return" cases stricter than C, which is already a
warning and very likely a bug in a code anyway.
Reviewed By: kda
Differential Revision: https://reviews.llvm.org/D139296
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This value was added to clang/Basic in D111566, but is only used during
codegen, where we can use the LLVM IR DataLayout instead. I noticed this
because the downstream CHERI targets would have to also set this value
for AArch64/RISC-V/MIPS. Instead of duplicating more information between
LLVM IR and Clang, this patch moves getTargetAddressSpace(QualType T) to
CodeGenTypes, where we can consult the DataLayout.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D138296
Mixing LLVM and Clang address spaces can result in subtle bugs, and there
is no need for this hook to use the LLVM IR level address spaces.
Most of this change is just replacing zero with LangAS::Default,
but it also allows us to remove a few calls to getTargetAddressSpace().
This also removes a stale comment+workaround in
CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does
return the expected size for ReferenceType (and handles address spaces).
Differential Revision: https://reviews.llvm.org/D138295
The conditions for which Clang emits the `unsafe-fp-math` function
attribute has been modified as part of
`84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`.
In the backend code generators `"unsafe-fp-math"="true"` enable floating
point contraction for the whole function.
The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
`unsafe-fp-math` to be set also when only in-statement contraction is
allowed.
Consider the following example
```
float foo(float a, float b, float c) {
float tmp = a * b;
return tmp + c;
}
```
and compile it with the command line
```
clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
-O2 -mavx512f -S -o -
```
The resulting assembly has a `vfmadd213ss` instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.
The optimized IR is:
```
define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
%mul = fmul reassoc nsz arcp afn float %b, %a
%add = fadd reassoc nsz arcp afn float %mul, %c
ret float %add
}
attributes #0 = {
[...]
"no-signed-zeros-fp-math"="true"
"no-trapping-math"="true"
[...]
"unsafe-fp-math"="true"
}
```
The `"unsafe-fp-math"="true"` function attribute allows the backend code
generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`.
In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the `contract` fast-math flag and
`llvm.fmuladd` is being used to represent contractions opportunities
that occur within a single statement.
Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction
across statements is allowed.
Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see `CGFPOptionsRAII`).
For consistency `unsafe-fp-math` needs to be disabled if the contraction
mode for any scope/operation is not `fast`.
Similarly for consistency reason the initialization of `UnsafeFPMath` of
in `TargetOptions` for the backend code generation should take into
account the contraction mode as well.
Reviewed By: zahiraam
Differential Revision: https://reviews.llvm.org/D136786
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage.
I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm.
9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)
Fixes https://github.com/llvm/llvm-project/issues/54139
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D135926
There are currently two options that are used to tell the compiler to perform
unsafe floating-point optimizations:
'-ffast-math' and '-funsafe-math-optimizations'.
'-ffast-math' is enabled by default. It automatically enables the driver option
'-menable-unsafe-fp-math'.
Below is a table illustrating the special operations enabled automatically by
'-ffast-math', '-funsafe-math-optimizations' and '-menable-unsafe-fp-math'
respectively.
Special Operations -ffast-math -funsafe-math-optimizations -menable-unsafe-fp-math
MathErrno 0 1 1
FiniteMathOnly 1 0 0
AllowFPReassoc 1 1 1
NoSignedZero 1 1 1
AllowRecip 1 1 1
ApproxFunc 1 1 1
RoundingMath 0 0 0
UnsafeFPMath 1 0 1
FPContract fast on on
'-ffast-math' enables '-fno-math-errno', '-ffinite-math-only',
'-funsafe-math-optimzations' and sets 'FpContract' to 'fast'. The driver option
'-menable-unsafe-fp-math' enables the same special options than
'-funsafe-math-optimizations'. This is redundant.
We propose to remove the driver option '-menable-unsafe-fp-math' and use
instead, the setting of the special operations to set the function attribute
'unsafe-fp-math'. This attribute will be enabled only if those special
operations are enabled and if 'FPContract' is either 'fast' or set to the
default value.
Differential Revision: https://reviews.llvm.org/D135097
When `objc_direct` methods were implemented, the implicit `_cmd` parameter was left as an argument to the method implementation function, but was unset by callers; if the method body referenced the `_cmd` variable, a selector load would be emitted inside the body. However, this leaves an unused argument in the ABI, and is unnecessary.
This change removes the empty/unset argument, and if `_cmd` is referenced inside an `objc_direct` method it will emit local storage for the implicit variable. From the ABI perspective, `objc_direct` methods will have the implicit `self` parameter, immediately followed by whatever explicit arguments are defined on the method, rather than having one unset/undefined register in the middle.
Differential Revision: https://reviews.llvm.org/D131424
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.
Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.
KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.
A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:
```
.c:
int f(void);
int (*p)(void) = f;
p();
.s:
.4byte __kcfi_typeid_f
.global f
f:
...
```
Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.
As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.
Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.
Relands 67504c9549 with a fix for
32-bit builds.
Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay
Differential Revision: https://reviews.llvm.org/D119296
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.
Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.
KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.
A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:
```
.c:
int f(void);
int (*p)(void) = f;
p();
.s:
.4byte __kcfi_typeid_f
.global f
f:
...
```
Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.
As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.
Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.
Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay
Differential Revision: https://reviews.llvm.org/D119296
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827