Commit Graph

1225 Commits

Author SHA1 Message Date
Alan Zhao
c5b3fe2094 [clang] Automatically add the returns_twice attribute to certain functions even if -fno-builtin is set (#133511)
Certain functions require the `returns_twice` attribute in order to
produce correct codegen. However, `-fno-builtin` removes all knowledge
of functions that require this attribute, so this PR modifies Clang to
add the `returns_twice` attribute even if `-fno-builtin` is set. This
behavior is also consistent with what GCC does.

It's not (easily) possible to get the builtin information from
`Builtins.td` because `-fno-builtin` causes Clang to never initialize
any builtins, so functions never get tokenized as functions/builtins
that require `returns_twice`. Therefore, the most straightforward
solution is to explicitly hard code the function names that require
`returns_twice`.

Fixes #122840
2025-03-31 09:42:34 -07:00
Kazu Hirata
d3c10a3897 [CodeGen] Use llvm::reverse (NFC) (#133550) 2025-03-28 19:55:32 -07:00
Liberty
c4ed0ad1f5 [Clang] Fix typo 'dereferencable' to 'dereferenceable' (#116761)
This patch corrects the typo 'dereferencable' to 'dereferenceable' in
CGCall.cpp.
The typo is located within a comment inside the `void
CodeGenModule::ConstructAttributeList` function.
2025-03-08 19:35:20 +00:00
Brandon Wu
c804e86f55 [RISCV][VLS] Support RISCV VLS calling convention (#100346)
This patch adds a function attribute `riscv_vls_cc` for RISCV VLS
calling
convention which takes 0 or 1 argument, the argument is the `ABI_VLEN`
which is the `VLEN` for passing the fixed-vector arguments, it wraps the
argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the
corresponding mechanism to handle it. The range of `ABI_VLEN` is [32,
65536],
if not specified, the default value is 128.

Here is an example of VLS argument passing:
Non-VLS call:
```
  void original_call(__attribute__((vector_size(16))) int arg) {}
=>
  define void @original_call(i128 noundef %arg) {
  entry:
    ...
    ret void
  }
```
VLS call:
```
  void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {}
=>
  define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) {
  entry:
    ...
    ret void
  }
}
```

The first Non-VLS call passes generic vector argument of 16 bytes by
flattened integer.
On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the
vector to <vscale x 1 x i32> where the number of scalable vector
elements
is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`.
Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4.

PsABI PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/418
C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/68
2025-03-03 12:39:35 +08:00
Alois Klink
79a28aa0a4 [clang] Ignore GCC 11 [[malloc(x)]] attribute
Ignore the `[[malloc(x)]]` or `[[malloc(x, 1)]]` function attribute
syntax added in [GCC 11][1] and print a warning instead of an error.

Unlike `[[malloc]]` with no arguments (which is supported by Clang),
GCC uses the one or two argument form to specify a deallocator for
GCC's static analyzer.

Code currently compiled with `[[malloc(x)]]` or
`__attribute((malloc(x)))` fails with the following error:
`'malloc' attribute takes no arguments`.

[1]: https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;f=gcc/doc/extend.texi;h=dce6c58db87ebf7f4477bd3126228e73e4eeee97#patch6

Fixes: https://github.com/llvm/llvm-project/issues/51607
Partial-Bug: https://github.com/llvm/llvm-project/issues/53152
2025-02-27 08:06:58 -08:00
Alex Voicu
c8f70d7286 [clang][CodeGen] Additional fixes for #114062 (#128166)
This addresses two issues introduced by moving indirect args into an
explicit AS (please see
<https://github.com/llvm/llvm-project/pull/114062#issuecomment-2659829790>
and
<https://github.com/llvm/llvm-project/pull/114062#issuecomment-2661158477>):

1. Unconditionally stripping casts from a pre-allocated return slot was
incorrect / insufficient (this is illustrated by the
`amdgcn_sret_ctor.cpp` test);
2. Putting compiler manufactured sret args in a non default AS can lead
to a C-cast (surprisingly) requiring an AS cast (this is illustrated by
the `sret_cast_with_nonzero_alloca_as.cpp test).

The way we handle (2), by subverting CK_BitCast emission iff a sret arg
is involved, is quite naff, but I couldn't think of any other way to use
a non default indirect AS and make this case work (hopefully this is a
failure of imagination on my part).
2025-02-27 09:03:17 +07:00
Alex Voicu
a7a356833d [NFC][Clang][CodeGen] Remove vestigial assertion (#127528)
This removes a vestigial assertion, which would erroneously trigger even
though we now correctly handle valid arg mismatches
(<2dda529838/clang/lib/CodeGen/CGCall.cpp (L5397)>),
after #114062 went in.
2025-02-17 22:05:22 +02:00
Alex Voicu
39ec9de7c2 [clang][CodeGen] sret args should always point to the alloca AS, so use that (#114062)
`sret` arguments are always going to reside in the stack/`alloca`
address space, which makes the current formulation where their AS is
derived from the pointee somewhat quaint. This patch ensures that `sret`
ends up pointing to the `alloca` AS in IR function signatures, and also
guards agains trying to pass a casted `alloca`d pointer to a `sret` arg,
which can happen for most languages, when compiled for targets that have
a non-zero `alloca` AS (e.g. AMDGCN) / map `LangAS::default` to a
non-zero value (SPIR-V). A target could still choose to do something
different here, by e.g. overriding `classifyReturnType` behaviour.

In a broader sense, this patch extends non-aliased indirect args to also
carry an AS, which leads to changing the `getIndirect()` interface. At
the moment we're only using this for (indirect) returns, but it allows
for future handling of indirect args themselves. We default to using the
AllocaAS as that matches what Clang is currently doing, however if, in
the future, a target would opt for e.g. placing indirect returns in some
other storage, with another AS, this will require revisiting.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2025-02-14 11:20:45 +00:00
Nikita Popov
29441e4f5f [IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Wolfgang Pieb
4424c44c8c [Clang] Add fake use emission to Clang with -fextend-lifetimes (#110102)
Following the previous patch which adds the "extend lifetimes" flag
without (almost) any functionality, this patch adds the real feature by
allowing Clang to emit fake uses. These are emitted as a new form of cleanup,
set for variable addresses, which just emits a fake use intrinsic when the
variable falls out of scope. The code for achieving this is simple, with most
of the logic centered on determining whether to emit a fake use for a given
address, and on ensuring that fake uses are ignored in a few cases.

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2025-01-28 12:30:31 +00:00
Kazu Hirata
35f9d2ac49 [CodeGen] Migrate away from PointerUnion::dyn_cast (NFC) (#122778)
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect Prototype.P to be nonnull.
2025-01-13 20:53:13 -08:00
Sander de Smalen
b4ce29ab31 [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (#121788)
This adds support for parsing the attribute and codegen to map it to
"aarch64_za_state_agnostic" LLVM IR attribute.

This attribute is described in the Arm C Language Extensions (ACLE)
document:

  https://github.com/ARM-software/acle/blob/main/main/acle.md#__arm_agnostic
2025-01-12 21:35:44 +00:00
Timm Baeder
cfe26358e3 Reapply "[clang] Avoid re-evaluating field bitwidth" (#122289) 2025-01-11 07:12:37 +01:00
Thurston Dang
55b587506e [ubsan][NFCI] Use SanitizerOrdinal instead of SanitizerMask for EmitCheck (exactly one sanitizer is required) (#122511)
The `Checked` parameter of `CodeGenFunction::EmitCheck` is of type
`ArrayRef<std::pair<llvm::Value *, SanitizerMask>>`, which is overly
generalized: SanitizerMask can denote that zero or more sanitizers are
enabled, but `EmitCheck` requires that exactly one sanitizer is
specified in the SanitizerMask (e.g.,
`SanitizeTrap.has(Checked[i].second)` enforces that).

This patch replaces SanitizerMask with SanitizerOrdinal in the `Checked`
parameter of `EmitCheck` and code that transitively relies on it. This
should not affect the behavior of UBSan, but it has the advantages that:
- the code is clearer: it avoids ambiguity in EmitCheck about what to do
if multiple bits are set
- specifying the wrong number of sanitizers in `Checked[i].second` will
be detected as a compile-time error, rather than a runtime assertion
failure

Suggested by Vitaly in https://github.com/llvm/llvm-project/pull/122392
as an alternative to adding an explicit runtime assertion that the
SanitizerMask contains exactly one sanitizer.
2025-01-10 12:40:57 -08:00
Timm Bäder
59bdea24b0 Revert "[clang] Avoid re-evaluating field bitwidth (#117732)"
This reverts commit 81fc3add1e.

This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:

lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
2025-01-08 15:09:52 +01:00
Timm Baeder
81fc3add1e [clang] Avoid re-evaluating field bitwidth (#117732)
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
2025-01-08 14:45:19 +01:00
天音あめ
ca5fd06366 [clang] Fix crashes when passing VLA to va_arg (#119563)
Closes #119360.

This bug occurs when passing a VLA to `va_arg`. Since the return value
is inferred to be an array, it triggers
`ScalarExprEmitter::VisitCastExpr`, which converts it to a pointer and
subsequently calls `CodeGenFunction::EmitAggExpr`. At this point,
because the inferred type is an `AggExpr` instead of a `ScalarExpr`,
`ScalarExprEmitter::VisitVAArgExpr` is not invoked, and as a result,
`CodeGenFunction::EmitVariablyModifiedType` is also not called, leading
to the size of the VLA not being retrieved.
The solution is to move the call to
`CodeGenFunction::EmitVariablyModifiedType` into
`CodeGenFunction::EmitVAArg`, ensuring that the size of the VLA is
correctly obtained regardless of whether the expression is an `AggExpr`
or a `ScalarExpr`.
2025-01-07 07:49:43 -05:00
Sameer Sahasrabuddhe
df67e37e37 [clang][NFC] clean up the handling of convergence control tokens (#121738) 2025-01-06 21:34:11 +05:30
Brandon Wu
8e7f1bee84 [clang][RISCV] Remove unneeded RISCV tuple code (#121024)
These code are no longer needed because we've modeled tuple type using
target extension type rather than structure of scalable vectors.
2024-12-25 22:48:54 +08:00
Pedro Lobo
f28e52274c [Clang] Change two placeholders from undef to poison [NFC] (#119141)
- Use `poison` instead of `undef` as a phi operand for an unreachable path (the predecessor
will not go the BB that uses the value of the phi).
- Call `@llvm.vector.insert` with a `poison` subvec when performing a
`bitcast` from a fixed vector to a scalable vector.
2024-12-10 15:57:55 +00:00
Kazu Hirata
91d6e10cca [CodeGen] Migrate away from PointerUnion::{is,get} (NFC) (#118600)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-06 01:45:56 -08:00
Sarah Spall
46de3a7064 [HLSL] get inout/out ABI for array parameters working (#111047)
Get inout/out parameters working for HLSL Arrays.
Utilizes the fix from #109323, and corrects the assignment behavior
slightly to allow for Non-LValues on the RHS.
Closes #106917

---------

Co-authored-by: Chris B <beanz@abolishcrlf.org>
2024-12-03 17:43:36 -08:00
Pedro Lobo
98e747ba56 [clang] Use poison instead of undef as the placeholder when creating a new vector [NFC] (#117064)
Call `@llvm.vector.insert` with a `poison` vector when coercing a fixed
vector to a scalable vector with the same element type.
2024-12-02 09:00:39 +00:00
Benjamin Maxwell
db6f627f3f [clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (#116391)
If `__attribute__((flatten))` is used on a function, or
`[[clang::always_inline]]` on a statement, don't inline any callees with
incompatible streaming attributes. Without this check, clang may produce
incorrect code when these attributes are used in code with streaming
functions.

Note: The docs for flatten say it can be ignored when inlining is
impossible: "causes calls within the attributed function to be inlined
unless it is impossible to do so".

Similarly, the (clang-only) `[[clang::always_inline]]` statement
attribute is more relaxed than the GNU `__attribute__((always_inline))`
(which says it should error it if it can't inline), saying only "If a
statement is marked [[clang::always_inline]] and contains calls, the
compiler attempts to inline those calls.". The docs also go on to show
an example of where `[[clang::always_inline]]` has no effect.
2024-11-26 14:26:34 +00:00
Kazu Hirata
e8a6624325 [CodeGen] Remove unused includes (NFC) (#116459)
Identified with misc-include-cleaner.
2024-11-16 07:37:13 -08:00
joaosaffran
481bce018e Adding splitdouble HLSL function (#109331)
- Adding hlsl `splitdouble` intrinsics
- Adding DXIL lowering
- Adding SPIRV lowering
- Adding test

Fixes: #108901

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-10-28 13:26:59 -07:00
Momchil Velikov
53f7f8ecca [Clang][AArch64] Fix Pure Scalables Types argument passing and return (#112747)
Pure Scalable Types are defined in AAPCS64 here:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts

And should be passed according to Rule C.7 here:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules

This part of the ABI is completely unimplemented in Clang, instead it
treats PSTs sometimes as HFAs/HVAs, sometime as general composite types.

This patch implements the rules for passing PSTs by employing the
`CoerceAndExpand` method and extending it to:
* allow array types in the `coerceToType`; Now only `[N x i8]` are
considered padding.
* allow mismatch between the elements of the `coerceToType` and the
elements of the `unpaddedCoerceToType`; AArch64 uses this to map
fixed-length vector types to SVE vector types.

Corectly passing a PST argument needs a decision in Clang about whether
to pass it in memory or registers or, equivalently, whether to use the
`Indirect` or `Expand/CoerceAndExpand` method. It was considered
relatively harder (or not practically possible) to make that decision in
the AArch64 backend.
Hence this patch implements the register counting from AAPCS64 (cf.
`NSRN`, `NPRN`) to guide the Clang's decision.
2024-10-28 15:43:14 +00:00
Kiran
a96c14eeb8 [Clang] Always forward sret parameters to musttail calls
If a call using the musttail attribute returns it's value through an
sret argument pointer, we must forward an incoming sret pointer to it,
instead of creating a new alloca. This is always possible because the
musttail attribute requires the caller and callee to have the same
argument and return types.
2024-10-25 09:34:08 +01:00
Jay Foad
4dd55c567a [clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)
Follow up to #109133.
2024-10-24 10:23:40 +01:00
Jonas Paulsson
14120227a3 Target ABI: improve call parameters extensions handling (#100757)
For the purpose of verifying proper arguments extensions per the target's ABI,
introduce the NoExt attribute that may be used by a target when neither sign-
or zeroextension is required (e.g. with a struct in register). The purpose of
doing so is to be able to verify that there is always one of these attributes
present and by this detecting cases where sign/zero extension is actually
missing.

As a first step, this patch has the verification step done for the SystemZ
backend only, but left off by default until all known issues have been
addressed.

Other targets/front-ends can now also add NoExt attribute where needed and do
this check in the backend.
2024-09-19 16:59:31 +02:00
Chris B
89fb8490a9 [HLSL] Implement output parameter (#101083)
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.

For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.

In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.

This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.

Fixes #87526

---------

Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
2024-08-31 10:59:08 -05:00
Kiran
c50d11e6d9 Revert "[ARM] musttail fixes"
committed by accident, see #104795

This reverts commit a2088a24da.
2024-08-27 11:17:17 +01:00
Kiran
ad468da038 Revert "Seperate frontend changes, add debug directives, remove redundant stuff from tests"
This reverts commit 1a908c6be3.
2024-08-27 10:46:18 +01:00
Kiran
1a908c6be3 Seperate frontend changes, add debug directives, remove redundant stuff from tests 2024-08-27 10:44:06 +01:00
Kiran
a2088a24da [ARM] musttail fixes
Backend:
- Caller and callee arguments no longer have to match, just to take up the same space, as they can be changed before the call
- Allowed tail calls if callee and callee both (or neither) use sret, wheras before it would be dissalowed if either used sret
- Allowed tail calls if byval args are used
- Added debug trace for IsEligibleForTailCallOptimisation

Frontend (clang):
- Do not generate extra alloca if sret is used with musttail, as the space for the sret is allocated already

Change-Id: Ic7f246a7eca43c06874922d642d7dc44bdfc98ec
2024-08-27 10:44:06 +01:00
eddyz87
64e464349b [BPF] introduce __attribute__((bpf_fastcall)) (#105417)
This commit introduces attribute bpf_fastcall to declare BPF functions
that do not clobber some of the caller saved registers (R0-R5).

The idea is to generate the code complying with generic BPF ABI,
but allow compatible Linux Kernel to remove unnecessary spills and
fills of non-scratched registers (given some compiler assistance).

For such functions do register allocation as-if caller saved registers
are not clobbered, but later wrap the calls with spill and fill
patterns that are simple to recognize in kernel.

For example for the following C code:

    #define __bpf_fastcall __attribute__((bpf_fastcall))

    void bar(void) __bpf_fastcall;
    void buz(long i, long j, long k);

    void foo(long i, long j, long k) {
      bar();
      buz(i, j, k);
    }

First allocate registers as if:

    foo:
      call bar    # note: no spills for i,j,k (r1,r2,r3)
      call buz
      exit

And later insert spills fills on the peephole phase:

    foo:
      *(u64 *)(r10 - 8) = r1;  # Such call pattern is
      *(u64 *)(r10 - 16) = r2; # correct when used with
      *(u64 *)(r10 - 24) = r3; # old kernels.
      call bar
      r3 = *(u64 *)(r10 - 24); # But also allows new
      r2 = *(u64 *)(r10 - 16); # kernels to recognize the
      r1 = *(u64 *)(r10 - 8);  # pattern and remove spills/fills.
      call buz
      exit

The offsets for generated spills/fills are picked as minimal stack
offsets for the function. Allocated stack slots are not used for any
other purposes, in order to simplify in-kernel analysis.
2024-08-22 03:40:56 +03:00
Vassil Vassilev
6c62ad446b [clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. (#98138)
In incremental compilation clang works with multiple `llvm::Module`s.
Our current approach is to create a CodeGenModule entity for every new
module request (via StartModule). However, some of the state such as the
mangle context needs to be preserved to keep the original semantics in
the ever-growing TU.

Fixes: llvm/llvm-project#95581.

cc: @jeaye
2024-08-21 07:22:31 +02:00
Eduard Zingerman
5e8f4618ce Revert "[BPF] introduce __attribute__((bpf_fastcall)) (#101228)"
This reverts commit e9b2e16dc9.
Reverting because of the test failure:
https://lab.llvm.org/buildbot/#/builders/187/builds/509
2024-08-19 11:30:27 -07:00
eddyz87
e9b2e16dc9 [BPF] introduce __attribute__((bpf_fastcall)) (#101228)
This commit introduces attribute bpf_fastcall to declare BPF functions
that do not clobber some of the caller saved registers (R0-R5).

The idea is to generate the code complying with generic BPF ABI, but
allow compatible Linux Kernel to remove unnecessary spills and fills of
non-scratched registers (given some compiler assistance).

For such functions do register allocation as-if caller saved registers
are not clobbered, but later wrap the calls with spill and fill patterns
that are simple to recognize in kernel.

For example for the following C code:

     #define __bpf_fastcall __attribute__((bpf_fastcall))

     void bar(void) __bpf_fastcall;
     void buz(long i, long j, long k);

     void foo(long i, long j, long k) {
       bar();
       buz(i, j, k);
     }

First allocate registers as if:

    foo:
      call bar    # note: no spills for i,j,k (r1,r2,r3)
      call buz
      exit

And later insert spills fills on the peephole phase:

    foo:
      *(u64 *)(r10 - 8) = r1;  # Such call pattern is
      *(u64 *)(r10 - 16) = r2; # correct when used with
      *(u64 *)(r10 - 24) = r3; # old kernels.
      call bar
      r3 = *(u64 *)(r10 - 24); # But also allows new
      r2 = *(u64 *)(r10 - 16); # kernels to recognize the
      r1 = *(u64 *)(r10 - 8);  # pattern and remove spills/fills.
      call buz
      exit

The offsets for generated spills/fills are picked as minimal stack
offsets for the function. Allocated stack slots are not used for any
other purposes, in order to simplify in-kernel analysis.

Corresponding functionality had been merged in Linux Kernel as
[this](https://lore.kernel.org/bpf/172179364482.1919.9590705031832457529.git-patchwork-notify@kernel.org/)
patch set (the patch assumed that `no_caller_saved_regsiters` attribute
would be used by LLVM, naming does not matter for the Kernel).
2024-08-19 19:49:11 +03:00
Daniel Kiss
9e9fa00dcb [Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978)
Default attributes assigned to all functions according to the command
line parameters. Some functions might have their own attributes and we
need to set or remove attributes accordingly.
Tests are updated to test this scenarios too.
2024-08-09 17:51:38 +02:00
Jeremy Morse
92aec5192c [DebugInfo][RemoveDIs] Use iterator-inserters in clang (#102006)
As part of the LLVM effort to eliminate debug-info intrinsics, we're
moving to a world where only iterators should be used to insert
instructions. This isn't a problem in clang when instructions get
generated before any debug-info is inserted, however we're planning on
deprecating and removing the instruction-pointer insertion routines.

Scatter some calls to getIterator in a few places, remove a
deref-then-addrof on another iterator, and add an overload for the
createLoadInstBefore utility. Some callers passes a null insertion
point, which we need to handle explicitly now.
2024-08-09 10:17:48 +01:00
Eli Friedman
1762e01cca Fix codegen of consteval functions returning an empty class, and related issues (#93115)
Fix codegen of consteval functions returning an empty class, and related
issues

If a class is empty, don't store it to memory: the store might overwrite
useful data. Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.

The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.

Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.

Fixes #93040.
2024-08-01 16:18:20 -07:00
darkbuck
fa84297002 [clang][CUDA] Add 'noconvergent' function and statement attribute
- For languages following SPMD/SIMT programming model, functions and
  call sites are marked 'convergent' by default. 'noconvergent' is added
  in this patch to allow developers to remove that 'convergent'
  attribute when it's safe.

Reviewers:
nhaehnle, Sirraide, yxsamliu, Artem-B, ilovepi, jayfoad, ssahasra, arsenm

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/100637
2024-07-31 11:30:48 -04:00
Qiu Chaofan
20957d2091 [AIX] Add -msave-reg-params to save arguments to stack (#97524)
In PowerPC ABI, a few initial arguments are passed through registers,
but their places in parameter save area are reserved, arguments passed
by memory goes after the reserved location.

For debugging purpose, we may want to save copy of the pass-by-reg
arguments into correct places on stack. The new option achieves by
adding new function level attribute and make argument lowering part
aware of it.
2024-07-24 20:58:37 +08:00
Oliver Hunt
4dcd91aea3 [PAC] Implement authentication for C++ member function pointers (#99576)
Introduces type based signing of member function pointers. To support
this discrimination schema we no longer emit member function pointer to
virtual methods and indices into a vtable but migrate to using thunks.
This does mean member function pointers are no longer necessarily
directly comparable, however as such comparisons are UB this is
acceptable.

We derive the discriminator from the C++ mangling of the type of the
pointer being authenticated.

Co-Authored-By: Akira Hatanaka ahatanaka@apple.com
Co-Authored-By: John McCall rjmccall@apple.com
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-07-22 18:29:06 -07:00
Mariya Podchishchaeva
9ad72df55c [clang] Use different memory layout type for _BitInt(N) in LLVM IR (#91364)
There are two problems with _BitInt prior to this patch:
1. For at least some values of N, we cannot use LLVM's iN for the type
of struct elements, array elements, allocas, global variables, and so
on, because the LLVM layout for that type does not match the high-level
layout of _BitInt(N).
Example: Currently for i128:128 targets correct implementation is
possible either for __int128 or for _BitInt(129+) with lowering to iN,
but not both, since we have now correct implementation of __int128 in
place after a21abc7.
When this happens, opaque [M x i8] types used, where M =
sizeof(_BitInt(N)).
2. LLVM doesn't guarantee any particular extension behavior for integer
types that aren't a multiple of 8. For this reason, all _BitInt types
are now have in-memory representation that is a whole number of bytes.
I.e. for example _BitInt(17) now will have memory layout type i32.

This patch also introduces concept of load/store type and adds an API to
CodeGenTypes that returns the IR type that should be used for load and
store operations. This is particularly useful for the case when a
_BitInt ends up having array of bytes as memory layout type. For
_BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and
stores of iM would both (1) produce far better code from the backends
and (2) be far more optimizable by IR passes than loads and stores of [M
x i8].

Fixes https://github.com/llvm/llvm-project/issues/85139
Fixes https://github.com/llvm/llvm-project/issues/83419

---------

Co-authored-by: John McCall <rjmccall@gmail.com>
2024-07-15 09:40:39 +02:00
Daniel Kiss
7d1b6b2c32 [Clang][ARM][AArch64] Add branch protection attributes to the defaults. (#83277)
These attributes are no longer inherited from the module flags,
therefore need to be added for synthetic functions.
2024-07-12 20:52:56 +02:00
Chen Zheng
afd0e6d06b [PowerPC] Diagnose musttail instead of crash inside backend (#93267)
musttail is not often possible to be generated on PPC targets as when
calling to a function defined in another module, PPC needs to restore
the TOC pointer. To restore the TOC pointer, compiler needs to emit a
nop after the call to let linker generate codes to restore TOC pointer.
Tail call cannot generate expected call sequence for this case.

To avoid the crash inside the compiler backend, a diagnosis is added in
the frontend.

Fixes #63214
2024-07-08 09:30:01 +08:00
Ahmed Bougacha
e23250ecb7 [clang] Implement function pointer signing and authenticated function calls (#93906)
The functions are currently always signed/authenticated with zero
discriminator.

Co-Authored-By: John McCall <rjmccall@apple.com>
2024-06-21 10:20:15 -07:00
Mariya Podchishchaeva
6d973b4548 [clang][CodeGen] Return RValue from EmitVAArg (#94635)
This should simplify handling of resulting value by the callers.
2024-06-17 13:29:20 +02:00