Commit Graph

17560 Commits

Author SHA1 Message Date
CHANDRA GHALE
6f558e0e12 [OpenMP] codegen support for masked combined construct masked taskloop (#121914)
Added codegen support for combined masked constructs `masked taskloop.`
Added implementation for `EmitOMPMaskedTaskLoopDirective`.

---------

Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2025-01-13 11:42:13 +05:30
Sander de Smalen
b4ce29ab31 [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (#121788)
This adds support for parsing the attribute and codegen to map it to
"aarch64_za_state_agnostic" LLVM IR attribute.

This attribute is described in the Arm C Language Extensions (ACLE)
document:

  https://github.com/ARM-software/acle/blob/main/main/acle.md#__arm_agnostic
2025-01-12 21:35:44 +00:00
CHANDRA GHALE
1d2eea962a [OpenMP] codegen support for masked combined construct masked taskloop simd (#121916)
Added codegen support for combined masked constructs `masked taskloop
simd`.
Added implementation for `EmitOMPMaskedTaskLoopSimdDirective`.

Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2025-01-12 23:38:00 +05:30
Timm Baeder
cfe26358e3 Reapply "[clang] Avoid re-evaluating field bitwidth" (#122289) 2025-01-11 07:12:37 +01:00
Fangrui Song
0de18e72c6 -ftime-report: reorganize timers
The code generation time is unclear in the -ftime-report output:

* The two clang timers "Code Generation Time" and "LLVM IR Generation
  Time" are in the default group "Miscellaneous Ungrouped Timers".
* There is also a "Clang front-end time" group, which actually includes
  code generation time.

```
===-------------------------------------------------------------------------===
                         Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0611 (  1.7%)   0.0099 (  4.4%)   0.0710 (  1.9%)   0.0713 (  1.9%)  LLVM IR Generation Time
   3.5140 ( 98.3%)   0.2165 ( 95.6%)   3.7306 ( 98.1%)   3.7342 ( 98.1%)  Code Generation Time
   3.5751 (100.0%)   0.2265 (100.0%)   3.8016 (100.0%)   3.8055 (100.0%)  Total
...
===-------------------------------------------------------------------------===
                          Clang front-end time report
===-------------------------------------------------------------------------===
  Total Execution Time: 3.9108 seconds (3.9146 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   3.6802 (100.0%)   0.2306 (100.0%)   3.9108 (100.0%)   3.9146 (100.0%)  Clang front-end timer
   3.6802 (100.0%)   0.2306 (100.0%)   3.9108 (100.0%)   3.9146 (100.0%)  Total
```

This patch

* renames "Clang front-end time report" (FrontendAction time) to "Clang
  time report",
* renames "Clang front-end" to "Front end",
* moves "LLVM IR Generation" into the group,
* replaces "Code Generation time" with "Optimizer" (middle end) and
  "Machine code generation" (back end).

```
% clang -c sqlite3.i -w -ftime-report -mllvm -sort-timers=0
...
===-------------------------------------------------------------------------===
                               Clang time report
===-------------------------------------------------------------------------===
  Total Execution Time: 1.5922 seconds (1.5972 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.5107 ( 35.9%)   0.0105 (  6.2%)   0.5211 ( 32.7%)   0.5222 ( 32.7%)  Front end
   0.2464 ( 17.3%)   0.0340 ( 20.0%)   0.2804 ( 17.6%)   0.2814 ( 17.6%)  LLVM IR generation
   0.6240 ( 43.9%)   0.1235 ( 72.7%)   0.7475 ( 47.0%)   0.7503 ( 47.0%)  Machine code generation
   0.0413 (  2.9%)   0.0018 (  1.0%)   0.0431 (  2.7%)   0.0433 (  2.7%)  Optimizer
   1.4224 (100.0%)   0.1698 (100.0%)   1.5922 (100.0%)   1.5972 (100.0%)  Total
```

Pull Request: https://github.com/llvm/llvm-project/pull/122225
2025-01-10 19:25:18 -08:00
Thurston Dang
55b587506e [ubsan][NFCI] Use SanitizerOrdinal instead of SanitizerMask for EmitCheck (exactly one sanitizer is required) (#122511)
The `Checked` parameter of `CodeGenFunction::EmitCheck` is of type
`ArrayRef<std::pair<llvm::Value *, SanitizerMask>>`, which is overly
generalized: SanitizerMask can denote that zero or more sanitizers are
enabled, but `EmitCheck` requires that exactly one sanitizer is
specified in the SanitizerMask (e.g.,
`SanitizeTrap.has(Checked[i].second)` enforces that).

This patch replaces SanitizerMask with SanitizerOrdinal in the `Checked`
parameter of `EmitCheck` and code that transitively relies on it. This
should not affect the behavior of UBSan, but it has the advantages that:
- the code is clearer: it avoids ambiguity in EmitCheck about what to do
if multiple bits are set
- specifying the wrong number of sanitizers in `Checked[i].second` will
be detected as a compile-time error, rather than a runtime assertion
failure

Suggested by Vitaly in https://github.com/llvm/llvm-project/pull/122392
as an alternative to adding an explicit runtime assertion that the
SanitizerMask contains exactly one sanitizer.
2025-01-10 12:40:57 -08:00
Farzon Lotfi
b900379e26 [HLSL] Reapply Move length support out of the DirectX Backend (#121611) (#122337)
## Changes
- Delete DirectX length intrinsic
- Delete HLSL length lang builtin
- Implement length algorithm entirely in the header.

## History
- In the past if an HLSL intrinsic lowered to either a spirv op code or
a DXIL opcode we represented it with intrinsics

## Why we are moving away?
- To make HLSL apis more portable the team decided that it makes sense
for some intrinsics to be defined only in the header.
- Since there tends to be more SPIRV opcodes than DXIL opcodes the plan
is to support SPIRV opcodes either with target specific builtins or via
pattern matching.
2025-01-10 14:16:27 -05:00
Alexandros Lamprineas
b93ffa8e4a [FMV][AArch64] Changes in fmv-features metadata. (#122192)
* We want the default version to have this attribute too otherwise it
becomes indistinguishable from non-versioned functions.

* We don't need the '+' unlike target-features which can negate. This
will allow using the parsing API of target_version/clones for the
metadata too.
2025-01-10 17:50:35 +00:00
Fangrui Song
186bd8e4cd [CodeGen] Restore CodeGenerationTime
Fixes 48d0eb5181
2025-01-09 21:47:20 -08:00
Fangrui Song
48d0eb5181 [CodeGen] Simplify EmitAssemblyHelper and emitBackendOutput
Prepare for -ftime-report change (#122225).
2025-01-09 21:23:52 -08:00
Vitaly Buka
4c8fdc2954 [nfc][BoundsChecking] Rename BoundsCheckingOptions into Options (#122359) 2025-01-09 20:38:13 -08:00
Vitaly Buka
9c2de994a1 [nfc][BoundsChecking] Refactor BoundsCheckingOptions (#122346)
Remove ReportingMode and ReportingOpts.
2025-01-09 20:19:01 -08:00
Reid Kleckner
26aa20a3dd Use range-based for to iterate over fields in record layout, NFCI (#122029)
Move the common case of FieldDecl::getFieldIndex() inline to mitigate
the cost of removing the extra `FieldNo` induction variable.

Also rename isNoUniqueAddress parameter to isNonVirtualBaseType, which
appears to be more accurate. I think the current name is just a
consequence of autocomplete gone wrong.
2025-01-09 11:21:03 -08:00
Nico Weber
9ec92873ec Revert "[HLSL] Move length support out of the DirectX Backend (#121611)"
This reverts commit a6b7181733.
Breaks Clang :: CodeGenHLSL/builtins/length.hlsl, see
https://github.com/llvm/llvm-project/pull/121611#issuecomment-2581004278
2025-01-09 14:19:03 -05:00
Farzon Lotfi
a6b7181733 [HLSL] Move length support out of the DirectX Backend (#121611)
## Changes
- Delete DirectX length intrinsic
- Delete HLSL length lang builtin
- Implement length algorithm entirely in the header.

## History
- In the past if an HLSL intrinsic lowered to either a spirv op code or
a DXIL opcode we represented it with intrinsics

## Why we are moving away?
- To make HLSL apis more portable the team decided that it makes sense
for some intrinsics to be defined only in the header.
- Since there tends to be more SPIRV opcodes than DXIL opcodes the plan
is to support SPIRV opcodes either with target specific builtins or via
pattern matching.
2025-01-09 13:11:52 -05:00
Stanislav Mekhanoshin
50ca2eff5f Drop emitQuaternaryBuiltin from clang (#122169)
It was superceeded by the emitBuiltinWithOneOverloadedType() some
time ago.
2025-01-09 09:56:44 -08:00
CHANDRA GHALE
aedb30fdc7 [OpenMP] codegen support for masked combined construct parallel masked taskloop (#121741)
Added codegen support for combined masked constructs Parallel masked
taskloop.
Added implementation for EmitOMPParallelMaskedTaskLoopDirective.

---------

Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2025-01-09 16:38:36 +05:30
Sergio Afonso
b79ed8729b [OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863)
The preprocessor definition used to enable asserts and the one that
`llvm::Error` and `llvm::Expected` use to ensure all created instances are
checked are not the same. By making these checks inside of an `assert` in cases
where errors are not expected, certain build configurations would trigger
runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF
-DLLVM_UNREACHABLE_OPTIMIZE=ON`).

The `llvm::cantFail()` function, which was intended for this use case, is used
by this patch in place of `assert` to prevent these runtime failures. In tests,
new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and
`EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release
builds.
2025-01-09 10:28:16 +00:00
Nikita Popov
4847395c54 [Clang] Adjust pointer-overflow sanitizer for N3322 (#120719)
N3322 makes NULL + 0 well-defined in C, matching the C++ semantics.
Adjust the pointer-overflow sanitizer to no longer report NULL + 0 as a
pointer overflow in any language mode. NULL + nonzero will of course
continue to be reported.

As N3322 is part of
https://www.open-std.org/jtc1/sc22/wg14/www/previous.html, and we never
performed any optimizations based on NULL + 0 being undefined in the
first place, I'm applying this change to all C versions.
2025-01-09 09:23:23 +01:00
NAKAMURA Takumi
397ac44f62 [Coverage] Introduce the type CounterPair for RegionCounterMap. NFC. (#112724)
`CounterPair` can hold `<uint32_t, uint32_t>` instead of current
`unsigned`, to hold also the counter number of SkipPath. For now, this
change provides the skeleton and only `CounterPair::Executed` is used.

Each counter number can have `None` to suppress emitting counter
increment. 2nd element `Skipped` is initialized as `None` by default,
since most `Stmt*` don't have a pair of counters.

This change also provides stubs for the verifier. I'll provide the impl
of verifier for `+Asserts` later.

`markStmtAsUsed(bool, Stmt*)` may be used to inform that other side
counter may not emitted.

`markStmtMaybeUsed(S)` may be used for the `Stmt` and its inner will be
excluded for emission in the case of skipping by constant folding. I put
it into places where I found.

`verifyCounterMap()` will check the coverage map and the counter map,
and can be used to report inconsistency.

These verifier methods shall be eliminated in `-Asserts`.


https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492
2025-01-09 17:11:07 +09:00
NAKAMURA Takumi
f5cd181ffb [Coverage] Introduce getBranchCounterPair(). NFC. (#112702)
This aggregates the generation of branch counter pair as `ExecCnt` and
`SkipCnt`, to aggregate `CounterExpr::subtract`. At the moment:

- This change preserves the behavior of
`llvm::EnableSingleByteCoverage`. Almost of SingleByteCoverage will be
cleaned up by coming commits.

- `IsCounterEqual(Out, Par)` is introduced instead of
`Counter::operator==`. Tweaks would be required for the comparison for
additional counters.


https://discourse.llvm.org/t/rfc-integrating-singlebytecoverage-with-branch-coverage/82492
2025-01-09 16:47:01 +09:00
Fangrui Song
6dd1315cf0 [clang] Simplify BackendConsumer after #69371 2025-01-08 22:05:55 -08:00
Fangrui Song
5fdcea2d25 [clang] Simplify BackendConsumer. NFC 2025-01-08 20:48:59 -08:00
Alexandros Lamprineas
8e65940161 [FMV][AArch64] Simplify version selection according to ACLE. (#121921)
Currently, the more features a version has, the higher its priority is.
We are changing ACLE https://github.com/ARM-software/acle/pull/370 as
follows:

"Among any two versions, the higher priority version is determined by
 identifying the highest priority feature that is specified in exactly
 one of the versions, and selecting that version."
2025-01-08 18:59:07 +00:00
Steven Perron
92e575d7e4 [HLSL] Add SPIR-V version of getPointer. (#121963)
Use the spv version of the resource.getpointer intrinsic when targeting
SPIR-V.
2025-01-08 10:51:17 -05:00
Chris B
b66f6b25cb Revert #116331 & #121852 (#122105) 2025-01-08 08:55:02 -06:00
Timm Bäder
59bdea24b0 Revert "[clang] Avoid re-evaluating field bitwidth (#117732)"
This reverts commit 81fc3add1e.

This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:

lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
2025-01-08 15:09:52 +01:00
Timm Baeder
81fc3add1e [clang] Avoid re-evaluating field bitwidth (#117732)
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
2025-01-08 14:45:19 +01:00
Mingjie Xu
2a1632824d [tysan] Convert TySan from function+module pass to just module pass (#120667)
As mentioned in https://github.com/llvm/llvm-project/pull/118989, all
sanitizers but tsan are converted to just module pass for easier
maintenance.

This patch removes the TySan function pass, convert TySan from
function+module pass to just module pass.
2025-01-08 11:25:32 +08:00
Alex MacLean
4583f6d344 [NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806)
the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is not
supposed to change the meaning of the program. Further, checking the
calling convention is significantly faster than traversing the metadata,
improving compile time.

This change updates the clang and mlir frontends as well as the
NVPTXCtorDtorLowering pass to emit kernels using the calling convention.
In addition, this updates all NVPTX unit tests to use the calling
convention as well.
2025-01-07 18:24:50 -08:00
erichkeane
db81e8c42e [OpenACC] Initial sema implementation of 'update' construct
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions.  This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
    or label' statement restriction.  Future patches will handle the
    rest of the restrictions, which are based on clauses.
2025-01-07 08:20:20 -08:00
Florian Hahn
473cdb93e5 [TySan] Don't report globals with incomplete types. (#121922)
Type metadata for incomplete types should also get handled at the place
they are defined.

Fixes https://github.com/llvm/llvm-project/issues/121014.


PR: https://github.com/llvm/llvm-project/pull/121922
2025-01-07 15:06:26 +00:00
天音あめ
ca5fd06366 [clang] Fix crashes when passing VLA to va_arg (#119563)
Closes #119360.

This bug occurs when passing a VLA to `va_arg`. Since the return value
is inferred to be an array, it triggers
`ScalarExprEmitter::VisitCastExpr`, which converts it to a pointer and
subsequently calls `CodeGenFunction::EmitAggExpr`. At this point,
because the inferred type is an `AggExpr` instead of a `ScalarExpr`,
`ScalarExprEmitter::VisitVAArgExpr` is not invoked, and as a result,
`CodeGenFunction::EmitVariablyModifiedType` is also not called, leading
to the size of the VLA not being retrieved.
The solution is to move the call to
`CodeGenFunction::EmitVariablyModifiedType` into
`CodeGenFunction::EmitVAArg`, ensuring that the size of the VLA is
correctly obtained regardless of whether the expression is an `AggExpr`
or a `ScalarExpr`.
2025-01-07 07:49:43 -05:00
Alex Voicu
66acb26946 [clang][CodeGen][SPIRV] Translate amdgpu_flat_work_group_size into max_work_group_size. (#116820)
HIPAMD relies on the `amdgpu_flat_work_group_size` attribute to
implement key functionality such as the `__launch_bounds__` `__global__`
function annotation. This attribute is not available / directly
translatable to SPIR-V, hence as it is AMDGCN flavoured SPIR-V suffers
from information loss.

This patch addresses that limitation by converting the unsupported
attribute into the `max_work_group_size` attribute which maps to
[`MaxWorkgroupSizeINTEL`](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_kernel_attributes.asciidoc),
which is available in / handled by SPIR-V. When reverse translating from
SPIR-V to AMDGCN LLVMIR we invert the map and add the original AMDGPU
attribute.
2025-01-07 12:01:31 +02:00
Nicholas Guy
21b531ead1 [clang][llvm][aarch64] Add aarch64_sme_in_streaming_mode intrinsic (#120265)
Replacing the extant streaming mode function call with an intrinsic
allows us to make further optimisations around it. For example, if it's
called within a function that has a known streaming mode, we can remove
the dead code, and avoid the redundant conditional branch.
2025-01-07 09:02:26 +00:00
Alexandros Lamprineas
93011fe2a5 [FMV][AArch64][clang] Emit fmv-features metadata in LLVM IR. (#118544)
We need to be able to propagate information about FMV attribute strings
from C/C++ source to LLVM IR. This is necessary so that we can
distinguish which target-features are coming from the cmdline, which are
coming from the target attribute, and which are coming from feature
dependency expansion. We need this for static resolution of calls in
LLVM. Here's a motivating example:

Suppose you have target_version("i8mm+dotprod") and
target_version("fcma"). The first version clearly has higher priority.
Now suppose you specify -march=armv8-a+i8mm on the command line. Then
the versions would have target-features "+i8mm,+dotprod" and
"+i8mm,+fcma" respectively. If you are using those to deduce version
priority, then you would incorrectly deduce that the second version was
higher priority than the first.
2025-01-07 08:51:23 +00:00
Vitaly Buka
1a435feffc [HLSL] Fix build warning after #116331 (#121852)
After #116331 is always SpellingNotCalculated,
so I assume doing nothing is expected.
2025-01-06 14:50:57 -08:00
erichkeane
21c785d7bd [OpenACC] Implement 'set' construct sema
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
2025-01-06 11:03:18 -08:00
joaosaffran
3f936251d2 [HLSL] Fix debug info generation for RWBuffer types (#119041)
This PR fix the debug infor generation for RWBuffer types.
- This implements the [same fix as
DXC](https://github.com/microsoft/DirectXShaderCompiler/pull/6296).
- Adds the HLSLAttributedResource debug info generation

Closes #118523

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
2025-01-06 11:01:49 -08:00
joaosaffran
0d5c07285f [HLSL] Adding Flatten and Branch if attributes (#116331)
- adding Flatten and Branch to if stmt.
- adding dxil control flow hint metadata generation
- modifing spirv OpSelectMerge to account for the specific attributes.

Closes #70112

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
2025-01-06 10:27:02 -08:00
Farzon Lotfi
21edac25f0 [SPIRV] Add Target Builtins using Distance ext as an example (#121598)
- Update pr labeler so new SPIRV files get properly labeled.
- Add distance target builtin to BuiltinsSPIRV.td.
- Update TargetBuiltins.h to account for spirv builtins.
- Update clang basic CMakeLists.txt to build spirv builtin tablegen.
- Hook up sema for SPIRV in Sema.h|cpp, SemaSPIRV.h|cpp, and
SemaChecking.cpp.
- Hookup sprv target builtins to SPIR.h|SPIR.cpp target.
- Update GBuiltin.cpp to emit spirv intrinsics when we get the expected
spirv target builtin.

Consensus was reach in this RFC to add both target builtins and pattern
matching:
https://discourse.llvm.org/t/rfc-add-targetbuiltins-for-spirv-to-support-hlsl/83329.

pattern matching will come in a separate pr this one just sets up the
groundwork to do target builtins for spirv.

partially resolves
[#99107](https://github.com/llvm/llvm-project/issues/99107)
2025-01-06 11:37:20 -05:00
Sameer Sahasrabuddhe
df67e37e37 [clang][NFC] clean up the handling of convergence control tokens (#121738) 2025-01-06 21:34:11 +05:30
Joseph Huber
81fae0d5e3 [Clang][AMDGPU] Stop defaulting to one-as for all atomic scopes (#120095)
Summary:
The documentation at
https://llvm.org/docs/AMDGPUUsage.html#memory-scopes states that these
'one-as' modifiers are more specific versions of the scopes that only
apply to a specific address space. This doesn't make sense for fences
which have no associated address space to use, and it's a more
restrictive version the normal scope. This should not tbe the default
behavior, but it is currently emitted in all cases except for
sequentially consistent.
2025-01-06 08:11:08 -06:00
Kerry McLaughlin
d8d4c18761 [AArch64][SME] Disable inlining of callees with new ZT0 state (#121338)
Inlining must be disabled for new-ZT0 callees as the callee is required
to save ZT0 and toggle PSTATE.ZA on entry.
2025-01-06 12:02:28 +00:00
Benjamin Maxwell
e4e2f53693 [clang] Add sincos builtin using llvm.sincos intrinsic (#114086)
This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to
emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. Note:
`llvm.sincos.*` is only emitted by `__builtin_sincos[f|l]` functions in
this initial patch.
2025-01-06 11:07:25 +00:00
Zhengxing li
7a76110096 [HLSL][SPIR-V] implement SV_GroupID semantic lowering (#121521)
The HLSL SV_GroupID semantic attribute is lowered into
@llvm.spv.group.id intrinsic in LLVM IR for SPIR-V target.

In the SPIR-V backend, this is now translated to a `WorkgroupId` builtin
variable.

Fixes #118700 which's a follow-up work to #70120
2025-01-04 14:02:39 -08:00
Simon Pilgrim
6279d2e0f3 AArch64ABIInfo::passAsAggregateType - don't directly dereference getAs<> result. NFC.
Reported by coverity static analyzer - we know the type is a BuiltinType so use castAs<>
2024-12-31 14:34:14 +00:00
Brandon Wu
8e7f1bee84 [clang][RISCV] Remove unneeded RISCV tuple code (#121024)
These code are no longer needed because we've modeled tuple type using
target extension type rather than structure of scalable vectors.
2024-12-25 22:48:54 +08:00
Momchil Velikov
f70ab7d909 [AArch64] Fix argument passing for SVE tuples (#118961)
The fix for passing Pure Scalable Types
(https://github.com/llvm/llvm-project/pull/112747) was incomplete,
it didn't handle correctly tuples of SVE vectors (e.g. `sveboolx2_t`,
`svfloat32x4_t`, etc).

These types are Pure Scalable Types and should be passed either entirely
in vector registers
or indirectly in memory, not split.
2024-12-23 09:26:24 +00:00
Thurston Dang
5bb650345d Remove -bounds-checking-unique-traps (replace with -fno-sanitize-merge=local-bounds) (#120682)
#120613 removed -ubsan-unique-traps and replaced it with
-fno-sanitize-merge (introduced in #120511), which allows fine-grained
control of which UBSan checks to prevent merging. This analogous patch
removes -bound-checking-unique-traps, and allows it to be controlled via
-fno-sanitize-merge=local-bounds.

Most of this patch is simply plumbing through the compiler flags into
the bounds checking pass.

Note: this patch subtly changes -fsanitize-merge (the default) to also
include -fsanitize-merge=local-bounds. This is different from the
previous behavior, where -fsanitize-merge (or the old
-ubsan-unique-traps) did not affect local-bounds (requiring the separate
-bounds-checking-unique-traps). However, we argue that the new behavior
is more intuitive.

Removing -bounds-checking-unique-traps and merging its functionality
into -fsanitize-merge breaks backwards compatibility; we hope that this
is acceptable since '-mllvm -bounds-checking-unique-traps' was an
experimental flag.
2024-12-20 10:07:44 -08:00