- Adds a new +pc option to -mbranch-protection that will enable
the use of PC as a diversifier in PAC branch protection code.
- When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination
with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions
(pacibsppc, retaasppc, etc) are used.
Documentation for the relevant instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/
Co-authored-by: Lucas Prates <lucas.prates@arm.com>
When constructing vectors from elements, use poison instead of
undef as the base value. These literals always initialize all
elements (padding the remainder with zero), so that the choice
of base value does not affect semantics.
The RISC-V vector crypto extensions have been ratified. This patch
updates the Clang and LLVM support for these extensions to be
non-experimental, while leaving the C intrinsics as experimental since
the C intrinsics are not yet standardized.
Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
There are many issues that popped up with the counted_by feature. The
patch #73730 has grown too large and approval is blocking Linux testing.
Includes reverts of:
commit 769bc11f68 ("[Clang] Implement the 'counted_by' attribute
(#68750)")
commit bc09ec6962 ("[CodeGen] Revamp counted_by calculations
(#70606)")
commit 1a09cfb2f3 ("[Clang] counted_by attr can apply only to C99
flexible array members (#72347)")
commit a76adfb992 ("[NFC][Clang] Refactor code to calculate flexible
array member size (#72790)")
commit d8447c78ab ("[Clang] Correct handling of negative and
out-of-bounds indices (#71877)")
Partial commit b31cd07de5 ("[Clang] Regenerate test checks (NFC)")
Closes#73168Closes#75173
Since FatLTO now uses the UnifiedLTO pipeline, we should not set the
ThinLTO module flag to true, since it may cause an assertion failure.
See https://github.com/llvm/llvm-project/issues/70703 for context.
Non-LTO compiles set the buffer name to "<inline asm>"
(`AsmPrinter::addInlineAsmDiagBuffer`) and pass diagnostics to
`ClangDiagnosticHandler` (through the `MCContext` handler in
`MachineModuleInfoWrapperPass::doInitialization`) to ensure that
the exit code is 1 in the presence of errors. In contrast, LTO compiles
spuriously succeed even if error messages are printed.
```
% cat a.c
void _start() {}
asm("unknown instruction");
% clang -c a.c
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 error generated.
% clang -c -flto a.c; echo $? # -flto=thin is the same
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
0
```
`CollectAsmSymbols` parses inline assembly and is transitively called by
both `ModuleSummaryIndexAnalysis::run` and `WriteBitcodeToFile`, leading
to duplicate diagnostics.
This patch updates `CollectAsmSymbols` to be similar to non-LTO
compiles.
```
% clang -c -flto=thin a.c; echo $?
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 errors generated.
1
```
The `HasErrors` check does not prevent duplicate warnings but assembler
warnings are very uncommon.
This patch enables the following builtins for SME2
svbfmlslb_f32
svbfmlslb_lane_f32
svbfmlslt_f32
svbfmlslt_lane_f32
Patch by: Kerry McLaughlin <kerry.mclaughlin@arm.com>
---------
Co-authored-by: Matthew Devereau <matthew.devereau@arm.com>
The `_s64`/`_u64` part can be omitted now and the name variants do not
include unsigned comparison mnemonics. Both are inferred from
the argument types.
This patch adds a warning that's emitted when a builtin call uses ZA
state but the calling function doesn't provide any.
Patch by David Sherwood <david.sherwood@arm.com>.
Add intrinsics of the form:
svboolx2_t svwhile<cond>_b{8,16,32,64}_[{s,u}64]_x2([u]int64_t, [u]int64_t);
and their overloaded variants as specified in
https://github.com/ARM-software/acle/pull/257
This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.
Uses work by Kerry McLaughlin.
This is a re-upload of #74064 and fixes a compile time increase.
This is an experimental address space for strided buffers. These buffers
can have structs as elements and
a stride > 1.
These pointers allow the indexed access in units of stride, i.e., they
point at `buffer[index * stride]`.
Thus, we can use the `idxen` modifier for buffer loads.
We assign address space 9 to 192-bit buffer pointers which contain a
128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially,
they are fat buffer pointers with an additional 32-bit index.
On processors supporting vector registers and SIMD instructions, enable
i128 as legal type in VRs. This allows many operations to be implemented
via native instructions directly in VRs (including add, subtract,
logical operations and shifts). For a few other operations (e.g.
multiply and divide, as well as atomic operations), we need to move the
i128 value back to a GPR pair to use the corresponding instruction
there. Overall, this is still beneficial.
The patch includes the following LLVM changes:
- Enable i128 as legal type
- Set up legal operations (in SystemZInstrVector.td)
- Custom expansion for i128 add/subtract with carry
- Custom expansion for i128 comparisons and selects
- Support for moving i128 to/from GPR pairs when required
- Handle 128-bit integer constant values everywhere
- Use i128 as intrinsic operand type where appropriate
- Updated and new test cases
In addition, clang builtins are updated to reflect the intrinsic operand
type changes (which also improves compatibility with GCC).
``` c
// All the intrinsics below are [SVE2.1 or SME2]
// Variants are also available for _u16[_s32]_x2 and _u16[_u32]_x2
svint16_t svqcvtn_s16[_s32_x2](svint32x2_t zn);
```
According to PR#257[1]
[1]https://github.com/ARM-software/acle/pull/257
Unlike ELF targets, MachO does not support the same kind of dynamic symbol
resolution at load time. Instead, the corresponding MachO feature resolves
symbols lazily on first call.
Reviewers:
JDevlieghere, dmpolukhin, ahmedbougacha, tahonermann, echristo, MaskRay, erichkeane
Reviewed By: MaskRay, echristo, ahmedbougacha
Pull Request: https://github.com/llvm/llvm-project/pull/73687
This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.
Uses work by Kerry McLaughlin.
This patch implements the builtins in Clang
and the LLVM-IR intrinsic for the following:
// Variants are also available for:
// _s8, _s16, _u16, _s32, _u32, _s64, _u64,
// _f16, _f32, _f64uint8x16_t svaddqv[_u8](svbool_t pg, svuint8_t zn);
// Variants are also available for:
// _s8, _u16, _s16, _u32, _s32, _u64, _s64
uint8x16_t svandqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t
sveorqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t svorqv[_u8](svbool_t
pg, svuint8_t zn);
// Variants are also available for:
// _s8, _u16, _s16, _u32, _s32, _u64, _s64;
uint8x16_t svmaxqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t
svminqv[_u8](svbool_t pg, svuint8_t zn);
// Variants are also available for _f32, _f64
float16x8_t svmaxnmqv[_f16](svbool_t pg, svfloat16_t zn); float16x8_t
svminnmqv[_f16](svbool_t pg, svfloat16_t zn);
According to the PR#257[1]
The reduction instruction uses scalable vectors as input and fixed
vectors as output, therefore we changed SVEEmitter to emit fixed vector
types in case the neon header(arm_neon.h) is not present.
[1]https://github.com/ARM-software/acle/pull/257
Co-author: Dinar Temirbulatov <dinar.temirbulatov@arm.com>
This patch is needed for the reduction instructions in sve2.1
It add a new header to sve with all the fixed vector types.
The new types are only added if neon is not declared.
When the main file is preprocessed and we change `MainFileName` to the
original source file name (e.g. `a.i => a.c`), the source manager does
not contain `a.c`, but we incorrectly associate the DIFile(a.c) with
md5(a.i). This causes CGDebugInfo::emitFunctionStart to create a
duplicate DIFile and leads to a spurious "inconsistent use of MD5
checksums" warning.
```
% cat a.c
void f() {}
% clang -c -g a.c # no warning
% clang -E a.c -o a.i && clang -g -S a.i && clang -g -c a.s
a.s:9:2: warning: inconsistent use of MD5 checksums
.file 1 "a.c"
^
% grep DIFile a.ll
!1 = !DIFile(filename: "a.c", directory: "/tmp/c", checksumkind: CSK_MD5, checksum: "c5b2e246df7d5f53e176b097a0641c3d")
!11 = !DIFile(filename: "a.c", directory: "/tmp/c")
% grep 'file.*a.c' a.s
.file "a.c"
.file 0 "/tmp/c" "a.c" md5 0x2d14ea70fee15102033eb8d899914cce
.file 1 "a.c"
```
Fix#56378 by disassociating md5(a.i) with a.c.
d80e46d added support for the target function attribute. However, it
turns out that commit has a nasty bug/oversight. As the tests in that
revision show, everything works if clang -cc1 is directly invoked. I was
suprised to learn this morning that compiling with clang (i.e. the
typical user workflow) did not work.
The bug is that if a set of explicit negative extensions is passed to
cc1 at the command line (as the clang driver always does), we were
copying these negative extensions to the end of the rewritten extension
list. When this was later parsed, this had the effect of turning back
off any extension that the target attribute had enabled.
This patch updates the logic to only propagate the features from the
input which don't appear in the rewritten form in either positive or
negative form.
Note that this code structure is still highly suspect. In particular I'm
fairly sure that mixing extension versions with this code will result in
odd results. However, I figure its better to have something which mostly
works than something which doesn't work at all.
Akin other passes - refactored the name to `InstrProfilingLoweringPass` to better communicate what it does, and split the pass part and the transformation part to avoid needing to initialize object state during `::run`.
A subsequent PR will move `InstrLowering` to the .cpp file and rename it to `InstrLowerer`.