In 6a884a9aef, I synchronized the export list of libc++abi to the
export list of libc++. From the linker's perspective, this caused these
symbols to be taken from libc++.dylib instead of libc++abi.dylib.
However, that can be problematic when back-deploying. Indeed, this means
that the linker will encode an undefined reference to be fullfilled by
libc++.dylib, but when backdeploying against an older system, that
symbol might only be available in libc++abi.dylib.
Most of the symbols that started being re-exported after 6a884a9aef
turn out to be implementation details of libc++abi, so nobody really
depends on them and this back-deployment issue is inconsequential.
However, we ran into issues with a few of these symbols while testing
LLVM 19, which led to this patch. This slipped between the cracks and
that is why the patch is coming so long after the original patch landed.
In the future, a follow-up cleanup would be to stop exporting most of
the _cxxabiv1_foo_type_infoE symbols from both libc++abi and libc++
since they are implementation details that nobody should be relying on.
rdar://131984512
Fixes#109538.
In this patch, we introduce diagnostic for required expression
parameters in the same way as function parameters, fix the issue of
handling void type parameters, and align the behavior with GCC and other
compilers.
Fix the calling convention used for the call in the entry point
wrapper. No calling convention is currently set. It can easily use the
calling convention of the function that is being called.
Without this, there is a mismatch in the calling convention between the
call site and the callee. This is undefined behaviour.
Update VOP3 only instructions with true16 and fake16 formats.
This patch includes instructions:
V_MUL_LO_U16
V_MAX_U16
V_MAX_I16
V_MIN_U16
V_MIN_I16
V_LSHLREV_B16
V_LSHRREV_B16
V_ASHRREV_I16
IEEE_RINT rounds a real value to an integer-valued real.
IEEE_INT rounds a real value to an integer value.
The primary IEEE_INT result is generated with a call to IEEE_RINT.
This is instead of pushing directly. Creating a pull request is slightly
more work for the release manager, but it is more secure as we no longer
need a secret with write access to the www-releases repo.
The actions/checkout step will clear the current directory, so we need
to checkout the sources first so that the downloaded artifacts won't be
deleted.
Some of these check lines are insufficient to determine correctness.
Generate full check lines instead.
To reduce noise, add nounwind and use static relocation model.
Issue: #105181
extendhfxf2 calls extendhfXfy to convert _Float16 to double, then type
casts this converted value to long double.
__uint128_t may not be available on all architectures. Thus I din't use
extendhfXfy to widen precision to 128 bits.
HLSL doesn't distinguish `main` from any other function. It does treat
entry points special, but they're not required to be called `main` so we
have a different attribute annotation to mark them.
At the moment this change really just changes the mangling of functions
named `main` in the Itanium mangling.
Fixes#110517
---------
Co-authored-by: Farzon Lotfi <1802579+farzonl@users.noreply.github.com>
__arm64x_native_entrypoint and __guard_check_icall_a64n_fptr are
relevant only for hybrid ARM64X images, we need support for separate
namespaces before we can support them.
__hybrid_image_info_bitfield is 0 in MSVC linker in all tests I tried.
As pointed out in the review of #102133, SCEVExpander currently
incorrectly reuses GEP instructions that have poison-generating flags
set. Fix this by clearing the flags on the reused instruction.
Update VPWidenCallRecipe to manage fast-math flags directly via
VPRecipeWithIRFlags. This addresses a TODO and allows adjusting the FMFs
directly on the recipe. Also fixes printing for flags for
VPWidenCallRecipe.
When Branch Target Identification BTI is enabled all indirect branches
must target a BTI instruction. A long branch thunk is a source of
indirect branches. To date LLD has been assuming that the object
producer is responsible for putting a BTI instruction at all places the
linker might generate an indirect branch to. This is true for clang, but
not for GCC. GCC will elide the BTI instruction when it can prove that
there are no indirect branches from outside the translation unit(s). GNU
ld was fixed to generate a landing pad stub (gnu ld speak for thunk) for
the destination when a long range stub was needed [1].
This means that using GCC compiled objects with LLD may lead to LLD
generating an indirect branch to a location without a BTI. The ABI [2]
has also been clarified to say that it is a static linker's
responsibility to generate a landing pad when the target does not have a
BTI.
This patch implements the same mechansim as GNU ld. When the output ELF
file is setting the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI property, then we check the
destination to see if it has a BTI instruction. If it does not we
generate a landing pad consisting of:
BTI c
B <destination>
The B <destination> can be elided if the thunk can be placed so that
control flow drops through. For example:
BTI c
<destination>:
This will be common when -ffunction-sections is used.
The landing pad thunks are effectively alternative entry points for the
function. Direct branches are unaffected but any linker generated
indirect branch needs to use the alternative. We place these as close as
possible to the destination section.
There is some further optimization possible. Consider the case:
.text
fn1
...
fn2
...
If we need landing pad thunks for both fn1 and fn2 we could order them
so that the thunk for fn1 immediately precedes fn1. This could save a
single branch. However I didn't think that would be worth the additional
complexity.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
[2] https://github.com/ARM-software/abi-aa/issues/196
The minimum and maximum operations were introduced in
https://reviews.llvm.org/D52764 alongside the intrinsics. The question
of NaN propagation was discussed at the time, but the resulting
semantics don't seem to match what was ultimately agreed in IEEE754-2019
or the description we now have in the LangRef at
<https://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>.
Essentially, the APFloat implementation doesn't quiet a signaling NaN
input when it should in order to match the LangRef and IEEE spec.
This patch introduces a new check to find mismatches between the number
of data members in a union and the number enum values present in
variant-like structures.
Variant-like types can look something like this:
```c++
struct variant {
enum {
tag1,
tag2,
} kind;
union {
int i;
char c;
} data;
};
```
The kind data member of the variant is supposed to tell which data
member of the union is valid, however if there are fewer enum values
than union members, then it is likely a mistake.
The opposite is not that obvious, because it might be fine to have more
enum values than union data members, but for the time being I am curious
how many real bugs can be caught if we give a warning regardless.
This patch also contains a heuristic where we try to guess whether the
last enum constant is actually supposed to be a tag value for the
variant or whether it is just holding how many enum constants have been
created.
Patch by Gábor Tóthvári!
The "narrowing top" convert instructions leave the bottom half of active
elements untouched and thus the first paramater of their associated
intrinsic remains live even when there are no inactive lanes.
As this comment around target initialization implies:
```
// This can be NULL if we don't know anything about the architecture or if
// the target for an architecture isn't enabled in the llvm/clang that we
// built
```
There are cases where we might fail to call `InitBuiltinTypes` when
creating the backing `ASTContext` for a `TypeSystemClang`. If that
happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are
not initialized and dereferencing them as we do in
`GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to
nullptr-dereferences. Example backtrace:
```
(lldb) run
Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958.
Process 2680 stopped
* thread #15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) +
liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes
, bool) (.cold.1):
-> 0x10cdf3cdc <+0>: stp x29, x30, [sp, #-0x10]!
0x10cdf3ce0 <+4>: mov x29, sp
0x10cdf3ce4 <+8>: adrp x0, 545
0x10cdf3ce8 <+12>: add x0, x0, #0xa25 ; "ParseObjCMethod"
Target 0: (lldb) stopped.
(lldb) bt
* thread #15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288
frame #2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128
frame #3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284
* frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) +
frame #5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188
frame #6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384
```
This patch adds a one-time user-visible warning for when we fail to
initialize the AST to indicate that initialization went wrong for the
given target. Additionally, we add checks for whether one of the
`ASTContext` `QualType`s is invalid before dereferencing any builtin
types.
The warning would look as follows:
```
(lldb) target create "a.out"
Current executable set to 'a.out' (arm64).
(lldb) b main
warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly.
Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90
```
rdar://134869779
There was a discrepancy between how SimplifyDemandedBits and
computeKnownBits handled the Argument case. computeKnownBits()
would use information from range attributes even once the
recursion limit has been reached.
Fixes https://github.com/llvm/llvm-project/issues/110631.
The comments in isObjectSmallerThan are outdated, as it is only ever
called with the underlying object as the first argument. Update the
comments to reflect this.
Tainted division operation is separated out from the core.DivideZero
checker into the optional optin.taint.TaintedDiv checker. The checker
warns when the denominator in a division operation is an attacker
controlled value.
Adds a new Transform Dialect Op that collects patters for dropping unit
dims from various Ops:
* `transform.apply_patterns.vector.drop_unit_dims_with_shape_cast`.
It excludes patterns for vector.transfer Ops - these are collected
under:
* `apply_patterns.vector.rank_reducing_subview_patterns`,
and use ShapeCastOp _and_ SubviewOp to reduce the rank (and to eliminate
unit dims).
This new TD Ops allows us to test the "ShapeCast folder" pattern in
isolation. I've extracted the only test that I could find for that
folder from "vector-transforms.mlir" and moved it to a dedicated file:
"shape-cast-folder.mlir". I also added a test case with scalable
vectors.
Changes in VectorTransforms.cpp are not needed (added a comment with
a TODO + ordered the patterns alphabetically). I am Including them here
to avoid a separate PR.
In preparation for refactoring the instruction size checks being made by
PAuth-related code, switch all instruction emission in AArch64AsmPrinter
to using EmitToStreamer function.
Introduce a single-operand overload of `EmitToStreamer(MCInst)`, as the
only MCStreamer passed as the first argument is actually `*OutStreamer`.
To decrease the number of code lines changed due to clang-format, do not
touch the existing calls to two-argument EmitToStreamer function so far.
There are llvm intrinsics which we do not expect to actually represent
code after lowering or which are not implemented yet but can be found in
a customer's LLVM IR input. We do not want translation to crash when
these llvm intrinsics are found, and this PR fixes the issue with
translation crash for some known cases, aligned with Khronos Translator.
This PR reworks implementation of OpSpecConstantOp with ptr-cast
operation (PtrCastToGeneric, GenericCastToPtr). Previous implementation
didn't take into account a lot of use cases, including multiple
inclusion of pointers, reference to a pointer from OpName, etc. A
reproducer is attached as a new test case.
This PR also fixes wrong type inference for IR patterns which generate
new virtual registers without SPIRV type. Previous implementation
assumed always that result has the same address space as a source that
is not the fact, and, for example, led to impossibility to emit a
ptr-cast operation in the reproducer, because wrong type inference
rendered source and destination with the same address space, eliminating
translation of G_ADDRSPACE_CAST.
This PR improves type inference and fixes inconsistency between
previously deduced element type of a pointer and function's return type.
It fixes https://github.com/llvm/llvm-project/issues/109401 by ensuring
that OpPhi is consistent with respect to operand types.
For register constraints that require specific register ranges, the
width of the range should match the type of the associated
parameter/return value. With this PR, we error out when that is not the
case. Previously, these cases would hit assertions or llvm_unreachables.
The handling of register constraints that require only a single register
remains more lenient to allow narrower non-vector types for the
associated IR values. For example, constraining an i16 or i8 value to a
32-bit register is still allowed.
Fixes#101190.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>