Creation of pointers and jump stubs always succeeds for all existing JITLink
backends, and I haven't been able to think of a scenario where it would fail.
(Pointer / stub *fixup* may fail due to range errors, but that will happen
later and the APIs already account for it).
Add support for '`llvm.nvvm.flo.[su].*`' intrinsics which correspond to
a PTX `bfind` instruction.
See [PTX ISA 9.7.1.16. Integer Arithmetic Instructions: bfind]
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#integer-arithmetic-instructions-bfind)
The '`llvm.nvvm.flo.u`' family of intrinsics identifies the bit position
of the leading one, returning either it's offset from the most or least
significant bit.
The '`llvm.nvvm.flo.s`' family of intrinsics identifies the bit position
of the leading non-sign bit, returning either it's offset from the most
or least significant bit.
This re-applies 244ea40625, which was reverted in 0019d06185 while I
investigated a bot failure. The fix for the failure will be committed as a
follow-up.
This enables the assertion introduced in
https://github.com/llvm/llvm-project/pull/106524, which checks that the
value passed to the constructor is indeed a valid N-bit signed or
unsigned integer.
Places that previously violated the assertion were updated in advance,
e.g. in https://github.com/llvm/llvm-project/pull/80309.
It is possible to opt-out of the check and restore the previous behavior
by setting implicitTrunc=true.
-----
The buildbot failures from the previous attempt should be fixed by
a18dd29077 and
e2074c60bb.
This pull request corrects multiple occurrences of the typo "avaliable"
to "available" across the LLVM and Clang codebase. These changes improve
the clarity and accuracy of comments and documentation. Specific
modifications are in the following files:
1. clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp:
Updated comments in readability checks for cognitive complexity.
2. llvm/include/llvm/ExecutionEngine/Orc/ExecutionUtils.h: Corrected
documentation for JITDylib responsibilities.
3. llvm/include/llvm/Target/TargetMacroFusion.td: Fixed descriptions for
FusionPredicate variables.
4. llvm/lib/CodeGen/SafeStack.cpp: Improved comments on DominatorTree
availability.
5. llvm/lib/Target/RISCV/RISCVSchedSiFive7.td: Enhanced resource usage
descriptions for vector units.
6. llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp: Updated invariant
description in shift-detect idiom logic.
7. llvm/test/MC/ARM/mve-fp-registers.s: Amended ARM MVE register
availability notes.
8. mlir/lib/Bytecode/Reader/BytecodeReader.cpp: Adjusted forward
reference descriptions for bytecode reader operations.
These changes have no impact on code functionality, focusing solely on
documentation clarity.
Co-authored-by: wangqiang <wangqiang1@kylinos.cn>
This is a fixed version of #106185, which was reverted in #113978 due to
a buildbot failure.
Motivation example:
```
> cat test.cpp
extern "C" [[gnu::weak]] void f() {}
void alias() __attribute__((alias("f")));
int main() { auto p = alias; p(); }
> clang test.cpp -fsanitize=cfi-icall -flto=thin -fuse-ld=lld
> ./a.out
[1] 1868 illegal hardware instruction ./a.out
```
If the address of a function was only taken through its alias, the
function was not considered exported and therefore was not included in
the CFI jumptable. This resulted in `@llvm.type.test()` being lowered to
`false`, and consequently the indirect call to the function was
eventually optimized to `ubsantrap()`.
When expanding an atomicrmw with a cmpxchg, preserve any metadata
attached to it. This will avoid unwanted double expansions
in a future commit.
The initial load should also probably receive the same metadata
(which for some reason is not emitted as an atomic).
This reverts commit f8f4235612 and replaces the
MapVector with a sorted vector in the debug dump: We only need to sort the
sections for debug dumping, and don't want LinkGraph API clients assuming
anything about the section iteration order.
The change improves the code in general and, as a side effect, avoids
crashing on an impossible address space casts guarded
by `__isGlobal/__isShared`, which partially fixes
https://github.com/llvm/llvm-project/issues/112760
It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.
This is #112964 + a small fix for the crash on unintended argument
access which was the root cause to revers the earlier version of the patch.
This patch adds basic support for `hypot`. Constant folding support will
be submitted in a subsequent patch.
Related issue: https://github.com/llvm/llvm-project/issues/113711
Note: It's my first time contributing to the LLVM with encouragement
from one of my friends, @fawdlstty. I learned a lot from
https://github.com/llvm/llvm-project/pull/99611, and thanks for that.
Note: I had created the same PR and merged
(https://github.com/llvm/llvm-project/pull/113724), but reverted caused
by the merging issue. (The CI issue happened in 3 A.M. at my timezone.
So, I need to fall asleep again after I replied about why issue
happened.) So, I rebased to the latest main branch and recreate the PR
and hope I won't have the third time to create the same PR.
I hope @arsenm can help me review the code again. I’m sorry for that.
Kenji Mouri
This shares most of its code with the scalar sincos expansion. It allows
expanding vector FSINCOS nodes to a library call from the specified
`-vector-library`. The upside of this is it will mean the vectorizer
only needs to handle the sincos intrinsic, which has no memory effects,
and this can handle lowering the intrinsic to a call that takes output
pointers.
The int_amdgcn_mov_dpp8 is overloaded, but we can only select i32.
To allow a corresponding builtin to be overloaded the same way as
int_amdgcn_mov_dpp we need it to be able to split unsupported values.
Currently, the `DropTypeTests` parameter only fully works with phi nodes
and llvm.assume instructions. However, we'd like CFI to work in
conjunction with FatLTO, in so far as the bitcode section should be able
to contain the CFI instrumentation, while any incompatible bits are
dropped when compiling the object code.
To do that, we need to drop the llvm.type.test instructions everywhere,
and not just their uses in phi nodes. This patch updates the
LowerTypeTest pass so that uses are removed, and replaced with `true` in
all cases, and not just in phi nodes.
Addressing this will allow us to fix#112053 by modifying the FatLTO
pipeline.
Reviewers: pcc, nikic
Reviewed By: pcc
Pull Request: https://github.com/llvm/llvm-project/pull/112787
The change improves the code in general and, as a side effect, avoids crashing
on an impossible address space casts guarded by `__isGlobal/__isShared`, which
partially fixes https://github.com/llvm/llvm-project/issues/112760
It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.
This patch is a part of step-by-step refactoring of CloneFunctionInto.
The goal is to extract reusable pieces out of it that will be later used
to optimize function cloning e.g. in coroutine processing.
Extracted from #109032 (commit 2)
Allow construction of DstOp with VRegAttrs.
Also allow construction with register class or bank and LLT.
Intended to be used in lowering code for reg-bank-select where
new registers need to have both register bank and LLT.
Add support for new type of DstOp in CSEMIRBuilder.
Removes sve-bf16, sve-ebf16, and sve-i8mm since they are obsolete. One
could write target_version("sve+bf16") instead of sve-bf16 for instance.
Approved in ACLE as https://github.com/ARM-software/acle/pull/353
This patch converts `SDNodeFlags` into an enumeration as we did for
`FastMathFlags`. It simplifies the implementation and improves
compile-time. This patch is NFC since it doesn't break SDNodeFlags API.
Add support for generating random hotness in the memprof profile writer,
to be used for testing. The random seed is printed to stderr, and an
additional option enables providing a specific seed in order to
reproduce a particular random profile.