Commit Graph

520379 Commits

Author SHA1 Message Date
Callum Fare
fd3907ccb5 Reland #118503: [Offload] Introduce offload-tblgen and initial new API implementation (#118614)
Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON`
(see last commit). Otherwise the changes are identical.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-12-05 09:34:04 +01:00
Feng Zou
636beb6a28 [X86][LLD] Handle R_X86_64_CODE_6_GOTTPOFF relocation type (#117675)
For

    add %reg1, name@GOTTPOFF(%rip), %reg2
    add name@GOTTPOFF(%rip), %reg1, %reg2
    {nf} add %reg1, name@GOTTPOFF(%rip), %reg2
    {nf} add name@GOTTPOFF(%rip), %reg1, %reg2
    {nf} add name@GOTTPOFF(%rip), %reg

add

    R_X86_64_CODE_6_GOTTPOFF = 50

in #117277.

Linker can treat R_X86_64_CODE_6_GOTTPOFF as R_X86_64_GOTTPOFF or
convert the instructions above to

    add $name@tpoff, %reg1, %reg2
    add $name@tpoff, %reg1, %reg2
    {nf} add $name@tpoff, %reg1, %reg2
    {nf} add $name@tpoff, %reg1, %reg2
    {nf} add $name@tpoff, %reg

if the first byte of the instruction at the relocation offset - 6 is
0x62 (namely, encoded w/EVEX prefix) when possible.

Binutils patch: bminor/binutils-gdb@5bc71c2
Binutils mailthread:
https://sourceware.org/pipermail/binutils/2024-February/132351.html
ABI discussion:
https://groups.google.com/g/x86-64-abi/c/FhEZjCtDLFw/m/VHDjN4orAgAJ
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation
2024-12-05 16:26:26 +08:00
Aiden Grossman
a9a4a83b61 [clang-format] Add test to ensure formatting options docs are updated (#118154)
This patch adds a lit test to clang format to ensure that the
ClangFormatStyleOptions doc page has been updated appropriately. The
test just runs the automatic update script and diffs the outputs to
ensure they are the same.
2024-12-04 23:41:12 -08:00
Owen Pan
6bec1806c9 [clang-format] Add plurals.txt to DEPENDS of style_options_depends 2024-12-04 23:02:02 -08:00
Iuri Chaer
f7560ee97b [clang-format] Add cmake target clang-format-style-options for updating ClangFormatStyleOptions.rst (#111513)
* Create a new `clang-format-style-options` build target which
re-generates ClangFormatStyleOptions.rst from its source header files.

As discussed in
https://github.com/llvm/llvm-project/pull/96804#discussion_r1718407404

---------

Co-authored-by: Owen Pan <owenpiano@gmail.com>
2024-12-04 22:50:01 -08:00
Thorsten Schütt
71ac1eb509 Revert "[GlobalISel] Combine [s,z]ext of undef into 0" (#118746)
Reverts llvm/llvm-project#117439
2024-12-05 07:48:20 +01:00
Renat Idrisov
0629e9e352 [MLIR] Removing dead values for branches (#117501)
Fixing RemoveDeadValues to properly remove arguments from
BranchOpInterface operations.
This is a follow-up for:
https://github.com/llvm/llvm-project/pull/117405
cc: @joker-eph @codemzs

---------

Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
2024-12-05 14:05:48 +08:00
Timm Baeder
abc27039be [clang][bytecode] Pass __builtin_memcpy size along (#118649)
To DoBitCastPtr, so we know how many bytes we want to read.
2024-12-05 06:55:18 +01:00
Craig Topper
3e0e1c13ce [RISCV][GISel] Support fp128 arithmetic and conversion for RV64. (#118707)
We can support these via libcalls in libgcc/compiler-rt or integer
operations for fneg/fabs/fcopysign. fp128 values will be passed in two
64-bit GPRs according to the psABI.

Supporting RV32 requires sret which is not supported by libcall handling
in LegalizerHelper.cpp yet. It doesn't call canLowerReturn.
2024-12-04 21:43:29 -08:00
Ben Shi
dba0861cd7 [AVR] Simplify eocoding of load/store instructions (#118279)
Fixes https://github.com/llvm/llvm-project/issues/113774
2024-12-05 13:05:25 +08:00
Vladimir Vereschaka
a996a15b4c [CMake] Allow parametrizing of the static libraries in Cross ARM CMake cache. NFC. (#118737)
In order to support the cross-arm remote tests for LLDB project

(see 'lldb-remote-linux-*' public builders for details).
2024-12-04 21:04:29 -08:00
Timm Baeder
44be794658 [clang][bytecode] Not all null pointers are 0 (#118601)
Get the Value from the ASTContext instead.
2024-12-05 06:03:50 +01:00
Kareem Ergawy
0993335134 [OpenMP][OMPIRBuilder] Add delayed privatization support for wsloop (#118463)
Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for
delayed privatization. This also refactors a few bit of code to isolate
the logic needed for `firstprivate` initialization in a shared util that
can be used across constructs that need it. The same is done for
`dealloc`
regions.

Parent PR: https://github.com/llvm/llvm-project/pull/118447. Only latest
commit is relevant for this PR.
2024-12-05 05:59:52 +01:00
Kazu Hirata
50f8580e2c [memprof] Add IndexedMemProfData::addFrame (#118724)
This patch adds a helper function to replace an idiom like:

  FrameId Id = F.hash();
  MemProfData.Frames.try_emplace(Id, F);
  // Do something with Id.
2024-12-04 20:33:35 -08:00
Kareem Ergawy
7f72d71de7 [OpenMP][OMPIRBuilder] Refactor reduction initialization logic into one util (#118447)
This refactors the logic needed to emit init logic for reductions by
moving some duplicated code into a shared util. The logic for doing is
quite involved and is needed for any construct that has reductions.
Moreover, when a construct has both private and reduction clauses, both
sets of clauses need to cooperate with each other when emitting the
logic needed for allocation and initialization. Therefore, this PR
clearly sets the boundaries for the logic needed to initialize
reductions.
2024-12-05 05:23:49 +01:00
Kazu Hirata
7b8cf147ad [memprof] Update YAML traits for writer purposes (#118720)
For Frames, we prefer the inline notation for the brevity.

For PortableMemInfoBlock, we go through all member fields and print
out those that are populated.
2024-12-04 19:23:27 -08:00
Congcong Cai
f98c9a9b36 [mutation analyzer][NFC] combine ConditionalOperator BinaryConditionalOperator (#118602) 2024-12-05 11:17:45 +08:00
Igor Kudrin
740ac4f0ff Reland "[ObjectYAML][ELF] Take alignment into account when generating notes" (#118434)
This relands #118157 with a fix for the use of an uninitialized
variable and additional tests.

The System V ABI
(https://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section)
states that the note entries and their descriptor fields must be aligned
to 4 or 8 bytes for 32-bit or 64-bit objects respectively. In practice,
64-bit systems can use both alignments, with the actual format being
determined by the alignment of the segment. For example, the Linux
gABI extension (https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf)
contains a special note on this, see 2.1.7 "Alignment of Note Sections".

This patch adjusts the format of the generated notes to the specified
section alignment. Since `llvm-readobj` was fixed in a similar way in
https://reviews.llvm.org/D150022, "[Object] Fix handling of Elf_Nhdr
with sh_addralign=8", the generated notes can now be parsed
successfully by the tool.
2024-12-04 18:55:59 -08:00
hev
00d8ea3a4c [LoongArch] Supports FP_TO_SINT operation for fp16 (#118303)
Fixes #118301
2024-12-05 10:46:23 +08:00
Valentin Clement (バレンタイン クレメン)
7d1c661381 [flang] Allow to pass an async id to allocate the descriptor (#118713)
This is a patch in preparation for the support stream ordered memory
allocator in CUDA Fortran.

This patch adds an asynchronous id to the AllocatableAllocate runtime
function and to Descriptor::Allocate so it can be passed down to the
registered allocator. It is up to the allocator to use this value or
not.

A follow up patch will implement that asynchronous allocator for CUDA
Fortran.
2024-12-04 18:24:40 -08:00
pcc
970d6d2096 ELF: Have __rela_iplt_{start,end} surround .rela.iplt with --pack-dyn-relocs=android.
In #86751 we moved the IRELATIVE relocations to .rela.plt when
--pack-dyn-relocs=android was enabled but we neglected to also move
the __rela_iplt_{start,end} symbols. As a result, static binaries
linked with this flag were unable to find their IRELATIVE relocations.
Fix it by having the symbols surround the correct section.

Reviewers: MaskRay, smithp35

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/118585
2024-12-04 17:35:05 -08:00
Chris Apple
af4ae12780 [rtsan] Add fork/execve interceptors (#117198) 2024-12-04 16:38:37 -08:00
vdonaldson
df43af40ec Vkd1 (#118721) 2024-12-04 19:16:49 -05:00
Kazu Hirata
32b821cab3 [AST] Fix a warning
This patch fixes:

  clang/lib/AST/MicrosoftMangle.cpp:1006:11: error: enumeration value
  'S_PPCDoubleDoubleLegacy' not handled in switch [-Werror,-Wswitch]
2024-12-04 16:03:14 -08:00
vdonaldson
17f99accf2 [flang] build test fix/suppression (#118716) 2024-12-04 18:47:45 -05:00
Nick Desaulniers
659834df0e docgen refresh (#118709)
- **[libc][docgen] Use Macro for macro table name**
- **fix setjmp json, otherwise can't regen**
- **regen all docs**
2024-12-04 15:43:52 -08:00
Philip Reames
1ef9410a96 Revert "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)"
This reverts commit e6aec2c120.  Commit breaks "ninja check-llvm" on x86 host.
2024-12-04 15:37:25 -08:00
Roland McGrath
09f4c26262 [Driver][Fuchsia] Avoid "argument unused" warnings (#118416)
There should not be an error or warning reported for using
redundant options to control what goes into the link.  For
example, -nolibc -nostdlib.
2024-12-04 15:34:54 -08:00
Philip Reames
758107f70a [RISCV] Improve spread(N) shuffle testing
Rework them now that spread(2) is special cased to ensure we still have
non-zero shift coverage.
2024-12-04 15:21:08 -08:00
Vitaly Buka
fc201d6133 Revert "[InstCombine] Support gep nuw in icmp folds" (#118698)
Reverts llvm/llvm-project#118472

Breaks profile tests on i386
https://lab.llvm.org/buildbot/#/builders/66/builds/7009
2024-12-04 15:07:27 -08:00
Matthias Springer
f50ce316ec [llvm][NFC] APFloat: Add missing semantics to enum (#117291)
* Add missing semantics to the `Semantics` enum.
* Move all documentation of the semantics to the header file.
* Also rename some functions for consistency.
2024-12-04 14:58:59 -08:00
Craig Topper
2fea1ccb62 [RISCV][GISel] Correct the widening predicate for G_SITOFP/G_UITOFP.
This happened to coincidentally work due to D and Zfh both depending
on the F extension.

It breaks when I tried to add fp128 libcall support.
2024-12-04 14:35:44 -08:00
Joseph Huber
a2fc276ed2 [libc] Remove complicated header guards on HSA include
Summary:
This is much more standard now, we already require new HSA with what we
use, so no point checking for this.
2024-12-04 16:28:13 -06:00
Brox Chen
1b4cdc401a [AMDGPU][True16][MC]update vop3 dasm test with latest script (#118686)
This is a NFC. Update dasm test for VOP3 using latest update script
2024-12-04 17:28:10 -05:00
Jun Wang
e6aec2c120 [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)
The AMDGPUAnnotateKernelFeatures pass infers the "amdgpu-calls" and
"amdgpu-stack-objects" attributes, which are used to infer whether we
need to initialize flat scratch. This is, however, not precise. Instead,
we should use AMDGPUAttributor and infer amdgpu-no-flat-scratch-init on
kernels. Refer to https://github.com/llvm/llvm-project/issues/63586 .
2024-12-04 14:10:15 -08:00
Nick Desaulniers
b86a5993bc [libc] remove references to LIBC_HDRGEN_EXE (#118670)
Further cleanups from old hdrgen removal. I didn't realize there were
cmake
variables related to old hdrgen spread out throughout more of the source
tree.

Link: #117220
Link: #117208
2024-12-04 14:04:18 -08:00
Petr Hosek
8cffab821c [Fuchsia] Remove libc from LLVM_ENABLE_PROJECTS (#118704)
This was only needed for old hdrgen which is no longer being used.
2024-12-04 13:58:41 -08:00
AdityaK
004e75ef17 Pack relocations for Android API >= 28 (#117624)
Patch copied from:
https://github.com/android/ndk/issues/909#issuecomment-649872696
Fixes: https://github.com/android/ndk/issues/909
2024-12-04 13:53:47 -08:00
Luke Quinn
261d4bbb3b [RISCV] f32 roundeven pattern missed for Zfa (#118672)
f32 roundeven pattern was missing from RISCVInstrInfoZfa.td. Tests for
roundeven.f32/f16/f64 were missing.
2024-12-04 13:52:20 -08:00
Daniel Paoliello
35c7df1a21 [aarch64][arm] Add support for the _Interlocked[Compare]ExchangePointer_{acq|nf|rel} MS intrinsics (#117645)
Adds support for the following MSVC intrinsics:
* `_InterlockedCompareExchangePointer_acq`
* `_InterlockedCompareExchangePointer_rel`
* `_InterlockedExchangePointer_acq`
* `_InterlockedExchangePointer_nf`
* `_InterlockedExchangePointer_rel`

These are documented at:
<https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170#interlocked-intrinsics>

NOTE: `_InterlockedCompareExchangePointer_nf` is not being added since
it already exists, although it was incorrectly added for all
architectures instead of being Arm & AArch64 specific.

This change also unifies how the pointer and non-pointer interlocked
compare-exchange intrinsics are being handled.
2024-12-04 13:41:26 -08:00
lntue
a7da702377 [libc][math] Add small code size options for atan2f. (#118532) 2024-12-04 16:37:51 -05:00
Valentin Clement (バレンタイン クレメン)
7efd6139f2 [flang][cuda] Get device address in fir.declare (#118591)
Add pattern that update fir.declare memref when it comes from a device
global and is not a descriptor. In that case, we recover the device
address that needs to be used in ops like `fir.array_coor` and so on.
2024-12-04 13:36:58 -08:00
Marina Taylor
e6bd00c0f7 [Inliner] Add a helper around SimplifiedValues.lookup. NFCI (#118646) 2024-12-04 21:27:02 +00:00
vdonaldson
6003be7ef1 [flang] IEEE_GET_UNDERFLOW_MODE, IEEE_SET_UNDERFLOW_MODE (#118551)
Implement IEEE_GET_UNDERFLOW_MODE and IEEE_SET_UNDERFLOW_MODE. Update
IEEE_SUPPORT_UNDERFLOW_CONTROL to enable support for indvidual REAL
kinds.
2024-12-04 16:21:11 -05:00
George Stagg
ac5dd455ca [WebAssembly] Support multiple .init_array fragments when writing Wasm objects (#111008) 2024-12-04 13:12:15 -08:00
Zequan Wu
2e425bf629 Reapply "[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#117071)"
9de73b2040 lands a fix to DWARFTypePrinter that is used by lldb in this change.
2024-12-04 13:05:36 -08:00
Augie Fackler
ce0f11325e Revert "[clangd] Re-land "support outgoing calls in call hierarchy" (#117673)"
This reverts commit 7be3326200.

Per https://protobuf.dev/programming-guides/dos-donts/#add-required
this will re-land tomorrow without the required fields.
2024-12-04 15:58:56 -05:00
Florian Hahn
7954a0514b [Clang] Enable -fpointer-tbaa by default. (#117244)
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory accesses
 * +3% more stores removed by DSE,
 * +4% more loops vectorized.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
https://github.com/llvm/llvm-project/pull/76612.

Support for user-defined types has been added in
https://github.com/llvm/llvm-project/pull/110569.

There are 2 recent PRs with bug fixes for special cases uncovered during
testing:
 * https://github.com/llvm/llvm-project/pull/116991
 * https://github.com/llvm/llvm-project/pull/116596

PR: https://github.com/llvm/llvm-project/pull/117244
2024-12-04 20:55:18 +00:00
Matt Arsenault
431581b22a AMDGPU: Simplify definition of bitop3 operand. NFC. (#118648)
Co-authored-by: Jay Foad <jay.foad@amd.com>
2024-12-04 15:47:20 -05:00
Matt Arsenault
e0f52538c9 AMDGPU: Change bitop3 intrinsic operand to i32 (#118647) 2024-12-04 15:44:04 -05:00