Commit Graph

2566 Commits

Author SHA1 Message Date
Joseph Huber
2d588461bc [Libomptarget] Add more moves to expected conversion
Summary:
Fixes other instances of the same problem in the previous patch.
2023-01-06 09:09:45 -06:00
Joseph Huber
75c03596b8 [Libomptarget] Add move to expected conversion
Summary:
These implicit conversions from move-only types to expected seem to only
work with newer compilers. This should hopefully fix it.
2023-01-06 09:09:45 -06:00
Johannes Doerfert
ccc1324120 Introduce environment variables to deal with JIT IR
We can now dump the IR before and after JIT optimizations into the
files passed via `LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE` and
`LIBOMPTARGET_JIT_POST_OPT_IR_MODULE`, respectively.

Similarly, users can set `LIBOMPTARGET_JIT_REPLACEMENT_MODULE` to
replace the IR in the image with a custom IR module in a file.
All options take file paths, documentation was added.

Reviewed by: tianshilei1992

Differential revision: https://reviews.llvm.org/D140945
2023-01-05 00:17:46 -08:00
Johannes Doerfert
c63dced93b [OpenMP][JIT] Introduce support for AMDGPU
To JIT kernels for AMDGPUs we need to provide the architecture, the
triple, and a post-link callback. The first two are simple, the last one
is a little more complicated since we need to invoke `lld`. There is
some library interface but for that we need the lld library, which is
not generally available, thus we go with the executable for now. In
either way we need to manifest the (amdgcn) object file and read the
output from another file. We should try to avoid that in the future.
The options for `lld` are copied from the way clang invokes it.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D140720
2023-01-04 10:14:27 -08:00
Johannes Doerfert
93e75714cd [OpenMP][AMDGPU][NFC] Improve error message for errors 2023-01-03 17:09:32 -08:00
Johannes Doerfert
5524952c14 [OpenMP][JIT][FIX] Create the default O0 pipeline for -O0 2023-01-03 17:07:52 -08:00
Johannes Doerfert
428bc510bf [OpenMP] Unify "exec_mode" query code and default to SPMD
Defaulting to Generic mode doesn't make much sense as the kernel needs
to be prepared for it. SPMD mode is the "native" execution, e.g., for
"bare" kernels. It also is the execution method for constructors and
destructors (as we might otherwise throw an extra warp onto them).

Differential Revision: https://reviews.llvm.org/D140718
2023-01-03 16:58:13 -08:00
JP Lehr
263962545d [OpenMP] Solve potential VERSION script error w/ OMPT symbols
The patch adds the symbols if OMPT_SUPPORT is not defined.
Github issue: https://github.com/llvm/llvm-project/issues/59660

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D140591
2023-01-03 16:47:12 -05:00
Doru Bercea
86dc7de8ff Fix initializer name. 2023-01-03 12:45:28 -06:00
Ron Lieberman
750e1c8dbd Revert "[libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support."
This reverts commit 135f6a1ee8.
2023-01-03 12:26:39 -06:00
Ron Lieberman
135f6a1ee8 [libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support. 2023-01-03 11:04:13 -06:00
Kevin Sala
339d810a0f [OpenMP][libomptarget] Add TargetParser as dependency in NextGen's JIT
This patch fixes an undefined reference to llvm::Triple::Triple(llvm::Twine const&).

Differential Revision: https://reviews.llvm.org/D140810
2023-01-01 13:29:30 +01:00
Shilei Tian
75019f18bd [OpenMP][JIT] Fixed a couple of issues in the initial implementation of JIT
This patch fixes a couple of issues:
1. Instead of using `llvm_unreachable` for those base virtual functions, unknown
   value will be returned. The previous method could cause runtime error for those
   targets where the image is not compatible but JIT is not implemented.
2. Fixed the type in CMake that causes the `Target` CMake variable is undefined.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D140732
2022-12-28 14:40:59 -05:00
Shilei Tian
5a3a527f8a [OpenMP] Introduce basic JIT support to OpenMP target offloading
This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. 02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432) shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D139287
2022-12-27 22:19:05 -05:00
Shilei Tian
95956bd896 Revert "[OpenMP] Introduce basic JIT support to OpenMP target offloading"
This reverts commit 58906e4901 because it breaks AMD's buildbot.
2022-12-27 21:52:07 -05:00
Shilei Tian
58906e4901 [OpenMP] Introduce basic JIT support to OpenMP target offloading
This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. 02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432) shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D139287
2022-12-27 19:07:32 -05:00
Shilei Tian
4007963756 [RFC][OpenMP] Update to Python3 for lit test
I think it's reasonable to upgrade to Python 3 for LIT test requirement because `lit` itself (`llvm/utils/lit/lit.py`) already switched to Python 3. In addition, LLVM already requires Python 3.6 to be the minimum version (https://llvm.org/docs/GettingStarted.html#software).

Reviewed By: jdoerfert, jhuber6

Differential Revision: https://reviews.llvm.org/D139855
2022-12-26 21:39:51 -05:00
Shilei Tian
a82e5825e0 [NFC][OpenMP] Fix compile warning caused by using std::move on a local object on a return statement 2022-12-23 10:42:29 -05:00
Vignesh Balasubramanian
ae1507d3ea [OpenMP] [OMPD] Enable OMPD Tests
It was disabled due to different failures it different llvm bots.

Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D138411
2022-12-23 11:47:21 +05:30
Kevin Sala
e5354a2bfa [OpenMP][libomptarget] Centralize host pinned buffers map to NextGen's PluginInterface
This patch moves the management/tracking of host pinned buffers to the common PluginInterface
in NextGen plugins. For the moment, the management consists of tracking the host pinned
allocations into a map in each device.

Differential Revision: https://reviews.llvm.org/D140502
2022-12-22 02:11:05 +01:00
Kevin Sala
a487e0ffde [NFC][OpenMP][libomptarget] Return null if error detected during allocation in NextGen AMDGPU 2022-12-22 01:46:33 +01:00
Guilherme Valarini
4e32d5cedf [OpenMP] Disable libomptarget integration on unsupported platforms
Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D140419
2022-12-20 16:41:43 -03:00
Joseph Huber
3d2aa5473c [OpenMP][NFC] Fix message to recommend C++17 instead of C++14
Summary:
This was changed in LLVM 16.0.
2022-12-20 13:39:44 -06:00
Sunil Kuravinakop
e9babe7571 [OpenMP] Clang Support for taskwait nowait clause
Support for taskwait nowait clause with placeholder for runtime changes.

Reviewed By: cchen, ABataev

Differential Revision: https://reviews.llvm.org/D131830
2022-12-20 12:13:56 -06:00
Johannes Doerfert
e3d9a448c5 [OpenMP] Account for dynamic shared memory in the AMDGPU nextgen plugin 2022-12-19 19:09:44 -08:00
Johannes Doerfert
fb2c42df41 [OpenMP] Improve AMDGPU Plugin
With this patch we:
- pick more sensible defaults for the number of teams, inspired by the
  old plugin, and configured via LIBOMPTARGET_AMDGPU_TEAMS_PER_CU.
- check the input signal of a kernel launch late, after the queue lock
  was taken, to avoid a barrier packet more often.
- copy the kernel arguments in one swoop into the appropriate memory.
- manually specialize the callbacks to avoid potential indirect calls.
2022-12-19 19:09:43 -08:00
Ye Luo
ee3d9ee49c [OpenMP] Change the nextgen plugin kernel thread count scheme as old plugins'
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140352
2022-12-19 18:27:02 -06:00
Johannes Doerfert
77197b5651 [OpenMP] Export ompx:: symbols from the device runtime
Differential Revision: https://reviews.llvm.org/D140335
2022-12-19 14:46:54 -08:00
Johannes Doerfert
2b5a99b3d9 [OpenMP] Rename the _OMP namespace in the device runtime to ompx
Differential Revision: https://reviews.llvm.org/D140334
2022-12-19 14:43:59 -08:00
Kevin Sala
6bbf9c0cca [OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior
This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes.
It also implements the asynchronous behavior in the plugin operations: kernel launches
and memory transfers. To this end, it implements the concept of streams of asynchronous
operations. The streams are implemented using the HSA signals to define input and output
dependencies between asynchronous operations.

Missing features:
  - Retrieve the maximum number of threads per group that a kernel can run. This requires
    reading the image.
  - Implement __tgt_rtl_sync_event, not used on the libomptarget side.

Differential Revision: https://reviews.llvm.org/D138389
2022-12-17 00:01:24 +01:00
Kevin Sala
7b97941721 [OpenMP][libomptarget] Add missing symbols in dynamic_hsa
This patch prepares for the new AMDGPU NextGen plugin.

Differential Revision: https://reviews.llvm.org/D140213
2022-12-17 00:01:24 +01:00
Joseph Huber
d8b0f007cb [libomptarget] Add HSA definitions for memory faults to dynamic_hsa
Summary:
We use the dynamic HSA file to forward declare needed definitions from
the HSA runtime if not present at build time. These definitions were not
included so using them caused problems on systems without it if used.
Just add them.
2022-12-16 07:06:44 -06:00
Kevin Sala
a66826a233 Revert "[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior"
This reverts commit 87e6b96b00.
2022-12-16 11:53:45 +01:00
Carlo Bertolli
ac52c8f589 [OpenMP] Add missing test for pinned memory API
I accidentally left out the test for the pinned API introduced by D138933. Adding it back.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140077
2022-12-15 21:29:15 -06:00
Kevin Sala
87e6b96b00 [OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior
This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes.
It also implements the asynchronous behavior in the plugin operations: kernel launches
and memory transfers. To this end, it implements the concept of streams of asynchronous
operations. The streams are implemented using the HSA signals to define input and output
dependencies between asynchronous operations.

Missing features:
  - Retrieve the maximum number of threads per group that a kernel can run. This requires
    reading the image.
  - Implement __tgt_rtl_sync_event, not used on the libomptarget side.

Differential Revision: https://reviews.llvm.org/D138389
2022-12-16 00:30:43 +01:00
Kevin Sala
39fe657b66 [OpenMP][libomptarget] Add utility header for AMDGPU plugins
This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. The original and the
NextGen plugin will share some structures and functionalities. We use this header for defining
them and avoiding code duplication.

Differential Revision: https://reviews.llvm.org/D139792
2022-12-15 21:06:04 +01:00
Guilherme Valarini
89c82c8394 [OpenMP] Add non-blocking support for target nowait regions
This patch better integrates the target nowait functions with the tasking runtime. It splits the nowait execution into two stages: a dispatch stage, which triggers all the necessary asynchronous device operations and stores a set of post-processing procedures that must be executed after said ops; and a synchronization stage, responsible for synchronizing the previous operations in a non-blocking manner and running the appropriate post-processing functions. Suppose during the synchronization stage the operations are not completed. In that case, the attached hidden helper task is re-enqueued to any hidden helper thread to be later synchronized, allowing other target nowait regions to be concurrently dispatched.

Reviewed By: jdoerfert, tianshilei1992

Differential Revision: https://reviews.llvm.org/D132005
2022-12-14 14:03:32 -03:00
Guilherme Valarini
63efc58c5a [NFC][OpenMP] Add missing LLVM headers on utility file
Differential Revision: https://reviews.llvm.org/D137566
2022-12-14 12:46:00 -03:00
Carlo Bertolli
d6281caa34 [OpenMP] Add API for pinned memory
This patch adds API support for the atk_pinned trait for omp_alloc.
It does not implement kmp_target_lock_mem and kmp_target_unlock_mem in libomptarget,
but prepares libomp for it. Patches to libomptarget to implement
lock/unlock coming after this one.

Reviewed by: jlpeyton, jdoerfert

Differential Revision: https://reviews.llvm.org/D138933
2022-12-14 08:50:10 -06:00
Martin Storsjö
15151315f7 [OpenMP] Add a missing dllexport for the new function __kmpc_fork_call_if
This new function was added in b72f1ec9fb,
but wasn't exported from the DLL on Windows.

This fixes the parallel/omp_parallel_if.c OpenMP testcase on
Windows.
2022-12-14 14:19:10 +02:00
Martin Storsjö
93c011eebb [OpenMP] Fix detecting warning options for GCC
If testing for a warning option like -Wno-<foo> with GCC, GCC won't
print any diagnostic at all, leading to the options being accepted
incorrectly. However later, if compiling a file that actually prints
another warning, GCC will also print warnings about these -Wno-<foo>
options being unrecognized.

This avoids warning spam like this, for every OpenMP source file that
produces build warnings with GCC:

    cc1plus: warning: unrecognized command line option ‘-Wno-int-to-void-pointer-cast’
    cc1plus: warning: unrecognized command line option ‘-Wno-return-type-c-linkage’
    cc1plus: warning: unrecognized command line option ‘-Wno-covered-switch-default’
    cc1plus: warning: unrecognized command line option ‘-Wno-enum-constexpr-conversion’

This matches how such warning options are detected and added in
llvm/cmake/modules/HandleLLVMOptions.cmake, e.g. like this:

    check_cxx_compiler_flag("-Wclass-memaccess" CXX_SUPPORTS_CLASS_MEMACCESS_FLAG)
    append_if(CXX_SUPPORTS_CLASS_MEMACCESS_FLAG "-Wno-class-memaccess" CMAKE_CXX_FLAGS)

This also matches how LLDB warning options were restructured for
GCC compatibility in e546bbfda0.

Differential Revision: https://reviews.llvm.org/D139922
2022-12-14 14:19:03 +02:00
Johannes Doerfert
90609fb68f [OpenMP][NFCI] Remove effectively dead code in clang and the runtime
Differential Revision: https://reviews.llvm.org/D136903
2022-12-13 18:44:19 -08:00
gonglingqin
b49d3e50e3 [OpenMP][Test] Make the output error message consistent with the comment
Modify the error message output of affinity/kmp-affinity.c and
affinity/omp-places.c.

Differential Revision: https://reviews.llvm.org/D139803
2022-12-14 10:07:15 +08:00
Jon Chesterfield
56ec7ce80d [openmp][amdgpu] Let fine grain and kernarg pools differ 2022-12-14 02:04:21 +00:00
gonglingqin
9a0831afa0 [OpenMP] Skip extra blank line when parsing /proc/cpuinfo on LoongArch64
This fixes the following test cases:

* affinity/kmp-affinity.c
* affinity/kmp-hw-subset.c
* affinity/omp-places.c

Differential Revision: https://reviews.llvm.org/D139802
2022-12-13 20:13:10 +08:00
Johannes Doerfert
f9c29878b0 Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime"
This reverts commit c1c8cbbf5f. One of the
tests seems to be flaky/non-deterministic.
2022-12-12 22:08:28 -08:00
Johannes Doerfert
c1c8cbbf5f [OpenMP][NFCI] Remove effectively dead code in clang and the runtime 2022-12-12 20:55:36 -08:00
Terry Wilmarth
bc6cc63ab8 [OpenMP] Refactoring: Move teams forking and serial region forking to separate functions.
Code for serial parallel regions and teams construct have been moved
out of __kmp_fork_call and into separate functions.  This is to reduce
the size of the __kmp_fork_call function, and aid in debugging.

Differential Revision: https://reviews.llvm.org/D139116
2022-12-12 17:19:53 -06:00
Ye Luo
d3ebce9362 [OpenMP] add offload tests with reduction on complex data types
Differential Revision: https://reviews.llvm.org/D139856
2022-12-12 11:48:35 -06:00
Shilei Tian
3eef428948 Revert "[OpenMP] Add abort to FATAL_MESSAGE"
This reverts commit ac65b3c7a2.
2022-12-11 22:46:56 -05:00