clang-p2996

Author	SHA1	Message	Date
Joseph Huber	2d588461bc	[Libomptarget] Add more moves to expected conversion Summary: Fixes other instances of the same problem in the previous patch.	2023-01-06 09:09:45 -06:00
Joseph Huber	75c03596b8	[Libomptarget] Add move to expected conversion Summary: These implicit conversions from move-only types to expected seem to only work with newer compilers. This should hopefully fix it.	2023-01-06 09:09:45 -06:00
Johannes Doerfert	ccc1324120	Introduce environment variables to deal with JIT IR We can now dump the IR before and after JIT optimizations into the files passed via `LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE` and `LIBOMPTARGET_JIT_POST_OPT_IR_MODULE`, respectively. Similarly, users can set `LIBOMPTARGET_JIT_REPLACEMENT_MODULE` to replace the IR in the image with a custom IR module in a file. All options take file paths, documentation was added. Reviewed by: tianshilei1992 Differential revision: https://reviews.llvm.org/D140945	2023-01-05 00:17:46 -08:00
Johannes Doerfert	c63dced93b	[OpenMP][JIT] Introduce support for AMDGPU To JIT kernels for AMDGPUs we need to provide the architecture, the triple, and a post-link callback. The first two are simple, the last one is a little more complicated since we need to invoke `lld`. There is some library interface but for that we need the lld library, which is not generally available, thus we go with the executable for now. In either way we need to manifest the (amdgcn) object file and read the output from another file. We should try to avoid that in the future. The options for `lld` are copied from the way clang invokes it. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D140720	2023-01-04 10:14:27 -08:00
Johannes Doerfert	93e75714cd	[OpenMP][AMDGPU][NFC] Improve error message for errors	2023-01-03 17:09:32 -08:00
Johannes Doerfert	5524952c14	[OpenMP][JIT][FIX] Create the default O0 pipeline for -O0	2023-01-03 17:07:52 -08:00
Johannes Doerfert	428bc510bf	[OpenMP] Unify "exec_mode" query code and default to SPMD Defaulting to Generic mode doesn't make much sense as the kernel needs to be prepared for it. SPMD mode is the "native" execution, e.g., for "bare" kernels. It also is the execution method for constructors and destructors (as we might otherwise throw an extra warp onto them). Differential Revision: https://reviews.llvm.org/D140718	2023-01-03 16:58:13 -08:00
JP Lehr	263962545d	[OpenMP] Solve potential VERSION script error w/ OMPT symbols The patch adds the symbols if OMPT_SUPPORT is not defined. Github issue: https://github.com/llvm/llvm-project/issues/59660 Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D140591	2023-01-03 16:47:12 -05:00
Doru Bercea	86dc7de8ff	Fix initializer name.	2023-01-03 12:45:28 -06:00
Ron Lieberman	750e1c8dbd	Revert "[libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support." This reverts commit `135f6a1ee8`.	2023-01-03 12:26:39 -06:00
Ron Lieberman	135f6a1ee8	[libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support.	2023-01-03 11:04:13 -06:00
Kevin Sala	339d810a0f	[OpenMP][libomptarget] Add TargetParser as dependency in NextGen's JIT This patch fixes an undefined reference to llvm::Triple::Triple(llvm::Twine const&). Differential Revision: https://reviews.llvm.org/D140810	2023-01-01 13:29:30 +01:00
Shilei Tian	75019f18bd	[OpenMP][JIT] Fixed a couple of issues in the initial implementation of JIT This patch fixes a couple of issues: 1. Instead of using `llvm_unreachable` for those base virtual functions, unknown value will be returned. The previous method could cause runtime error for those targets where the image is not compatible but JIT is not implemented. 2. Fixed the type in CMake that causes the `Target` CMake variable is undefined. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D140732	2022-12-28 14:40:59 -05:00
Shilei Tian	5a3a527f8a	[OpenMP] Introduce basic JIT support to OpenMP target offloading This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs. The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. `02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432)` shows how it roughly works. As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later. In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D139287	2022-12-27 22:19:05 -05:00
Shilei Tian	95956bd896	Revert "[OpenMP] Introduce basic JIT support to OpenMP target offloading" This reverts commit `58906e4901` because it breaks AMD's buildbot.	2022-12-27 21:52:07 -05:00
Shilei Tian	58906e4901	[OpenMP] Introduce basic JIT support to OpenMP target offloading This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs. The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. `02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432)` shows how it roughly works. As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later. In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D139287	2022-12-27 19:07:32 -05:00
Shilei Tian	4007963756	[RFC][OpenMP] Update to Python3 for lit test I think it's reasonable to upgrade to Python 3 for LIT test requirement because `lit` itself (`llvm/utils/lit/lit.py`) already switched to Python 3. In addition, LLVM already requires Python 3.6 to be the minimum version (https://llvm.org/docs/GettingStarted.html#software). Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D139855	2022-12-26 21:39:51 -05:00
Shilei Tian	a82e5825e0	[NFC][OpenMP] Fix compile warning caused by using `std::move` on a local object on a `return` statement	2022-12-23 10:42:29 -05:00
Vignesh Balasubramanian	ae1507d3ea	[OpenMP] [OMPD] Enable OMPD Tests It was disabled due to different failures it different llvm bots. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D138411	2022-12-23 11:47:21 +05:30
Kevin Sala	e5354a2bfa	[OpenMP][libomptarget] Centralize host pinned buffers map to NextGen's PluginInterface This patch moves the management/tracking of host pinned buffers to the common PluginInterface in NextGen plugins. For the moment, the management consists of tracking the host pinned allocations into a map in each device. Differential Revision: https://reviews.llvm.org/D140502	2022-12-22 02:11:05 +01:00
Kevin Sala	a487e0ffde	[NFC][OpenMP][libomptarget] Return null if error detected during allocation in NextGen AMDGPU	2022-12-22 01:46:33 +01:00
Guilherme Valarini	4e32d5cedf	[OpenMP] Disable libomptarget integration on unsupported platforms Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D140419	2022-12-20 16:41:43 -03:00
Joseph Huber	3d2aa5473c	[OpenMP][NFC] Fix message to recommend C++17 instead of C++14 Summary: This was changed in LLVM 16.0.	2022-12-20 13:39:44 -06:00
Sunil Kuravinakop	e9babe7571	[OpenMP] Clang Support for taskwait nowait clause Support for taskwait nowait clause with placeholder for runtime changes. Reviewed By: cchen, ABataev Differential Revision: https://reviews.llvm.org/D131830	2022-12-20 12:13:56 -06:00
Johannes Doerfert	e3d9a448c5	[OpenMP] Account for dynamic shared memory in the AMDGPU nextgen plugin	2022-12-19 19:09:44 -08:00
Johannes Doerfert	fb2c42df41	[OpenMP] Improve AMDGPU Plugin With this patch we: - pick more sensible defaults for the number of teams, inspired by the old plugin, and configured via LIBOMPTARGET_AMDGPU_TEAMS_PER_CU. - check the input signal of a kernel launch late, after the queue lock was taken, to avoid a barrier packet more often. - copy the kernel arguments in one swoop into the appropriate memory. - manually specialize the callbacks to avoid potential indirect calls.	2022-12-19 19:09:43 -08:00
Ye Luo	ee3d9ee49c	[OpenMP] Change the nextgen plugin kernel thread count scheme as old plugins' Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140352	2022-12-19 18:27:02 -06:00
Johannes Doerfert	77197b5651	[OpenMP] Export `ompx::` symbols from the device runtime Differential Revision: https://reviews.llvm.org/D140335	2022-12-19 14:46:54 -08:00
Johannes Doerfert	2b5a99b3d9	[OpenMP] Rename the `_OMP` namespace in the device runtime to `ompx` Differential Revision: https://reviews.llvm.org/D140334	2022-12-19 14:43:59 -08:00
Kevin Sala	6bbf9c0cca	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-17 00:01:24 +01:00
Kevin Sala	7b97941721	[OpenMP][libomptarget] Add missing symbols in dynamic_hsa This patch prepares for the new AMDGPU NextGen plugin. Differential Revision: https://reviews.llvm.org/D140213	2022-12-17 00:01:24 +01:00
Joseph Huber	d8b0f007cb	[libomptarget] Add HSA definitions for memory faults to dynamic_hsa Summary: We use the dynamic HSA file to forward declare needed definitions from the HSA runtime if not present at build time. These definitions were not included so using them caused problems on systems without it if used. Just add them.	2022-12-16 07:06:44 -06:00
Kevin Sala	a66826a233	Revert "[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior" This reverts commit `87e6b96b00`.	2022-12-16 11:53:45 +01:00
Carlo Bertolli	ac52c8f589	[OpenMP] Add missing test for pinned memory API I accidentally left out the test for the pinned API introduced by D138933. Adding it back. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140077	2022-12-15 21:29:15 -06:00
Kevin Sala	87e6b96b00	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-16 00:30:43 +01:00
Kevin Sala	39fe657b66	[OpenMP][libomptarget] Add utility header for AMDGPU plugins This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. The original and the NextGen plugin will share some structures and functionalities. We use this header for defining them and avoiding code duplication. Differential Revision: https://reviews.llvm.org/D139792	2022-12-15 21:06:04 +01:00
Guilherme Valarini	89c82c8394	[OpenMP] Add non-blocking support for target nowait regions This patch better integrates the target nowait functions with the tasking runtime. It splits the nowait execution into two stages: a dispatch stage, which triggers all the necessary asynchronous device operations and stores a set of post-processing procedures that must be executed after said ops; and a synchronization stage, responsible for synchronizing the previous operations in a non-blocking manner and running the appropriate post-processing functions. Suppose during the synchronization stage the operations are not completed. In that case, the attached hidden helper task is re-enqueued to any hidden helper thread to be later synchronized, allowing other target nowait regions to be concurrently dispatched. Reviewed By: jdoerfert, tianshilei1992 Differential Revision: https://reviews.llvm.org/D132005	2022-12-14 14:03:32 -03:00
Guilherme Valarini	63efc58c5a	[NFC][OpenMP] Add missing LLVM headers on utility file Differential Revision: https://reviews.llvm.org/D137566	2022-12-14 12:46:00 -03:00
Carlo Bertolli	d6281caa34	[OpenMP] Add API for pinned memory This patch adds API support for the atk_pinned trait for omp_alloc. It does not implement kmp_target_lock_mem and kmp_target_unlock_mem in libomptarget, but prepares libomp for it. Patches to libomptarget to implement lock/unlock coming after this one. Reviewed by: jlpeyton, jdoerfert Differential Revision: https://reviews.llvm.org/D138933	2022-12-14 08:50:10 -06:00
Martin Storsjö	15151315f7	[OpenMP] Add a missing dllexport for the new function __kmpc_fork_call_if This new function was added in `b72f1ec9fb`, but wasn't exported from the DLL on Windows. This fixes the parallel/omp_parallel_if.c OpenMP testcase on Windows.	2022-12-14 14:19:10 +02:00
Martin Storsjö	93c011eebb	[OpenMP] Fix detecting warning options for GCC If testing for a warning option like -Wno-<foo> with GCC, GCC won't print any diagnostic at all, leading to the options being accepted incorrectly. However later, if compiling a file that actually prints another warning, GCC will also print warnings about these -Wno-<foo> options being unrecognized. This avoids warning spam like this, for every OpenMP source file that produces build warnings with GCC: cc1plus: warning: unrecognized command line option ‘-Wno-int-to-void-pointer-cast’ cc1plus: warning: unrecognized command line option ‘-Wno-return-type-c-linkage’ cc1plus: warning: unrecognized command line option ‘-Wno-covered-switch-default’ cc1plus: warning: unrecognized command line option ‘-Wno-enum-constexpr-conversion’ This matches how such warning options are detected and added in llvm/cmake/modules/HandleLLVMOptions.cmake, e.g. like this: check_cxx_compiler_flag("-Wclass-memaccess" CXX_SUPPORTS_CLASS_MEMACCESS_FLAG) append_if(CXX_SUPPORTS_CLASS_MEMACCESS_FLAG "-Wno-class-memaccess" CMAKE_CXX_FLAGS) This also matches how LLDB warning options were restructured for GCC compatibility in `e546bbfda0`. Differential Revision: https://reviews.llvm.org/D139922	2022-12-14 14:19:03 +02:00
Johannes Doerfert	90609fb68f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime Differential Revision: https://reviews.llvm.org/D136903	2022-12-13 18:44:19 -08:00
gonglingqin	b49d3e50e3	[OpenMP][Test] Make the output error message consistent with the comment Modify the error message output of affinity/kmp-affinity.c and affinity/omp-places.c. Differential Revision: https://reviews.llvm.org/D139803	2022-12-14 10:07:15 +08:00
Jon Chesterfield	56ec7ce80d	[openmp][amdgpu] Let fine grain and kernarg pools differ	2022-12-14 02:04:21 +00:00
gonglingqin	9a0831afa0	[OpenMP] Skip extra blank line when parsing /proc/cpuinfo on LoongArch64 This fixes the following test cases: * affinity/kmp-affinity.c * affinity/kmp-hw-subset.c * affinity/omp-places.c Differential Revision: https://reviews.llvm.org/D139802	2022-12-13 20:13:10 +08:00
Johannes Doerfert	f9c29878b0	Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime" This reverts commit `c1c8cbbf5f`. One of the tests seems to be flaky/non-deterministic.	2022-12-12 22:08:28 -08:00
Johannes Doerfert	c1c8cbbf5f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime	2022-12-12 20:55:36 -08:00
Terry Wilmarth	bc6cc63ab8	[OpenMP] Refactoring: Move teams forking and serial region forking to separate functions. Code for serial parallel regions and teams construct have been moved out of __kmp_fork_call and into separate functions. This is to reduce the size of the __kmp_fork_call function, and aid in debugging. Differential Revision: https://reviews.llvm.org/D139116	2022-12-12 17:19:53 -06:00
Ye Luo	d3ebce9362	[OpenMP] add offload tests with reduction on complex data types Differential Revision: https://reviews.llvm.org/D139856	2022-12-12 11:48:35 -06:00
Shilei Tian	3eef428948	Revert "[OpenMP] Add `abort` to `FATAL_MESSAGE`" This reverts commit `ac65b3c7a2`.	2022-12-11 22:46:56 -05:00

1 2 3 4 5 ...

2566 Commits