clang-p2996

Author	SHA1	Message	Date
Joseph Huber	a8f9f85ab0	[Libomptarget][NFC] Fix unused variable warnings Summary: This patch fixes a few warnings that would show up while building.	2024-04-10 10:01:15 -05:00
Joseph Huber	4213f4a9ae	[Libomptarget] Fix resizing the buffer of RPC handles Summary: The previous code would potentially make it smaller if a device with a lower ID touched it later. Also we should minimize changes to the state for multi threaded reasons. This just sets up an owned slot for each at initialization time.	2024-04-01 07:29:57 -05:00
dhruvachak	cc8c6b037c	[OpenMP] [amdgpu] Added a synchronous version of data exchange. (#87032 ) Similar to H2D and D2H, use synchronous mode for large data transfers beyond a certain size for D2D as well. As with H2D and D2H, this size is controlled by an env-var.	2024-03-29 13:33:43 -07:00
Joseph Huber	a1a8bb1d3a	[libc] Change RPC interface to not use device ids (#87087 ) Summary: The current implementation of RPC tied everything to device IDs and forced us to do init / shutdown to manage some global state. This turned out to be a bad idea in situations where we want to track multiple hetergeneous devices that may report the same device ID in the same process. This patch changes the interface to instead create an opaque handle to the internal device and simply allocates it via `new`. The user will then take this device and store it to interface with the attached device. This interface puts the burden of tracking the device identifier to mapped d evices onto the user, but in return heavily simplifies the implementation.	2024-03-29 12:49:16 -05:00
Ye Luo	19185d5a72	[Libomptarget] Make dynamic loading libffi more verbose. (#86891 )	2024-03-27 18:40:57 -05:00
Joseph Huber	ed68aac9f2	[Libomptarget] Move API implementations into GenericPluginTy (#86683 ) Summary: The plan is to remove the entire plugin interface and simply use the `GenericPluginTy` inside of `libomptarget` by statically linking against it. This means that inside of `libomptarget` we will simply do `Plugin.data_alloc` without the dynamically loaded interface. To reduce the amount of code required, this patch simply moves all of the RTL implementation functions inside of the Generic device. Now the `__tgt_rtl_` interface is simply a shallow wrapper that will soon go away. There is some redundancy here, this will be improved later. For now what is important is minimizing the changes to the API.	2024-03-27 14:10:54 -05:00
Joseph Huber	4dc3225248	[Libomptarget] Replace global `PluginTy::get` interface with references (#86595 ) Summary: We have a plugin singleton that implements the Plugin interface. This then spawns separate device and kernels. Previously when these needed to reach into the global singleton they would use the `PluginTy::get` routine to get access to it. In the future we will move away from this as the lifetime of the plugin will be handled by `libomptarget` directly. This patch removes uses of this inside of the plugin implementaion themselves by simply keeping a reference to the plugin inside of the device. The external `__tgt_rtl` functions still use the global method, but will be removed later.	2024-03-26 07:13:59 -05:00
Joseph Huber	2cad43c1ba	[Libomptarget] Factor functions out of 'Plugin' interface (#86528 ) Summary: This patch factors common functions out of the `Plugin` interface prior to its removal in a future patch. This simply temporarily renames it to `PluginTy` so that we could re-use `Plugin::check` internally as this needs to be defined statically per plugin now. We can refactor this later. The future patch will delete `PluginTy` and `PluginTy::get` entirely. This simply tries to minimize a few changes to make it easier to land.	2024-03-25 15:24:39 -05:00
Joseph Huber	9f0321ccf1	[Libomptarget] Make plugins depend explicitly on `intrinsics_gen` Summary: It's possible for the OpenMP offloading plugins to be build before tablegen is run despite the fact that we rely on it. Simply make it depend on it currently.	2024-03-24 15:24:35 -05:00
Joseph Huber	909ea28ac6	[Libomptarget] Specificall add LLVM include dirs in plugins	2024-03-24 14:29:39 -05:00
Michał Górny	3f5e649ff6	[Libomptarget] Fix linking to LLVM dylib (#86397 ) Use `LINK_COMPONENTS` parameter of `add_llvm_library` rather than passing LLVM components directly to `target_link_libraries`, in order to ensure that LLVM dylib is linked correctly when used. Otherwise, CMake insists on linking to static libraries that aren't present on distributions doing pure dylib installs, such as Gentoo. This fixes a regression introduced in `dcbddc2525`.	2024-03-23 15:46:32 +00:00
Joseph Huber	20e0bacd05	[Libomptarget][Fix] Remove duplicate version script for host builds Summary: This causes an error on some linkers and was mistakenly kept in.	2024-03-22 19:09:16 -05:00
Joseph Huber	85af772f3b	[Libomptarget][FIX] Fix unintentinally used PUBLIC interface Summary: This was supposed to be private and caused some issues with certain configs.	2024-03-22 16:38:51 -05:00
Joseph Huber	dcbddc2525	[Libomptarget] Unify and simplify plugin CMake (#86191 ) Summary: This patch reworks the CMake handling for building plugins. All this does is pull a lot of shared and common logic into a single helper function. This also simplifies the OMPT libraries from being built separately instead of just added.	2024-03-22 16:13:58 -05:00
Ulrich Weigand	cb071942f8	[OpenMP] Fix SystemZ build failure Commit `a7d5f73a03` introduced an error in a target_compile_definitions on the SystemZ, causing the build to break. Fixed by adding the missing "PRIVATE".	2024-03-21 12:02:50 +01:00
Joseph Huber	a7d5f73a03	[Libomptarget] Consolidate CPU offloading into 'host' directory (#86014 ) Summary: All of these CPU targets use the same underlying implementation. We should consolidate them into a single target to make it easier to update this to a static library based approach. I have decided to call this the 'host' target so it can be given a single name. We still only build these if the system processor matches and we are on Linux.	2024-03-20 17:41:53 -05:00
Gheorghe-Teodor Bercea	c25e77436e	Revert "[libomptarget][nextgen-plugin] Use SCRELEASE/SCACQUIRE in packet headers" (#85950 ) Reverts llvm/llvm-project#85678	2024-03-20 11:40:12 -04:00
Gheorghe-Teodor Bercea	927308a52b	[libomptarget][nextgen-plugin] Use SCRELEASE/SCACQUIRE in packet headers (#85678 ) This patch updates the construction of packet headers to replace the usage of ACQUIRE/RELEASE with SCACQUIRE/SCRELEASE which is now recommended. The patch also ensures consistency across kernel dispatches.	2024-03-20 11:22:01 -04:00
dhruvachak	b5d02bbd0d	[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363 ) A kernel implicit parameter (dyn_ptr) was introduced some time back. This patch increments the kernel args version for a compiler supporting dyn_ptr. The version will be used by the runtime to determine whether the implicit parameter is generated by the compiler. The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime. If approved, this patch should be backported to release 18.	2024-03-19 16:40:22 -07:00
Joseph Huber	470040bd4d	[Libomptarget][NFC] Remove warning on return value const	2024-03-15 18:50:33 -05:00
Ulrich Weigand	c9062e8f78	Reapply [libomptarget] Build plugins-nextgen for SystemZ (#83978 ) The plugin was not getting built as the build_generic_elf64 macro assumes the LLVM triple processor name matches the CMake processor name, which is unfortunately not the case for SystemZ. Fix this by providing two separate arguments instead. Actually building the plugin exposed a number of other issues causing various test failures. Specifically, I've had to add the SystemZ target to - CompilerInvocation::ParseLangArgs - linkDevice in ClangLinuxWrapper.cpp - OMPContext::OMPContext (to set the device_kind_cpu trait) - LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt - a check_plugin_target call in libomptarget/src/CMakeLists.txt Finally, I've had to set a number of test cases to UNSUPPORTED on s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on s390x for what seem to be the same reason. In addition, this also requires support for BE ELF files in plugins-nextgen: https://github.com/llvm/llvm-project/pull/85246	2024-03-15 19:06:43 +01:00
Ulrich Weigand	2210c85a66	Reapply [libomptarget] Support BE ELF files in plugins-nextgen (#85246 ) Code in plugins-nextgen reading ELF files is currently hard-coded to assume a 64-bit little-endian ELF format. Unfortunately, this assumption is even embedded in the interface between GlobalHandler and Utils/ELF routines, which use ELF64LE types. To fix this, I've refactored the interface to use generic types, in particular by using (a unique_ptr to) ObjectFile instead of ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym. This allows properly templating over multiple ELF format variants inside Utils/ELF; specifically, this patch adds support for 64-bit big-endian ELF files in addition to 64-bit little-endian files.	2024-03-15 18:28:28 +01:00
Ulrich Weigand	4c8714efc5	Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#85246 )" This reverts commit `611c62b30d`.	2024-03-14 18:38:13 +01:00
Ulrich Weigand	611c62b30d	[libomptarget] Support BE ELF files in plugins-nextgen (#85246 ) Code in plugins-nextgen reading ELF files is currently hard-coded to assume a 64-bit little-endian ELF format. Unfortunately, this assumption is even embedded in the interface between GlobalHandler and Utils/ELF routines, which use ELF64LE types. To fix this, I've refactored the interface to use generic types, in particular by using (a unique_ptr to) ObjectFile instead of ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym. This allows properly templating over multiple ELF format variants inside Utils/ELF; specifically, this patch adds support for 64-bit big-endian ELF files in addition to 64-bit little-endian files.	2024-03-14 18:19:12 +01:00
Joseph Huber	8a79003307	[libc] Move RPC opcodes include out of the header Summary: This header isn't strictly necessary, and is currently broken because we install these to separate locations.	2024-03-10 14:07:47 -05:00
Gheorghe-Teodor Bercea	5c752df1e1	[libomptarget][nextgen-plugin][NFC] Clean-up InputSignal checks (#83458 ) Clean-up InputSignal checks.	2024-03-07 12:01:42 -05:00
Ulrich Weigand	fb7cc73975	Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#83976 )" This reverts commit `15b7b3182c`.	2024-03-06 21:37:45 +01:00
Ulrich Weigand	70677c81de	Revert "[libomptarget] Build plugins-nextgen for SystemZ (#83978 )" This reverts commit `3ecd38c8e1`.	2024-03-06 21:37:43 +01:00
Ulrich Weigand	d4f4f80236	Revert "[libomptarget] Fix CUDA plugin build regression" This reverts commit `b64482e23e`.	2024-03-06 21:37:35 +01:00
Ulrich Weigand	b64482e23e	[libomptarget] Fix CUDA plugin build regression After `3ecd38c8e`, the Handler.getELFObjectFile routine is no longer available. Call ELF64LEObjectFile::create directly, which should always be suitable for CUDA images.	2024-03-06 21:00:25 +01:00
Ulrich Weigand	3ecd38c8e1	[libomptarget] Build plugins-nextgen for SystemZ (#83978 ) The plugin was not getting built as the build_generic_elf64 macro assumes the LLVM triple processor name matches the CMake processor name, which is unfortunately not the case for SystemZ. Fix this by providing two separate arguments instead. Actually building the plugin exposed a number of other issues causing various test failures. Specifically, I've had to add the SystemZ target to - CompilerInvocation::ParseLangArgs - linkDevice in ClangLinuxWrapper.cpp - OMPContext::OMPContext (to set the device_kind_cpu trait) - LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt - a check_plugin_target call in libomptarget/src/CMakeLists.txt Finally, I've had to set a number of test cases to UNSUPPORTED on s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on s390x for what seem to be the same reason. In addition, this also requires support for BE ELF files in plugins-nextgen: https://github.com/llvm/llvm-project/pull/83976	2024-03-06 20:50:01 +01:00
Ulrich Weigand	15b7b3182c	[libomptarget] Support BE ELF files in plugins-nextgen (#83976 ) Code in plugins-nextgen reading ELF files is currently hard-coded to assume a 64-bit little-endian ELF format. Unfortunately, this assumption is even embedded in the interface between GlobalHandler and Utils/ELF routines, which use ELF64LE types. To fix this, I've refactored the interface to push all ELF specific types into Utils/ELF. Specifically, this patch removes both the getSymbol and getSymbolAddress routines and replaces them with a single findSymbolInImage, which gets a StringRef identifying the raw object file image as input, and returns a StringRef covering the data addressed by the symbol (address and size) if found, or std::nullopt otherwise. This allows properly templating over multiple ELF format variants inside Utils/ELF; specifically, this patch adds support for 64-bit big-endian ELF files in addition to 64-bit little-endian files.	2024-03-06 20:49:12 +01:00
Ye Luo	0fa04b6e2c	[libomptarget] Fix libomptarget.rtl.amdgpu.so installation If AMD GPUs don't exist when building libomptarget, the early return causes skipping the plugin installation.	2024-03-05 21:32:48 -06:00
Daniel Martinez	aa6ebf9be1	Replace some C headers with C++ ones (#82697 ) #81434 Replaced some C headers with C++ ones Co-authored-by: Daniel Martinez <danielmartinez@cock.li>	2024-03-04 01:21:31 -05:00
Joseph Huber	2fb764d2da	[libomptarget] Fix 'libomptarget' libraries being installed twice (#83624 ) Summary: We use `add_llvm_library` as a shorthand for setting up all the dependencies and libraries we need for the OpenMP offloading runtime as they depend on a lot of the LLVM utilities. However, we always explicitly installed these manually. Behind the scenes the function would then install it again. This was unnoticed because until now the destinations matched. Now that we want it to optionally go into the other directory it is duplicating them. Fix this by stating that this is a build tree only library so we can handle it ourselves.	2024-03-01 15:52:20 -06:00
Joseph Huber	1a2ecbb398	[libc] Remove 'llvm-gpu-none' directory from build (#82816 ) Summary: This directory is leftover from when we handled both AMDGPU and NVPTX in the same build and merged them into a pseudo triple. Now the only thing it contains is the RPC server header. This gets rid of it, but now that it's in the base install directory we should make it clear that it's an LLVM libc header.	2024-02-23 14:11:31 -06:00
Joseph Huber	47b7c91abe	[libc] Rework the GPU build to be a regular target (#81921 ) Summary: This is a massive patch because it reworks the entire build and everything that depends on it. This is not split up because various bots would fail otherwise. I will attempt to describe the necessary changes here. This patch completely reworks how the GPU build is built and targeted. Previously, we used a standard runtimes build and handled both NVPTX and AMDGPU in a single build via multi-targeting. This added a lot of divergence in the build system and prevented us from doing various things like building for the CPU / GPU at the same time, or exporting the startup libraries or running tests without a full rebuild. The new appraoch is to handle the GPU builds as strict cross-compiling runtimes. The first step required https://github.com/llvm/llvm-project/pull/81557 to allow the `LIBC` target to build for the GPU without touching the other targets. This means that the GPU uses all the same handling as the other builds in `libc`. The new expected way to build the GPU libc is with `LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`. The second step was reworking how we generated the embedded GPU library by moving it into the library install step. Where we previously had one `libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This patch includes the necessary clang / OpenMP changes to make that not break the bots when this lands. We unfortunately still require that the NVPTX target has an `internal` target for tests. This is because the NVPTX target needs to do LTO for the provided version (The offloading toolchain can handle it) but cannot use it for the native toolchain which is used for making tests. This approach is vastly superior in every way, allowing us to treat the GPU as a standard cross-compiling target. We can now install the GPU utilities to do things like use the offload tests and other fun things. Some certain utilities need to be built with `--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine workaround as we will always assume that the GPU `libc` is a cross-build with a functioning host. Depends on https://github.com/llvm/llvm-project/pull/81557	2024-02-22 15:29:29 -06:00
Joseph Huber	0ac4438560	[Libomptarget] Remove unused 'SupportsEmptyImages' API function (#80316 ) Summary: This function is always false in the current implementation and is not even considered required. Just remove it and if someone needs it in the future they can add it back in. This is done to simplify the interface prior to other changes	2024-02-05 10:00:09 -06:00
Kelvin Li	a063df20ab	[OpenMP] Fix typo (NFC) (#80332 )	2024-02-01 15:13:17 -05:00
Kelvin Li	cc0c8e592f	[OpenMP] Fix build breakage (NFC) (#80313 ) Assign `nullptr` to the pointer instead.	2024-02-01 12:33:23 -06:00
Joseph Huber	621bafd5c1	[Libomptarget] Move target table handling out of the plugins (#77150 ) Summary: This patch removes the bulk of the handling of the `__tgt_offload_entries` out of the plugins itself. The reason for this is because the plugins themselves should not be handling this implementation detail of the OpenMP runtime. Instead, we expose two new plugin API functions to get the points to a device pointer for a global as well as a kernel type. This required introducing a new type to represent a binary image that has been loaded on a device. We can then use this to load the addresses as needed. The creation of the mapping table is then handled just in `libomptarget` where we simply look up each address individually. This should allow us to expose these operations more generically when we provide a separate API.	2024-01-22 11:06:47 -06:00
carlobertolli	ae99966a27	[OpenMP] Enable automatic unified shared memory on MI300A. (#77512 ) This patch enables applications that did not request OpenMP unified_shared_memory to run with the same zero-copy behavior, where mapped memory does not result in extra memory allocations and memory copies, but CPU-allocated memory is accessed from the device. The name for this behavior is "automatic zero-copy" and it relies on detecting: that the runtime is running on a MI300A, that the user did not select unified_shared_memory in their program, and that XNACK (unified memory support) is enabled in the current GPU configuration. If all these conditions are met, then automatic zero-copy is triggered. This patch also introduces an environment variable OMPX_APU_MAPS that, if set, triggers automatic zero-copy also on non APU GPUs (e.g., on discrete GPUs). This patch is still missing support for global variables, which will be provided in a subsequent patch. Co-authored-by: Thorsten Blass <thorsten.blass@amd.com>	2024-01-22 10:30:22 -06:00
Joseph Huber	b689b4fe55	[LLVM][CMake] Add ffi_static target for the FFI static library (#78779 ) Summary: This patch is an attempt to make the `find_package(FFI)` support in LLVM prefer to provide the static library version if present. This is currently an optional library for building `libffi`, and its presence implies that it should likely be used. This patch is an attempt to fix some problems observed with testing programs linked against `libffi` on many different systems that could have conflicting paths. Linking it statically prevents this. This patch adds the `ffi_static` target for this library.	2024-01-22 07:27:06 -06:00
Joseph Huber	89cdd48a22	[Libomptarget] Remove temporary files in AMDGPU JIT impl (#77980 ) Summary: This patch cleans up some of the JIT handling for AMDGPU as well as removing its temporary files. Previously these would be left in the temporary directory after the program was run. This costs some extra time, but the correct solution to avoid that is to create a sufficient entrypoint into `ld.lld` that we can simply pass a memory buffer into.	2024-01-15 19:03:19 -06:00
Joseph Huber	37c1a5e3f5	[Libomptarget] Fix GPU Dtors referencing possibly deallocated image (#77828 ) Summary: The constructors and destructors look up a symbol in the ELF quickly to determine if they need to be run on the GPU. This allows us to avoid the very slow actions required to do the slower lookup using the vendor API. One problem occurs with how we handle the lifetime of these images. Right now there is no invariant to specify the lifetime of the underlying binary image that is loaded. In the typical case, this comes from the binary itself in the `.llvm.offloading` section, meaning that the lifetime of the binary should match the executable itself. This would work fine, if it weren't for the fact that the plugin is loaded via `dlopen` and can have a teardown order out of sync with the main executable. This was likely what was occuring when this failed on some systems but not others. A potential solution would be to simply copy images into memory so the runtime does not rely on external references. Another would be to manually zero these out after initialization as to prevent this mistake from happening accidentally. The former has the benefit of making some checks easier, and allowing for constant initialization be done on the ELF itself (normally we can't do this because writing to a constant section, e.g. .llvm.offloading is a segfault.). The downside would be the extra time required to copy the image in bulk (Although we are likely doing this in the vendor runtimes as well). This patch went with a quick solution to simply set a boolean value at initialization time if we need to call destructors. Fixes: https://github.com/llvm/llvm-project/issues/77798	2024-01-11 15:00:53 -06:00
Joseph Huber	3ede817f5b	[Libomptarget] Fix JIT on the NVPTX target by calling ptx manually (#77801 ) Summary: Recently a patch added an assertion in the GlobalHandler to indicate when an ELF was not used. This began to fire whenever NVPTX JIT was used, because the JIT pass output a PTX file instead of an ELF. The CUModuleLoad method consumes `.s` internally and compiles it to a cubin, however, this is too late as we perform several checks on the ELF directly for the presence of certain symbols and to read some necessary constants. This results in inconsistent behaviour. To address this, this patch simply calls `ptxas` manually, similar to how `lld` is called for the AMDGPU JIT pass. This is inevitably going to be slower than simply passing it to the CUDA routine due to the overhead involved in file IO and a fork call, but it's necessary for correctness. CUDA provides an API for compiling PTX manually. However, this only started showing up in CUDA 11.1 and is only provided "officially" in a static library. The `libnvidia-ptxjitcompiler.so` next to the CUDA driver has the same symbols and can likely be used as a replacement. This would be the faster solution. However, given that it's not documented it may have some issues.	2024-01-11 11:32:43 -06:00
Joseph Huber	e203968e41	[Libomptarget] Do not abort on failed plugin init (#77623 ) Summary: The current code logic is supposed to skip plugins that aren't found or could not be loaded. However, the plugic ontained a call to `abort` if it failed, which prevented us from continuing if initilalization the plugin failed (such as if `dlopen` failed for the dyanmic plugins).	2024-01-10 11:42:04 -06:00
Joseph Huber	d03b8c3a04	[Libomptarget][NFC] Format in-line comments consistently (#77530 ) Summary: The LLVM style uses /Foo=/ when indicating the name of a constant. See https://llvm.org/docs/CodingStandards.html#comment-formatting. This is useful for consistency, as well as because `clang-format` understands this syntax and formats it more cleanly. Do a bulk update of this syntax.	2024-01-10 10:10:08 -06:00
Joseph Huber	d65a7d1f1a	[Libomptarget] Do not run CPU tests if FFI was not found Summary: The previous behaviour before I made it dynamically open libFFI was that these tests would be ignored if FFI was not found. This now allows tests to be run without the dependency and thus the tests fails on some buildbots. This simply makesit not build the tests if it's not present.	2024-01-10 07:22:23 -06:00
Joseph Huber	c7c68f1764	[Libomptarget] Allow the CPU targets to be built without libffi (#77495 ) Summary: The CPU targets currently rely on `libffi` to invoke the "kernel" functions. Previously we would not build these if this dependency was not found. This patch copies th eapproach used for things like CUDA and HSA to dynamically load this if it is not found. The one sketchy thing this does is hard-code the default ABI for the target. These are normally defined on a per-file basis in the FFI source, so I had to fish out the expected values. We only use two types, so ideally we will always be able to use the default ABI. It's possible we could remove this dependency entirely in the future as well.	2024-01-09 14:01:52 -06:00

1 2 3 4 5

250 Commits