clang-p2996

Author	SHA1	Message	Date
dhruvachak	cc8c6b037c	[OpenMP] [amdgpu] Added a synchronous version of data exchange. (#87032 ) Similar to H2D and D2H, use synchronous mode for large data transfers beyond a certain size for D2D as well. As with H2D and D2H, this size is controlled by an env-var.	2024-03-29 13:33:43 -07:00
agozillon	8612fa0d84	[MLIR][OpenMP] Refactor bounds offsetting and fix to apply to all directives (#84349 ) This PR refactors bounds offsetting by combining the two differing implementations (one applying to initial derived type member map implementation for descriptors and the other for regular arrays, effectively allocatable array vs regular array in fortran) now that it's a little simpler to do. The PR also moves the utilization of createAlteredByCaptureMap into genMapInfoOp, where it will be correctly applied to all MapInfoData, appropriately offsetting and altering Pointer data set in the kernel argument structure on the host. This primarily means bounds offsets will now correctly apply to enter/exit/update map clauses as opposed to just the Target directive that is currently the case. A few fortran runtime tests have been added to verify this new behavior. This PR depends on: https://github.com/llvm/llvm-project/pull/84328 and is an extraction of the larger derived type member map PR stack (so a requirement for it to land).	2024-03-22 15:32:39 +01:00
Ulrich Weigand	fe1341248d	[OpenMP] Disable workshare_chunk.c test case on SystemZ The test added by https://github.com/llvm/llvm-project/pull/83261 has been consistently failing. Mark as UNSUPPORTED just like on x86_64 and aarch64.	2024-03-20 11:12:26 +01:00
Joseph Huber	5cccc405e3	Revert "[OpenMP] Disable flaky barrier fence test (#85093 )" This reverts commit `cd8843f87a`. Originally disabled to try to unstick the AMD build bot, didn't make a difference after a week so it goes back in.	2024-03-19 14:46:03 -05:00
Dominik Adamski	b9a41b9e9b	[NFC][OpenMP] Add test checking clang offload chunking policy (#83261 ) Verify how clang handles `dist_schedule(static, block_chunk)` and `schedule(static, thread_chunk)` clauses for OpenMP offload loop workshare pragmas.	2024-03-19 10:29:35 -07:00
Akash Banerjee	e9da5f0083	[OpenMP] Fix target data region codegen being omitted for device pass (#85218 ) This patch enables the BodyCodeGen callback to still trigger for the TargetData nested region during the device pass. There maybe Target code nested within the TargetData region for which this is required. Also add tests for the same.	2024-03-19 13:04:23 +00:00
Ulrich Weigand	c9062e8f78	Reapply [libomptarget] Build plugins-nextgen for SystemZ (#83978 ) The plugin was not getting built as the build_generic_elf64 macro assumes the LLVM triple processor name matches the CMake processor name, which is unfortunately not the case for SystemZ. Fix this by providing two separate arguments instead. Actually building the plugin exposed a number of other issues causing various test failures. Specifically, I've had to add the SystemZ target to - CompilerInvocation::ParseLangArgs - linkDevice in ClangLinuxWrapper.cpp - OMPContext::OMPContext (to set the device_kind_cpu trait) - LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt - a check_plugin_target call in libomptarget/src/CMakeLists.txt Finally, I've had to set a number of test cases to UNSUPPORTED on s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on s390x for what seem to be the same reason. In addition, this also requires support for BE ELF files in plugins-nextgen: https://github.com/llvm/llvm-project/pull/85246	2024-03-15 19:06:43 +01:00
Joseph Huber	cd8843f87a	[OpenMP] Disable flaky barrier fence test (#85093 ) Summary: This test is flaky on all targets I know of. We should disable it for now so running the test suite doesn't randomly fail 50% of the time.	2024-03-13 15:05:22 -05:00
Anchu Rajendran S	c03fd37d9b	[flang] Changes to map variables in link clause of declare target (#83643 ) As per the OpenMP standard, "If a variable appears in a link clause on a declare target directive that does not have a device_type clause with the nohost device-type-description then it is treated as if it had appeared in a map clause with a map-type of tofrom" is an implicit mapping rule. Before this change, such variables were mapped as to by default.	2024-03-07 08:23:58 -08:00
Ulrich Weigand	70677c81de	Revert "[libomptarget] Build plugins-nextgen for SystemZ (#83978 )" This reverts commit `3ecd38c8e1`.	2024-03-06 21:37:43 +01:00
Ulrich Weigand	3ecd38c8e1	[libomptarget] Build plugins-nextgen for SystemZ (#83978 ) The plugin was not getting built as the build_generic_elf64 macro assumes the LLVM triple processor name matches the CMake processor name, which is unfortunately not the case for SystemZ. Fix this by providing two separate arguments instead. Actually building the plugin exposed a number of other issues causing various test failures. Specifically, I've had to add the SystemZ target to - CompilerInvocation::ParseLangArgs - linkDevice in ClangLinuxWrapper.cpp - OMPContext::OMPContext (to set the device_kind_cpu trait) - LIBOMPTARGET_ALL_TARGETS in libomptarget/CMakeLists.txt - a check_plugin_target call in libomptarget/src/CMakeLists.txt Finally, I've had to set a number of test cases to UNSUPPORTED on s390x-ibm-linux-gnu; all these tests were already marked as UNSUPPORTED for x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu and are failing on s390x for what seem to be the same reason. In addition, this also requires support for BE ELF files in plugins-nextgen: https://github.com/llvm/llvm-project/pull/83976	2024-03-06 20:50:01 +01:00
Joseph Huber	ea174c0934	[Libomptarget] Remove global ctor and use reference counting (#80499 ) Summary: Currently we rely on global constructors to initialize and shut down the OpenMP runtime library and plugin manager. This causes some issues because we do not have a defined lifetime that we can rely on to release and allocate resources. This patch instead adds some simple reference counted initialization and deinitialization function. A future patch will use the `deinit` interface to more intelligently handle plugin deinitilization. Right now we do nothing and rely on `atexit` inside of the plugins to tear them down. This isn't great because it limits our ability to control these things. Note that I made the `__tgt_register_lib` functions do the initialization instead of adding calls to the new runtime functions in the linker wrapper. The reason for this is because in the past it's been easier to not introduce a new function call, since sometimes the user's compiler will link against an older `libomptarget`. Maybe if we change the name with offloading in the future we can simplify this. Depends on https://github.com/llvm/llvm-project/pull/80460	2024-02-22 12:01:52 -06:00
Joseph Huber	cc374d8056	[OpenMP] Remove `register_requires` global constructor (#80460 ) Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used. This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything. This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it. It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well.	2024-02-21 11:33:32 -06:00
agozillon	95fe47ca7e	[Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (#71766 ) This patch seeks to add an initial lowering for pointers and allocatable variables captured by implicit and explicit map in Flang OpenMP for Target operations that take map clauses e.g. Target, Target Update. Target Exit/Enter etc. Currently this is done by treating the type that lowers to a descriptor (allocatable/pointer/assumed shape) as a map of a record type (e.g. a structure) as that's effectively what descriptor types lower to in LLVM-IR and what they're represented as in the Fortran runtime (written in C/C++). The descriptor effectively lowers to a structure containing scalar and array elements that represent various aspects of the underlying data being mapped (lower bound, upper bound, extent being the main ones of interest in most cases) and a pointer to the allocated data. In this current iteration of the mapping we map the structure in it's entirety and then attach the underlying data pointer and map the data to the device, this allows most of the required data to be resident on the device for use. Currently we do not support the addendum (another block of pointer data), but it shouldn't be too difficult to extend this to support it. The MapInfoOp generation for descriptor types is primarily handled in an optimization pass, where it expands BoxType (descriptor types) map captures into two maps, one for the structure (scalar elements) and the other for the pointer data (base address) and links them in a Parent <-> Child relationship. The later lowering processes will then treat them as a conjoined structure with a pointer member map.	2024-02-05 18:45:07 +01:00
Joseph Huber	2333865546	[Libomptarget] Fix data mapping on dynamic loads (#80559 ) Summary: The current logic tries to map target mapping tables to the current device. Right now it assumes that data is only mapped a single time per device. This is only true if we have a single instance of the runtime running on a single program. However, in the case of dynamic library loads or shared libraries, this may happen multiple times. Given a case of a simple dynamic library load which has its own target kernel instruction, the current logic had only the first call to `__tgt_target_kernel` to the data mapping for that device. Then, when the next dynamic library load got called, it would see that the global were already mapped for that device and skip registering its own entires, even though they were distinct. This resulted in none of the mappings being done and hitting an assertion. This patch simply gets rid of this per-device check. The check should instead be on the host offloading entries. We already have logic that calls `continue` if we already have entries for that pointer, so we can simply rely on that instead.	2024-02-03 15:28:20 -06:00
Joseph Huber	254287658f	[Libomptarget] Remove handling of old ctor / dtor entries (#80153 ) Summary: A previous patch removed creating these entries in clang in favor of the backend emitting a callable kernel and having the runtime call that if present. The support for the old style was kept around in LLVM 18.0 but now that we have forked to 19.0 we should remove the support. The effect of this would be that an application linking against a newer libomptarget that still had the old constructors will no longer be called. In that case, they can either recompile or use the `libomptarget.so.18` that comes with the previous release.	2024-01-31 11:48:07 -06:00
Kareem Ergawy	383d488b0b	[openmp][flang][offloading] Do not use fixed device IDs in checks (#78973 ) Fixes a small issues in an offloading test where the test dependec on the host and device being assigned certains numeric IDs. This however is not stable and fails in situations where any of the devices is assigned an ID different from the expected value. The fix just checks that offloading succeeded by making sure the IDs are different. The test was failing locally for me.	2024-01-24 11:52:06 +01:00
Jan Patrick Lehr	181c4c331a	[OpenMP][Fix] Require USM capability in force-usm test (#79059 ) This should fix the AMDGPU buildbot breakage from #76571	2024-01-22 15:21:31 -06:00
Jan Patrick Lehr	fa4780fa6c	[OpenMP][USM] Introduces -fopenmp-force-usm flag (#76571 ) This flag forces the compiler to generate code for OpenMP target regions as if the user specified the #pragma omp requires unified_shared_memory in each source file. The option does not have a -fno-* friend since OpenMP requires the unified_shared_memory clause to be present in all source files. Since this flag does no harm if the clause is present, it can be used in conjunction. My understanding is that USM should not be turned off selectively, hence, no -fno- version. This adds a basic test to check the correct generation of double indirect access to declare target globals in USM mode vs non-USM mode. Which I think is the only difference observable in code generation. This runtime test checks for the (non-)occurence of data movement between host and device. It does one run without the flag and one with the flag to also see that both versions behave as expected. In the case w/o the new flag data movement between host and device is expected. In the case with the flag such data movement should not be present / reported.	2024-01-22 21:59:26 +01:00
Dominik Adamski	21199f9842	[OpenMP][OMPIRBuilder] Fix LLVM IR codegen for collapsed device loop (#78708 ) When we generate the loop body function, we need to be sure, that all original loop counters are replaced by the new counter. We need to save all items which use the original loop counter and then perform replacement of the original loop counter. If we don't do it, there is a risk that some values are not updated.	2024-01-22 09:24:45 +01:00
Dominik Adamski	8930c5a4be	[NFC][OpenMP] Fix typo in CHECK line (#78586 ) Typo in test: openmp/libomptarget/test/offloading/fortran/basic-target-parallel-do.f90	2024-01-18 15:40:15 +01:00
Dominik Adamski	d87a53a960	[NFC][OpenMP][Flang] Add test for OpenMP target parallel do (#77776 ) Added test which proves that end-to-end compilation of `omp target parallel do` costruct is successful for Flang compiler.	2024-01-18 15:26:39 +01:00
Joseph Huber	ab02372c23	[OpenMP] Fix or disable NVPTX tests failing currently (#77844 ) Summary: This patch is an attempt to get a clean run of `check-openmp` running on an NVPTX machine. I simply took the lists of tests that failed on my `sm_89` machine and disabled them or fixed them. A lot of these tests are disabled on AMDGPU already, so it makes sense that NVPTX fails. The others are simply problems with NVPTX optimized debugging which will need to be fixed. I opened an issue on one of them.	2024-01-11 19:17:08 -06:00
Dominik Adamski	ee431288a6	[NFC][OpenMP][Flang] Add smoke test for omp target parallel (#77579 ) Added test which proves that end-to-end compilation of omp target parallel costruct is successful for Flang compiler.	2024-01-11 10:18:11 +01:00
Andrew Gozillon	8ca07e57c3	[Flang][OpenMP][Offloading][Test] Adjust slightly incorrect tests now cmake configuration works These tests were slightly broken, in one case a failing test that now works. In the other case some accidentally left over code during a name change that broke compilation due to missing symbols.	2024-01-10 16:20:33 -06:00
Kareem Ergawy	75be7bb3fc	[flang][OpenMP][Offloading][AMDGPU] Add test for `target update` (#76355 ) Adds a new test for offloading `target update` directive to AMD GPUs.	2024-01-02 09:50:27 +01:00
Gheorghe-Teodor Bercea	a01b58aef0	[OpenMP][libomptarget][Fix] Add missing array initialization (#76457 ) Add missing array initialization as the array was not initialized and the value zero was assumed.	2023-12-27 12:58:41 -05:00
Fabian Mora	12250c4092	Reland [OpenMP][Fix] libomptarget Fortran tests (#76189 ) This patch fixes the erroneous multiple-target requirement in Fortran offloading tests. Additionally, it adds two new variables (test_flags_clang, test_flags_flang) to lit.cfg so that compiler-specific flags for Clang and Flang can be specified. This patch re-lands: #74543. The error was caused by having: ``` config.substitutions.append(("%flags", config.test_flags)) config.substitutions.append(("%flags_clang", config.test_flags_clang)) config.substitutions.append(("%flags_flang", config.test_flags_flang)) ``` when instead it has to be: ``` config.substitutions.append(("%flags_clang", config.test_flags_clang)) config.substitutions.append(("%flags_flang", config.test_flags_flang)) config.substitutions.append(("%flags", config.test_flags)) ``` because LIT replaces with the first longest sub-string match.	2023-12-21 14:18:36 -08:00
Shilei Tian	7e4c6f6cb2	[OpenMP] Reduce the size of heap memory required by the test `malloc_parallel.c` (#75885 ) This patch reduces the size of heap memory required by the test `malloc_parallel.c` and `malloc.c`. The original size is too large such that `malloc` returns `nullptr` on many threads, causing illegal memory access.	2023-12-20 15:03:01 -08:00
Fabian Mora	ac82c8b925	Revert "[OpenMP][Fix] libomptarget Fortran tests" (#75953 ) Reverts llvm/llvm-project#74543	2023-12-19 12:11:08 -05:00
Gheorghe-Teodor Bercea	65909177e3	[OpenMP][libomptarget][Fix] Disable test on NVIDIA platforms (#75949 ) The tests doesn't seem to work for NVIDIA so disabling it for now.	2023-12-19 11:58:10 -05:00
Fabian Mora	49efb082cc	[OpenMP][Fix] libomptarget Fortran tests (#74543 ) This patch fixes the erroneous multiple-target requirement in Fortran offloading tests. Additionally, it adds two new variables (`test_flags_clang`, `test_flags_flang`) to `lit.cfg` so that compiler-specific flags for Clang and Flang can be specified.	2023-12-19 11:35:14 -05:00
Shilei Tian	3768039913	[OpenMP] Directly use user's grid and block size in kernel language mode (#70612 ) In kernel language mode, use user's grid and blocks size directly. No validity check, which means if user's values are too large, the launch will fail, similar to what CUDA and HIP are doing right now.	2023-12-18 12:26:18 -05:00
Gheorghe-Teodor Bercea	cd1038a46a	[OpenMP][libomptarget][Fix]Require presence of libomptarget-debug for newly added test (#75807 ) Require presence of libomptarget-debug fixes https://github.com/llvm/llvm-project/pull/75642	2023-12-18 10:07:52 -05:00
Gheorghe-Teodor Bercea	4ef6587715	[Clang][OpenMP] Fix mapping of structs to device (#75642 ) Fix mapping of structs to device. The following example fails: ``` #include <stdio.h> #include <stdlib.h> struct Descriptor { int datum; long int x; int xi; long int arr[1][30]; }; int main() { Descriptor dat = Descriptor(); dat.datum = (int )malloc(sizeof(int)*10); dat.xi = 3; dat.arr[0][0] = 1; #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) #pragma omp target { dat.xi = 4; dat.datum[dat.arr[0][0]] = dat.xi; } #pragma omp target exit data map(from: dat) return 0; } ``` This is a rework of the previous attempt: https://github.com/llvm/llvm-project/pull/72410	2023-12-18 09:47:59 -05:00
Gheorghe-Teodor Bercea	5fc76e6b6d	[OpenMP][Fix] Fix test initialization (#74801 ) Fix test initialization	2023-12-07 22:20:32 -05:00
Gheorghe-Teodor Bercea	1216a31cae	Revert "[OpenMP][Fix] Fix test array initialization. (#74799 )" (#74800 ) This reverts commit `d413681344`.	2023-12-07 22:14:12 -05:00
Gheorghe-Teodor Bercea	d413681344	[OpenMP][Fix] Fix test array initialization. (#74799 ) Fix test array initialization.	2023-12-07 22:09:08 -05:00
jyu2-git	8e8bff3397	Fix test. (#74745 ) Just add // REQUIRES: libomptarget-debug So that test will not run with release compiler.	2023-12-07 10:45:59 -08:00
jyu2-git	0113722d82	[OpenMP] Fix runtime problem due to wrong map size. (#74692 ) Currently we are missing set up-boundary address for FinalArraySection as highests elements in partial struct data. Currently for: \#pragma omp target map(D.a) map(D.b[:2]) The size is: %a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0 %b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1 %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0 %2 = getelementptr float, ptr %arrayidx, i32 1 %3 = ptrtoint ptr %2 to i64 %4 = ptrtoint ptr %a to i64 %5 = sub i64 %3, %4 %6 = sdiv exact i64 %5, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64) Where %2 is wrong for (D.b[:2]) is pointer to first element of array section. It should pointe to last element of array section. The fix is to emit the pointer to the last element of array section and use this pointer as the highest element in partial struct data. After change IR: %a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0 %b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1 %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0 %b1 = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1 %arrayidx2 = getelementptr inbounds [2 x float], ptr %b1, i64 0, i64 1 %1 = getelementptr float, ptr %arrayidx2, i32 1 %2 = ptrtoint ptr %1 to i64 %3 = ptrtoint ptr %a to i64 %4 = sub i64 %2, %3 %5 = sdiv exact i64 %4, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)	2023-12-07 09:38:56 -08:00
Johannes Doerfert	13b8826508	Revert " [OpenMP][NFC] Remove `DelayedBinDesc`" (#74679 ) Reverts llvm/llvm-project#74360 As I wrote in the analysis of #74360: Since `bc4e0c048a` we will not add PluginAdaptors into the container of all plugin adaptors before the plugin is not ready. The error is thereby gone. When and old HSA loads other libraries they can call register_image but that will simply not register the image with the plugin we are currently initializing. That seems like reasonable behavior, thought it is good to keep in mind if we ever want a kernel library (@jhuber6 @mjklemm). We can still have a standalone kernel library though or load it late after all plugins are setup (which seems reasonable). I did not expect one our tests actually doing exactly what this will not allow anymore, at least when you use rocm <5.5.0. Need to figure out if we want this behavior (for rocm <5.5.0).	2023-12-06 16:04:23 -08:00
Johannes Doerfert	0ace6ee73a	[OpenMP][FIX] Ensure we do not read outside the device image (#74669 ) Before we expected all symbols in the device image to be backed up with data that we could read. However, uninitialized values are not. We now check for this case and avoid reading random memory. This also replaces the correct readGlobalFromImage call with a isSymbolInImage check after https://github.com/llvm/llvm-project/pull/74550 picked the wrong one. Fixes: https://github.com/llvm/llvm-project/issues/74582	2023-12-06 14:57:57 -08:00
Johannes Doerfert	dcbb1968a8	[OpenMP][FIX] Use unique library name to avoid clashes with other tests We probably should use a temporary name, but having stable names helps debugging.	2023-12-06 14:50:28 -08:00
Johannes Doerfert	d552ce2638	[OpenMP][NFC] Remove `DelayedBinDesc` (#74360 ) Remove `DelayedBinDesc` as it is not necessary since `bc4e0c048a`. See https://github.com/llvm/llvm-project/pull/74360#issuecomment-1843603736 for details.	2023-12-06 14:48:23 -08:00
JP Lehr	a65363d989	[OpenMP] Disable offloading/barrier_fence test Unblock build bot, while investigating. Issue is tracked under llvm https://github.com/llvm/llvm-project/issues/74582	2023-12-06 04:32:48 -06:00
Johannes Doerfert	20da662656	[OpenMP][FIX] Fixup test that doesn't work with lit's `env` substitute	2023-12-05 16:32:19 -08:00
Johannes Doerfert	9f87509b19	[OpenMP][FIX] Ensure we allow shared libraries without kernels (#74532 ) This fixes two bugs and adds a test for them: - A shared library with declare target functions but without kernels should not error out due to missing globals. - Enabling LIBOMPTARGET_INFO=32 should not deadlock in the presence of indirect declare targets.	2023-12-05 15:25:10 -08:00
Johannes Doerfert	e469f8474b	[OpenMP][FIX] Fixup test	2023-12-01 15:22:51 -08:00
Johannes Doerfert	7169c45efa	[OpenMP][NFCI] Organize offload entry logic This moves the offload entry logic into classes and provides convenient accessors. No functional change intended but we can now print all offload entries (and later look them up), tested via `OMPTARGET_DUMP_OFFLOAD_ENTRIES=<device_no>`.	2023-12-01 15:10:52 -08:00
Johannes Doerfert	5fe741f08e	[OpenMP] Separate Requirements into a standalone header (#74126 ) This is not completely NFC since we now check all 4 requirements and the test is checking the good and the bad case for combining flags.	2023-12-01 14:47:00 -08:00

1 2 3 4 5

209 Commits