Commit Graph

182 Commits

Author SHA1 Message Date
Fabian Mora
12250c4092 Reland [OpenMP][Fix] libomptarget Fortran tests (#76189)
This patch fixes the erroneous multiple-target requirement in Fortran
offloading tests. Additionally, it adds two new variables
(test_flags_clang, test_flags_flang) to lit.cfg so that
compiler-specific flags for Clang and Flang can be specified.

This patch re-lands: #74543. The error was caused by having:
```
config.substitutions.append(("%flags", config.test_flags))
config.substitutions.append(("%flags_clang", config.test_flags_clang))
config.substitutions.append(("%flags_flang", config.test_flags_flang))
```
when instead it has to be:
```
config.substitutions.append(("%flags_clang", config.test_flags_clang))
config.substitutions.append(("%flags_flang", config.test_flags_flang))
config.substitutions.append(("%flags", config.test_flags))
```
because LIT replaces with the first longest sub-string match.
2023-12-21 14:18:36 -08:00
Shilei Tian
7e4c6f6cb2 [OpenMP] Reduce the size of heap memory required by the test malloc_parallel.c (#75885)
This patch reduces the size of heap memory required by the test
`malloc_parallel.c` and `malloc.c`. The original size is too large such
that `malloc` returns `nullptr` on many threads, causing illegal
memory access.
2023-12-20 15:03:01 -08:00
Fabian Mora
ac82c8b925 Revert "[OpenMP][Fix] libomptarget Fortran tests" (#75953)
Reverts llvm/llvm-project#74543
2023-12-19 12:11:08 -05:00
Gheorghe-Teodor Bercea
65909177e3 [OpenMP][libomptarget][Fix] Disable test on NVIDIA platforms (#75949)
The tests doesn't seem to work for NVIDIA so disabling it for now.
2023-12-19 11:58:10 -05:00
Fabian Mora
49efb082cc [OpenMP][Fix] libomptarget Fortran tests (#74543)
This patch fixes the erroneous multiple-target requirement in Fortran
offloading tests. Additionally, it adds two new variables
(`test_flags_clang`, `test_flags_flang`) to `lit.cfg` so that
compiler-specific flags for Clang and Flang can be specified.
2023-12-19 11:35:14 -05:00
Shilei Tian
3768039913 [OpenMP] Directly use user's grid and block size in kernel language mode (#70612)
In kernel language mode, use user's grid and blocks size directly. No
validity
check, which means if user's values are too large, the launch will fail,
similar
to what CUDA and HIP are doing right now.
2023-12-18 12:26:18 -05:00
Gheorghe-Teodor Bercea
cd1038a46a [OpenMP][libomptarget][Fix]Require presence of libomptarget-debug for newly added test (#75807)
Require presence of libomptarget-debug fixes https://github.com/llvm/llvm-project/pull/75642
2023-12-18 10:07:52 -05:00
Gheorghe-Teodor Bercea
4ef6587715 [Clang][OpenMP] Fix mapping of structs to device (#75642)
Fix mapping of structs to device.

The following example fails:

```
#include <stdio.h>
#include <stdlib.h>

struct Descriptor {
  int *datum;
  long int x;
  int xi;
  long int arr[1][30];
};

int main() {
  Descriptor dat = Descriptor();
  dat.datum = (int *)malloc(sizeof(int)*10);
  dat.xi = 3;
  dat.arr[0][0] = 1;

  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)

  #pragma omp target
  {
    dat.xi = 4;
    dat.datum[dat.arr[0][0]] = dat.xi;
  }

  #pragma omp target exit data map(from: dat)

 return 0;
}
```

This is a rework of the previous attempt:
https://github.com/llvm/llvm-project/pull/72410
2023-12-18 09:47:59 -05:00
Gheorghe-Teodor Bercea
5fc76e6b6d [OpenMP][Fix] Fix test initialization (#74801)
Fix test initialization
2023-12-07 22:20:32 -05:00
Gheorghe-Teodor Bercea
1216a31cae Revert "[OpenMP][Fix] Fix test array initialization. (#74799)" (#74800)
This reverts commit d413681344.
2023-12-07 22:14:12 -05:00
Gheorghe-Teodor Bercea
d413681344 [OpenMP][Fix] Fix test array initialization. (#74799)
Fix test array initialization.
2023-12-07 22:09:08 -05:00
jyu2-git
8e8bff3397 Fix test. (#74745)
Just add
// REQUIRES: libomptarget-debug

So that test will not run with release compiler.
2023-12-07 10:45:59 -08:00
jyu2-git
0113722d82 [OpenMP] Fix runtime problem due to wrong map size. (#74692)
Currently we are missing set up-boundary address for FinalArraySection
as highests elements in partial struct data.

Currently for:
\#pragma omp target map(D.a) map(D.b[:2])
The size is:
  %a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
  %b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
  %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
  %2 = getelementptr float, ptr %arrayidx, i32 1
  %3 = ptrtoint ptr %2 to i64
  %4 = ptrtoint ptr %a to i64
  %5 = sub i64 %3, %4
%6 = sdiv exact i64 %5, ptrtoint (ptr getelementptr (i8, ptr null, i32
1) to i64)

Where %2 is wrong for (D.b[:2]) is pointer to first element of array
section. It should pointe to last element of array section.
  
The fix is to emit the pointer to the last element of array section and
use this pointer as the highest element in partial struct data.

After change IR:
  %a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
  %b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
  %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
  %b1 = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
  %arrayidx2 = getelementptr inbounds [2 x float], ptr %b1, i64 0, i64 1
  %1 = getelementptr float, ptr %arrayidx2, i32 1
  %2 = ptrtoint ptr %1 to i64
  %3 = ptrtoint ptr %a to i64
  %4 = sub i64 %2, %3
%5 = sdiv exact i64 %4, ptrtoint (ptr getelementptr (i8, ptr null, i32
1) to i64)
2023-12-07 09:38:56 -08:00
Johannes Doerfert
13b8826508 Revert " [OpenMP][NFC] Remove DelayedBinDesc" (#74679)
Reverts llvm/llvm-project#74360

As I wrote in the analysis of #74360:

Since
bc4e0c048a
we will not add PluginAdaptors into the container of all plugin adaptors
before the plugin is not ready. The error is thereby gone. When and old
HSA loads other libraries they can call register_image but that will
simply not register the image with the plugin we are currently
initializing. That seems like reasonable behavior, thought it is good to
keep in mind if we ever want a kernel library (@jhuber6 @mjklemm). We
can still have a standalone kernel library though or load it late after
all plugins are setup (which seems reasonable).

I did not expect one our tests actually doing exactly what this will not
allow anymore, at least when you use rocm <5.5.0. Need to figure out if
we want this behavior (for rocm <5.5.0).
2023-12-06 16:04:23 -08:00
Johannes Doerfert
0ace6ee73a [OpenMP][FIX] Ensure we do not read outside the device image (#74669)
Before we expected all symbols in the device image to be backed up with
data that we could read. However, uninitialized values are not. We now
check for this case and avoid reading random memory.

This also replaces the correct readGlobalFromImage call with a
isSymbolInImage check after
https://github.com/llvm/llvm-project/pull/74550 picked the wrong one.

Fixes: https://github.com/llvm/llvm-project/issues/74582
2023-12-06 14:57:57 -08:00
Johannes Doerfert
dcbb1968a8 [OpenMP][FIX] Use unique library name to avoid clashes with other tests
We probably should use a temporary name, but having stable names helps
debugging.
2023-12-06 14:50:28 -08:00
Johannes Doerfert
d552ce2638 [OpenMP][NFC] Remove DelayedBinDesc (#74360)
Remove `DelayedBinDesc` as it is not necessary since
bc4e0c048a.
See
https://github.com/llvm/llvm-project/pull/74360#issuecomment-1843603736
for details.
2023-12-06 14:48:23 -08:00
JP Lehr
a65363d989 [OpenMP] Disable offloading/barrier_fence test
Unblock build bot, while investigating. Issue is tracked under llvm
https://github.com/llvm/llvm-project/issues/74582
2023-12-06 04:32:48 -06:00
Johannes Doerfert
20da662656 [OpenMP][FIX] Fixup test that doesn't work with lit's env substitute 2023-12-05 16:32:19 -08:00
Johannes Doerfert
9f87509b19 [OpenMP][FIX] Ensure we allow shared libraries without kernels (#74532)
This fixes two bugs and adds a test for them:
- A shared library with declare target functions but without kernels
should not error out due to missing globals.
- Enabling LIBOMPTARGET_INFO=32 should not deadlock in the presence of
indirect declare targets.
2023-12-05 15:25:10 -08:00
Johannes Doerfert
e469f8474b [OpenMP][FIX] Fixup test 2023-12-01 15:22:51 -08:00
Johannes Doerfert
7169c45efa [OpenMP][NFCI] Organize offload entry logic
This moves the offload entry logic into classes and provides convenient
accessors. No functional change intended but we can now print all
offload entries (and later look them up), tested via
`OMPTARGET_DUMP_OFFLOAD_ENTRIES=<device_no>`.
2023-12-01 15:10:52 -08:00
Johannes Doerfert
5fe741f08e [OpenMP] Separate Requirements into a standalone header (#74126)
This is not completely NFC since we now check all 4 requirements and the
test is checking the good and the bad case for combining flags.
2023-12-01 14:47:00 -08:00
Shraiysh
abaeaf3823 [OpenMP][flang] Adding more tests for commonblock with target map (#71146)
This patch addresses the concern about multiple devices and also adds
more tests for `map(to:)`, `map(from:)` and named common blocks.
2023-12-01 10:59:01 -06:00
Johannes Doerfert
5d57041d39 [OpenMP][NFC] Move debug declares into CMAKE out of "private.h" (#73732)
Everywhere else we define this in the CMakeLists.txt and "private.h"
needs to go. Rename "Libomptarget" into "omptarget", no benefit from
"lib".
2023-11-28 17:38:49 -08:00
Akash Banerjee
f1d773863d [Flang][OpenMP] Remove use of non reference values from MapInfoOp (#72444)
This patch removes the val field from the `MapInfoOp`.

Previously when lowering `TargetOp`, the bounds information for the
`BoxValues` were also being mapped. Instead these ops are now cloned
inside the target region to prevent mapping of non reference typed
values.
2023-11-24 11:33:19 +00:00
Fabian Mora
be9fa9dee5 [flang][NVPTX] Add initial support to the NVPTX target (#71992)
This patch adds initial support to the NVPTX target, enabling `flang` to
produce OpenMP offload code for NVPTX targets.
2023-11-16 11:34:28 -05:00
agozillon
718793ce6a [OpenMP][OMPIRBuilder] Handle replace uses of ConstantExpr's inside of Target regions (#71891)
Currently there's an edge cases where constant indexing in target
regions can lead to incorrect results as we do not correctly replace
uses of mapped variables in generated target functions with the target
arguments (and accessor instructions) that replace them. This patch
seeks to fix that by extending the current logic in the OMPIRBuilder.

Things like GEP's can come in the form of Constants/ConstantExprs,
Constants and ConstantExpr's do not have access to the knowledge of what
they're contained in, so we must dig a little to find an instruction so
we can tell if they're used inside of the function we're outlining so we
can be sure they are replaceable and we are not accidentally replacing a
usage somewhere else in the module that's still necessary.

This patch handles these by replacing the original constant expression
with a new instruction equivalent; an instruction as it allows easy
modification in the following loop, as we can now know the constant
(instruction) is owned by our target function (as it holds this
knowledge) and replaceUsesOfWith can now be invoked on it (cannot do
this with constants it seems), a brand new one also allows us to be
cautious as it is perhaps possible the old expression was used inside of
the function but exists and is used externally (unlikely by the nature
of a Constant, but still a positive side affect).
2023-11-15 15:45:32 +01:00
Johannes Doerfert
7318fe6334 [OpenMP][FIX] Ensure device reduction geps work for multi-var reductions
If we have more than one reduction variable we need to be consistent
wrt. indexing. In 3de645efe3 we broke this
as the buffer type was reduced to a singleton but the index computation
was not adjusted to account for that offset. This fixes it by
interleaving the reduction variables properly in a array-of-struct
style. We can revert it back to struct-of-array in a follow up if turns
out to be a problem. I doubt it since half the accesses should benefit
from the locallity this layout offers and only the other half were
consecutive before.
2023-11-10 14:34:46 -08:00
Anton Rydahl
446e11acef [OpenMP ]Adding more libomptarget reduction tests (#71616)
Based on https://github.com/llvm/llvm-project/pull/70766 I think it
would be good to have a few more offloading reduction tests, so we do
not accidentally break minimum and maximum reductions another time.
2023-11-07 20:39:08 -08:00
Johannes Doerfert
2d739f13d4 [OpenMP][Offload] Automatically map indirect function pointers (#71462)
We already have all the information to automatically map function
pointers that have been declared as `indirect` declare target by the
user. This is just enabling and testing the functionality by looking
through the one level of indirection.
2023-11-07 08:33:39 -08:00
Akash Banerjee
be59fe5028 [OpenMP][Flang]Fix some of the Fortan OpenMP Offloading tests
target_map_common_block2.f90
	- Fix the extra space in the print message.
	- #67164 fixes this. So moving it outside of failing and also removing XFAIL marker.

basic-target-region-3D-array.f90
	- Corrected the check to account for the new lines printed.

Depends on #67319
2023-11-06 13:24:02 +00:00
Shilei Tian
db37d25c53 Revert "[OpenMP] Simplify parallel reductions (#70983)"
This reverts commit e9a48f9e05 because it breaks
3 sollve 5.0 tests:

test_loop_reduction_and_device.c
test_loop_reduction_bitand_device.c
test_loop_reduction_multiply_device.c
2023-11-05 22:51:59 -05:00
Johannes Doerfert
e9a48f9e05 [OpenMP] Simplify parallel reductions (#70983)
A lot of the code was from a time when we had multiple parallel levels.
The new runtime is much simpler, the code can be simplified a lot which
should speed up reductions too.
2023-11-02 15:50:05 -07:00
Johannes Doerfert
a8152086ff [Attributor][FIX] Ensure new BBs are registered 2023-11-01 12:12:14 -07:00
Johannes Doerfert
a273d17d4a [OpenMP][FIX] Do not add implicit argument to device Ctors and Dtors
Constructors and destructors on the device do not take any arguments,
also not the implicit dyn_ptr argument other kernels automatically take.
2023-11-01 11:18:11 -07:00
Johannes Doerfert
f9a89e6b9c [OpenMP][FIX] Allocate per launch memory for GPU team reductions (#70752)
We used to perform team reduction on global memory allocated in the
runtime and by clang. This was racy as multiple instances of a kernel,
or different kernels with team reductions, would use the same locations.
Since we now have the kernel launch environment, we can allocate dynamic
memory per-launch, allowing us to move all the state into a non-racy
place.

Fixes: https://github.com/llvm/llvm-project/issues/70249
2023-11-01 11:11:48 -07:00
Johannes Doerfert
e137af60cd [OpenMP][NFC] Fix test to actually check for the result 2023-10-30 17:15:41 -07:00
Andrew Gozillon
68c384676c [Flang][MLIR][OpenMP] Temporarily re-add basic handling of uses in target regions to avoid gfortran test-suite regressions
This was a regression introduced by myself in:

 6a62707c04

where I too hastily removed the basic handling of implicit captures
we have currently. This will be superseded by all implicit captures
being added to target operations map_info entries in a soon landing
series of patches, however, that is currently not the case so we must
continue to do some basic handling of these captures for the time
being. This patch re-adds that behaviour to avoid regressions.

Unfortunately this means some test changes as well as
getUsedValuesDefinedAbove grabs constants used outside
of the target region which aren't handled particularly
well currently.
2023-10-30 15:10:12 -05:00
Shilei Tian
0d5b7dd25c [OpenMP] Add a test for D158802 (#70678)
In D158802 we honored user's `thread_limit` value even with the
optimization
introduced in D152014. This patch adds a simple test.
2023-10-30 15:59:05 -04:00
agozillon
6a62707c04 [Flang][OpenMP][MLIR] Initial array section mapping MLIR -> LLVM-IR lowering utilising omp.bounds (#68689)
This patch seeks to add initial lowering of OpenMP array sections within
target region map clauses from MLIR to LLVM IR.

This patch seeks to support fixed sized contiguous (don't think OpenMP
supports anything other than contiguous sections from my reading but i
could be wrong) arrays initially, before looking toward assumed size and
shaped arrays. The patch also currently does not include stride, it's
left for future work.

Although, assumed size works in some fashion (dummy arguments) with some
minor alterations to the OMPEarlyOutliner, so it is possible changes
made in the IsolatedFromAbove series may allow this to work with no
further required patches.

It utilises the generated omp.bounds to calculate the size of the mapped
OpenMP array (both for sectioned and un-sectioned arrays) as well as the
offset to be passed to the kernel argument structure.

Alongside these changes some refactoring of how map data is handled is
attempted, using a new MapData structure to keep track of information
utilised in the lowering of mapped values.

The initial addition of a more complex createDeviceArgumentAccessor that
utilises capture kinds similarly to (and loosely based on) Clang to
generate different kernel argument accesses is also added.

A similar function for altering how the kernel argument is passed to the
kernel argument structure on the host is also utilised
(createAlteredByCaptureMap), which allows modification of the
pointer/basePointer based on their capture (and bounds information).
It's of note ByRef, is the default for explicit mappings and ByCopy will
be the default for implicit captures, so the former is currently tested
in this patch and the latter is not for the moment.
2023-10-30 16:00:23 +01:00
Johannes Doerfert
d346c82435 [OpenMP] Associate the KernelEnvironment with the GenericKernelTy (#70383)
By associating the kernel environment with the generic kernel we can
access middle-end information easily, including the launch bounds ranges
that are acceptable. By constraining the number of threads accordingly,
we now obey the user-provided bounds that were passed via attributes.
2023-10-29 11:35:34 -07:00
Shraiysh
03485a0406 [openmp][flang] Add tests for map clause (#70394)
This patch adds basic tests for map clause on target construct for
commonblocks. There will be more tests to add, which will be added in
future patches. Currently failing tests are added in a separate folder
with XFAIL. They should be moved as they are fixed.
2023-10-27 09:35:06 -07:00
Johannes Doerfert
86bb713142 [OpenMP][FIX] Enlarge thread state array, improve test and add second 2023-10-22 17:47:00 -07:00
Johannes Doerfert
9f3b06d8be [OpenMP][FIX] Fix memset oversight to partially unblock test
The tests "unoptimized" version is still broken, disabled for now.
2023-10-22 14:29:11 -07:00
Johannes Doerfert
f3ff0a67be [OpenMP][FIX] Ensure test runs correct with (at least) 2 threads 2023-10-22 13:22:36 -07:00
Johannes Doerfert
87dac9f168 [OpenMP] Rewrite test to check the correct (CPU) result
The test initially showed we do no crash but compute the wrong GPU
result, now we show the CPU result is correct and disable GPU testing.
2023-10-21 14:55:15 -07:00
Johannes Doerfert
d3921e4670 [OpenMP] Basic BumpAllocator for (AMD)GPUs (#69806)
The patch contains a basic BumpAllocator for (AMD)GPUs to allow us to
run more tests. The allocator implements `malloc`, both internally and
externally, while we continue to default to the NVIDIA `malloc` when we
target NVIDIA GPUs. Once we have smarter or customizable allocators we
should consider this choice, for now, this allocator is better than
none. It traps if it is out of memory, making it easy to debug. Heap
size is configured via `LIBOMPTARGET_HEAP_SIZE` and defaults to 512MB.
It allows to track allocation statistics via
`LIBOMPTARGET_DEVICE_RTL_DEBUG=8` (together with
`-fopenmp-target-debug=8`). Two tests were added, and one was enabled.

This is the next step towards fixing
 https://github.com/llvm/llvm-project/issues/66708
2023-10-21 14:49:30 -07:00
Johannes Doerfert
d571af7f62 [OpenMP][FIX] Ensure thread states do not crash on the GPU
The nested parallelism causes thread states which still do not properly
work but at least don't crash anymore.
2023-10-21 14:43:09 -07:00
Jon Chesterfield
7ac516a119 [amdgpu] Disable openmp test that is blocking CI after changing hardware, need to diagnose memory fault 2023-10-16 13:59:49 +01:00