Commit Graph

237 Commits

Author SHA1 Message Date
Valentin Clement
a25da1a921 [mlir][openacc] Add device_type support for compute operations (#75864)
Re-land PR after being reverted because of buildbot failures.

This patch adds representation for `device_type` clause information on
compute construct (parallel, kernels, serial).

The `device_type` clause on compute construct impacts clauses that
appear after it. The values impacted by `device_type` are now tied with
an attribute array that represent the device_type associated with them.
`DeviceType::None` is used to represent the value produced by a clause
before any `device_type`. The operands and the attribute information are
parser/printed together.

This is an example with `vector_length` clause. The first value (64) is
not impacted by `device_type` so it will be represented with
DeviceType::None. None is not printed. The second value (128) is tied
with the `device_type(multicore)` clause.
```
!$acc parallel vector_length(64) device_type(multicore) vector_length(256)
```
```
acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) {
}
```

When multiple values can be produced for a single clause like
`num_gangs` and `wait`, an extra attribute describe the number of values
belonging to each `device_type`. Values and attributes are
parsed/printed together.

```
acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>])
```

While preparing this patch I noticed that the wait devnum is not part of
the operations and is not lowered. It will be added in a follow up
patch.
2023-12-20 20:36:09 -08:00
Valentin Clement
553748356c Revert "[mlir][openacc] Add device_type support for compute operations (#75864)"
This reverts commit 8b885eb90f.
2023-12-20 16:08:10 -08:00
Valentin Clement
e98082d90a Revert "[flang][openacc] Remove unused waitdevnum"
This reverts commit 8fdc3b98b8.
2023-12-20 16:07:57 -08:00
Valentin Clement
8fdc3b98b8 [flang][openacc] Remove unused waitdevnum 2023-12-20 14:01:51 -08:00
Valentin Clement (バレンタイン クレメン)
8b885eb90f [mlir][openacc] Add device_type support for compute operations (#75864)
This patch adds representation for `device_type` clause information on
compute construct (parallel, kernels, serial).

The `device_type` clause on compute construct impacts clauses that
appear after it. The values impacted by `device_type` are now tied with
an attribute array that represent the device_type associated with them.
`DeviceType::None` is used to represent the value produced by a clause
before any `device_type`. The operands and the attribute information are
parser/printed together.

This is an example with `vector_length` clause. The first value (64) is
not impacted by `device_type` so it will be represented with
DeviceType::None. None is not printed. The second value (128) is tied
with the `device_type(multicore)` clause.
```
!$acc parallel vector_length(64) device_type(multicore) vector_length(256)
```
```
acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) {
}
```

When multiple values can be produced for a single clause like
`num_gangs` and `wait`, an extra attribute describe the number of values
belonging to each `device_type`. Values and attributes are
parsed/printed together.

```
acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>])
```

While preparing this patch I noticed that the wait devnum is not part of
the operations and is not lowered. It will be added in a follow up
patch.
2023-12-20 13:45:47 -08:00
Razvan Lupusoru
a711b042fd [acc] Initial implementation of MemoryEffects on acc operations (#75970)
The `acc` dialect operations now implement MemoryEffects interfaces in
the following ways:
- Data entry operations which may read host memory via `varPtr` are now
marked as so. The majority of them do NOT actually read the host memory.
For example, `acc.present` works on the basis of presence of pointer and
not necessarily what the data points to - so they are not marked as
reading the host memory. They still use `varPtr` though but this
dependency is reflected through ssa.
- Data clause operations which may mutate the data pointed to by
`accPtr` are marked as doing so.
- Data clause operations which update required structured or dynamic
runtime counters are marked as reading and writing the newly defined
`RuntimeCounters` resource. Some operations, like `acc.getdeviceptr` do
not actually use the runtime counters - but are marked as reading them
since the address obtained depends on the mapping operations which do
update the runtime counters. Namely, `acc.getdeviceptr` cannot be moved
across other mapping operations.
- Constructs are marked as writing to the `ConstructResource`. This may
be too strict but is needed for the following reasons: 1) Structured
constructs may not use `accPtr` and instead use `varPtr` - when this is
the case, data actions may be removed even when used. 2) Unstructured
constructs are currently used to aggregate multiple data actions. We do
not want such constructs removed or moved for now.
- Terminators are marked as `Pure` as in other dialects.

The current approach has the following limitations which may require
further improvements:
- Subsequent `acc.copyin` operations on same data do not actually read
host memory pointed to by `varPtr` but are still marked as so.
- Two `acc.delete` operations on same data may not mutate `accPtr` until
the runtime counters are zero (but are still marked as mutating).
- The `varPtrPtr` argument, when present, points to the address of
location of `varPtr`. When mapping to target device, an `accPtrPtr`
needs computed and this memory is mutated. This effect is not captured
since the current operations do not produce `accPtrPtr`.
- Runtime counter effects are imprecise since two operations with
differing `varPtr` increment/decrement different counters. Additionally,
operations with `varPtrPtr` mutate attachment counters.
- The `ConstructResource` is too strict and likely can be relaxed with
better modeling.
2023-12-20 07:11:19 -08:00
Valentin Clement (バレンタイン クレメン)
22426d9ecd [flang][openacc/mp] Do not read bounds on absent box (#75252)
Make sure we only load box and read its bounds when it is present.
- Add `AddrAndBoundInfo` struct to be able to carry around the `addr`
and `isPresent` values. This is likely to grow so we can make all the
access in a single `fir.if` operation.
2023-12-15 13:02:40 -08:00
Valentin Clement (バレンタイン クレメン)
711809f37a [flang][openacc/mp][NFC] Fix order of template arguments (#75538)
Some template parameters for the bounds ops generation have been
inverted. It should be consistent to be `BoundsOp, BoundsType`.
2023-12-14 21:13:38 -08:00
Valentin Clement (バレンタイン クレメン)
a9a5af8270 [flang][openacc] Support early return in acc.loop (#73841)
Early return is accepted in OpenACC loop not directly nested in a
compute construct. Since acc.loop operation has a region, the
`func.return` operation cannot be directly used inside the region.
An early return is materialized by an `acc.yield` operation returning a
`true` value. The standard end of the `acc.loop` region yield a `false`
value in this case.
A conditional branch operation on the `acc.loop` result will branch to
the `finalBlock` or just to the continue block whether an early exit was
produce in the acc.loop.
2023-11-30 14:25:03 -08:00
Valentin Clement (バレンタイン クレメン)
9365ed1e10 [flang][openacc] Add ability to link acc.declare_enter with acc.declare_exit ops (#72476) 2023-11-16 16:41:50 -08:00
Valentin Clement (バレンタイン クレメン)
a3700cc29d [flang][openacc] Make implicit declare region unstructured (#71591)
Using an op with a region cause some issue with unstructured code. This
patch make use of acc.declare_enter and acc.declare_exit to represent
the implicit declare region.
2023-11-14 14:42:11 -08:00
Valentin Clement (バレンタイン クレメン)
90da688bac [flang][openacc] Avoid creation of duplicate global ctor (#71846)
PR #70698 relax the duplication rule in acc declare clauses. This lead
to potential duplicate creation of the global constructor/destructor.
This patch make sure to not generate a duplicate ctor/dtor.
2023-11-09 12:57:30 -08:00
Valentin Clement (バレンタイン クレメン)
edfaae8726 [flang][openacc] Correctly lower acc routine in interface block (#71451)
When the acc routine directive was in an interface block in a
subroutine, the routine information was attached to the wrong
subroutine. This patch fixes this be retrieving the subroutine name in
the interface.
2023-11-06 17:48:45 -08:00
Valentin Clement (バレンタイン クレメン)
3c356eef31 [flang][openacc] Support variable from equivalence in data clauses (#71434)
The value for a var in an equivalence is represented by a `fir.ptr`.
Support this type in the recipe creation.
2023-11-06 15:49:40 -08:00
Slava Zakharin
ecb1fbaa13 [flang][openacc] Generate data bounds for array addressing. (#71254)
In cases like `copy(array(N))` it is still useful to represent
the data operand uniformly with `copy(array(N:N))`.
This change generates data bounds even if it is not an array
section with the triplets. The lower and the upper bounds
are the same and the extent is one in this case.
2023-11-06 14:45:46 -08:00
Valentin Clement
ad584a27f2 [flang][openacc][NFC] Remove unused variable 2023-11-06 14:43:36 -08:00
Valentin Clement (バレンタイン クレメン)
fdf3823c0e [flang][openacc] Support variable in equivalence in declare directive (#71242)
A variable in equivalence share the storage units with one or more
objects. When lowered to FIR, the global created for the equivalence has
the name of one of the object. The variable also has an offset in the
storage unit.
This patch takes all of this into account for variable part of
equivalence used in a declare directive.
2023-11-06 14:36:24 -08:00
Valentin Clement (バレンタイン クレメン)
32d91449ef [flang][openacc] Only issue a warning when acc routine func is not found (#70964)
Do not issue a hard error when the function in acc routine directive is
not present in the current translation unit. Only issue a warning.
2023-11-01 12:59:59 -07:00
Razvan Lupusoru
f5a5142571 [openacc] Update acc.loop to expose data operands (#70954)
The compute and data constructs implement getNumDataOperands and
getDataOperand. The acc.loop operation similarly has multiple data
operands - thus it makes sense to expose them the same way.

For loop, only private and reduction operands are exposed this way.
Technically, acc.loop also holds cache operands - but these are hints
not a data attribute.
2023-11-01 11:52:31 -07:00
Valentin Clement (バレンタイン クレメン)
0f8615f4dc [flang][openacc][openmp] Set correct location on atomic operations (#70680)
The location set on atomic operations in both OpenMP and OpenACC was
completly off. The real location needs to be created from the source
CharBlock of the parse tree node of the respective atomic statement.
This patch updates locations in lowering for atomic operations.
2023-10-30 10:35:43 -07:00
Valentin Clement (バレンタイン クレメン)
f706837e2b [flang][mlir][openacc] Switch device_type representation to an enum (#70250)
Switch the representation from scalar integer to a enumeration. The
parser transform the string in the input to the correct enumeration.
2023-10-30 09:51:42 -07:00
Razvan Lupusoru
62ae549f57 [flang][openacc] Add implicit copy for reduction in combined construct (#70148)
After PR#69417, lowering for combined constructs was updated to adhere
to OpenACC 3.3, section 2.11: `A private or reduction clause on a
combined construct is treated as if it appeared on the loop construct.`

However, the second part of that paragraph notes `In addition, a
reduction clause on a combined construct implies a copy clause`. Since
the acc dialect decomposes combined constructs, it is important to
distinguish between the case where an explicit data clause is required
(as noted in section 2.6.2) and the case where an implicit data action
must be generated by compiler.
2023-10-25 07:27:57 -07:00
Valentin Clement (バレンタイン クレメン)
828674395b [flang][openacc] Allow acc routine at the top level (#69936)
Some compilers allow the `$acc routine(<name>)` to be placed at the
program unit level. To be compatible, this patch enables the use of acc
routine at this level. These acc routine directives must have a name.
2023-10-24 09:17:48 -07:00
Razvan Lupusoru
54e46ba447 [flang][openacc] Fix post_alloc declare function ordering (#69980)
The declare actions were introduced to capture semantics dealing with
allocation of descriptor-based variable. However, the post_alloc action
has an ordering error. It needs to update descriptor first before the
mapping action of the data. The reason for this is that implicit attach
must occur during mapping action - but updating the descriptor
synchronizes it with the host copy (which would hold a host pointer).
2023-10-23 18:25:17 -07:00
Valentin Clement (バレンタイン クレメン)
d2e7a15dfb [flang][openacc] Warn about misplaced end loop directive and ignore it (#69512)
Instead of raising an error for a misplaced `end loop directive`, just
warn about it and ignore it. This directive is an extension and is
optional.
2023-10-19 08:49:01 -07:00
Slava Zakharin
d0e8f3321e [flang][openacc] Fixed private/reduction for combined constructs. (#69417)
According to OpenACC 3.2 2.11, private or reduction clause
on the combined construct is treated as if it appeared
on the loop construct.
2023-10-18 08:10:50 -07:00
Valentin Clement (バレンタイン クレメン)
d9568bd4aa [flang][openacc] Support array with dynamic extents in firstprivate recipe (#69026)
Add lowering support for array with dynamic extents in the firstprivate
recipe. Generalize the lowering so static shaped arrays and array with
dynamic extents use the same path.

Some cleaning code is taken from #68836 that is not landed yet.
2023-10-16 12:51:01 -07:00
Valentin Clement (バレンタイン クレメン)
f74b85c678 [flang][openacc] Support array with dynamic extents in reduction recipe (#68829)
Add support for array with dynamic extents in lowering of the reduction
recipe.
2023-10-16 12:50:39 -07:00
Valentin Clement (バレンタイン クレメン)
468d3b1b78 [flang][openacc][NFC] Simplify lowering of recipe (#68836)
Refactor some of the lowering in the reduction and firstprivate recipe
to avoid duplicated code.
2023-10-16 09:35:50 -07:00
Valentin Clement
acbb260a48 [flang][openacc][NFC] Fix TODO messages 2023-10-10 10:10:15 -07:00
Valentin Clement (バレンタイン クレメン)
e2f493bed3 [flang][openacc] Support assumed shape array in firstprivate recipe (#68640)
Add support for assumed shape arrays in lowering of the copy region of
the firstprivate recipe. Information is passed in block arguments as it
is done for the reduction recipe.
2023-10-10 09:55:55 -07:00
Valentin Clement (バレンタイン クレメン)
c8b5f4c07e [flang][openacc] Support array with dynamic extent in private recipe (#68624)
Add lowering support for array with dynamic extents for private recipe.
The extents are passed as block arguments and used in the alloca
operation. The shape also used this information for the hlfir.declare
operation.
2023-10-09 17:43:03 -07:00
Valentin Clement
470b65270b [flang][openacc] Support allocatable and pointer array in private recipe
Add support for pointer and allocatable arrays in private clause.
2023-10-09 10:20:58 -07:00
Valentin Clement
2615de5867 Revert "[flang][openacc] Support allocatable and pointer array in private recipe (#68422)"
This reverts commit e85cdb94cc.

This fails some buildbots
2023-10-09 09:33:26 -07:00
Valentin Clement (バレンタイン クレメン)
e85cdb94cc [flang][openacc] Support allocatable and pointer array in private recipe (#68422)
Add support for pointer and allocatable arrays in private clause.
2023-10-09 09:22:43 -07:00
Valentin Clement (バレンタイン クレメン)
1556ddf255 [flang][openacc] Do not generate duplicate routine op (#68348)
This patch updates the lowering of OpenACC routine directive to avoid
creating duplicate acc.routine operations when all the clauses are
identical. If clauses differ an error is raised.
2023-10-06 08:10:12 -07:00
Valentin Clement (バレンタイン クレメン)
964a252202 [flang][openacc] Add support for allocatable and pointer arrays in reduction (#68261)
This patch adds support for allocatable and pointer arrays in the
reduction recipe lowering.
2023-10-05 13:05:43 -07:00
Valentin Clement (バレンタイン クレメン)
4f98fb2e93 [mlir][openacc][NFC] Remove useless OptionalAttr with UnitAttr (#68337)
`UnitAttr` exits or not so adding `OptionalAttr` around it is not
necessary. This patch cleanup this for the `RoutineOp`
2023-10-05 10:45:04 -07:00
Valentin Clement (バレンタイン クレメン)
438a6ef277 [flang][openacc] Use the array section for assumed shape array reduction (#68147)
Use the bounds information in the reduction recipe for assumed shape
arrays.
2023-10-03 13:38:02 -07:00
Valentin Clement (バレンタイン クレメン)
e0cd781f3b [flang][openacc] Fix getBoundsString for reduction recipe name (#68146)
`getBoundsString` is used to generate the reduction recipe names when an
array section is provided. The lowerbound and upperbound were swapped.
This patch fixes it.
2023-10-03 13:08:00 -07:00
Jie Fu
57c639deb4 [flang][openacc] Fix -Wunused-variable in OpenACC.cpp (NFC)
/llvm-project/flang/lib/Lower/OpenACC.cpp:940:16: error: unused variable 'nbRangeArgs' [-Werror,-Wunused-variable]
      unsigned nbRangeArgs =
               ^
1 error generated.
2023-10-03 09:14:32 +08:00
Valentin Clement (バレンタイン クレメン)
185eab1974 [flang][openacc] Keep constant bounds in reduction recipe when it is all constants (#67827)
Following #67719, propagate the constant bounds in the combiner region
when all bounds are constant. Otherwise, bounds information are
propagated as block arguments as defined in #67719.
2023-10-02 17:59:41 -07:00
Kazu Hirata
aa5a1b267a [flang] Fix a typo 2023-09-28 14:28:19 -07:00
Kazu Hirata
33f5087e1b [flang] Fix an unused variable warning
This patch fixes:

  flang/lib/Lower/OpenACC.cpp:876:14: error: unused variable
  'nbRangeArgs' [-Werror,-Wunused-variable]
2023-09-28 14:25:54 -07:00
Valentin Clement (バレンタイン クレメン)
d28a782542 [flang][openacc] Use bounds information in reduction recipe combiner (#67719)
This patch makes use of the bounds in the combiner region for known
shape arrays. Until know the combiner region was iterating over the
whole array.
Lowerbound, upperbound and step are passed as block arguments after the
two values.

A follow up patch will make use of this information for the assumed
shape arrays as well.
2023-09-28 13:11:06 -07:00
Valentin Clement (バレンタイン クレメン)
49f1232ea1 [flang][openacc] Support assumed shape arrays in private recipe (#67701)
This patch adds correct support for the assumed shape arrays in the
privatization recipes.
This follows the same IR generation than in #67610.
2023-09-28 12:40:51 -07:00
Valentin Clement (バレンタイン クレメン)
ef1eb502e0 [flang][openacc] Support assumed shape arrays in reduction (#67610)
Assumed shape array are using descriptor and must be handled differently
than known shape arrays. This patch adds support to generate the `init`
and `combiner` region for the reduction recipe operation with assumed
shape array by using the descriptor and the HLFIR lowering path.

`createTempFromMold` function is moved from
`flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp` to
`flang/include/flang/Optimizer/Builder/HLFIRTools.h` to be reused to
create the private copy.
2023-09-28 08:36:19 -07:00
Valentin Clement (バレンタイン クレメン)
b163e52b78 [flang][openacc][hlfir] Add declare op in reduction recipe (#67484)
Same change than #67368 but for the reduction recipe.
2023-09-26 13:36:57 -07:00
Valentin Clement (バレンタイン クレメン)
4da01a636b [flang][openacc][hlfir] Add declare op in private recipe (#67368)
Following #66099, the generation of private (and firstprivate) recipe
needs to add a declare op. This patch adds the declare op for the case
currently supported.

This will fix issue #66105.
2023-09-26 10:32:47 -07:00
Andrew Gozillon
8fde6f41a0 [Flang][OpenMP] Add lowering from PFT to new MapEntry and Bounds operations and tie them to relevant Target operations
This patch builds on top of a prior patch in review which adds a new map
and bounds operation by modifying the OpenMP PFT lowering to support
these operations and generate them from the PFT.

A significant amount of the support for the Bounds operation is borrowed
from OpenACC's own current implementation and lowering, just ported
over to OpenMP.

The patch also adds very preliminary/initial support for lowering to
a new Capture attribute, which is stored on the new Map Operation,
which helps the later lowering from OpenMP -> LLVM IR by indicating
how a map argument should be handled. This capture type will
influence how a map argument is accessed on device and passed by
the host (different load/store handling etc.). It is reflective of a
similar piece of information stored in the Clang AST which performs a
similar role.

As well as some minor adjustments to how the map type (map bitshift
which dictates to the runtime how it should handle an argument) is
generated to further support more use-cases for future patches that
build on this work.

Finally it adds the map entry operation creation and tying it to the relevant
target operations as well as the addition of some new tests and alteration
of previous tests to support the new changes.

Depends on D158732

reviewers: kiranchandramohan, TIFitis, clementval, razvanlupusoru

Differential Revision: https://reviews.llvm.org/D158734
2023-09-19 08:26:46 -05:00