For unstructured construct, the blocks are created in advance inside the
function body. This causes issues when the unstructured construct is
inside an OpenACC region operations. This patch adds the same fix than
OpenMP lowering and re-create the blocks inside the op region.
Initial OpenMP fix: 29f167abcf
Since the OpenACC atomics specification is a subset of OpenMP atomics,
the same lowering implementation can be used. This change extracts out
the necessary pieces from the OpenMP lowering and puts them in a shared
spot. The shared spot is a header file so that each implementation can
template specialize directly.
After putting the OpenMP implementation in a common spot, the following
changes were needed to make it work for OpenACC:
* Ensure parsing works correctly by avoiding hardcoded offsets.
* Templatize based on atomic type.
* The checking whether it is OpenMP or OpenACC is done by checking for
OmpAtomicClauseList (OpenACC does not implement this so we just
templatize with void). It was preferable to check this instead of atomic
type because in some cases, like atomic capture, the read/write/update
implementations are called - and we want compile time evaluation of
these conditional parts.
* The memory order and hint are used only for OpenMP.
* Generate acc dialect operations instead of omp dialect operations.
The cache directive is attached directly to the acc.loop operation when
the directive appears in the loop. When it appears before a loop, the
OpenACCCacheConstruct is saved and attached when the acc.loop is
created.
Directive that cannot be attached to a loop are silently discarded.
Depends on #65521
The `cache` directive may appear at the top of (inside of) a loop. It
specifies array elements or subarrays that should be fetched into the
highest level of the cache for the body of the loop.
The `cache` directive is modeled as a data entry operands attached to
the acc.loop operation.
Some compilers accept `!$acc data` without any clauses. For portability
reason, this patch relaxes the strict error to a simple portability warning.
Reviewed By: razvanlupusoru, vzakhari
Differential Revision: https://reviews.llvm.org/D159019
getSymbolFromAccObject was hitting the fatal error when
trying to retrieve the symbol on array section
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158881
This patch propagates the acc routine information
to the module file so they can be used by the caller.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158541
Lower the acc delcare directive in function/subroutine
to the newly introduced acc.declare operation. Only a single
acc.declare operation is procduced in a function or subroutine
so they don't end up nested.
Depends on D158314
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158315
The routine directive can appear in the specification part of
a subroutine, function or module and therefore appear before the
function or subroutine is lowered. We keep track of the created
routine info attribute and attach them to the function at the end
of the lowering if the directive appeared before the function was
lowered.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158204
This patch makes use of the HLFIR box produced for hlfir.declare
in place of the FIR box (the memref of hlfir.declare) when possible.
This makes the representation a little bit more clear, because
all accesses are made via a single box.
This reduces the life range of the original box, because the new
temporary box produced by embox/rebox is used from now.
Apparently, this works around some issues in the current HLFIR codegen,
for example, look at the LIT tests changes around fir.array_coor
produced by hlfir.designate codegen - using the FIR box for fir.array_coor
might result in using incorrect lbounds.
Apparently, this change enables more intrinsics simplifications
because the SimplifyIntrinsicsPass looks for explicit embox/rebox
in findBoxDef() to decide whether to apply the optimization.
This change also provides better association of the base addresses
referenced by OpenACC clauses with the corresponding boxes
that might be used explicitly in OpenACC regions (e.g. for reading
the lbounds).
Reviewed By: razvanlupusoru, clementval
Differential Revision: https://reviews.llvm.org/D158119
Lower the bind clause to the corresponding attribute
Depends on D158120
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158121
Not all declare clause have an exit operation attach to them and
therefore no dealloc function generated. Attach
the pre/post deallocation attribute only for the clauses that have
an exit operation.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158106
Lowering was missing to generate the pre/post alloc/dealloc
functions for the acc declare variables. This patch adds the generation.
These functions have the descriptor as their unique argument.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158103
This patch lower simple acc routine directive
with no clauses and no name inside function/subroutine.
Patch to handle name and clauses will follow up.
Patch to add attribute to the original routine will follow as well.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157919
This patches adds the acc.declare_action attrbites on
post allocate operation and pre/post deallocate operations.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157915
The exit operation for the declare device_resident in function/subroutine
is set to delete. Make is consistent and set it also for global declare.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157537
Generate the function dealing with the action on deallocation (pre/post) of
a declare variable.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157530
Always generate the acc.declare_exit operation so it
matches the acc.declare_enter operation. This is to ensure the
lifetime of data in the implicit region.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157524
When creating declare entry operation, the variable needs to be flagged
with the declare attribute. This was not done for device_resident, link and
deviceptr.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157522
Generate the register function for global declare
variable. This function is meant to be called after the
actual data is allocated. Patch to insert the function call
and attribute will follow.
Depends on D157338
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157339
The global ctor for acc declare when the variable is a descriptor
is treated differently. The descriptor is implicity copied in.
An additional registering function will be generated to deal with
the data pointer when the data is actually allocated. This will come in
a follow up patch.
The descriptor is not a user visible detail but an implementation detail.
The intent for declare is that the lifetime is implicitly managed - and the
data must be on device. Since descriptor holds pointer to the data,
it makes sense to also make this available on device at same time.
Copyin is used because it contains relevant details about the data such
as bounds.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D157338
Lower the copyout clause for the OpenACC declare directive
Depends on D156738
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156824
Lower the present clause on the OpenACC declare construct in
function/subroutine.
Depends on D156572
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156721
Lower the create clause on the OpenACC declare construct in
function/subroutine.
Depends on D156568
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156572
This patch adds lowering for the exit part of the OpenACC declare construct
in function/subroutine.
Depends on D156560
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156568
This patch adds lowering for the entry part of the OpenACC declare construct
in function/subroutine. The exit part will come as a follow up patch.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156560
The OpenACC 3.3 specification does not allow the `zero` modifier
on the `create` clause used with the declare directive.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156703
Fix the value of the structured attribute for entry operation in the
global constructor noted in D156353.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156481
This patch adds support to lower the link clause on OpenACC
declare construct in module declaration.
Depends on D156463
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156464
This patch adds support to lower the device_resident clause on OpenACC
declare construct in module declaration.
Depends on D156457
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156463
This patch adds support to lower the copyin clause on OpenACC
declare construct in module declaration.
Depends on D156353
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156457
Add the acc.global_dtor when lowering the OpenACC declare
construct.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156353
This patch adds the skeleton and the basic lowering for OpenACC declare
construct when located in the module declaration. This patch just lower the
create clause with or without modifier. Other clause and global descrutor
lowering will come in follow up patches to keep this one small enough for
review.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156266