fir.call side effects are hard to describe in a useful way using
`MemoryEffectOpInterface` because it is impossible to list which memory
location a user procedure read/write without doing a data flow analysis
of its body (even PURE procedures may read from any module variable,
Fortran SIMPLE procedure from F2023 will allow that, but they are far
from common at that point).
Fortran language specifications allow the compiler to deduce
that a procedure call cannot access a variable in many cases
This patch leverages this to extend `fir::AliasAnalysis::getModRef` to
deal with fir.call.
This will allow implementing "array = array_function()" optimization in
a future patch.
This patch adds a flag to mark hlfir.declare of host variables that are
captured in some internal procedure.
It enables implementing a simple fir.call handling in
fir::AliasAnalysis::getModRef leveraging Fortran language specifications
and without a data flow analysis.
This will allow implementing an optimization for "array =
array_function()" where array storage is passed directly into the hidden
result argument to "array_function" when it can be proven that
arraY_function does not reference "array".
Captured host variables are very tricky because they may be accessed
indirectly in any calls if the internal procedure address was captured
via some global procedure pointer. Without flagging them, there is no
way around doing a complex inter procedural data flow analysis:
- checking that the call is not made to an internal procedure is not
enough because of the possibility of indirect calls made to internal
procedures inside the callee.
- checking that the current func.func has no internal procedure is not
enough because this would be invalid with inlining when an procedure
with internal procedures is inlined inside a procedure without internal
procedure.
getElementType() was missing from Sequence and Vector types. Did a
replace of the obvious places getEleTy() was used for these two types
and updated to use this name instead.
Co-authored-by: Scott Manley <scmanley@nvidia.com>
Descriptor for module variable with cuda attribute must be set with the
correct allocator index. This patch updates the embox operation used in
the global to carry the allocator index.
CUDA unified variable where set to use the same allocator than managed
variable. This patch adds a specific allocator for the unified
variables. Currently it will call the managed allocator underneath but
we want to have the flexibility to change that in the future.
Local descriptor for cuda allocatable need to be handled on host and
device. One solution is to duplicate the descriptor (one on the host and
one on the device) and keep them in sync or have the descriptor in
managed/unified memory so we don't to take care of any sync.
The second solution is probably the one we will implement. In order to
have more flexibility on how descriptor representing cuda allocatable
are allocated, this patch updates the lowering to use the cuf operations
alloc and free to managed them.
With `createUnallocatedBox` utility change from #96106 , the TODO for assumed-rank in entry
can simply be lifted and test is added.
The key is that a unallocated assumed-rank descriptor is created with
rank zero in the entry where an assumed-rank dummy from some other entry
do not appear as a dummy (the symbol must still be mapped to some valid
value because the symbol could be used in code that would be unreachable
at runtime, but that the compiler must still generate).
The frontend computes the necessary alignment for COMMON blocks but this
information is never carried over to the code generation and can lead to
segfault for COMMON block that requires a non default alignment.
This patch add an optional attribute on fir.global and carries over the
information.
Lower allocatable and pointers specification parts. Nothing special is
required to allocate the descriptor given they are required to be dummy
arguments, however, care must be taken with INTENT(OUT) to use the
runtime to deallocate them (inlined fir.embox + store is not possible).
Enable lowering of assumed-ranks in specification parts under a debug
flag. I am using a debug flag because many cryptic TODOs/issues may be
hit until more support is added. The development should not take too
long, so I want to stay away from the noise of adding an actual
experimental flag to flang-new.
The number of operations dedicated to CUF grew and where all still in
FIR. In order to have a better organization, the CUF operations,
attributes and code is moved into their specific dialect and files. CUF
dialect is tightly coupled with HLFIR/FIR and their types.
The CUF attributes are bundled into their own library since some
HLFIR/FIR operations depend on them and the CUF dialect depends on the
FIR types. Without having the attributes into a separate library there
would be a dependency cycle.
The lowering produces fir.dummy_scope operation if the current
function has dummy arguments. Each hlfir.declare generated
for a dummy argument is then using the result of fir.dummy_scope
as its dummy_scope operand. This is only done for HLFIR.
I was not able to find a reliable way to identify dummy symbols
in `genDeclareSymbol`, so I added a set of registered dummy symbols
that is alive during the variables instantiation for the current
function. The set is initialized during the mapping of the dummy
argument symbols to their MLIR values. It is reset right after
all variables are instantiated - this is done to avoid generating
hlfir.declare operations with dummy_scope for the clones of
the dummy symbols (e.g. this happens with OpenMP privatization).
If this can be done in a cleaner way, please advise.
Lower locals allocation of cuda device, managed and unified variables to
fir.cuda_alloc. Add fir.cuda_free in the function context finalization.
@vzakhari For some reason the PR #90526 has been closed when I merged PR
#90525. Just reopening one.
…ted. (#89998)" (#90250)
This partially reverts commit 7aedd7dc75.
This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.
This patch has also the side effect to correctly call
`attachDeclarePostDeallocAction` for OpenACC declare variable on
automatic deallocation as well. Update the code in
`attachDeclarePostDeallocAction` so we do not attach on fir.result but
on the correct last op.
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.
Fortran mandates "CHARACTER(1), VALUE" be passed as a C "char" in calls
to BIND(C) procedures (F'2023 18.3.7 (4)). Lowering passed them by
memory instead. Update call interface lowering code to pass them by
register. Fix related test and update it to use HLFIR.
The all one masks was not properly created for i128 types because
builder.createIntegerConstant ended-up truncating -1 to something
positive.
Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use
them where createIntegerConstant(..., -1) was used.
Add an assert in createIntegerConstant to catch negative numbers for
i128 type.
This PR is to address `TODO(loc, "procedure pointer component default
initialization");`.
It handles default init for procedure pointer components in a derived
type that is 32 bytes or larger (Default init for smaller size type has
already been handled).
```
interface
subroutine sub()
end
end interface
type dt
real :: r1 = 5.0
procedure(real), pointer, nopass :: pp1 => null()
real, pointer :: rp1 => null()
procedure(), pointer, nopass :: pp2 => sub
end type
type(dt) :: dd1
end
```
Cray pointee symbols can be host associated from a module or host
procedure while the related cray pointer is not explicitly associated.
This caused the "not yet implemented: lowering symbol to HLFIR" to fire
when lowering a reference to the cray pointee and fetching the cray
pointer.
This patch:
- Ensures cray pointers are always instantiated when instantiating a
cray pointee.
- Fix internal procedure lowering to deal with cray pointee host
association like it does for pointers (the lowering strategy for cray
pointee is to create a pointer that is updated with the cray pointer
value before being fetched).
This should fix the bug reported in
https://github.com/llvm/llvm-project/issues/85420.
The current lowering did not handle sequence associated argument passed
by descriptor. This case is special because sequence association implies
that the actual and dummy argument need to to agree in rank and shape.
Usually, arguments that can be sequence associated are passed by raw
address, and the shape mistmatch is transparent. But there are three
cases of explicit and assumed-size arrays passed by descriptors:
- polymorphic arguments
- BIND(C) assumed-length arguments (F'2023 18.3.7 (5)).
- length parametrized derived types (TBD)
The callee side is expecting a descriptor containing the dummy rank and
shape. This was not the case. This patch fix that by evaluating the
dummy shape on the caller side using the interface (that has to be
available when arguments are passed by descriptors).
CUDA attribute are correctly propagated to the module file but were not
imported currently so they did not appear on the hlfir.declare and
fir.global operations for module variables.
The newly introduced `CUDAAttribute` is meant for CUDA attributes
associated with variable. In order to not clash with the future
attribute for function/subroutine, rename `CUDAAttribute` to
`CUDADataAttribute`.
Lower CUDA attribute for simple dummy argument. This is done in a
similar way than `TARGET`, `OPTIONAL` and so on.
This patch also move the `Fortran::common::CUDADataAttr` to
`fir::CUDAAttributeAttr` mapping to
`flang/include/flang/Optimizer/Support/Utils.h` so that it can be reused
where needed.
This is a first simple patch to introduce a new FIR attribute to carry
the CUDA variable attribute information to hlfir.declare and fir.declare
operations. It currently lowers this information for local variables.
The texture attribute is omitted since it is rejected by semantic and
will not make its way to MLIR.
This new attribute is added as optional attribute to the hlfir.declare
and fir.declare operations.
Runtime globals are compiler generated globals injected in user scopes.
They are never referred to directly in lowering code, we only need th
fur.global for them. Yet lowering was creating hlfir.declare for them in
module procedures. In modern fortran apps, this blows up the generated
IR for nothing (Types with dozens of components, type bound procedures
and parents can create in the order of 10 000 runtime info globals to
describe them, if there is a 100 module procedure, that is that is a few
million operations generated and processed in each pass for nothing).
Start implementing assumed-rank support as described in
https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md
This commit holds the minimal support for lowering calls to procedure
with assumed-rank arguments where the procedure implementation is done
in C.
The case for passing assumed-size to assumed-rank is left TODO since it
will be done a change in assumed-size lowering that is better done in
another patch.
Care is taken to set the lower bounds to zero when passing non allocatable no pointer as descriptor
to a BIND(C) procedure as required per 18.5.3 point 3. This was not done before while the requirements also applies to non assumed-rank descriptors. This change required special attention with IGNORE_TKR(t) to avoid emitting invalid fir.rebox operations (the actual argument type must be used in this case as the output type).
Implementation of Fortran procedure with assumed-rank arguments is still
TODO.
Currently lowering sets the extents of assumed-size array to "undef"
which was OK as long as the value was not expected to be read.
But when interfacing with the runtime and when passing assumed-size to
assumed-rank, this last extent may be read and must be -1 as specified
in the BIND(C) case in 18.5.3 point 5.
Set this value to -1, and update all the lowering code that was looking
for an undef defining op to identify assumed-size: much safer to
propagate and use semantic info here, the previous check actually did
not work if the array was used in an internal procedure (defining op not
visible anymore).
@clementval and @agozillon, I left assumed-size extent to zero in the
acc/omp bounds op as it was, please double check that is what you want
(I can imagine -1 may create troubles here, and 0 makes some sense as it
would lead to no data transfer).
This also allows removing special cases in UBOUND/LBOUND lowering.
Also disable allocation of cray pointee. This was never intended and
would now lead to crashes with the -1 value for assumed-size cray
pointee.
Lower initialized BIND(C) module variable as regular module variable,
except that the fir.global symbol name is the binding label.
For uninitialized variables, add the common linkage so that C code may
define the variables. The standard does not provide a way to indicate
that a variable is defined in C, but there are use cases.
Beware that if the module file compiled object is added to a shared
library, the variable will become a regular global definition and may
override the C variable depending on the linking order.
Lowering was instantiating component symbols (but the last) in initial
target designator as if they were whole objects, leading to collisions
and bugs.
Fixes https://github.com/llvm/llvm-project/issues/75728
VALUE derived type are passed by reference outside of BIND(C) interface.
The ABI is much simpler and it is possible for these arguments to have
the OPTIONAL attribute.
In the BIND(C) context, these arguments must follow the C ABI for
struct, which may lead the data to be passed in register. OPTIONAL is
also forbidden for those arguments, so it is safe to directly use the
fir.type<T> type for the func.func argument.
Codegen is in charge of later applying the C passing ABI according to
the target (https://github.com/llvm/llvm-project/pull/74829).
The function `genCommonBlockMember` is not specific to OpenMP, and it
could very well be a common utility. Move it to ConvertVariable.cpp
where it logically belongs.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
Note: this commit is actually resubmitting the original commit from
PR #70461 that was reverted. See PR #73221.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.