The number of operations dedicated to CUF grew and where all still in
FIR. In order to have a better organization, the CUF operations,
attributes and code is moved into their specific dialect and files. CUF
dialect is tightly coupled with HLFIR/FIR and their types.
The CUF attributes are bundled into their own library since some
HLFIR/FIR operations depend on them and the CUF dialect depends on the
FIR types. Without having the attributes into a separate library there
would be a dependency cycle.
The lowering produces fir.dummy_scope operation if the current
function has dummy arguments. Each hlfir.declare generated
for a dummy argument is then using the result of fir.dummy_scope
as its dummy_scope operand. This is only done for HLFIR.
I was not able to find a reliable way to identify dummy symbols
in `genDeclareSymbol`, so I added a set of registered dummy symbols
that is alive during the variables instantiation for the current
function. The set is initialized during the mapping of the dummy
argument symbols to their MLIR values. It is reset right after
all variables are instantiated - this is done to avoid generating
hlfir.declare operations with dummy_scope for the clones of
the dummy symbols (e.g. this happens with OpenMP privatization).
If this can be done in a cleaner way, please advise.
Lower locals allocation of cuda device, managed and unified variables to
fir.cuda_alloc. Add fir.cuda_free in the function context finalization.
@vzakhari For some reason the PR #90526 has been closed when I merged PR
#90525. Just reopening one.
…ted. (#89998)" (#90250)
This partially reverts commit 7aedd7dc75.
This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.
This patch has also the side effect to correctly call
`attachDeclarePostDeallocAction` for OpenACC declare variable on
automatic deallocation as well. Update the code in
`attachDeclarePostDeallocAction` so we do not attach on fir.result but
on the correct last op.
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.
Fortran mandates "CHARACTER(1), VALUE" be passed as a C "char" in calls
to BIND(C) procedures (F'2023 18.3.7 (4)). Lowering passed them by
memory instead. Update call interface lowering code to pass them by
register. Fix related test and update it to use HLFIR.
The all one masks was not properly created for i128 types because
builder.createIntegerConstant ended-up truncating -1 to something
positive.
Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use
them where createIntegerConstant(..., -1) was used.
Add an assert in createIntegerConstant to catch negative numbers for
i128 type.
This PR is to address `TODO(loc, "procedure pointer component default
initialization");`.
It handles default init for procedure pointer components in a derived
type that is 32 bytes or larger (Default init for smaller size type has
already been handled).
```
interface
subroutine sub()
end
end interface
type dt
real :: r1 = 5.0
procedure(real), pointer, nopass :: pp1 => null()
real, pointer :: rp1 => null()
procedure(), pointer, nopass :: pp2 => sub
end type
type(dt) :: dd1
end
```
Cray pointee symbols can be host associated from a module or host
procedure while the related cray pointer is not explicitly associated.
This caused the "not yet implemented: lowering symbol to HLFIR" to fire
when lowering a reference to the cray pointee and fetching the cray
pointer.
This patch:
- Ensures cray pointers are always instantiated when instantiating a
cray pointee.
- Fix internal procedure lowering to deal with cray pointee host
association like it does for pointers (the lowering strategy for cray
pointee is to create a pointer that is updated with the cray pointer
value before being fetched).
This should fix the bug reported in
https://github.com/llvm/llvm-project/issues/85420.
The current lowering did not handle sequence associated argument passed
by descriptor. This case is special because sequence association implies
that the actual and dummy argument need to to agree in rank and shape.
Usually, arguments that can be sequence associated are passed by raw
address, and the shape mistmatch is transparent. But there are three
cases of explicit and assumed-size arrays passed by descriptors:
- polymorphic arguments
- BIND(C) assumed-length arguments (F'2023 18.3.7 (5)).
- length parametrized derived types (TBD)
The callee side is expecting a descriptor containing the dummy rank and
shape. This was not the case. This patch fix that by evaluating the
dummy shape on the caller side using the interface (that has to be
available when arguments are passed by descriptors).
CUDA attribute are correctly propagated to the module file but were not
imported currently so they did not appear on the hlfir.declare and
fir.global operations for module variables.
The newly introduced `CUDAAttribute` is meant for CUDA attributes
associated with variable. In order to not clash with the future
attribute for function/subroutine, rename `CUDAAttribute` to
`CUDADataAttribute`.
Lower CUDA attribute for simple dummy argument. This is done in a
similar way than `TARGET`, `OPTIONAL` and so on.
This patch also move the `Fortran::common::CUDADataAttr` to
`fir::CUDAAttributeAttr` mapping to
`flang/include/flang/Optimizer/Support/Utils.h` so that it can be reused
where needed.
This is a first simple patch to introduce a new FIR attribute to carry
the CUDA variable attribute information to hlfir.declare and fir.declare
operations. It currently lowers this information for local variables.
The texture attribute is omitted since it is rejected by semantic and
will not make its way to MLIR.
This new attribute is added as optional attribute to the hlfir.declare
and fir.declare operations.
Runtime globals are compiler generated globals injected in user scopes.
They are never referred to directly in lowering code, we only need th
fur.global for them. Yet lowering was creating hlfir.declare for them in
module procedures. In modern fortran apps, this blows up the generated
IR for nothing (Types with dozens of components, type bound procedures
and parents can create in the order of 10 000 runtime info globals to
describe them, if there is a 100 module procedure, that is that is a few
million operations generated and processed in each pass for nothing).
Start implementing assumed-rank support as described in
https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md
This commit holds the minimal support for lowering calls to procedure
with assumed-rank arguments where the procedure implementation is done
in C.
The case for passing assumed-size to assumed-rank is left TODO since it
will be done a change in assumed-size lowering that is better done in
another patch.
Care is taken to set the lower bounds to zero when passing non allocatable no pointer as descriptor
to a BIND(C) procedure as required per 18.5.3 point 3. This was not done before while the requirements also applies to non assumed-rank descriptors. This change required special attention with IGNORE_TKR(t) to avoid emitting invalid fir.rebox operations (the actual argument type must be used in this case as the output type).
Implementation of Fortran procedure with assumed-rank arguments is still
TODO.
Currently lowering sets the extents of assumed-size array to "undef"
which was OK as long as the value was not expected to be read.
But when interfacing with the runtime and when passing assumed-size to
assumed-rank, this last extent may be read and must be -1 as specified
in the BIND(C) case in 18.5.3 point 5.
Set this value to -1, and update all the lowering code that was looking
for an undef defining op to identify assumed-size: much safer to
propagate and use semantic info here, the previous check actually did
not work if the array was used in an internal procedure (defining op not
visible anymore).
@clementval and @agozillon, I left assumed-size extent to zero in the
acc/omp bounds op as it was, please double check that is what you want
(I can imagine -1 may create troubles here, and 0 makes some sense as it
would lead to no data transfer).
This also allows removing special cases in UBOUND/LBOUND lowering.
Also disable allocation of cray pointee. This was never intended and
would now lead to crashes with the -1 value for assumed-size cray
pointee.
Lower initialized BIND(C) module variable as regular module variable,
except that the fir.global symbol name is the binding label.
For uninitialized variables, add the common linkage so that C code may
define the variables. The standard does not provide a way to indicate
that a variable is defined in C, but there are use cases.
Beware that if the module file compiled object is added to a shared
library, the variable will become a regular global definition and may
override the C variable depending on the linking order.
Lowering was instantiating component symbols (but the last) in initial
target designator as if they were whole objects, leading to collisions
and bugs.
Fixes https://github.com/llvm/llvm-project/issues/75728
VALUE derived type are passed by reference outside of BIND(C) interface.
The ABI is much simpler and it is possible for these arguments to have
the OPTIONAL attribute.
In the BIND(C) context, these arguments must follow the C ABI for
struct, which may lead the data to be passed in register. OPTIONAL is
also forbidden for those arguments, so it is safe to directly use the
fir.type<T> type for the func.func argument.
Codegen is in charge of later applying the C passing ABI according to
the target (https://github.com/llvm/llvm-project/pull/74829).
The function `genCommonBlockMember` is not specific to OpenMP, and it
could very well be a common utility. Move it to ConvertVariable.cpp
where it logically belongs.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
Note: this commit is actually resubmitting the original commit from
PR #70461 that was reverted. See PR #73221.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
This change is required for hlfir.declares of host-associated symbols in
the OpenMP regions.
Added A FIXME to correctly use the symbol attributes for VOLATILE and
ASYNCHRONOUS.
Type extension is currently handled in FIR by inlining the parents
components as the first member of the record type.
This is not correct from a memory layout point of view since the storage
size of the parent type may be bigger than the sum of the size of its
component (due to alignment requirement). To avoid making FIR types
target dependent and fix this issue, make the parent component a single
component with the parent type at the beginning of the record type.
This also simplifies addressing since parent component is now a "normal"
component that can be designated with hlfir.designate.
StructureComponent lowering however is a bit more complex since the
symbols in the structure component may refer to subcomponents of parent
types.
Notes:
1. The fix is only done in HLFIR for now, a similar fix should be done
in ConvertExpr.cpp to fix the path without HLFIR (I will likely still do
it in a new patch since it would be an annoying bug to investigate for
people testing flang without HLFIR).
2. The private component extra mangling is useless after this patch. I
will remove it after 1.
3. The "parent component" TODO in constant CTOR is free to implement for
HLFIR after this patch, but I would rather remove it and test it in a
different patch.
Non POINTER/ALLOCATABLE INTENT(OUT) dummy arguments with allocatable
components were reset without a proper deallocation if needed. Add a
call to Destroy runtime to deallocate the components on entry.
Notes:
1. The same logic is not needed on the callee side of BIND(C) call
because BIND(C) arguments cannot be derived type with allocatable
components (C1806).
2. When the argument is an INTENT(OUT) polymorphic, the dynamic type of
the actual may contain allocatable components. This case is covered by
the call to Destroy that uses dynamic type and was already inserted for
INTENT(OUT) polymorphic dummies.
When calling a statement function with a character actual argument with
a constant length mismatching the dummy length, HLFIR lowering created
an hlfir.declare with the actual argument length for the dummy, causing
bugs when lowering the statement function expression.
Ensure character dummies are always cast to the dummy type when lowering
dummy declarations.
Fixes https://github.com/llvm/llvm-project/issues/67658
The bug was that when instantiating a character array result variable,
the code inserted a cast from the result buffer to the proper array type
if it could see an fir.unboxchar op. But this is wrong for results and
on caller side because the fir.emboxchar is visible so,
charHelper.genUnboxChar() just takes the operand from that instead of
generating an unboxchar.
The fix is simply to move the cast at the place where fir.boxchar<>
argument are dealt with. The cast when creating fir.emboxchar is also
removed: it adds noise and causes constant length result type to be
lowered to fir.char<?>.
The main change from this patch is to deal with the lit test fallout of
this cast move and removal.
Follow up up of https://github.com/llvm/llvm-project/pull/67693
- Zero initialize uninitialized components of saved derived type entity
with a default initial value.
- Zero initialize uninitialized storage of common blocks with a member
with an initial value.
- Zero initialized uninitialized saved equivalence
This removes all the cases where fir.global are created with an initial
value that results in an undef in LLVM for part of the global, leading
in surprising LLVM optimizations at -O2 for Fortran folks that expects
there saved variables to be zero initialized if there is no explicit or
default initial value.
This is not standard but is vastly expected by existing code.
This was implemented by https://reviews.llvm.org/D149877 for simple
scalars, but MLIR lacked a generic way to deal with aggregate types
(arrays and derived type).
Support was recently added in
https://github.com/llvm/llvm-project/pull/65508. Leverage it to zero
initialize all types.
Implement automatic deallocation of unsaved local alloctables when
reaching the end of their scope of block as described in Fortran 2018
9.7.3.2 point 2. and 3.
Uses genDeallocateIfAllocated used for intent(out) deallocation and the
"function context" already used for finalization at end of scope.
There are currently several places that automatically deallocate
allocatble if they are allocated:
- INTENT(OUT) allocatable are deallocated on entry in the callee
- INTENT(OUT) allocatable are also deallocated on the caller side of
BIND(C) function in case the implementation is in C.
- Results of function returning allocatable are deallocated after usage.
- OPENMP privatized allocatable are deallocated at the end of OPENMP
region.
Introduce genDeallocateIfAllocated that centralize all this code, except
for the function return that use genFreememIfAllocated since
finalization is done separately currently.
`fir::factory::genFinalization` and
`fir::factory::genInlinedDeallocation` are removed and replaced by
genFreemem since their name were misleading: finalization was not
called.
There is a fallout in the tests because previous generated code did not
check the allocated status when doing inline deallocation. This was OK
since free(null) is guaranteed to be a no-op, but this makes compiler
code more complex, is a bit surprising in the generated IR IMHO, and it
relied on knowing when genDeallocateBox inserts runtime calls or uses
inlined code.
This patch adds `host_assoc` attribute for operations that implement
FortranVariableInterface (e.g. `hlfir.declare`). The attribute is used
by the alias analysis to make better conclusions about memory overlap.
For example, a dummy argument of an inner subroutine and a host's
variable used inside the inner subroutine cannot refer to the same
object (if the dummy argument does not satisify exceptions in F2018
15.5.2.13).
This closes a performance gap between HLFIR optimization pipeline
and FIR ArrayValueCopy for Polyhedron/nf.
The POINTER= and TARGET= arguments to the intrinsic function
ASSOCIATED() can be the results of references to functions that return
object pointers or procedure pointers. NULL() was working well but not
program-defined pointer-valued functions. Correct the validation of
ASSOCIATED() and extend the infrastructure used to detect and
characterize procedures and pointers.
It is possible for a derived type extending a type with private
components to define components with the same name as the private
components.
This was not properly handled by lowering where several fir.record type
component names could end-up being the same, leading to bad generated
code (only the first component was accessed via fir.field_index, leading
to bad generated code).
This patch handles the situation by adding the derived type mangled name
to private component.
Outside of BIND(C), assumed length character scalar and explicit shape
are passed by address + an extra length argument (fir.boxchar in FIR).
The standard mandates that they be passed via CFI descriptor in BIND(C)
interface (fir.box in FIR). This patch fix the handling for this case.
According to 7.5.6.3 point 3, finalization occurs when
> A nonpointer, nonallocatable object that is not a dummy argument or
function result is finalized immediately before it would become
undefined due to execution of a RETURN or END statement (19.6.6, item
(3)).
We were not calling the finalization on empty derived-type. There is no
such restriction so this patch updates the code so the finalization is
called for empty type as well.