hlfir.assign currently has the `MemoryEffects<[MemWrite]` which makes it
look like it can write to anything. This is good for some cases where
the assign effect cannot be precisely described through the MLIR side
effect API (e.g., when the LHS is a descriptor and it is not possible to
get an OpOperand describing the data address, or when derived type are
involved and finalization could be called, or user defined assignment
for some components). For the most common case of hlfir.assign on
intrinsic types without whole allocatable LHS, this is pessimistic.
This patch implements a finer description of the side effects when
possible, and also adds the proper read/allocate/free effects when
relevant.
The ultimate goal is to suppress the generation of temporary for the LHS
address when dealing with an assignment to a vector subscripted LHS
where the vector subscript is an array constructor that does not refer
to the LHS (as in `x([a,b]) = y`).
Two more patches will follow to enable this.
This patch adds more precise side effects to the current ops with memory
effects, allowing us to determine which OpOperand/OpResult/BlockArgument
the
operation reads or writes, rather than just recording the reading and
writing
of values. This allows for convenient use of precise side effects to
achieve
analysis and optimization.
Related discussions:
https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243
The runtime API for copy-in copy-out currently only has an entry only
for the copy-out. This entry has a "skipInit" boolean that is never set
to false by lowering and it does not deal with the deallocation of the
temporary.
The generated code was a mix of inline code and runtime calls This is not a big deal,
but this is unneeded compiler and generated code complexity.
With assumed-rank, it is also more cumbersome to establish a
temporary descriptor.
Instead, this patch:
- Adds a CopyInAssignment API that deals with establishing the temporary
descriptor and does the copy.
- Removes unused arg to CopyOutAssign, and pushes
destruction/deallocation responsibility inside it.
Note that this runtime API are still not responsible for deciding the
need of copying-in and out. This is kept as a separate runtime call to
IsContiguous, which is easier to inline/replace by inline code with the
hope of removing the copy-in/out calls after user function inlining.
@vzakhari has already shown that always inlining all the copy part
increase Fortran compilation time due to loop optimization attempts for
loops that are known to have little optimization profitability (the
variable being copied from and to is not contiguous).
The number of operations dedicated to CUF grew and where all still in
FIR. In order to have a better organization, the CUF operations,
attributes and code is moved into their specific dialect and files. CUF
dialect is tightly coupled with HLFIR/FIR and their types.
The CUF attributes are bundled into their own library since some
HLFIR/FIR operations depend on them and the CUF dialect depends on the
FIR types. Without having the attributes into a separate library there
would be a dependency cycle.
The lowering produces fir.dummy_scope operation if the current
function has dummy arguments. Each hlfir.declare generated
for a dummy argument is then using the result of fir.dummy_scope
as its dummy_scope operand. This is only done for HLFIR.
I was not able to find a reliable way to identify dummy symbols
in `genDeclareSymbol`, so I added a set of registered dummy symbols
that is alive during the variables instantiation for the current
function. The set is initialized during the mapping of the dummy
argument symbols to their MLIR values. It is reset right after
all variables are instantiated - this is done to avoid generating
hlfir.declare operations with dummy_scope for the clones of
the dummy symbols (e.g. this happens with OpenMP privatization).
If this can be done in a cleaner way, please advise.
The new operation is just an abstract attribute that is attached to
[hl]fir.declare operations of dummy arguments of a subroutine.
Dummy arguments of the same subroutine refer to the same
fir.dummy_scope, so they can be recognized as such during FIR AliasAnalysis.
Note that the fir.dummy_scope must be specific to the runtime
instantiation of a subroutine, so any MLIR inlining/cloning should duplicate and
unique it vs using the same fir.dummy_scope for different runtime instantiations.
This is why I made it an operation rather than an attribute.
The new operation uses a write effect on DebuggingResource, same as
[hl]fir.declare, to avoid optimizing it away.
…ted. (#89998)" (#90250)
This partially reverts commit 7aedd7dc75.
This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
A mold argument need to be added to the hlfir.element_addr and set in
lowering so that when the hlfir.element_addr need to be turned into an
hlfir.elemental operation because the designator must be turned into a
value, the mold can be set on the hlfir.elemental to later allocate the
temporary according the the dynamic type.
This situation happens whenever the vector subscripted polymorphic
designator does not appear as an assignment left-hand side, or as an
IO-input item.
I initially thought retrieving the mold would be tricky if the dynamic
type of the designator was set by a part-ref of the right of the vector
subscripts ("array(vector)%polymorphic_comp"), but this turned out to be
impossible because:
1. A derived type component can be polymorphic only if it has the
POINTER or ALLOCATABLE attribute (F2023 C708).
2. Vector-subscripted part are ranked and F2023 C919 prohibits any
part-ref on the right of the rank part to have the POINTER or
ALLOCATABLE attribute.
=> If a vector subscripted designator is polymorphic, the vector
subscripted part is the rightmost part, and the mold is the base of the
vector subscripted part. This makes the retrieval of the mold easy in
lowering. The mold argument is always set to be the base of the vector
subscripted part when lowering the vector subscripted part, and it is
removed at the end of the designator lowering if the designator is not
polymorphic. This way there is no need to find back the mold from the
inside of the hlfir.element_addr body.
Rewriting an op can invalidate the operator range being iterated on.
Store the users in a separate list, and iterate over the list instead.
This was detected by address sanitizer.
The newly introduced `CUDAAttribute` is meant for CUDA attributes
associated with variable. In order to not clash with the future
attribute for function/subroutine, rename `CUDAAttribute` to
`CUDADataAttribute`.
This is a first simple patch to introduce a new FIR attribute to carry
the CUDA variable attribute information to hlfir.declare and fir.declare
operations. It currently lowers this information for local variables.
The texture attribute is omitted since it is rejected by semantic and
will not make its way to MLIR.
This new attribute is added as optional attribute to the hlfir.declare
and fir.declare operations.
The verifiers are currently very strict: requiring intrinsic operations
to be used only in cases where the Fortran standard permits the
intrinsic to be used.
There have now been a lot of cases where these verifiers have caused
bugs in corner cases. In a recent ticket, @jeanPerier pointed out that
it could be useful for future optimizations if somewhat invalid uses of
these operations could be allowed in dead code. See this comment:
https://github.com/llvm/llvm-project/issues/79995#issuecomment-1918118234
In response to all of this, I have decided to relax the intrinsic
operation verifiers. The intention is now to only disallow operation
uses that are likely to crash the compiler. Other checks are still
available under `-strict-intrinsic-verifier`.
The disadvantage of this approach is that IR can now represent intrinsic
invocations which are incorrect. The lowering and implementation of
these intrinsic functions is unlikely to do the right thing in all of
these cases, and as they should mostly be impossible to generate using
normal Fortran code, these edge cases will see very little testing,
before some new optimization causes them to become more common.
Fixes#79995
The adds a hlfir minloc intrinsic, similar to the minval intrinsic
already added, to help in the lowering of minloc. The idea is to later
add maxloc too, and from there add a simplification for producing minloc
with inlined elemental and hopefully less temporaries.
This got "lost" in the HLFIR transformation. This patch applies the old
attribute to the AssociateOp that needs it, and forwards it to the
AllocaOp that is generated when lowering to FIR.
This set of commits resolves some of the issues with elemental calls producing
results that may require finalization, and also some memory leak issues due to
the missing deallocation of allocatable components of the temporary buffers
created by the bufferization pass.
- [flang][runtime] Expose Finalize API for derived types.
- [flang][hlfir] Add 'finalize' attribute for DestroyOp.
- [flang][hlfir] Postpone result finalization for elemental calls.
The results of elemental calls generated inside hlfir.elemental must not
be finalized/destructed before they are copied into the resulting
array. The finalization must be done on the array as a whole
(e.g. there might be different scalar and array finalization routines).
The finalization work is left to the hlfir.destroy corresponding
to this hlfir.elemental.
- [flang][hlfir] Tighten requirements on hlfir.end_associate operand.
If component deallocation might be required for the operand of
hlfir.end_associate, we have to be able to get the variable
shape/params to create a descriptor for calling the runtime.
This commit adds verification that we can do so.
- [flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate.
The operand must be a Fortran entity, when allocatable component
deallocation may be required.
- [flang][hlfir] Properly clean-up temporary buffers in bufferization pass.
This commit combines changes for proper finalization and component
deallocation of the temporary buffers. The finalization part
relates to hlfir.destroy operations with 'finalize' attribute.
The component deallocation might be invoked for both hlfir.destroy
and hlfir.end_associate, if the operand is of a derived type
with allocatable component(s).
The changes are mostly in one function, so I decided not to split them.
- [flang][hlfir] Disable optimizations for hlfir.elemental requiring finalization.
If hlfir.elemental is coupled with hlfir.destroy with 'finalize' attribute,
the temporary array result of hlfir.elemental needs to be created
for the purpose of finalization. We cannot do certain optimizations
on such hlfir.elemental operations.
I was not able to come up with a test for the OptimizedBufferization pass,
but I put the check there as well.
Anything that produces a hlfir.expr should have an allocation side
effect so that it is not removed by CSE (which would result in two
hlfir.destroy operations for the same expression). Similarly for
hlfir.associate, which has hlfir.end_associate.
Also adds read effects on arguments which are pointer-like or boxes.
I see no regressions from this change when running llvm-testsuite with
optimization enabled, or from SPEC2017 rate benchmarks.
To test this, I have added MLIR's pass for testing side effect
interfaces to fir-opt.
Differential Revision: https://reviews.llvm.org/D158662
To properly create temporary array for a polymorphic result
of hlfir.elemental we need to keep the mold as its operand.
This patch adds just the basic support.
Reviewed By: clementval, tblah
Differential Revision: https://reviews.llvm.org/D157315
This patch sets 'polymorphic' attribute of hlfir::ExprType when
the value is created from a polymorphic entity.
Memoization of such ExprType involves creating a mutable descriptor
on the stack, which is initialized (as a null box) and passed to
AllocatableApplyMold with the mold being the entity from which
the ExprType value is being created.
This patch fixes "creating polymorphic temporary" TODO and also
several cases of "'fir.convert' op invalid type conversion" error.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D155541
If the size of one of the operand arrays is not known at compile
time, do not issue a size mismatch error sinc they could match at
runtime.
Fixes the compilation error in polyhedron/induct2.
Reviewed By: tblah, vzakhari
Differential Revision: https://reviews.llvm.org/D155302
The problem appeared as a segfault for case like this:
```
type t
character(11), allocatable :: c
end type
character(12), alloctable :: x
type(t) y
y = t(x)
```
The frontend representes `y = t(x)` as `y=t(c=%SET_LENGTH(x,11_8))`.
When 'x' is unallocated the hlfir.set_length lowering results in
segfault. It could probably be handled in hlfir.set_length lowering
by using NULL base for the hlfir.declare depending on the allocation
status of 'x', but I am not sure if !hlfir.expr, in general, is supposed
to represent an expression created from unallocated allocatable.
I believe in Fortran that would mean referencing an unallocated
allocatable, which is not allowed.
I decided to special case `SET_LENGTH` in structure constructor,
so that we use its 'x' operand as the RHS for the assign operation
implying the isAllocatable check for cases when 'x' is allocatable.
This requires setting keep_lhs_length_if_realloc flag for the assign
operation. Note that when the component being intialized has
deferred length the frontend does not produce `SET_LENGTH`.
Differential Revision: https://reviews.llvm.org/D155151
We will use hlfir.get_length to lower inquiries of char length
applied to hlfir.expr character values.
Reviewed By: tblah, jeanPerier
Differential Revision: https://reviews.llvm.org/D154560
This patch adds 'unordered' attribute handling the HLFIR elementals'
builders and fixes the attribute handling in lowering and transformations.
Depends on D154031, D154032
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154035
This patch adds support for vector subscripted assignment left-hand
side. It does not yet add support for the cases where the LHS must be
saved because its evaluation could be impacted by the assignment.
The implementation adds an hlfir::ElementalOpInterface to share the
elemental inlining utility and some other tools between
hlfir::ElementalOp and hlfir::ElelemntalAddrOp.
It adds generateYieldedLHS() to allow retrieving the LHS value
in lowering, whether or not it is vector subscripted. If it is vector
subscripted, this utility creates a loop nest iterating over the
elements and returns the address of an element.
Differential Revision: https://reviews.llvm.org/D153759
When `AssignOp` is used with LHS that is a compiler generated temporary
special care must be taken to initialize the temporary and avoid
finalizations of its components. This change-set adds optional
`temporary_lhs` attribute for `AssignOp` to convey this information
to HLFIR-to-FIR conversion pass. Currently, this results in
calling `AssignTemporary` runtime for doing the assignment.
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D152482
This patch adds an hlfir operation called `char_extremum`, which takes the
lexicographic comparison between a variadic number (minimum of 2 arguments) of
characters.
Discussion for this work can be found in the draft revision found
[here](https://reviews.llvm.org/D143326). The reason I'm not promoting that draft to
a true patch for review was because I needed to separate out the op
definition/codegen and lowering as two separate patches, as preferred by
@jeanPerier.
Differential Revision: https://reviews.llvm.org/D152474
Lower user defined assignment inside the hlfir.region_assign
"userDefinedAssignment" mlir region.
This is done by adding an entry point to ConvertCall.h in order
to call genUserCall with the region block arguments as arguments.
The codegen for hlfir.region_assign with user defined assignment
will be added in a later patch.
Differential Revision: https://reviews.llvm.org/D153404
Adds a new HLFIR operation for the COUNT intrinsic according to
the design set out in flang/docs/HighLevel.md. This patch includes all
the necessary changes to create a new HLFIR operation and lower it into
the fir runtime call.
Author was @jacob-crawley. Minor adjustments by @tblah
Differential Revision: https://reviews.llvm.org/D152521
The verifiers for hlfir.matmul and hlfir.transpose try to ensure that
the shape of the result value makes sense given the shapes of the input
argument(s).
It there are some cases in the gfortran tests where lowering knows a bit
more about shape information than (HL)FIR. I think the cases here will be
solved when hlfir.shape_meet is implemented.
But in the meantime, and to improve robustness, I've relaxed the
verifier to allow the return type to have more precise shape information
than can be deduced from the argument type(s).
Differential Revision: https://reviews.llvm.org/D152254
Adds a new HLFIR operation for the DOT_PRODUCT intrinsic according to
the design set out in flang/docs/HighLevel.md. This patch includes all
the necessary changes to create a new HLFIR operation and lower it into
the fir runtime call.
Differential Revision: https://reviews.llvm.org/D152252
Adds a new HLFIR operation for the ALL intrinsic according to the
design set out in flang/docs/HighLevel.md
Differential Revision: https://reviews.llvm.org/D151090
It seems the canonicalization was not correct: it cannot return that
it failed if it did modify the IR.
This was exposed by a new MLIR sanity check added in
https://reviews.llvm.org/D144552.
I am not sure it is legit to return success if the operation being
canonicalized is not modified either. So only remove the loads if
they are the only uses of the forall_index.
Should fix (intermittent?) bot failures like
https://lab.llvm.org/buildbot/#/builders/179/builds/6251
since the new MLIR check was added.
Differential Revision: https://reviews.llvm.org/D151487
Comments in the recent patch https://reviews.llvm.org/D149964,
mentioned that using hlfir_ExprType in cases where intrinsics
return simple scalars adds unnecessary abstraction that isn't
needed unless an array type is being used.
This patch modifies the HLFIR operations for product, sum and any
so that they only return a hlfir_ExprType when the result is an array,
otherwise they will return just the simple scalar type.
Differential Revision: https://reviews.llvm.org/D150877
Adds a HLFIR operation for the ANY intrinsic according to the
design set out in flang/docs/HighLevel.md
Differential Revision: https://reviews.llvm.org/D149964
This is the last piece required to lower Forall (except pointer
assignments, where an operation may be needed to deal with bounds
remapping).
Lowering requires symbols to be mapped to memory SSA values produced
by a fir_FortranVariableOpInterface operation. This applies to
forall index-values, that are symbols.
fir.alloca/fir.store/hlfir.declare are not allowed inside the body of
an hlfir.forall that only accept operations with the
hlfir_OrderedAssignmentTreeOpInterface so that the forall structure is
well defined and easy to transform.
Allowing such operations in the forall body would open the doors to
generating ill-formed programs where such operation would be used for
non index-values.
Instead, add an hlfir.forall_index with both required interface to
produce a memory address for a forall index.
As a bonus, since forall index-value are by nature read-only, the
loads of hlfir.forall_index can be canonicalized, which will help
simplifying the hlfir.forall nested code (it is unclear we will be
able to tell MLIR enough about hlfir.forall and hlfir.where structure
so that it could safely do a generic mem-to-reg inside it, and getting
rid of read-effect operations will benefit the forall rewrite pass).
Differential Revision: https://reviews.llvm.org/D149836
Adds a HLFIR operation for the PRODUCT intrinsic according to
the design set out in flang/doc/HighLevelFIR.md
Since the PRODUCT intrinsic is essentially identical to SUM
in terms of its arguments and result characteristics in the
Fortran Standard, the operation definition and subsequent
tests also take the same form.
Differential Revision: https://reviews.llvm.org/D147624
Add hlfir.forall_mask, hlfir.where, and hlfir.elsewhere operations that
are operations that holds (optionally for hlfir.elsewhere) the
evaluation of a logical mask that controls the evaluation of nested
operations.
They allow representing Fortran forall control mask, as well as where
and eslewhere statements/constructs.
They use the OrderedAssignmentTreeOpInterface since they can all be used
inside Forall and their masks should be fully evaluated for all the
index-value set induced by parent Forall before any of the nested
operations in their body is evaluated.
I initially tried making them into a single operation with some attributes
to make a difference, but I felt this made the verifier/parser/printer and
usages messier/tricky compared to making three distinct operations that
represent the three Fortran feature in a vanilla way.
Differential Revision: https://reviews.llvm.org/D149754
This patch adds the hlfir.forall operation and the
OrderAssignmentTreeOpInterface that allows representing Fortran forall.
It uses regions to keep Fortran expression evaluation independent from
each other in the IR. Forall assignments inside hlfir.forall are
represented with hlfir.region_assign which also keeps the IR generated
for each expressions independently.
The goal of this representation is to provide a representation that is
straightforward to generate from Fortran parse tree without any analysis, while
providing enough structure information so that an optimization pass can decide
how to schedule, and save if needed, the evaluations of the Forall and Where
expression and statements. It allows the data dependency analysis to be done at
the HLFIR level.
The OrderAssignmentTreeOpInterface allows ensuring that the Forall/Where
tree structure is kept in the IR. It will allow visiting this tree in
the IR without hard coding the operation structures in the pass.
Differential Revision: https://reviews.llvm.org/D149734
hlfir.region_assign is a Region based version of hlfir.assign: the
right-hand side and left-hand-side are evaluated in their own region,
and an optional region can be added to implement user defined
assignment.
This will be used for:
- assignments inside where and forall
- user defined assignments
- assignments to vector subscripted entities.
Rational:
Forall and Where lowering requires solving an expression/assignment
evaluation scheduling problem based on data dependencies between the
variables being assigned and the one used in the expressions.
Keeping left-hand side and right-hand side in their own region will
make it really easy to analyse the dependency and move around the
expression evaluation as a whole. Operation DAGs are hard to scissor out
when the LHS and RHS evaluation are lowered in the same block. The pass
dealing with further forall/where lowering in HLFIR will need to
succeed. It is not acceptable for them to fail splitting the RHS/LHS
evaluation code. Keeping them in independent block is an approach that
cannot fail.
For user defined assignments, having a region allows implementing all
the call details in lowering, and even to allow inlining of the user
assignment, before it is decided if a temporary for the LHS or RHS is
required or not.
The operation description mention "hlfir.elemental_addr" (operation that
will be used for vector subscripted LHS) and "ordered assignment trees"
(concept/inetrface that will be used to represent forall/where structure
in HLFIR). These will be pushed in follow-up patch, but I do not want t
scissor out the descriptions.
Differential Revision: https://reviews.llvm.org/D149442
This operation fetches an extent value from a fir.shape. The operation
could just as easily live in the fir namespace, but is only needed for
hlfir lowering so I put it here.
This operation is required to allow one to defer getting the extents of a shape
generated by hlfir.get_shape until after that shape has been resolved
(after bufferization of the hlfir.expr).
This operation will be lowered to FIR as an arith.constant created using
the definition of the fir.shape argument.
Depends on: D146830
Differential Revision: https://reviews.llvm.org/D148220