The all one masks was not properly created for i128 types because
builder.createIntegerConstant ended-up truncating -1 to something
positive.
Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use
them where createIntegerConstant(..., -1) was used.
Add an assert in createIntegerConstant to catch negative numbers for
i128 type.
Whenever lowering is checking if a function or global already exists in
the mlir::Module, it was doing module->lookup.
On big programs (~5000 globals and functions), this causes important
slowdowns because these lookups are linear. Use mlir::SymbolTable to
speed-up these lookups. The SymbolTable has to be created from the
ModuleOp and maintained in sync. It is therefore placed in the
converter, and FirOPBuilders can take a pointer to it to speed-up the
lookups.
This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but
some passes creating a lot of runtime calls could benefit from it too.
More analysis will be needed.
As an example of the speed-ups, this patch speeds-up compilation of
Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine
(there is still room for speed-ups).
Advise to place the alloca at the start of the first block of whichever
region (init or combiner) we are currently inside.
It probably isn't safe to put an alloca inside of a combiner region
because this will be executed multiple times. But that would be a bug to
fix in Lower/OpenMP.cpp, not here.
OpenMP array reductions 1/6
Next PR: https://github.com/llvm/llvm-project/pull/84953
This will be useful for OpenMP too.
I changed the definition slightly to use `fir::isa_ref_type` (which also
includes llvm pointers) because I think it reads better using the common
type helpers. There shouldn't be any llvm pointers in lowering so this
isn't a functional change.
Internal procedures cannot be called directly from outside the host
procedure, so there is no point giving them external linkage. The only
reason flang did is because it is the default in MLIR.
Giving external linkage to them:
- prevents deleting them when not used/inlined by LLVM
- causes bugs with shared libraries (at least on linux x86-64) because
the call to the internal function could lead to a dynamic loader call
that would overwrite r10 register (the static chain pointer) due to
system calls and did not restore (it seems it does not expect r10 to be
used for PLT calls).
This patch gives internal linkage to internal procedures:
Note: the llvm.linkage attribute name cannot be obtained via a
getLinkageAttrName since it is not the same name as the one used in the
LLVM dialect. It is just a placeholder defined in
mlir/lib/Conversion/FuncToLLVM/FuncToLLVM.cpp until the func dialect
gets a real linkage model. So simply avoid hard coding it too many times
in lowering.
The newly introduced `CUDAAttribute` is meant for CUDA attributes
associated with variable. In order to not clash with the future
attribute for function/subroutine, rename `CUDAAttribute` to
`CUDADataAttribute`.
Start implementing assumed-rank support as described in
https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md
This commit holds the minimal support for lowering calls to procedure
with assumed-rank arguments where the procedure implementation is done
in C.
The case for passing assumed-size to assumed-rank is left TODO since it
will be done a change in assumed-size lowering that is better done in
another patch.
Care is taken to set the lower bounds to zero when passing non allocatable no pointer as descriptor
to a BIND(C) procedure as required per 18.5.3 point 3. This was not done before while the requirements also applies to non assumed-rank descriptors. This change required special attention with IGNORE_TKR(t) to avoid emitting invalid fir.rebox operations (the actual argument type must be used in this case as the output type).
Implementation of Fortran procedure with assumed-rank arguments is still
TODO.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
Note: this commit is actually resubmitting the original commit from
PR #70461 that was reverted. See PR #73221.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
Change the separator in the `uniqueCGIdent` method to `X`. This change
is required to enable OpenMP offloading for the NVPTX target, as dots
are not valid identifiers in PTX and `uniqueCGIdent` is used to mangle
some literals. Follow up patches will change the remainder of `.`
appearances in names to `X` and add support for the NVPTX target.
Conversion of `hlfir.assign` operations inside OpenACC recipe operations
may result in `fir.alloca` insertion. FIRBuilder can only handle
alloca insertion inside FuncOp's and outlineable OpenMP operations.
I added a simple interface for OpenACC recipe operations that have
executable code inside all their regions, and alloca may be inserted
into the entry blocks of those regions always.
With our current approach the OptimizedBufferization pass is supposed
to lower these `hlfir.assign` operations into loops, because there
should not be conflicts between lhs/rhs. The pass is currently
only working on FuncOp, and this is why it does not optimize
`hlfir.assign` inside the recipes. I will fix it in a separate commit.
Since we run OptimizedBufferization only at >O0, these changes
should still be useful.
Note that the OpenACC codegen that applies the recipes should be aware
of potential alloca operations and produce appropriate stack clean-ups.
HLFIR was always calling Destroy runtime when doing derived type scalar
assignments because the IR did not contain the info of whether
finalization was needed or not.
This info is now available in fir.type_info which allow skipping the
runtime call when not needed.
Also, when finalization is needed, simply use Assign runtime. This makes
no difference from a semantic point of view with the current code that
generated a call to Destroy and did the assignment inline, but if some
piece of runtime must be called anyway, it is simpler to just call
Assign that deals with everything.
The goal is to progressively propagate all the derived type info that is
currently in the runtime type info globals into a FIR operation that can
be easily queried and used by FIR/HLFIR passes.
When this will be complete, the last step will be to stop generating the
runtime info global in lowering, but to do that later in or just before
codegen to keep the FIR files readable (on the added type-info.f90
tests, the lowered runtime info globals takes a whooping 2.6 millions
characters on 1600 lines of the FIR textual output. The fir.type_info that
contains all the info required to generate those globals for such
"trivial" types takes 1721 characters on 9 lines).
So far this patch simply starts by replacing the fir.dispatch_table
operation by the fir.type_info operation and to add the noinit/
nofinal/nodestroy flags to it. These flags will soon be used in HLFIR to
better rewrite hlfir.assign with derived types.
HLFIR lowering always adds hlfir.declare when symbols are bound to their
address allocated on the stack. Ensure that the declare is placed along
with the alloca if it is hoisted. And always return the mlir value that
is bound to the symbol (i.e the alloca in FIR lowering and the declare
in HLFIR lowering).
Context: Loop index variables in OpenMP parallel regions should be
privatised to work correctly.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D158594
When hlfir.assign is lowered into simple load/store,
we may still need to finalize the LHS.
The patch passes `needFinalization` to `genScalarAssignment`
for LHS of any derived type, so some `Destroy` calls
might be redundant. They can be removed later by propagating/deducing
IsFinalizable information about the LHS type.
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D155664
On platforms which support COMDAT sections we should use them when
linkonce or linkonce_odr linkage is requested. This is required on
Windows (PE/COFF) and provides better behaviour than weak symbols on
ELF-based platforms.
This patch also reverts string literals to use linkonce instead of
internal linkage now that comdats are supported.
Differential Revision: https://reviews.llvm.org/D153768
When `AssignOp` is used with LHS that is a compiler generated temporary
special care must be taken to initialize the temporary and avoid
finalizations of its components. This change-set adds optional
`temporary_lhs` attribute for `AssignOp` to convey this information
to HLFIR-to-FIR conversion pass. Currently, this results in
calling `AssignTemporary` runtime for doing the assignment.
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D152482
The issue affected type select tests, such that inside the type guard
block the associated variable was using default lbounds instead of
inheriting it from the original variable.
The bridge's `genExprBox` ended up creating BoxValue from a MutableBoxValue
without setting non-default lbounds. So the hlfir.declare generated
for the associated name inside the type guard block was also using
the default lbounds. The fix is to read the value of the mutable box
and propagate the lbounds to the new BoxValue.
The fix might affect more than just select type cases.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D152413
On Windows, global string literals with "linkonce" linkage is not
supported without using comdat. As a simpler fix than adding comdat
support we can use internal linkage instead.
This fixes a bug where two string literals with the same value in
different fortran files would cause a linker error due to the use
of linkonce linkage.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D149859
A case was missed for the bfloat type in the getRealType function
in FIRBuilder. This can cause a crash in some situations like in
the provided test. The patch adds the missing case.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D149469
The dyanmic type must be carried over in a PolymorphicValue when the address is
loaded from an unlimited polymorphic allocatable.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146525
Just use conversion from assuemd type to assumed type
because a rebox needs more information that are not available
with assumed type.
Also use `isAssumedType` instead of `isBoxNone` since assumed
type can have sequence information.
Depends on D146207
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146209
When passing an argument to an assumed type dummy argument, embox
it directly to a !fir.box<none> box.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146207
Adapat the fix made in D146079 to just avoid the type
to be wrapped with an extra fir.box or fir.class. The potential
load is delegated to the code that is after.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146120
When a subroutine has an entry statement, the non-used argument
will be a fir.alloca and result in a fir.ref<fir.class<T>> for
polymorphic entities. In createBox, just load the box instead of
creating a wrong box.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146079
Fortran allows type mismatch when passing actual arguments to
procedures and most cases were already being handled correctly by
Flang. However, conversion of data types to and from procedures and
conversion between procedures and char procedures were not always
handled properly. The missing cases were added and these
conversions are supported now.
Fixes#60550
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D145601
Optional character function arguments were not being lowered
properly. As they are passed as a tuple, containing the (boxed)
function address and the character length, it is not possible for
fir.absent to handle it directly. Instead, a tuple needs to be
created and filled with an absent function address and a dummy
character length.
Fixes#60225
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D144743
Lower the cases that require runtime support to deal
with the allocation, reallocation, or copy of ac-values to
the array constructor storage.
Differential Revision: https://reviews.llvm.org/D144513
Derived-type assignment was already handling the case of derived-type with
allocatable components with the runtime. Extend the code so the polymorphic
allocatable component are also taken into account.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D144473
This patch updates the code added in D143888 to avoid
overwriting some part of the types when updating it
for unlimited polymorphic types.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D143995
When emboxing an intrinsic type to a polymorphic descriptor, directly set its
type to `fir.class<none>`.
`fir.class<i32>` is not a real type used anywhere in lowering so make it right directly
avoid unnecessary convert op to `fir.class<none>`. Also `fir.class<i32>` would not be
recognized as unlimited polymorphic.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D143888
genRecordAssignment is emitting code to call Assign in the runtime for some cases.
In these cases, the finalization is done by the runtime so we do not need to do it in
a separate cal to avoid multiple finalization..
Also refactor the code in Bridge so the actual finalization of allocatable
is done before any reallocation. We might need to push this into ReallocIfNeeded.
It is not clear if the allocatable lhs needs to be finalized in any cases or only if it is
reallocated.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D143186
Until now, only the address of the type descriptor was hold in
a PolymorphicValue. In some cases, the element size and the
type code are also needed when creating new polymorphic
descriptors from an element of a polymorphic entity.
This patch updates PolymorphicValue to carry the source
descriptor from which the element is extracted. The source
descriptor is then used when emboxing the element to a new
polymorphic descriptor.
This simplify the code done in D141274 and will be used
when creating polymorphic temporary as well.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D141609
The motivation is to have it accessible in HLFIROps.cpp to
use it in hlfir.set_length builder to build the result length
type as best as possible.
Differential Revision: https://reviews.llvm.org/D140214
std::optional::value() has undesired exception checking semantics and is
unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The
call sites block std::optional migration.
Without any optimization or when it cannot be optimized before
bufferization, an hlfir.elemental lowers to an array temporary.
Its codegen consists in:
- allocating a temp given the type, shape, and length parameter arguments.
- generating a loop nest given the elemental shape
- inlining the body of the elemental inside the loops, and replacing the
yield_element by an assignment to an element of the temp.
Differential Revision: https://reviews.llvm.org/D140093
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Use BaseBoxType in `genComponentByComponentAssignment`
so the component by component assignment involving polymorphic entities
does not fall back to scalar assignment.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138927
Perform a rebox instead of a convert operation when the input type is
polymorphic and the output type is a boxed derived-type.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138831
When the rhs is non-polymorphic the type descriptor should not
be propagated. An error in the EmboxOp verifier was raised in that case.
This patch propagate the type descriptor only if the result type of the
EmboxOp operation is polymorphic.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138442
Create the fir.dispatch_table operation based on semantics
information. The fir.dispatch_table will be used for static devirtualization
as well as for fir.select_type conversion.
Depends on D138129
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138131
Plugged in propagation of nnan/nsz/arcp/afn/reassoc related options
to lowering/FirOpBuilder.
Reviewed By: jeanPerier, tblah, awarzynski
Differential Revision: https://reviews.llvm.org/D137580
Added MathOptionsBase to share fastmath config between different
components. Frontend driver translates LangOptions into MathOptionsBase.
FirConverter configures FirOpBuilder using MathOptionsBase
config passed to it via LoweringOptions.
Depends on D137390
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D137391