C_FUNLOC was not handling procedure pointer argument correctly, the
issue lied in `hlfir::convertToBox` that did not handle procedure
pointers.
I modified the interface of `hlfir::convertToXXX` to take values on the
way because hlfir::Entity are fundamentally an mlir::Value with type
guarantees, so they should be dealt with by value as mlir::Value are
(they are very small).
This is a lot less complex than the data case where the shape has to be
accounted for, so the implementation is done inline.
One corner case will not be supported correctly for now: the case where
POINTER and TARGET points to the same internal procedure may return
false because lowering is creating fir.embox_proc each time the address
of an internal procedure is taken, so different thunk for the same
internal procedure/host link may be created and compare to false. This
will be fixed in a later patch that moves creating of internal procedure
fir.embox_proc in the host so that the addresses are the same when the
host link is the same. This change is required to properly support the
required lifetime of internal procedure addresses anyway (should be the
always be the lifetime of the host, even when the address is taken in an
internal procedure).
This patch enables more numeric (mod, sum, matmul, etc.) APIs,
and some others.
I added new macros to disable warnings about using C++ STD methods
like operators of std::complex, which do not have __device__ attribute.
This may probably result in unresolved references, if the header files
implementation relies on libstdc++. I will need to follow up on this.
Lower procedure pointer components, except in the context of structure
constructor (left TODO).
Procedure pointer components lowering share most of the lowering logic
of procedure poionters with the following particularities:
- They are components, so an hlfir.designate must be generated to
retrieve the procedure pointer address from its derived type base.
- They may have a PASS argument. While there is no dispatching as with
type bound procedure, special care must be taken to retrieve the derived
type component base in this case since semantics placed it in the
argument list and not in the evaluate::ProcedureDesignator.
These components also bring a new level of recursive MLIR types since a
fir.type may now contain a component with an MLIR function type where
one of the argument is the fir.type itself. This required moving the
"derived type in construction" stackto the converter so that the object
and function type lowering utilities share the same state (currently the
function type utilty would end-up creating a new stack when lowering its
arguments, leading to infinite loops). The BoxedProcedurePass also
needed an update to deal with this recursive aspect.
The function `instantiateVariable` in Bridge.cpp has the following code:
```
if (var.getSymbol().test(
Fortran::semantics::Symbol::Flag::OmpThreadprivate))
Fortran::lower::genThreadprivateOp(*this, var);
if (var.getSymbol().test(
Fortran::semantics::Symbol::Flag::OmpDeclareTarget))
Fortran::lower::genDeclareTargetIntGlobal(*this, var);
```
Implement `handleOpenMPSymbolProperties` in OpenMP.cpp, move the above
code there, and have `instantiateVariable` call this function instead.
This would further separate OpenMP-related details into OpenMP.cpp.
There was some discussion on discourse[1] about allowing call to FIR
generation functions from other part of lowering belonging to OpenMP.
This solution exposes a simple `genEval` member function on the
`AbstractConverter` so that IR generation for PFT Evaluation objects can
be called from lowering outside of the FirConverter but not exposing it.
[1] https://discourse.llvm.org/t/openmp-lowering-from-pft-to-fir/75263
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
The adds a hlfir minloc intrinsic, similar to the minval intrinsic
already added, to help in the lowering of minloc. The idea is to later
add maxloc too, and from there add a simplification for producing minloc
with inlined elemental and hopefully less temporaries.
In the context of C/Fortran interoperability (BIND(C)), it is possible
to give the VALUE attribute to a BIND(C) derived type dummy, which
according to Fortran 2018 18.3.6 - 2. (4) implies that it must be passed
like the equivalent C structure value. The way C structure value are
passed is ABI dependent.
LLVM does not implement the C struct ABI passing for LLVM aggregate type
arguments. It is up to the front-end, like clang is doing, to split the
struct into registers or pass the struct on the stack (llvm "byval") as
required by the target ABI.
So the logic for C struct passing sits in clang. Using it from flang
requires setting up a lot of clang context and to bridge FIR/MLIR
representation to clang AST representation for function signatures (in
both directions). It is a non trivial task.
See
https://stackoverflow.com/questions/39438033/passing-structs-by-value-in-llvm-ir/75002581#75002581.
Since BIND(C) struct are rather limited as opposed to generic C struct
(e.g. no bit fields). It is easier to provide a limited implementation
of it for the case that matter to Fortran.
This patch:
- Updates the generic target rewrite pass to keep track of both the new
argument type and attributes. The motivation for this is to be able to
tell if a previously marshalled argument is passed in memory (it is a C
pointer), or if it is being passed on the stack (has the byval llvm
attributes).
- Adds an entry point in the target specific codegen to marshal struct
arguments, and use it in the generic target rewrite pass.
- Implements limited support for the X86-64 case. So far, the support
allows telling if a struct must be passed in register or on the stack,
and to deal with the stack case. The register case is left TODO in this
patch.
The X86-64 ABI implemented is the System V ABI for AMD64 version 1.0
Lowering doesn't use it, and won't need it in the future; it emits
in-line code for conversions and for assignments to scalars, and uses
the general Assign() routine for arrays.
Defined unformatted I/O is not allowed except from/to an external unit.
This restriction prohibits an INQUIRE(IOLENGTH=n) statement from using
derived types with defined unformatted output in its I/O list. The
runtime currently detects this case and crashes with an internal error;
this patch defines a new I/O error enum and causes the program to crash
with a more useful message.
The function `genCommonBlockMember` is not specific to OpenMP, and it
could very well be a common utility. Move it to ConvertVariable.cpp
where it logically belongs.
Preliminary patch to change lowering/code generation to use
llvm::DataLayout information instead of generating "sizeof" GEP (see
https://github.com/llvm/llvm-project/issues/71507).
Fortran Semantic analysis needs to know about the target type size and
alignment to deal with common blocks, and intrinsics like
C_SIZEOF/TRANSFER. This information should be obtained from the
llvm::DataLayout so that it is consistent during the whole compilation
flow.
This change is changing flang-new and bbc drivers to:
1. Create the llvm::TargetMachine so that the data layout of the target
can be obtained before semantics.
2. Sharing bbc/flang-new set-up of the
SemanticConstext.targetCharateristics from the llvm::TargetMachine. For
now, the actual part that set-up the Fortran type size and alignment
from the llvm::DataLayout is left TODO so that this change is mostly an
NFC impacting the drivers.
3. Let the lowering bridge set-up the mlir::Module datalayout attributes
since it is doing it for the target attribute, and that allows the llvm
data layout information to be available during lowering.
For flang-new, the changes are code shuffling: the `llvm::TargetMachine`
instance is moved to `CompilerInvocation` class so that it can be used
to set-up the semantic contexts. `setMLIRDataLayout` is moved to
`flang/Optimizer/Support/DataLayout.h` (it will need to be used from
codegen pass for fir-opt target independent testing.)), and the code
setting-up semantics targetCharacteristics is moved to
`Tools/TargetSetup.h` so that it can be shared with bbc.
As a consequence, LLVM targets must be registered when running
semantics, and it is not possible to run semantics for a target that is
not registered with the -triple option (hence the power pc specific
modules can only be built if the PowerPC target is available.
`llvm.fcmp` does support fast math attributes therefore so should
`arith.cmpf`.
The heavy churn in flang tests are because flang sets
`fastmath<contract>` by default on all operations that support the fast
math interface. Downstream users of MLIR should not be so effected.
This was requested in https://github.com/llvm/llvm-project/issues/74263
This patch is fixing two issue relative to the dynamic dispatch for
polymorphic entities.
1. Fix the `requireDispatchCall` function. It was checking for the first
symbol of the component but this is not the one to be checked. Instead
the last symbol of the base of the component object is the one to check
to know if it is polymorphic object with a dispatch call or not. This is
demonstrated in the new added test in `flang/test/Lower/dispatch.f90`
where the first symbol would point to `q` which is monomorphic and would
result in a simple `fir.call`
2. Fix the pass object in a no pass situation. In a no pass situation
the pass object is lowered anyway to be able to do the lookup in the
binding table. It was previously lowered wrongly an lead to unresolved
lookup. The base of the component is the passed object and should be
lowered. To achieve this, the `gen(DataRef)` entry point is exposed form
`ConvertExprToHLFIR` through a `convertDataRefToValue` function. The
same test added in `flang/test/Lower/dispatch.f90` is checking for the
correct passed object.
In addition couple of tests were updated to HLFIR since the lowering
used only works with it.
The `llvm::sys::fs::getMainExecutable(nullptr, nullptr)` is not able to
obtain the correct executable path on AIX without Argv0 due to the lack
of a current process on AIX's `proc` filesystem. This causes a build
failure on AIX as intrinsic module directory is missing.
---------
Co-authored-by: Mark Danial <mak.danial@ibm.com>
Enable by default for optimization levels higher than 0 (same behavior
as clang).
For simplicity, only forward the flag to the frontend driver when it
contradicts what is implied by the optimization level.
This was first landed in
https://github.com/llvm/llvm-project/pull/73111 but was later reverted
due to a performance regression. That regression was fixed by
https://github.com/llvm/llvm-project/pull/74065.
We pass TBAA alias information with separate TBAA trees per function (to
prevent incorrect alias information after inlining). These TBAA trees
are identified by a unique string per function. Naturally, we use the
mangled name of the function.
TBAA tags are added in two places: during a dedicated pass relatively
early (structured control flow makes fir::AliasAnalysis more accurate),
then again during CodeGen (when implied box loads and stores become
visible). In between these two passes, the ExternalNameConversion pass
changes the name of some functions.
These functions with changed names previously ended up with separate
TBAA trees from the TBAA tags pass and from CodeGen - leading LLVM to
think that all data accesses alias with all descriptor accesses.
This patch solves this by storing the original name of a function in an
attribute during the ExternalNameConversion pass, and using the name
from that attribute when creating TBAA trees during CodeGen.
This patch expose the type and attribute names in C++ as methods in the
`AbstractType` and `AbstractAttribute` classes, and keep a map of names
to `AbstractType` and `AbstractAttribute` in the `MLIRContext`. Type and
attribute names should be unique.
It adds support in ODS to generate the `getName` methods in
`AbstractType` and `AbstractAttribute`, through the use of two new
variables, `typeName` and `attrName`. It also adds names to C++-defined
type and attributes.
Early return is accepted in OpenACC loop not directly nested in a
compute construct. Since acc.loop operation has a region, the
`func.return` operation cannot be directly used inside the region.
An early return is materialized by an `acc.yield` operation returning a
`true` value. The standard end of the `acc.loop` region yield a `false`
value in this case.
A conditional branch operation on the `acc.loop` result will branch to
the `finalBlock` or just to the continue block whether an early exit was
produce in the acc.loop.
…ortran_binding.h
... and into the ISO_Fortran_binding_wrapper.h header, through which the
compiler and runtime access its contents. This change ensures that user
code that #includes ISO_Fortran_binding.h within 'extern "C" {' doesn't
encounter mysterious namespace errors.
…arrays
When comparing dummy array extents, cope with references to symbols
better (including references to other dummy arguments), and emit
warnings in dubious cases that are not equivalent but not provably
incompatible.
Moves the defintion of `SemanticsContext` within the Flang driver.
Rather than in `CompilerInvocation`, semantic context fits better within
`CompilerInstance` that encapsulates the objects that are required to
run the
frontend. `CompilerInvocation` is better suited for objects
encapsulating compiler configuration (e.g. set-up resulting from user
input or host set-up).
This update makes the user visible messages relating to features that
are not yet implemented be more consistent. I also cleaned up some of
the code.
For NYI messages that refer to intrinsics, I made sure the the message
begins with "not yet implemented: intrinsic:" to make them easier to
recognize.
I created some utility functions for NYI reporting that I put into
.../include/Optimizer/Support/Utils.h. These mainly convert MLIR types
to their Fortran equivalents.
I converted the NYI code to use the newly created utility functions.
This got "lost" in the HLFIR transformation. This patch applies the old
attribute to the AssociateOp that needs it, and forwards it to the
AllocaOp that is generated when lowering to FIR.
This operation allows computing the address of descriptor fields. It is
needed to help attaching descriptors in OpenMP/OpenACC target region.
The pointers inside the descriptor structure must be mapped too, but the
fir.box is abstract, so these fields cannot be computed with
fir.coordinate_of.
To preserve the abstraction of the descriptor layout in FIR, introduce
an operation specifically to !fir.ref<fir.box<>> address fields based on
field names (base_addr or derived_type).
Information about code object version can be configured by the user for
AMD GPU target and it needs to be placed in LLVM IR generated by Flang.
Information about code object version in MLIR generated by the parser
can be reused by other tools. There is no need to specify extra flags if
we want to invoke MLIR tools (like fir-opt) separately.
Changes in comparison to a8ac93:
* added information about required targets for test
flang/test/Driver/driver-help.f90
Information about code object version can be configured by the user for
AMD GPU target and it needs to be placed in LLVM IR generated by Flang.
Information about code object version in MLIR generated by the parser
can be reused by other tools. There is no need to specify extra flags if
we want to invoke MLIR tools (like fir-opt) separately.
Enable by default for optimization levels higher than 0 (same behavior
as clang).
For simplicity, only forward the flag to the frontend driver when it
contradicts what is implied by the optimization level.
Since https://github.com/llvm/llvm-project/pull/72903 there are now no
known performance regressions.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
Note: this commit is actually resubmitting the original commit from
PR #70461 that was reverted. See PR #73221.
**Scope of the PR:**
1. Lowering global and local procedure pointer declaration statement
with explicit or implicit interface. The explicit interface can be from
an interface block, a module procedure or an internal procedure.
2. Lowering procedure pointer assignment, where the target procedure
could be external, module or internal procedures.
3. Lowering reference to procedure pointers so that it works end to end.
**PR notes:**
1. The first commit of the PR does not include testing. I would like to
collect some comments first, which may alter the output. Once I confirm
the implementation, I will add some testing as a follow up commit to
this PR.
2. No special handling of the host-associated entities when an internal
procedure is the target of a procedure pointer assignment in this PR.
**Implementation notes:**
1. The implementation is using the HLFIR path.
2. Flang currently uses `getUntypedBoxProcType` to get the
`fir::BoxProcType` for `ProcedureDesignator` when getting the address of
a procedure in order to pass it as an actual argument. This PR inherits
the same design decision for procedure pointer as the `fir::StoreOp`
requires the same memory type.
This patch adds a --dependent-lib option to flang -fc1 on Windows to
embed library link options into the object file. This is needed to
properly select the Windows CRT to link against.
Patch 3/3 of the transition step 1 described in
https://discourse.llvm.org/t/rfc-enabling-the-hlfir-lowering-by-default/72778/7
This patch changes bbc and flang-new driver to use HLFIR lowering by
default.
`-hlfir=false` can be used with bbc and `-flang-deprecated-no-hlfir`
with flang-new to get the previous default lowering behavior, but these
options will only be available for a limited period of time.
If any user needs these options to workaround bugs, they should open an
issue against flang in llvm github repo so that the regression can be
fixed in HLFIR.
Using an op with a region cause some issue with unstructured code. This
patch make use of acc.declare_enter and acc.declare_exit to represent
the implicit declare region.
Currently, when deleting the device functions in the second stage of filtering during MLIR to LLVM translation we can end up with invalid calls to these functions. This is because of the removal of the EarlyOutliningPass which would have otherwise gotten rid of any such calls.
This patch aims to alter the function filtering pass in the following way:
- Any host function is completely removed.
- Call to the host function are also removed and their uses replaced with Undef values.
- Any host function with target region code is marked to be removed during the the second stage.
- Calls to such functions are still removed and their uses replaced with Undef values.
Co-authored-by: Sergio Afonso <sergio.afonsofumero@amd.com>
Before emitting a warning message, code should check that the usage in
question should be diagnosed by calling ShouldWarn(). A fair number of
sites in the code do not, and can emit portability warnings
unconditionally, which can confuse a user that hasn't asked for them
(-pedantic) and isn't terribly concerned about portability *to* other
compilers.
Add calls to ShouldWarn() or IsEnabled() around messages that need them,
and add -pedantic to tests that now require it to test their portability
messages, and add more expected message lines to those tests when
-pedantic causes other diagnostics to fire.