Commit Graph

248 Commits

Author SHA1 Message Date
jeanPerier
c2601f1769 [flang][NFC] remove unused fir.constc operation (#110821)
As part of [RFC to replace fir.complex usages by mlir.complex
type](https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292).

fir.constc is unused so instead of porting it, just remove it.
Complex constants are currently created with inserts in lowering
already. When using mlir complex, we may just want to start using
[complex.constant](4f6ad17adc/mlir/include/mlir/Dialect/Complex/IR/ComplexOps.td (L131C5-L131C16)).
2024-10-02 16:16:57 +02:00
Sirui Mu
fde3c16ac9 [mlir][LLVM] Add operand bundle support (#108933)
This PR adds LLVM [operand
bundle](https://llvm.org/docs/LangRef.html#operand-bundles) support to
MLIR LLVM dialect. It affects these 3 operations related to making
function calls: `llvm.call`, `llvm.invoke`, and `llvm.call_intrinsic`.

This PR adds two new parameters to each of the 3 operations. The first
parameter is a variadic operand `op_bundle_operands` that contains the
SSA values for operand bundles. The second parameter is a property
`op_bundle_tags` which holds an array of strings that represent the tags
of each operand bundle.
2024-09-26 07:59:37 +02:00
Jan Leyonberg
4290e34ebd [flang][AMDGPU] Convert math ops to AMD GPU library calls instead of libm calls (#99517)
This patch invokes a pass when compiling for an AMDGPU target to lower
math operations to AMD GPU library calls library calls instead of libm
calls.
2024-09-10 09:48:55 -04:00
Nikita Popov
67e19e5bb1 [flang] Set isSigned=true for negative constant (NFC)
We're providing this as a negative signed value, so set the flag.
Currently doesn't make a difference, but will assert in the future.

Split out of https://github.com/llvm/llvm-project/pull/80309.
2024-09-05 15:25:05 +02:00
Peter Klausler
9e53e77265 [flang] Fix warnings from more recent GCCs (#106567)
While experimenting with some more recent C++ features, I ran into
trouble with warnings from GCC 12.3.0 and 14.2.0. These warnings looked
legitimate, so I've tweaked the code to avoid them.
2024-09-04 10:52:51 -07:00
Slava Zakharin
cfd4c1805e [RFC][flang] Replace special symbols in uniqued global names. (#104859)
This change addresses more "issues" as the one resolved in #71338.
Some targets (e.g. NVPTX) do not accept global names containing
`.`. In particular, the global variables created to represent
the runtime information of derived types use `.` in their names.
A derived type's descriptor object may be used in the device code,
e.g. to initialize a descriptor of a variable of this type.
Thus, the runtime type info objects may need to be compiled
for the device.

Moreover, at least the derived types' descriptor objects
may need to be registered (think of `omp declare target`)
for the host-device association so that the addendum pointer
can be properly mapped to the device for descriptors using
a derived type's descriptor as their addendum pointer.
The registration implies knowing the name of the global variable
in the device image so that proper host code can be created.
So it is better to name the globals the same way for the host
and the device.

CompilerGeneratedNamesConversion pass renames all uniqued globals
such that the special symbols (currently `.`) are replaced
with `X`. The pass is supposed to be run for the host and the device.

An option is added to FIR-to-LLVM conversion pass to indicate
whether the new pass has been run before or not. This setting
affects how the codegen computes the names of the derived types'
descriptors for FIR derived types.

fir::NameUniquer now allows `X` to be part of a name, because
the name deconstruction may be applied to the mangled names
after CompilerGeneratedNamesConversion pass.
2024-08-21 13:37:03 -07:00
Valentin Clement (バレンタイン クレメン)
15e1e3b234 [flang] Read the extra field from the in box when doing reboxing (#102992)
Updated version of #102686. The issue was that in some rebox case the
addendum presence flag should be updated and not always taken from the
"from" box. This is the case when reboxing a fir.class to a fir.box that
doesn't require an addendum for example.

Open a new review since there is a bit of additional code in the CodeGen
part.
2024-08-14 11:23:56 -07:00
Valentin Clement (バレンタイン クレメン)
8fc9b4efd2 Revert "[flang] Read the extra field from the in box when doing reboxing" (#102931)
Reverts llvm/llvm-project#102686 as it might be the source of buildbot
failures https://lab.llvm.org/buildbot/#/builders/143/builds/1392.
2024-08-12 09:35:50 -07:00
Valentin Clement (バレンタイン クレメン)
dab7e3c30d [flang] Read the extra field from the in box when doing reboxing (#102686)
The extra field in the descriptor carries multiple information and
cannot be deducted anymore when doing a reboxing. This patch updates the
codegen to retrieve the extra field value from the inboc and set it in
the new box.
2024-08-12 08:48:27 -07:00
Valentin Clement (バレンタイン クレメン)
0def9a923d [flang] Add allocator_idx attribute on fir.embox and fircg.ext_embox (#101212)
#100690 introduces allocator registry with the ability to store
allocator index in the descriptor. This patch adds an attribute to
fir.embox and fircg.ext_embox to be able to set the allocator index
while populating the descriptor fields.
2024-08-01 12:49:17 -07:00
Valentin Clement (バレンタイン クレメン)
6df4e7c25f [flang] Add ability to have special allocator for descriptor data (#100690)
This patch enhances the descriptor with the ability to have specialized
allocator. The allocators are registered in a dedicated registry and the
index of the desired allocator is stored in the descriptor. The default
allocator, std::malloc, is registered at index 0.

In order to have this allocator index in the descriptor, the f18Addendum
field is repurposed to be able to hold the presence flag for the
addendum (lsb) and the allocator index.

Since this is a change in the semantic and name of the 7th field of the
descriptor, the CFI_VERSION is bumped to the date of the initial change.

This patch only adds the ability to have this features as part of the
descriptor but does not add specific allocator yet. CUDA fortran will be
the first user of this feature to allocate descriptor data in the
different type of device memory base on the CUDA attribute.

---------

Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
2024-08-01 09:39:53 -07:00
jeanPerier
d8b672dac9 [flang][NFC] rename fircg op operand index accessors (#100584)
fircg operations have xxxOffset members to give the operand index of
operand xxx. This is a bit weird when looking at usage (e.g.
`arrayCoor.shiftOffset` reads like it is shifting some offset). Rename
them to getXxxOperandIndex.
2024-07-25 18:02:03 +02:00
Alexis Perry-Holby
f1d3fe7aae Add basic -mtune support (#98517)
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
2024-07-16 16:48:24 +01:00
Matthias Springer
e73cf2f0c5 [flang] Remove materialization workaround in type converter (#98743)
This change is in preparation of #97903, which adds extra checks for
materializations: it is now enforced that they produce an SSA value of
the correct type, so the current workaround no longer works.

The original workaround avoided target materializations by directly
returning the to-be-converted SSA value from the materialization
callback. This can be avoided by initializing the lowering patterns that
insert the materializations without a type converter. For
`cg::XEmboxOp`, the existing workaround that skips
`unrealized_conversion_cast` ops is still in place.

Also remove the lowering pattern for `unrealized_conversion_cast`. This
pattern has no effect because `unrealized_conversion_cast` ops that are
inserted by the dialect conversion framework are never matched by the
pattern driver.
2024-07-15 16:07:48 +02:00
Ramkumar Ramachandra
db791b278a mlir/LogicalResult: move into llvm (#97309)
This patch is part of a project to move the Presburger library into
LLVM.
2024-07-02 10:42:33 +01:00
Tarun Prabhu
8dd9494056 Revert "[flang] Add basic -mtune support" (#96678)
Reverts llvm/llvm-project#95043
2024-06-25 13:25:39 -06:00
Alexis Perry-Holby
a790279bf2 [flang] Add basic -mtune support (#95043)
This PR adds -mtune as a valid flang flag and passes the information
through to LLVM IR as an attribute on all functions. No specific
architecture optimizations are added at this time.
2024-06-25 18:39:35 +01:00
Vijay Kandiah
6d340e4c44 [flang] fixing alloca hoisting for blocks having single op. (#96009)
This change fixes the issue
https://github.com/llvm/llvm-project/issues/95977 due to commit
c0cba51981 inserting allocas after the
terminator op in the insertion block in the case where the block had
only a single operation, its terminator, in it. With this change, the
hoisted constant-sized allocas are placed at the front of the insertion
block, rather than right after the first operation in it.
2024-06-19 18:45:23 -05:00
jeanPerier
a786919256 [flang] allow assumed-rank box in fir.store (#95980)
Codegen is done with a memcpy using the rank from the "value" descriptor
like for the fir.load case.
Rational described in
https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md.
2024-06-19 10:12:19 +02:00
Vijay Kandiah
c0cba51981 [Flang] Hoisting constant-sized allocas at flang codegen. (#95310)
This change modifies the `AllocaOpConversion` in flang codegen to insert
constant-sized LLVM allocas at the entry block of `LLVMFuncOp` or
OpenACC/OpenMP Op, rather than in-place at the `fir.alloca`. This
effectively hoists constant-sized FIR allocas to the proper block.

When compiling the example subroutine below with `flang-new`, we get a
llvm.stacksave/stackrestore pair around a constant-sized `fir.alloca
i32`.

```
subroutine test(n)
    block
      integer :: n
      print *, n
    end block
  end subroutine test
```

Without the proposed change, downstream LLVM compilation cannot hoist
this constant-sized alloca out of the stacksave/stackrestore region
which may lead to missed downstream optimizations:

```
*** IR Dump After Safe Stack instrumentation pass (safe-stack) ***
define void @test_(ptr %0) !dbg !3 {
  %2 = call ptr @llvm.stacksave.p0(), !dbg !7
  %3 = alloca i32, i64 1, align 4, !dbg !8
  %4 = call ptr @_FortranAioBeginExternalListOutput(i32 6, ptr @_QQclX62c91d05f046c7a656e7978eb13f2821, i32 4), !dbg !9
  %5 = load i32, ptr %3, align 4, !dbg !10, !tbaa !11
  %6 = call i1 @_FortranAioOutputInteger32(ptr %4, i32 %5), !dbg !10
  %7 = call i32 @_FortranAioEndIoStatement(ptr %4), !dbg !9
  call void @llvm.stackrestore.p0(ptr %2), !dbg !15
  ret void, !dbg !16
}
```

With this change, the `llvm.alloca` is already hoisted out of the
stacksave/stackrestore region during flang codegen:

```
// -----// IR Dump After FIRToLLVMLowering (fir-to-llvm-ir) //----- //
  llvm.func @test_(%arg0: !llvm.ptr {fir.bindc_name = "n"}) attributes {fir.internal_name = "_QPtest"} {
    %0 = llvm.mlir.constant(4 : i32) : i32
    %1 = llvm.mlir.constant(1 : i64) : i64
    %2 = llvm.alloca %1 x i32 {bindc_name = "n"} : (i64) -> !llvm.ptr
    %3 = llvm.mlir.constant(6 : i32) : i32
    %4 = llvm.mlir.undef : i1
    %5 = llvm.call @llvm.stacksave.p0() {fastmathFlags = #llvm.fastmath<contract>} : () -> !llvm.ptr
    %6 = llvm.mlir.addressof @_QQclX62c91d05f046c7a656e7978eb13f2821 : !llvm.ptr
    %7 = llvm.call @_FortranAioBeginExternalListOutput(%3, %6, %0) {fastmathFlags = #llvm.fastmath<contract>} : (i32, !llvm.ptr, i32) -> !llvm.ptr
    %8 = llvm.load %2 {tbaa = [#tbaa_tag]} : !llvm.ptr -> i32
    %9 = llvm.call @_FortranAioOutputInteger32(%7, %8) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr, i32) -> i1
    %10 = llvm.call @_FortranAioEndIoStatement(%7) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr) -> i32
    llvm.call @llvm.stackrestore.p0(%5) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr) -> ()
    llvm.return
  }
```

---------

Co-authored-by: Vijay Kandiah <vkandiah@sky6.pgi.net>
2024-06-14 11:36:05 -05:00
Valentin Clement (バレンタイン クレメン)
c1654c38e8 [flang] Carry over alignment computed by frontend for COMMON (#94280)
The frontend computes the necessary alignment for COMMON blocks but this
information is never carried over to the code generation and can lead to
segfault for COMMON block that requires a non default alignment.

This patch add an optional attribute on fir.global and carries over the
information.
2024-06-04 11:15:31 -07:00
jeanPerier
fd8b2d2046 [flang] lower RANK intrinsic (#93694)
First commit is reviewed in
https://github.com/llvm/llvm-project/pull/93682.

Lower RANK using fir.box_rank. This patches updates fir.box_rank to
accept box reference, this avoids the need of generating an assumed-rank
fir.load just for the sake of reading ALLOCATABLE/POINTER rank. The
fir.load would generate a "dynamic" memcpy that is hard to optimize
without further knowledge. A read effect is conditionally given to the
operation.
2024-05-30 11:02:09 +02:00
jeanPerier
e398383f9a [flang][fir] add codegen for fir.load of assumed-rank fir.box (#93569)
- Update LLVM type conversion of assumed-rank fir.box/class to generate
the type of the maximum ranked descriptor. That way, alloca for assumed
rank descriptor copies are always big enough. This is needed in the
fir.load case that generates a new storage for the value
- Add a "computeBoxSize" helper to compute the dynamic size of a
descriptor.
- Use that size to generate an llvm.memcpy intrinsic to copy the input
descriptor into the new storage.

Looking at https://reviews.llvm.org/D108221?id=404635, it seems valid to
add the TBAA node on the memcpy, which I did.

In a further patch, I think we should likely always use a memcpy since
LLVM seems to have a better time optimizing it than fir.load/fir.store
patterns.
2024-05-30 09:30:27 +02:00
jeanPerier
26e0ce0b36 [flang] update fir.box_rank and fir.is_array codegen (#93541)
fir.box_rank codegen was invalid, it was assuming the rank field in the
descriptor was an i32. This is not correct. Do not hard code the type,
use the named position to find the type, and convert as needed in the
patterns.
2024-05-28 17:32:27 +02:00
Krzysztof Parzyszek
101f977f2c [flang][CodeGen] Avoid out-of-bounds memory access in SelectCaseOp (#92955)
`SelectCaseOp::getCompareOperands` may return an empty range for the
"default" case. Do not dereference the range until it is expected to be
non-empty.

This was detected by address-sanitizer.
2024-05-22 10:52:17 -05:00
Abid Qadeer
f156b9ce7a [flang] Add debug information for module variables. (#91582)
This PR add debug info for module variables. The module variables are
added as global variables but their scope is set to module instead of
compile unit. The scope of function declared inside a module is also set
accordingly.

After this patch, a module variable could be evaluated in the GDB as `p
helper::gli` where helper is name of the module and gli is the name of
the variable. A future patch will add the import module functionality
which will remove the need to prefix the name with helper::.

The line number where is module is declared is a best guess at the
moment as this information is not part of the GlobalOp.
2024-05-22 10:59:29 +01:00
Abid Qadeer
cd5ee2715e [reland][flang] Initial debug info support for local variables (#92304)
This is same as #90905 with an added fix. The issue was that we
generated variable info even when user asked for line-tables-only. This
caused llvm dwarf generation code to fail an assertion as it expected an
empty variable list.

Fixed by not generating debug info for variables when user wants only
line table. I also updated a test check for this case.
2024-05-16 09:10:59 +01:00
Pete Steinfeld
468357114c Revert "[flang] Initial debug info support for local variables. (#909… (#92302)
…05)"

This reverts commit 61da6366d0.

Update #90905 was causing many tests to fail.

See comments in #90905.
2024-05-15 11:30:30 -07:00
Abid Qadeer
61da6366d0 [flang] Initial debug info support for local variables. (#90905)
We need the information in the `DeclareOp` to generate debug information
for variables.  Currently, cg-rewrite removes the `DeclareOp`. As
`AddDebugInfo` runs after that, it cannot process the `DeclareOp`. My
initial plan was to make the `AddDebugInfo` pass run before the cg-rewrite
but that has few issues.
    
1. Initially I was thinking to use the memref op to carry the variable
attr. But as @tblah suggested in the #86939, it makes more sense to
carry that information on `DeclareOp`. It also makes it easy to handle
it in codegen and there is no special handling needed for arguments. For
this reason, we need to preserve the `DeclareOp` till the codegen.
    
2. Running earlier, we will miss the changes in passes that run between
cg-rewrite and codegen.
    
But not removing the DeclareOp in cg-rewrite has the issue that ShapeOp
remains and it causes errors during codegen. To solve this problem, I
convert DeclareOp to XDeclareOp in cg-rewrite instead of removing
it. This was mentioned as possible solution by @jeanPerier in
https://reviews.llvm.org/D136254
    
The conversion follows similar logic as used for other operators in that
file. The FortranAttr and CudaAttr are currently not converted but left
as TODO when the need arise.

Now `AddDebugInfo` pass can extracts information about local variables
from `XDeclareOp` and creates `DILocalVariableAttr`. These are attached
to `XDeclareOp` using `FusedLoc` approach. Codegen can use them to
create `DbgDeclareOp`.  I have added tests that checks the debug
information in mlir from and also in llvm ir.

Currently we only handle very limited types. Rest are given a place
holder type. The previous placeholder type was basic type with
`DW_ATE_address` encoding. When variables are added, it started
causing assertions in the llvm debug info generation logic for some
types. It has been changed to an interger type to prevent these issues
until we handle those types properly.
2024-05-15 15:20:27 +01:00
Kazu Hirata
f841ca0c35 Use StringRef::operator== instead of StringRef::equals (NFC) (#91864)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.

- StringRef::operator==/!= outnumber StringRef::equals by a factor of
  276 under llvm-project/ in terms of their usage.

- The elimination of StringRef::equals brings StringRef closer to
  std::string_view, which has operator== but not equals.

- S == "foo" is more readable than S.equals("foo"), especially for
  !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-12 23:08:40 -07:00
Christian Sigg
bd9fdce69b [flang] Use isa/dyn_cast/cast/... free functions. (#90432)
The corresponding member functions are deprecated.
2024-04-29 09:16:22 +02:00
Christian Sigg
fac349a169 Reapply "[mlir] Mark isa/dyn_cast/cast/... member functions depreca… (#90406)
…ted. (#89998)" (#90250)

This partially reverts commit 7aedd7dc75.

This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
2024-04-28 22:01:42 +02:00
dyung
7aedd7dc75 Revert "[mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (#89998)" (#90250)
This reverts commit 950b7ce0b8.

This change is causing build failures on a bot
https://lab.llvm.org/buildbot/#/builders/216/builds/38157
2024-04-26 12:09:13 -07:00
Christian Sigg
950b7ce0b8 [mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (#89998)
See https://mlir.llvm.org/deprecation and
https://discourse.llvm.org/t/preferred-casting-style-going-forward.
2024-04-26 16:28:30 +02:00
Jeff Niu
e553ac4d81 [mlir][llvm] Port overflowFlags to a native operation property (RELAND) (#89410)
This PR changes the LLVM dialect's IntegerOverflowFlags to be stored on
operations as native properties.

Reland to fix flang
2024-04-19 09:23:00 -07:00
Tom Eccles
a5ae54ab05 [flang][NFC] Unify getIfConstantIntValue helpers (#87633)
There were different helpers for attempting to fetch compile time
constants from MLIR: one in fir::getIntIfConstant and one in CodeGen.
Unify the two.
2024-04-05 12:39:24 +01:00
Valentin Clement (バレンタイン クレメン)
e9639e9c06 [flang][NFC] Extract FIROpConversion to its own files (#86213)
This PR extracts `FIROpConversion` and `FIROpAndTypeConversion`
templated base patterns to a header file. All the functions from
FIROpConversion that do not require the template argument are moved to a
base class named `ConvertFIRToLLVMPattern`.
This move is done so the `FIROpConversion` pattern and all its utility
functions can be reused outside of the codegen pass.

For the most part the code is only moved to the new files and not
modified. The only update is that addition of the PatternBenefit
argument with a default value to the constructor so it can be forwarded
to the `ConversionPattern` ctor.

This split is done in a similar way for the `ConvertOpToLLVMPattern`
base pattern that is based on the `ConvertToLLVMPattern` base class in
`mlir/include/mlir/Conversion/LLVMCommon/Pattern.h`.
2024-03-22 12:56:45 -07:00
Sergio Afonso
d84252e064 [MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393)
This patch proposes the renaming of certain OpenMP dialect operations with the
goal of improving readability and following a uniform naming convention for
MLIR operations and associated classes. In particular, the following operations
are renamed:

- `omp.map_info` -> `omp.map.info`
- `omp.target_update_data` -> `omp.target_update`
- `omp.ordered_region` -> `omp.ordered.region`
- `omp.cancellationpoint` -> `omp.cancellation_point`
- `omp.bounds` -> `omp.map.bounds`
- `omp.reduction.declare` -> `omp.declare_reduction`

Also, the following MLIR operation classes have been renamed:

- `omp::TaskLoopOp` -> `omp::TaskloopOp`
- `omp::TaskGroupOp` -> `omp::TaskgroupOp`
- `omp::DataBoundsOp` -> `omp::MapBoundsOp`
- `omp::DataOp` -> `omp::TargetDataOp`
- `omp::EnterDataOp` -> `omp::TargetEnterDataOp`
- `omp::ExitDataOp` -> `omp::TargetExitDataOp`
- `omp::UpdateDataOp` -> `omp::TargetUpdateOp`
- `omp::ReductionDeclareOp` -> `omp::DeclareReductionOp`
- `omp::WsLoopOp` -> `omp::WsloopOp`
2024-03-20 11:19:38 +00:00
Valentin Clement (バレンタイン クレメン)
85d7fef6c1 [flang][NFC] Fix include style (#85655) 2024-03-18 11:17:32 -07:00
Tom Eccles
e12b46fef7 [flang] support fir.alloca operations inside of omp reduction ops (#84952)
Advise to place the alloca at the start of the first block of whichever
region (init or combiner) we are currently inside.

It probably isn't safe to put an alloca inside of a combiner region
because this will be executed multiple times. But that would be a bug to
fix in Lower/OpenMP.cpp, not here.

OpenMP array reductions 1/6
Next PR: https://github.com/llvm/llvm-project/pull/84953
2024-03-15 11:46:12 +00:00
Valentin Clement (バレンタイン クレメン)
408d4e131e [flang][NFC] Expose FIR to LLVM patterns (#83492)
The FIR dialect has been initiated before many interfaces have been
introduced to MLIR. This patch expose the FIR to LLVM patterns in a
`populateFIRToLLVMConversionPatterns` function. The idea is to be able
to add the `ConvertToLLVMPatternInterface`. This is not directly
possible since the FIR dialect does not currently use the table
infrastructure for its definition.

Follow up patches will move the FIR dialect definition to table gen and
then implement the interface.
2024-03-04 09:11:59 +01:00
jeanPerier
d26a6464f8 [flang] Do not leave length parameters uninitialized in descriptor addendums (#81858)
Descriptor addendum have a field to hold length parameters (currently
only one). This field is currently never used because flang does not
lowered derived types with length parameters.

However, leaving it uninitialized is causing bugs in code like gFTL
where the code is trying to sort POINTERs (see [1]). More precisely, it
is an issue when two pointers should compare equal (same base address),
because the uninitialized values in the addendum may differ depending on
the "stack history" and optimization level.

Always initialized the length parameters field in the addendum to zero.

[1]:
dc93a5fc2f/include/v1/templates/set_impl.inc (L312)

The type being transferred to an integer array may look like:

```
   TYPE :: localwrapper
    TYPE(T), POINTER :: item
   END TYPE localwrapper
```

Which in flang case ends-up transferring a descriptor to an integer
array, the code in gFTL later compare the integer arrays. This logic is
used when building set data structures in gFTL.
2024-02-16 08:51:03 +01:00
agozillon
95fe47ca7e [Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (#71766)
This patch seeks to add an initial lowering for pointers and allocatable variables 
captured by implicit and explicit map in Flang OpenMP for Target operations that 
take map clauses e.g. Target, Target Update. Target Exit/Enter etc.

Currently this is done by treating the type that lowers to a descriptor 
(allocatable/pointer/assumed shape) as a map of a record type (e.g. a structure) as that's
effectively what descriptor types lower to in LLVM-IR and what they're represented as
in the Fortran runtime (written in C/C++). The descriptor effectively lowers to a structure
containing scalar and array elements that represent various aspects of the underlying
data being mapped (lower bound, upper bound, extent being the main ones of interest
in most cases) and a pointer to the allocated data. In this current iteration of the mapping
we map the structure in it's entirety and then attach the underlying data pointer and map
the data to the device, this allows most of the required data to be resident on the device
for use. Currently we do not support the addendum (another block of pointer data), but
it shouldn't be too difficult to extend this to support it.

The MapInfoOp generation for descriptor types is primarily handled in an optimization
pass, where it expands BoxType (descriptor types) map captures into two maps, one for
the structure (scalar elements) and the other for the pointer data (base address) and
links them in a Parent <-> Child relationship. The later lowering processes will then treat
them as a conjoined structure with a pointer member map.
2024-02-05 18:45:07 +01:00
Sergio Afonso
837bff11cb [Flang][Lower] Attach target_cpu and target_features attributes to MLIR functions (#78289)
This patch forwards the target CPU and features information from the
Flang frontend to MLIR func.func operation attributes, which are later
used to populate the target_cpu and target_features llvm.func
attributes.

This is achieved in two stages:

1. Introduce the `fir.target_cpu` and `fir.target_features` module
attributes with information from the target machine immediately after
the initial creation of the MLIR module in the lowering bridge.

2. Update the target rewrite flang pass to get this information from the
module and pass it along to all func.func MLIR operations, respectively
as attributes named `target_cpu` and `target_features`. These attributes
will be automatically picked up during Func to LLVM dialect lowering and
used to initialize the corresponding llvm.func named attributes.

The target rewrite and FIR to LLVM lowering passes are updated with the
ability to override these module attributes, and the `CodeGenSpecifics`
optimizer class is augmented to make this information available to
target-specific MLIR transformations.

This completes a full flow by which target CPU and features make it all
the way from compiler options to LLVM IR function attributes.
2024-01-30 13:45:56 +00:00
jeanPerier
4f62a183d9 [flang] Allow user to define free via BIND(C) (#78428)
A user defining and using free/malloc via BIND(C) would previously cause
flang to crash when generating LLVM IR with error "redefinition of
symbol named 'free'". This was caused by flang codegen not expecting to
find a mlir::func::FuncOp definition of these function and emitting a
new mlir::LLVM::FuncOp that later conflicted when translating the
mlir::func::FuncOp.
2024-01-18 09:37:44 +01:00
agozillon
abeb6c9f58 [Flang][MLIR] Add basic initial support for alloca and program address space handling in FIR->LLVMIR codegen (#77518)
This is a slightly more slimmed down and up-to-date version of the older
PR from here: https://reviews.llvm.org/D144203, written by @jsjodin,
which has already under gone some review.

This PR places allocas in the alloca address space specified by the
provided data layout (default is 0 for all address spaces, unless
explicitly specified by the layout), and then will cast these alloca's
to the program address space if this address space is different from the
allocation address space. For most architectures data layouts, this will
be a no-op, as they have a flat address space. But in the case of AMDGPU
it will result in allocas being placed in the correct address space (5,
private), and then casted into the correct program address space (0,
generic). This results in correct (partially, a follow up PR will be
forthcoming soon) generation of allocations inside of device code.

This PR is in addition to the work by @skatrak in this PR:
https://github.com/llvm/llvm-project/pull/69599 and adds seperate and
neccesary functionality of casting alloca's from their address space to
the program address space, both are independent PRs, although there is
some minor overlap e.g. this PR incorporates some of the useful helper
functions from 69599, so whichever lands first will need a minor rebase.

Co-author: jsjodin
2024-01-17 17:37:16 +01:00
Matthias Springer
5fcf907b34 [mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`

The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.

Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
2024-01-17 11:08:59 +01:00
jeanPerier
27d9a479c0 [flang] Add struct passing target rewrite hooks and partial X86-64 impl (#74829)
In the context of C/Fortran interoperability (BIND(C)), it is possible
to give the VALUE attribute to a BIND(C) derived type dummy, which
according to Fortran 2018 18.3.6 - 2. (4) implies that it must be passed
like the equivalent C structure value. The way C structure value are
passed is ABI dependent.

LLVM does not implement the C struct ABI passing for LLVM aggregate type
arguments. It is up to the front-end, like clang is doing, to split the
struct into registers or pass the struct on the stack (llvm "byval") as
required by the target ABI.
So the logic for C struct passing sits in clang. Using it from flang
requires setting up a lot of clang context and to bridge FIR/MLIR
representation to clang AST representation for function signatures (in
both directions). It is a non trivial task.
See
https://stackoverflow.com/questions/39438033/passing-structs-by-value-in-llvm-ir/75002581#75002581.

Since BIND(C) struct are rather limited as opposed to generic C struct
(e.g. no bit fields). It is easier to provide a limited implementation
of it for the case that matter to Fortran.

This patch:
- Updates the generic target rewrite pass to keep track of both the new
argument type and attributes. The motivation for this is to be able to
tell if a previously marshalled argument is passed in memory (it is a C
pointer), or if it is being passed on the stack (has the byval llvm
attributes).
- Adds an entry point in the target specific codegen to marshal struct
arguments, and use it in the generic target rewrite pass.
- Implements limited support for the X86-64 case. So far, the support
allows telling if a struct must be passed in register or on the stack,
and to deal with the stack case. The register case is left TODO in this
patch.

The X86-64 ABI implemented is the System V ABI for AMD64 version 1.0
2023-12-12 11:52:39 +01:00
Tom Eccles
bdacd56fd1 [flang][CodeGen] add nsw to address calculations (#74709)
`nsw` is a flag for LLVM arithmetic operations meaning "no signed wrap".
If this keyword is present, the result of the operation is a poison
value if overflow occurs. Adding this keyword permits LLVM to re-order
integer arithmetic more aggressively.

In

https://discourse.llvm.org/t/rfc-changes-to-fircg-xarray-coor-codegen-to-allow-better-hoisting/75257/16
@vzakhari observed that adding nsw is useful to enable hoisting of
address calculations after some loops (or is at least a step in that
direction).

Classic flang also adds nsw to address calculations.
2023-12-08 10:51:20 +00:00
Tom Eccles
fcd06d774d [mlir][flang] add fast math attribute to fcmp (#74315)
`llvm.fcmp` does support fast math attributes therefore so should
`arith.cmpf`.

The heavy churn in flang tests are because flang sets
`fastmath<contract>` by default on all operations that support the fast
math interface. Downstream users of MLIR should not be so effected.

This was requested in https://github.com/llvm/llvm-project/issues/74263
2023-12-06 10:19:48 +00:00