Commit Graph

373 Commits

Author SHA1 Message Date
Renaud Kauffmann
b9978f8c77 [flang][cuda] Adding variable registration in constructor (#113976)
1) Adding variable registration in constructor
2) Applying feedback from PR
https://github.com/llvm/llvm-project/pull/112989
2024-10-29 11:48:48 -07:00
Valentin Clement (バレンタイン クレメン)
b05fec97d5 [flang][cuda] Convert gpu.launch_func to CUFLaunchClusterKernel when cluster dims are present (#113959)
Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel
has `cluster_dims` specified these get carried over to the
`gpu.launch_func` operation. This patch updates the special conversion
of `gpu.launch_func` when cluster dims are present to the newly added
entry point.
2024-10-29 10:02:08 -07:00
Abid Qadeer
8239ea3918 [flang][debug] Support IndexType. (#113921) 2024-10-29 12:22:43 +00:00
Renaud Kauffmann
0eb5c9d2ef [flang][cuda] Copying device globals in the gpu module (#113955) 2024-10-28 15:34:27 -07:00
Yusuke MINATO
bd6ab32e6e Revert "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" (#113901)
Reverts llvm/llvm-project#110063 due to the performance regression on
503.bwaves_r in SPEC2017.
2024-10-28 14:19:20 +00:00
jeanPerier
64d7e45c40 Revert "[flang][debug] Support mlir::NoneType." (#113769)
Reverts llvm/llvm-project#113550

It turns out this causes compiler crashes with assumed-type arrays and -g.
See https://github.com/llvm/llvm-project/pull/113769 for a reproducer.
2024-10-26 21:38:54 +02:00
Renaud Kauffmann
3acf856b50 Adding CUFCommon.{h,cpp} for CUF utilities (#113740) 2024-10-25 16:08:45 -07:00
Abid Qadeer
85af1926f7 [flang][debug] Support mlir::NoneType. (#113550) 2024-10-25 11:43:25 +01:00
Yusuke MINATO
96bb375f5c [flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv (#110063)
nsw is now added to do-variable increment when -fno-wrapv is enabled as
GFortran seems to do.
That means the option introduced by #91579 isn't necessary any more.

Note that the feature of -flang-experimental-integer-overflow is enabled
by default.
2024-10-25 15:20:23 +09:00
Abid Qadeer
37832d5de2 [flang][debug] Support fir.vector type. (#112951)
This PR converts the `fir.vector<>` to
`DICompositeTypeAttr(DW_TAG_array_type)` with `vector` flag set.
2024-10-24 13:37:32 +01:00
Abid Qadeer
47c1abf4af [flang][debug] Fix array lower bounds in derived type members. (#113183)
The lower bound information for the array members of a derived type
can't be obtained from the `DeclareOp`. It has to be extracted from the
`TypeInfoOp`. That was left as FIXME in the code. This PR adds the
missing functionality to fix the issue.

I tried the following approaches before settling on the current one that
is to generate `DITypeAttr` for array members right where the components
are being processed.

1. Generate a temp XDeclareOp with the shift information obtained from
the `TypeInfoOp`. This caused a few issues mostly related to
`unrealized_conversion_cast`.

2. Change the shift operands in the `declOp` that was passed in the
function before calling `convertType`. The code can be seen in the
abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the
following. It works correctly but I was not sure if temporarily changing
the `declOp` is the safe thing to do.

```
mlir::OperandRange originalShift = declOp.getShift();
mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable();
mutableOpRange.assign(shiftOpers);
elemTy = convertType(fieldTy, fileAttr, scope, declOp);
mutableOpRange.assign(originalShift);
```

Fixes #113178.
2024-10-24 13:22:28 +01:00
Abid Qadeer
c07abf7272 [flang][debug] Support fir::ReferenceType. (#113480) 2024-10-24 11:38:17 +01:00
Valentin Clement (バレンタイン クレメン)
4e40b71c51 [flang][cuda] Add specialized gpu.launch_func conversion (#113493) 2024-10-23 15:28:51 -07:00
Renaud Kauffmann
f1e59dcb45 Renaming Cuf passes to CUF (#113351)
For consistency with other dialects and other CUF passes and files, this
patch renames passes CufOpConversion to CUFOpConversion,
CufImplicitDeviceGlobal to CUFDeviceGlobal.
It also renames the file.
2024-10-22 12:50:31 -07:00
Abid Qadeer
95b4128c6a [flang][debug] Don't generate debug for compiler-generated variables (#112423)
Flang generates many globals to handle derived types. There was a check
in debug info to filter them based on the information that their names
start with a period. This changed since PR#104859 where 'X' is being
used instead of '.'.

This PR fixes this issue by also adding 'X' in that list. As user
variables gets lower cased by the NameUniquer, there is no risk that
those will be filtered out. I added a test for that to be sure.
2024-10-21 11:27:34 +01:00
Valentin Clement (バレンタイン クレメン)
d37bc32a65 [flang][cuda] Translate cuf.register_kernel and cuf.register_module (#112972)
Add LLVM IR Translation for `cuf.register_module` and
`cuf.register_kernel`. These are lowered to function call to the CUF
runtime entries.
2024-10-18 21:31:47 -07:00
Valentin Clement (バレンタイン クレメン)
5406834cda [flang][cuda] Add cuf.register_module operation (#112971)
Add a new operation to register the fatbin and pass it to
`cuf.register_kernel`
2024-10-18 21:30:38 -07:00
Renaud Kauffmann
864902e9b4 [flang][cuda] Call CUFGetDeviceAddress to get global device address from host address (#112989) 2024-10-18 17:35:38 -07:00
Valentin Clement (バレンタイン クレメン)
85880140be [flang][cuda] Add kernel registration in CUF constructor (#112416)
Update the CUF constructor with the cuf.register_kernel operations.
2024-10-15 14:18:37 -07:00
jeanPerier
367c3c968e [flang] correctly deal with bind(c) derived type result ABI (#111969)
Derived type results of BIND(C) function should be returned according
the the C ABI for returning the related C struct type.

This currently did not happen since the abstract-result pass was forcing
the Fortran ABI for all derived type results.
use the bind_c attribute that was added on call/func/dispatch in FIR to
prevent such rewrite in the abstract result pass, and update the
target-rewrite pass to deal with the struct return ABI.

So far, the target specific part of the target-rewrite is only
implemented for X86-64 according to the "System V Application Binary
Interface AMD64 v1", the other targets will hit a TODO, just like for
BIND(C), VALUE derived type arguments.

This intends to deal with #102113.

This is a re-land of #111678 with an extra commit to keep rewriting `type(c_ptr)`
results to `!fir.ref<none>` in the abstract result pass regardless of the ABIs.
2024-10-14 09:35:29 +02:00
donald chen
4b3f251bad [mlir] [dataflow] unify semantics of program point (#110344)
The concept of a 'program point' in the original data flow framework is
ambiguous. It can refer to either an operation or a block itself. This
representation has different interpretations in forward and backward
data-flow analysis. In forward data-flow analysis, the program point of
an operation represents the state after the operation, while in backward
data flow analysis, it represents the state before the operation. When
using forward or backward data-flow analysis, it is crucial to carefully
handle this distinction to ensure correctness.

This patch refactors the definition of program point, unifying the
interpretation of program points in both forward and backward data-flow
analysis.

How to integrate this patch?

For dense forward data-flow analysis and other analysis (except dense
backward data-flow analysis), the program point corresponding to the
original operation can be obtained by `getProgramPointAfter(op)`, and
the program point corresponding to the original block can be obtained by
`getProgramPointBefore(block)`.

For dense backward data-flow analysis, the program point corresponding
to the original operation can be obtained by
`getProgramPointBefore(op)`, and the program point corresponding to the
original block can be obtained by `getProgramPointAfter(block)`.

NOTE: If you need to get the lattice of other data-flow analyses in
dense backward data-flow analysis, you should still use the dense
forward data-flow approach. For example, to get the Executable state of
a block in dense backward data-flow analysis and add the dependency of
the current operation, you should write:

``getOrCreateFor<Executable>(getProgramPointBefore(op),
getProgramPointBefore(block))``

In case above, we use getProgramPointBefore(op) because the analysis we
rely on is dense backward data-flow, and we use
getProgramPointBefore(block) because the lattice we query is the result
of a non-dense backward data flow computation.

related dsscussion:
https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8
corresponding PSA:
https://discourse.llvm.org/t/psa-program-point-semantics-change/81479
2024-10-11 21:59:05 +08:00
jeanPerier
4ddc756bcc Revert "[flang] correctly deal with bind(c) derived type result ABI" (#111858)
Reverts llvm/llvm-project#111678

Causes ARM failure in test suite. TYPE(C_PTR) result should not regress
even if struct ABI no implemented for the target.
https://lab.llvm.org/buildbot/#/builders/143/builds/2731
I need to revisit this.
2024-10-10 17:25:57 +02:00
jeanPerier
480e7f0667 [flang] correctly deal with bind(c) derived type result ABI (#111678)
Derived type results of BIND(C) function should be returned according
the the C ABI for returning the related C struct type.

This currently did not happen since the abstract-result pass was forcing
the Fortran ABI for all derived type results.
use the bind_c attribute that was added on call/func/dispatch in FIR to
prevent such rewrite in the abstract result pass, and update the
target-rewrite pass to deal with the struct return ABI.

So far, the target specific part of the target-rewrite is only
implemented for X86-64 according to the "System V Application Binary
Interface AMD64 v1", the other targets will hit a TODO, just like for
BIND(C), VALUE derived type arguments.

This intends to deal with
https://github.com/llvm/llvm-project/issues/102113.
2024-10-10 15:37:19 +02:00
Walter Erquinigo
2918e779a9 [mlir][debuginfo] Add support for subprogram annotations (#110946)
LLVM already supports `DW_TAG_LLVM_annotation` entries for subprograms,
but this hasn't been surfaced to the LLVM dialect.
I'm doing the minimal amount of work to support string-based
annotations, which is useful for attaching metadata to
functions, which is useful for debuggers to offer features beyond basic
DWARF.
As LLVM already supports this, this patch is not controversial.
2024-10-07 17:51:08 -04:00
Tom Eccles
91d6e77d8b [flang][debug] set DW_AT_main_subprogram for fortran main function (#111350)
Requested here
https://github.com/llvm/llvm-project/pull/111022#issuecomment-2396287781
2024-10-07 13:59:41 +01:00
Matthias Springer
206fad0e21 [mlir][NFC] Mark type converter in populate... functions as const (#111250)
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.

Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.

Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
2024-10-05 21:32:40 +02:00
Renaud Kauffmann
72f38040dd Removing CUF runtime dependency with llvm::EnableABIBreakingChecks (#111200)
getMemType happens to only be used in CufOpConversion.cpp. So, moving it
here for now. If it needs to be shared with the runtime, then care
should be taken in not bringing the include `#include
"flang/Optimizer/Dialect/CUF/Attributes/CUFAttr.h"` which introduces the
dependency with llvm::EnableABIBreakingChecks
2024-10-04 13:18:08 -07:00
Tom Eccles
f6f4c177ef [flang][debug] Use PROGRAM name for main function name (#111022)
For example, in

        PROGRAM test_program
          ...
        END PROGRAM

This allows a user to break on the main function with `break
test_program`. This matches what classic flang and gfortran do.
2024-10-04 10:46:58 +01:00
jeanPerier
1753de2d95 [flang][FIR] remove fir.complex type and its fir.real element type (#111025)
Final patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292

Since fir.real was only still used as fir.complex element type, this
patch removes it at the same time.
2024-10-04 09:57:03 +02:00
Abid Qadeer
fc4b1a303b [flang][debug] Handle array types with variable size/bounds. (#110686)
The debug information generated by flang did not handle the cases where
dimension or lower bounds of the arrays were variable. This PR fixes
this issue. It will help distinguish assumed size arrays from cases
where array size are variable. It also handles the variable lower bounds
for assumed shape arrays.
    
Fixes #98879.
2024-10-03 21:29:47 +01:00
jeanPerier
c4204c0b29 [flang] replace fir.complex usages with mlir complex (#110850)
Core patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292.
After that, the last step is to remove fir.complex from FIR types.
2024-10-03 17:10:57 +02:00
jeanPerier
a78359c2ed [flang] add procedure flags to fir.dispatch (#110970)
Currently, it is not possible to distinguish between BIND(C) from
non-BIND(C) type bound procedure call at the FIR level.
This will be a problem when dealing with derived type BIND(C) function
where the ABI differ between BIND(C)/non-BIND(C) but the FIR signature
looks like the same at the FIR level.

Fix this by adding the Fortran procedure attributes to fir.distpatch,
and propagating it until the related fir.call is generated in
fir.dispatch codegen.
2024-10-03 17:10:03 +02:00
Abid Qadeer
1094ee71da [flang][debug] Better handle array lower bound of assumed shape arrays. (#110302)
As mentioned in #108633, we don't respect the lower bound of the assumed
shape arrays if those were specified. It happens in both cases:
1. When caller has non-default lower bound and callee has default
2. When callee has non-default lower bound and caller has default

This PR tries to fix this issue by improving our generation of lower
bound attribute on DICompositeTypeAttr. If we see a lower bound in the
declaration, we respect that. Note that same function is also used for
allocatable/pointer variables. We make sure that we get the lower bound
from descriptor in those cases. Please note that DWARF assumes a lower
bound of 1 so in many cases we don't need to generate the lower bound.

Fixes #108633.
2024-09-30 20:31:08 +01:00
Valentin Clement (バレンタイン クレメン)
3e5e48a173 [flang][cuda] Fix buildbot failure (#110540)
https://lab.llvm.org/buildbot/#/builders/89/builds/7488
2024-09-30 10:41:59 -07:00
Valentin Clement (バレンタイン クレメン)
7dbc664549 [flang][cuda] Convert data transfer between scalar and arrays (#110180)
Add conversion of data transfer between scalars or between arrays. 

Scalar to array are not handled yet.
2024-09-30 10:27:07 -07:00
Valentin Clement (バレンタイン クレメン)
39e254ec91 [flang][cuda] Convert cuf.alloc and cuf.free for scalar and arrays (#110055)
This patch adds more conversion of cuf.alloc and cuf.free for scalars,
constant size arrays and dynamic size arrays
2024-09-30 09:48:25 -07:00
Abid Qadeer
d556e38fe8 [flang][debug] Support derived type components with box types. (#109424)
Our support for derived types uses `getTypeSizeAndAlignment` to
calculate the offset of the members. The `fir.box` was not supported in
that function. It meant that any member which required descriptor was
not supported in the derived type.
    
We convert the type into an llvm type and then use the DataLayout to
calculate the size/offset of a member. There is no dependency on
`getTypeSizeAndAlignment` to get the size of the types.

There are 2 other changes in this PR:

1. The `recID` field is used to handle cases where we have a member
references its parent type.

2. A type cache is maintained to avoid duplication. It is also needed
for circular reference case.


Fixes #108001.
2024-09-30 10:31:56 +01:00
Abid Qadeer
69ef3b102c [flang][debug] Allow variable length for dummy char arguments. (#109448)
As pointed out by @jeanPerier
[here](https://github.com/llvm/llvm-project/pull/108283#discussion_r1764528809),
we don't need to restrict the length of the dummy character argument
location to `fir.unboxchar`. This PR removes that restriction.
2024-09-26 10:08:48 +01:00
Valentin Clement (バレンタイン クレメン)
b15bd3fc65 [flang][cuda] Add global constructor for allocators registration (#109854)
This pass creates the constructor function to call the allocator
registration and adds it to the global_ctors.
2024-09-24 17:04:54 -07:00
Valentin Clement (バレンタイン クレメン)
f760db1249 [flang][cuda][NFC] Expose conversion patterns from CUF to FIR calls (#109465)
Expose conversion patterns so they can be reused outside of this pass.
2024-09-20 22:28:10 -07:00
Valentin Clement (バレンタイン クレメン)
2e89e6b59a [flang][cuda] Flag globals used in device function (#109460) 2024-09-20 18:03:25 -07:00
Valentin Clement
156035ed4d [flang][cuda] Convert module allocation/deallocation to runtime calls
Convert `cuf.allocate` and `cuf.deallocate` to the runtime entry points added
in #109213

Was reviewed in https://github.com/llvm/llvm-project/pull/109214 but the
parent branch was closed for some reason.
2024-09-18 20:49:08 -07:00
Valentin Clement (バレンタイン クレメン)
cdf447baa5 [flang][cuda] Add function to allocate and deallocate device module variable (#109213)
This patch adds new runtime entry points that perform the simple
allocation/deallocation of module allocatable variable with cuda
attributes.
When the allocation is initiated on the host, the descriptor on the
device is synchronized. Both descriptors point to the same data on the
device.

This is the first PR of a stack.
2024-09-18 20:22:06 -07:00
Abid Qadeer
76347ee958 [flang][debug] Improve handling of dummy character arguments. (#108283)
As described in #107998, we were not handling the case well when length
of the character is not part of the type. This PR handles one of the
case when the length can be calculated by looking at the result of
corresponding `fir.unboxchar`.

The DIStringTypeAttr have a `stringLength` field that can be a variable.
We create an artificial variable that will hold the length and used as
value of `stringLength` field. The variable is then attached with
a `DbgValueOp`.

Fixes #107998.
2024-09-18 13:52:23 +01:00
Valentin Clement (バレンタイン クレメン)
0bbebf6f3a [flang][cuda] Convert cuf.data_transfer with descriptors (#108890)
Convert cuf.data_transfer operations involving descriptors to the newly
introduced entry points (#108244).
2024-09-17 11:00:31 -07:00
Abid Qadeer
b6f72fc1e2 [flang][debug] Generate correct subroutine type. (#108605)
We pass a list of types when creating a subroutine type. The first one
is supposed to be return type and the rest are the argument types. A
subroutine does not have a return type so an argument type could be
confused as a return type. To fix this, if there is no return type, we
generate a null type as a place holder.

Fixes #108564.
2024-09-17 11:07:23 +01:00
Abid Qadeer
1fc288bf48 [flang][debug] Handle lower bound in assumed size arrays. (#108523)
Fixes #108411
2024-09-17 11:02:10 +01:00
Tom Eccles
1e64864c6f [flang][StackArrays] run in parallel on different functions (#108842)
Since #108562, StackArrays no longer has to create function declarations
at the module level to use stacksave/stackrestore LLVM intrinsics. This
will allow it to run in parallel on multiple functions at the same time.
2024-09-17 10:25:37 +01:00
Tom Eccles
5aaf384b16 [flang][NFC] use llvm.intr.stacksave/restore instead of opaque calls (#108562)
The new LLVM stack save/restore intrinsic operations are more convenient
than function calls because they do not add function declarations to the
module and therefore do not block the parallelisation of passes.
Furthermore they could be much more easily marked with memory effects
than function calls if that ever proved useful.

This builds on top of #107879.

Resolves #108016
2024-09-16 12:33:37 +01:00
Abid Qadeer
db64e69fa2 [flang][debug] Handle 'used' module. (#107626)
As described in #98883, we have to qualify a module variable name in
debugger to get its value. This PR tries to remove this limitation.
    
LLVM provides `DIImportedEntity` to handle such cases but the PR is made
more complicated due to the following 2 issues.
    
1. The MLIR attributes are readonly and we have a circular dependency
here. This has to be handled using the recursive interface provided by
the MLIR. This requires us to first create a place holder
`DISubprogramAttr` which is used in creating `DIImportedEntityAttr`.
Later another `DISubprogramAttr` is created which replaces the place
holder.
    
2. The flang IR does not provide any information about the 'used' module
so this has to be extracted by doing a pass over the
`DeclareOp` in the function. This presents certain limitation as 'only'
and module variable renaming may not be handled properly.
    
Due to the change in `DISubprogramAttr`, some tests also needed to be
adjusted.
    
Fixes #98883.
2024-09-11 09:31:53 +01:00