Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel
has `cluster_dims` specified these get carried over to the
`gpu.launch_func` operation. This patch updates the special conversion
of `gpu.launch_func` when cluster dims are present to the newly added
entry point.
nsw is now added to do-variable increment when -fno-wrapv is enabled as
GFortran seems to do.
That means the option introduced by #91579 isn't necessary any more.
Note that the feature of -flang-experimental-integer-overflow is enabled
by default.
The lower bound information for the array members of a derived type
can't be obtained from the `DeclareOp`. It has to be extracted from the
`TypeInfoOp`. That was left as FIXME in the code. This PR adds the
missing functionality to fix the issue.
I tried the following approaches before settling on the current one that
is to generate `DITypeAttr` for array members right where the components
are being processed.
1. Generate a temp XDeclareOp with the shift information obtained from
the `TypeInfoOp`. This caused a few issues mostly related to
`unrealized_conversion_cast`.
2. Change the shift operands in the `declOp` that was passed in the
function before calling `convertType`. The code can be seen in the
abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the
following. It works correctly but I was not sure if temporarily changing
the `declOp` is the safe thing to do.
```
mlir::OperandRange originalShift = declOp.getShift();
mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable();
mutableOpRange.assign(shiftOpers);
elemTy = convertType(fieldTy, fileAttr, scope, declOp);
mutableOpRange.assign(originalShift);
```
Fixes#113178.
For consistency with other dialects and other CUF passes and files, this
patch renames passes CufOpConversion to CUFOpConversion,
CufImplicitDeviceGlobal to CUFDeviceGlobal.
It also renames the file.
Flang generates many globals to handle derived types. There was a check
in debug info to filter them based on the information that their names
start with a period. This changed since PR#104859 where 'X' is being
used instead of '.'.
This PR fixes this issue by also adding 'X' in that list. As user
variables gets lower cased by the NameUniquer, there is no risk that
those will be filtered out. I added a test for that to be sure.
Derived type results of BIND(C) function should be returned according
the the C ABI for returning the related C struct type.
This currently did not happen since the abstract-result pass was forcing
the Fortran ABI for all derived type results.
use the bind_c attribute that was added on call/func/dispatch in FIR to
prevent such rewrite in the abstract result pass, and update the
target-rewrite pass to deal with the struct return ABI.
So far, the target specific part of the target-rewrite is only
implemented for X86-64 according to the "System V Application Binary
Interface AMD64 v1", the other targets will hit a TODO, just like for
BIND(C), VALUE derived type arguments.
This intends to deal with #102113.
This is a re-land of #111678 with an extra commit to keep rewriting `type(c_ptr)`
results to `!fir.ref<none>` in the abstract result pass regardless of the ABIs.
The concept of a 'program point' in the original data flow framework is
ambiguous. It can refer to either an operation or a block itself. This
representation has different interpretations in forward and backward
data-flow analysis. In forward data-flow analysis, the program point of
an operation represents the state after the operation, while in backward
data flow analysis, it represents the state before the operation. When
using forward or backward data-flow analysis, it is crucial to carefully
handle this distinction to ensure correctness.
This patch refactors the definition of program point, unifying the
interpretation of program points in both forward and backward data-flow
analysis.
How to integrate this patch?
For dense forward data-flow analysis and other analysis (except dense
backward data-flow analysis), the program point corresponding to the
original operation can be obtained by `getProgramPointAfter(op)`, and
the program point corresponding to the original block can be obtained by
`getProgramPointBefore(block)`.
For dense backward data-flow analysis, the program point corresponding
to the original operation can be obtained by
`getProgramPointBefore(op)`, and the program point corresponding to the
original block can be obtained by `getProgramPointAfter(block)`.
NOTE: If you need to get the lattice of other data-flow analyses in
dense backward data-flow analysis, you should still use the dense
forward data-flow approach. For example, to get the Executable state of
a block in dense backward data-flow analysis and add the dependency of
the current operation, you should write:
``getOrCreateFor<Executable>(getProgramPointBefore(op),
getProgramPointBefore(block))``
In case above, we use getProgramPointBefore(op) because the analysis we
rely on is dense backward data-flow, and we use
getProgramPointBefore(block) because the lattice we query is the result
of a non-dense backward data flow computation.
related dsscussion:
https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8
corresponding PSA:
https://discourse.llvm.org/t/psa-program-point-semantics-change/81479
Derived type results of BIND(C) function should be returned according
the the C ABI for returning the related C struct type.
This currently did not happen since the abstract-result pass was forcing
the Fortran ABI for all derived type results.
use the bind_c attribute that was added on call/func/dispatch in FIR to
prevent such rewrite in the abstract result pass, and update the
target-rewrite pass to deal with the struct return ABI.
So far, the target specific part of the target-rewrite is only
implemented for X86-64 according to the "System V Application Binary
Interface AMD64 v1", the other targets will hit a TODO, just like for
BIND(C), VALUE derived type arguments.
This intends to deal with
https://github.com/llvm/llvm-project/issues/102113.
LLVM already supports `DW_TAG_LLVM_annotation` entries for subprograms,
but this hasn't been surfaced to the LLVM dialect.
I'm doing the minimal amount of work to support string-based
annotations, which is useful for attaching metadata to
functions, which is useful for debuggers to offer features beyond basic
DWARF.
As LLVM already supports this, this patch is not controversial.
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.
Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.
Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
getMemType happens to only be used in CufOpConversion.cpp. So, moving it
here for now. If it needs to be shared with the runtime, then care
should be taken in not bringing the include `#include
"flang/Optimizer/Dialect/CUF/Attributes/CUFAttr.h"` which introduces the
dependency with llvm::EnableABIBreakingChecks
For example, in
PROGRAM test_program
...
END PROGRAM
This allows a user to break on the main function with `break
test_program`. This matches what classic flang and gfortran do.
The debug information generated by flang did not handle the cases where
dimension or lower bounds of the arrays were variable. This PR fixes
this issue. It will help distinguish assumed size arrays from cases
where array size are variable. It also handles the variable lower bounds
for assumed shape arrays.
Fixes#98879.
Currently, it is not possible to distinguish between BIND(C) from
non-BIND(C) type bound procedure call at the FIR level.
This will be a problem when dealing with derived type BIND(C) function
where the ABI differ between BIND(C)/non-BIND(C) but the FIR signature
looks like the same at the FIR level.
Fix this by adding the Fortran procedure attributes to fir.distpatch,
and propagating it until the related fir.call is generated in
fir.dispatch codegen.
As mentioned in #108633, we don't respect the lower bound of the assumed
shape arrays if those were specified. It happens in both cases:
1. When caller has non-default lower bound and callee has default
2. When callee has non-default lower bound and caller has default
This PR tries to fix this issue by improving our generation of lower
bound attribute on DICompositeTypeAttr. If we see a lower bound in the
declaration, we respect that. Note that same function is also used for
allocatable/pointer variables. We make sure that we get the lower bound
from descriptor in those cases. Please note that DWARF assumes a lower
bound of 1 so in many cases we don't need to generate the lower bound.
Fixes#108633.
Our support for derived types uses `getTypeSizeAndAlignment` to
calculate the offset of the members. The `fir.box` was not supported in
that function. It meant that any member which required descriptor was
not supported in the derived type.
We convert the type into an llvm type and then use the DataLayout to
calculate the size/offset of a member. There is no dependency on
`getTypeSizeAndAlignment` to get the size of the types.
There are 2 other changes in this PR:
1. The `recID` field is used to handle cases where we have a member
references its parent type.
2. A type cache is maintained to avoid duplication. It is also needed
for circular reference case.
Fixes#108001.
This patch adds new runtime entry points that perform the simple
allocation/deallocation of module allocatable variable with cuda
attributes.
When the allocation is initiated on the host, the descriptor on the
device is synchronized. Both descriptors point to the same data on the
device.
This is the first PR of a stack.
As described in #107998, we were not handling the case well when length
of the character is not part of the type. This PR handles one of the
case when the length can be calculated by looking at the result of
corresponding `fir.unboxchar`.
The DIStringTypeAttr have a `stringLength` field that can be a variable.
We create an artificial variable that will hold the length and used as
value of `stringLength` field. The variable is then attached with
a `DbgValueOp`.
Fixes#107998.
We pass a list of types when creating a subroutine type. The first one
is supposed to be return type and the rest are the argument types. A
subroutine does not have a return type so an argument type could be
confused as a return type. To fix this, if there is no return type, we
generate a null type as a place holder.
Fixes#108564.
Since #108562, StackArrays no longer has to create function declarations
at the module level to use stacksave/stackrestore LLVM intrinsics. This
will allow it to run in parallel on multiple functions at the same time.
The new LLVM stack save/restore intrinsic operations are more convenient
than function calls because they do not add function declarations to the
module and therefore do not block the parallelisation of passes.
Furthermore they could be much more easily marked with memory effects
than function calls if that ever proved useful.
This builds on top of #107879.
Resolves#108016
As described in #98883, we have to qualify a module variable name in
debugger to get its value. This PR tries to remove this limitation.
LLVM provides `DIImportedEntity` to handle such cases but the PR is made
more complicated due to the following 2 issues.
1. The MLIR attributes are readonly and we have a circular dependency
here. This has to be handled using the recursive interface provided by
the MLIR. This requires us to first create a place holder
`DISubprogramAttr` which is used in creating `DIImportedEntityAttr`.
Later another `DISubprogramAttr` is created which replaces the place
holder.
2. The flang IR does not provide any information about the 'used' module
so this has to be extracted by doing a pass over the
`DeclareOp` in the function. This presents certain limitation as 'only'
and module variable renaming may not be handled properly.
Due to the change in `DISubprogramAttr`, some tests also needed to be
adjusted.
Fixes#98883.