Commit Graph

172 Commits

Author SHA1 Message Date
Nikita Popov
29441e4f5f [IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Abid Qadeer
afa4681ce4 [flang][debug] Add support for common blocks. (#112398)
This PR adds debug support for common block in flang. As variable which
are part of a common block don't have a special marker to recognize
them, we use the following check to find them.

%0 = fir.address_of(@a)
%1 = fir.convert %0
%2 = fir.coordinate_of %1, %c0
%3 = fir.convert %2
%4 = fircg.ext_declare %3

If the memref of a fircg.ext_declare points to a fir.coordinate_of and
that in turn points to an fir.address_of (ignoring immediate
fir.convert) then we assume that it is a common block variable. The
fir.address_of gives us the global symbol which is the storage for
common block and fir.coordinate_of provides the offset in this storage.

The debug hierarchy looks like as

subroutine f3
  integer :: x, y
  common /a/ x, y
end subroutine

@a_ = global { ... } { ... }, !dbg !26, !dbg !28

!23 = !DISubprogram(name: "f3"...)
!24 = !DICommonBlock(scope: !23, name: "a", ...)
!25 = !DIGlobalVariable(name: "x", scope: !24 ...)
!26 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression())
!27 = !DIGlobalVariable(name: "y", scope: !24 ...)
!28 = !DIGlobalVariableExpression(var: !27, expr:
!DIExpression(DW_OP_plus_uconst, 4))

This required following changes:

1. Instead of using DIGlobalVariableAttr in the FusedLoc of GlobalOp, we
use DIGlobalVariableExpressionAttr. This allows us the generate the
DIExpression where we have the information.

2. Previously, only one DIGlobalVariableExpressionAttr could be linked
to one global op. I recently removed this restriction in mlir. To make
use of it, we add an ArrayAttr to the FusedLoc of a GlobalOp. This
allows us to pass multiple DIGlobalVariableExpressionAttr.

3. I was depending on the name of global for the name of the common
block. The name gets a '_' appended. I could not find a utility function
in flang to remove it so I have to brute force it.
2025-01-28 12:54:15 +00:00
Kareem Ergawy
1e2d5f7943 [NFC][mlir][OpenMP] Remove mentions of target from generic loop rewrite (#124528)
This removes mentions of `target` from the generic `loop` rewrite pass
since there is not need for it anyway. It is enough to detect `loop`'s
nesting within `teams` or `parallel` directives.
2025-01-27 16:44:17 +01:00
Kareem Ergawy
29f7392c73 [flang][OpenMP] Rewrite standalone loop (without bind) directives to simd (#122632)
Extends conversion support for `loop` directives. This PR handles
standalone `loop` constructs that do not have a `bind` clause attached
by rewriting them to equivalent `simd` constructs. The reasoning behind
that decision is documented in the rewrite function itself.
2025-01-21 14:56:00 +01:00
Valentin Clement (バレンタイン クレメン)
12ba74e181 [flang] Do not produce result for void runtime call (#123155)
Runtime function call to a void function are producing a ssa value
because the FunctionType result is set to NoneType with is later
translated to a empty struct. This is not an issue when going to LLVM IR
but it breaks when lowering a gpu module to PTX. This patch update the
RTModel to correctly set the FunctionType result type to nothing.

This is one runtime call before this patch at the LLVM IR dialect step.
```
%45 = llvm.call @_FortranAAssign(%arg0, %1, %44, %4) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
```

After the patch the call would be correctly formed
```
llvm.call @_FortranAAssign(%arg0, %1, %44, %4) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> ()
```

Without the patch it would lead to error like:
```
ptxas /tmp/mlir-cuda_device_mod-nvptx64-nvidia-cuda-sm_60-e804b6.ptx, line 10; error   : Output parameter cannot be an incomplete array.
ptxas /tmp/mlir-cuda_device_mod-nvptx64-nvidia-cuda-sm_60-e804b6.ptx, line 125; error   : Call has wrong number of parameters
```

The change is pretty much mechanical.
2025-01-16 12:34:38 -08:00
Slava Zakharin
e3cd88a7be [flang] Fixed StackArrays assertion after #121919. (#122550)
`findAllocaLoopInsertionPoint()` hit assertion not being able
to find the `fir.freemem` because of the `fir.convert`.
I think it is better to look for `fir.freemem` same way
with the look-through walk.
2025-01-13 11:56:11 -08:00
macurtis-amd
d291e45909 [flang] Teach omp-map-info-finalization to reuse descriptor allocas (#122507)
Internal testing shows improvements in some SPEC HPC benchmarks with
this change.
2025-01-11 07:27:19 -06:00
Tom Eccles
303249c449 [flang][StackArrays] track pointers through fir.convert (#121919)
This does add a little computational complexity because now every
freemem operation has to be tested for every allocation. This could be
improved with some more memoisation but I think it is easier to read
this way. Let me know if you would prefer me to change this to
pre-compute the normalised addresses each freemem operation is using.

Weirdly, this change resulted in a verifier failure for the fir.declare
in the previous test case. Maybe it was previously removed as dead code
and now it isn't. Anyway I fixed that too.
2025-01-08 10:05:21 +00:00
agozillon
5137c209f0 [Flang][OpenMP] Fix allocating arrays with size intrinisic (#119226)
Attempt to address the following example from causing an assert or ICE:

```
   subroutine test(a)
        implicit none
        integer :: i
        real(kind=real64), dimension(:) :: a
        real(kind=real64), dimension(size(a, 1)) :: b

!$omp target map(tofrom: b)
        do i = 1, 10
            b(i) = i
        end do
!$omp end target
end subroutine
```

Where we utilise a Fortran intrinsic (size) to calculate the size of
allocatable arrays and then map it to device.
2025-01-03 16:46:15 +01:00
Slava Zakharin
711419e302 [flang] Enable loop-versioning for slices. (#120344)
Loops resulting from array expressions like array(:,i)
may be versioned for the unit stride of the innermost dimension,
when the initial array is an assumed-shape array (which are contiguous
in many Fortran programs).
This speeds up facerec for about 12% due to further vectorization
of the innermost loop produced for the total SUM reduction.
2024-12-23 07:53:10 -08:00
Kareem Ergawy
e532241b02 Re-apply (#117867): [flang][OpenMP] Implicitly map allocatable record fields (#120374)
This re-applies #117867 with a small fix that hopefully prevents build
bot failures. The fix is avoiding `dyn_cast` for the result of
`getOperation()`. Instead we can assign the result to `mlir::ModuleOp`
directly since the type of the operation is known statically (`OpT` in
`OperationPass`).
2024-12-18 09:19:45 +01:00
Kareem Ergawy
dc936f3c19 Revert "[flang][OpenMP] Implicitly map allocatable record fields (#117867)" (#120360) 2024-12-18 06:52:24 +01:00
Kareem Ergawy
db09014a07 [flang][OpenMP] Implicitly map allocatable record fields (#117867)
This is a starting PR to implicitly map allocatable record fields.

This PR contains the following changes:
1. Re-purposes some of the utils used in `Lower/OpenMP.cpp` so that
   these utils work on the `mlir::Value` level rather than the
   `semantics::Symbol` level. This takes one step towards to enabling
   MLIR passes to more easily do some lowering themselves (e.g. creating
   `omp.map.bounds` ops for implicitely caputured data like this PR
   does).
2. Adds support for implicitely capturing and mapping allocatable fields
   in record types.

There is quite some distant to still cover to have full support for
this. I added a number of todos to guide further development.

Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>

Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>
2024-12-18 05:37:58 +01:00
Michael Klemm
261a4026e8 [Flang][OpenMP] Use internal linkage for OpenMP code-gen'ed helper functions (#117911)
When compiling WORKSHARE construct in different compilation units, a
linker error happened, when two equal WORKSHARE constructs with a copy
operation have been compiled:

```
/usr/bin/ld: module2.o: in function `_workshare_copy_f64':
FIRModule:(.text+0x0): multiple definition of `_workshare_copy_f64'; module1.o:FIRModule:(.text+0x0): first defined here
```

Reason is that the generate copy function has the wrong linkage:

```
0000000000000000 T _workshare_copy_f64
```

while it should be

```
0000000000000000 t _workshare_copy_f64
```
2024-11-28 17:28:56 +01:00
s-watanabe314
f3cf24fcc4 [flang] Apply nocapture attribute to dummy arguments (#116182)
Apply llvm.nocapture attribute to dummy arguments that do not have the
target, asynchronous, volatile, or pointer attributes in a procedure
that is not a bind(c). This was discussed in


https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401
2024-11-28 15:39:26 +09:00
Kareem Ergawy
81f544d465 [flang][OpenMP] Rewrite omp.loop to semantically equivalent ops (#115443)
Introduces a new conversion pass that rewrites `omp.loop` ops to their
semantically equivalent op nests bases on the surrounding/binding
context of the `loop` op. Not all forms of `omp.loop` are supported yet.
See `isLoopConversionSupported` for more info on which forms are
supported.
2024-11-28 05:15:06 +01:00
Ivan R. Ivanov
2ed8c5de58 [flang][OpenMP] Fix handling of nested loop wrappers in LowerWorkshare (#117275) 2024-11-26 09:30:27 +09:00
Ivan R. Ivanov
5d38e6e42a [flang] Introduce hlfir.elemental lowerings to omp.workshare_loop_nest (#104748)
This patch adds parallelization support for the following expression in OpenMP
workshare constructs:

* Elemental procedures in array expressions

(reapplied with linking fix)
2024-11-21 11:14:21 +09:00
Ivan Radanov Ivanov
fa22100d57 Revert "[flang] Introduce hlfir.elemental lowerings to omp.workshare_loop_nest (#104748)"
This reverts commit 40c8938ff8.

Linking errors in buildbot build
2024-11-20 10:56:55 +09:00
Ivan R. Ivanov
40c8938ff8 [flang] Introduce hlfir.elemental lowerings to omp.workshare_loop_nest (#104748)
This patch adds parallelization support for the following expression in OpenMP
workshare constructs:

* Elemental procedures in array expressions
2024-11-20 10:49:34 +09:00
Ivan R. Ivanov
e7e5541616 [flang] Lower omp.workshare to other omp constructs (#101446)
Add a new pass that lowers an `omp.workshare` with its binding `omp.workshare.loop_wrapper` loop nests into other OpenMP constructs that can be lowered to LLVM.

More specifically, in order to preserve the sequential execution semantics of the code contained, it wraps portions that needs to be executed on a single thread in `omp.single` blocks, converts code that must be parallelized into `omp.wsloop` nests and inserts the appropriate synchronization.
2024-11-19 17:02:16 +09:00
Abid Qadeer
030179c2cb [flang][debug] Support ClassType. (#114809)
This PR adds the handling of `ClassType`. It is treated as pointer to
the underlying type. Note that `ClassType` when passed to the function
have double indirection so it is represented as pointer to type
(compared to other types which may have a single indirection).

If `ClassType` wraps a pointer or allocatable then we take care to
generate it as PTR -> type (and not PTR -> PTR -> type).

This is how it looks like in the debugger.

```
subroutine test_proc (this)
    class(test_type), intent (inout) :: this
    allocate (this%b (3, 2))
    call fill_array_2d (this%b)
    print *, this%a
end
```

```
(gdb) p this
$6 = (PTR TO -> ( Type test_type )) 0x2052a0
(gdb) p this%a
$7 = 0
(gdb) p this%b
$8 = ((1, 2, 3) (4, 5, 6))

```
2024-11-18 11:26:35 +00:00
agozillon
e508bacce4 [Flang][OpenMP] Derived type explicit allocatable member mapping (#113557)
This PR is one of 3 in a PR stack, this is the primary change set which
seeks to extend the current derived type explicit member mapping support
to handle descriptor member mapping at arbitrary levels of nesting. The
PR stack seems to do this reasonably (from testing so far) but as you
can create quite complex mappings with derived types (in particular when
adding allocatable derived types or arrays of allocatable derived types)
I imagine there will be hiccups, which I am more than happy to address.
There will also be further extensions to this work to handle the
implicit auto-magical mapping of descriptor members in derived types and
a few other changes planned for the future (with some ideas on
optimizing things).

The changes in this PR primarily occur in the OpenMP lowering and the
OMPMapInfoFinalization pass.

In the OpenMP lowering several utility functions were added or extended
to support the generation of appropriate intermediate member mappings
which are currently required when the parent (or multiple parents) of a
mapped member are descriptor types. We need to map the entirety of these
types or do a "deep copy" for lack of a better term, where we map both
the base address and the descriptor as without the copying of both of
these we lack the information in the case of the descriptor to access
the member or attach the pointers data to the pointer and in the latter
case we require the base address to map the chunk of data. Currently we
do not segment descriptor based derived types as we do with regular
non-descriptor derived types, we effectively map their entirety in all
cases at the moment, I hope to address this at some point in the future
as it adds a fair bit of a performance penalty to having nestings of
allocatable derived types as an example. The process of mapping all
intermediate descriptor members in a members path only occurs if a
member has an allocatable or object parent in its symbol path or the
member itself is a member or allocatable. This occurs in the
createParentSymAndGenIntermediateMaps function, which will also generate
the appropriate address for the allocatable member within the derived
type to use as a the varPtr field of the map (for intermediate
allocatable maps and final allocatable mappings). In this case it's
necessary as we can't utilise the usual Fortran::lower functionality
such as gatherDataOperandAddrAndBounds without causing issues later in
the lowering due to extra allocas being spawned which seem to affect the
pointer attachment (at least this is my current assumption, it results
in memory access errors on the device due to incorrect map information
generation). This is similar to why we do not use the MLIR value
generated for this and utilise the original symbol provided when mapping
descriptor types external to derived types. Hopefully this can be
rectified in the future so this function can be simplified and more
closely aligned to the other type mappings. We also make use of
fir::CoordinateOp as opposed to the HLFIR version as the HLFIR version
doesn't support the appropriate lowering to FIR necessary at the moment,
we also cannot use a single CoordinateOp (similarly to a single GEP) as
when we index through a descriptor operation (BoxType) we encounter
issues later in the lowering, however in either case we need access to
intermediate descriptors so individual CoordinateOp's aid this
(although, being able to compress them into a smaller amount of
CoordinateOp's may simplify the IR and perhaps result in a better end
product, something to consider for the future).

The other large change area was in the OMPMapInfoFinalization pass,
where the pass had to be extended to support the expansion of box types
(or multiple nestings of box types) within derived types, or box type
derived types. This requires expanding each BoxType mapping from one
into two maps and then modifying all of the existing member indices of
the overarching parent mapping to account for the addition of these new
members alongside adjusting the existing member indices to support the
addition of these new maps which extend the original member indices (as
a base address of a box type is currently considered a member of the box
type at a position of 0 as when lowered to LLVM-IR it's a pointer
contained at this position in the descriptor type, however, this means
extending mapped children of this expanded descriptor type to
additionally incorporate the new member index in the correct location in
its own index list). I believe there is a reasonable amount of comments
that should aid in understanding this better, alongside the test
alterations for the pass.

A subset of the changes were also aimed at making some of the utilities
for packing and unpacking the DenseIntElementsAttr containing the member
indices shareable across the lowering and OMPMapInfoFinalization, this
required moving some functions to the Lower/Support/Utils.h header, and
transforming the lowering structure containing the member index data
into something more similar to the version used in
OMPMapInfoFinalization. There we also some other attempts at tidying
things up in relation to the member index data generation in the
lowering, some of which required creating a logical operator for the
OpenMP ID class so it can be utilised as a map key (it simply utilises
the symbol address for the moment as ordering isn't particularly
important).

Otherwise I have added a set of new tests encompassing some of the
mappings currently supported by this PR (unfortunately as you can have
arbitrary nestings of all shapes and types it's not very feasible to
cover them all).
2024-11-16 12:28:37 +01:00
agozillon
d84d0caf28 [Flang][OpenMP] Update MapInfoFinalization to use BlockArgs Interface and modify use_device_ptr/addr to be order independent (#113919)
This patch primarily updates the MapInfoFinalization pass to utilise the
BlockArgument interface. It also shuffles newly added arguments the
MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo
lists, instead of one prior to the owning descriptor type.

During this it was noted that the use_device_ptr/addr handling of target
data was a little bit too order dependent so I've attempted to make it
less so, as we cannot depend on argument ordering to be the same as
Fortran for any future frontends.
2024-11-14 15:47:37 +01:00
Abid Qadeer
a993dfcdbf [flang][debug] Support assumed-rank arrays. (#114404)
The assumed-rank array are represented by DIGenericSubrange in debug
metadata. We have to provide 2 things.

1. Expression to get rank value at the runtime from descriptor.

2. Assuming the dimension number for which we want the array information
has been put on the DWARF expression stack, expressions which will
extract the lowerBound, count and stride information from the descriptor
for the said dimension.

With this patch in place, this is how I see an assumed_rank variable
being evaluated by GDB.

```
function mean(x) result(y)
integer, intent(in) :: x(..)
...
end

program main
use mod
implicit none
integer :: x1,xvec(3),xmat(3,3),xtens(3,3,3)
x1 = 5
xvec = 6
xmat = 7
xtens = 8
print *,mean(xvec), mean(xmat), mean(xtens), mean(x1)
end program main

(gdb) p x
$1 = (6, 6, 6)

(gdb) p x
$2 = ((7, 7, 7) (7, 7, 7) (7, 7, 7))

(gdb) p x
$3 = (((8, 8, 8) (8, 8, 8) (8, 8, 8)) ((8, 8, 8) (8, 8, 8) (8, 8, 8)) ((8, 8, 8) (8, 8, 8) (8, 8, 8)))

(gdb) p x
$4 = 5
```
2024-11-05 18:49:29 +00:00
Abid Qadeer
652988b658 [flang][debug] Support TupleType. (#113917)
Handling is similar to RecordType with following differences:

1. No check for cyclic references
2. No extra processing for lower bounds of array members.
3. No line information as TupleType is a lowering artefact and does not
really represent an entity in the code.
2024-10-30 09:52:56 +00:00
Abid Qadeer
8239ea3918 [flang][debug] Support IndexType. (#113921) 2024-10-29 12:22:43 +00:00
jeanPerier
64d7e45c40 Revert "[flang][debug] Support mlir::NoneType." (#113769)
Reverts llvm/llvm-project#113550

It turns out this causes compiler crashes with assumed-type arrays and -g.
See https://github.com/llvm/llvm-project/pull/113769 for a reproducer.
2024-10-26 21:38:54 +02:00
Abid Qadeer
85af1926f7 [flang][debug] Support mlir::NoneType. (#113550) 2024-10-25 11:43:25 +01:00
Abid Qadeer
37832d5de2 [flang][debug] Support fir.vector type. (#112951)
This PR converts the `fir.vector<>` to
`DICompositeTypeAttr(DW_TAG_array_type)` with `vector` flag set.
2024-10-24 13:37:32 +01:00
Abid Qadeer
47c1abf4af [flang][debug] Fix array lower bounds in derived type members. (#113183)
The lower bound information for the array members of a derived type
can't be obtained from the `DeclareOp`. It has to be extracted from the
`TypeInfoOp`. That was left as FIXME in the code. This PR adds the
missing functionality to fix the issue.

I tried the following approaches before settling on the current one that
is to generate `DITypeAttr` for array members right where the components
are being processed.

1. Generate a temp XDeclareOp with the shift information obtained from
the `TypeInfoOp`. This caused a few issues mostly related to
`unrealized_conversion_cast`.

2. Change the shift operands in the `declOp` that was passed in the
function before calling `convertType`. The code can be seen in the
abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the
following. It works correctly but I was not sure if temporarily changing
the `declOp` is the safe thing to do.

```
mlir::OperandRange originalShift = declOp.getShift();
mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable();
mutableOpRange.assign(shiftOpers);
elemTy = convertType(fieldTy, fileAttr, scope, declOp);
mutableOpRange.assign(originalShift);
```

Fixes #113178.
2024-10-24 13:22:28 +01:00
Abid Qadeer
c07abf7272 [flang][debug] Support fir::ReferenceType. (#113480) 2024-10-24 11:38:17 +01:00
Abid Qadeer
95b4128c6a [flang][debug] Don't generate debug for compiler-generated variables (#112423)
Flang generates many globals to handle derived types. There was a check
in debug info to filter them based on the information that their names
start with a period. This changed since PR#104859 where 'X' is being
used instead of '.'.

This PR fixes this issue by also adding 'X' in that list. As user
variables gets lower cased by the NameUniquer, there is no risk that
those will be filtered out. I added a test for that to be sure.
2024-10-21 11:27:34 +01:00
Pranav Bhandarkar
11dad2fa51 [flang][OpenMP] - Add MapInfoOp instances for target private variables when needed (#109862)
This PR adds an OpenMP dialect related pass for FIR/HLFIR which creates
`MapInfoOp` instances for certain privatized symbols. For example, if an
allocatable variable is used in a private clause attached to a
`omp.target` op, then the allocatable variable's descriptor will be
needed on the device (e.g. GPU). This descriptor needs to be separately
mapped onto the device. This pass creates the necessary `omp.map.info`
ops for this.
2024-10-20 01:01:39 -05:00
Abid Qadeer
cd12ffb622 [mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981)
Currently, we allow only one DIGlobalVariableExpressionAttr per global.
It is especially evident in import where we pick the first from the list
and ignore the rest. In contrast, LLVM allows multiple
DIGlobalVariableExpression to be attached to the global. They are needed
for correct working of things like DICommonBlock. This PR removes this
restriction in mlir. Changes are mostly mechanical. One thing on which I
went a bit back and forth was the representation inside GlobalOp. I
would be happy to change if there are better ways to do this.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-10-13 23:36:00 +01:00
Tom Eccles
91d6e77d8b [flang][debug] set DW_AT_main_subprogram for fortran main function (#111350)
Requested here
https://github.com/llvm/llvm-project/pull/111022#issuecomment-2396287781
2024-10-07 13:59:41 +01:00
Tom Eccles
f6f4c177ef [flang][debug] Use PROGRAM name for main function name (#111022)
For example, in

        PROGRAM test_program
          ...
        END PROGRAM

This allows a user to break on the main function with `break
test_program`. This matches what classic flang and gfortran do.
2024-10-04 10:46:58 +01:00
jeanPerier
1753de2d95 [flang][FIR] remove fir.complex type and its fir.real element type (#111025)
Final patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292

Since fir.real was only still used as fir.complex element type, this
patch removes it at the same time.
2024-10-04 09:57:03 +02:00
Abid Qadeer
fc4b1a303b [flang][debug] Handle array types with variable size/bounds. (#110686)
The debug information generated by flang did not handle the cases where
dimension or lower bounds of the arrays were variable. This PR fixes
this issue. It will help distinguish assumed size arrays from cases
where array size are variable. It also handles the variable lower bounds
for assumed shape arrays.
    
Fixes #98879.
2024-10-03 21:29:47 +01:00
jeanPerier
c4204c0b29 [flang] replace fir.complex usages with mlir complex (#110850)
Core patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292.
After that, the last step is to remove fir.complex from FIR types.
2024-10-03 17:10:57 +02:00
Sergio Afonso
cdb3ebf1e6 [MLIR][OpenMP] Normalize representation of entry block arg-defining clauses (#109809)
This patch updates printing and parsing of operations including clauses
that define entry block arguments to the operation's region. This
impacts `in_reduction`, `map`, `private`, `reduction` and
`task_reduction`.

The proposed representation to be used by all such clauses is the
following:
```
<clause_name>([byref] [@<sym>] %value -> %block_arg [, ...] : <type>[, ...]) {
  ...
}
```

The `byref` tag is only allowed for reduction-like clauses and the
`@<sym>` is required and only allowed for the `private` and
reduction-like clauses. The `map` clause does not accept any of these
two.

This change fixes some currently broken op representations, like
`omp.teams` or `omp.sections` reduction:
```
omp.teams reduction([byref] @<sym> -> %value : <type>) {
^bb0(%block_arg : <type>):
  ...
}
```

Additionally, it addresses some redundancy in the representation of the
previously mentioned cases, as well as e.g. `map` in `omp.target`. The
problem is that the block argument name after the arrow is not checked
in any way, which makes some misleading representations legal:
```mlir
omp.target map_entries(%x -> %arg1, %y -> %arg0, %z -> %doesnt_exist : !llvm.ptr, !llvm.ptr, !llvm.ptr) {
^bb0(%arg0 : !llvm.ptr, %arg1 : !llvm.ptr, %arg2 : !llvm.ptr):
  ...
}
```

In that case, `%x` maps to `%arg0`, contrary to what the representation
states, and `%z` maps to `%arg2`. `%doesnt_exist` is not resolved, so it
would likely cause issues if used anywhere inside of the operation's
region.

The solution implemented in this patch makes it so that values
introduced after the arrow on the representation of these clauses
implicitly define the corresponding entry block arguments, removing the
potential for these problematic representations. This is what is already
implemented for the `private` and `reduction` clauses of `omp.parallel`.

There are a couple of consequences of this change:
- Entry block argument-defining clauses must come at the end of the
operation's representation and in alphabetical order. This is because
they are printed/parsed as part of the region and a standardized
ordering is needed to reliably match op arguments with their
corresponding entry block arguments via the `BlockArgOpenMPOpInterface`.
- We can no longer define per-clause assembly formats to be reused by
all operations that take these clauses, since they must be passed to a
custom printer including the region and arguments of all other entry
block argument-defining clauses. Code duplication and potential for
introducing issues is minimized by providing the generic
`{print,parse}BlockArgRegion` helpers and associated structures.

MLIR and Flang lowering unit tests are updated due to changes in the
order and formatting of impacted operations.
2024-10-01 16:18:36 +01:00
Abid Qadeer
1094ee71da [flang][debug] Better handle array lower bound of assumed shape arrays. (#110302)
As mentioned in #108633, we don't respect the lower bound of the assumed
shape arrays if those were specified. It happens in both cases:
1. When caller has non-default lower bound and callee has default
2. When callee has non-default lower bound and caller has default

This PR tries to fix this issue by improving our generation of lower
bound attribute on DICompositeTypeAttr. If we see a lower bound in the
declaration, we respect that. Note that same function is also used for
allocatable/pointer variables. We make sure that we get the lower bound
from descriptor in those cases. Please note that DWARF assumes a lower
bound of 1 so in many cases we don't need to generate the lower bound.

Fixes #108633.
2024-09-30 20:31:08 +01:00
Abid Qadeer
d556e38fe8 [flang][debug] Support derived type components with box types. (#109424)
Our support for derived types uses `getTypeSizeAndAlignment` to
calculate the offset of the members. The `fir.box` was not supported in
that function. It meant that any member which required descriptor was
not supported in the derived type.
    
We convert the type into an llvm type and then use the DataLayout to
calculate the size/offset of a member. There is no dependency on
`getTypeSizeAndAlignment` to get the size of the types.

There are 2 other changes in this PR:

1. The `recID` field is used to handle cases where we have a member
references its parent type.

2. A type cache is maintained to avoid duplication. It is also needed
for circular reference case.


Fixes #108001.
2024-09-30 10:31:56 +01:00
Abid Qadeer
69ef3b102c [flang][debug] Allow variable length for dummy char arguments. (#109448)
As pointed out by @jeanPerier
[here](https://github.com/llvm/llvm-project/pull/108283#discussion_r1764528809),
we don't need to restrict the length of the dummy character argument
location to `fir.unboxchar`. This PR removes that restriction.
2024-09-26 10:08:48 +01:00
Abid Qadeer
76347ee958 [flang][debug] Improve handling of dummy character arguments. (#108283)
As described in #107998, we were not handling the case well when length
of the character is not part of the type. This PR handles one of the
case when the length can be calculated by looking at the result of
corresponding `fir.unboxchar`.

The DIStringTypeAttr have a `stringLength` field that can be a variable.
We create an artificial variable that will hold the length and used as
value of `stringLength` field. The variable is then attached with
a `DbgValueOp`.

Fixes #107998.
2024-09-18 13:52:23 +01:00
Abid Qadeer
b6f72fc1e2 [flang][debug] Generate correct subroutine type. (#108605)
We pass a list of types when creating a subroutine type. The first one
is supposed to be return type and the rest are the argument types. A
subroutine does not have a return type so an argument type could be
confused as a return type. To fix this, if there is no return type, we
generate a null type as a place holder.

Fixes #108564.
2024-09-17 11:07:23 +01:00
Abid Qadeer
1fc288bf48 [flang][debug] Handle lower bound in assumed size arrays. (#108523)
Fixes #108411
2024-09-17 11:02:10 +01:00
Tom Eccles
5aaf384b16 [flang][NFC] use llvm.intr.stacksave/restore instead of opaque calls (#108562)
The new LLVM stack save/restore intrinsic operations are more convenient
than function calls because they do not add function declarations to the
module and therefore do not block the parallelisation of passes.
Furthermore they could be much more easily marked with memory effects
than function calls if that ever proved useful.

This builds on top of #107879.

Resolves #108016
2024-09-16 12:33:37 +01:00
Abid Qadeer
db64e69fa2 [flang][debug] Handle 'used' module. (#107626)
As described in #98883, we have to qualify a module variable name in
debugger to get its value. This PR tries to remove this limitation.
    
LLVM provides `DIImportedEntity` to handle such cases but the PR is made
more complicated due to the following 2 issues.
    
1. The MLIR attributes are readonly and we have a circular dependency
here. This has to be handled using the recursive interface provided by
the MLIR. This requires us to first create a place holder
`DISubprogramAttr` which is used in creating `DIImportedEntityAttr`.
Later another `DISubprogramAttr` is created which replaces the place
holder.
    
2. The flang IR does not provide any information about the 'used' module
so this has to be extracted by doing a pass over the
`DeclareOp` in the function. This presents certain limitation as 'only'
and module variable renaming may not be handled properly.
    
Due to the change in `DISubprogramAttr`, some tests also needed to be
adjusted.
    
Fixes #98883.
2024-09-11 09:31:53 +01:00
Abid Qadeer
4f3f09e787 [flang][debug] Add stride information for assumed shape array. (#106703)
Without this information, debugger could present wrong values for arrays
in certain cases as shown in issue #105646.

Fixes #105646.
2024-09-04 11:13:10 +01:00