Commit Graph

7519 Commits

Author SHA1 Message Date
Kareem Ergawy
52b7045fbb [flang][MLIR][OpenMP] Emit UpdateDataOp from !$omp target update (#75345)
Emits MLIR op corresponding to `!$omp target update` directive. So far,
only motion types: `to` and `from` are supported. Motion modifiers:
`present`, `mapper`, and `iterator` are not supported yet.

This is a follow up to #75047 & #75159, only the last commit is relevant
to this PR.
2023-12-22 20:02:31 +01:00
jeanPerier
0ac1dfa311 [flang] lower c_f_procpointer (#76071)
This is equivalent to a procedure pointer assignment, except that the
target is a C_FUNPTR.
2023-12-22 11:01:03 +01:00
jeanPerier
30a1c0aa27 [flang] c_funloc - handle pocedure pointers in convertToBox (#76070)
C_FUNLOC was not handling procedure pointer argument correctly, the
issue lied in `hlfir::convertToBox` that did not handle procedure
pointers.

I modified the interface of `hlfir::convertToXXX` to take values on the
way because hlfir::Entity are fundamentally an mlir::Value with type
guarantees, so they should be dealt with by value as mlir::Value are
(they are very small).
2023-12-22 10:59:59 +01:00
jeanPerier
f3fa603d74 [flang] lower ASSOCIATED for procedure pointers (#76067)
This is a lot less complex than the data case where the shape has to be
accounted for, so the implementation is done inline.

One corner case will not be supported correctly for now: the case where
POINTER and TARGET points to the same internal procedure may return
false because lowering is creating fir.embox_proc each time the address
of an internal procedure is taken, so different thunk for the same
internal procedure/host link may be created and compare to false. This
will be fixed in a later patch that moves creating of internal procedure
fir.embox_proc in the host so that the addresses are the same when the
host link is the same. This change is required to properly support the
required lifetime of internal procedure addresses anyway (should be the
always be the lifetime of the host, even when the address is taken in an
internal procedure).
2023-12-22 10:59:01 +01:00
Pete Steinfeld
0cf3af0c51 Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… (#76184)
… scalar result. (#75820)"

This reverts commit 701f647905.

The commit breaks some uses of the 'maxloc' intrinsic.

See PR #75820
2023-12-21 13:14:05 -08:00
Kazu Hirata
c50de57feb [flang] Fix a warning
This patch fixes:

  flang/lib/Optimizer/Transforms/StackArrays.cpp:452:7: error:
  ignoring return value of function declared with 'nodiscard'
  attribute [-Werror,-Wunused-result]
2023-12-21 10:30:36 -08:00
madanial0
6b505406a3 [Flang] remove whole-archive option for AIX linker (#76039)
The AIX linker does not support the `--whole-archive` option, removing
the option if the OS is AIX.

---------

Co-authored-by: Mark Danial <mark.danial@ibm.com>
2023-12-21 10:22:30 -05:00
Krzysztof Parzyszek
791200b3bc [flang][OpenMP] Avoid captures of references to structured bindings
Handle one more case missed in ad37c8694e.
2023-12-21 08:41:30 -06:00
Radu Salavat
3107f313f1 [Flang, Clang] Enable and test 'rdynamic' flag (#75598)
Enable and test 'rdynamic' flag
2023-12-21 14:37:51 +00:00
madanial0
3d9fc3fed0 [flang] add no-cpp-dep test for AIX 64 bit (#74637)
Add a new test for no-cpp-dep on AIX as it requires 64 bit OBJECT_MODE
since only 64-bit AIX is supported. AIX does not allow `-o /dev/null`
and requires `-lpthread` flag to be added.

---------

Co-authored-by: Mark Danial <mark.danial@ibm.com>
2023-12-21 08:58:55 -05:00
Yi Wu
18af032c0e [flang] add GETLOG runtime and extension implementation: get login username (#74628)
Get login username, ussage:
```
CHARACTER(32) :: login
CALL getlog(login)
WRITE(*,*) login
```
getlog is required for an exascale proxyapp.
https://proxyapps.exascaleproject.org/app/minismac2d/

f904467142/ref/smac2d.f (L615)

f904467142/ref/smac2d.f (L1570)

---------

Co-authored-by: Yi Wu <43659785+PAX-12-WU@users.noreply.github.com>
Co-authored-by: Yi Wu <yiwu02@wdev-yiwu02.arm.com>
Co-authored-by: Kiran Chandramohan <kiranchandramohan@gmail.com>
2023-12-21 10:35:28 +00:00
Valentin Clement
a25da1a921 [mlir][openacc] Add device_type support for compute operations (#75864)
Re-land PR after being reverted because of buildbot failures.

This patch adds representation for `device_type` clause information on
compute construct (parallel, kernels, serial).

The `device_type` clause on compute construct impacts clauses that
appear after it. The values impacted by `device_type` are now tied with
an attribute array that represent the device_type associated with them.
`DeviceType::None` is used to represent the value produced by a clause
before any `device_type`. The operands and the attribute information are
parser/printed together.

This is an example with `vector_length` clause. The first value (64) is
not impacted by `device_type` so it will be represented with
DeviceType::None. None is not printed. The second value (128) is tied
with the `device_type(multicore)` clause.
```
!$acc parallel vector_length(64) device_type(multicore) vector_length(256)
```
```
acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) {
}
```

When multiple values can be produced for a single clause like
`num_gangs` and `wait`, an extra attribute describe the number of values
belonging to each `device_type`. Values and attributes are
parsed/printed together.

```
acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>])
```

While preparing this patch I noticed that the wait devnum is not part of
the operations and is not lowered. It will be added in a follow up
patch.
2023-12-20 20:36:09 -08:00
Valentin Clement
553748356c Revert "[mlir][openacc] Add device_type support for compute operations (#75864)"
This reverts commit 8b885eb90f.
2023-12-20 16:08:10 -08:00
Valentin Clement
e98082d90a Revert "[flang][openacc] Remove unused waitdevnum"
This reverts commit 8fdc3b98b8.
2023-12-20 16:07:57 -08:00
Valentin Clement
8fdc3b98b8 [flang][openacc] Remove unused waitdevnum 2023-12-20 14:01:51 -08:00
Valentin Clement (バレンタイン クレメン)
8b885eb90f [mlir][openacc] Add device_type support for compute operations (#75864)
This patch adds representation for `device_type` clause information on
compute construct (parallel, kernels, serial).

The `device_type` clause on compute construct impacts clauses that
appear after it. The values impacted by `device_type` are now tied with
an attribute array that represent the device_type associated with them.
`DeviceType::None` is used to represent the value produced by a clause
before any `device_type`. The operands and the attribute information are
parser/printed together.

This is an example with `vector_length` clause. The first value (64) is
not impacted by `device_type` so it will be represented with
DeviceType::None. None is not printed. The second value (128) is tied
with the `device_type(multicore)` clause.
```
!$acc parallel vector_length(64) device_type(multicore) vector_length(256)
```
```
acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) {
}
```

When multiple values can be produced for a single clause like
`num_gangs` and `wait`, an extra attribute describe the number of values
belonging to each `device_type`. Values and attributes are
parsed/printed together.

```
acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>])
```

While preparing this patch I noticed that the wait devnum is not part of
the operations and is not lowered. It will be added in a follow up
patch.
2023-12-20 13:45:47 -08:00
Krzysztof Parzyszek
7ffad37c86 [flang][OpenMP] Avoid captures of references to structured bindings
Fixes build break caused by 400c32cbf9.
2023-12-20 15:31:49 -06:00
Krzysztof Parzyszek
400c32cbf9 [flang][OpenMP] Use llvm::enumerate in few places, NFC (#76095)
Use `llvm::enumerate` instead of iterating over a range and keeping a
separate counter.
2023-12-20 15:09:37 -06:00
Slava Zakharin
b4b23ff7f8 [flang][runtime] Enable more APIs in the offload build. (#75996)
This patch enables more numeric (mod, sum, matmul, etc.) APIs,
and some others.

I added new macros to disable warnings about using C++ STD methods
like operators of std::complex, which do not have __device__ attribute.
This may probably result in unresolved references, if the header files
implementation relies on libstdc++. I will need to follow up on this.
2023-12-20 11:52:51 -08:00
Razvan Lupusoru
a711b042fd [acc] Initial implementation of MemoryEffects on acc operations (#75970)
The `acc` dialect operations now implement MemoryEffects interfaces in
the following ways:
- Data entry operations which may read host memory via `varPtr` are now
marked as so. The majority of them do NOT actually read the host memory.
For example, `acc.present` works on the basis of presence of pointer and
not necessarily what the data points to - so they are not marked as
reading the host memory. They still use `varPtr` though but this
dependency is reflected through ssa.
- Data clause operations which may mutate the data pointed to by
`accPtr` are marked as doing so.
- Data clause operations which update required structured or dynamic
runtime counters are marked as reading and writing the newly defined
`RuntimeCounters` resource. Some operations, like `acc.getdeviceptr` do
not actually use the runtime counters - but are marked as reading them
since the address obtained depends on the mapping operations which do
update the runtime counters. Namely, `acc.getdeviceptr` cannot be moved
across other mapping operations.
- Constructs are marked as writing to the `ConstructResource`. This may
be too strict but is needed for the following reasons: 1) Structured
constructs may not use `accPtr` and instead use `varPtr` - when this is
the case, data actions may be removed even when used. 2) Unstructured
constructs are currently used to aggregate multiple data actions. We do
not want such constructs removed or moved for now.
- Terminators are marked as `Pure` as in other dialects.

The current approach has the following limitations which may require
further improvements:
- Subsequent `acc.copyin` operations on same data do not actually read
host memory pointed to by `varPtr` but are still marked as so.
- Two `acc.delete` operations on same data may not mutate `accPtr` until
the runtime counters are zero (but are still marked as mutating).
- The `varPtrPtr` argument, when present, points to the address of
location of `varPtr`. When mapping to target device, an `accPtrPtr`
needs computed and this memory is mutated. This effect is not captured
since the current operations do not produce `accPtrPtr`.
- Runtime counter effects are imprecise since two operations with
differing `varPtr` increment/decrement different counters. Additionally,
operations with `varPtrPtr` mutate attachment counters.
- The `ConstructResource` is too strict and likely can be relaxed with
better modeling.
2023-12-20 07:11:19 -08:00
David Green
701f647905 [Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result. (#75820)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.
2023-12-20 12:12:12 +00:00
jeanPerier
36a073a5f4 [flang] Add option to skip struct argument rewrite in target-rewrite (#75939)
Be consistent with complex and character rewrite so that the pass can be
run selectively.
2023-12-20 10:15:09 +01:00
Matthias Springer
f10302e3fa [mlir] Require folders to produce Values of same type (#75887)
This commit adds extra assertions to `OperationFolder` and `OpBuilder`
to ensure that the types of the folded SSA values match with the result
types of the op. There used to be checks that discard the folded results
if the types do not match. This commit makes these checks stricter and
turns them into assertions.

Discarding folded results with the wrong type (without failing
explicitly) can hide bugs in op folders. Two such bugs became apparent
in MLIR (and some more in downstream projects) and are fixed with this
change.

Note: The existing type checks were introduced in
https://reviews.llvm.org/D95991.

Migration guide: If you see failing assertions (`folder produced value
of incorrect type`; make sure to run with assertions enabled!), run with
`-debug` or dump the operation right before the failing assertion. This
will point you to the op that has the broken folder. A common mistake is
a mismatch between static/dynamic dimensions (e.g., input has a static
dimension but folded result has a dynamic dimension).
2023-12-20 14:39:22 +09:00
jeanPerier
c373f58134 [flang] Lower procedure pointer components (#75453)
Lower procedure pointer components, except in the context of structure
constructor (left TODO).

Procedure pointer components lowering share most of the lowering logic
of procedure poionters with the following particularities:
- They are components, so an hlfir.designate must be generated to
retrieve the procedure pointer address from its derived type base.
- They may have a PASS argument. While there is no dispatching as with
type bound procedure, special care must be taken to retrieve the derived
type component base in this case since semantics placed it in the
argument list and not in the evaluate::ProcedureDesignator.

These components also bring a new level of recursive MLIR types since a
fir.type may now contain a component with an MLIR function type where
one of the argument is the fir.type itself. This required moving the
"derived type in construction" stackto the converter so that the object
and function type lowering utilities share the same state (currently the
function type utilty would end-up creating a new stack when lowering its
arguments, leading to infinite loops). The BoxedProcedurePass also
needed an update to deal with this recursive aspect.
2023-12-19 17:17:09 +01:00
jeanPerier
41096d19ab [flang] Do not instantiate components in initial targets as objects (#75778)
Lowering was instantiating component symbols (but the last) in initial
target designator as if they were whole objects, leading to collisions
and bugs.

Fixes https://github.com/llvm/llvm-project/issues/75728
2023-12-19 10:10:24 +01:00
jeanPerier
1d57b9a5b1 [flang] Pass one element struct by register on X86-64 (#75802)
Implement the C struct passing ABI on X86-64 for the trivial case where
the structs have one element. This is required to cover some cases of
BIND(C) derived type pass with the VALUE attribute.
2023-12-19 09:50:58 +01:00
harishch4
482a37b860 [Flang][OpenMp]Add testcase for threadprivate with blank common block (#74969) 2023-12-18 18:27:34 +05:30
Kiran Chandramohan
a4deb14e35 [Flang][OpenMP] Add check-dag to private-clause-fixes test 2023-12-18 11:19:38 +00:00
David Green
9bb47f7f8b [Flang] Add Maxloc to fir simplify intrinsics pass (#75463)
This takes the code from D144103 and extends it to maxloc, to allow the
simplifyMinMaxlocReduction method to work with both min and max
intrinsics by switching condition and limit/initial value.
2023-12-18 07:59:51 +00:00
Valentin Clement (バレンタイン クレメン)
22426d9ecd [flang][openacc/mp] Do not read bounds on absent box (#75252)
Make sure we only load box and read its bounds when it is present.
- Add `AddrAndBoundInfo` struct to be able to carry around the `addr`
and `isPresent` values. This is likely to grow so we can make all the
access in a single `fir.if` operation.
2023-12-15 13:02:40 -08:00
vdonaldson
d6a3607ff5 [flang] legacy branch target (#75628)
Branching to an endif statement from outside of the if is nonconformant:

  subroutine jump(n)
    goto 6
    if (n == 3) then
      goto 7
  6 end if
    print *, 'pass'
    return
  7 print *, 'fail'
  end

However, this branch was permitted up to f90. Account for this usage
when rewriting if constructs and if statements by suppressing rewriting
if the end statement is labeled.
2023-12-15 11:21:53 -08:00
Philip Reames
0b7dda3d4c Revert "[flang][nfc] Refactor linker invocation logic (#75534)"
This reverts commit 71bbfabd08.  Breaks check-flang on x86_64 host.
2023-12-15 11:08:09 -08:00
Valentin Clement (バレンタイン クレメン)
43cb8f00f0 [flang][openacc/mp][NFC] Remove unused baseAddr argument (#75537)
`baseAddr` is not used in `genBaseBoundsOps` just remove it.
2023-12-15 09:46:47 -08:00
Krzysztof Parzyszek
82e91b91ca [flang][OpenMP] Move handling of OpenMP symbol flags to OpenMP.cpp (#75523)
The function `instantiateVariable` in Bridge.cpp has the following code:
```
  if (var.getSymbol().test(
          Fortran::semantics::Symbol::Flag::OmpThreadprivate))
    Fortran::lower::genThreadprivateOp(*this, var);

  if (var.getSymbol().test(
          Fortran::semantics::Symbol::Flag::OmpDeclareTarget))
    Fortran::lower::genDeclareTargetIntGlobal(*this, var);
```

Implement `handleOpenMPSymbolProperties` in OpenMP.cpp, move the above
code there, and have `instantiateVariable` call this function instead.

This would further separate OpenMP-related details into OpenMP.cpp.
2023-12-15 09:32:57 -06:00
Andrzej Warzyński
48e96e97d1 [flang][driver][nfc] Rename one variable (res -> invoc) (#75535)
The new name better reflects what the variable represents.
2023-12-15 15:04:54 +00:00
Andrzej Warzyński
71bbfabd08 [flang][nfc] Refactor linker invocation logic (#75534)
Refactor how the Fortran runtime libs are added to the linker
invocation. This is a non-functional change.
2023-12-15 15:04:37 +00:00
Krzysztof Parzyszek
aeb482106c [flang][OpenMP] Move nested eval conversion to OpenMP.cpp, NFC (#75502)
This is the first step towards exploiting `genEval` functionality from
inside of OpenMP-generating functions.

This follows discourse discussion:
https://discourse.llvm.org/t/openmp-lowering-from-pft-to-fir/75263
2023-12-15 09:01:08 -06:00
David Green
2812cb065a [Flang] HLFIR maxloc intrinsic (#75450)
Similar to minloc from #74436, this adds a hlfir maxloc intrinsic so
that we can keep them symmetrical. It's just a bit of copy and pasting.
2023-12-15 09:32:15 +00:00
David Green
34eee5d647 [Flang] Remove kind from CountOp (#75466)
The kind is already represented in the return type of the operation.
Like we did for minloc, this removes the kind parameter from CountOp.
2023-12-15 09:31:52 +00:00
Valentin Clement (バレンタイン クレメン)
711809f37a [flang][openacc/mp][NFC] Fix order of template arguments (#75538)
Some template parameters for the bounds ops generation have been
inverted. It should be consistent to be `BoundsOp, BoundsType`.
2023-12-14 21:13:38 -08:00
madanial0
af06c5f634 [Flang] fix ppc-vec intrinsics testcases on AIX (NFC) (#74347)
Modify ppc-vec intrinsic test cases to include support for both little
and big endianness

Co-authored-by: Mark Danial <mark.danial@ibm.com>
2023-12-14 19:15:59 -05:00
Jacques Pienaar
ee2deb4cf7 [mlir] Handle simple commutative cases in CSE.
Tried to keep this simple while handling obvious CSE instances. For more
complicated cases the expectation is still that the sorting pass would
run before. While simple, this case did turn up in a real deployed
instance where it had a large (>10% e2e) impact. This can of course be
refined.
2023-12-14 16:09:05 -08:00
Andrzej Warzyński
1b6c8280b9 [flang][driver] Don't use -whole-archive on Darwin (#75393)
Direct follow-up of #73124 - the linker on Darwin does not support
`-whole-archive`, so that needs to be removed from the linker
invocation.

For context:
  * https://github.com/llvm/llvm-project/pull/73124
2023-12-14 21:11:25 +00:00
Rainer Orth
6e87672150 [flang] Adjust _FORTRAN_RUNTIME_IEEE_FENV_T_EXTENT for Solaris (#74590)
Even after 13e2200fa4 (Solaris lacks
`femode_t`, too), the Solaris `flang` build is still broken:
```
/vol/llvm/src/llvm-project/local/flang/runtime/exceptions.cpp:87:5: error: static assertion failed due to requirement 'sizeof(fenv_t) <= sizeof(int) * 8': increase ieee_status_type size
   87 |     sizeof(fenv_t) <= sizeof(int) * _FORTRAN_RUNTIME_IEEE_FENV_T_EXTENT,      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/vol/llvm/src/llvm-project/local/flang/runtime/exceptions.cpp:87:20: note: expression evaluates to '200 <= 32'
   87 |     sizeof(fenv_t) <= sizeof(int) * _FORTRAN_RUNTIME_IEEE_FENV_T_EXTENT,      |     ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
This patch fixes this by removing the assertion.

Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
2023-12-14 22:03:45 +01:00
Valentin Clement (バレンタイン クレメン)
fedc54bf35 [flang] Add genEval to the AbstractConverter (#75140)
There was some discussion on discourse[1] about allowing call to FIR
generation functions from other part of lowering belonging to OpenMP.

This solution exposes a simple `genEval` member function on the
`AbstractConverter` so that IR generation for PFT Evaluation objects can
be called from lowering outside of the FirConverter but not exposing it.

[1] https://discourse.llvm.org/t/openmp-lowering-from-pft-to-fir/75263
2023-12-14 09:25:27 -08:00
Krzysztof Parzyszek
9cf9721dcf [flang][OpenMP] Avoid unnecessary init loop, use constructor instead,… (#75482)
… NFC

SmallVector has a constructor that fills it with a number of copies of a
given value. Use it instead of a loop that does the same thing.
2023-12-14 11:24:17 -06:00
Kazu Hirata
11efccea8f [flang] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 23:48:53 -08:00
vdonaldson
1220edc6e0 [flang] module namelist IO with renaming (#75264)
The test:
```
  module mmm
    real rrr
    namelist /aaa/ rrr
  end

    use mmm, bbb => aaa
    rrr = 3.
    write(*,bbb)
  end
```
Should output:  &AAA RRR= 3./

not:            &BBB RRR= 3./
2023-12-13 10:26:12 -08:00
Pete Steinfeld
2a1d222010 [flang] Fix compilation error due to variable no being used (#75210)
My builds were failing because the variable 'dim' was not used. This
produced a warning, and my builds have warnings set as errors.
2023-12-12 08:15:53 -08:00
David Green
a216115433 [Flang] Add a HLFIR Minloc intrinsic (#74436)
The adds a hlfir minloc intrinsic, similar to the minval intrinsic
already added, to help in the lowering of minloc. The idea is to later
add maxloc too, and from there add a simplification for producing minloc
with inlined elemental and hopefully less temporaries.
2023-12-12 12:39:21 +00:00