Files
clang-p2996/mlir/test/Conversion/OpenMPToLLVM
khaki3 a63f915771 [flang][openacc][openmp] Support implicit casting on the atomic interface (#114390)
ACCMP atomics do not support type conversion. Specifically, I have
encountered semantically incorrect code for atomic reads.

Example:

```
program main
  implicit none
  real(8) :: n
  integer :: x
  x = 1.0
  !$acc atomic capture
  n = x
  x = n
  !$acc end atomic
end program main
```

We have this error when compiling it with flang-new: `error:
loc("rep.f90":6:9): expected three operations in atomic.capture region
(one terminator, and two atomic ops)`

Yet, in the following generated FIR code, we observe three issues.

1. `fir.convert` intrudes into the capture region.
2. An incorrect temporary (`%2`) is being updated instead of `n`.
3. If we allow `n` in place of `%2`, the operand types of `atomic.read`
do not match. Introducing a `!fir.ref<i32> -> !fir.ref<f64>` conversion
on `x` is inaccurate because we need to convert the value of `x`.

```
    %2 = "fir.alloca"() <{in_type = i32, operandSegmentSizes = array<i32: 0, 0>}> : () -> !fir.ref<i32>
    %3 = "fir.alloca"() <{bindc_name = "n", in_type = f64, operandSegmentSizes = array<i32: 0, 0>, uniq_name = "_QFEn"}> : () -> !fir.ref<f64>
    %4:2 = "hlfir.declare"(%3) <{operandSegmentSizes = array<i32: 1, 0, 0, 0>, uniq_name = "_QFEn"}> : (!fir.ref<f64>) -> (!fir.ref<f64>, !fir.ref<f64>)
    %5 = "fir.alloca"() <{bindc_name = "x", in_type = i32, operandSegmentSizes = array<i32: 0, 0>, uniq_name = "_QFEx"}> : () -> !fir.ref<i32>
    %6:2 = "hlfir.declare"(%5) <{operandSegmentSizes = array<i32: 1, 0, 0, 0>, uniq_name = "_QFEx"}> : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
    %7 = "arith.constant"() <{value = 1 : i32}> : () -> i32
    "hlfir.assign"(%7, %6#0) : (i32, !fir.ref<i32>) -> ()
    %8 = "fir.load"(%4#0) : (!fir.ref<f64>) -> f64
    %9 = "fir.convert"(%8) : (f64) -> i32
    "fir.store"(%9, %2) : (i32, !fir.ref<i32>) -> ()
    %10 = "fir.load"(%6#0) : (!fir.ref<i32>) -> i32
    %11 = "fir.convert"(%10) : (i32) -> f64
    "acc.atomic.capture"() ({
      "acc.atomic.read"(%2, %6#1) <{element_type = f64}> : (!fir.ref<i32>, !fir.ref<i32>) -> ()
      %12 = "fir.convert"(%11) : (f64) -> i32
      "acc.atomic.write"(%2, %12) : (!fir.ref<i32>, i32) -> ()
      "acc.terminator"() : () -> ()
    }) : () -> ()
```

This PR updates `flang/lib/Lower/DirectivesCommon.h` to solve the issues
by taking the following approaches (from top to bottom):

1. Move `fir.convert` for `atomic.write` out of the capture region.
2. Remove the `!fir.ref<i32> -> !fir.ref<f64>` conversion found in
`genOmpAccAtomicRead`.
3. Eliminate unnecessary `genExprAddr` calls on the RHS, which create an
invalid temporary for `x = 1.0`.
4. When generating a capture operation, refer to the original LHS
instead of the type-casted RHS.

Here, we have to allow for the cases where the operand types of
`atomic.read` differ from one another. Thus, this PR also removes the
`AllTypesMatch` trait from both `acc.atomic.read` and `omp.atomic.read`.

The example code is converted as follows:

```
    %0 = fir.alloca f64 {bindc_name = "n", uniq_name = "_QFEn"}
    %1:2 = hlfir.declare %0 {uniq_name = "_QFEn"} : (!fir.ref<f64>) -> (!fir.ref<f64>, !fir.ref<f64>)
    %2 = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFEx"}
    %3:2 = hlfir.declare %2 {uniq_name = "_QFEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
    %c1_i32 = arith.constant 1 : i32
    hlfir.assign %c1_i32 to %3#0 : i32, !fir.ref<i32>
    %4 = fir.load %1#0 : !fir.ref<f64>
    %5 = fir.convert %4 : (f64) -> i32
    acc.atomic.capture {
      acc.atomic.read %1#1 = %3#1 : !fir.ref<f64>, !fir.ref<i32>, i32
      acc.atomic.write %3#1 = %5 : !fir.ref<i32>, i32
    }
```

Fixes #112911.
2024-11-05 07:52:45 -08:00
..