This commit makes reductions part of the terminator. Instead of
`scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops.
`scf.reduce` may contain an arbitrary number of reductions, with one
region per reduction.
Example:
```mlir
%init = arith.constant 0.0 : f32
%r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init)
-> f32, f32 {
%elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32>
%elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32>
scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.addf %lhs, %rhs : f32
scf.reduce.return %res : f32
}, {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.mulf %lhs, %rhs : f32
scf.reduce.return %res : f32
}
}
```
`scf.reduce` operations can no longer be interleaved with other ops in
the body of `scf.parallel`. This simplifies the op and makes it possible
to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was
not possible before because the op was not a terminator, causing the op
to be DCE'd.)
35 lines
2.3 KiB
MLIR
35 lines
2.3 KiB
MLIR
// First test various sets of invalid arguments
|
|
// RUN: not mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(test-scf-parallel-loop-collapsing))' 2>&1 | FileCheck %s --check-prefix=CL0
|
|
// CL0: No collapsed-indices were specified. This pass is only for testing and does not automatically collapse all parallel loops or similar
|
|
|
|
// RUN: not mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(test-scf-parallel-loop-collapsing{collapsed-indices-1=1}))' 2>&1 | FileCheck %s --check-prefix=CL1
|
|
// CL1: collapsed-indices-1 specified but not collapsed-indices-0
|
|
|
|
// RUN: not mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(test-scf-parallel-loop-collapsing{collapsed-indices-0=1 collapsed-indices-2=2}))' 2>&1 | FileCheck %s --check-prefix=CL2
|
|
// CL2: collapsed-indices-2 specified but not collapsed-indices-1
|
|
|
|
// RUN: not mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(test-scf-parallel-loop-collapsing{collapsed-indices-0=1 collapsed-indices-1=2}))' 2>&1 | FileCheck %s --check-prefix=NON-ZERO
|
|
// NON-ZERO: collapsed-indices arguments must include all values [0,N).
|
|
|
|
// RUN: not mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(test-scf-parallel-loop-collapsing{collapsed-indices-0=0 collapsed-indices-1=2}))' 2>&1 | FileCheck %s --check-prefix=NON-CONTIGUOUS
|
|
// NON-CONTIGUOUS: collapsed-indices arguments must include all values [0,N).
|
|
|
|
|
|
// Then test for invalid combinations of argument+input-ir
|
|
// RUN: mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(test-scf-parallel-loop-collapsing{collapsed-indices-0=0,1}))' -verify-diagnostics
|
|
func.func @too_few_iters(%arg0: index, %arg1: index, %arg2: index) {
|
|
// expected-error @+1 {{op has 1 iter args while this limited functionality testing pass was configured only for loops with exactly 2 iter args.}}
|
|
scf.parallel (%arg3) = (%arg0) to (%arg1) step (%arg2) {
|
|
scf.reduce
|
|
}
|
|
return
|
|
}
|
|
|
|
func.func @too_many_iters(%arg0: index, %arg1: index, %arg2: index) {
|
|
// expected-error @+1 {{op has 3 iter args while this limited functionality testing pass was configured only for loops with exactly 2 iter args.}}
|
|
scf.parallel (%arg3, %arg4, %arg5) = (%arg0, %arg0, %arg0) to (%arg1, %arg1, %arg1) step (%arg2, %arg2, %arg2) {
|
|
scf.reduce
|
|
}
|
|
return
|
|
}
|