Files
clang-p2996/mlir/test/Dialect/Vector/vector-break-down-bitcast.mlir
Quinn Dawkins 650f04feda [mlir][vector] Add pattern to break down vector.bitcast
The pattern added here is intended as a last resort for targets like
SPIR-V where there are vector size restrictions and we need to be able
to break down large vector types. Vectorizing loads/stores for small
bitwidths (e.g. i8) relies on bitcasting to a larger element type and
patterns to bubble bitcast ops to where they can cancel.
This fails for cases such as
```
%1 = arith.trunci %0 : vector<2x32xi32> to vector<2x32xi8>
vector.transfer_write %1, %destination[%c0, %c0] {in_bounds = [true, true]} : vector<2x32xi8>, memref<2x32xi8>
```
where the `arith.trunci` op essentially does the job of one of the
bitcasts, leading to a bitcast that need to be further broken down
```
vector.bitcast %0 : vector<16xi8> to vector<4xi32>
```

Differential Revision: https://reviews.llvm.org/D149065
2023-04-25 20:18:02 -04:00

42 lines
3.0 KiB
MLIR

// RUN: mlir-opt -split-input-file -test-vector-break-down-bitcast %s | FileCheck %s
// CHECK-LABEL: func.func @bitcast_f16_to_f32
// CHECK-SAME: (%[[INPUT:.+]]: vector<8xf16>)
func.func @bitcast_f16_to_f32(%input: vector<8xf16>) -> vector<4xf32> {
%0 = vector.bitcast %input : vector<8xf16> to vector<4xf32>
return %0: vector<4xf32>
}
// CHECK: %[[INIT:.+]] = arith.constant dense<0.000000e+00> : vector<4xf32>
// CHECK: %[[EXTRACT0:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [0], sizes = [4], strides = [1]} : vector<8xf16> to vector<4xf16>
// CHECK: %[[CAST0:.+]] = vector.bitcast %[[EXTRACT0]] : vector<4xf16> to vector<2xf32>
// CHECK: %[[INSERT0:.+]] = vector.insert_strided_slice %[[CAST0]], %[[INIT]] {offsets = [0], strides = [1]} : vector<2xf32> into vector<4xf32>
// CHECK: %[[EXTRACT1:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [4], sizes = [4], strides = [1]} : vector<8xf16> to vector<4xf16>
// CHECK: %[[CAST1:.+]] = vector.bitcast %[[EXTRACT1]] : vector<4xf16> to vector<2xf32>
// CHECK: %[[INSERT1:.+]] = vector.insert_strided_slice %[[CAST1]], %[[INSERT0]] {offsets = [2], strides = [1]} : vector<2xf32> into vector<4xf32>
// CHECK: return %[[INSERT1]]
// -----
// CHECK-LABEL: func.func @bitcast_i8_to_i32
// CHECK-SAME: (%[[INPUT:.+]]: vector<16xi8>)
func.func @bitcast_i8_to_i32(%input: vector<16xi8>) -> vector<4xi32> {
%0 = vector.bitcast %input : vector<16xi8> to vector<4xi32>
return %0: vector<4xi32>
}
// CHECK: %[[INIT:.+]] = arith.constant dense<0> : vector<4xi32>
// CHECK: %[[EXTRACT0:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [0], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
// CHECK: %[[CAST0:.+]] = vector.bitcast %[[EXTRACT0]] : vector<4xi8> to vector<1xi32>
// CHECK: %[[INSERT0:.+]] = vector.insert_strided_slice %[[CAST0]], %[[INIT]] {offsets = [0], strides = [1]} : vector<1xi32> into vector<4xi32>
// CHECK: %[[EXTRACT1:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [4], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
// CHECK: %[[CAST1:.+]] = vector.bitcast %[[EXTRACT1]] : vector<4xi8> to vector<1xi32>
// CHECK: %[[INSERT1:.+]] = vector.insert_strided_slice %[[CAST1]], %[[INSERT0]] {offsets = [1], strides = [1]} : vector<1xi32> into vector<4xi32>
// CHECK: %[[EXTRACT2:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [8], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
// CHECK: %[[CAST2:.+]] = vector.bitcast %[[EXTRACT2]] : vector<4xi8> to vector<1xi32>
// CHECK: %[[INSERT2:.+]] = vector.insert_strided_slice %[[CAST2]], %[[INSERT1]] {offsets = [2], strides = [1]} : vector<1xi32> into vector<4xi32>
// CHECK: %[[EXTRACT3:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [12], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
// CHECK: %[[CAST3:.+]] = vector.bitcast %[[EXTRACT3]] : vector<4xi8> to vector<1xi32>
// CHECK: %[[INSERT3:.+]] = vector.insert_strided_slice %[[CAST3]], %[[INSERT2]] {offsets = [3], strides = [1]} : vector<1xi32> into vector<4xi32>
// CHECK: return %[[INSERT3]]