### Description This patch improves the folding efficiency of `vector.insert` and `vector.extract` operations by not returning early after successfully converting dynamic indices to static indices. This PR also renames the test pass `TestConstantFold` to `TestSingleFold` and adds comprehensive documentation explaining the single-pass folding behavior. ### Motivation Since the `OpBuilder::createOrFold` function only calls `fold` **once**, the current `fold` methods of `vector.insert` and `vector.extract` may leave the op in a state that can be folded further. For example, consider the following un-folded IR: ``` %v1 = vector.insert %e1, %v0 [0] : f32 into vector<128xf32> %c0 = arith.constant 0 : index %e2 = vector.extract %v1[%c0] : f32 from vector<128xf32> ``` If we use `createOrFold` to create the `vector.extract` op, then the result will be: ``` %v1 = vector.insert %e1, %v0 [127] : f32 into vector<128xf32> %e2 = vector.extract %v1[0] : f32 from vector<128xf32> ``` But this is not the optimal result. `createOrFold` should have returned `%e1`. The reason is that the execution of fold returns immediately after `extractInsertFoldConstantOp`, causing subsequent folding logics to be skipped. --------- Co-authored-by: Yang Bai <yangb@nvidia.com>
24 lines
820 B
MLIR
24 lines
820 B
MLIR
// RUN: mlir-opt --test-single-fold %s | FileCheck %s
|
|
|
|
// CHECK-LABEL: func @test_const
|
|
func.func @test_const(%arg0 : index) -> tensor<4xi32> {
|
|
// CHECK: tosa.const
|
|
%0 = "tosa.const"() {values = dense<[3, 0, 1, 2]> : tensor<4xi32>} : () -> tensor<4xi32>
|
|
return %0 : tensor<4xi32>
|
|
}
|
|
|
|
// CHECK-LABEL: func @test_const_i64
|
|
func.func @test_const_i64(%arg0 : index) -> tensor<4xi64> {
|
|
// CHECK: tosa.const
|
|
%0 = "tosa.const"() {values = dense<[3, 0, 1, 2]> : tensor<4xi64>} : () -> tensor<4xi64>
|
|
return %0 : tensor<4xi64>
|
|
}
|
|
|
|
// CHECK-LABEL: func @try_fold_equal_with_unranked_tensor
|
|
func.func @try_fold_equal_with_unranked_tensor(%arg0: tensor<4xi32>, %arg1: tensor<1xi32>) {
|
|
// CHECK: tosa.equal
|
|
// CHECK-NEXT: return
|
|
%0 = tosa.equal %arg0, %arg1 : (tensor<4xi32>, tensor<1xi32>) -> tensor<*xi1>
|
|
return
|
|
}
|