Commit Graph

803 Commits

Author SHA1 Message Date
Andrzej Warzyński
d50705ed5d [mlir][vector] Support scalable vec in TransferReadAfterWriteToBroadcast (#79162)
Makes `TransferReadAfterWriteToBroadcast` correctly propagate
scalability flags.
2024-01-24 08:18:08 +00:00
Andrzej Warzynski
75b0c913a5 [mlir][nfc] Update comments
1. Updates and clarifies a few comments related to hooks for
   vector.{insert|extract}_strided_slice.

2. For consistency with vector.insert_strided_slice, removes a TODO from
   vector.extract_strided_slice Op def. It's self-explenatory that
   adding support for non-unit strides is a "TODO".
2024-01-22 14:25:27 +00:00
Jerry Wu
dedc7d4d36 [mlir] Exclude masked ops in VectorDropLeadUnitDim (#76468)
Don't insert cast ops for ops in `vector.mask` region in
`VectorDropLeadUnitDim`.
2024-01-20 19:37:46 -05:00
Benjamin Chetioui
35121add2e [mlir][NFC] Remove unused variable. 2024-01-19 11:32:19 +00:00
Han-Chung Wang
12b676de72 [mlir][vector] Drop innermost unit dims on transfer_write. (#78554) 2024-01-19 03:15:13 -08:00
Matthias Springer
5fcf907b34 [mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`

The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.

Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
2024-01-17 11:08:59 +01:00
Matthias Springer
510626fa65 [mlir][vector] Fix invalid IR in RewriteBitCastOfTruncI (#78146)
This commit fixes `Dialect/Vector/vector-rewrite-narrow-types.mlir` when
running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`.

```
within split at llvm-project/mlir/test/Dialect/Vector/vector-rewrite-narrow-types.mlir:1 offset :118:8: error: 'arith.trunci' op operand type 'vector<3xi16>' and result type 'vector<3xi16>' are cast incompatible
  %1 = vector.bitcast %0 : vector<16xi3> to vector<3xi16>
       ^
within split at llvm-project/mlir/test/Dialect/Vector/vector-rewrite-narrow-types.mlir:1 offset :118:8: note: see current operation: %48 = "arith.trunci"(%47) : (vector<3xi16>) -> vector<3xi16>
LLVM ERROR: IR failed to verify after pattern application
```
2024-01-16 09:45:38 +01:00
Matthias Springer
c0a354dfab [mlir][vector] Fix invalid IR in ContractionOpLowering (#78130)
If a rewrite pattern returns "failure", it must not have modified the
IR. This commit fixes
`Dialect/Vector/vector-contract-to-outerproduct-transforms-unsupported.mlir`
when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`.

```
  * Pattern (anonymous namespace)::ContractionOpToOuterProductOpLowering : 'vector.contract -> ()' {
Trying to match "(anonymous namespace)::ContractionOpToOuterProductOpLowering"
    ** Insert  : 'vector.transpose'(0x5625b3a8cb30)
    ** Insert  : 'vector.transpose'(0x5625b3a8cbc0)
"(anonymous namespace)::ContractionOpToOuterProductOpLowering" result 0
  } -> failure : pattern failed to match
} -> failure : pattern failed to match

LLVM ERROR: pattern returned failure but IR did change
```

Note: `vector-contract-to-outerproduct-transforms-unsupported.mlir` is
merged into `vector-contract-to-outerproduct-matvec-transforms.mlir`.
The `greedy pattern application failed` error is not longer produced.
This error indicates that the greedy pattern rewrite did not
convergence; it does not mean that a pattern could not be applied.
2024-01-16 09:40:24 +01:00
Kazu Hirata
8e8bbbd48e [mlir] Use llvm::is_contained (NFC) 2024-01-12 22:08:29 -08:00
Matthias Springer
ad100b36e7 [mlir][vector] Fix dominance error in warp vector distribution (#77771)
This commit fixes a test in `vector-warp-distribute.mlir` when
`MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled.

```
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use
    %1 = vector.extract %0[9] : f32 from vector<64xf32>
         ^
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region)
"func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({
^bb0(%arg0: index):
  %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({
    %7 = "some_def"() : () -> vector<64xf32>
    %8 = "arith.constant"() <{value = 9 : index}> : () -> index
    %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32
    "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> ()
  }) : (index) -> (f32, vector<2xf32>)
  %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index
  %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index
  %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32
  %4 = "arith.index_cast"(%1) : (index) -> i32
  %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32
  %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1)
  "func.return"(%6#0) : (f32) -> ()
}) : () -> ()
LLVM ERROR: IR failed to verify after pattern application
```

The position at which `vector.extractelement` extracts must also be
distributed. The fix in `WarpOpExtractElement` is similar to
`WarpOpInsertElement`.
2024-01-12 15:08:13 +01:00
Matthias Springer
35c19fdde2 [mlir][vector] Support warp distribution of transfer_read with dependencies (#77779)
Support distribution of `vector.transfer_read` ops when operands are
defined inside of the region of `warp_execute_on_lane_0` (except for the
buffer from which the op is reading).

Such IR was previously not supported. This commit changes the
implementation such that indices and the padding value are also
distributed.

This commit simplifies the implementation considerably: the original
implementation created a new `transfer_read` op and then checked if this
new op is valid. If not, the rewrite pattern failed. This was a bit
hacky. It was also a violation of the rewrite pattern API (detected by
`MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`) because the IR was modified,
but the pattern returned "failure".
2024-01-12 11:55:37 +01:00
Andrzej Warzyński
81df51fb31 [mlir][vector] Don't treat memrefs with empty stride as non-contiguous (#76848)
As per the docs [1]:

```
In absence of an explicit layout, a memref is considered to have a
multi-dimensional identity affine map layout.
```

This patch makes sure that MemRefs with no strides (i.e. no explicit
layout) are treated as contiguous when checking whether a particular
vector is a contiguous slice of the given MemRef.

[1] https://mlir.llvm.org/docs/Dialects/Builtin/#layout

Follow-up for #76428.
2024-01-09 08:13:31 +00:00
Balaji V. Iyer
21fe8b635c [mlir] Check if the stride tensor is empty. (#76428)
Added a check to see if the stride tensor is empty. If so then return
false for isContiguousSlice function.

Possible fix for #74463
2024-01-03 10:00:15 -06:00
Andrzej Warzyński
354adb44c9 [mlir][vector] Extend CreateMaskFolder (#75842)
Extends `CreateMaskFolder` pattern so that the following:
```mlir
  %c8 = arith.constant 8 : index
  %c16 = arith.constant 16 : index
  %0 = vector.vscale
  %1 = arith.muli %0, %c16 : index
  %10 = vector.create_mask %c8, %1 : vector<8x[16]xi1>
```

is folded as:

```mlir
  %0 = vector.constant_mask [8, 16] : vector<8x[16]xi1>
```
2023-12-20 11:08:54 +00:00
Matthias Springer
f10302e3fa [mlir] Require folders to produce Values of same type (#75887)
This commit adds extra assertions to `OperationFolder` and `OpBuilder`
to ensure that the types of the folded SSA values match with the result
types of the op. There used to be checks that discard the folded results
if the types do not match. This commit makes these checks stricter and
turns them into assertions.

Discarding folded results with the wrong type (without failing
explicitly) can hide bugs in op folders. Two such bugs became apparent
in MLIR (and some more in downstream projects) and are fixed with this
change.

Note: The existing type checks were introduced in
https://reviews.llvm.org/D95991.

Migration guide: If you see failing assertions (`folder produced value
of incorrect type`; make sure to run with assertions enabled!), run with
`-debug` or dump the operation right before the failing assertion. This
will point you to the op that has the broken folder. A common mistake is
a mismatch between static/dynamic dimensions (e.g., input has a static
dimension but folded result has a dynamic dimension).
2023-12-20 14:39:22 +09:00
Jakub Kuderski
560564f51c [mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901)
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.

Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.

Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.

Issue: https://github.com/llvm/llvm-project/issues/72354
2023-12-20 00:14:43 -05:00
Jakub Kuderski
9f74e6e615 [mlir][vector][gpu] Use makeArithReduction in lowering patterns. NFC. (#75952)
Use the `vector::makeArithReduction` helper as the source-of-truth of
reduction to arith ops lowering.
2023-12-19 19:04:27 -05:00
Jakub Kuderski
07677113ff [mlir][vector] Add pattern to break down reductions into arith ops (#75727)
The number of vector elements considered 'small' enough to extract is
parameterized.                                                   
                                                                 
This is to avoid going into specialized reduction lowering when a
single/couple of arith ops can do. Targets without dedicated reduction  
intrinsics can use that as an emulation path too.                  
                                                                   
Depends on https://github.com/llvm/llvm-project/pull/75846.
2023-12-18 17:54:54 -05:00
Jakub Kuderski
a528cee224 [mlir][vector] Improve makeArithReduction expansion (#75846)
Propagate fast math flags.
Distinguish `minf`/`maxf` and `minimumf`/`maximumf`.

Required for future patterns in
https://github.com/llvm/llvm-project/pull/75727.
2023-12-18 17:47:46 -05:00
Hsiangkai Wang
f643eec892 [mlir][vector] Add emulation patterns for vector masked load/store (#74834)
In this patch, it will convert

```
vector.maskedload %base[%idx_0, %idx_1], %mask, %pass_thru
```

to

```
%ivalue = %pass_thru
%m = vector.extract %mask[0]
%result0 = scf.if %m {
  %v = memref.load %base[%idx_0, %idx_1]
  %combined = vector.insert %v, %ivalue[0]
  scf.yield %combined
} else {
  scf.yield %ivalue
}
%m = vector.extract %mask[1]
%result1 = scf.if %m {
  %v = memref.load %base[%idx_0, %idx_1 + 1]
  %combined = vector.insert %v, %result0[1]
  scf.yield %combined
} else {
  scf.yield %result0
}
...
```

It will convert

```
vector.maskedstore %base[%idx_0, %idx_1], %mask, %value
```

to

```
%m = vector.extract %mask[0]
scf.if %m {
  %extracted = vector.extract %value[0]
  memref.store %extracted, %base[%idx_0, %idx_1]
}
%m = vector.extract %mask[1]
scf.if %m {
  %extracted = vector.extract %value[1]
  memref.store %extracted, %base[%idx_0, %idx_1 + 1]
}
...
```
2023-12-15 11:35:48 +00:00
Jerry Wu
2c9ba9c34a [mlir] Fix type transformation in DropUnitDimFromElementwiseOps (#75430)
Use operand and result types to build the corresponding new types in
`DropUnitDimFromElementwiseOps`.
2023-12-14 12:20:54 -05:00
Fangrui Song
71ba8bb4a7 [mlir,vector] Fix -Wunused-variable 2023-12-13 13:28:17 -08:00
Andrzej Warzyński
c02d07fdf0 [mlir][vector] Add pattern to drop unit dim from elementwise(a, b)) (#74817)
For vectors with either leading or trailing unit dim, replaces:

    elementwise(a, b)

with:

    sc_a = shape_cast(a)
    sc_b = shape_cast(b)
    res = elementwise(sc_a, sc_b)
    return shape_cast(res)

The newly inserted shape_cast Ops fold (before elementwise Op) and then
restore (after elementwise Op) the unit dim. Vectors `a` and `b` are
required to be rank > 1.

Example:
```mlir
  %mul = arith.mulf %B_row, %A_row : vector<1x[4]xf32>
  %cast = vector.shape_cast %mul : vector<1x[4]xf32> to vector<[4]xf32>
```

gets converted to:

```mlir
  %B_row_sc = vector.shape_cast %B_row : vector<1x[4]xf32> to vector<[4]xf32>
  %A_row_sc = vector.shape_cast %A_row : vector<1x[4]xf32> to vector<[4]xf32>
  %mul = arith.mulf %B_row_sc, %A_row_sc : vector<[4]xf32>
  %mul_sc = vector.shape_cast %mul : vector<[4]xf32> to vector<1x[4]xf32>
  %cast = vector.shape_cast %mul_sc : vector<1x[4]xf32> to vector<[4]xf32>
```

In practice, the bottom 2 shape_cast(s) will be folded away.
2023-12-13 20:29:12 +00:00
Jakub Kuderski
8063622721 [mlir][vector] Allow vector distribution with multiple written elements (#75122)
Add a configuration option to allow vector distribution with multiple
elements written by a single lane.

This is so that we can perform vector multi-reduction with multiple
results per workgroup.
2023-12-12 13:15:17 -05:00
Andrzej Warzyński
07919cf895 Revert "[mlir][vector] Make TransposeOpLowering configurable (#73915)" (#75062)
Reverting a workaround intended specifically for SPRI-V. That workaround
emerged from this discussion:

  * https://github.com/llvm/llvm-project/pull/72105

AFAIK, it hasn't been required in practice. This is based on IREE
(https://github.com/openxla/iree), which has just bumped it's fork of
LLVM without using it (*).

(*) cef31e775e

This reverts commit bbd2b08b95.
2023-12-11 21:32:23 +00:00
Rik Huijzer
51e5f677c8 [mlir][vector] Fix crash on invalid permutation_map (#74925)
Without this patch, MLIR crashes with
```
Assertion failed: (getNumDims() == map.getNumResults() && "Number of results mismatch"), function compose, file AffineMap.cpp, line 537.
```
during parsing.
2023-12-11 12:07:41 +01:00
Adam Paszke
34df53739a Revert "[mlir][Vector] Add fold transpose(shape_cast) -> shape_cast (#73951)" (#74579)
This reverts commit f42b7615b8.

The fold pattern is incorrect, because it does not even look at the
permutation of non-unit dims and is happy to replace a pattern such as
```
%22 = vector.shape_cast %21 : vector<1x256x256xf32> to vector<256x256xf32>
%23 = vector.transpose %22, [1, 0] : vector<256x256xf32> to vector<256x256xf32>
```
with
```
%22 = vector.shape_cast %21 : vector<1x256x256xf32> to vector<256x256xf32>
```
which is obviously incorrect.
2023-12-06 11:15:47 +01:00
Andrzej Warzyński
2eb9e33cc5 [mlir][Vector] Update patterns for flattening vector.xfer Ops (2/N) (#73523)
Updates patterns for flattening `vector.transfer_read` by relaxing the
requirement that the "collapsed" indices are all zero. This enables
collapsing cases like this one:

```mlir
  %2 = vector.transfer_read %arg4[%c0, %arg0, %arg1, %c0] ... :
    memref<1x43x4x6xi32>, vector<1x2x6xi32>
```

Previously only the following case would be consider for collapsing
(all indices are 0):

```mlir
  %2 = vector.transfer_read %arg4[%c0, %c0, %c0, %c0] ... :
    memref<1x43x4x6xi32>, vector<1x2x6xi32>
```

Also adds some new comments and renames the `firstContiguousInnerDim`
parameter as `firstDimToCollapse` (the latter better matches the actual
meaning).

Similar updates for `vector.transfer_write` will be implemented in a
follow-up patch.
2023-12-05 08:35:58 +00:00
Andrzej Warzyński
bbd2b08b95 [mlir][vector] Make TransposeOpLowering configurable (#73915)
Following the discussion here:

  * https://github.com/llvm/llvm-project/pull/72105

this patch makes the `TransposeOpLowering` configurable so that one can select
whether to favour `vector.shape_cast` over `vector.transpose`.

As per the discussion in #72105, using `vector.shape_cast` is very beneficial
and desirable when targeting `LLVM IR` (CPU lowering), but won't work when
targeting `SPIR-V` today (GPU lowering). Hence the need for a mechanism to be
able to disable/enable the pattern introduced in #72105. This patch proposes one
such mechanism.

While this should solve the problem that we are facing today, it's understood to
be a temporary workaround. It should be removed once support for lowering
`vector.shape_cast` to SPIR-V is added. Also, (once implemented) the following
proposal might make this workaround redundant:

  * https://discourse.llvm.org/t/improving-handling-of-unit-dimensions-in-the-vector-dialect/
2023-12-04 16:56:43 +00:00
Andrzej Warzyński
8171eac23f [mlir][Vector] Update patterns for flattening vector.xfer Ops (1/N) (#73522)
Updates "flatten vector" patterns to support more cases, namely Ops that
read/write vectors with leading unit dims. For example:

```mlir
%0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0] ... :
  memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>, vector<1x1x2x2xi8>
```

Currently, the `vector.transfer_read` above would not be flattened. With
this
change, it will be rewritten as follows:
```mlir
%collapse_shape = memref.collapse_shape %arg0 [[0, 1, 2, 3]] :
  memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>
  into memref<120xi8, strided<[1], offset: ?>>
%0 = vector.transfer_read %collapse_shape[%c0] ... :
  memref<120xi8, strided<[1], offset: ?>>, vector<4xi8>
%1 = vector.shape_cast %0 : vector<4xi8> to vector<1x1x2x2xi8>
```

`hasMatchingInnerContigousShape` is generalised and renamed as
`isContiguousSlice` to better match the updated functionality. A few
test names are updated to better highlight what case is being exercised.
2023-12-04 10:21:32 +00:00
Quinn Dawkins
fdf84cbf87 [mlir][vector] Fix unit dim dropping pattern for masked writes (#74038)
This does the same as #72142 for vector.transfer_write. Previously the
pattern would silently drop the mask.
2023-12-01 10:01:28 -05:00
Benjamin Maxwell
f42b7615b8 [mlir][Vector] Add fold transpose(shape_cast) -> shape_cast (#73951)
This folds transpose(shape_cast) into a new shape_cast, when the
transpose just permutes a unit dim from the result of the shape_cast.

Example:

```
%0 = vector.shape_cast %vec : vector<[4]xf32> to vector<[4]x1xf32>
%1 = vector.transpose %0, [1, 0] : vector<[4]x1xf32> to vector<1x[4]xf32>
```

Folds to:
```
%0 = vector.shape_cast %vec : vector<[4]xf32> to vector<1x[4]xf32>
```

This is an (alternate) fix for lowering matmuls to ArmSME.
2023-12-01 14:24:36 +00:00
Han-Chung Wang
7f82c90621 [mlir][vector] Add support for vector.maskedstore sub-type emulation. (#73871)
The idea is similar to vector.maskedload + vector.store emulation. What
the emulation does is:

1. Get a compressed mask and load the data from destination.
2. Bitcast the data to original vector type.
3. Select values between `op.valueToStore` and the data from load using
original mask.
4. Bitcast the new value and store it to destination using compressed
masked.
2023-11-30 11:27:06 -08:00
Andrzej Warzyński
a383817b7e [mlir][Vector] Add a rewrite pattern for gather over a strided memref (#72991)
This patch adds a rewrite pattern for `vector.gather` over a strided
memref like the following:

```mlir
%subview = memref.subview %arg0[0, 0] [100, 1] [1, 1] :
    memref<100x3xf32> to memref<100xf32, strided<[3]>>
%gather = vector.gather %subview[%c0] [%idxs], %cst_0, %cst :
    memref<100xf32, strided<[3]>>, vector<4xindex>, vector<4xi1>, vector<4xf32>
    into vector<4xf32>
```

After the pattern added in this patch:
```mlir
%collapse_shape = memref.collapse_shape %arg0 [[0, 1]] :
    memref<100x3xf32> into memref<300xf32>
%1 = arith.muli %arg3, %cst : vector<4xindex>
%gather = vector.gather %collapse_shape[%c0] [%1], %cst_1, %cst_0 :
    memref<300xf32>, vector<4xindex>, vector<4xi1>, vector<4xf32>
    into vector<4xf32>
```

Fixes https://github.com/openxla/iree/issues/15364.
2023-11-30 16:33:20 +00:00
Andrzej Warzyński
4b2ba5a61a [mlir][sve] Add an e2e for linalg.matmul with mixed types (#73773)
Apart from the test itself, this patch also updates a few patterns to
fix how new VectorType(s) are created. Namely, it makes sure that
"scalability" is correctly propagated.

Regression tests will be updated seperately while auditing Vector
dialect tests in the context of scalable vectors:
  * https://github.com/orgs/llvm/projects/23
2023-11-29 21:21:10 +00:00
Quinn Dawkins
f385f6c93b [mlir][vector] Distribute all non-permutation or broadcasted masked transfer reads (#73539)
The primary difficulty with distribution of masked transfers is when the
permutation map permutes the vector, in which case the distribution
logic needs to make sure the correct mask elements end up with the
distributed transfer. This is only tricky when the permutation map has a
permutation in it, so we can relax the condition for distribution.
2023-11-27 16:23:48 -05:00
Quinn Dawkins
6e8f7d5966 [mlir][vector] Fix patterns for dropping leading unit dims from masks (#73525)
Previously the pattern only worked when the permutation map was a minor
identity. Infer the new mask type from the new transfer map after
dropping leading unit dims.
2023-11-27 12:35:32 -05:00
Jakub Kuderski
d33bad66d8 [mlir][vector] Add patterns to simplify chained reductions (#73048)
Chained reductions get created during vector unrolling. These patterns
simplify them into a series of adds followed by a final reductions.

This is preferred on GPU targets like SPIR-V/Vulkan where vector
reduction gets lowered into subgroup operations that are generally more
expensive than simple vector additions.

For now, only the `add` combining kind is handled.
2023-11-22 10:30:04 -05:00
MaheshRavishankar
1f141737c7 Revert "[mlir][vector] Move transpose with unit-dim to shape_cast pattern (#72493)" (#72918)
This reverts commit 95acb33b45.
2023-11-21 06:18:51 -08:00
Matthias Springer
32c3decb77 [mlir][vector] Modernize vector.transpose op (#72594)
* Declare arguments/results with `let` statements.
* Rename `transp` to `permutation`.
* Change type of `transp` from `I64ArrayAttr` to `DenseI64ArrayAttr`
(provides direct access to `ArrayRef<int64_t>` instead of `ArrayAttr`).
2023-11-20 11:25:35 +01:00
Cullen Rhodes
bf897d5d77 [mlir][vector] Extend TransferReadDropUnitDimsPattern to support partially-static memrefs (#72142)
This patch extends TransferReadDropUnitDimsPattern to support dropping
unit dims from partially-static memrefs, for example:

%v = vector.transfer_read %base[%c0, %c0], %pad {in_bounds = [true, true]} :
  memref<?x1xi8, strided<[?, ?], offset: ?>>, vector<[16]x1xi8>

Is rewritten as:

%dim0 = memref.dim %base, %c0 : memref<?x1xi8, strided<[?, ?], offset: ?>>
%subview = memref.subview %base[0, 0] [%dim0, 1] [1, 1] :
  memref<?x1xi8, strided<[?, ?], offset: ?>> to memref<?xi8, #map1>
%v = vector.transfer_read %subview[%c0], %pad {in_bounds = [true]}
  : memref<?xi8, #map1>, vector<[16]xi8>

Scalable vectors are now also supported, the scalable dims were being
dropped when creating the rank-reduced vector type. The xfer op can also
have a mask of type 'vector.create_mask', which gets rewritten as long
as the mask of the unit dim is a constant of 1.
2023-11-20 08:39:34 +00:00
Cullen Rhodes
95acb33b45 [mlir][vector] Move transpose with unit-dim to shape_cast pattern (#72493)
Moved from lowering to canonicalization.
2023-11-17 14:06:03 +00:00
Matthias Springer
8a40fcaf35 [mlir][vector] Clean up VectorTransferOpInterface (#72353)
- Better documentation.
- Rename interface methods: `source` -> `getSource`, `indices` ->
`getIndices`, etc. to conform with MLIR naming conventions. A default
implementation is not needed.
- Turn many interface methods into helper functions. Most of the
previous interface methods were not meant to be overridden, and if some
were overridden without others, the op would be have been broken.
2023-11-16 10:35:46 +09:00
Cullen Rhodes
b7b6d54004 [mlir][vector] Add vector.transpose with unit-dim to vector.shape_cast pattern (#72105)
This patch extends the vector.transpose lowering to replace:

vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to vector<1xnx<eltty>>

with:

  vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnx<eltty>>

Source with leading unit-dim (inverse) is also replaced. Unit dim must
be fixed. Non-unit dim can be scalable.

A check is also added to bail out for scalable vectors before unrolling.
2023-11-15 14:14:33 +00:00
long.chen
1609f1c2a5 [mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/

Not all changes are made manually, most of them are made through a clang
tool I wrote https://github.com/lipracer/cpp-refactor.
2023-11-14 13:01:19 +08:00
Felix Schneider
d5a0fb39ae [mlir][vector] Handle empty MaskOp in LowerVectorMask, MaskOpRewritePattern (#72031)
This patch adds handling of an empty `MaskOp` to `MaskOpRewritePattern`
and thereby fixes a crash.
It also pulls the `MaskOp` canonicalization patterns into
`LowerVectorMask` so that empty `MaskOp`s are folded away in the Pass.

Fix https://github.com/llvm/llvm-project/issues/71036
2023-11-12 08:12:28 +01:00
Quinn Dawkins
bc81f8c87e [mlir][vector] Drop incorrect startRootUpdate calls in vector distribution (#71988)
Fixes asan failures in
https://lab.llvm.org/buildbot/#/builders/5/builds/38191 introduced by
#71964.
2023-11-10 17:07:39 -05:00
Quinn Dawkins
aa2376a083 [mlir][vector] Notify the rewriter when sinking out of warp ops (#71964)
A number of the warp distribution patterns work by rewriting a warp op
in place by moving a contained op outside. This notifies the rewriter
that the warp op is changing in this case.
2023-11-10 14:45:18 -05:00
Han-Chung Wang
2bac720101 [mlir][vector] Take dim sizes into account in DropInnerMostUnitDims. (#71752)
The `stride == 1` does not imply that we can drop it. Because it could
load more than 1 elements. We should also take source sizes and vector
sizes into account. Otherwise it generates invalid IRs. E.g.,

```mlir
func.func @foo(%arg0: memref<1x1xf32>) -> vector<4x8xf32> {
  %c0 = arith.constant 0 : index
  %cst = arith.constant 0.000000e+00 : f32
  %0 = vector.transfer_read %arg0[%c0, %c0], %cst : memref<1x1xf32>, vector<4x8xf32>
  return %0 : vector<4x8xf32>
}
```

Fixes https://github.com/openxla/iree/issues/15493
2023-11-10 09:27:59 -08:00
Quinn Dawkins
d4d2891447 [mlir][vector] Add distribution pattern for vector.create_mask (#71619)
This is the last step needed for basic support for distributing masked
vector code. The lane id gets delinearized based on the distributed mask
shape and then compared against the original mask sizes to compute the
bounds for the distributed mask. Note that the distribution of masks is
implicit on the shape specified by the warp op. As a result, it is the
responsibility of the consumer of the mask to ensure the distributed
mask will match its own distribution semantics.
2023-11-10 10:09:37 -05:00