Commit Graph

245 Commits

Author SHA1 Message Date
Cullen Rhodes
067bd7d051 [mlir][vector] Use optional for outerproduct accumulator instead of variadic
This was introduced before the Optional directive and uses Variadic, but
it's really optional.

Reviewed By: nicolasvasilache, benmxwl-arm, dcaballe

Differential Revision: https://reviews.llvm.org/D159259
2023-09-01 05:50:01 +00:00
Benjamin Maxwell
296d5cb60c [mlir][BuiltinTypes] Return VectorType from VectorType::Builder conversion operator
0-D vectors are now supported, so the special case of returning the just
the element type can now be removed.

A few callers that relied on the old behaviour have been updated.

Reviewed By: awarzynski, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D159122
2023-08-30 13:47:06 +00:00
yzhang93
f4bef787bc Add narrow type emulation pattern for vector.transfer_read
Reviewed By: mravishankar, hanchung

Differential Revision: https://reviews.llvm.org/D158757
2023-08-29 13:15:19 -07:00
Lei Zhang
d243378722 [mlir][vector] Use dyn_cast in if conditions
Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158336
2023-08-22 08:27:40 -07:00
Andrzej Warzynski
f9070b2dfb [mlir][vector] Enable CastAwayElementwiseLeadingOneDim for scalable vec
This patch effectively enables the CastAwayElementwiseLeadingOneDim
rewrite pattern for scalable vectors. To this end,
`ExtractOp::inferReturnTypes` is updated so that scalable dimensions are
correctly recognised.

The change to ExtractOp will likely make also other conversion patterns
valid for scalable vectors, but this patch focuses on just one case.
Other conversion patterns will be enabled in the forthcoming patches.

Depends on D157993

Differential Revision: https://reviews.llvm.org/D158335
2023-08-22 11:40:46 +00:00
Andrzej Warzynski
576b184d6e [mlir][vector] Add support for scalable vectors in trimLeadingOneDims
This patch updates one specific hook in "VectorDropLeadUnitDim.cpp" to
make sure that "scalable dims" are handled correctly. While this change
affects multiple patterns, I am only adding one regression tests that
captures one specific case that affects me right now.

I am also adding Vector dialect to the list of dependencies of
`-test-vector-to-vector-lowering`. Otherwise my test case won't work as
a standalone test.

Differential Revision: https://reviews.llvm.org/D157993
2023-08-22 08:45:59 +00:00
Lei Zhang
199442ea2c [mlir][vector] Fix uniform transfer_read distribution
If the original shape and the distributed shape is the same,
we don't distribute at all--every thread is handling the whole.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D158235
2023-08-17 17:38:55 -07:00
Mahesh Ravishankar
0f8bab8d59 [mlir] Revamp implementation of sub-byte load/store emulation.
When handling sub-byte emulation, the sizes of the converted `memref`s
also need to be updated (this was not done in the current
implementation). This adds the additional complexity of having to
linearize the `memref`s as well. Consider a `memref<3x3xi4>` where the
`i4` elements are packed. This has a overall size of 5 bytes (rounded
up to number of bytes). This can only be represented by a
`memref<5xi8>`. A `memref<3x2xi8>` would imply an implicit padding of
4 bits at the end of each row. So incorporate linearization into the
sub-byte load-store emulation.

This patch also updates some of the utility functions to make better
use of statically available information using `OpFoldResult` and
`makeComposedFoldedAffineApplyOps`.

Reviewed By: hanchung, yzhang93

Differential Revision: https://reviews.llvm.org/D158125
2023-08-17 20:27:53 +00:00
Lei Zhang
73ddc4474b [mlir][vector] Enable distribution over multiple dimensions
This commit starts enabling vector distruction over multiple
dimensions. It requires delinearize the lane ID to match the
expected rank. shape_cast and transfer_read now can properly
handle multiple dimensions.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D157931
2023-08-16 12:08:43 -07:00
Matthias Springer
a02ad6c177 [mlir][bufferization] Generalize getAliasingOpResults to getAliasingValues
This revision is needed to support bufferization of `cf.br`/`cf.cond_br`. It will also be useful for better analysis of loop ops.

This revision generalizes `getAliasingOpResults` to `getAliasingValues`. An OpOperand can now not only alias with OpResults but also with BlockArguments. In the case of `cf.br` (will be added in a later revision): a `cf.br` operand will alias with the corresponding argument of the destination block.

If an op does not implement the `BufferizableOpInterface`, the analysis in conservative. It previously assumed that an OpOperand may alias with each OpResult. It now assumes that an OpOperand may alias with each OpResult and each BlockArgument of the entry block.

Differential Revision: https://reviews.llvm.org/D157957
2023-08-15 15:02:47 +02:00
Andrzej Warzynski
12b4951866 [mlir][vector] Add missing support for scalable vectors
This patch adds the missing logic so that the
`TransferReadPermutationLowering` can be used for scalable vectors. To
this end:
  * TransferOp custom C++ builder is updated to support scalable
    vectors,
  * `TransferOpReduceRank` is also updated to support scalable vectors.

This pattern is relevant when lowering `linalg.matmul` via
`vector_multi_reduction` for scalable vectors.

I've also updated relevant code in `TransferOpReduceRank` not to use
`llvm::to_vector` for constructing `SmallVector` from `ArrayRef`. That
hook doesn't work for `ArraryRef<bool>` (*), so for consistency I
switched to an explicit constructor (so that both `newShape` and
`newScalableDim` are constructed in a similar fashion).

(*) IIUC, that's due how implicit narrowing conversions between `bool`
and `*bool` work. Note that these narrowing conversions change when
using initializer lists, see
  * https://en.cppreference.com/w/cpp/language/list_initialization.

Depends on D157092

Differential Revision: https://reviews.llvm.org/D157268
2023-08-10 09:08:30 +00:00
Diego Caballero
15a08cf27c [mlir][Vector] Fold selects of single-element i1 vectors
This patch adds a folding to select operation between an all-true and all-false vector.
For now, only single element vectors (i.e., vector<1xi1>) are supported. Multi-element
cases are caught by InstCombine.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D154682
2023-08-09 18:57:36 +00:00
Andrzej Warzynski
5c581720b9 [mlir][Vector] Add support for scalable vectors in multi_reduction
Support for scalable vectors in vector.multi_reduction is added by
simply updating MultiDimReductionOp::verify.

Also, the conversion pattern for reducing n-D vector.multi_reduction to
2D vector.multi_reduction is updated.

Differential Revision: https://reviews.llvm.org/D157092
2023-08-08 17:01:59 +00:00
Matthias Springer
16b75cd2bb [mlir][vector] Use DenseI64ArrayAttr for ExtractOp/InsertOp positions
`DenseI64ArrayAttr` provides a better API than `I64ArrayAttr`. E.g., accessors returning `ArrayRef<int64_t>` (instead of `ArrayAttr`) are generated.

Differential Revision: https://reviews.llvm.org/D156684
2023-07-31 15:25:37 +02:00
Matthias Springer
b1d2687501 [mlir][IR] Remove duplicate isLastMemrefDimUnitStride functions
This function is duplicated in various dialects.

Differential Revision: https://reviews.llvm.org/D155462
2023-07-17 16:31:04 +02:00
Matthias Springer
fd5cda3393 [mlir][vector][NFC] Minor VectorTransferOpInterface cleanup
* Rename functions with underscore to camel case.
* Return C++ bools of "in_bounds" values instead of an `ArrayAttr`.

Differential Revision: https://reviews.llvm.org/D155277
2023-07-14 15:41:21 +02:00
Matthias Springer
6040044f2f [mlir][vector] VectorToSCF: Omit redundant out-of-bounds check
There was a bug in `TransferWriteNonPermutationLowering`, a pattern that extends the permutation map of a TransferWriteOp with leading transfer dimensions of size ones. These newly added transfer dimensions are always in-bounds, because the starting point of any dimension is in-bounds. VectorToSCF inserts out-of-bounds checks based on the "in_bounds" attribute and dims that are marked as out-of-bounds but that are actually always in-bounds lead to unnecessary "scf.if" ops.

Differential Revision: https://reviews.llvm.org/D155196
2023-07-14 09:50:37 +02:00
Hanhan Wang
8fc433f055 [mlir][MemRef] Move narrow type emulation common methods to MemRefUtils.
It also unifies the computation of StridedLayoutAttr. If the stride is
static known value, we can just use it.

Differential Revision: https://reviews.llvm.org/D155017
2023-07-13 14:43:21 -07:00
Quinn Dawkins
5b6b2caf3c [mlir][vector] Handle memory space conflicts in VectorTransferSplit patterns
Currently the transfer splitting patterns will generate an invalid cast
when the source memref for a transfer op has a non-default memory space.
This is handled by first introducing a `memref.memory_space_cast` in
such cases.

Differential Revision: https://reviews.llvm.org/D154515
2023-07-11 22:58:23 -04:00
yzhang93
9a7677d8ee [mlir] Narrow bitwidth emulation for vector.load
This patch is a following for the previous patch https://reviews.llvm.org/D151519.
With this patch, vector.load op with narrow bitwidth (e.g., i4) can be converted to
supported wider bitwidth (e.g., i8).

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D154178
2023-07-11 13:38:15 -07:00
Matthias Springer
867afe5e53 [mlir][vector] Remove duplicate tensor subset <-> vector transfer patterns
Remove patterns that fold tensor subset ops into vector transfer ops from the vector dialect. These patterns already exist in the tensor dialect.

Differential Revision: https://reviews.llvm.org/D154932
2023-07-11 11:12:29 +02:00
Matthias Springer
a7a5641bdc [mlir][vector] Fix bug in TransferWriteNonPermutationLowering
This pattern expands the rank of the vector. However, the rank of the mask was not expanded.

Differential Revision: https://reviews.llvm.org/D154849
2023-07-10 17:21:03 +02:00
Matthias Springer
cb7bda2ace [mlir][NFC] Use getConstantIntValue instead of casting to ConstantIndexOp
`getConstantIntValue` extracts constant values from all constant-like ops, not just `arith::ConstantIndexOp`.

Differential Revision: https://reviews.llvm.org/D154356
2023-07-04 14:08:37 +02:00
Matthias Springer
030b18fe14 [mlir][vector] Clean up some dimension size checks
* Add `memref::getMixedSize` (same as in the tensor dialect).
* Simplify in-bounds check in `VectorTransferSplitRewritePatterns.cpp` and fix off-by-one error in the static in-bounds check.
* Use "memref::DimOp" instead of `createOrFoldDimOp` when possible.

Differential Revision: https://reviews.llvm.org/D154218
2023-07-03 09:10:00 +02:00
Andrzej Warzynski
f22af204ed [mlir][VectorType] Remove numScalableDims from the vector type
This is a follow-up of https://reviews.llvm.org/D153372 in which
`numScalableDims` (single integer) was effectively replaced with
`isScalableDim` bitmask.

This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
  * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/

Differential Revision: https://reviews.llvm.org/D153412
2023-06-28 13:53:45 +01:00
Matthias Springer
efc290ce9c [mlir][affine] More efficient makeComposedFolded... helpers
The old code used to materialize constants as ops, immediately folded them into the resulting affine map and then deleted the constant ops again. Instead, directly fold the attributes into the affine map. Furthermore, all helpers accept `OpFoldResult` instead of `Value` now. This makes the code at call sites more efficient, because it is no longer necessary to materialize a `Value`, just to be able to use these helper functions.

Note: The API has changed (accepts OpFoldResult instead of Value), otherwise this change is NFC.

Differential Revision: https://reviews.llvm.org/D153324
2023-06-22 10:47:38 +02:00
Andrzej Warzynski
4d339ec91e [mlir][Vector] Add pattern to reorder elementwise and broadcast ops
The new pattern will replace elementwise(broadcast) with
broadcast(elementwise) when safe.

This change affects tests for vectorising nD-extract. In one case
("vectorize_nd_tensor_extract_with_tensor_extract") I just trimmed the
test and only preserved the key parts (scalar and contiguous load from
the original Op). We could do the same with some other tests if that
helps maintainability.

Differential Revision: https://reviews.llvm.org/D152812
2023-06-15 10:13:41 +01:00
Cullen Rhodes
1e41a29d73 Revert "[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to zero"
Apologies I shouldn't have comitted this, need to wait until the planned
MLIR ODM:

  https://discourse.llvm.org/t/rfc-creating-a-armsme-dialect/67208/76

This reverts commit a48fe89885.
2023-06-14 09:03:10 +00:00
Cullen Rhodes
a48fe89885 [mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to zero
This patch adds support for lowering a `vector.transfer_write` of zeroes
and type `vector<[16x16]xi8>` to the SME `zero {za}` instruction [1],
which zeroes the entire accumulator.

This contributes to supporting a path from `linalg.fill` to SME.

[1] https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/ZERO--Zero-a-list-of-64-bit-element-ZA-tiles-

Reviewed By: awarzynski, dcaballe

Differential Revision: https://reviews.llvm.org/D152508
2023-06-14 08:46:53 +00:00
Matthias Springer
80853a1673 [mlir][vector][bufferize] Better analysis for vector.transfer_write
The destination operand does not bufferize to a memory read if it is completely overwritten.

Differential Revision: https://reviews.llvm.org/D152823
2023-06-14 09:38:51 +02:00
Nicolas Vasilache
e35ff2605f [mlir][vector] NFC - Add debug information to vector unrolling patterns 2023-06-08 08:06:47 +00:00
Quentin Colombet
1dd00d3903 [mlir][Vector] Fix a propagation bug with broadcast
In the vector distribute patterns, we used to move
`vector.broadcast`s out of `vector.warp_execute_on_lane0`s
irrespectively of how they were defined.

This could create broadcast operations with invalid semantic.
E.g.,
```
%r = warop ...[32] ... -> vector<1x2xf32> {
  %val = broadcast %in : vector<64xf32> to vetor<1x64xf32>
  vector.yield %val : vector<1x64xf32>
}
```
=>
```
%r = warop ...[32] ... -> vector<64xf32> {
  vector.yield %in : vector<64xf32>
}
// Broadcasting to a narrower type!
broadcast %r : vector<64xf32> to vector<1x2xf32>
```

The root issue is we are trying to broadcast something that is not the same
for each thread, so there is actually nothing to propagate here.

The fix checks that the broadcast we want to create actually makes sense.

Differential Revision: https://reviews.llvm.org/D152154
2023-06-06 16:40:15 +02:00
Manish Gupta
9a795f0c59 [mlir][Vector] Adds a pattern to fold arith.extf into vector.contract
Consider mixed precision data type, i.e., F16 input lhs, F16 input rhs, F32 accumulation, and F32 output. This is typically written as F32 <= F16*F16 + F32.

During vectorization from linalg to vector for mixed precision data type (F32 <= F16*F16 + F32), linalg.matmul introduces arith.extf on input lhs and rhs operands.

"linalg.matmul"(%lhs, %rhs, %acc) ({
      ^bb0(%arg1: f16, %arg2: f16, %arg3: f32):
        %lhs_f32 = "arith.extf"(%arg1) : (f16) -> f32
        %rhs_f32 = "arith.extf"(%arg2) : (f16) -> f32
       %mul = "arith.mulf"(%lhs_f32, %rhs_f32) : (f32, f32) -> f32
        %acc = "arith.addf"(%arg3, %mul) : (f32, f32) -> f32
      "linalg.yield"(%acc) : (f32) -> ()
    })
There are backend that natively supports mixed-precision data type and does not need the arith.extf. For example, NVIDIA A100 GPU has mma.sync.aligned.*.f32.f16.f16.f32 that can support mixed-precision data type. However, the presence of arith.extf in the IR, introduces the unnecessary casting targeting F32 Tensor Cores instead of F16 Tensor Cores for NVIDIA backend. This patch adds a folding pattern to fold arith.extf into vector.contract

Differential Revision: https://reviews.llvm.org/D151918
2023-06-05 23:22:20 +00:00
Quentin Colombet
018d8ac974 [mlir][Vector] Fix a propagation bug with transfer_read
In the vector distribute patterns, we used to move
`vector.transfer_read`s out of `vector.warp_execute_on_lane0`s
irrespectively of how they were defined.

This could create transfer_read operations that would read values from
within the warpOp's body from outside of the body.
E.g.,
```
warpop {
  %defined_in_body
  %read = transfer_read %defined_in_body
  vector.yield %read
}
```
=>
```
warpop {
  %defined_in_body
  vector.yield ...
}
// %defined_in_body is referenced outside of its scope.
%read = transfer_read %defined_in_body
```

The fix consists in checking that all the values feeding the new
`transfer_read` are defined outside of warpOp's body.

Note: We could do this check before creating any operation, but that would
mean knowing what `affine::makeComposedAffineApply` actually do. So the
current fix is a trade off of coupling the implementations of this
propagation and `makeComposedAffineApply` versus compile time.

Differential Revision: https://reviews.llvm.org/D152149
2023-06-05 15:52:26 +02:00
Matthias Springer
01128d4baf [mlir][vector][NFC] Clean up headers
Certain functions were declared in `VectorOps.h` instead of `VectorTransforms.h` or `VectorRewritePatterns.h`.

Differential Revision: https://reviews.llvm.org/D152146
2023-06-05 15:16:20 +02:00
Diego Caballero
834fcfed24 Reland "[mlir][Vector] Extend xfer drop unit dim patterns"
This reverts commit 76d71f3792.
2023-06-01 22:22:16 +00:00
Diego Caballero
d3e1398bef [mlir][Vector] Prevent vector-to-scalar xfer patterns from triggering on sub-vectors
Patterns that convert extract(transfer_read) into a scalar load where
incorrectly triggering for cases where a sub-vector instead of a scalar
was extracted.

Reviewed By: nicolasvasilache, hanchung, awarzynski

Differential Revision: https://reviews.llvm.org/D151862
2023-06-01 22:22:16 +00:00
Diego Caballero
0935c0556b [mlir][Vector] Add support for 0-D 'vector.shape_cast' lowering
This PR adds support for shape casting from and to 0-D vectors.

Reviewed By: nicolasvasilache, hanchung, awarzynski

Differential Revision: https://reviews.llvm.org/D151851
2023-06-01 22:22:16 +00:00
Diego Caballero
76d71f3792 Revert "[mlir][Vector] Extend xfer drop unit dim patterns"
This reverts commit a53cd03dea.

This commit is exposing some implementation gaps in other patterns.
Reverting for now.
2023-05-31 18:20:05 +00:00
Diego Caballero
a53cd03dea [mlir][Vector] Extend xfer drop unit dim patterns
This patch extends the transfer drop unit dim patterns to support cases where the vector shape should also be reduced
(e.g., transfer_read(memref<1x4x1xf32>, vector<1x4x1xf32>) -> transfer_read(memref<4xf32>, vector<4xf32>).

Reviewed By: hanchung, pzread

Differential Revision: https://reviews.llvm.org/D151007
2023-05-23 20:58:51 +00:00
Diego Caballero
14726cd691 [mlir][Vector] Extend xfer_read(extract)->scalar load to support multiple uses
This patch extends the vector.extract(vector.transfer_read) -> scalar
load patterns to support vector.transfer_read with multiple uses. For
now, we check that all the uses are vector.extract operations.
Supporting multiple uses is predicated under a flag.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D150812
2023-05-19 21:03:18 +00:00
Lei Zhang
e000b62a34 [mlir][vector] Separate out vector transfer + tensor slice patterns
These patterns touches the structure generated from tiling so it
affects later steps like bufferization and vector hoisting.
Instead of putting them in canonicalization, this commit creates
separate entry points for them to be called explicitly.

This is NFC regarding the functionality and tests of those patterns.
It also addresses two TODO items in the codebase.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D150702
2023-05-17 09:01:19 -07:00
Matthias Springer
61223c49dd [mlir][GPU] Rename MLIRGPUOps CMake target to MLIRGPUDialect
This is for consistency with other dialects.

Differential Revision: https://reviews.llvm.org/D150659
2023-05-16 14:25:08 +02:00
Tres Popp
c1fa60b4cd [mlir] Update method cast calls to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Context:

* https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…"
* Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This follows a previous patch that updated calls
`op.cast<T>()-> cast<T>(op)`. However some cases could not handle an
unprefixed `cast` call due to occurrences of variables named cast, or
occurring inside of class definitions which would resolve to the method.
All C++ files that did not work automatically with `cast<T>()` are
updated here to `llvm::cast` and similar with the intention that they
can be easily updated after the methods are removed through a
find-replace.

See https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
for the clang-tidy check that is used and then update printed
occurrences of the function to include `llvm::` before.

One can then run the following:
```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
                 -export-fixes /tmp/cast/casts.yaml mlir/*\
                 -header-filter=mlir/ -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```

Differential Revision: https://reviews.llvm.org/D150348
2023-05-12 11:21:30 +02:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Hanhan Wang
25cc5a71b3 [mlir][vector] Generalize vector.transpose lowering to n-D vectors
The existing vector.transpose lowering patterns only triggers if the
input vector is 2D. The revision extends the pattern to handle n-D
vectors which are effectively 2-D vectors (e.g., vector<1x4x1x8x1).

It refactors a common check about 2-D vectors from X86Vector
lowering to VectorUtils.h so it can be reused by both sides.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D149908
2023-05-08 10:48:26 -07:00
Quinn Dawkins
650f04feda [mlir][vector] Add pattern to break down vector.bitcast
The pattern added here is intended as a last resort for targets like
SPIR-V where there are vector size restrictions and we need to be able
to break down large vector types. Vectorizing loads/stores for small
bitwidths (e.g. i8) relies on bitcasting to a larger element type and
patterns to bubble bitcast ops to where they can cancel.
This fails for cases such as
```
%1 = arith.trunci %0 : vector<2x32xi32> to vector<2x32xi8>
vector.transfer_write %1, %destination[%c0, %c0] {in_bounds = [true, true]} : vector<2x32xi8>, memref<2x32xi8>
```
where the `arith.trunci` op essentially does the job of one of the
bitcasts, leading to a bitcast that need to be further broken down
```
vector.bitcast %0 : vector<16xi8> to vector<4xi32>
```

Differential Revision: https://reviews.llvm.org/D149065
2023-04-25 20:18:02 -04:00
Quinn Dawkins
435f7d4c2e [mlir][vector] Add unroll pattern for vector.gather
This pattern is useful for SPIR-V to unroll to a supported vector size
before later lowerings. The unrolling pattern is closer to an
elementwise op than the transfer ops because the index values from which
to extract elements are captured by the index vector and thus there is
no need to update the base offsets when unrolling gather.

Differential Revision: https://reviews.llvm.org/D149066
2023-04-24 14:02:59 -04:00
Hanhan Wang
8d163e5045 [mlir][Vector] Add 16x16 strategy to vector.transpose lowering.
It adds a `shuffle_16x16` strategy LowerVectorTranspose and renames `shuffle` to `shuffle_1d`. The idea is similar to 8x8 cases in x86Vector::avx2. The general algorithm is:

```
interleave 32-bit lanes using
    8x _mm512_unpacklo_epi32
    8x _mm512_unpackhi_epi32
interleave 64-bit lanes using
    8x _mm512_unpacklo_epi64
    8x _mm512_unpackhi_epi64
permute 128-bit lanes using
   16x _mm512_shuffle_i32x4
permute 256-bit lanes using again
   16x _mm512_shuffle_i32x4
```

After the first stage, they got transposed to

```
 0  16   1  17   4  20   5  21   8  24   9  25  12  28  13  29
 2  18   3  19   6  22   7  23  10  26  11  27  14  30  15  31
32  48  33  49 ...
34  50  35  51 ...
64  80  65  81 ...
...
```

After the second stage, they got transposed to

```
 0  16  32  48 ...
 1  17  33  49 ...
 2  18  34  49 ...
 3  19  35  51 ...
64  80  96 112 ...
65  81  97 114 ...
66  82  98 113 ...
67  83  99 115 ...
...
```

After the thrid stage, they got transposed to

```
  0  16  32  48   8  24  40  56  64  80  96  112 ...
  1  17  33  49 ...
  2  18  34  50 ...
  3  19  35  51 ...
  4  20  36  52 ...
  5  21  37  53 ...
  6  22  38  54 ...
  7  23  39  55 ...
128 144 160 176 ...
129 145 161 177 ...
...
```

After the last stage, they got transposed to

```
0  16  32  48  64  80  96 112 ... 240
1  17  33  49  66  81  97 113 ... 241
2  18  34  50  67  82  98 114 ... 242
...
15  31  47  63  79  96 111 127 ... 255
```

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D148685
2023-04-23 11:05:41 -07:00
Lei Zhang
eca7698a97 [mlir][vector] NFC: Expose castAwayContractionLeadingOneDim
This commit exposes the transformation behind the pattern.
It is useful for more targeted application on a specific op
for once.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D148758
2023-04-21 09:41:14 -07:00