Commit Graph

949 Commits

Author SHA1 Message Date
Ivan Butygin
f54cdc5d6e [mlir] IntegerRangeAnalysis: add support for vector type (#112292)
Treat integer range for vector type as union of ranges of individual
elements. With this semantics, most arith ops on vectors will work out
of the box, the only special handling needed for constants and vector
elements manipulation ops.

The end goal of these changes is to be able to optimize vectorized index
calculations.
2024-11-01 23:58:16 +03:00
Manupa Karunaratne
a6e72f9392 [MLIR][Vector] Add Lowering for vector.step (#113655)
Currently, the lowering for vector.step lives
under a folder. This is not ideal if we want
to do transformation on it and defer the
 materizaliztion of the constants much later.

This commits adds a rewrite pattern that
could be used by using
`transform.structured.vectorize_children_and_apply_patterns`
transform dialect operation.

Moreover, the rewriter of vector.step is also
now used in -convert-vector-to-llvm pass where
it handles scalable and non-scalable types as
LLVM expects it.

As a consequence of removing the vector.step
lowering as its folder, linalg vectorization
will keep vector.step intact.
2024-11-01 16:38:36 +00:00
lialan
2c313259c6 [MLIR] VectorEmulateNarrowType to support loading of unaligned vectors (#113411)
Previously, the pass only supported emulation of loading vector sizes
that are multiples of the emulated data type. This patch expands its
support for emulating sizes that are not multiples of byte sizes. In
such cases, the element values are packed back-to-back to preserve
memory space.

To give a concrete example: if an input has type `memref<3x3xi2>`, it is
actually occupying 3 bytes in memory, with the first 18 bits storing the
values and the last 6 bits as padding. The slice of `vector<3xi2>` at
index `[2, 0]` is stored in memory from bit 12 to bit 18. To properly
load the elements from bit 12 to bit 18 from memory, first load byte 2
and byte 3, and convert it to a vector of `i2` type; then extract bits 4
to 10 (element index 2-5) to form a `vector<3xi2>`.

A limitation of this patch is that the linearized index of the unaligned
vector has to be known at compile time. Extra code needs to be emitted
to handle it if the condition does not hold.

The following ops are updated:
* `vector::LoadOp`
* `vector::TransferReadOp`
* `vector::MaskedLoadOp`
2024-10-29 20:04:48 -07:00
Kunwar Grover
2c5eea0e88 [mlir][Vector] Fix vector.insert folder for scalar to 0-d inserts (#113828)
The current vector.insert folder tries to replace a scalar with a 0-rank
vector. This patch fixes this crash by not folding unless they types of
the result and replacement are same.
2024-10-29 22:47:44 +00:00
Andrzej Warzyński
0cf7aaf300 [MLIR][Vector] Update Transfer{Read|Write}DropUnitDimsPattern patterns (#112394)
Updates `TransferWriteDropUnitDimsPattern` and
`TransferReadDropUnitDimsPattern` to inherit from
`MaskableOpRewritePattern` so that masked versions of
xfer_read/xfer_write Ops are also supported:

```mlir
    %v = vector.mask %mask {
      vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
        memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, vector<3x2xi8>
    } : vector<3x2xi1> -> vector<3x2xi8>
```
2024-10-26 13:54:04 +01:00
Jacques Pienaar
bb00f5b1ed [mlir][vector] Remove unneeded mask restriction (#113742)
These were added when the only mapping was to LLVM.
2024-10-25 20:45:44 -07:00
Kunwar Grover
1004865f1c [mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank (#112907)
Since
ddf2d62c7d
, 0-d vectors are supported in VectorType. This patch removes 0-d vector
handling with scalars for the TransferOpReduceRank pattern. This pattern
specifically introduces tensor.extract_slice during vectorization,
causing vectorization to not fold transfer_read/transfer_write slices
properly. The changes in vectorization test files reflect this.

There are other places where lowering patterns are still side-stepping
from handling 0-d vectors properly, by turning them into scalars, but
this patch only focuses on the vector.transfer_x patterns.
2024-10-22 15:50:16 +01:00
Benoit Jacob
a9ebdbb5ac [MLIR] Vector: turn the ExtractStridedSlice rewrite pattern from #111541 into a canonicalization (#111614)
This is a reasonable canonicalization because `extract` is more
constrained than `extract_strided_slices`, so there is no loss of
semantics here, just lifting an op to a special-case higher/constrained
op. And the additional `shape_cast` is merely adding leading unit dims
to match the original result type.

Context: discussion on #111541. I wasn't sure how this would turn out,
but in the process of writing this PR, I discovered at least 2 bugs in
the pattern introduced in #111541, which shows the value of shared
canonicalization patterns which are exercised on a high number of
testcases.

---------

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2024-10-09 09:24:23 -04:00
Kunwar Grover
32db6fbdb9 [mlir][vector] Implement speculation for vector.transferx ops (#111533)
This patch implements speculation for
vector.transfer_read/vector.transfer_write ops, allowing these ops to
work with LICM.
2024-10-09 13:50:33 +01:00
Benoit Jacob
d905b1caf1 [MLIR] Vector dialect: Address post-merge review comments on #111541 (#111552)
Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
2024-10-08 15:35:16 -04:00
Benoit Jacob
10054ba4ac [mlir][vector] Add pattern to rewrite contiguous ExtractStridedSlice into Extract (#111541)
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
2024-10-08 11:51:01 -04:00
BARRET
1666d13078 [CMake]: Remove unnecessary dependencies on LLVM/MLIR (#111255)
Previous https://github.com/llvm/llvm-project/pull/110362 (reverted)
caused breakage. Here is the PR with fix.

My build cmdline:

```
cmake ../llvm \
    -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=install \
    -DCMAKE_C_COMPILER=gcc-9 \
    -DCMAKE_CXX_COMPILER=g++-9 \
    -DCMAKE_CUDA_COMPILER=$(which nvcc) \
    -DLLVM_ENABLE_LLD=OFF \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DLLVM_BUILD_EXAMPLES=ON \
    -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
    -DLLVM_CCACHE_BUILD=ON \
    -DMLIR_ENABLE_BINDINGS_PYTHON=ON \
    -DBUILD_SHARED_LIBS=ON \
    -DLLVM_ENABLE_PROJECTS='llvm;mlir'
```
2024-10-07 15:52:43 +02:00
Matthias Springer
206fad0e21 [mlir][NFC] Mark type converter in populate... functions as const (#111250)
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.

Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.

Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
2024-10-05 21:32:40 +02:00
Andrzej Warzyński
56d6b56739 [mlir][vector] Relax the requirements on broadcast dims (#99341)
NOTE: This is a follow-up for #97049 in which the `in_bounds` attribute
was made mandatory.

This PR updates the semantics of the `in_bounds` attribute so that
broadcast dimensions are no longer required to be "in bounds".
Specifically, these xfer_read/xfer_write Ops become valid after this
change:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %pad
      {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (0)>}
      {permutation_map = affine_map<(d0, d1) -> (0)>}
      : memref<?x?xf32>, vector<9xf32>

  vector.transfer_write %vec, %A[%base1, %base2],
      {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (0)>}
      {permutation_map = affine_map<(d0, d1) -> (0)>}
      : vector<9xf32>, memref<?x?xf32>
```

Note that the value `false` merely means "may run out-of-bounds", i.e.,
the corresponding access can still be "in bounds". In fact, the folder
for xfer Ops is also updated (*) and will update the attribute value
corresponding to broadcast dims to `true` if all non-broadcast dims
are marked as "in bounds". 

Note that this PR doesn't change any of the lowerings. The changes in
"SuperVectorize.cpp", "Vectorization.cpp" and "AffineMap.cpp" are simple
reverts of recent changes in #97049. Those were only meant to facilitate
making `in_bounds` mandatory and to work around the extra requirements
for broadcast dims (those requirements ere removed in this PR). All
changes in tests are also reverts of changes from #97049.

For context, here's a PR in which "broadcast" dims where forced to
always be "in-bounds":
  * https://reviews.llvm.org/D102566

(*) See `foldTransferInBoundsAttribute`.
2024-10-04 07:41:20 +01:00
Quinn Dawkins
4e2efea5e8 [mlir][vector] Add all view-like ops to transfer flow opt (#110521)
`vector.transfer_*` folding and forwarding currently does not take into
account reshaping view-like memref ops (expand and collapse shape),
leading to potentially invalid store folding or value forwarding. This
patch adds tracking for those (and other) view-like ops. It is still
possible to design operations that alias memrefs without being a view
(e.g. memref in the iter_args of an `scf.for`), so these patterns may
still need revisiting in the future.
2024-10-02 00:20:44 -04:00
Andrzej Warzyński
1f5e8263b9 [mlir][vector] Add a new TD Op for patterns leveraging ShapeCastOp (#110525)
Adds a new Transform Dialect Op that collects patters for dropping unit
dims from various Ops:
  * `transform.apply_patterns.vector.drop_unit_dims_with_shape_cast`.

It excludes patterns for vector.transfer Ops - these are collected
under:
  * `apply_patterns.vector.rank_reducing_subview_patterns`,

and use ShapeCastOp _and_ SubviewOp to reduce the rank (and to eliminate
unit dims).

This new TD Ops allows us to test the "ShapeCast folder" pattern in
isolation. I've extracted the only test that I could find for that
folder from "vector-transforms.mlir" and moved it to a dedicated file:
"shape-cast-folder.mlir". I also added a test case with scalable
vectors.

Changes in VectorTransforms.cpp are not needed (added a comment with
a TODO + ordered the patterns alphabetically). I am Including them here
to avoid a separate PR.
2024-10-01 10:08:43 +01:00
Mehdi Amini
8b47711e84 Revert "CMake: Remove unnecessary dependencies on LLVM/MLIR" (#110594)
Reverts llvm/llvm-project#110362

Multiple bots are broken.
2024-10-01 00:44:21 +02:00
BARRET
4980f2177e CMake: Remove unnecessary dependencies on LLVM/MLIR (#110362)
There are some spurious libraries which can be removed.

I'm trying to bundle MLIR/LLVM library dependencies for our own
libraries. We're utilizing cmake function to recursively collect
MLIR/LLVM related dependencies. However, we identified certain library
dependencies as redundant and safe for removal.
2024-09-30 23:57:13 +02:00
Quinn Dawkins
a3b34e67e6 [mlir][vector] Add pattern for dropping unit dims from for loops (#109585)
This adds a pattern for dropping unit dims from the iter_args of scf.for
ops using vector.shape_cast. This composes with the other patterns for
dropping unit dims from elementwise ops and transposes.
2024-09-27 11:43:27 -04:00
Andrzej Warzyński
1335a11176 [mlir][vector][nfc] Clean-up VectorOps.{h|cpp} (#109316) 2024-09-19 21:45:01 +01:00
Diego Caballero
bcd65ba612 [mlir][Vector] Verify that masked ops implement MaskableOpInterface (#108123)
This PR fixes a bug in `MaskOp::verifier` that allowed `vector.mask` to
mask operations that did not implement the MaskableOpInterface.
2024-09-19 10:17:13 -07:00
Ivan Butygin
f325085878 [mlir][vector] Relax strides check for 1-element vector load/stores (#108998)
Single elememst vector load/stores are equivalent to scalar load/stores,
so they don't need memref to be contigious.
2024-09-19 13:12:32 +03:00
Jie Fu
9d8950a8d9 [mlir] 'dyn_cast' is deprecated: Use mlir::dyn_cast<U>() instead (NFC)
/llvm-project/mlir/lib/Dialect/Vector/IR/VectorOps.cpp:2923:29: error: 'dyn_cast' is deprecated: Use mlir::dyn_cast<U>() instead [-Werror,-Wdeprecated-declarations]
 2923 |     if (auto intAttr = attr.dyn_cast<IntegerAttr>()) {
      |                             ^
/llvm-project/mlir/include/mlir/IR/Attributes.h:184:14: note: 'dyn_cast' has been explicitly marked deprecated here
  184 | U Attribute::dyn_cast() const {
      |              ^
2024-09-09 09:00:15 +08:00
Rajveer Singh Bharadwaj
1c4b04ce1f [mlir] Fix crash in InsertOpConstantFolder when vector.insert operand is from a llvm.mlir.constant op (#88314)
In cases where llvm.mlir.constant has an attribute with a different type than the returned type,
the folder use to create an incorrect DenseElementsAttr and crash.

Resolves #74236
2024-09-09 01:44:30 +02:00
Longsheng Mou
50febdeb64 [mlir][vector] Bugfix of linearize vector.extract (#106836)
This patch add check for `vector.extract` with scalar type, which is not
allowed when linearize `vector.extract`. Fix #106162.
2024-09-04 16:41:56 +08:00
Benjamin Maxwell
74a512df2b [mlir][vector] Populate sink patterns in apply_patterns.vector.reduction_to_contract (#104754)
This restores the functionality to before:
42944da5ba
This fixes a buildbot failure:
https://lab.llvm.org/buildbot/#/builders/143/builds/1487
2024-08-19 11:20:18 +01:00
Andrzej Warzyński
42944da5ba [mlir][vector] Group re-order patterns together (#102856)
Group all patterns that re-order vector.transpose and vector.broadcast
Ops (*) under `populateSinkVectorOpsPatterns`. These patterns are
normally used to "sink" redundant Vector Ops, hence grouping together.
Example:

```mlir
%at = vector.transpose %a, [1, 0]: vector<4x2xf32> to vector<2x4xf32>
%bt = vector.transpose %b, [1, 0]: vector<4x2xf32> to vector<2x4xf32>
%r = arith.addf %at, %bt : vector<2x4xf32>
```
would get converted to:
```mlir
%0 = arith.addf %a, %b : vector<4x2xf32>
%r = vector.transpose %0, [1, 0] : vector<2x4xf32>
```

This patch also moves all tests for these patterns so that all of them
are:
  * run under one test-flag: `test-vector-sink-patterns`,
  * located in one file: "vector-sink.mlir".

To facilitate this change:
  * `-test-sink-vector-broadcast` is renamed as
    `test-vector-sink-patterns`,
  * "sink-vector-broadcast.mlir" is renamed as "vector-sink.mlir",
  * tests for `ReorderCastOpsOnBroadcast` and
    `ReorderElementwiseOpsOnTranspose` patterns are moved from
    "vector-reduce-to-contract.mlir" to "vector-sink.mlir",
  * `ReorderElementwiseOpsOnTranspose` patterns are removed from
    `populateVectorReductionToContractPatterns` and added to (newly
    created) `populateSinkVectorOpsPatterns`,
  * `ReorderCastOpsOnBroadcast` patterns are removed from
    `populateVectorReductionToContractPatterns` - these are already
    present in `populateSinkVectorOpsPatterns`.

This should allow us better layering and more straightforward testing.
For the latter, the goal is to be able to easily identify which pattern
a particular test is exercising (especially when it's a specific
pattern).

NOTES FOR DOWNSTREAM USERS

In order to preserve the current functionality, please make sure to add
  * `populateSinkVectorOpsPatterns`,

wherever you are using `populateVectorReductionToContractPatterns`.
Also, rename `populateSinkVectorBroadcastPatterns` as
`populateSinkVectorOpsPatterns`.

(*) I didn't notice any other re-order patterns.
2024-08-16 16:53:53 +01:00
Andrzej Warzyński
efe3db2124 [mlir][vector] Add tests for populateSinkVectorBroadcastPatterns (1/n) (#102286)
Adds tests for scalable vectors in:
  * sink-vector-broadcast.mlir

This test file excercises patterns grouped under
`populateSinkVectorBroadcastPatterns`, which includes:
  * `ReorderElementwiseOpsOnBroadcast`,
  * `ReorderCastOpsOnBroadcast`.

Right now there are only tests for the former. However, I've noticed
that "vector-reduce-to-contract.mlir" contains tests for the latter and
I've left a few TODOs to group these tests back together in one file.

Additionally, added some helpful `notifyMatchFailure` messages in
`ReorderElementwiseOpsOnBroadcast`.
2024-08-14 17:28:55 +01:00
Bangtian Liu
b5e47d2e40 [mlir][vector] Add extra check on distribute types to avoid crashes (#102952)
This PR addresses the issue detailed in
https://github.com/iree-org/iree/issues/17948.

The problem occurs when distributed types are set to NULL, leading to
compilation crashes.

---------

Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
2024-08-14 08:47:38 -07:00
Benjamin Maxwell
5f26497da7 [mlir][vector] Use DenseI64ArrayAttr in vector.multi_reduction (#102637)
This prevents some unnecessary conversions to/from int64_t and
IntegerAttr.
2024-08-10 14:10:24 +01:00
Benjamin Maxwell
9b06e25e73 [mlir][vector] Add mask elimination transform (#99314)
This adds a new transform `eliminateVectorMasks()` which aims at
removing scalable `vector.create_masks` that will be all-true at
runtime. It attempts to do this by simply pattern-matching the mask
operands (similar to some canonicalizations), if that does not lead to
an answer (is all-true? yes/no), then value bounds analysis will be used
to find the lower bound of the unknown operands. If the lower bound is
>= to the corresponding mask vector type dim, then that dimension of the
mask is all true.

Note that the pattern matching prevents expensive value-bounds analysis
in cases where the mask won't be all true.

For example:
```mlir
%mask = vector.create_mask %dynamicValue, %c2 : vector<8x4xi1>
```
From looking at `%c2` we can tell this is not going to be an all-true
mask, so we don't need to run the value-bounds analysis for
`%dynamicValue` (and can exit the transform early).

Note: Eliminating create_masks here means replacing them with all-true
constants (which will then lead to the masks folding away).
2024-08-09 10:51:49 +01:00
Han-Chung Wang
201da87c3f [mlir][vector] Handle corner cases in DropUnitDimsFromTransposeOp. (#102518)
da8778e499
breaks the lowering of vector.transpose that all the dimensions are unit
dimensions. The revision fixes the issue and adds a test.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2024-08-08 17:29:09 -07:00
Benjamin Maxwell
da492d4387 [mlir][vector] Fix return of DropUnitDimsFromTransposeOp pattern (#102478)
This accidentally returned `failure()` (rather than `success()`) when it
applied.
2024-08-08 16:31:11 +01:00
Andrzej Warzyński
22a130220c [mlir][vector] Add more tests for ConvertVectorToLLVM (1/n) (#101936)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.bitcast
  * vector.broadcast

Note, this has uncovered some missing logic in `BroadcastOpLowering`.
This PR fixes the most basic cases where the scalable flags were dropped
and the generated code was incorrect. Also, the conditions in
`vector::isBroadcastableTo` are relaxed to allow cases like this:
```mlir
%0 = vector.broadcast %arg0 : vector<1xf32> to vector<[4]xf32>
```

The `BroadcastOpLowering` pattern is effectively disabled for scalable
vectors in more complex cases where an SCF loop would be required to
loop over the scalable dims, e.g.:
```mlir
 %0 = vector.broadcast %arg0 : vector<[4]x1x2xf32> to vector<[4]x3x2xf32>
```

These cases are marked as "Stretch not at start" in the code. In those
cases, support for scalable vectors is left as a TODO.
2024-08-08 15:57:36 +01:00
Benjamin Maxwell
da8778e499 [mlir][vector] Add pattern to drop unit dims from vector.transpose (#102017)
Example:

BEFORE:
```mlir
%transpose = vector.transpose %vector, [3, 0, 1, 2]
  : vector<1x1x4x[4]xf32> to vector<[4]x1x1x4xf32>
```

AFTER:
```mlir
%dropDims = vector.shape_cast %vector
  : vector<1x1x4x[4]xf32> to vector<4x[4]xf32>
%transpose = vector.transpose %0, [1, 0]
  : vector<4x[4]xf32> to vector<[4]x4xf32>
%restoreDims = vector.shape_cast %transpose
  : vector<[4]x4xf32> to vector<[4]x1x1x4xf32>
```
2024-08-08 11:42:27 +01:00
Andrzej Warzyński
1919db907b [mlir][vector] Clarify the semantics of BroadcastOp (#101928)
Clarifies the semantics of `vector.broadcast` in the context of scalable
vectors. In particular, broadcasting a unit scalable dim, `[1]`, is not
valid unless there's a match between the output and the input dims.
See the examples below for an illustration:

```mlir
// VALID
 %0 = vector.broadcast %arg0 : vector<[1]xf32> to vector<4x[1]xf32>
// INVALID
 %0 = vector.broadcast %arg0 : vector<[1]xf32> to vector<[4]xf32>
// VALID FIXED-WIDTH EQUIVALENT
 %0 = vector.broadcast %arg0 : vector<1xf32> to vector<4xf32>
```

Documentation, the Op verifier and tests are updated accordingly.
2024-08-08 09:11:19 +01:00
Andrzej Warzyński
cb89457ff8 [nlir][vector] Constrain ContractionOpToMatmulOpLowering (#102225)
Disables `ContractionOpToMatmulOpLowering` for scalable vectors. This
pattern is meant to enable lowering to `llvm.matrix.multiply` - I'm not
aware of any use of that in the context of scalable vectors.
2024-08-07 10:18:14 +01:00
Nikhil Kalra
84cc1865ef [mlir] Support DialectRegistry extension comparison (#101119)
`PassManager::run` loads the dependent dialects for each pass into the
current context prior to invoking the individual passes. If the
dependent dialect is already loaded into the context, this should be a
no-op. However, if there are extensions registered in the
`DialectRegistry`, the dependent dialects are unconditionally registered
into the context.

This poses a problem for dynamic pass pipelines, however, because they
will likely be executing while the context is in an immutable state
(because of the parent pass pipeline being run).

To solve this, we'll update the extension registration API on
`DialectRegistry` to require a type ID for each extension that is
registered. Then, instead of unconditionally registered dialects into a
context if extensions are present, we'll check against the extension
type IDs already present in the context's internal `DialectRegistry`.
The context will only be marked as dirty if there are net-new extension
types present in the `DialectRegistry` populated by
`PassManager::getDependentDialects`.

Note: this PR removes the `addExtension` overload that utilizes
`std::function` as the parameter. This is because `std::function` is
copyable and potentially allocates memory for the contained function so
we can't use the function pointer as the unique type ID for the
extension.

Downstream changes required:
- Existing `DialectExtension` subclasses will need a type ID to be
registered for each subclass. More details on how to register a type ID
can be found here:
8b68e06731/mlir/include/mlir/Support/TypeID.h (L30)
- Existing uses of the `std::function` overload of `addExtension` will
need to be refactored into dedicated `DialectExtension` classes with
associated type IDs. The attached `std::function` can either be inlined
into or called directly from `DialectExtension::apply`.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-08-06 01:32:36 +02:00
Benjamin Maxwell
dd8a9e2b8c [mlir][vector] Remove vector.reshape operation (#101645)
This operation was added five years ago and has no lowerings or uses
within upstream MLIR (and no reported uses downstream). There’s only a
handful of round-trip tests.

See related RFC:

https://discourse.llvm.org/t/rfc-should-vector-reshape-be-removed/80478/3
2024-08-05 10:15:02 +01:00
Kazu Hirata
5262865aac [mlir] Construct SmallVector with ArrayRef (NFC) (#101896) 2024-08-04 11:43:05 -07:00
Benjamin Maxwell
b4444dca47 [mlir][vector] Use DenseI64ArrayAttr for shuffle masks (#101163)
Follow on from #100997. This again removes from boilerplate conversions
to/from IntegerAttr and int64_t (otherwise, this is a NFC).
2024-07-30 15:00:14 +01:00
Benjamin Maxwell
0d9b439408 [mlir][vector] Use DenseI64ArrayAttr for constant_mask dim sizes (#100997)
This prevents a bunch of boilerplate conversions to/from IntegerAttrs
and int64_ts. Other than that this is a NFC.
2024-07-29 18:08:37 +01:00
Andrzej Warzyński
fe07d9aa41 [mlir][vector] Switch to using getNumScalableDims (nfc) (#100806) 2024-07-27 08:11:00 +01:00
Andrzej Warzyński
98c73d5df7 [mlir][vector] Restrict vector.shape_cast (scalable vectors) (#100331)
Updates the verifier for `vector.shape_cast` so that incorrect cases
where "scalability" is dropped are immediately rejected. For example:
```mlir
  vector.shape_cast %vec : vector<1x1x[4]xindex> to vector<4xindex>
```

Also, as a separate PR, I've prepared a fix for the Linalg vectorizer to
avoid generating such shape casts (*):
* https://github.com/llvm/llvm-project/pull/100325

(*) Note, that's just one specific case that I've identified so far.
2024-07-25 09:50:25 +01:00
Andrzej Warzyński
1f5807eb35 [mlir][vector][nfc] Simplify in_bounds attr update (#100334)
Since the `in_bounds` attribute is mandatory, there's no need for logic
like this (`readOp.getInBounds()` is guaranteed to return a non-empty
ArrayRef):

```cpp
ArrayAttr inBoundsAttr = readOp.getInBounds()
    ? rewriter.getArrayAttr( readOp.getInBoundsAttr().getValue().drop_back(dimsToDrop))
    : ArrayAttr();
```

Instead, we can do this:
```cpp
ArrayAttr inBoundsAttr = rewriter.getArrayAttr(
    readOp.getInBoundsAttr().getValue().drop_back(dimsToDrop));
```

This is a small follow-up for #97049 - this change should've been
included there.
2024-07-24 14:40:19 +01:00
Hugo Trachino
de61875e9d [MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / trailing dimensions. (#98455)
Generalizes DropUnitDimFromElementwiseOps to support inner unit
dimensions.
This change stems from improving lowering of contractionOps for Arm SME.
Where we end up with inner unit dimensions on MulOp, BroadcastOp and
TransposeOp, preventing the generation of outerproducts.
discussed
[here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa).

Fix after : https://github.com/llvm/llvm-project/pull/97652 showed an
unhandled edge case when all dimensions are one. The generated target
VectorType would be `vector<f32>` which is apparently not supported by
the mulf.
In case all dimensions are dropped, the target vectorType is
vector<1xf32>

---------

Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
2024-07-17 10:22:25 +01:00
Andrzej Warzyński
2ee5586ac7 [mlir][vector] Make the in_bounds attribute mandatory (#97049)
At the moment, the in_bounds attribute has two confusing/contradicting
properties:
  1. It is both optional _and_ has an effective default-value.
  2. The default value is "out-of-bounds" for non-broadcast dims, and
     "in-bounds" for broadcast dims.

(see the `isDimInBounds` vector interface method for an example of this
"default" behaviour [1]).

This PR aims to clarify the logic surrounding the `in_bounds` attribute
by:
  * making the attribute mandatory (i.e. it is always present),
  * always setting the default value to "out of bounds" (that's
    consistent with the current behaviour for the most common cases).

#### Broadcast dimensions in tests

As per [2], the broadcast dimensions requires the corresponding
`in_bounds` attribute to be `true`:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

The changes in this PR mean that we can no longer rely on the
default value in cases like the following (dim 0 is a broadcast dim):
```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

Instead, the broadcast dimension has to explicitly be marked as "in
bounds:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

All tests with broadcast dims are updated accordingly.

#### Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The following patterns in "Vectorization.cpp" are updated to explicitly
set the `in_bounds` attribute to `false`:
* `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and
`vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to
make sure that xfer Ops created by these hooks set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would complain

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Still, this is consistent with the
current behaviour where the broadcast dim would be implicitly assumed
to be "in bounds".

[1]
4145ad2bac/mlir/include/mlir/Interfaces/VectorInterfaces.td (L243-L246)
[2]
https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
2024-07-16 16:49:52 +01:00
Andrzej Warzyński
6479a5a438 [mlir][vector] Restrict DropInnerMostUnitDimsTransfer{Read|Write} (#96218)
Restrict `DropInnerMostUnitDimsTransfer{Read|Write}` so that it fails
when one of the indices to be dropped could be != 0 and "out of bounds":

```mlir
func.func @negative_example(%arg0: memref<16x1xf32>, %arg1: vector<8x1xf32>, %idx_1: index, %idx_2: index) {
  vector.transfer_write %arg1, %arg0[%idx_1, %idx_2] {in_bounds = [true, false]} : vector<8x1xf32>, memref<16x1xf32>
  return
}
```

This is an edge case that could represent an out-of-bounds access,
though that will depend on the actual value of %i. Importantly, without
this change it would be transformed as follows:

```mlir
func.func @negative_example(%arg0: memref<16x1xf32>, %arg1: vector<8x1xf32>, %arg2: index, %arg3: index) {
  %subview = memref.subview %arg0[0, 0] [16, 1] [1, 1] : memref<16x1xf32> to memref<16xf32, strided<[1]>>
  %0 = vector.shape_cast %arg1 : vector<8x1xf32> to vector<8xf32>
  vector.transfer_write %0, %subview[%arg2] {in_bounds = [true]} : vector<8xf32>, memref<16xf32, strided<[1]>>
  return
}
```

This is incorrect - `%idx_2` is ignored and the "out of bounds" flags is
not propagated. Hence the extra restriction to avoid such cases.

NOTE: This is a follow-up for: #94904
2024-07-12 09:32:01 +01:00
Matthias Springer
3bb2563641 [mlir][vector] Fix crash in vector.insert canonicalization (#97801)
The `InsertOpConstantFolder` assumed that whenever the destination can
be folded to a constant attribute, that attribute must be a
`DenseElementsAttr`. That is is not necessarily the case.
2024-07-05 17:32:51 +02:00
Cullen Rhodes
67b302c52f [mlir][vector] Add vector.step operation (#96776)
This patch adds a new vector.step operation to the Vector dialect. It
produces a linear sequence of index values from 0 to N, where N is the
number of elements in the result vector, and can be used to create
vectors of indices.

It supports both fixed-width and scalable vectors. For fixed the
canonical representation is `arith.constant dense<[0, .., N]>`. A
scalable step cannot be represented as a constant and is lowered to the
`llvm.experimental.stepvector` intrinsic [1].

This op enables scalable vectorization of linalg.index ops, see #96778. It can
also be used in the SparseVectorizer in-place of lower-level stepvector
intrinsic, see [2] (patch to follow).

[1] https://llvm.org/docs/LangRef.html#llvm-experimental-stepvector-intrinsic
[2] acf675b63f/mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp (L385-L388)
2024-07-04 08:57:02 +01:00