Commit Graph

181 Commits

Author SHA1 Message Date
Matthias Springer
0693b9e9cc [mlir][Vector] Clean up populateVectorToLLVMConversionPatterns (#119975)
Clean up `populateVectorToLLVMConversionPatterns` so that it populates
only conversion patterns. All rewrite patterns that do not lower to LLVM
should be populated into a separate greedy pattern rewrite.

The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.

Depends on #119973.
2024-12-17 11:37:17 +01:00
Matthias Springer
8cd8b5079b [mlir][Vector] Move mask materialization patterns to greedy rewrite (#119973)
The mask materialization patterns during `VectorToLLVM` are rewrite
patterns. They should run as part of the greedy pattern rewrite and not
the dialect conversion. (Rewrite patterns and conversion patterns are
not generally compatible.)

The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.
2024-12-17 11:26:31 +01:00
Andrzej Warzyński
bc624a56c7 [mlir][vector][nfc] Update vector-to-llvm.mlir (#118112)
* Adds extra comments to group Ops
* Unifies the test function naming, i.e.
  * `@vector_{op_name}_{variant}` -> `@{op_name}_{variant}`
* Unifies input variable names (`%input` -> `%arg0`)
* Capitalises LIT variable names (e.g. `%[[insert]]` --> `%[[INSERT]]`)
* Moves `@step_scalable()` _below_ its "fixed-width" counterpart
  (to follow the existing consistency within this file).

There's still some inconsistencies within this file - I'm happy to send
more updates if folks find it useful. But I'd definitely recommend
splitting across multiple PRs (otherwise it's hard to review).
2024-12-07 10:29:48 +00:00
Kunwar Grover
a8f927161b [mlir][Vector] Fix vector.extract lowering to llvm for 0-d vectors (#117731)
The current implementation of lowering to llvm for vector.extract
incorrectly assumes that if the number of indices is zero, the operation
can be folded away. This PR removes this condition and relies on the
folder to do it instead.

This PR also unifies the logic for scalar extracts and slice extracts,
which as a side effect also enables vector.extract lowering for n-d
vector.extract with dynamic inner most dimension. (This was only
prevented by a conservative check in the old implementation)
2024-12-04 17:26:53 +00:00
Andrzej Warzyński
204d715362 [mlir][vector] Add more tests for ConvertVectorToLLVM (11/n) (#117160)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:

  * `vector.splat`.

In addition:
* Removed `@make_fixed_vector_of_scalable_vect`, which duplicated
  `@broadcast_vec2d_from_scalar_scalable` (and wasn't grouped with other
  tests for `vector.broadcast`).
* Moved `@vector_bitcast_2d` near other tests for `vector.bitcast` and
  added a variant with scalable vectors.
2024-11-27 08:40:21 +00:00
Andrzej Warzyński
4086ead63c [mlir][vector] Add more tests for ConvertVectorToLLVM (10/n) (#117041)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:

  * `vector.maskedload`,
  * `vector.maskedstore`,
  * `vector.gather`,
  * `vector.scatter`.

In addition:
* For consistency with other tests, renamed test function names
  (e.g. `@masked_load_op` -> `@masked_load_op`)
* Made some test names more descriptive, e.g `@gather_op_2d` ->
  `@gather_1d_from_2d`.
2024-11-21 09:27:05 +00:00
Andrzej Warzyński
1ff22f8a71 [mlir][vector] Add more tests for ConvertVectorToLLVM (9/n) (#116795)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:

  * `vector.load`,
  * `vector.store`.

In addition:
* For consistency with other tests, renamed test function names
  (e.g. `@vector_load_op_nontemporal` -> `vector_load_nontemporal`)
* Moved `@vector_load_0d` near other test for `vector.load` (as opposed
  to next to `@vector_store_0d`).
2024-11-20 07:50:30 +00:00
Andrzej Warzyński
844fe8f662 [mlir][nfc] Rename @genbool_* as @constant_mask_* (#115335)
Renames `@genbool_*` tests as `@constant_mask_*`. That's to better
highlight which Op is tested and for better consistency with other test.

In addition,`@genbool_2d` is moved _above_ it's counterparts with
scalable vectors (again, for consistency).
2024-11-08 13:50:04 +00:00
Manupa Karunaratne
a6e72f9392 [MLIR][Vector] Add Lowering for vector.step (#113655)
Currently, the lowering for vector.step lives
under a folder. This is not ideal if we want
to do transformation on it and defer the
 materizaliztion of the constants much later.

This commits adds a rewrite pattern that
could be used by using
`transform.structured.vectorize_children_and_apply_patterns`
transform dialect operation.

Moreover, the rewriter of vector.step is also
now used in -convert-vector-to-llvm pass where
it handles scalable and non-scalable types as
LLVM expects it.

As a consequence of removing the vector.step
lowering as its folder, linalg vectorization
will keep vector.step intact.
2024-11-01 16:38:36 +00:00
Benjamin Kramer
4d228e1ebd [mlir][vector] Escape variable usage in test
Otherwise the shell might expand this in the command line.
2024-10-17 12:43:32 +02:00
Andrzej Warzyński
3187a4917d [mlir][vector] Add more tests for ConvertVectorToLLVM (8/n) (#111997)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:

* `vector.transfer_read`,
* `vector.transfer_write`.

In addition:

* Duplicate tests from "vector-mask-to-llvm.mlir" are removed.
* Tests for xfer_read/xfer_write are moved to a newly created test file,
  "vector-xfer-to-llvm.mlir". This follows an existing pattern among
  VectorToLLVM conversion tests.
* Tests that test both xfer_read and xfer_write have their names updated
  to capture that (e.g. @transfer_read_1d_mask ->
  @transfer_read_write_1d_mask)
* @transfer_write_1d_scalable_mask and @transfer_read_1d_scalable_mask
  are re-written as @transfer_read_write_1d_mask_scalable. This is to
  make it clear that this case is meant to complement
  @transfer_read_write_1d_mask.
* @transfer_write_tensor is updated to also test xfer_read.
2024-10-15 13:15:36 +01:00
Andrzej Warzyński
f7eb271542 [mlir][vector] Add more tests for ConvertVectorToLLVM (7/n) (#111895)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.fma
  * vector.reduce
2024-10-11 14:36:26 +01:00
Andrzej Warzyński
f58e85a972 [mlir][vector] Add more tests for ConvertVectorToLLVM (6/n) (#111121)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * `vector.insert_strided_slice`

With this change, for every test with fixed-width vectors, there should
be a corresponding example with scalable vectors (for
`vector.insert_strided_slice`). In addition:
  * Test function names are updated to more accurately reflect the case
    being exercised (e.g. `@insert_strided_index_slice1` ->
    `@insert_strided_index_slice_index_2d_into_3d`)
  * For consistency, took the liberty of updating some of the function
    names for `vector.extract_strided_slice`
  * `@insert_strided_slice_scalable` is effectively replaced with
    `@insert_strided_slice_f32_2d_into_3d_scalable`
2024-10-07 10:05:16 +01:00
Longsheng Mou
a4b0153c4f [mlir][vector] Support for extracting 1-element vectors in VectorExtractOpConversion (#107549)
This patch adds support for converting `vector.extract` that extract
1-element vectors into LLVM, fixing a crash in such cases.
E.g., `vector.extract %1[0]: vector<1xf32> from vector<2xf32>`. Fix
#61372.
2024-09-11 17:10:58 +08:00
Longsheng Mou
a8f3d30312 [mlir] Add dependent TensorDialect to ConvertVectorToLLVM pass (#108045)
This patch registers the tensor dialect as dependent of the
ConvertVectorToLLVM.
This which fixes a crash when `vector.transfer_write` is used with
dynamic tensor type.
The MaterializeTransferMask pattern would call
`vector::createOrFoldDimOp` which
creates a `tensor.dim` operation.

Fixes #107805.
2024-09-11 17:08:44 +08:00
Andrzej Warzyński
a9c71d3665 [mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#106510) 2024-09-02 12:19:00 +01:00
Maciej Gabka
95d2d1cba0 Move stepvector intrinsic out of experimental namespace (#98043)
This patch is moving out stepvector intrinsic from the experimental
namespace.

This intrinsic exists in LLVM for several years now, and is widely used.
2024-08-28 12:48:20 +01:00
Hugo Trachino
749ba7f6b2 [mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#104784)
This patch aims to disambiguate test names for some of the
Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.extractelement
  * vector.extract
  * vector.insertelement
  * vector.insert
  
1. Tests targetting `vector.{insert|extract}` Ops do not have names like
`{insert|extract}_element*` which was confusing against
`vector.{insert|extract}element` ops targetting tests.
2. Tests mention the type of the target/source buffer. e.g.
`@extractelement` => `@extractelement_from_vec_1d`
3. Align LIT ligns consistently with other tests.
4. Tests with a different type for position have a name updated
accordingly. `@extractelement_index` =>`@extractelement_index_position`
5. Tests with a dynamic value for position have a name updated
accordingly. `@extract_element_with_value_1d`
=>`@extract_scalar_dynamic_position_from_vec_1d`
6. Added the scalable flavour of the tests
`insert_scalar_into_vec_2d_dynamic_position` and
`@extract_scalar_from_vec_2d_dynamic_position`
2024-08-21 08:55:50 +01:00
Andrzej Warzyński
6fceb3e865 [mlir][vector] Add more tests for ConvertVectorToLLVM (4/n) (#103391)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.insertelement
  * vector.insert

I have also renamed some function names from `@insert_element{}` to
`@insertelement{}` - that's to make a clearer distinction between
tests for `vector.insertelement` (tested by `@insertelement{}`) and
`vector.insert` (tested by `@insert_element{}`).
2024-08-16 16:48:16 +01:00
Andrzej Warzyński
f0f5afe968 [mlir][vector] Add more tests for ConvertVectorToLLVM (3/n) (#102854)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.extractelement
  * vector.extract

I have also renamed some function names from `@extract_element{}` to
`@extractelement{}` - that's to make a clearer distinction between
tests for `vector.extractelement` (tested by `@extractelement{}`) and
`vector.extract` (tested by `@extract_element{}`).
2024-08-13 13:03:35 +01:00
Andrzej Warzyński
7e175b307e [mlir][vector] Add more tests for ConvertVectorToLLVM (2/n) (#102203)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.outerproduct
2024-08-09 09:06:57 +01:00
Andrzej Warzyński
22a130220c [mlir][vector] Add more tests for ConvertVectorToLLVM (1/n) (#101936)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.bitcast
  * vector.broadcast

Note, this has uncovered some missing logic in `BroadcastOpLowering`.
This PR fixes the most basic cases where the scalable flags were dropped
and the generated code was incorrect. Also, the conditions in
`vector::isBroadcastableTo` are relaxed to allow cases like this:
```mlir
%0 = vector.broadcast %arg0 : vector<1xf32> to vector<[4]xf32>
```

The `BroadcastOpLowering` pattern is effectively disabled for scalable
vectors in more complex cases where an SCF loop would be required to
loop over the scalable dims, e.g.:
```mlir
 %0 = vector.broadcast %arg0 : vector<[4]x1x2xf32> to vector<[4]x3x2xf32>
```

These cases are marked as "Stretch not at start" in the code. In those
cases, support for scalable vectors is left as a TODO.
2024-08-08 15:57:36 +01:00
Cullen Rhodes
1e7d6d3455 [mlir][vector] Propagate scalability to gather/scatter ptrs vector (#97584)
In convert-vector-to-llvm the first operand (vector of pointers holding
all memory addresses to read) to the masked.gather (and scatter)
intrinsic has a fixed vector type.

This may result in intrinsics where the scalable flag has been dropped:
```
  %0 = llvm.intr.masked.gather %1, %2, %3 {alignment = 4 : i32}
    : (!llvm.vec<4 x ptr>, vector<[4]xi1>, vector<[4]xi32>) -> vector<[4]xi32>
```
Fortunately the operand is overloaded on the result type so we end up
with the correct IR when lowering to LLVM, but this is still incorrect.
This patch fixes it by propagating scalability.
2024-07-09 09:06:25 +01:00
Cullen Rhodes
67b302c52f [mlir][vector] Add vector.step operation (#96776)
This patch adds a new vector.step operation to the Vector dialect. It
produces a linear sequence of index values from 0 to N, where N is the
number of elements in the result vector, and can be used to create
vectors of indices.

It supports both fixed-width and scalable vectors. For fixed the
canonical representation is `arith.constant dense<[0, .., N]>`. A
scalable step cannot be represented as a constant and is lowered to the
`llvm.experimental.stepvector` intrinsic [1].

This op enables scalable vectorization of linalg.index ops, see #96778. It can
also be used in the SparseVectorizer in-place of lower-level stepvector
intrinsic, see [2] (patch to follow).

[1] https://llvm.org/docs/LangRef.html#llvm-experimental-stepvector-intrinsic
[2] acf675b63f/mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp (L385-L388)
2024-07-04 08:57:02 +01:00
Matthias Springer
c6ff2446a4 [mlir][vector] Add vector.from_elements op (#95938)
This commit adds a new operation to the vector dialect:
`vector.from_elements`

The op constructs a new vector from a given list of scalar values. It is
similar to `tensor.from_elements`.
```mlir
%0 = vector.from_elements %a, %b, %c, %a, %a, %a : vector<2x3xf32>
```

Constructing a new vector from elements was tedious before this op
existed: a typical way was to define an `arith.constant ... :
vector<...>`, followed by a chain of `vector.insert`.

Folders/canonicalizations are added that can fold `vector.extract` ops
and convert the `vector.from_elements` op into a `vector.splat` op.

The LLVM lowering generates an `llvm.mlir.undef`, followed by a sequence
of scalar insertions in the form of `llvm.insertelement`. Only 0-D and
1-D vectors are currently supported in the LLVM lowering.
2024-06-19 09:58:37 +02:00
Matthias Springer
13d983e730 [mlir][Transforms][NFC] Dialect Conversion: Resolve insertion point TODO (#95653)
Remove a TODO in the dialect conversion code base when materializing
unresolved conversions:
```
// FIXME: Determine a suitable insertion location when there are multiple
// inputs.
```

The implementation used to select an insertion point as follows:
- If the cast has exactly one operand: right after the definition of the
SSA value.
- Otherwise: right before the cast op.

However, it is not necessary to change the insertion point. Unresolved
materializations (`UnrealizedConversionCastOp`) are built during
`buildUnresolvedArgumentMaterialization` or
`buildUnresolvedTargetMaterialization`. In the former case, the op is
inserted at the beginning of the block. In the latter case, only one
operand is supported in the dialect conversion, and the op is inserted
right after the definition of the SSA value. I.e., the
`UnrealizedConversionCastOp` is already inserted at the right place and
it is not necessary to change the insertion point for the resolved
materialization op.

Note: The IR change changes slightly because the
`unrealized_conversion_cast` ops at the beginning of a block are no
longer doubly-inverted (by setting the insertion to the beginning of the
block when inserting the `unrealized_conversion_cast` and again when
inserting the resolved conversion op). All affected test cases were
fixed by using `CHECK-DAG` instead of `CHECK`.

Also improve the quality of multiple test cases that did not check for
the correct operands.

Note: This commit is in preparation of decoupling the
argument/source/target materialization logic of the type converter from
the dialect conversion (to reduce its complexity and make that
functionality usable from a new dialect conversion driver).
2024-06-17 19:56:40 +02:00
Zhaoshi Zheng
abcbbe7114 [MLIR][VectorToLLVM] Handle scalable dim in createVectorLengthValue() (#93361)
LLVM's Vector Predication Intrinsics require an explicit vector length
parameter:
https://llvm.org/docs/LangRef.html#vector-predication-intrinsics.

For a scalable vector type, this should be caculated as VectorScaleOp
multiplied by base vector length, e.g.: for <[4]xf32> we should return:
vscale * 4.
2024-06-13 09:06:05 -07:00
Mubashar Ahmad
b87a80d4eb [mlir][vector] Add n-d deinterleave lowering (#94237)
This patch implements the lowering for vector
deinterleave for vector of n-dimensions. Process
involves unrolling the n-d vector to a series
of one-dimensional vectors. The deinterleave
operation is then used on these vectors.

From:
```
%0, %1 = vector.deinterleave %a : vector<2x8xi8> -> vector<2x4xi8>
```

To:
```
%cst = arith.constant dense<0> : vector<2x4xi32>
%0 = vector.extract %arg0[0] : vector<8xi32> from vector<2x8xi32>
%res1, %res2 = vector.deinterleave %0 : vector<8xi32> -> vector<4xi32>
%1 = vector.insert %res1, %cst [0] : vector<4xi32> into vector<2x4xi32>
%2 = vector.insert %res2, %cst [0] : vector<4xi32> into vector<2x4xi32>
%3 = vector.extract %arg0[1] : vector<8xi32> from vector<2x8xi32>
%res1_0, %res2_1 = vector.deinterleave %3 : vector<8xi32> -> vector<4xi32>
%4 = vector.insert %res1_0, %1 [1] : vector<4xi32> into vector<2x4xi32>
%5 = vector.insert %res2_1, %2 [1] : vector<4xi32> into vector<2x4xi32>
...etc.
```
2024-06-07 10:57:00 +01:00
Han-Chung Wang
0ea1271ee1 [mlir][vector] Add support for unrolling vector.bitcast ops. (#94064)
The revision unrolls vector.bitcast like:

```mlir
%0 = vector.bitcast %arg0 : vector<2x4xi32> to vector<2x2xi64>
```

to

```mlir
%cst = arith.constant dense<0> : vector<2x2xi64>
%0 = vector.extract %arg0[0] : vector<4xi32> from vector<2x4xi32>
%1 = vector.bitcast %0 : vector<4xi32> to vector<2xi64>
%2 = vector.insert %1, %cst [0] : vector<2xi64> into vector<2x2xi64>
%3 = vector.extract %arg0[1] : vector<4xi32> from vector<2x4xi32>
%4 = vector.bitcast %3 : vector<4xi32> to vector<2xi64>
%5 = vector.insert %4, %2 [1] : vector<2xi64> into vector<2x2xi64>
```

The scalable vector is not supported because of the limitation of
`vector::createUnrollIterator`. The targetRank could mismatch the final
rank during unrolling; there is no direct way to query what the final
rank is from the object.
2024-06-03 16:39:52 -07:00
Mubashar Ahmad
bc946f5287 [mlir][vector] Add 1D vector.deinterleave lowering (#93042)
This patch implements the lowering of vector.deinterleave 
for 1D vectors.

For fixed vector types, the operation is lowered to two
llvm shufflevector operations. One for even indexed
elements and the other for odd indexed elements. A poison
operation is used to satisfy the parameters of the
shufflevector parameters.
    
For scalable vectors, the llvm vector.deinterleave2
intrinsic is used for lowering. As such the results
found by extraction and used to form the result
struct for the intrinsic.
2024-05-30 09:42:35 +01:00
Jakub Kuderski
714aee31e1 [mlir][vector] Add result type to interleave assembly format (#93392)
This is to make it more obvious for what the result type is, especially
with some less trivial cases like 0-d inputs resulting in 1-d inputs or
interaction with scalable vector types. Note that `vector.deinterleave`
uses the same format with explicit result type.

Also improve examples and clean up surrounding code.
2024-05-27 11:03:36 -04:00
Maciej Gabka
bfc0317153 Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Diego Caballero
42a6ad7bad [mlir][Vector] Fix n-D vector.extract/insert lowering to LLVM (#87591)
The lowering of n-D vector.extract/insert ops to LLVM is not supported
but if one of these accidentally reaches the vector-to-llvm conversion
patterns, we end up with a kind of puzzling crash. This PR fixes that
crash and gracefully bails out in those cases.
2024-04-05 15:01:20 -07:00
Benjamin Maxwell
a1a6860314 [mlir][VectorOps] Add unrolling for n-D vector.interleave ops (#80967)
This unrolls n-D vector.interleave ops like:

```mlir
vector.interleave %i, %j : vector<6x3xf32>
```

To a sequence of 1-D operations:
```mlir
%i_0 = vector.extract %i[0] 
%j_0 = vector.extract %j[0] 
%res_0 = vector.interleave %i_0, %j_0 : vector<3xf32>
vector.insert %res_0, %result[0] :
// ... repeated x6
```

The 1-D operations can then be directly lowered to LLVM.

Depends on: #80966
2024-02-20 14:33:33 +00:00
Benjamin Maxwell
79ce2c93ae [mlir][VectorOps] Add conversion of 1-D vector.interleave ops to LLVM (#80966)
The 1-D case directly maps to LLVM intrinsics. The n-D case will be
handled by unrolling to 1-D first (in a later patch).

Depends on: #80965
2024-02-13 10:47:33 +00:00
Andrzej Warzyński
9ddbcee25e [mlir][vector] Extend vector.{insert|extract}_strided_slice (#79052)
Extends `vector.insert_strided_slice` and `vector.insert_strided_slice`
to allow scalable input and output vectors. For scalable sizes, the
corresponding slice size has to match the corresponding dimension in the
output/input vector (insert/extract, respectively).

This is supported:
```mlir
vector.extract_strided_slice %1 {
  offsets = [0, 3, 0],
  sizes = [1, 1, 4],
  strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[4]xi32>
```

This is not supported:
```mlir
vector.extract_strided_slice %1 {
  offsets = [0, 3, 0],
  sizes = [1, 1, 2],
  strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[2]xi32>
```
2024-01-25 19:01:28 +00:00
Krzysztof Drewniak
5cfe24eee4 [mlir][Vector] Add nontemporal attribute, mirroring memref (#76752)
Since vector loads and stores from scalar memrefs translate to
llvm.load/store, add the ability to tag said loads and stores as
nontemporal. This mirrors functionality available in memref.load/store.
2024-01-09 11:05:20 -06:00
Matthias Springer
bb6d5c2200 [mlir][Transforms] GreedyPatternRewriteDriver: Do not CSE constants during iterations (#75897)
The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply
rewrite patterns to ops. It has special handling for constants: they are
CSE'd and sometimes moved to parent regions to allow for additional
CSE'ing. This happens in `OperationFolder`.

To allow for efficient CSE'ing, `OperationFolder` maintains an internal
lookup data structure to find the existing constant ops with the same
value for each `IsolatedFromAbove` region:
```c++
/// A mapping between an insertion region and the constants that have been
/// created within it.
DenseMap<Region *, ConstantMap> foldScopes;
```

Rewrite patterns are allowed to modify operations. In particular, they
may move operations (including constants) from one region to another
one. Such an IR rewrite can make the above lookup data structure
inconsistent.

We encountered such a bug in a downstream project. This bug materialized
in the form of an op that uses the result of a constant op from a
different `IsolatedFromAbove` region (that is not accessible).

This commit changes the behavior of the `GreedyPatternRewriteDriver`
such that `OperationFolder` is used to CSE constants at the beginning of
each iteration (as the worklist is populated), but no longer during an
iteration. `OperationFolder` is no longer used after populating the
worklist, so we do not have to care about inconsistent state in the
`OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver`
now performs the op folding by itself instead of calling
`OperationFolder::tryToFold`.

This change changes the order of constant ops in test cases, but not the
region in which they appear. All broken test cases were fixed by turning
`CHECK` into `CHECK-DAG`.

Alternatives considered: The state of `OperationFolder` could be
partially invalidated with every `notifyOperationModified` notification.
That is more fragile than the solution in this commit because incorrect
rewriter API usage can lead to missing notifications and hard-to-debug
`IsolatedFromAbove` violations. (It did not fix the above mention bug in
a downstream project, which could be due to incorrect rewriter API usage
or due to another conceptual problem that I missed.) Moreover, ops are
frequently getting modified during a greedy pattern rewrite, so we would
likely keep invalidating large parts of the state of `OperationFolder`
over and over.

Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant
ops are no longer folded during a greedy pattern rewrite. If you rely on
folding (and rematerialization) of constant ops during a greedy pattern
rewrite, turn the folder into a pattern.
2024-01-05 09:22:18 +01:00
Matthias Springer
c99670ba51 [mlir][vector] LoadOp/StoreOp: Allow 0-D vectors (#76134)
Similar to `vector.transfer_read`/`vector.transfer_write`, allow 0-D
vectors.

This commit fixes
`mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir`
when verifying the IR after each pattern (#74270). That test produces a
temporary 0-D load/store op.
2023-12-22 11:12:58 +09:00
Jakub Kuderski
560564f51c [mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901)
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.

Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.

Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.

Issue: https://github.com/llvm/llvm-project/issues/72354
2023-12-20 00:14:43 -05:00
Jakub Kuderski
a528cee224 [mlir][vector] Improve makeArithReduction expansion (#75846)
Propagate fast math flags.
Distinguish `minf`/`maxf` and `minimumf`/`maximumf`.

Required for future patterns in
https://github.com/llvm/llvm-project/pull/75727.
2023-12-18 17:47:46 -05:00
Christian Ulmann
ceb4dc4477 [MLIR][VectorToLLVM] Remove typed pointer support (#71075)
This commit removes the support for lowering Vector to LLVM dialect with
typed pointers. Typed pointers have been deprecated for a while now and
it's planned to soon remove them from the LLVM dialect.

Related PSA:
https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502
2023-11-03 11:16:11 +01:00
Benjamin Maxwell
3be3883e6d [mlir][VectorOps] Support string literals in vector.print (#68695)
Printing strings within integration tests is currently quite annoyingly
verbose, and can't be tucked into shared helpers as the types depend on
the length of the string:

```
llvm.mlir.global internal constant @hello_world("Hello, World!\0")

func.func @entry() {
  %0 = llvm.mlir.addressof @hello_world : !llvm.ptr<array<14 x i8>>
  %1 = llvm.mlir.constant(0 : index) : i64
  %2 = llvm.getelementptr %0[%1, %1]
    : (!llvm.ptr<array<14 x i8>>, i64, i64) -> !llvm.ptr<i8>
  llvm.call @printCString(%2) : (!llvm.ptr<i8>) -> ()
  return
}
```

So this patch adds a simple extension to `vector.print` to simplify
this:
```
func.func @entry() {
   // Print a vector of characters ;)
   vector.print str "Hello, World!"
   return
}
```

Most of the logic for this is now shared with `cf.assert` which already
does something similar.

Depends on #68694
2023-10-24 09:34:14 +01:00
Quinn Dawkins
78c49743c7 [MLIR][Vector] Allow non-default memory spaces in gather/scatter lowerings (#67500)
GPU targets can gather on non-default address spaces (e.g. global), so
this removes the check for the default memory space.
2023-09-28 19:20:32 -04:00
Cullen Rhodes
9816edc9f3 [mlir][vector] add result type to vector.extract assembly format (#66499)
The vector.extract assembly format currently only contains the source
type, for example:

  %1 = vector.extract %0[1] : vector<3x7x8xf32>

it's not immediately obvious if this is the source or result type. This
patch improves the assembly format to make this clearer, so the above
becomes:

  %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>
2023-09-28 11:11:16 +01:00
Diego Caballero
98f6289a34 [mlir][Vector] Add support for Value indices to vector.extract/insert
`vector.extract/insert` ops only support constant indices. This PR is
extending them so that arbitrary values can be used instead.

This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops

Differential Revision: https://reviews.llvm.org/D155034
2023-09-22 00:39:32 +00:00
Nicolas Vasilache
1b8b556443 [mlir][Vector] Add fastmath flags to vector.reduction (#66905)
This revision pipes the fastmath attribute support through the
vector.reduction op. This seemingly simple first step already requires
quite some genuflexions, file and builder reorganization. In the
process, retire the boolean reassoc flag deep in the LLVM dialect
builders and just use the fastmath attribute.

During conversions, templated builders for predicated intrinsics are
partially cleaned up. In the future, to finalize the cleanups, one
should consider adding fastmath to the VPIntrinsic ops.
2023-09-20 16:57:20 +02:00
Benjamin Maxwell
2f11ce5579 [mlir][VectorOps] Extend vector.constant_mask to support 'all true' scalable dims (#66638)
This extends `vector.constant_mask` so that mask dim sizes that
correspond to a scalable dimension are treated as if they're implicitly
multiplied by vscale. Currently this is limited to mask dim sizes of 0
or the size of the dim/vscale. This allows constant masks to represent
all true and all false scalable masks (and some variations):

```
// All true scalable mask
%mask = vector.constant_mask [8] : vector<[8]xi1>

// All false scalable mask
%mask = vector.constant_mask [0] : vector<[8]xi1>

// First two scalable rows
%mask = vector.constant_mask [2,4] : vector<4x[4]xi1>
```
2023-09-20 14:54:42 +01:00
Benjamin Maxwell
665995b918 [mlir][Conversion] Allow lowering to fixed arrays of scalable vectors
This allows lowering vector types like: vector<3x[4]> or vector<3x2x[4]>
to LLVM IR, i.e. vectors where the trailing dim is scalable.

This is contingent on:
https://discourse.llvm.org/t/rfc-enable-arrays-of-scalable-vector-types/72935

More tests will be added in later patches, however, some MLIR fixes are
needed first.

Depends on: D158517

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D158752
2023-09-15 09:33:18 +00:00
Daniil Dudkin
8f5d519458 [mlir][vector] Implement Workaround Lowerings for Masked fm**imum Reductions
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

Within LLVM, there are no masked reduction counterparts for vector reductions such as `fmaximum` and `fminimum`.
More information can be found here: https://github.com/llvm/llvm-project/issues/64940#issuecomment-1690694156.

To address this issue in MLIR, where we need to generate appropriate lowerings for these cases, we employ regular non-masked intrinsics.
However, we modify the input vector using the `arith.select` operation to effectively deactivate undesired elements using a "neutral mask value".
The neutral mask value is the smallest possible value for the `fmaximum` reduction and the largest possible value for the `fminimum` reduction.

Depends on D158618

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158773
2023-09-13 22:49:08 +00:00