Commit Graph

172 Commits

Author SHA1 Message Date
Benjamin Kramer
4d228e1ebd [mlir][vector] Escape variable usage in test
Otherwise the shell might expand this in the command line.
2024-10-17 12:43:32 +02:00
Andrzej Warzyński
3187a4917d [mlir][vector] Add more tests for ConvertVectorToLLVM (8/n) (#111997)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:

* `vector.transfer_read`,
* `vector.transfer_write`.

In addition:

* Duplicate tests from "vector-mask-to-llvm.mlir" are removed.
* Tests for xfer_read/xfer_write are moved to a newly created test file,
  "vector-xfer-to-llvm.mlir". This follows an existing pattern among
  VectorToLLVM conversion tests.
* Tests that test both xfer_read and xfer_write have their names updated
  to capture that (e.g. @transfer_read_1d_mask ->
  @transfer_read_write_1d_mask)
* @transfer_write_1d_scalable_mask and @transfer_read_1d_scalable_mask
  are re-written as @transfer_read_write_1d_mask_scalable. This is to
  make it clear that this case is meant to complement
  @transfer_read_write_1d_mask.
* @transfer_write_tensor is updated to also test xfer_read.
2024-10-15 13:15:36 +01:00
Andrzej Warzyński
f7eb271542 [mlir][vector] Add more tests for ConvertVectorToLLVM (7/n) (#111895)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.fma
  * vector.reduce
2024-10-11 14:36:26 +01:00
Andrzej Warzyński
f58e85a972 [mlir][vector] Add more tests for ConvertVectorToLLVM (6/n) (#111121)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * `vector.insert_strided_slice`

With this change, for every test with fixed-width vectors, there should
be a corresponding example with scalable vectors (for
`vector.insert_strided_slice`). In addition:
  * Test function names are updated to more accurately reflect the case
    being exercised (e.g. `@insert_strided_index_slice1` ->
    `@insert_strided_index_slice_index_2d_into_3d`)
  * For consistency, took the liberty of updating some of the function
    names for `vector.extract_strided_slice`
  * `@insert_strided_slice_scalable` is effectively replaced with
    `@insert_strided_slice_f32_2d_into_3d_scalable`
2024-10-07 10:05:16 +01:00
Longsheng Mou
a4b0153c4f [mlir][vector] Support for extracting 1-element vectors in VectorExtractOpConversion (#107549)
This patch adds support for converting `vector.extract` that extract
1-element vectors into LLVM, fixing a crash in such cases.
E.g., `vector.extract %1[0]: vector<1xf32> from vector<2xf32>`. Fix
#61372.
2024-09-11 17:10:58 +08:00
Longsheng Mou
a8f3d30312 [mlir] Add dependent TensorDialect to ConvertVectorToLLVM pass (#108045)
This patch registers the tensor dialect as dependent of the
ConvertVectorToLLVM.
This which fixes a crash when `vector.transfer_write` is used with
dynamic tensor type.
The MaterializeTransferMask pattern would call
`vector::createOrFoldDimOp` which
creates a `tensor.dim` operation.

Fixes #107805.
2024-09-11 17:08:44 +08:00
Andrzej Warzyński
a9c71d3665 [mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#106510) 2024-09-02 12:19:00 +01:00
Maciej Gabka
95d2d1cba0 Move stepvector intrinsic out of experimental namespace (#98043)
This patch is moving out stepvector intrinsic from the experimental
namespace.

This intrinsic exists in LLVM for several years now, and is widely used.
2024-08-28 12:48:20 +01:00
Hugo Trachino
749ba7f6b2 [mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#104784)
This patch aims to disambiguate test names for some of the
Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.extractelement
  * vector.extract
  * vector.insertelement
  * vector.insert
  
1. Tests targetting `vector.{insert|extract}` Ops do not have names like
`{insert|extract}_element*` which was confusing against
`vector.{insert|extract}element` ops targetting tests.
2. Tests mention the type of the target/source buffer. e.g.
`@extractelement` => `@extractelement_from_vec_1d`
3. Align LIT ligns consistently with other tests.
4. Tests with a different type for position have a name updated
accordingly. `@extractelement_index` =>`@extractelement_index_position`
5. Tests with a dynamic value for position have a name updated
accordingly. `@extract_element_with_value_1d`
=>`@extract_scalar_dynamic_position_from_vec_1d`
6. Added the scalable flavour of the tests
`insert_scalar_into_vec_2d_dynamic_position` and
`@extract_scalar_from_vec_2d_dynamic_position`
2024-08-21 08:55:50 +01:00
Andrzej Warzyński
6fceb3e865 [mlir][vector] Add more tests for ConvertVectorToLLVM (4/n) (#103391)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.insertelement
  * vector.insert

I have also renamed some function names from `@insert_element{}` to
`@insertelement{}` - that's to make a clearer distinction between
tests for `vector.insertelement` (tested by `@insertelement{}`) and
`vector.insert` (tested by `@insert_element{}`).
2024-08-16 16:48:16 +01:00
Andrzej Warzyński
f0f5afe968 [mlir][vector] Add more tests for ConvertVectorToLLVM (3/n) (#102854)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.extractelement
  * vector.extract

I have also renamed some function names from `@extract_element{}` to
`@extractelement{}` - that's to make a clearer distinction between
tests for `vector.extractelement` (tested by `@extractelement{}`) and
`vector.extract` (tested by `@extract_element{}`).
2024-08-13 13:03:35 +01:00
Andrzej Warzyński
7e175b307e [mlir][vector] Add more tests for ConvertVectorToLLVM (2/n) (#102203)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.outerproduct
2024-08-09 09:06:57 +01:00
Andrzej Warzyński
22a130220c [mlir][vector] Add more tests for ConvertVectorToLLVM (1/n) (#101936)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.bitcast
  * vector.broadcast

Note, this has uncovered some missing logic in `BroadcastOpLowering`.
This PR fixes the most basic cases where the scalable flags were dropped
and the generated code was incorrect. Also, the conditions in
`vector::isBroadcastableTo` are relaxed to allow cases like this:
```mlir
%0 = vector.broadcast %arg0 : vector<1xf32> to vector<[4]xf32>
```

The `BroadcastOpLowering` pattern is effectively disabled for scalable
vectors in more complex cases where an SCF loop would be required to
loop over the scalable dims, e.g.:
```mlir
 %0 = vector.broadcast %arg0 : vector<[4]x1x2xf32> to vector<[4]x3x2xf32>
```

These cases are marked as "Stretch not at start" in the code. In those
cases, support for scalable vectors is left as a TODO.
2024-08-08 15:57:36 +01:00
Cullen Rhodes
1e7d6d3455 [mlir][vector] Propagate scalability to gather/scatter ptrs vector (#97584)
In convert-vector-to-llvm the first operand (vector of pointers holding
all memory addresses to read) to the masked.gather (and scatter)
intrinsic has a fixed vector type.

This may result in intrinsics where the scalable flag has been dropped:
```
  %0 = llvm.intr.masked.gather %1, %2, %3 {alignment = 4 : i32}
    : (!llvm.vec<4 x ptr>, vector<[4]xi1>, vector<[4]xi32>) -> vector<[4]xi32>
```
Fortunately the operand is overloaded on the result type so we end up
with the correct IR when lowering to LLVM, but this is still incorrect.
This patch fixes it by propagating scalability.
2024-07-09 09:06:25 +01:00
Cullen Rhodes
67b302c52f [mlir][vector] Add vector.step operation (#96776)
This patch adds a new vector.step operation to the Vector dialect. It
produces a linear sequence of index values from 0 to N, where N is the
number of elements in the result vector, and can be used to create
vectors of indices.

It supports both fixed-width and scalable vectors. For fixed the
canonical representation is `arith.constant dense<[0, .., N]>`. A
scalable step cannot be represented as a constant and is lowered to the
`llvm.experimental.stepvector` intrinsic [1].

This op enables scalable vectorization of linalg.index ops, see #96778. It can
also be used in the SparseVectorizer in-place of lower-level stepvector
intrinsic, see [2] (patch to follow).

[1] https://llvm.org/docs/LangRef.html#llvm-experimental-stepvector-intrinsic
[2] acf675b63f/mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp (L385-L388)
2024-07-04 08:57:02 +01:00
Matthias Springer
c6ff2446a4 [mlir][vector] Add vector.from_elements op (#95938)
This commit adds a new operation to the vector dialect:
`vector.from_elements`

The op constructs a new vector from a given list of scalar values. It is
similar to `tensor.from_elements`.
```mlir
%0 = vector.from_elements %a, %b, %c, %a, %a, %a : vector<2x3xf32>
```

Constructing a new vector from elements was tedious before this op
existed: a typical way was to define an `arith.constant ... :
vector<...>`, followed by a chain of `vector.insert`.

Folders/canonicalizations are added that can fold `vector.extract` ops
and convert the `vector.from_elements` op into a `vector.splat` op.

The LLVM lowering generates an `llvm.mlir.undef`, followed by a sequence
of scalar insertions in the form of `llvm.insertelement`. Only 0-D and
1-D vectors are currently supported in the LLVM lowering.
2024-06-19 09:58:37 +02:00
Matthias Springer
13d983e730 [mlir][Transforms][NFC] Dialect Conversion: Resolve insertion point TODO (#95653)
Remove a TODO in the dialect conversion code base when materializing
unresolved conversions:
```
// FIXME: Determine a suitable insertion location when there are multiple
// inputs.
```

The implementation used to select an insertion point as follows:
- If the cast has exactly one operand: right after the definition of the
SSA value.
- Otherwise: right before the cast op.

However, it is not necessary to change the insertion point. Unresolved
materializations (`UnrealizedConversionCastOp`) are built during
`buildUnresolvedArgumentMaterialization` or
`buildUnresolvedTargetMaterialization`. In the former case, the op is
inserted at the beginning of the block. In the latter case, only one
operand is supported in the dialect conversion, and the op is inserted
right after the definition of the SSA value. I.e., the
`UnrealizedConversionCastOp` is already inserted at the right place and
it is not necessary to change the insertion point for the resolved
materialization op.

Note: The IR change changes slightly because the
`unrealized_conversion_cast` ops at the beginning of a block are no
longer doubly-inverted (by setting the insertion to the beginning of the
block when inserting the `unrealized_conversion_cast` and again when
inserting the resolved conversion op). All affected test cases were
fixed by using `CHECK-DAG` instead of `CHECK`.

Also improve the quality of multiple test cases that did not check for
the correct operands.

Note: This commit is in preparation of decoupling the
argument/source/target materialization logic of the type converter from
the dialect conversion (to reduce its complexity and make that
functionality usable from a new dialect conversion driver).
2024-06-17 19:56:40 +02:00
Zhaoshi Zheng
abcbbe7114 [MLIR][VectorToLLVM] Handle scalable dim in createVectorLengthValue() (#93361)
LLVM's Vector Predication Intrinsics require an explicit vector length
parameter:
https://llvm.org/docs/LangRef.html#vector-predication-intrinsics.

For a scalable vector type, this should be caculated as VectorScaleOp
multiplied by base vector length, e.g.: for <[4]xf32> we should return:
vscale * 4.
2024-06-13 09:06:05 -07:00
Mubashar Ahmad
b87a80d4eb [mlir][vector] Add n-d deinterleave lowering (#94237)
This patch implements the lowering for vector
deinterleave for vector of n-dimensions. Process
involves unrolling the n-d vector to a series
of one-dimensional vectors. The deinterleave
operation is then used on these vectors.

From:
```
%0, %1 = vector.deinterleave %a : vector<2x8xi8> -> vector<2x4xi8>
```

To:
```
%cst = arith.constant dense<0> : vector<2x4xi32>
%0 = vector.extract %arg0[0] : vector<8xi32> from vector<2x8xi32>
%res1, %res2 = vector.deinterleave %0 : vector<8xi32> -> vector<4xi32>
%1 = vector.insert %res1, %cst [0] : vector<4xi32> into vector<2x4xi32>
%2 = vector.insert %res2, %cst [0] : vector<4xi32> into vector<2x4xi32>
%3 = vector.extract %arg0[1] : vector<8xi32> from vector<2x8xi32>
%res1_0, %res2_1 = vector.deinterleave %3 : vector<8xi32> -> vector<4xi32>
%4 = vector.insert %res1_0, %1 [1] : vector<4xi32> into vector<2x4xi32>
%5 = vector.insert %res2_1, %2 [1] : vector<4xi32> into vector<2x4xi32>
...etc.
```
2024-06-07 10:57:00 +01:00
Han-Chung Wang
0ea1271ee1 [mlir][vector] Add support for unrolling vector.bitcast ops. (#94064)
The revision unrolls vector.bitcast like:

```mlir
%0 = vector.bitcast %arg0 : vector<2x4xi32> to vector<2x2xi64>
```

to

```mlir
%cst = arith.constant dense<0> : vector<2x2xi64>
%0 = vector.extract %arg0[0] : vector<4xi32> from vector<2x4xi32>
%1 = vector.bitcast %0 : vector<4xi32> to vector<2xi64>
%2 = vector.insert %1, %cst [0] : vector<2xi64> into vector<2x2xi64>
%3 = vector.extract %arg0[1] : vector<4xi32> from vector<2x4xi32>
%4 = vector.bitcast %3 : vector<4xi32> to vector<2xi64>
%5 = vector.insert %4, %2 [1] : vector<2xi64> into vector<2x2xi64>
```

The scalable vector is not supported because of the limitation of
`vector::createUnrollIterator`. The targetRank could mismatch the final
rank during unrolling; there is no direct way to query what the final
rank is from the object.
2024-06-03 16:39:52 -07:00
Mubashar Ahmad
bc946f5287 [mlir][vector] Add 1D vector.deinterleave lowering (#93042)
This patch implements the lowering of vector.deinterleave 
for 1D vectors.

For fixed vector types, the operation is lowered to two
llvm shufflevector operations. One for even indexed
elements and the other for odd indexed elements. A poison
operation is used to satisfy the parameters of the
shufflevector parameters.
    
For scalable vectors, the llvm vector.deinterleave2
intrinsic is used for lowering. As such the results
found by extraction and used to form the result
struct for the intrinsic.
2024-05-30 09:42:35 +01:00
Jakub Kuderski
714aee31e1 [mlir][vector] Add result type to interleave assembly format (#93392)
This is to make it more obvious for what the result type is, especially
with some less trivial cases like 0-d inputs resulting in 1-d inputs or
interaction with scalable vector types. Note that `vector.deinterleave`
uses the same format with explicit result type.

Also improve examples and clean up surrounding code.
2024-05-27 11:03:36 -04:00
Maciej Gabka
bfc0317153 Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Diego Caballero
42a6ad7bad [mlir][Vector] Fix n-D vector.extract/insert lowering to LLVM (#87591)
The lowering of n-D vector.extract/insert ops to LLVM is not supported
but if one of these accidentally reaches the vector-to-llvm conversion
patterns, we end up with a kind of puzzling crash. This PR fixes that
crash and gracefully bails out in those cases.
2024-04-05 15:01:20 -07:00
Benjamin Maxwell
a1a6860314 [mlir][VectorOps] Add unrolling for n-D vector.interleave ops (#80967)
This unrolls n-D vector.interleave ops like:

```mlir
vector.interleave %i, %j : vector<6x3xf32>
```

To a sequence of 1-D operations:
```mlir
%i_0 = vector.extract %i[0] 
%j_0 = vector.extract %j[0] 
%res_0 = vector.interleave %i_0, %j_0 : vector<3xf32>
vector.insert %res_0, %result[0] :
// ... repeated x6
```

The 1-D operations can then be directly lowered to LLVM.

Depends on: #80966
2024-02-20 14:33:33 +00:00
Benjamin Maxwell
79ce2c93ae [mlir][VectorOps] Add conversion of 1-D vector.interleave ops to LLVM (#80966)
The 1-D case directly maps to LLVM intrinsics. The n-D case will be
handled by unrolling to 1-D first (in a later patch).

Depends on: #80965
2024-02-13 10:47:33 +00:00
Andrzej Warzyński
9ddbcee25e [mlir][vector] Extend vector.{insert|extract}_strided_slice (#79052)
Extends `vector.insert_strided_slice` and `vector.insert_strided_slice`
to allow scalable input and output vectors. For scalable sizes, the
corresponding slice size has to match the corresponding dimension in the
output/input vector (insert/extract, respectively).

This is supported:
```mlir
vector.extract_strided_slice %1 {
  offsets = [0, 3, 0],
  sizes = [1, 1, 4],
  strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[4]xi32>
```

This is not supported:
```mlir
vector.extract_strided_slice %1 {
  offsets = [0, 3, 0],
  sizes = [1, 1, 2],
  strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[2]xi32>
```
2024-01-25 19:01:28 +00:00
Krzysztof Drewniak
5cfe24eee4 [mlir][Vector] Add nontemporal attribute, mirroring memref (#76752)
Since vector loads and stores from scalar memrefs translate to
llvm.load/store, add the ability to tag said loads and stores as
nontemporal. This mirrors functionality available in memref.load/store.
2024-01-09 11:05:20 -06:00
Matthias Springer
bb6d5c2200 [mlir][Transforms] GreedyPatternRewriteDriver: Do not CSE constants during iterations (#75897)
The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply
rewrite patterns to ops. It has special handling for constants: they are
CSE'd and sometimes moved to parent regions to allow for additional
CSE'ing. This happens in `OperationFolder`.

To allow for efficient CSE'ing, `OperationFolder` maintains an internal
lookup data structure to find the existing constant ops with the same
value for each `IsolatedFromAbove` region:
```c++
/// A mapping between an insertion region and the constants that have been
/// created within it.
DenseMap<Region *, ConstantMap> foldScopes;
```

Rewrite patterns are allowed to modify operations. In particular, they
may move operations (including constants) from one region to another
one. Such an IR rewrite can make the above lookup data structure
inconsistent.

We encountered such a bug in a downstream project. This bug materialized
in the form of an op that uses the result of a constant op from a
different `IsolatedFromAbove` region (that is not accessible).

This commit changes the behavior of the `GreedyPatternRewriteDriver`
such that `OperationFolder` is used to CSE constants at the beginning of
each iteration (as the worklist is populated), but no longer during an
iteration. `OperationFolder` is no longer used after populating the
worklist, so we do not have to care about inconsistent state in the
`OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver`
now performs the op folding by itself instead of calling
`OperationFolder::tryToFold`.

This change changes the order of constant ops in test cases, but not the
region in which they appear. All broken test cases were fixed by turning
`CHECK` into `CHECK-DAG`.

Alternatives considered: The state of `OperationFolder` could be
partially invalidated with every `notifyOperationModified` notification.
That is more fragile than the solution in this commit because incorrect
rewriter API usage can lead to missing notifications and hard-to-debug
`IsolatedFromAbove` violations. (It did not fix the above mention bug in
a downstream project, which could be due to incorrect rewriter API usage
or due to another conceptual problem that I missed.) Moreover, ops are
frequently getting modified during a greedy pattern rewrite, so we would
likely keep invalidating large parts of the state of `OperationFolder`
over and over.

Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant
ops are no longer folded during a greedy pattern rewrite. If you rely on
folding (and rematerialization) of constant ops during a greedy pattern
rewrite, turn the folder into a pattern.
2024-01-05 09:22:18 +01:00
Matthias Springer
c99670ba51 [mlir][vector] LoadOp/StoreOp: Allow 0-D vectors (#76134)
Similar to `vector.transfer_read`/`vector.transfer_write`, allow 0-D
vectors.

This commit fixes
`mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir`
when verifying the IR after each pattern (#74270). That test produces a
temporary 0-D load/store op.
2023-12-22 11:12:58 +09:00
Jakub Kuderski
560564f51c [mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901)
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.

Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.

Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.

Issue: https://github.com/llvm/llvm-project/issues/72354
2023-12-20 00:14:43 -05:00
Jakub Kuderski
a528cee224 [mlir][vector] Improve makeArithReduction expansion (#75846)
Propagate fast math flags.
Distinguish `minf`/`maxf` and `minimumf`/`maximumf`.

Required for future patterns in
https://github.com/llvm/llvm-project/pull/75727.
2023-12-18 17:47:46 -05:00
Christian Ulmann
ceb4dc4477 [MLIR][VectorToLLVM] Remove typed pointer support (#71075)
This commit removes the support for lowering Vector to LLVM dialect with
typed pointers. Typed pointers have been deprecated for a while now and
it's planned to soon remove them from the LLVM dialect.

Related PSA:
https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502
2023-11-03 11:16:11 +01:00
Benjamin Maxwell
3be3883e6d [mlir][VectorOps] Support string literals in vector.print (#68695)
Printing strings within integration tests is currently quite annoyingly
verbose, and can't be tucked into shared helpers as the types depend on
the length of the string:

```
llvm.mlir.global internal constant @hello_world("Hello, World!\0")

func.func @entry() {
  %0 = llvm.mlir.addressof @hello_world : !llvm.ptr<array<14 x i8>>
  %1 = llvm.mlir.constant(0 : index) : i64
  %2 = llvm.getelementptr %0[%1, %1]
    : (!llvm.ptr<array<14 x i8>>, i64, i64) -> !llvm.ptr<i8>
  llvm.call @printCString(%2) : (!llvm.ptr<i8>) -> ()
  return
}
```

So this patch adds a simple extension to `vector.print` to simplify
this:
```
func.func @entry() {
   // Print a vector of characters ;)
   vector.print str "Hello, World!"
   return
}
```

Most of the logic for this is now shared with `cf.assert` which already
does something similar.

Depends on #68694
2023-10-24 09:34:14 +01:00
Quinn Dawkins
78c49743c7 [MLIR][Vector] Allow non-default memory spaces in gather/scatter lowerings (#67500)
GPU targets can gather on non-default address spaces (e.g. global), so
this removes the check for the default memory space.
2023-09-28 19:20:32 -04:00
Cullen Rhodes
9816edc9f3 [mlir][vector] add result type to vector.extract assembly format (#66499)
The vector.extract assembly format currently only contains the source
type, for example:

  %1 = vector.extract %0[1] : vector<3x7x8xf32>

it's not immediately obvious if this is the source or result type. This
patch improves the assembly format to make this clearer, so the above
becomes:

  %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>
2023-09-28 11:11:16 +01:00
Diego Caballero
98f6289a34 [mlir][Vector] Add support for Value indices to vector.extract/insert
`vector.extract/insert` ops only support constant indices. This PR is
extending them so that arbitrary values can be used instead.

This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops

Differential Revision: https://reviews.llvm.org/D155034
2023-09-22 00:39:32 +00:00
Nicolas Vasilache
1b8b556443 [mlir][Vector] Add fastmath flags to vector.reduction (#66905)
This revision pipes the fastmath attribute support through the
vector.reduction op. This seemingly simple first step already requires
quite some genuflexions, file and builder reorganization. In the
process, retire the boolean reassoc flag deep in the LLVM dialect
builders and just use the fastmath attribute.

During conversions, templated builders for predicated intrinsics are
partially cleaned up. In the future, to finalize the cleanups, one
should consider adding fastmath to the VPIntrinsic ops.
2023-09-20 16:57:20 +02:00
Benjamin Maxwell
2f11ce5579 [mlir][VectorOps] Extend vector.constant_mask to support 'all true' scalable dims (#66638)
This extends `vector.constant_mask` so that mask dim sizes that
correspond to a scalable dimension are treated as if they're implicitly
multiplied by vscale. Currently this is limited to mask dim sizes of 0
or the size of the dim/vscale. This allows constant masks to represent
all true and all false scalable masks (and some variations):

```
// All true scalable mask
%mask = vector.constant_mask [8] : vector<[8]xi1>

// All false scalable mask
%mask = vector.constant_mask [0] : vector<[8]xi1>

// First two scalable rows
%mask = vector.constant_mask [2,4] : vector<4x[4]xi1>
```
2023-09-20 14:54:42 +01:00
Benjamin Maxwell
665995b918 [mlir][Conversion] Allow lowering to fixed arrays of scalable vectors
This allows lowering vector types like: vector<3x[4]> or vector<3x2x[4]>
to LLVM IR, i.e. vectors where the trailing dim is scalable.

This is contingent on:
https://discourse.llvm.org/t/rfc-enable-arrays-of-scalable-vector-types/72935

More tests will be added in later patches, however, some MLIR fixes are
needed first.

Depends on: D158517

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D158752
2023-09-15 09:33:18 +00:00
Daniil Dudkin
8f5d519458 [mlir][vector] Implement Workaround Lowerings for Masked fm**imum Reductions
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

Within LLVM, there are no masked reduction counterparts for vector reductions such as `fmaximum` and `fminimum`.
More information can be found here: https://github.com/llvm/llvm-project/issues/64940#issuecomment-1690694156.

To address this issue in MLIR, where we need to generate appropriate lowerings for these cases, we employ regular non-masked intrinsics.
However, we modify the input vector using the `arith.select` operation to effectively deactivate undesired elements using a "neutral mask value".
The neutral mask value is the smallest possible value for the `fmaximum` reduction and the largest possible value for the `fminimum` reduction.

Depends on D158618

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158773
2023-09-13 22:49:08 +00:00
Daniil Dudkin
709b27427b [mlir][vector] Bring back maxf/minf reductions
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

In line with the mentioned RFC, this patch  tackles tasks 2.3 and 2.4.
It adds LLVM conversions for the `maxf`/`minf` reductions to the non-NaN-propagating LLVM intrinsics.

Depends on D158618

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158659
2023-09-13 22:49:07 +00:00
Daniil Dudkin
4a831250b8 [mlir][vector] Rename vector reductions: maxfmaximumf, minfminimumf
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

Here, we are addressing task 2.1 from the plan, which involves renaming the vector reductions to align with the semantics of the corresponding LLVM intrinsics.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158618
2023-09-13 22:49:07 +00:00
Daniil Dudkin
8a6e54c9b3 [mlir][arith] Rename operations: maxfmaximumf, minfminimumf (#65800)
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
2023-09-11 22:02:19 -07:00
Benjamin Maxwell
ccef726d09 [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToLLVM)
This is a follow-on to D158753, and allows the lowering of a
transfer read/write of n-D vectors with a single trailing scalable dimension
to primitive vector ops.

The final conversion to LLVM depends on D158517 and D158752, without
these patches type conversion will fail (or an assert is hit in the LLVM
backend) if the final IR contains an array of scalable vectors.

This patch adds `transform.apply_patterns.vector.lower_create_mask`
which allows the lowering of vector.create_mask/constant_mask to be
tested independently of --convert-vector-to-llvm.

Reviewed By: c-rhodes, awarzynski, dcaballe

Differential Revision: https://reviews.llvm.org/D159482
2023-09-11 16:47:51 +00:00
Benjamin Maxwell
f36e909da0 [mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
Reland of the original patch after updating the Python binding tests,
a few CUDA/GPU MLIR tests, and ensuring the assembly format is
round-trippable.

This patch splits the lowering of vector.print into first converting
an n-D print into a loop of scalar prints of the elements, then a second
pass that converts those scalar prints into the runtime calls. The
former is done in VectorToSCF and the latter in VectorToLLVM.

The main reason for this is to allow printing scalable vector types,
which are not possible to fully unroll at compile time, though this
also avoids fully unrolling very large vectors.

To allow VectorToSCF to add the necessary punctuation between vectors
and elements, a "punctuation" attribute has been added to vector.print.
This abstracts calling the runtime functions such as printNewline(),
without leaking the LLVM details into the higher abstraction levels.
For example:

  vector.print punctuation <comma>

lowers to

  llvm.call @printComma() : () -> ()

The output format and runtime functions remain the same, which avoids
the need to alter a large number of tests (aside from the pipelines).

Reviewed By: awarzynski, c-rhodes, aartbik

Differential Revision: https://reviews.llvm.org/D156519
2023-08-11 09:29:54 +00:00
Mehdi Amini
1b272d21c8 Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors"
This reverts commit 490dae26cb.

Bot is broken, seems like there is a problem of ambiguity in the parser.
2023-08-09 19:37:01 -07:00
Benjamin Maxwell
490dae26cb [mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
Reland of the original patch after updating the Python binding tests and
a few CUDA/GPU MLIR tests.

This patch splits the lowering of vector.print into first converting
an n-D print into a loop of scalar prints of the elements, then a second
pass that converts those scalar prints into the runtime calls. The
former is done in VectorToSCF and the latter in VectorToLLVM.

The main reason for this is to allow printing scalable vector types,
which are not possible to fully unroll at compile time, though this
also avoids fully unrolling very large vectors.

To allow VectorToSCF to add the necessary punctuation between vectors
and elements, a "punctuation" attribute has been added to vector.print.
This abstracts calling the runtime functions such as printNewline(),
without leaking the LLVM details into the higher abstraction levels.
For example:

  vector.print <comma>

lowers to

  llvm.call @printComma() : () -> ()

The output format and runtime functions remain the same, which avoids
the need to alter a large number of tests (aside from the pipelines).

Reviewed By: awarzynski, c-rhodes, aartbik

Differential Revision: https://reviews.llvm.org/D156519
2023-08-09 11:47:18 +00:00
Benjamin Maxwell
b160442dd2 Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors"
This reverts commit 3875804a07.

This caused some test failures for the MLIR python bindings. Reverting
until those are addressed.
2023-08-09 09:54:05 +00:00
Benjamin Maxwell
3875804a07 [mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
This patch splits the lowering of vector.print into first converting
an n-D print into a loop of scalar prints of the elements, then a second
pass that converts those scalar prints into the runtime calls. The
former is done in VectorToSCF and the latter in VectorToLLVM.

The main reason for this is to allow printing scalable vector types,
which are not possible to fully unroll at compile time, though this
also avoids fully unrolling very large vectors.

To allow VectorToSCF to add the necessary punctuation between vectors
and elements, a "punctuation" attribute has been added to vector.print.
This abstracts calling the runtime functions such as printNewline(),
without leaking the LLVM details into the higher abstraction levels.
For example:

  vector.print <comma>

lowers to

  llvm.call @printComma() : () -> ()

The output format and runtime functions remain the same, which avoids
the need to alter a large number of tests (aside from the pipelines).

Reviewed By: awarzynski, c-rhodes, aartbik

Differential Revision: https://reviews.llvm.org/D156519
2023-08-09 09:38:05 +00:00