Commit Graph

214 Commits

Author SHA1 Message Date
Quentin Colombet
1dd00d3903 [mlir][Vector] Fix a propagation bug with broadcast
In the vector distribute patterns, we used to move
`vector.broadcast`s out of `vector.warp_execute_on_lane0`s
irrespectively of how they were defined.

This could create broadcast operations with invalid semantic.
E.g.,
```
%r = warop ...[32] ... -> vector<1x2xf32> {
  %val = broadcast %in : vector<64xf32> to vetor<1x64xf32>
  vector.yield %val : vector<1x64xf32>
}
```
=>
```
%r = warop ...[32] ... -> vector<64xf32> {
  vector.yield %in : vector<64xf32>
}
// Broadcasting to a narrower type!
broadcast %r : vector<64xf32> to vector<1x2xf32>
```

The root issue is we are trying to broadcast something that is not the same
for each thread, so there is actually nothing to propagate here.

The fix checks that the broadcast we want to create actually makes sense.

Differential Revision: https://reviews.llvm.org/D152154
2023-06-06 16:40:15 +02:00
Manish Gupta
9a795f0c59 [mlir][Vector] Adds a pattern to fold arith.extf into vector.contract
Consider mixed precision data type, i.e., F16 input lhs, F16 input rhs, F32 accumulation, and F32 output. This is typically written as F32 <= F16*F16 + F32.

During vectorization from linalg to vector for mixed precision data type (F32 <= F16*F16 + F32), linalg.matmul introduces arith.extf on input lhs and rhs operands.

"linalg.matmul"(%lhs, %rhs, %acc) ({
      ^bb0(%arg1: f16, %arg2: f16, %arg3: f32):
        %lhs_f32 = "arith.extf"(%arg1) : (f16) -> f32
        %rhs_f32 = "arith.extf"(%arg2) : (f16) -> f32
       %mul = "arith.mulf"(%lhs_f32, %rhs_f32) : (f32, f32) -> f32
        %acc = "arith.addf"(%arg3, %mul) : (f32, f32) -> f32
      "linalg.yield"(%acc) : (f32) -> ()
    })
There are backend that natively supports mixed-precision data type and does not need the arith.extf. For example, NVIDIA A100 GPU has mma.sync.aligned.*.f32.f16.f16.f32 that can support mixed-precision data type. However, the presence of arith.extf in the IR, introduces the unnecessary casting targeting F32 Tensor Cores instead of F16 Tensor Cores for NVIDIA backend. This patch adds a folding pattern to fold arith.extf into vector.contract

Differential Revision: https://reviews.llvm.org/D151918
2023-06-05 23:22:20 +00:00
Quentin Colombet
018d8ac974 [mlir][Vector] Fix a propagation bug with transfer_read
In the vector distribute patterns, we used to move
`vector.transfer_read`s out of `vector.warp_execute_on_lane0`s
irrespectively of how they were defined.

This could create transfer_read operations that would read values from
within the warpOp's body from outside of the body.
E.g.,
```
warpop {
  %defined_in_body
  %read = transfer_read %defined_in_body
  vector.yield %read
}
```
=>
```
warpop {
  %defined_in_body
  vector.yield ...
}
// %defined_in_body is referenced outside of its scope.
%read = transfer_read %defined_in_body
```

The fix consists in checking that all the values feeding the new
`transfer_read` are defined outside of warpOp's body.

Note: We could do this check before creating any operation, but that would
mean knowing what `affine::makeComposedAffineApply` actually do. So the
current fix is a trade off of coupling the implementations of this
propagation and `makeComposedAffineApply` versus compile time.

Differential Revision: https://reviews.llvm.org/D152149
2023-06-05 15:52:26 +02:00
Matthias Springer
01128d4baf [mlir][vector][NFC] Clean up headers
Certain functions were declared in `VectorOps.h` instead of `VectorTransforms.h` or `VectorRewritePatterns.h`.

Differential Revision: https://reviews.llvm.org/D152146
2023-06-05 15:16:20 +02:00
Diego Caballero
834fcfed24 Reland "[mlir][Vector] Extend xfer drop unit dim patterns"
This reverts commit 76d71f3792.
2023-06-01 22:22:16 +00:00
Diego Caballero
d3e1398bef [mlir][Vector] Prevent vector-to-scalar xfer patterns from triggering on sub-vectors
Patterns that convert extract(transfer_read) into a scalar load where
incorrectly triggering for cases where a sub-vector instead of a scalar
was extracted.

Reviewed By: nicolasvasilache, hanchung, awarzynski

Differential Revision: https://reviews.llvm.org/D151862
2023-06-01 22:22:16 +00:00
Diego Caballero
0935c0556b [mlir][Vector] Add support for 0-D 'vector.shape_cast' lowering
This PR adds support for shape casting from and to 0-D vectors.

Reviewed By: nicolasvasilache, hanchung, awarzynski

Differential Revision: https://reviews.llvm.org/D151851
2023-06-01 22:22:16 +00:00
Diego Caballero
76d71f3792 Revert "[mlir][Vector] Extend xfer drop unit dim patterns"
This reverts commit a53cd03dea.

This commit is exposing some implementation gaps in other patterns.
Reverting for now.
2023-05-31 18:20:05 +00:00
Diego Caballero
a53cd03dea [mlir][Vector] Extend xfer drop unit dim patterns
This patch extends the transfer drop unit dim patterns to support cases where the vector shape should also be reduced
(e.g., transfer_read(memref<1x4x1xf32>, vector<1x4x1xf32>) -> transfer_read(memref<4xf32>, vector<4xf32>).

Reviewed By: hanchung, pzread

Differential Revision: https://reviews.llvm.org/D151007
2023-05-23 20:58:51 +00:00
Diego Caballero
14726cd691 [mlir][Vector] Extend xfer_read(extract)->scalar load to support multiple uses
This patch extends the vector.extract(vector.transfer_read) -> scalar
load patterns to support vector.transfer_read with multiple uses. For
now, we check that all the uses are vector.extract operations.
Supporting multiple uses is predicated under a flag.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D150812
2023-05-19 21:03:18 +00:00
Lei Zhang
e000b62a34 [mlir][vector] Separate out vector transfer + tensor slice patterns
These patterns touches the structure generated from tiling so it
affects later steps like bufferization and vector hoisting.
Instead of putting them in canonicalization, this commit creates
separate entry points for them to be called explicitly.

This is NFC regarding the functionality and tests of those patterns.
It also addresses two TODO items in the codebase.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D150702
2023-05-17 09:01:19 -07:00
Matthias Springer
61223c49dd [mlir][GPU] Rename MLIRGPUOps CMake target to MLIRGPUDialect
This is for consistency with other dialects.

Differential Revision: https://reviews.llvm.org/D150659
2023-05-16 14:25:08 +02:00
Tres Popp
c1fa60b4cd [mlir] Update method cast calls to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Context:

* https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…"
* Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This follows a previous patch that updated calls
`op.cast<T>()-> cast<T>(op)`. However some cases could not handle an
unprefixed `cast` call due to occurrences of variables named cast, or
occurring inside of class definitions which would resolve to the method.
All C++ files that did not work automatically with `cast<T>()` are
updated here to `llvm::cast` and similar with the intention that they
can be easily updated after the methods are removed through a
find-replace.

See https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
for the clang-tidy check that is used and then update printed
occurrences of the function to include `llvm::` before.

One can then run the following:
```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
                 -export-fixes /tmp/cast/casts.yaml mlir/*\
                 -header-filter=mlir/ -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```

Differential Revision: https://reviews.llvm.org/D150348
2023-05-12 11:21:30 +02:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Hanhan Wang
25cc5a71b3 [mlir][vector] Generalize vector.transpose lowering to n-D vectors
The existing vector.transpose lowering patterns only triggers if the
input vector is 2D. The revision extends the pattern to handle n-D
vectors which are effectively 2-D vectors (e.g., vector<1x4x1x8x1).

It refactors a common check about 2-D vectors from X86Vector
lowering to VectorUtils.h so it can be reused by both sides.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D149908
2023-05-08 10:48:26 -07:00
Quinn Dawkins
650f04feda [mlir][vector] Add pattern to break down vector.bitcast
The pattern added here is intended as a last resort for targets like
SPIR-V where there are vector size restrictions and we need to be able
to break down large vector types. Vectorizing loads/stores for small
bitwidths (e.g. i8) relies on bitcasting to a larger element type and
patterns to bubble bitcast ops to where they can cancel.
This fails for cases such as
```
%1 = arith.trunci %0 : vector<2x32xi32> to vector<2x32xi8>
vector.transfer_write %1, %destination[%c0, %c0] {in_bounds = [true, true]} : vector<2x32xi8>, memref<2x32xi8>
```
where the `arith.trunci` op essentially does the job of one of the
bitcasts, leading to a bitcast that need to be further broken down
```
vector.bitcast %0 : vector<16xi8> to vector<4xi32>
```

Differential Revision: https://reviews.llvm.org/D149065
2023-04-25 20:18:02 -04:00
Quinn Dawkins
435f7d4c2e [mlir][vector] Add unroll pattern for vector.gather
This pattern is useful for SPIR-V to unroll to a supported vector size
before later lowerings. The unrolling pattern is closer to an
elementwise op than the transfer ops because the index values from which
to extract elements are captured by the index vector and thus there is
no need to update the base offsets when unrolling gather.

Differential Revision: https://reviews.llvm.org/D149066
2023-04-24 14:02:59 -04:00
Hanhan Wang
8d163e5045 [mlir][Vector] Add 16x16 strategy to vector.transpose lowering.
It adds a `shuffle_16x16` strategy LowerVectorTranspose and renames `shuffle` to `shuffle_1d`. The idea is similar to 8x8 cases in x86Vector::avx2. The general algorithm is:

```
interleave 32-bit lanes using
    8x _mm512_unpacklo_epi32
    8x _mm512_unpackhi_epi32
interleave 64-bit lanes using
    8x _mm512_unpacklo_epi64
    8x _mm512_unpackhi_epi64
permute 128-bit lanes using
   16x _mm512_shuffle_i32x4
permute 256-bit lanes using again
   16x _mm512_shuffle_i32x4
```

After the first stage, they got transposed to

```
 0  16   1  17   4  20   5  21   8  24   9  25  12  28  13  29
 2  18   3  19   6  22   7  23  10  26  11  27  14  30  15  31
32  48  33  49 ...
34  50  35  51 ...
64  80  65  81 ...
...
```

After the second stage, they got transposed to

```
 0  16  32  48 ...
 1  17  33  49 ...
 2  18  34  49 ...
 3  19  35  51 ...
64  80  96 112 ...
65  81  97 114 ...
66  82  98 113 ...
67  83  99 115 ...
...
```

After the thrid stage, they got transposed to

```
  0  16  32  48   8  24  40  56  64  80  96  112 ...
  1  17  33  49 ...
  2  18  34  50 ...
  3  19  35  51 ...
  4  20  36  52 ...
  5  21  37  53 ...
  6  22  38  54 ...
  7  23  39  55 ...
128 144 160 176 ...
129 145 161 177 ...
...
```

After the last stage, they got transposed to

```
0  16  32  48  64  80  96 112 ... 240
1  17  33  49  66  81  97 113 ... 241
2  18  34  50  67  82  98 114 ... 242
...
15  31  47  63  79  96 111 127 ... 255
```

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D148685
2023-04-23 11:05:41 -07:00
Lei Zhang
eca7698a97 [mlir][vector] NFC: Expose castAwayContractionLeadingOneDim
This commit exposes the transformation behind the pattern.
It is useful for more targeted application on a specific op
for once.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D148758
2023-04-21 09:41:14 -07:00
Diego Caballero
eb7f9feedb [Vector][NFC] Fix DEBUG_TYPE in LowerVectorTranspose.cpp
Fix wrong debug type.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D148729
2023-04-21 01:16:22 +00:00
Rahul Kayaith
6089d612a5 [mlir] Prevent implicit downcasting to interfaces
Currently conversions to interfaces may happen implicitly (e.g.
`Attribute -> TypedAttr`), failing a runtime assert if the interface
isn't actually implemented. This change marks the `Interface(ValueT)`
constructor as explicit so that a cast is required.

Where it was straightforward to I adjusted code to not require casts,
otherwise I just made them explicit.

Depends on D148491, D148492

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D148493
2023-04-20 16:31:54 -04:00
Matthias Springer
4c48f016ef [mlir][Affine][NFC] Wrap dialect in "affine" namespace
This cleanup aligns the affine dialect with all the other dialects.

Differential Revision: https://reviews.llvm.org/D148687
2023-04-20 11:19:21 +09:00
Benjamin Kramer
37a867a5a8 [vector] When trimming leading insertion dimensions, base the final result on the ranks
This was incorrect when the number of dropped source dims was smaller
than the number of dropped dst dims. We still need to insert zeros if
there is anything dropped from the src.

Differential Revision: https://reviews.llvm.org/D148636
2023-04-18 18:49:29 +02:00
Lei Zhang
5041fe8439 [mlir][vector] Fix integer promotion type mismatch
We need to create a new type with transposed shape after
transposing the operand in `CanonicalizeContractMatmulToMMT`.

Reviewed By: kuhar, dcaballe

Differential Revision: https://reviews.llvm.org/D148470
2023-04-17 11:29:23 -07:00
tyb0807
942b403ff1 [mlir] Fix casting of leading unit dims for vector.insert
When dropping leading unit dims of vector.insert's operands and creating
a new vector.insert, its new position rank should be computed explicitly
in two steps: first based on the numbers of leading unit dims dropped
from the vector.insert's destination, then based on the numbers of
leading unit dims dropped from its source.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D147280
2023-03-31 12:12:35 +00:00
Jakub Kuderski
72c662a47f [mlir][vector][NFC] Clean up vector gather lowering comments
These got relocated recently.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D147257
2023-03-30 17:13:14 -04:00
Diego Caballero
7b70baa9ef [mlir][Vector] Remove lhs and rhs masks from vector.contract
This patch removes the historical lhs and rhs masks in vector.contract,
now that vector.mask supports vector.contract and the lhs and rhs masks
are barely supported by all the vector.contract lowerings and
transformations.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D144430
2023-03-29 19:53:29 +00:00
Nicolas Vasilache
553cebde06 [mlir][Vector] Use a RewriterBase for IR rewrites in VectorTransferOpTransforms 2023-03-25 01:48:50 -07:00
Nicolas Vasilache
8b51340740 [mlir][Vector][Transforms] Improve the control over individual vector lowerings and transforms
This revision adds vector transform operations that allow us to better inspect the composition
of various lowerings that were previously very opaque.

This commit is NFC in that it does not change patterns beyond adding `rewriter.notifyFailure` messages
and it does not change the tests beyond breaking them into pieces and using transforms instead of
throwaway opaque test passes.

Reviewed By: ftynse, springerm

Co-authored-by: Alex Zinenko <zinenko@google.com>

Differential Revision: https://reviews.llvm.org/D146755
2023-03-24 14:01:39 +00:00
Nicolas Vasilache
2bc4c3e920 [mlir][Vector] NFC - Reorganize vector patterns
Vector dialect patterns have grown enormously in the past year to a point where they are now impenetrable.
Start reorganizing them towards finer-grained control.

Differential Revision: https://reviews.llvm.org/D146736
2023-03-23 11:30:25 -07:00
Jakub Kuderski
8c258fda1f [ADT][mlir][NFCI] Do not use non-const lvalue-refs with enumerate
Replace references to enumerate results with either result_pairs
(reference wrapper type) or structured bindings. I did not use
structured bindings everywhere as it wasn't clear to me it would
improve readability.

This is in preparation to the switch to zip semantics which won't
support non-const lvalue reference to elements:
https://reviews.llvm.org/D144503.

I chose to use values instead of const lvalue-refs because MLIR is
biased towards avoiding `const` local variables. This won't degrade
performance because currently `result_pair` is cheap to copy (size_t
+ iterator), and in the future, the enumerator iterator dereference
will return temporaries anyway.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D146006
2023-03-15 10:43:56 -04:00
Jakub Kuderski
f80a976acd [mlir][vector] Add gather lowering patterns
This is for targets that do not support gather-like ops, e.g., SPIR-V.

Gather is expanded into lower-level vector ops with memory accesses
guarded with `scf.if`.

I also considered generating `vector.maskedload`s, but decided against
it to keep the `memref` and `tensor` codepath closer together. There's a
good chance that if a target doesn't support gather it does not support
masked loads either.

Issue: https://github.com/llvm/llvm-project/issues/60905

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D145942
2023-03-14 10:59:30 -04:00
Nicolas Vasilache
203fad476b [mlir][DialectUtils] Cleanup IndexingUtils and provide more affine variants while reusing implementations
Differential Revision: https://reviews.llvm.org/D145784
2023-03-14 03:44:59 -07:00
Emilio Cota
350d7e33ff [mlir][vector] remove unnecessary VectorTransformOps include
While at it, add a dep that we missed in https://reviews.llvm.org/D145638.

Reviewed By: kuhar, dcaballe

Differential Revision: https://reviews.llvm.org/D145731
2023-03-09 18:23:36 -05:00
Jakub Kuderski
fb7ef637a8 [mlir][vector][nvgpu] Move MMA contraction preparation to VectorUtils
This pattern is not specific to nvgpu; I intend to use in SPIR-V codegen. `VectorTransforms` seems like a more generally useful place.

In addition:
-  Fix a bug in the second condition (the dimensions were swapped for RHS).
-  Add tests.
-  Add support for externally provided filter functions, similar to other vector transforms.
-  Prefer to transpose before zero/sign-extending inputs.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D145638
2023-03-09 14:56:21 -05:00
Matthias Springer
7ecc921deb [mlir][vector] Fix incorrect API usage in RewritePatterns
Incorrect API usage was detected by D144552.

Differential Revision: https://reviews.llvm.org/D145153
2023-03-02 13:58:37 +01:00
Diego Caballero
3a3ab2147d [mlir][Vector] Add support for high-order masked contractions
This patch adds support for masked vector.contract ops that needs to be
decomposed using the ContractionOpLowering pattern. It just slices the
mask according to the rest of the lowering.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D144427
2023-02-22 06:54:02 +00:00
Diego Caballero
c339f9e1c3 [mlir][Vector] Support masking for more contraction flavors
This patch adds masking support for more contraction flavors including those
with any combiner operation (add, mul, min, max, and, or, etc.) and
regular matmul contractions.

Combiner operations that are performing vertical reductions (and,
therefore, they are not represented with a horizontal reduction
operation) can be executed unmasked. However, the previous value of
the accumulator must be propagated for lanes that shouldn't accumulate.
We achieve this goal by introducing a select operation after the
accumulator to choose between the combined and the previous accumulator
value. This design decision is made to avoid introducing masking support
to all the arithmetic and logical operations in the Arith dialect. VP
intrinsics do not support pass-thru values either so we would have to
generate the same sequence when lowering to LLVM. The op + select
pattern is peepholed by some backend with native masking support for those
operations.

Consequently, this patch removes masking support from the vector.fma
operation to follow the same approach for all the combiner operations.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D144239
2023-02-22 01:47:44 +00:00
Kazu Hirata
9e5d2495ac Use APInt::isOne instead of APInt::isOneValue (NFC)
Note that isOneValue has been soft-deprecated in favor of isOne.
2023-02-19 23:06:36 -08:00
Thomas Raoux
e3a88a41af Revert "[mlir][vector] Prevent duplicating operations during vector distribute"
This reverts commit 2fc3c5c34c.
2023-02-17 03:07:16 +00:00
Lei Zhang
a1aad28d29 [mlir][vector] NFC: Improve vector type accessor methods
Plain `getVectorType()` can be quite confusing and error-prone
given that, well, vector ops always work on vector types, and
it can commonly involve both source and result vectors. So this
commit makes various such accessor methods to be explicit w.r.t.
source or result vectors.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D144159
2023-02-16 04:08:33 +00:00
Kazu Hirata
5382d28815 [mlir] Use std::optional instead of llvm::Optional (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-02-15 19:40:10 -08:00
Diego Caballero
1ac874c9aa [mlir][Vector] Add support for masked vector gather ops
This patch adds support for masked vector.gather ops using the
vector.mask representation. It includes the implementation of the
MaskableOpInterface, Linalg vectorizer support and lowering to LLVM.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D143939
2023-02-15 06:10:22 +00:00
Diego Caballero
9452356ddc [mlir][Vector] Add support for masked vector.contract
This patch adds support for masking vector.contract ops with the
vector.mask approach. This also includes the lowering of vector.contract
through the vector.outerproduct path to LLVM. For now, this only adds
support for one of the many potential flavors of
vector.contract/vector.outerproduct but unsupported cases will fail
gratefully.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D143965
2023-02-15 06:10:22 +00:00
Matthias Springer
9fa6b3504b [mlir][bufferization] Improve aliasing OpOperand/OpResult property
`getAliasingOpOperands`/`getAliasingOpResults` now encodes OpOperand/OpResult, buffer relation and a degree of certainty. E.g.:
```
// aliasingOpOperands(%r) = {(%t, EQUIV, DEFINITE)}
// aliasingOpResults(%t) = {(%r, EQUIV, DEFINITE)}
%r = tensor.insert %f into %t[%idx] : tensor<?xf32>

// aliasingOpOperands(%r) = {(%t0, EQUIV, MAYBE), (%t1, EQUIV, MAYBE)}
// aliasingOpResults(%t0) = {(%r, EQUIV, MAYBE)}
// aliasingOpResults(%t1) = {(%r, EQUIV, MAYBE)}
%r = arith.select %c, %t0, %t1 : tensor<?xf32>
```

`BufferizableOpInterface::bufferRelation` is removed, as it is now part of `getAliasingOpOperands`/`getAliasingOpResults`.

This change allows for better analysis, in particular wrt. equivalence. This allows additional optimizations and better error checking (which is sometimes overly conservative). Examples:

* EmptyTensorElimination can eliminate `tensor.empty` inside `scf.if` blocks. This requires a modeling of equivalence: It is not a per-OpResult property anymore. Instead, it can be specified for each OpOperand and OpResult. This is important because `tensor.empty` may be eliminated only if all values on the SSA use-def chain to the final consumer (`tensor.insert_slice`) are equivalent.
* The detection of "returning allocs from a block" can be improved. (Addresses a TODO in `assertNoAllocsReturned`.) This allows us to bufferize IR such as "yielding a `tensor.extract_slice` result from an `scf.if` branch", which currently fails to bufferize because the alloc detection is too conservative.
* Better bufferization of loops. Aliases of the iter_arg can be yielded (even if they are not equivalent) without having to realloc and copy the entire buffer on each iteration.

The above-mentioned examples are not yet implemented with this change. This change just improves the BufferizableOpInterface, its implementations and related helper functions, so that better aliasing information is available for each op.

Differential Revision: https://reviews.llvm.org/D142129
2023-02-09 11:35:03 +01:00
Thomas Raoux
2fc3c5c34c [mlir][vector] Prevent duplicating operations during vector distribute
We should distribute ops that have other uses than the yield op as this
would duplicate those ops.

Differential Revision: https://reviews.llvm.org/D143629
2023-02-09 08:26:35 +00:00
Diego Caballero
b1d82057ed [mlir][Vector] Add lowering support for 1-D masked multi-reductions
1-D multi-reductions follow a different lowering path (they are
converted to 2-D multi-reductions) so masked variants need to be
supported explicitly.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D143453
2023-02-07 20:03:38 +00:00
Matthias Springer
1ac248e485 [mlir][bufferization][NFC] Rename getAliasingOpOperand/getAliasingOpResult
* `getAliasingOpOperand` => `getAliasingOpOperands`
* `getAliasingOpResult` => `getAliasingOpResults`

Also a few minor code cleanups and better documentation.

Differential Revision: https://reviews.llvm.org/D142979
2023-02-01 10:07:41 +01:00
Matthias Springer
017be821da [mlir][vector][bufferize] Fix Windows build failure introduced by D141686 2023-01-31 09:36:20 +01:00
Matthias Springer
199f368e35 [mlir][vector][bufferize] Bufferize vector.mask and vector.yield
The masked op can currently not bufferize out-of-place. Such IR would be rejected by the One-Shot Bufferize because it would mean that a new buffer allocation is yielded from a block. Furthermore, only one operation is currently allowed inside `vector.mask`.

Differential Revision: https://reviews.llvm.org/D141686
2023-01-31 09:02:27 +01:00