When applying vector masking we may create a mask and then transpose it.
Transpositions are extremely expensive so this patch introduces a new
canonicalization pattern that remove the tranpose operation and create a
new transposed mask.
Differential Revision: https://reviews.llvm.org/D146193
This patch adds support for folding trivial masked reductions and
multi-reductions (e.g., multi-reductions with only parallel dims,
reductions of a single element, etc.). To support those foldings in
a composable way we also add support for folding different flavors of
empty vector.mask opertions.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D144414
This patch adds masking support for more contraction flavors including those
with any combiner operation (add, mul, min, max, and, or, etc.) and
regular matmul contractions.
Combiner operations that are performing vertical reductions (and,
therefore, they are not represented with a horizontal reduction
operation) can be executed unmasked. However, the previous value of
the accumulator must be propagated for lanes that shouldn't accumulate.
We achieve this goal by introducing a select operation after the
accumulator to choose between the combined and the previous accumulator
value. This design decision is made to avoid introducing masking support
to all the arithmetic and logical operations in the Arith dialect. VP
intrinsics do not support pass-thru values either so we would have to
generate the same sequence when lowering to LLVM. The op + select
pattern is peepholed by some backend with native masking support for those
operations.
Consequently, this patch removes masking support from the vector.fma
operation to follow the same approach for all the combiner operations.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D144239
Plain `getVectorType()` can be quite confusing and error-prone
given that, well, vector ops always work on vector types, and
it can commonly involve both source and result vectors. So this
commit makes various such accessor methods to be explicit w.r.t.
source or result vectors.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D144159
This patch adds support for masked vector.gather ops using the
vector.mask representation. It includes the implementation of the
MaskableOpInterface, Linalg vectorizer support and lowering to LLVM.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D143939
This patch adds support for masking vector.contract ops with the
vector.mask approach. This also includes the lowering of vector.contract
through the vector.outerproduct path to LLVM. For now, this only adds
support for one of the many potential flavors of
vector.contract/vector.outerproduct but unsupported cases will fail
gratefully.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D143965
This is a similar to the existing folder for f16 to f32 added with
D96041 but instead for integer types where destination bits > source bits.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D142922
This commit updates vector.contract documentation to clarify
the promotion behavior if operands and the result have different
bitwidths. It also adds a check to disable signed/unsigned integer
types and only allow signless integers.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D142915
Reductions can't be folded into plain arith ops until we can mask
those arith ops.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D141645
This will probably be the first in a series of patches that tries to
enable code generation for ARM SME (extension of SVE).
Since SME's core operation is the outer product instruction, I figured
that it would probably be a good idea to enable the outer product
operation to properly accept and generate scalable vectors.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D138718
This patch enables vectorization of reductions in Linalg vectorizer
using the vector.mask operation. It also introduces the logic to slice
and propagate the vector mask of a masked multi-reduction to their
respective lowering operations.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D141571
The verifier already had support for multiple result types, but the op definition assumed a single, optional result.
Differential Revision: https://reviews.llvm.org/D141683
This will probably be the first in a series of patches that tries to
enable code generation for ARM SME (extension of SVE).
Since SME's core operation is the outer product instruction, I figured
that it would probably be a good idea to enable the outer product
operation to properly accept and generate scalable vectors.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D138718
Ops that use TypesMatchWith to constrain result types for verification
and to infer result types during parser generation should also be able
to have the `inferReturnTypes` method auto generated. This patch
upgrades the logic for generating `inferReturnTypes` to handle the
TypesMatchWith trait by building a type inference graph where each edge
corresponds to "type of A can be inferred from type of B", supporting
transformers other than `"$_self"`.
Reviewed By: lattner, rriddle
Differential Revision: https://reviews.llvm.org/D141231
The patch adds operations to `BlockAndValueMapping` and renames it to `IRMapping`. When operations are cloned, old operations are mapped to the cloned operations. This allows mapping from an operation to a cloned operation. Example:
```
Operation *opWithRegion = ...
Operation *opInsideRegion = &opWithRegion->front().front();
IRMapping map
Operation *newOpWithRegion = opWithRegion->clone(map);
Operation *newOpInsideRegion = map.lookupOrNull(opInsideRegion);
```
Migration instructions:
All includes to `mlir/IR/BlockAndValueMapping.h` should be replaced with `mlir/IR/IRMapping.h`. All uses of `BlockAndValueMapping` need to be renamed to `IRMapping`.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D139665
We were missing to check for transpose when folding.
Also add a new file to test folding independently of
canonicalization as canonicalization was hiding the bug.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D140533
This is part of an effort to migrate from llvm::Optional to
std::optional. This patch changes the way mlir-tblgen generates .inc
files, and modifies tests and documentation appropriately. It is a "no
compromises" patch, and doesn't leave the user with an unpleasant mix of
llvm::Optional and std::optional.
A non-trivial change has been made to ControlFlowInterfaces to split one
constructor into two, relating to a build failure on Windows.
See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Differential Revision: https://reviews.llvm.org/D138934
This patch introduces the initial bits to support vector masking
using the `vector.mask` operation. Vectorization changes should be
NFC for non-masked cases. We can't test masked cases directly until
we extend the Transform dialect to support masking.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D137690
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
- Use `zip_equal` where iteratees are supposted to have equal lenght.
- Use `zip_first` where the first iteratee is supposed to be the
shortest.
- Use `llvm::enumerate` instead of calculating index manually.
- Use structured bindings to unpack tuples where appropriate.
- Fix a bug in a comparison in `intersectsWhereNonNegative`.
Both `zip_first` (after D138858) and `zip_equal` (introduced in D138865)
assert interatee lengths, which allows us to more precisely convey
whether we want to iterate over the common prefix (`zip`), or expect all
lengths to be the same (`zip_equal`).
Reviewed By: dcaballe, antiagainst
Differential Revision: https://reviews.llvm.org/D139022
This helper handles non trivial cases of broadcast + optional transpose creation
that should not leak to the outside world.
Differential Revision: https://reviews.llvm.org/D139003
Fold InsertStridedOp(ConstantOp into ConstantOp) -> ConstantOp.
This pattern comes with vector size threshold to make sure we do not
introduce too many large constants.
This help clean up code created by the Wide Integer Emulation pass.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D138739
This revision fixes a bug in the vector.extract folding that was missing
handling the "dim-1" broadcasting case in vector.broadcast.
Differential Revision: https://reviews.llvm.org/D138804
This pattern comes with vector size threshold to make sure we do not
introduce too many large constants.
This help clean up code created by the Wide Integer Emulation pass.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D138733
This allows us to better canonicalize/clean-up code created by the Wide
Integer Emulation pass.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D138606
This generalizes the existing fold for `ExtractOp(non-splat constant)`
to work with vector results. The vector case is handled by extracting
the subrange of attribute array.
My main use it to clean up code generated by the Wide Integer Emulation
pass.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D138690