This has been a major TODO for a very long time, and is necessary for establishing a proper
dialect-free dependency layering for the Transforms library. Code was moved to effectively
two main locations:
* Affine/
There was quite a bit of affine dialect related code in Transforms/ do to historical reasons
(of a time way into MLIR's past). The following headers were moved to:
Transforms/LoopFusionUtils.h -> Dialect/Affine/LoopFusionUtils.h
Transforms/LoopUtils.h -> Dialect/Affine/LoopUtils.h
Transforms/Utils.h -> Dialect/Affine/Utils.h
The following transforms were also moved:
AffineLoopFusion, AffinePipelineDataTransfer, LoopCoalescing
* SCF/
Only one SCF pass was in Transforms/ (likely accidentally placed here): ParallelLoopCollapsing
The SCF specific utilities in LoopUtils have been moved to SCF/Utils.h
* Misc:
mlir::moveLoopInvariantCode was also moved to LoopLikeInterface.h given
that it is a simple utility defined in terms of LoopLikeOpInterface.
Differential Revision: https://reviews.llvm.org/D117848
After removing the range type, Linalg does not define any type. The revision thus consolidates the LinalgOps.h and LinalgTypes.h into a single Linalg.h header. Additionally, LinalgTypes.cpp is renamed to LinalgDialect.cpp to follow the convention adopted by other dialects such as the tensor dialect.
Depends On D115727
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115728
The 0-D case gets lowered in almost the same way that the 1-D case does
in VectorCreateMaskOpConversion. I also had to slightly update the
verifier for the op to always require exactly 1 operand in the 0-D case.
Depends On D115220
Reviewed by: ftynse
Differential revision: https://reviews.llvm.org/D115221
The new form of printing attribute in the declarative assembly is eliding the `#dialect.mnemonic` prefix to only keep the `<....>` part.
Differential Revision: https://reviews.llvm.org/D113873
To support creating both a mask with just a single `true` and `false` values,
I had to relax the restriction in the verifier that the rank is always equal to
the length of the attribute array, in other words, we now allow:
- `vector.constant_mask [0] : vector<i1>` which gets lowered to
`arith.constant dense<false> : vector<i1>`
- `vector.constant_mask [1] : vector<i1>` which gets lowered to
`arith.constant dense<true> : vector<i1>`
(the attribute list for the 0-D case must be a singleton containing
either `0` or `1`)
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115023
This revision adds 0-d vector support to vector.transfer ops.
In the process, numerous cleanups are applied, in particular around normalizing
and reducing the number of builders.
Reviewed By: ThomasRaoux, springerm
Differential Revision: https://reviews.llvm.org/D114803
This changes the op to produce `AnyVectorOfAnyRank` following mostly the code for 1-D vectors.
Depends On D114598
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114550
Instead of using shape_cast op in the pattern removing leading unit
dimensions we use extract/broadcast ops. This is part of the effort to
restrict ShapeCastOp fuirther in the future and only allow them to
convert to or from 1D vector.
This also adds extra canonicalization to fill the gaps in simplifying
broadcast/extract ops.
Differential Revision: https://reviews.llvm.org/D114205
This is in prevision of dropping them altogether and using insert/extract based patterns.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D113928
This avoids crashes when the pattern is applied to ops with
tensor semantics.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D113415
The 2-D case can be rewritten to generate quite fewer instructions and a single vector.shuffle which seems to provide a nice performance boost.
Add this arrow to our quiver by exposing it with a new vector transform option.
Differential Revision: https://reviews.llvm.org/D113062
When operand is a subview we don't infer in_bounds and some default cases (e.g case in the tests) will crash with `operand is NULL` when converting to LLVM
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D112772
add several patterns that will simplify contraction vectorization in the
future. With those canonicalizationns we will be able to remove the special
case for contration during vectorization and rely on those transformations to
avoid materizalizing broadcast ops.
Differential Revision: https://reviews.llvm.org/D112121
Add a pattern to take a rank-reducing subview and drop inner most
contiguous unit dim.
This is useful when lowering vector to backends with 1d vector types.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D111561
This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf.
Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides).
The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them.
This is just a first stab and still needs to be evaluated for performance.
Other tradeoffs are possible that should be explored.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D111894
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
vector.multi_reduction currently does not allow reducing down to a scalar.
This creates corner cases that are hard to handle during vectorization.
This revision extends the semantics and adds the proper transforms, lowerings and canonicalizations to allow lowering out of vector.multi_reduction to other abstractions all the way to LLVM.
In a future, where we will also allow 0-d vectors, scalars will still be relevant: 0-d vector and scalars are not equivalent on all hardware.
In the process, splice out the implementation patterns related to vector.multi_reduce into a new file.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D111442
It was bundling quite a lot of patterns that convert high-D
vector ops into low-D elementary ops. It might not be good
for all of the patterns to happen for a particular downstream
user. For example, `ShapeCastOpRewritePattern` rewrites
`vector.shape_cast` into data movement extract/insert ops.
Instead, split the entry point into multiple ones so users
can pull in patterns on demand.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D111225
This patch extends Linalg core vectorization with support for min/max reductions
in linalg.generic ops. It enables the reduction detection for min/max combiner ops.
It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for
floating point and signed integer types. MINU/MAXU should be introduce din the future
for unsigned integer types.
Reviewed By: pifon2a, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D110854
This revision retires a good portion of the complexity of the codegen strategy and puts the logic behind pass logic.
Differential revision: https://reviews.llvm.org/D110678
When splitting with linalg.copy, cannot write into the destination alloc directly. Instead, write into a subview of the alloc.
Differential Revision: https://reviews.llvm.org/D110512
The approach for handling reductions in the outer most
dimension follows that for inner most dimensions, outlined
below
First, transpose to move reduction dims, if needed
Convert reduction from n-d to 2-d canonical form
Then, for outer reductions, we emit the appropriate op
(add/mul/min/max/or/and/xor) and combine the results.
Differential Revision: https://reviews.llvm.org/D107675
The existing vector transforms reduce the dimension of transfer_read
ops. However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector. This handles this case.
Note that in the longer term, it may be preferaby to support
zero-dimension vectors. see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.
Differential Revision: https://reviews.llvm.org/D103432
When the output indexing map has a permutation we need to consider in
the contraction vector type.
Differential Revision: https://reviews.llvm.org/D106469
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.
* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)
Differential Revision: https://reviews.llvm.org/D106118
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.
This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.
Differential Revision: https://reviews.llvm.org/D105381
Remove `getDynOperands` and `createOrFoldDimOp` from MemRef.h to decouple MemRef a bit from Tensor. These two functions are used in other dialects/transforms.
Differential Revision: https://reviews.llvm.org/D105260
The implementation has become too unwieldy and cognitive overhead wins.
Instead compress the implementation in preparation for additional lowering paths.
This is a resubmit of https://reviews.llvm.org/D105359 without ordering ambiguities.
Differential Revision: https://reviews.llvm.org/D105367
The implementation has become too unwieldy and cognitive overhead wins.
Instead compress the implementation in preparation for additional lowering paths.
Differential Revision: https://reviews.llvm.org/D105359
* Split memref.dim into two operations: memref.dim and tensor.dim. Both ops have the same builder interface and op argument names, so that they can be used with templates in patterns that apply to both tensors and memrefs (e.g., some patterns in Linalg).
* Add constant materializer to TensorDialect (needed for folding in affine.apply etc.).
* Remove some MemRefDialect dependencies, make some explicit.
Differential Revision: https://reviews.llvm.org/D105165
Uses elementwise interface to generalize canonicalization pattern and add a new
pattern for vector.contract case.
Differential Revision: https://reviews.llvm.org/D104343