The change is based on the proposal from the following discussion:
https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968
* Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute`
(`AffineMapAttr` implements this interface).
* Store layout as a single generic `MemRefLayoutAttr`.
This change removes the affine map composition feature and related API.
Actually, while the `MemRefType` itself supported it, almost none of the upstream
can work with more than 1 affine map in `MemRefType`.
The introduced `MemRefLayoutAttr` allows to re-implement this feature
in a more stable way - via separate attribute class.
Also the interface allows to use different layout representations rather than affine maps.
For example, the described "stride + offset" form, which is currently supported in ASM parser only,
can now be expressed as separate attribute.
Reviewed By: ftynse, bondhugula
Differential Revision: https://reviews.llvm.org/D111553
When folding A->B->C => A->C only accept A->C that is valid shape cast
Reviewed By: ThomasRaoux, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D111473
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
This revision takes advantage of the recently added support for 0-d transfers and vector.multi_reduction that return a scalar.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D111626
This revision updates the op semantics, printer, parser and verifier to allow 0-d transfers.
Until 0-d vectors are available, such transfers have a special form that transits through vector<1xt>.
This is a stepping stone towards the longer term work of adding 0-d vectors and will help significantly reduce corner cases in vectorization.
Transformations and lowerings do not yet support this form, extensions will follow.
Differential Revision: https://reviews.llvm.org/D111559
vector.multi_reduction currently does not allow reducing down to a scalar.
This creates corner cases that are hard to handle during vectorization.
This revision extends the semantics and adds the proper transforms, lowerings and canonicalizations to allow lowering out of vector.multi_reduction to other abstractions all the way to LLVM.
In a future, where we will also allow 0-d vectors, scalars will still be relevant: 0-d vector and scalars are not equivalent on all hardware.
In the process, splice out the implementation patterns related to vector.multi_reduce into a new file.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D111442
This patch extends Linalg core vectorization with support for min/max reductions
in linalg.generic ops. It enables the reduction detection for min/max combiner ops.
It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for
floating point and signed integer types. MINU/MAXU should be introduce din the future
for unsigned integer types.
Reviewed By: pifon2a, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D110854
The former is redundant because the later carries it as part of
its builder. Add a getContext() helper method to DialectAsmParser
to make this more convenient, and stop passing the context around
explicitly. This simplifies ODS generated parser hooks for attrs
and types.
This resolves PR51985
Recommit 4b32f8bac4 after fixing a dependency.
Differential Revision: https://reviews.llvm.org/D110796
The former is redundant because the later carries it as part of
its builder. Add a getContext() helper method to DialectAsmParser
to make this more convenient, and stop passing the context around
explicitly. This simplifies ODS generated parser hooks for attrs
and types.
This resolves PR51985
Differential Revision: https://reviews.llvm.org/D110796
The pattern is returning success even if it does no work leading to pattern application running up to the max iteration count and failing.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D109791
This PR adds missing AtomicRMWKind::min/max cases which we would like to use for min/max reduction loop vectorizations.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D104881
This aligns the printer with the parser contract: the operation isn't part of the user-controllable part of the syntax.
Differential Revision: https://reviews.llvm.org/D108804
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.
* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)
Differential Revision: https://reviews.llvm.org/D106118
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.
This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.
Differential Revision: https://reviews.llvm.org/D105381
* Previously, we were only generating .h.inc files. We foresee the need to also generate implementations and this is a step towards that.
* Discussed in https://llvm.discourse.group/t/generating-cpp-inc-files-for-dialects/3732/2
* Deviates from the discussion above by generating a default constructor in the .cpp.inc file (and adding a tablegen bit that disables this in case if this is user provided).
* Generating the destructor started as a way to flush out the missing includes (produces a link error), but it is a strict improvement on its own that is worth doing (i.e. by emitting key methods in the .cpp file, we root vtables in one translation unit, which is a non-controversial improvement).
Differential Revision: https://reviews.llvm.org/D105070
Broadcast dimensions of vector transfer ops are always in-bounds. This is consistent with the fact that the starting position of a transfer is always in-bounds.
Differential Revision: https://reviews.llvm.org/D102566
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.
This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.
Differential Revision: https://reviews.llvm.org/D101745
This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
the loop to prevent garbage values from being written to the
accumulator.
Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.
Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100694
The current implementation had a bug as it was relying on the target vector
dimension sizes to calculate where to insert broadcast. If several dimensions
have the same size we may insert the broadcast on the wrong dimension. The
correct broadcast cannot be inferred from the type of the source and
destination vector.
Instead when we want to extend transfer ops we calculate an "inverse" map to the
projected permutation and insert broadcast in place of the projected dimensions.
Differential Revision: https://reviews.llvm.org/D101738
This revision adds support for vectorizing more general linalg operations with projected permutation maps.
This is achieved by eagerly broadcasting the intermediate vector to the common size
of the iteration domain of the linalg op. This allows a much more natural expression of
generalized vectorization but may introduce additional computations until all the
proper canonicalizations are implemented.
This generalization modifies the vector.transfer_read/write permutation logic and
exposes the fact that the logic employed in vector.contract was too ad-hoc.
As a consequence, changes occur in the permutation / transposition logic for contraction. In turn this prompts supporting more cases in the lowering of contract
to matrix intrinsics, which is required to make the corresponding tests pass.
Differential revision: https://reviews.llvm.org/D101165
The current implementation allows for TransferWriteOps with broadcasts that do not make sense. E.g., a broadcast could write a vector into a single (scalar) memory location, which is effectively the same as writing only the last element of the vector.
Differential Revision: https://reviews.llvm.org/D100842
Move the existing optimization for transfer op on tensor to folder and
canonicalization. This handles the write after write case and read after write
and also add write after read case.
Differential Revision: https://reviews.llvm.org/D100597
The patch enables the use of index type in vectors. It is a prerequisite to support vectorization for indexed Linalg operations. This refactoring became possible due to the newly introduced data layout infrastructure. The data layout of a module defines the bitwidth of the index type needed to verify bitcasts and similar vector operations.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D99948
Also factors out out-of-bounds mask generation from vector.transfer_read/write into a new MaterializeTransferMask pattern.
Differential Revision: https://reviews.llvm.org/D100001
This is in preparation for adding a new "mask" operand. The existing "masked" attribute was used to specify dimensions that may be out-of-bounds. Such transfers can be lowered to masked load/stores. The new "in_bounds" attribute is used to specify dimensions that are guaranteed to be within bounds. (Semantics is inverted.)
Differential Revision: https://reviews.llvm.org/D99639
Use new `MemRefType::getMemorySpace` method with generic Attribute
in cases, where there is no specific logic around the memory space.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D99154
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names. We'll keep the old names around for a
couple weeks to help transitions.
Differential Revision: https://reviews.llvm.org/D99127
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.
Differential Revision: https://reviews.llvm.org/D99028
This allows for storage instances to store data that isn't uniqued in the context, or contain otherwise non-trivial logic, in the rare situations that they occur. Storage instances with trivial destructors will still have their destructor skipped. A consequence of this is that the storage instance definition must be visible from the place that registers the type.
Differential Revision: https://reviews.llvm.org/D98311
With the new vector.load/store operations, there is no need to go through
unmasked transfer operations (which will canonicalized to l/s anyway).
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D98056
Add a folder to rewrite a sequence such as:
```
%t1 = ...
%v = vector.transfer_read %t0[%c0...], {masked = [false...]} :
tensor<static_sizesxf32>, vector<static_sizesxf32>
%t2 = vector.transfer_write %v, %t1[%c0...] {masked = [false...]} :
vector<static_sizesxf32>, tensor<static_sizesxf32>
```
into:
```
%t0
```
The producer of t1 may or may not be DCE'd depending on whether it is a
block argument or has side effects.
Differential revision: https://reviews.llvm.org/D97934
Just a pure method renaming.
It is a preparation step for replacing "memory space as raw integer"
with more generic "memory space as attribute", which will be done in
separate commit.
The `MemRefType::getMemorySpace` method will return `Attribute` and
become the main API, while `getMemorySpaceAsInt` will be declared as
deprecated and will be replaced in all in-tree dialects (also in separate
commits).
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D97476
Similar to mask-load/store and compress/expand, the gather and
scatter operation now allow for higher dimension uses. Note that
to support the mixed-type index, the new syntax is:
vector.gather %base [%i,%j] [%kvector] ....
The first client of this generalization is the sparse compiler,
which needs to define scatter and gathers on dense operands
of higher dimensions too.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D97422