With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.
Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.
Scalable dimensions are indicated with the scalable dimensions
between square brackets:
vector<[4]xf32>
Is a scalable vector of 4 single precission floating point elements.
More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:
vector<2x[4x4]xf32>
Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.
The scale of the scalable dimensions can be obtained with the Vector
operation:
%vs = vector.vscale
This change is being discussed in the discourse RFC:
https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484
Differential Revision: https://reviews.llvm.org/D111819
The implementation only allows to bit-cast between two 0-D vectors. We could
probably support casting from/to vectors like `vector<1xf32>`, but I wasn't
convinced that this would be important and it would require breaking the
invariant that `BitCastOp` works only on vectors with equal rank.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114854
This reverts commit 29a50c5864.
After LLVM lowering, the original patch incorrectly moved alignment
information across an unconstrained GEP operation. This is only correct
for some index offsets in the GEP. It seems that the best approach is,
in fact, to rely on LLVM to propagate information from the llvm.assume()
to users.
Thanks to Thomas Raoux for catching this.
This revision makes concrete use of 0-d vectors to extend the semantics of
InsertElementOp.
Reviewed By: dcaballe, pifon2a
Differential Revision: https://reviews.llvm.org/D114388
This revision starts making concrete use of 0-d vectors to extend the semantics of
ExtractElementOp.
In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in.
Differential Revision: https://reviews.llvm.org/D114387
FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times.
Without this, FMAOps with 3+D do not lower to LLVM.
Differential Revision: https://reviews.llvm.org/D113886
The current implementation invokes materializations
whenever an input operand does not have a mapping for the
desired type, i.e. it requires materialization at the earliest possible
point. This conflicts with goal of dialect conversion (and also the
current documentation) which states that a materialization is only
required if the materialization is supposed to persist after the
conversion process has finished.
This revision refactors this such that whenever a target
materialization "might" be necessary, we insert an
unrealized_conversion_cast to act as a temporary materialization.
This allows for deferring the invocation of the user
materialization hooks until the end of the conversion process,
where we actually have a better sense if it's actually
necessary. This has several benefits:
* In some cases a target materialization hook is no longer
necessary
When performing a full conversion, there are some situations
where a temporary materialization is necessary. Moving forward,
these users won't need to provide any target materializations,
as the temporary materializations do not require the user to
provide materialization hooks.
* getRemappedValue can now handle values that haven't been
converted yet
Before this commit, it wasn't well supported to get the remapped
value of a value that hadn't been converted yet (making it
difficult/impossible to convert multiple operations in many
situations). This commit updates getRemappedValue to properly
handle this case by inserting temporary materializations when
necessary.
Another code-health related benefit is that with this change we
can move a majority of the complexity related to materializations
to the end of the conversion process, instead of handling adhoc
while conversion is happening.
Differential Revision: https://reviews.llvm.org/D111620
The change is based on the proposal from the following discussion:
https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968
* Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute`
(`AffineMapAttr` implements this interface).
* Store layout as a single generic `MemRefLayoutAttr`.
This change removes the affine map composition feature and related API.
Actually, while the `MemRefType` itself supported it, almost none of the upstream
can work with more than 1 affine map in `MemRefType`.
The introduced `MemRefLayoutAttr` allows to re-implement this feature
in a more stable way - via separate attribute class.
Also the interface allows to use different layout representations rather than affine maps.
For example, the described "stride + offset" form, which is currently supported in ASM parser only,
can now be expressed as separate attribute.
Reviewed By: ftynse, bondhugula
Differential Revision: https://reviews.llvm.org/D111553
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
It was bundling quite a lot of patterns that convert high-D
vector ops into low-D elementary ops. It might not be good
for all of the patterns to happen for a particular downstream
user. For example, `ShapeCastOpRewritePattern` rewrites
`vector.shape_cast` into data movement extract/insert ops.
Instead, split the entry point into multiple ones so users
can pull in patterns on demand.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D111225
This patch extends Linalg core vectorization with support for min/max reductions
in linalg.generic ops. It enables the reduction detection for min/max combiner ops.
It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for
floating point and signed integer types. MINU/MAXU should be introduce din the future
for unsigned integer types.
Reviewed By: pifon2a, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D110854
This commits updates the remaining usages of the ArrayRef<Value> based
matchAndRewrite/rewrite methods in favor of the new OpAdaptor
overload.
Differential Revision: https://reviews.llvm.org/D110360
The StringAttr version doesn't need a context, so we can just use the
existing `SymbolRefAttr::get` form. The StringRef version isn't preferred
so we want to encourage people to use StringAttr.
There is an additional form of getSymbolRefAttr that takes a (SymbolTrait
implementing) operation. This should also be moved, but I'll do that as
a separate patch.
Differential Revision: https://reviews.llvm.org/D108922
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.
* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)
Differential Revision: https://reviews.llvm.org/D106118
The dialect-specific cast between builtin (ex-standard) types and LLVM
dialect types was introduced long time before built-in support for
unrealized_conversion_cast. It has a similar purpose, but is restricted
to compatible builtin and LLVM dialect types, which may hamper
progressive lowering and composition with types from other dialects.
Replace llvm.mlir.cast with unrealized_conversion_cast, and drop the
operation that became unnecessary.
Also make unrealized_conversion_cast legal by default in
LLVMConversionTarget as the majority of convesions using it are partial
conversions that actually want the casts to persist in the IR. The
standard-to-llvm conversion, which is still expected to run last, cleans
up the remaining casts standard-to-llvm conversion, which is still
expected to run last, cleans up the remaining casts
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105880
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.
Reviewed By: herhut, silvas
Differential Revision: https://reviews.llvm.org/D105625
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.
This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.
Differential Revision: https://reviews.llvm.org/D105381
This commit moves the type translator from LLVM to MLIR to a public header for use by external projects or other code.
Unlike a previous attempt (https://reviews.llvm.org/D104726), this patch moves the type conversion into separate files which remedies the linker error which was only caught by CI.
Differential Revision: https://reviews.llvm.org/D104834
Lower a 1D vector transfer op to LLVM if the last dim stride is 1. Also fixes a bug in the original unit stride computation.
Differential Revision: https://reviews.llvm.org/D102897
vector.transfer_read and vector.transfer_write operations are converted
to llvm intrinsics with specific alignment information, however there
doesn't seem to be a way in llvm to take information from llvm.assume
intrinsics and change this alignment information. In any
event, due the to the structure of the llvm.assume instrinsic, applying
this information at the llvm level is more cumbersome. Instead, let's
generate the masked vector load and store instrinsic with the right
alignment information from MLIR in the first place. Since
we're bothering to do this, lets just emit the proper alignment for
loads, stores, scatter, and gather ops too.
Differential Revision: https://reviews.llvm.org/D100444
This is a hook that allows for providing custom initialization of the pattern, e.g. if it has bounded recursion, setting the debug name, etc., without needing to define a custom constructor. A non-virtual hook was chosen to avoid polluting the vtable with code that we really just want to be inlined when constructing the pattern. The alternative to this would be to just define a constructor for each pattern, this unfortunately creates a lot of otherwise unnecessary boiler plate for a lot of patterns and a hook provides a much simpler/cleaner interface for the very common case.
Differential Revision: https://reviews.llvm.org/D102440
Move TransposeOp lowering in its own populate function as in some cases
it is better to keep it during ContractOp lowering to better
canonicalize it rather than emiting scalar insert/extract.
Differential Revision: https://reviews.llvm.org/D101647
ArmSVE dialect is behind the recent changes in how the Vector dialect
interacts with backend vector dialects and the MLIR -> LLVM IR
translation module. This patch cleans up ArmSVE initialization within
Vector and removes the need for an LLVMArmSVE dialect.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D100171
We will soon be adding non-AVX512 operations to MLIR, such as AVX's rsqrt. In https://reviews.llvm.org/D99818 several possibilities were discussed, namely to (1) add non-AVX512 ops to the AVX512 dialect, (2) add more dialects (e.g. AVX dialect for AVX rsqrt), and (3) expand the scope of the AVX512 to include these SIMD x86 ops, thereby renaming the dialect to something more accurate such as X86Vector.
Consensus was reached on option (3), which this patch implements.
Reviewed By: aartbik, ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100119
The patch enables the use of index type in vectors. It is a prerequisite to support vectorization for indexed Linalg operations. This refactoring became possible due to the newly introduced data layout infrastructure. The data layout of a module defines the bitwidth of the index type needed to verify bitcasts and similar vector operations.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D99948
Also factors out out-of-bounds mask generation from vector.transfer_read/write into a new MaterializeTransferMask pattern.
Differential Revision: https://reviews.llvm.org/D100001
This is in preparation for adding a new "mask" operand. The existing "masked" attribute was used to specify dimensions that may be out-of-bounds. Such transfers can be lowered to masked load/stores. The new "in_bounds" attribute is used to specify dimensions that are guaranteed to be within bounds. (Semantics is inverted.)
Differential Revision: https://reviews.llvm.org/D99639
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names. We'll keep the old names around for a
couple weeks to help transitions.
Differential Revision: https://reviews.llvm.org/D99127
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.
Differential Revision: https://reviews.llvm.org/D99028
The Intel Advanced Matrix Extensions (AMX) provides a tile matrix
multiply unit (TMUL), a tile control register (TILECFG), and eight
tile registers TMM0 through TMM7 (TILEDATA). This new MLIR dialect
provides a bridge between MLIR concepts like vectors and memrefs
and the lower level LLVM IR details of AMX.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98470
The dialect separation was introduced to demarkate ops operating in different
type systems. This is no longer the case after the LLVM dialect has migrated to
using built-in vector types, so the original reason for separation is no longer
valid. Squash the two dialects into one.
The code size decrease isn't quite large: the ops originally in LLVM_AVX512 are
preserved because they match LLVM IR intrinsics specialized for vector element
bitwidth. However, it is still conceptually beneficial to have only one
dialect. I originally considered to use Tablegen multiclasses to define both
the type-polymorphic op and its two intrinsic-related instantiations, but
decided against it given both the complexity of the required Tablegen input and
its dissimilarity with the rest of ODS-defined ops, both potentially resulting
in very poor maintainability.
Depends On D98327
Reviewed By: nicolasvasilache, springerm
Differential Revision: https://reviews.llvm.org/D98328
The two dialects are largely redundant. The former was introduced as a mirror
of the latter operating on LLVM dialect types. This is no longer necessary
since the LLVM dialect operates on built-in types. Combine the two dialects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D98060
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97852
Just a pure method renaming.
It is a preparation step for replacing "memory space as raw integer"
with more generic "memory space as attribute", which will be done in
separate commit.
The `MemRefType::getMemorySpace` method will return `Attribute` and
become the main API, while `getMemorySpaceAsInt` will be declared as
deprecated and will be replaced in all in-tree dialects (also in separate
commits).
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D97476
Similar to mask-load/store and compress/expand, the gather and
scatter operation now allow for higher dimension uses. Note that
to support the mixed-type index, the new syntax is:
vector.gather %base [%i,%j] [%kvector] ....
The first client of this generalization is the sparse compiler,
which needs to define scatter and gathers on dense operands
of higher dimensions too.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D97422
It's not necessarily the case on all architectures that all memory is
addressable in addrspace 0, so casting the pointer to addrspace 0 is
liable to cause problems.
Reviewed By: aartbik, ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96380
Align the vector gather/scatter/expand/compress API with
the vector load/store/maskedload/maskedstore API.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D96396
This patch adds the 'vector.load' and 'vector.store' ops to the Vector
dialect [1]. These operations model *contiguous* vector loads and stores
from/to memory. Their semantics are similar to the 'affine.vector_load' and
'affine.vector_store' counterparts but without the affine constraints. The
most relevant feature is that these new vector operations may perform a vector
load/store on memrefs with a non-vector element type, unlike 'std.load' and
'std.store' ops. This opens the representation to model more generic vector
load/store scenarios: unaligned vector loads/stores, perform scalar and vector
memory access on the same memref, decouple memory allocation constraints from
memory accesses, etc [1]. These operations will also facilitate the progressive
lowering of both Affine vector loads/stores and Vector transfer reads/writes
for those that read/write contiguous slices from/to memory.
In particular, this patch adds the 'vector.load' and 'vector.store' ops to the
Vector dialect, implements their lowering to the LLVM dialect, and changes the
lowering of 'affine.vector_load' and 'affine.vector_store' ops to the new vector
ops. The lowering of Vector transfer reads/writes will be implemented in the
future, probably as an independent pass. The API of 'vector.maskedload' and
'vector.maskedstore' has also been changed slightly to align it with the
transfer read/write ops and the vector new ops. This will improve reusability
among all these operations. For example, the lowering of 'vector.load',
'vector.store', 'vector.maskedload' and 'vector.maskedstore' to the LLVM dialect
is implemented with a single template conversion pattern.
[1] https://llvm.discourse.group/t/memref-type-and-data-layout/
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96185
This reverts commit 511dd4f438 along with
a couple fixes.
Original message:
Now the context is the first, rather than the last input.
This better matches the rest of the infrastructure and makes
it easier to move these types to being declaratively specified.
Phabricator: https://reviews.llvm.org/D96111
Now the context is the first, rather than the last input.
This better matches the rest of the infrastructure and makes
it easier to move these types to being declaratively specified.
Differential Revision: https://reviews.llvm.org/D96111