Buffers are no longer deallocation by One-Shot Bufferize. This is now
done by a separate buffer deallocation pass.
Also fix a bug in the `vector.mask` folding, which was triggered by
`-buffer-deallocation-pipeline`, which runs the canonicalizer.
This PR adds promised interface declarations for all interfaces declared
in `InitAllDialects.h`.
Promised interfaces allow a dialect to declare that it will have an
implementation of a particular interface, crashing the program if one
isn't provided when the interface is used.
This reverts commit 5cdb8c0c88.
This pattern is producing incorrect IR. For example,
```mlir
func.func @extract_subvector_from_constant_mask() -> vector<16xi1> {
%mask = vector.constant_mask [2, 3] : vector<16x16xi1>
%extract = vector.extract %mask[8] : vector<16xi1> from vector<16x16xi1>
return %extract : vector<16xi1>
}
```
Canonicalizes to
```mlir
func.func @extract_subvector_from_constant_mask() -> vector<16xi1> {
%0 = vector.constant_mask [3] : vector<16xi1>
return %0 : vector<16xi1>
}
```
Where it should be a zero mask because the extraction index (8) is
greater than the constant mask size along that dim (2).
Extends `vector.insert_strided_slice` and `vector.insert_strided_slice`
to allow scalable input and output vectors. For scalable sizes, the
corresponding slice size has to match the corresponding dimension in the
output/input vector (insert/extract, respectively).
This is supported:
```mlir
vector.extract_strided_slice %1 {
offsets = [0, 3, 0],
sizes = [1, 1, 4],
strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[4]xi32>
```
This is not supported:
```mlir
vector.extract_strided_slice %1 {
offsets = [0, 3, 0],
sizes = [1, 1, 2],
strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[2]xi32>
```
1. Updates and clarifies a few comments related to hooks for
vector.{insert|extract}_strided_slice.
2. For consistency with vector.insert_strided_slice, removes a TODO from
vector.extract_strided_slice Op def. It's self-explenatory that
adding support for non-unit strides is a "TODO".
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`
The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.
Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
This commit adds extra assertions to `OperationFolder` and `OpBuilder`
to ensure that the types of the folded SSA values match with the result
types of the op. There used to be checks that discard the folded results
if the types do not match. This commit makes these checks stricter and
turns them into assertions.
Discarding folded results with the wrong type (without failing
explicitly) can hide bugs in op folders. Two such bugs became apparent
in MLIR (and some more in downstream projects) and are fixed with this
change.
Note: The existing type checks were introduced in
https://reviews.llvm.org/D95991.
Migration guide: If you see failing assertions (`folder produced value
of incorrect type`; make sure to run with assertions enabled!), run with
`-debug` or dump the operation right before the failing assertion. This
will point you to the op that has the broken folder. A common mistake is
a mismatch between static/dynamic dimensions (e.g., input has a static
dimension but folded result has a dynamic dimension).
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.
Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.
Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.
Issue: https://github.com/llvm/llvm-project/issues/72354
Without this patch, MLIR crashes with
```
Assertion failed: (getNumDims() == map.getNumResults() && "Number of results mismatch"), function compose, file AffineMap.cpp, line 537.
```
during parsing.
This reverts commit f42b7615b8.
The fold pattern is incorrect, because it does not even look at the
permutation of non-unit dims and is happy to replace a pattern such as
```
%22 = vector.shape_cast %21 : vector<1x256x256xf32> to vector<256x256xf32>
%23 = vector.transpose %22, [1, 0] : vector<256x256xf32> to vector<256x256xf32>
```
with
```
%22 = vector.shape_cast %21 : vector<1x256x256xf32> to vector<256x256xf32>
```
which is obviously incorrect.
This folds transpose(shape_cast) into a new shape_cast, when the
transpose just permutes a unit dim from the result of the shape_cast.
Example:
```
%0 = vector.shape_cast %vec : vector<[4]xf32> to vector<[4]x1xf32>
%1 = vector.transpose %0, [1, 0] : vector<[4]x1xf32> to vector<1x[4]xf32>
```
Folds to:
```
%0 = vector.shape_cast %vec : vector<[4]xf32> to vector<1x[4]xf32>
```
This is an (alternate) fix for lowering matmuls to ArmSME.
Previously the pattern only worked when the permutation map was a minor
identity. Infer the new mask type from the new transfer map after
dropping leading unit dims.
* Declare arguments/results with `let` statements.
* Rename `transp` to `permutation`.
* Change type of `transp` from `I64ArrayAttr` to `DenseI64ArrayAttr`
(provides direct access to `ArrayRef<int64_t>` instead of `ArrayAttr`).
- Better documentation.
- Rename interface methods: `source` -> `getSource`, `indices` ->
`getIndices`, etc. to conform with MLIR naming conventions. A default
implementation is not needed.
- Turn many interface methods into helper functions. Most of the
previous interface methods were not meant to be overridden, and if some
were overridden without others, the op would be have been broken.
When the mask bounds of a `vector.constant_mask` exactly equal the shape
of the vector, any transfer op consuming that mask will be unaffected by
it. Drop the mask in such cases.
The IR is valid, but UB: there is an out-of-bound index for the position
to insert inside the vector. We should just ignore this in the folder.
Fixes#70884
If a dimension does not appear in the permutation map of a vector
transfer op, the size of the accessed slice in that dimension is `1`.
Before this fix, `getTransferChunkAccessed` used to return `0` for such
dimensions, which would means that `0` elements in the underlying
tensor/memref are accessed.
Note: There is no test case that fails due to this bug and because this
interface method is currently only used in one place, it is hard to
write a regression test. This fix is in preparation of subset hoisting
functionality that will be added in subsequent commits.
Adds an end-to-end test for `vector.contract` that targets SVE (i.e.
scalable vectors). Note that this requires lifting the restriction on
`vector.outerproduct` (to which `vector.contract` is lowered to) that
would deem the following as invalid by the Op verifier (*):
```
vector.outerproduct %27, %28, %26 {kind = #vector.kind<add>} : vector<3xf32>, vector<[2]xf32>
```
This is indeed valid as the end-to-end test demonstrates (at least when
compiling for SVE).
This allows folding extracts from `vector.create_mask` ops that have a
known value. Currently, there's no fold for this, but you get the same
effect from the unrolling in LowerVectorMask (part of
-convert-vector-to-llvm), then folds after that. However, for a future
patch, this simplification needs to be done before lowering to LLVM,
hence the need for this fold.
E.g.:
```
%0 = vector.create_mask %c1, %dimA, %dimB : vector<1x[4]x[4]xi1>
%1 = vector.extract %mask[0] : vector<[4]x[4]xi1>
```
->
```
%0 = vector.create_mask %dimA, %dimB : vector<[4]x[4]xi1>
```
This is just a slight specialization of `TypesMatchWith` that returns
success if an optional parameter is missing.
There may be other places this could help e.g.:
eb21049b4b/mlir/include/mlir/Dialect/X86Vector/X86Vector.td (L58-L59)
...but I'm leaving those to avoid some churn.
This constraint will be handy for us in some later patches, it's a
formalization of a short circuiting trick with the `comparator` of the
`TypesMatchWith` constraint (devised for #69195).
```
TypesMatchWith<
"padding type matches element type of result (if present)",
"result", "padding",
"::llvm::cast<VectorType>($_self).getElementType()",
// This returns true if no padding is present, or it's present with a type that matches the element type of `result`.
"!getPadding() || std::equal_to<>()">
```
This is a little non-obvious, so after this patch you can instead do:
```
OptionalTypesMatchWith<
"padding type matches element type of result (if present)",
"result", "padding",
"::llvm::cast<VectorType>($_self).getElementType()">
```
Recent changes (https://github.com/llvm/llvm-project/pull/66930)
disabled vector transfer ops hoisting with view-like intermediate ops.
The recommended way is to fold subview ops into transfer op indices
before invoking hoisting. That would mean now we see transfer op indices
involving dynamic values, instead of static constant values before with
subview ops. Therefore hoisting won't kick in anymore. This breaks
downstream users.
To fix it, this commit enables hoisting transfer ops with dynamic
indices by using `ValueBoundsConstraintSet` to prove ranges are disjoint
in `isDisjointTransferIndices`. Given that utility is used in many
places including op folders, right now we introduce a flag to it and
only set as true for "heavy" transforms in hoisting and load-store
forwarding.
This is not yet supported and previously led to a confusing crash where
an extract op with a kDynamic marker, but no dynamic positions was
created. The verifier has also been updated to check for this, and hint
at where the problem is likely to be.
The vector.extract assembly format currently only contains the source
type, for example:
%1 = vector.extract %0[1] : vector<3x7x8xf32>
it's not immediately obvious if this is the source or result type. This
patch improves the assembly format to make this clearer, so the above
becomes:
%1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>
This revision pipes the fastmath attribute support through the
vector.reduction op. This seemingly simple first step already requires
quite some genuflexions, file and builder reorganization. In the
process, retire the boolean reassoc flag deep in the LLVM dialect
builders and just use the fastmath attribute.
During conversions, templated builders for predicated intrinsics are
partially cleaned up. In the future, to finalize the cleanups, one
should consider adding fastmath to the VPIntrinsic ops.
This extends `vector.constant_mask` so that mask dim sizes that
correspond to a scalable dimension are treated as if they're implicitly
multiplied by vscale. Currently this is limited to mask dim sizes of 0
or the size of the dim/vscale. This allows constant masks to represent
all true and all false scalable masks (and some variations):
```
// All true scalable mask
%mask = vector.constant_mask [8] : vector<[8]xi1>
// All false scalable mask
%mask = vector.constant_mask [0] : vector<[8]xi1>
// First two scalable rows
%mask = vector.constant_mask [2,4] : vector<4x[4]xi1>
```
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.
This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
This is just a small fix that makes sure that `vector.contract` works
with scalable vectors.
Rather than duplicating all the roundtrip tests for vector.contract, I'm
treating scalable vectors as an edge case and just adding a couple to
verify that this works.
This was introduced before the Optional directive and uses Variadic, but
it's really optional.
Reviewed By: nicolasvasilache, benmxwl-arm, dcaballe
Differential Revision: https://reviews.llvm.org/D159259
0-D vectors are now supported, so the special case of returning the just
the element type can now be removed.
A few callers that relied on the old behaviour have been updated.
Reviewed By: awarzynski, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D159122