This fixes an "infinite" loop bug, where the incoming IR was repeatedly
rewritten while adding identical cast operations. The test for
compatible types should include the notion of an encoding. If it
differs, then a
naive fusion into the consumer is invalid.
Transform interfaces are implemented, direction or via extensions, in
libraries belonging to multiple other dialects. Those dialects don't
need to depend on the non-interface part of the transform dialect, which
includes the growing number of ops and transitive dependency footprint.
Split out the interfaces into a separate library. This in turn requires
flipping the dependency from the interface on the dialect that has crept
in because both co-existed in one library. The interface shouldn't
depend on the transform dialect either.
As a consequence of splitting, the capability of the interpreter to
automatically walk the payload IR to identify payload ops of a certain
kind based on the type used for the entry point symbol argument is
disabled. This is a good move by itself as it simplifies the interpreter
logic. This functionality can be trivially replaced by a
`transform.structured.match` operation.
This PR adds promised interface declarations for all interfaces declared
in `InitAllDialects.h`.
Promised interfaces allow a dialect to declare that it will have an
implementation of a particular interface, crashing the program if one
isn't provided when the interface is used.
The current canonicalization of `memref.dim` operating on the result of
`memref.reshape` into `memref.load` is incorrect as it doesn't check
whether the `index` operand of `memref.dim` dominates the source
`memref.reshape` op. It always introduces `memref.load` right after
`memref.reshape` to ensure the `memref` is not mutated before the
`memref.load` call. As a result, the following error is observed:
```
$> mlir-opt --canonicalize input.mlir
func.func @reshape_dim(%arg0: memref<*xf32>, %arg1: memref<?xindex>, %arg2: index) -> index {
%c4 = arith.constant 4 : index
%reshape = memref.reshape %arg0(%arg1) : (memref<*xf32>, memref<?xindex>) -> memref<*xf32>
%0 = arith.muli %arg2, %c4 : index
%dim = memref.dim %reshape, %0 : memref<*xf32>
return %dim : index
}
```
results in:
```
dominator.mlir:22:12: error: operand #1 does not dominate this use
%dim = memref.dim %reshape, %0 : memref<*xf32>
^
dominator.mlir:22:12: note: see current operation: %1 = "memref.load"(%arg1, %2) <{nontemporal = false}> : (memref<?xindex>, index) -> index
dominator.mlir:21:10: note: operand defined here (op in the same block)
%0 = arith.muli %arg2, %c4 : index
```
Properly fixing this issue requires a dominator analysis which is
expensive to run within a canonicalization pattern. So, this patch fixes
the canonicalization pattern by being more strict/conservative about the
legality condition in which we perform this canonicalization.
The more general pattern is also added to `tensor.dim`. Since tensors are
immutable we don't need to worry about where to introduce the
`tensor.extract` call after canonicalization.
Before: op verifiers failed if the input and output ranks were the same
(i.e. no expansion or collapse). This behavior requires users of these
shape ops to verify manually that they are not creating identity
versions of these ops every time they build them -- problematic. This PR
removes this strict verification, and introduces folders for the the
identity cases.
The PR also removes the special case handling of rank-0 tensors for
expand_shape and collapse_shape, there doesn't seem to be any reason to
treat them differently.
The revision does not refactor the inferStaticShape for pack and unpack
ops because they can diverge quickly. Because there are more dimensions
can be inferred (i.e., with inner_tile_sizes) if the pack op does not
have padding value.
This is a follow-up of https://github.com/llvm/llvm-project/pull/80848
Previously, the `tensor.pack` verifier detects unconditional runtime
errors only when tile sizes are static. Now, dynamic tiles are
considered and we only require that the input and either corresponding
tile or output size are static to determine if it will unconditionally
produce errors at runtime.
Previously, `InsertSliceOpSourceCastInserter` was incorrectly applied to
a case when tensor types have an encoding attribute attached to them.
The type `newSrcType` was missing that attribute from the old `srcType`,
which made the expression `srcType == newSrcType` false, since
`tensor<2x2xf32, "foo">` is not equal to `tensor<2x2xf32>`. That lead to
an endless back and forth between `InsertSliceOpSourceCastInserter` that
would introduce a cast and `InsertSliceOpCastFolder` that would remove
it right after.
The canonicalization incrementally converts foldable dynamic hi/lo
padding to static hi/lo values. During this canonicalization the
static-fied valued should be removed from the dynamic values.
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`
The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.
Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
The revision moves pack/unpack related patterns to
PackAndUnpackPatterns.cpp. This follows the convention like other tensor
ops.
It also renames `populateSimplifyTensorPack` to
`populateSimplifyPackAndUnpackPatterns` and adds a TODO item for
tensor.unpack op.
This feature had been marked as `TODO` in the `tensor.splat`
documentation for a while. This MR includes:
- Support for dynamically shaped tensors in the return type of
`tensor.splat` with the syntax suggested in the `TODO` comment.
- Updated op documentation.
- Bufferization support.
- Updates in op folders affected by the new feature.
- Unit tests for valid/invalid syntax, valid/invalid folding, and
lowering through bufferization.
- Additional op builders resembling those available in `tensor.empty`.
When the concatenated dim is statically sized but the inputs are
dynamically sized, reifyResultShapes must return the static shape. Fixes
the implementation of the interface for tensor.concat in such cases.
The folder for `tensor.extract` is not operating correctly when it is
consuming the result of a `tensor.from_elements` operation.
The existing unit test named `@extract_from_tensor.from_elements_3d` in
`mlir/test/Dialect/Tensor/canonicalize.mlir` seems an attempt to stress
this code. However, this unit tests creates a `tensor.from_elements` op
exclusively from constants, which gets folded away into a single
constant tensor. Therefore, the buggy code was never executed in unit
tests.
I have added a new unit test named
`@extract_from_tensor.from_elements_variable_3d` that makes sure the
`tensor.from_elements` op is not folded away by having its input
operands come directly from function arguments. The original folder code
would have made this test fail.
This bug was notably affecting the lowering of the `tosa.pad` op in the
`tosa-to-tensor` pass, where the generated code is likely to contain a
`tensor.from_elements` + `tensor.extract` op sequence.
Op verifiers should verify only local properties of an op. The dynamic
sizes of a `tensor.generate` op should not be verified. Dynamic sizes
that have a negative constant value should not prevent the
`tensor.generate` op from verifying.
Also share some code between the `tensor.empty` and `tensor.generate`
"dynamic dim -> static dim" canonicalization patterns.
Remove the `invalid-canonicalize.mlir` file and move the test case to
`canonicalize.mlir`. Canonicalization no longer produces IR that does
not verify (and leaves the op as is).
Without folding the result of the initial tensor.dim, the
ReifyResultShapes implementation would be incorrect because it would
return a dynamic shape for a static result shape.
This adds an operation for concatenating ranked tensors along a static
dimension, as well as a decomposition mirroring the existing lowering
from TOSA to Tensor. This offers a convergence point for "input" like
dialects that include various lowerings for concatenation operations,
easing later analysis. In the future, this op can implement the
necessary interfaces for tiling, as well as potentially add conversions
to some kind of linalg and/or memref counterpart.
This patch adds the op, the decomposition, and some basic
folding/canonicalization. Replacing lowerings with the op (such as the
TOSA lowering) will come as a follow up.
See
https://discourse.llvm.org/t/rfc-tensor-add-a-tensor-concatenate-operation/74858
This commit fixes a crash of the canonicalizer when there are slice ops
with offset/size SSA values that have a negative constant value. Such
ops are invalid if they are reachable and their offsets/sizes should not
be folded to static integer values. (But such ops may appear in
non-reachable block.)
This commit fixes#71150.
In #71153, the `memref.subview` canonicalizer crashes due to a negative
`size` being passed as an operand. During `SubViewOp::verify` this
negative `size` is not yet detectable since it is dynamic and only
available after constant folding, which happens during the
canonicalization passes. As discussed in
<https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>,
the verifier should not be extended as it should "only verify local
aspects of an operation".
This patch fixes#71153 by not folding in aforementioned situation.
Also, this patch adds a basic offset and size check in the
`OffsetSizeAndStrideOpInterface` verifier.
Note: only `offset` and `size` are checked because `stride` is allowed
to be negative
(54d81e49e3).
Fixes https://github.com/llvm/llvm-project/issues/60656.
This patch implements a basic fold for various reshape/resize tensor
operations. Specifically, the folding removes tensor reshape/resize ops
when they are applied to a constant tensor. For example, the following
function:
```mlir
func.func @main(%dest : tensor<8x16x8x32xf32>) -> tensor<8x16x8x32xf32> {
%cst = arith.constant dense<1.000000e-01> : tensor<64x128xf32>
%0 = tensor.pack %cst outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
inner_tiles = [8, 32] into %dest : tensor<64x128xf32> -> tensor<8x16x8x32xf32>
return %0 : tensor<8x16x8x32xf32>
}
```
will be changed into the following with `mlir-opt -canonicalize`:
```mlir
func.func @main(%arg0: tensor<8x16x8x32xf32>) -> tensor<8x16x8x32xf32> {
%cst = arith.constant dense<1.000000e-01> : tensor<8x16x8x32xf32>
return %cst : tensor<8x16x8x32xf32>
}
```
As a side-note, this patch is essentially an extension of
f79f430d4b.
The destination operand of the `tensor.unpack` operation is only needed
to carry shape information. So if the producer of the destination
operand implements the `DestinationStyleOpInterface`, then fold it into
the `tensor.unpack` operation by replacing the destination operand with
the destination for the source.
`scf::ForallOp` has `shared_outs` tensor operands which are used to
insert partial results into in the parallel terminator. The
`scf::ForallOp` returns one tensor for each `shared_out` which then
contains the combined result from all threads. Since the parallel
terminator cannot change the shape of the `shared_out`, ForallOp is a
`DestinationStyleOp` and this patch implements the interface by
declaring the `outputs` operands as `inits` in the language of the DPS
interface.
For this change to work, we need to add an exception to the Pattern that
folds `tensor.cast` Ops into DPS Ops because `scf::Forall` needs special
handling of its `BlockArgument` Type during this folding.
This patch addresses a crash that occurs when negative dynamic sizes are
provided in tensor.emptyOp by adding a check to ensure that dynamic
sizes are non-negative.
Fixes#64064
This enables canonicalization to fold away unnecessary tensor.dim ops
which in turn enables folding away of other operations, as can be seen
in conv_tensors_dynamic where affine.min operations were folded away.
This commit provides a default implementation for all ops that implement
the `DestinationStyleOpInterface`. Result values of such ops are tied to
operand, and those have the same type.
`reifyResultShapes` should return an `Attribute` if and only if the respective dimension is static.
This fixes#64256.
Differential Revision: https://reviews.llvm.org/D158166
We are able to fuse the pack op only if inner tiles are not tiled or
they are fully used. Otherwise, it could generate a sequence of
non-trivial ops.
Differential Revision: https://reviews.llvm.org/D157932
* Move `foldDynamicIndexList` to `DialectUtils` and simplify function.
* Move `OpWithOffsetSizesAndStridesConstantArgumentFolder` to `ViewLikeInterface` and add documentation.
Differential Revision: https://reviews.llvm.org/D156581
In https://reviews.llvm.org/D151611, a check was added to the tensor verifier to
emit an error on negative tensor dimensions. This check allowed for dynamic
dimensions, hence negative dimensions were still able to get through the verifier.
This is a problem in situations such as #60558, where the dynamic dimension is
converted to a static (and possibly negative) dimension by another pass in the
compiler. This patch fixes that by doing another check during the
`StaticTensorGenerate` conversion, and return a failure if the dimension is
negative.
As a side-note, I have to admit that I do not know why returning a failure in
`StaticTensorGenerate` gives a nice "tensor dimensions must be non-negative"
error. I suspect that the verifier runs again when `return failure()` is called,
but I am not sure.
Fixes#60558.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D155728
* Remove duplicate functions. `tensor::getMixedSize` and `tensor::getMixedSizes` should be used.
* Use `tensor::getMixedSize` instead of `createOrFold<tensor::DimOp>`. This is more efficient. `createOrFold` will create an op an immediately try to fold it. In case of a static dimension size, an attribute can be used directly.
Differential Revision: https://reviews.llvm.org/D153332
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This patch updates all remaining uses of the deprecated functionality in
mlir/. This was done with clang-tidy as described below and further
modifications to GPUBase.td and OpenMPOpsInterfaces.td.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```
Differential Revision: https://reviews.llvm.org/D151542
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Context:
* https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…"
* Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This follows a previous patch that updated calls
`op.cast<T>()-> cast<T>(op)`. However some cases could not handle an
unprefixed `cast` call due to occurrences of variables named cast, or
occurring inside of class definitions which would resolve to the method.
All C++ files that did not work automatically with `cast<T>()` are
updated here to `llvm::cast` and similar with the intention that they
can be easily updated after the methods are removed through a
find-replace.
See https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
for the clang-tidy check that is used and then update printed
occurrences of the function to include `llvm::` before.
One can then run the following:
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-export-fixes /tmp/cast/casts.yaml mlir/*\
-header-filter=mlir/ -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```
Differential Revision: https://reviews.llvm.org/D150348