Commit Graph

247 Commits

Author SHA1 Message Date
Han-Chung Wang
c3e3d59fab [mlir][tensor] Fix tensor::PackOp fold() handling of padding value (#87296)
We can't just check if it is a splat constant or not. We should also
check if the value match.
2024-04-02 13:49:28 -07:00
Prashant Kumar
aa7ae1ba0b [mlir][tensor] Fold producer linalg transpose with consumer unpack an… (#86795)
…d viceversa

-- Adds folding of producer linalg transpose op with consumer unpack op,
also adds folding of producer unpack op and consumer transpose op.
-- Minor bug fixes w.r.t. to the test cases.
2024-03-28 23:13:33 +05:30
Jerry Wu
f566b079f1 [MLIR] Add pattern to fold insert_slice of extract_slice (#86328)
Fold the `tensor.insert_slice` of `tensor.extract_slice` into
`tensor_extract_slice` when the `insert_slice` simply expand some unit
dims dropped by the `extract_slice`.
2024-03-28 11:18:47 -04:00
Jianbang Yang
4bb9f918ff [mlir][tensor] fix out-of-bound index in tensor.dim (#85901)
fix a crash when fold tensor.dim with out-of-bound index.

Fixes: https://github.com/llvm/llvm-project/issues/70183
2024-03-25 21:08:18 +08:00
Matthias Springer
35d3b3430e [mlir][bufferization] Add "bottom-up from terminators" analysis heuristic (#83964)
One-Shot Bufferize currently does not support loops where a yielded
value bufferizes to a buffer that is different from the buffer of the
region iter_arg. In such a case, the bufferization fails with an error
such as:
```
Yield operand #0 is not equivalent to the corresponding iter bbArg
    scf.yield %0 : tensor<5xf32>
```

One common reason for non-equivalent buffers is that an op on the path
from the region iter_arg to the terminator bufferizes out-of-place. Ops
that are analyzed earlier are more likely to bufferize in-place.

This commit adds a new heuristic that gives preference to ops that are
reachable on the reverse SSA use-def chain from a region terminator and
are within the parent region of the terminator. This is expected to work
better than the existing heuristics for loops where an iter_arg is
written to multiple times within a loop, but only one write is fed into
the terminator.

Current users of One-Shot Bufferize are not affected by this change.
"Bottom-up" is still the default heuristic. Users can switch to the new
heuristic manually.

This commit also turns the "fuzzer" pass option into a heuristic,
cleaning up the code a bit.
2024-03-21 14:16:02 +09:00
Oleksandr "Alex" Zinenko
5a9bdd85ee [mlir] split transform interfaces into a separate library (#85221)
Transform interfaces are implemented, direction or via extensions, in
libraries belonging to multiple other dialects. Those dialects don't
need to depend on the non-interface part of the transform dialect, which
includes the growing number of ops and transitive dependency footprint.

Split out the interfaces into a separate library. This in turn requires
flipping the dependency from the interface on the dialect that has crept
in because both co-existed in one library. The interface shouldn't
depend on the transform dialect either.

As a consequence of splitting, the capability of the interpreter to
automatically walk the payload IR to identify payload ops of a certain
kind based on the type used for the entry point symbol argument is
disabled. This is a good move by itself as it simplifies the interpreter
logic. This functionality can be trivially replaced by a
`transform.structured.match` operation.
2024-03-20 22:15:17 +01:00
Sayan Saha
26722f5b61 [MLIR] Fix incorrect memref::DimOp canonicalization, add tensor::DimOp canonicalization (#84225)
The current canonicalization of `memref.dim` operating on the result of
`memref.reshape` into `memref.load` is incorrect as it doesn't check
whether the `index` operand of `memref.dim` dominates the source
`memref.reshape` op. It always introduces `memref.load` right after
`memref.reshape` to ensure the `memref` is not mutated before the
`memref.load` call. As a result, the following error is observed:

```
$> mlir-opt --canonicalize input.mlir

func.func @reshape_dim(%arg0: memref<*xf32>, %arg1: memref<?xindex>, %arg2: index) -> index {
    %c4 = arith.constant 4 : index
    %reshape = memref.reshape %arg0(%arg1) : (memref<*xf32>, memref<?xindex>) -> memref<*xf32>
    %0 = arith.muli %arg2, %c4 : index
    %dim = memref.dim %reshape, %0 : memref<*xf32>
    return %dim : index
  }
```

results in:

```
dominator.mlir:22:12: error: operand #1 does not dominate this use
    %dim = memref.dim %reshape, %0 : memref<*xf32>
           ^
dominator.mlir:22:12: note: see current operation: %1 = "memref.load"(%arg1, %2) <{nontemporal = false}> : (memref<?xindex>, index) -> index
dominator.mlir:21:10: note: operand defined here (op in the same block)
    %0 = arith.muli %arg2, %c4 : index
```

Properly fixing this issue requires a dominator analysis which is
expensive to run within a canonicalization pattern. So, this patch fixes
the canonicalization pattern by being more strict/conservative about the
legality condition in which we perform this canonicalization.
The more general pattern is also added to `tensor.dim`. Since tensors are
immutable we don't need to worry about where to introduce the
`tensor.extract` call after canonicalization.
2024-03-11 19:37:33 -07:00
James Newling
67ef4ae2c3 [MLIR][Tensor,MemRef] Fold expand_shape and collapse_shape if identity (#80658)
Before: op verifiers failed if the input and output ranks were the same
(i.e. no expansion or collapse). This behavior requires users of these
shape ops to verify manually that they are not creating identity
versions of these ops every time they build them -- problematic. This PR
removes this strict verification, and introduces folders for the the
identity cases.

The PR also removes the special case handling of rank-0 tensors for
expand_shape and collapse_shape, there doesn't seem to be any reason to
treat them differently.
2024-03-12 10:11:58 +09:00
Max191
e3b93a1620 [mlir] Fix bug in pack and unpack op canonicalization for folding dynamic dims (#82539)
This PR fixes a bug in the inference of pack and unpack static shapes
that should be using an inverse permutation.
2024-02-28 17:39:22 -05:00
Han-Chung Wang
eac8604d98 [mlir][tensor] Add support for tensor.unpack static shapes inference. (#81702)
The revision does not refactor the inferStaticShape for pack and unpack
ops because they can diverge quickly. Because there are more dimensions
can be inferred (i.e., with inner_tile_sizes) if the pack op does not
have padding value.

This is a follow-up of https://github.com/llvm/llvm-project/pull/80848
2024-02-19 16:26:12 -08:00
srcarroll
9466c4e629 [MLIR][tensor] Improve tensor.pack verifier to catch more cases with unconditional runtime errors (#77217)
Previously, the `tensor.pack` verifier detects unconditional runtime
errors only when tile sizes are static. Now, dynamic tiles are
considered and we only require that the input and either corresponding
tile or output size are static to determine if it will unconditionally
produce errors at runtime.
2024-02-19 12:27:24 -06:00
Han-Chung Wang
bc08cc2ac8 [mlir][tensor] Add support for tensor.pack static shapes inference. (#80848)
Fixes https://github.com/openxla/iree/issues/16317
2024-02-13 20:20:24 -08:00
Alexey Z
4759890f85 [mlir][tensor] Fix bug in insert_slice canonical. with tensor encoding (#81045)
Previously, `InsertSliceOpSourceCastInserter` was incorrectly applied to
a case when tensor types have an encoding attribute attached to them.
The type `newSrcType` was missing that attribute from the old `srcType`,
which made the expression `srcType == newSrcType` false, since
`tensor<2x2xf32, "foo">` is not equal to `tensor<2x2xf32>`. That lead to
an endless back and forth between `InsertSliceOpSourceCastInserter` that
would introduce a cast and `InsertSliceOpCastFolder` that would remove
it right after.
2024-02-08 20:22:27 -05:00
Rob Suderman
70eb0e37a8 [mlir][tensor] Fix tensor.pad to remove newly static values (#79938)
The canonicalization incrementally converts foldable dynamic hi/lo
padding to static hi/lo values. During this canonicalization the
static-fied valued should be removed from the dynamic values.
2024-01-29 20:32:15 -08:00
MaheshRavishankar
76ead96c1d [mlir][TilingInterface] Use LoopLikeOpInterface in tiling using SCF to unify tiling with scf.for and scf.forall. (#77874)
Using `LoopLikeOpInterface` as the basis for the implementation unifies
all the tiling logic for both `scf.for` and `scf.forall`. The only
difference is the actual loop generation. This is a follow up to
https://github.com/llvm/llvm-project/pull/72178
Instead of many entry points for each loop type, the loop type is now
passed as part of the options passed to the tiling method.

This is a breaking change with the following changes

1) The `scf::tileUsingSCFForOp` is renamed to `scf::tileUsingSCF`
2) The `scf::tileUsingSCFForallOp` is deprecated. The same
   functionality is obtained by using `scf::tileUsingSCF` and setting
   the loop type in `scf::SCFTilingOptions` passed into this method to
   `scf::SCFTilingOptions::LoopType::ForallOp` (using the
   `setLoopType` method).
3) The `scf::tileConsumerAndFusedProducerGreedilyUsingSCFForOp` is
   renamed to `scf::tileConsumerAndFuseProducerUsingSCF`. The use of
   the `controlFn` in `scf::SCFTileAndFuseOptions` allows implementing
   any strategy with the default callback implemeting the greedy fusion.
4) The `scf::SCFTilingResult` and `scf::SCFTileAndFuseResult` now use
   `SmallVector<LoopLikeOpInterface>`.
5) To make `scf::ForallOp` implement the parts of
   `LoopLikeOpInterface` needed, the `getOutputBlockArguments()`
   method is replaced with `getRegionIterArgs()`

These changes now bring the tiling and fusion capabilities using
`scf.forall` on par with what was already supported by `scf.for`
2024-01-25 21:26:23 -08:00
Han-Chung Wang
ad3cda7a04 [mlir][tensor] Enhance SimplifyUnPackToCollapseShape for unit dim cases. (#79262) 2024-01-25 06:54:33 -08:00
Han-Chung Wang
f59eef6515 [mlir][tensor] Enhance SimplifyPackToExpandShape for unit dim cases. (#79247)
Progress on https://github.com/openxla/iree/issues/16181
2024-01-24 18:47:25 -08:00
Quinn Dawkins
42b160356f [mlir][transform] Add an op for replacing values with function calls (#78398)
Adds `transform.func.cast_and_call` that takes a set of inputs and
outputs and replaces the uses of those outputs with a call to a function
at a specified insertion point.

The idea with this operation is to allow users to author independent IR
outside of a to-be-compiled module, and then match and replace a slice
of the program with a call to the external function.

Additionally adds a mechanism for populating a type converter with a set
of conversion materialization functions that allow insertion of
casts on the inputs/outputs to and from the types of the function
signature.
2024-01-19 13:21:52 -05:00
lorenzo chelini
6bc7e3764c [MLIR][Tensor] Fix checks for fold-into-pack-and-unpack.mlir (#77622)
Fix after 113bce0
2024-01-10 11:23:02 -06:00
Han-Chung Wang
2472c45ba3 [mlir][tensor] Enhance pack/unpack simplification for identity outer_dims_perm cases. (#77409)
They can be simplified to reshape ops if outer_dims_perm is an identity
permutation. The revision adds a `isIdentityPermutation` method to
IndexingUtils.
2024-01-10 08:30:34 -08:00
Prathamesh Tagore
113bce0c79 [mlir][tensor] Fold producer linalg transpose with consumer tensor pack (#75658)
Successor to https://github.com/llvm/llvm-project/pull/74206 

Partial fix to https://github.com/openxla/iree/issues/15367
2024-01-10 06:55:27 -08:00
Matthias Springer
bb6d5c2200 [mlir][Transforms] GreedyPatternRewriteDriver: Do not CSE constants during iterations (#75897)
The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply
rewrite patterns to ops. It has special handling for constants: they are
CSE'd and sometimes moved to parent regions to allow for additional
CSE'ing. This happens in `OperationFolder`.

To allow for efficient CSE'ing, `OperationFolder` maintains an internal
lookup data structure to find the existing constant ops with the same
value for each `IsolatedFromAbove` region:
```c++
/// A mapping between an insertion region and the constants that have been
/// created within it.
DenseMap<Region *, ConstantMap> foldScopes;
```

Rewrite patterns are allowed to modify operations. In particular, they
may move operations (including constants) from one region to another
one. Such an IR rewrite can make the above lookup data structure
inconsistent.

We encountered such a bug in a downstream project. This bug materialized
in the form of an op that uses the result of a constant op from a
different `IsolatedFromAbove` region (that is not accessible).

This commit changes the behavior of the `GreedyPatternRewriteDriver`
such that `OperationFolder` is used to CSE constants at the beginning of
each iteration (as the worklist is populated), but no longer during an
iteration. `OperationFolder` is no longer used after populating the
worklist, so we do not have to care about inconsistent state in the
`OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver`
now performs the op folding by itself instead of calling
`OperationFolder::tryToFold`.

This change changes the order of constant ops in test cases, but not the
region in which they appear. All broken test cases were fixed by turning
`CHECK` into `CHECK-DAG`.

Alternatives considered: The state of `OperationFolder` could be
partially invalidated with every `notifyOperationModified` notification.
That is more fragile than the solution in this commit because incorrect
rewriter API usage can lead to missing notifications and hard-to-debug
`IsolatedFromAbove` violations. (It did not fix the above mention bug in
a downstream project, which could be due to incorrect rewriter API usage
or due to another conceptual problem that I missed.) Moreover, ops are
frequently getting modified during a greedy pattern rewrite, so we would
likely keep invalidating large parts of the state of `OperationFolder`
over and over.

Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant
ops are no longer folded during a greedy pattern rewrite. If you rely on
folding (and rematerialization) of constant ops during a greedy pattern
rewrite, turn the folder into a pattern.
2024-01-05 09:22:18 +01:00
Han-Chung Wang
76cb0bb7a4 [mlir][tensor] Add a pattern to simplify tensor.unpack to collpase shape (#76607) 2024-01-03 09:34:52 -08:00
Han-Chung Wang
78348b6915 [mlir][tensor] Improve tensor.pack simplication pattern. (#76606)
A tensor.pack op can be rewritten to a tensor.expand_shape op if the
packing only happens on inner most dimension.

This also formats the lit checks better.
2024-01-02 09:34:24 -08:00
Han-Chung Wang
4b14205bc0 [mlir][tensor] Centralize pack/unpack related patterns. (#76603)
The revision moves pack/unpack related patterns to
PackAndUnpackPatterns.cpp. This follows the convention like other tensor
ops.

It also renames `populateSimplifyTensorPack` to
`populateSimplifyPackAndUnpackPatterns` and adds a TODO item for
tensor.unpack op.
2023-12-30 11:40:40 -08:00
Rafael Ubal
214d32ccd2 Support for dynamic dimensions in 'tensor.splat' (#74626)
This feature had been marked as `TODO` in the `tensor.splat`
documentation for a while. This MR includes:

- Support for dynamically shaped tensors in the return type of
`tensor.splat` with the syntax suggested in the `TODO` comment.

- Updated op documentation.

- Bufferization support.

- Updates in op folders affected by the new feature.

- Unit tests for valid/invalid syntax, valid/invalid folding, and
lowering through bufferization.

- Additional op builders resembling those available in `tensor.empty`.
2023-12-15 13:54:45 +00:00
Quinn Dawkins
fcd54b368e [mlir][tensor] Fix tensor.concat reifyResultShapes for static result dims (#75558)
When the concatenated dim is statically sized but the inputs are
dynamically sized, reifyResultShapes must return the static shape. Fixes
the implementation of the interface for tensor.concat in such cases.
2023-12-15 08:43:58 -05:00
Prathamesh Tagore
f397bdf5ae [mlir][tensor] Fold consumer linalg transpose with producer tensor pack (#74206)
Partial fix to https://github.com/openxla/iree/issues/15367
2023-12-13 14:26:19 -08:00
Rafael Ubal
a8f3860bcb [mlir][tensor] Fix bug in tensor.extract(tensor.from_elements) folder (#75109)
The folder for `tensor.extract` is not operating correctly when it is
consuming the result of a `tensor.from_elements` operation.

The existing unit test named `@extract_from_tensor.from_elements_3d` in
`mlir/test/Dialect/Tensor/canonicalize.mlir` seems an attempt to stress
this code. However, this unit tests creates a `tensor.from_elements` op
exclusively from constants, which gets folded away into a single
constant tensor. Therefore, the buggy code was never executed in unit
tests.

I have added a new unit test named
`@extract_from_tensor.from_elements_variable_3d` that makes sure the
`tensor.from_elements` op is not folded away by having its input
operands come directly from function arguments. The original folder code
would have made this test fail.

This bug was notably affecting the lowering of the `tosa.pad` op in the
`tosa-to-tensor` pass, where the generated code is likely to contain a
`tensor.from_elements` + `tensor.extract` op sequence.
2023-12-12 15:36:52 +00:00
Matthias Springer
75f6cad8e9 [mlir][tensor] tensor.generate: do not verify dynamic sizes (#74568)
Op verifiers should verify only local properties of an op. The dynamic
sizes of a `tensor.generate` op should not be verified. Dynamic sizes
that have a negative constant value should not prevent the
`tensor.generate` op from verifying.

Also share some code between the `tensor.empty` and `tensor.generate`
"dynamic dim -> static dim" canonicalization patterns.

Remove the `invalid-canonicalize.mlir` file and move the test case to
`canonicalize.mlir`. Canonicalization no longer produces IR that does
not verify (and leaves the op as is).
2023-12-07 08:36:07 +09:00
Quinn Dawkins
005c83380a [mlir][tensor] Fix ReifyResultShapes implementation for tensor.concat (#74157)
Without folding the result of the initial tensor.dim, the
ReifyResultShapes implementation would be incorrect because it would
return a dynamic shape for a static result shape.
2023-12-01 19:29:56 -05:00
Quinn Dawkins
f310a5d2c1 [mlir][tensor] Add a tensor.concat operation (#72779)
This adds an operation for concatenating ranked tensors along a static
dimension, as well as a decomposition mirroring the existing lowering
from TOSA to Tensor. This offers a convergence point for "input" like
dialects that include various lowerings for concatenation operations,
easing later analysis. In the future, this op can implement the
necessary interfaces for tiling, as well as potentially add conversions
to some kind of linalg and/or memref counterpart.

This patch adds the op, the decomposition, and some basic
folding/canonicalization. Replacing lowerings with the op (such as the
TOSA lowering) will come as a follow up.

See
https://discourse.llvm.org/t/rfc-tensor-add-a-tensor-concatenate-operation/74858
2023-12-01 15:05:29 -05:00
Han-Chung Wang
171cac95a7 [mlir][tensor] Fold padding_value away for pack ops when possible. (#74005)
If we can infer statically that there are no incomplete tiles, we can
remove the optional padding operand.

Fixes https://github.com/openxla/iree/issues/15417
2023-12-01 11:12:58 -08:00
Matthias Springer
68386a74ba [mlir][tensor] Fix crash when canonicalizing invalid IR (#72888)
This commit fixes a crash of the canonicalizer when there are slice ops
with offset/size SSA values that have a negative constant value. Such
ops are invalid if they are reachable and their offsets/sizes should not
be folded to static integer values. (But such ops may appear in
non-reachable block.)

This commit fixes #71150.
2023-11-21 09:20:18 +01:00
MaheshRavishankar
4a020018ce [NFC] Simplify the tiling implementation using cloning. (#72178)
The current implementation of tiling using `scf.for` is convoluted to
make sure that the destination passing style of the untiled program is
preserved. The addition of support to tile using `scf.forall` (adapted
from the transform operation in Linalg) in
https://github.com/llvm/llvm-project/pull/67083 used cloning of the
tiled operations to better streamline the implementation. This PR adapts
the other tiling methods to use a similar approach, making the
transformations (and handling destination passing style semantics) more
systematic.

---------

Co-authored-by: Abhishek-Varma <avarma094@gmail.com>
2023-11-20 09:05:48 -08:00
Rik Huijzer
d0da3d8393 [mlir][tensor] Fold when source is const (#71643)
Fixes https://github.com/llvm/llvm-project/issues/60656.

This patch implements a basic fold for various reshape/resize tensor
operations. Specifically, the folding removes tensor reshape/resize ops
when they are applied to a constant tensor. For example, the following
function:

```mlir
func.func @main(%dest : tensor<8x16x8x32xf32>) -> tensor<8x16x8x32xf32> {
  %cst = arith.constant dense<1.000000e-01> : tensor<64x128xf32>
  %0 = tensor.pack %cst outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
    inner_tiles = [8, 32] into %dest : tensor<64x128xf32> -> tensor<8x16x8x32xf32>
  return %0 : tensor<8x16x8x32xf32>
}
```
will be changed into the following with `mlir-opt -canonicalize`:
```mlir
func.func @main(%arg0: tensor<8x16x8x32xf32>) -> tensor<8x16x8x32xf32> {
  %cst = arith.constant dense<1.000000e-01> : tensor<8x16x8x32xf32>
  return %cst : tensor<8x16x8x32xf32>
}
```

As a side-note, this patch is essentially an extension of
f79f430d4b.
2023-11-09 20:36:32 +01:00
MaheshRavishankar
14e7846d6e [mlir][Tensor] Fold destination-style ops into tensor.unpack operation. (#71468)
The destination operand of the `tensor.unpack` operation is only needed
to carry shape information. So if the producer of the destination
operand implements the `DestinationStyleOpInterface`, then fold it into
the `tensor.unpack` operation by replacing the destination operand with
the destination for the source.
2023-11-07 21:42:32 -08:00
Oleksandr "Alex" Zinenko
e4384149b5 [mlir] use transform-interpreter in test passes (#70040)
Update most test passes to use the transform-interpreter pass instead of
the test-transform-dialect-interpreter-pass. The new "main" interpreter
pass has a named entry point instead of looking up the top-level op with
`PossibleTopLevelOpTrait`, which is arguably a more understandable
interface. The change is mechanical, rewriting an unnamed sequence into
a named one and wrapping the transform IR in to a module when necessary.

Add an option to the transform-interpreter pass to target a tagged
payload op instead of the root anchor op, which is also useful for repro
generation.

Only the test in the transform dialect proper and the examples have not
been updated yet. These will be updated separately after a more careful
consideration of testing coverage of the transform interpreter logic.
2023-10-24 16:12:34 +02:00
Adrian Kuegel
1c27899e24 [mlir][SCF] Pass result of getAsOpFoldResult to getBoundedTileSize.
A recent change modified the parameter tileSize from Value to
OpFoldResult. Therefore we should call getAsOpFoldResult before passing
on the tileSize.
Adjust a test regarding this new behavior.
2023-10-20 10:25:32 +00:00
Matthias Springer
ea71d2d0fe [mlir][tensor][bufferize] Reshapes: Fix memory side effects and memory space (#68195)
* `tensor.collapse_shape` may bufferize to a memory read because the op
may have to reallocate the source buffer.
* `tensor.reshape` should not use `bufferization.clone` for
reallocation. This op has requirements wrt. the order of buffer
writes/reads. Use `memref.alloc` and `memref.copy` instead. Also fix a
bug where the memory space of the source buffer was not propagated to
the reallocated buffer.
2023-10-05 14:33:04 +02:00
Matthias Springer
464dfeba44 [mlir][tensor][bufferize] tensor.empty bufferizes to an allocation (#68080)
Make `tensor.empty` bufferizable, so that the
`-empty-tensor-to-alloc-tensor` pass becomes optional. This makes the
bufferization easier to use. `tensor.empty` used to be non-bufferizable,
so that there two separate ops, one that can be optimized away
(`tensor.empty`) and one that is guaranteed to bufferize to an
allocation (`bufferization.alloc_tensor`). With the recent improvements
of "empty tensor elimination" this is no longer needed and
`bufferization.alloc_tensor` can be phased out.
2023-10-03 16:00:37 +02:00
Oleksandr "Alex" Zinenko
96ff0255f2 [mlir] cleanup of structured.tile* transform ops (#67320)
Rename and restructure tiling-related transform ops from the structured
extension to be more homogeneous. In particular, all ops now follow a
consistent naming scheme:

 - `transform.structured.tile_using_for`;
 - `transform.structured.tile_using_forall`;
 - `transform.structured.tile_reduction_using_for`;
 - `transform.structured.tile_reduction_using_forall`.

This drops the "_op" naming artifact from `tile_to_forall_op` that
shouldn't have been included in the first place, consistently specifies
the name of the control flow op to be produced for loops (instead of
`tile_reduction_using_scf` since `scf.forall` also belongs to `scf`),
and opts for the `using` connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as *trailing*
results of the transform op. While this required changing 3 out of 4 ops
(except for `tile_using_for`), this is the only choice that makes sense
when producing multiple `scf.for` ops that can be associated with a
variadic number of handles. This choice is also most consistent with
*other* transform ops from the structured extension, in particular with
fusion ops, that produce the structured op as the leading result and the
loop as the trailing result.
2023-09-26 09:14:29 +02:00
Spenser Bauman
0a0c7e8978 [mlir][tensor] Bufferize tensor.reshape with non-identity layouts (#65654)
Bufferization of tensor.reshape generates a memref.reshape operation.
memref.reshape requires the source memref to have an identity layout.
The bufferization process may result in the source memref having a
non-identity layout, resulting in a verification failure.

This change causes the bufferization interface for tensor.reshape to
copy the source memref to a new buffer when the source has a
non-identity layout.
2023-09-19 09:50:43 +09:00
Martin Erhart
6bf043e743 [mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute (#66619)
This commit removes the deallocation capabilities of
one-shot-bufferization. One-shot-bufferization should never deallocate
any memrefs as this should be entirely handled by the
ownership-based-buffer-deallocation pass going forward. This means the
`allow-return-allocs` pass option will default to true now,
`create-deallocs` defaults to false and they, as well as the escape
attribute indicating whether a memref escapes the current region, will
be removed. A new `allow-return-allocs-from-loops` option is added as a
temporary workaround for some bufferization limitations.
2023-09-18 16:44:48 +02:00
Martin Erhart
c199f7dc62 Revert "[mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute"
This reverts commit 6a91dfedeb.

This caused problems in downstream projects. We are reverting to give
them more time for integration.
2023-09-13 13:53:48 +00:00
Martin Erhart
6a91dfedeb [mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute
This is the first commit in a series with the goal to rework the
BufferDeallocation pass. Currently, this pass heavily relies on copies
to perform correct deallocations, which leads to very slow code and
potentially high memory usage. Additionally, there are unsupported cases
such as returning memrefs which this series of commits aims to add
support for as well.

This first commit removes the deallocation capabilities of
one-shot-bufferization.One-shot-bufferization should never deallocate any
memrefs as this should be entirely handled by the buffer-deallocation pass
going forward. This means the allow-return-allocs pass option will
default to true now, create-deallocs defaults to false and they, as well
as the escape attribute indicating whether a memref escapes the current region,
will be removed.

The documentation should w.r.t. these pass option changes should also be
updated in this commit.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D156662
2023-09-13 09:30:22 +00:00
Kohei Yamaguchi
ca8cf90c8c [mlir][tensor] Check the EmptyOp's dynamicSize to be non-negative (#65577)
This patch addresses a crash that occurs when negative dynamic sizes are
provided in tensor.emptyOp by adding a check to ensure that dynamic
sizes are non-negative.

Fixes #64064
2023-09-10 18:38:54 -07:00
lorenzo chelini
e5137e7c33 [MLIR][Linalg] Retire tile_to_scf_for (#65633)
Both `TileOp` and `TileToScfForOp` use the tiling interface and the
`tileUsingSCFForOp` method. This duplication was introduced in
44cfea0279
as a way to retire `linalg::tileLinalgOp,` now there is not more need
for this duplication, and it seems that `tileOp` has more recent
changes, thus retire `TileToScfForOp.`
2023-09-07 16:13:23 -04:00
Christopher Bate
9bd19bb703 [mlir][tensor] Fix bug in utility tensor::isCastLikeExtractSliceOp
Fixes an issue where `isCastLikeExtractSliceOp` did not account for the fact
that `tensor.extract_slice` may drop non-unit dimensions. This change makes the
utility function behave inline with its name/description. The only user of this
function is in the `FindPayloadReplacementOpInterface` for the
`tensor::ExtractSliceOp`. This can potentially cause downstream projects to have
more "listener could not find replacement op" errors when interpreting Transform
IR, but the behavior is inline with the documented conservative behavior of the
Transform dialect's TrackingListener.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D158635
2023-08-28 11:17:11 -06:00
Matthias Springer
dfa96cfd7c [mlir][tensor] Fix ReifyRankedShapedTypeOpInterface impl. of reshape ops
`reifyResultShapes` should return an `Attribute` if and only if the respective dimension is static.

This fixes #64256.

Differential Revision: https://reviews.llvm.org/D158166
2023-08-24 12:23:10 +02:00