Commit Graph

1461 Commits

Author SHA1 Message Date
Jakub Kuderski
560564f51c [mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901)
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.

Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.

Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.

Issue: https://github.com/llvm/llvm-project/issues/72354
2023-12-20 00:14:43 -05:00
Matthias Springer
3a087c1592 [mlir][linalg] Fix invalid IR in Linalg op fusion (#74425)
Linalg op fusion (`Linalg/Transforms/Fusion.cpp`) used to generate
invalid fused producer ops:
```
error: 'linalg.conv_2d_nhwc_hwcf' op expected type of operand #2 ('tensor<1x8x16x4xf32>') to match type of corresponding result ('tensor<?x?x?x?xf32>')
note: see current operation:
%24 = "linalg.conv_2d_nhwc_hwcf"(%21, %22, %23) <{dilations = dense<1> : tensor<2xi64>, operandSegmentSizes = array<i32: 2, 1>, strides = dense<2> : tensor<2xi64>}> ({
^bb0(%arg9: f32, %arg10: f32, %arg11: f32):
  %28 = "arith.mulf"(%arg9, %arg10) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32
  %29 = "arith.addf"(%arg11, %28) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32
  "linalg.yield"(%29) : (f32) -> ()
}) {linalg.memoized_indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1 * 2 + d4, d2 * 2 + d5, d6)>, affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d4, d5, d6, d3)>, affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1, d2, d3)>]} : (tensor<1x?x?x3xf32>, tensor<3x3x3x4xf32>, tensor<1x8x16x4xf32>) -> tensor<?x?x?x?xf32>
```

This is a problem because the input IR to greedy pattern rewriter during
`-test-linalg-greedy-fusion` is invalid. This commit fixes tests such as
`mlir/test/Dialect/Linalg/tile-and-fuse-tensors.mlir` when verifying the
IR after each pattern application (#74270).
2023-12-19 14:17:10 +09:00
srcarroll
b26ee97537 [MLIR][Linalg] Support dynamic sizes in lower_unpack (#75494) 2023-12-18 19:02:04 +01:00
Quinn Dawkins
82ab0f7f36 [mlir][linalg] Fix rank-reduced cases for extract/insert slice in DropUnitDims (#74723)
Inferring the reshape reassociation indices for extract/insert slice ops
based on the read sizes of the original slicing op will generate an
invalid expand/collapse shape op for already rank-reduced cases. Instead
just infer from the shape of the slice.

Ported from Differential Revision: https://reviews.llvm.org/D147488
2023-12-16 10:08:51 -05:00
Andrzej Warzyński
f11bda78c8 [mlir][linalg] Use vector.shuffle to flatten conv filter (#75038)
Updates the vectorisation of 1D depthwise convolution when flattening
the channel dimension (introduced in #71918). In particular - how the
convolution filter is "flattened". ATM, the vectoriser will use
`vector.shape_cast`:

```mlir
  %b_filter = vector.broadcast %filter : vector<4xf32> to vector<3x2x4xf32>
  %sc_filter = vector.shape_cast %b_filter : vector<3x2x4xf32> to vector<3x8xf32>
```

This lowering is not ideal - `vector.shape_cast` can be convenient when
it's folded away, but that's not happening in this case. Instead, this
patch updates the vectoriser to use `vector.shuffle` (the overall result
is identical):

```mlir
  %sh_filter = vector.shuffle %filter, %filter
      [0, 1, 2, 3, 0, 1, 2, 3] : vector<4xf32>, vector<4xf32>
  %b_filter = vector.broadcast %sh_filter : vector<8xf32> to vector<3x8xf32>
```
2023-12-15 17:56:59 +00:00
Amir Bishara
cf2d625a5d [mlir][linalg] Expose getPreservedProducerResults method from ElementwiseOpFusion file (#73850)
Declare `getPreservedProducerResults` function which helps to get the
preserved results of the producer linalg generic operation as a result
of elementwise fusion.
2023-12-08 11:50:33 +02:00
Matthias Springer
986287e7f3 [mlir][SparseTensor] Fix invalid API usage in patterns (#74690)
Rewrite patterns must return `success` if the IR was modified. This
commit fixes sparse tensor tests such as
`SparseTensor/sparse_fusion.mlir`,
`SparseTensor/CPU/sparse_reduce_custom.mlir`,
`SparseTensor/CPU/sparse_semiring_select.mlir` when running with
`MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`.
2023-12-07 12:05:20 +09:00
Andrzej Warzyński
03c2f5d8bb [mlir][linalg][conv] Flatten the channel dimension when vectorizing (#71918)
The current vectorization of 1D depthwise convolutions in Linalg is
_sub-optimal_ for tensor with a low number of channel dimensions, e.g.:

```mlir
linalg.depthwise_conv_1d_nwc_wc
    {dilations = dense<1> : vector<1xi64>,
    strides = dense<1> : vector<1xi64>}
    ins(%input, %filter : tensor<1x8x3xi8>, tensor<1x3xi8>)
    outs(%output : tensor<1x8x3xi8>) -> tensor<1x8x3xi8>
```

That's due to the fact that ultimately (i.e. at LLVM level),
vectorization happens along the trailing dimension (i.e. the channel
dimension). In this case it leads to vectors with 3 elements (or worse,
if there's e.g. only 1 channel dimension). For comparison, a 128 bit
wide vector registers can hold 16 x i8.

Instead, this patch adds an option to flatten/collapse the channel
dimension into the width dimension of the input/filter/output using
`vector.shape_cast` operation:

```mlir
    %sc_input = vector.shape_cast %input : vector<1x8x3xi8> to vector<1x24xi8>
    %sc_output = vector.shape_cast %output : vector<1x8x3xi8> to vector<1x24xi8>
    %b_filter = vector.broadcast %filter : vector<3xi8> to vector<1x8x3xi8>
    %sc_filter = vector.shape_cast %b_filter : vector<1x8x3xi8> to vector<1x24xi8>
```

This new vectorization mode is implemented in `depthwiseConv` by
inserting `vector.shape_cast` Ops before and after 
`depthwiseConv1dSliceAsMulAcc` is invoked. It can be selected through
e.g. a transform dialect attribute:

```mlir
  transform.structured.vectorize_children_and_apply_patterns %conv {flatten_1d_depthwise_conv}
```

A forthcoming patch will implement a strategy to automatically switch
between the two implementations, depending on the shape of the input
tensors.

Co-authored by: Bradley Smith <bradley.smith@arm.com>
2023-12-06 21:35:03 +00:00
Jack Frankland
4a3d2088d6 [mlir][linalg] Add TransposeConv2D Transform Op (#68567)
* Add a LinAlg pass to convert 2D convolutions and quantized 2D
convolutions that have the `FHWC` filter channel ordering into a
transpose followed by 2D convolutions that have the `HWCF` channel
ordering.

* Add a lit test to check the semantics of the transformation are
correct for both quantized and unquantized variants.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2023-11-28 09:56:12 +00:00
Oleksandr "Alex" Zinenko
8134a8fc3f [mlir] use TypeSize and uint64_t in DataLayout (#72874)
Data layout queries may be issued for types whose size exceeds the range
of 32-bit integer as well as for types that don't have a size known at
compile time, such as scalable vectors. Use best practices from LLVM IR
and adopt `llvm::TypeSize` for size-related queries and `uint64_t` for
alignment-related queries.

See #72678.
2023-11-21 16:12:27 +01:00
long.chen
1609f1c2a5 [mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/

Not all changes are made manually, most of them are made through a clang
tool I wrote https://github.com/lipracer/cpp-refactor.
2023-11-14 13:01:19 +08:00
Tim Harvey
c43e627457 Changed the phrase sparse-compiler to sparsifier in comments (#71578)
When the Powers That Be decided that the name "sparse compiler" should
be changed to "sparsifier", we negected to change some of the comments
in the code; this pull request completes the name change.
2023-11-07 20:55:00 +00:00
Han-Chung Wang
03529b99b3 [mlir][linalg] Add support for vectorizing dynamic elementwise named ops (#71454)
We are able to vectorize them in linalg.generic form. We just need to
relax the condition, so it can also vectorize named ops.
2023-11-06 15:35:50 -08:00
Matthias Springer
437c62178c [mlir][memref] Remove redundant memref.tensor_store op (#71010)
`bufferization.materialize_in_destination` should be used instead. Both
ops bufferize to a memcpy. This change also conceptually cleans up the
memref dialect a bit: the memref dialect no longer contains ops that
operate on tensor values.
2023-11-05 12:47:18 +09:00
Matthias Springer
6529c9a1db [mlir][linalg][NFC] Remove linalg subset hoisting (#70636)
Remove `SubsetHoisting.cpp` and migrate all remaining uses to the newly
added loop-invariant subset hoisting transform in `mlir/Transforms`.
2023-11-05 11:55:15 +09:00
Nicolas Vasilache
3a223f4414 [mlir][Bufferization] Add support for controlled bufferization of alloc_tensor (#70957)
This revision adds support to
`transform.structured.bufferize_to_allocation` to bufferize
`bufferization.alloc_tensor()` ops.
    
This is useful as a means path to control the bufferization of
`tensor.empty` ops that have bene previously
`bufferization.empty_tensor_to_alloc_tensor`'ed.
2023-11-02 11:34:10 +01:00
Matthias Springer
1abd8d1a8d [mlir][Interfaces] Add SubsetOpInterface and SubsetExtractionOpInterface (#70617)
There is currently an op interface for subset insertion ops
(`SubsetInsertionOpInterface`), but not for subset extraction ops. This
commit adds `SubsetExtractionOpInterface` to `mlir/Interfaces`, as well
as a common dependent op interface: `SubsetOpInterface`.

- `SubsetOpInterface` is for ops that operate on tensor subsets. It
provides interface methods to check if two subset ops operate on
equivalent or disjoint subsets. Ops that implement this interface must
implement either `SubsetExtractionOpInterface` or
`SubsetInsertionOpInterface`.
- `SubsetExtractionOpInterface` is for ops that extract from a tensor at
a subset. E.g., `tensor.extract_slice`, `tensor.gather`,
`vector.transfer_read`. Current implemented only on
`tensor.extract_slice`.
- `SubsetInsertionOpInterface` is for ops that insert into a destination
tensor at a subset. E.g., `tensor.insert_slice`,
`tensor.parallel_insert_slice`, `tensor.scatter`,
`vector.transfer_write`. Currently only implemented on
`tensor.insert_slice`, `tensor.parallel_insert_slice`.

Other changes:
- Rename `SubsetInsertionOpInterface.td` to `SubsetOpInterface.td`.
- Add helper functions to `ValueBoundsOpInterface.cpp` for checking
whether two slices are disjoint.

The new interfaces will be utilized by a new "loop-invariant subset
hoisting"
transformation. (This new transform is roughly
what `Linalg/Transforms/SubsetHoisting.cpp` is doing, but in a generic
and interface-driven way.)
2023-11-01 10:26:31 +09:00
Matthias Springer
98a6edd38f [mlir][Interfaces] LoopLikeOpInterface: Expose tied loop results (#70535)
Expose loop results, which correspond to the region iter_arg values that
are returned from the loop when there are no more iterations. Exposing
loop results is optional because some loops (e.g., `scf.while`) do not
have a 1-to-1 mapping between region iter_args and op results.

Also add additional helper functions to query tied
results/iter_args/inits.
2023-11-01 08:34:14 +09:00
lorenzo chelini
6cbcb79350 [MLIR][Linalg] Introduce SpecializeOp (#70326)
Introduce an operation to specialize linalg.generics, for example,
detecting a linalg.generic that is semantically equivalent to a
linalg.copy and replacing the former with the latter. After code
generation, it is helpful to lower named operations to vendor-optimized
libraries.
2023-10-31 10:07:35 +01:00
Matthias Springer
a8d0c86174 [mlir][Interfaces][NFC] Move SubsetInsertionOpInterface to mlir/Interfaces (#70615)
`SubsetInsertionOpInterface` is an interface for ops that insert into a
destination tensor at a subset. It is currently used by the
bufferization framework to support efficient
`tensor.extract_slice/insert_slice` bufferization and to drive "empty
tensor elimination".

This commit moves the interface to `mlir/Interfaces`. This is in
preparation of adding a new "loop-invariant subset hoisting"
transformation to
`mlir/Transforms/Utils/LoopInvariantCodeMotionUtils.cpp`, which will
utilize `SubsetInsertionOpInterface`. (This new transform is roughly
what `Linalg/Transforms/SubsetHoisting.cpp` is doing, but in a generic
and interface-driven way.)
2023-10-30 13:42:44 +09:00
Matthias Springer
3cd2a0bc1a [mlir][Interfaces] LoopLikeOpInterface: Add helpers to query tied inits/iter_args (#70408)
The `LoopLikeOpInterface` already has interface methods to query inits
and iter_args. This commit adds helper functions to query tied
init/iter_arg pairs and removes the corresponding functions for
`scf::ForOp`.
2023-10-28 12:10:36 +09:00
Aviad Cohen
a7d6039f3e [mlir][linalg] Replace CopyOp from memref to linalg in linalg PromoteOp (#69154)
linalg::CopyOp is much more generic and useful to promote buffers. In addition, this is linalg transform and makes more sense to use linalg operations when possible.
2023-10-26 18:54:23 +03:00
Aviad Cohen
5c3ed392fc [mlir][linalg] Enable CollapseLinalgDimensions to collapse linalg::CopyOp (#68526) 2023-10-23 09:42:30 +03:00
Jacques Pienaar
b858309ddc [mlir] Only attempt to vectorize conv if conv.
Avoids hitting assertions due to unsupported convolution patterns.

See https://github.com/openxla/iree/issues/15207#issuecomment-1767650797
2023-10-18 20:34:39 -07:00
Matthias Springer
ab737a8699 [mlir][Interfaces] LoopLikeOpInterface: Add helper to get yielded values (#67305)
Add a new interface method that returns the yielded values.
    
Also add a verifier that checks the number of inits/iter_args/yielded
values. Most of the checked invariants (but not all of them) are already
covered by the `RegionBranchOpInterface`, but the `LoopLikeOpInterface`
now provides (additional) error messages that are easier to read.
2023-10-16 08:45:48 +09:00
Lei Zhang
3049ac44e6 [mlir][vector] Enable transfer op hoisting with dynamic indices (#68500)
Recent changes (https://github.com/llvm/llvm-project/pull/66930)
disabled vector transfer ops hoisting with view-like intermediate ops.
The recommended way is to fold subview ops into transfer op indices
before invoking hoisting. That would mean now we see transfer op indices
involving dynamic values, instead of static constant values before with
subview ops. Therefore hoisting won't kick in anymore. This breaks
downstream users.

To fix it, this commit enables hoisting transfer ops with dynamic
indices by using `ValueBoundsConstraintSet` to prove ranges are disjoint
in `isDisjointTransferIndices`. Given that utility is used in many
places including op folders, right now we introduce a flag to it and
only set as true for "heavy" transforms in hoisting and load-store
forwarding.
2023-10-15 16:37:54 -07:00
Aviad Cohen
7060422265 [mlir][Linalg]: Optimize linalg generic in transform::PromoteOp to avoid unnecessary copies (#68555)
If the operands are not used in the payload of linalg generic operations, there is no need to copy them before the operation.
2023-10-14 10:40:45 +03:00
Jack Frankland
92e751d426 [mlir][linalg] Add NHWC + FHWC Img2Col (#68708)
Adds the Img2Col transformation for the fhwc channel ordering in a
Conv2D. Because of how the channel ordering affects the matrix
dimensions in the flattened filter this results in a slightly different
implementation of the actual "matrix multiplication". Instead of doing a
regular row-column dot-product this arrangement requires a row-row dot
product, otherwise the filter matrix would first need to be transposed.

Adds a lit test to the transform dialect to check the semantics of the
optimization are correct.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2023-10-13 10:20:18 +01:00
Aviad Cohen
d4ae7ee662 [mlir][linalg] Enable CollapseLinalgDimensions to collapse memref based operations (#68522) 2023-10-10 23:36:11 +03:00
qcolombet
7050ff4615 [mlir] Fix lower_unpack when dynamic dimensions are involved (#68423)
When lowering `tensor.unpack`, we need to use the sizes of the
destination tensor in the final `tensor.extract_slice` operation. Prior
to this patch, when the destination tensor had dynamic dimensions, we
would compute them from the result of the `tensor.unpack` operation
instead of its destination argument.

This would produce invalid IR because the `tensor.dim` operations would
need to appear before the `tensor.extract_slice` operation, but the
input of the `tensor.dim` operations would consume the final result of
the lowering of `tensor.unpack`, which happens after the
`tensor.extract_slice` operation. In other words, the definition
wouldn't dominate its uses.

I.e., we were generating:
```
%dynDim = tensor.dim %defLater, ... <-- %defLater defined below
%res = tensor.extract_slice ..., %dynDim, ...
%defLater = linalg.copy (ins %res)
```

Note: I checked the implementation of `lower_pack` and the code is
correct as far as I can tell.
2023-10-06 22:09:58 +02:00
Matthias Springer
0fcaca2fea [mlir][bufferization] MaterializeInDestinationOp: Support memref destinations (#68074)
Extend `bufferization.materialize_in_destination` to support memref
destinations. This op can now be used to indicate that a tensor
computation should materialize in a given buffer (that may have been
allocated by another component/runtime). The op still participates in
"empty tensor elimination".

Example:
```mlir
func.func @test(%out: memref<10xf32>) {
  %t = tensor.empty() : tensor<10xf32>
  %c = linalg.generic ... outs(%t: tensor<10xf32>) -> tensor<10xf32>
  bufferization.materialize_in_destination %c in restrict writable %out : (tensor<10xf32>, memref<10xf32>) -> ()
  return
}
```
After "empty tensor elimination", the above IR can bufferize without an
allocation:
```mlir
func.func @test(%out: memref<10xf32>) {
  linalg.generic ... outs(%out: memref<10xf32>)
  return
}
```

This change also clarifies the meaning of the `restrict` unit attribute
on `bufferization.to_tensor` ops.
2023-10-06 11:57:10 +02:00
Matthias Springer
8823e961f6 [mlir][ODS] Change get...Mutable to return OpOperand & for single operands (#66519)
The TableGen code generator now generates C++ code that returns a single
`OpOperand &` for `get...Mutable` of operands that are not variadic and
not optional. `OpOperand::set`/`assign` can be used to set a value (same
as `MutableOperandRange::assign`). This is safer than
`MutableOperandRange` because only single values (and no longer
`ValueRange`) can be assigned.

E.g.:
```
// Assignment of multiple values to non-variadic operand.
// Before: Compiles, but produces invalid op.
// After: Compilation error.
extractSliceOp.getSourceMutable().assign({v1, v2});
```
2023-10-04 08:35:40 +02:00
Andrzej Warzyński
94c04772bc [mlir][vector] Prevent incorrect vector.transfer_{read|write} hoisting (#66930)
At the moment, `hoistRedundantVectorTransfers` would hoist the
`vector.transfer_read`/`vector.transfer_write` pair in this function:

```mlir
func.func @no_hoisting_write_to_memref(%rhs: i32, %arg1: vector<1xi32>) {
  %c0_i32 = arith.constant 0 : i32
  %c0 = arith.constant 0 : index
  %c1 = arith.constant 1 : index
  %c4 = arith.constant 4 : index
  %c20 = arith.constant 20 : index
  %alloca = memref.alloca() {alignment = 64 : i64} : memref<1x1x2xi32>
  %cast = memref.cast %alloca : memref<1x1x2xi32> to memref<1x1x2xi32>
  %collapsed_1 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
  scf.for %_ = %c0 to %c20 step %c4 {
    %collapsed_2 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
    %lhs = vector.transfer_read %collapsed_1[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %acc = vector.transfer_read %collapsed_2[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %op = vector.outerproduct %lhs, %rhs, %acc {kind = #vector.kind<add>} : vector<1xi32>, i32
    vector.transfer_write %op, %collapsed_1[%c0] {in_bounds = [true]} : vector<1xi32>, memref<2xi32>
  }
  return
}
```
as follows:
```mlir
  func.func @no_hoisting_write_to_memref(%arg0: i32, %arg1: vector<1xi32>) {
    %c0_i32 = arith.constant 0 : i32
    %c0 = arith.constant 0 : index
    %c4 = arith.constant 4 : index
    %c20 = arith.constant 20 : index
    %alloca = memref.alloca() {alignment = 64 : i64} : memref<1x1x2xi32>
    %collapse_shape = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
    %collapse_shape_0 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
    %0 = vector.transfer_read %collapse_shape[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %1 = vector.transfer_read %collapse_shape_0[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %2 = scf.for %arg2 = %c0 to %c20 step %c4 iter_args(%arg3 = %0) -> (vector<1xi32>) {
      %3 = vector.outerproduct %arg3, %arg0, %1 {kind = #vector.kind<add>} : vector<1xi32>, i32
      scf.yield %3 : vector<1xi32>
    }
    vector.transfer_write %2, %collapse_shape[%c0] {in_bounds = [true]} : vector<1xi32>, memref<2xi32>
    return
  }
```

This is not safe. While one argument for `vector.outerproduct` (`%rhs`
from the original loop) is correctly being forwarded via `iter_args`,
the other one (`%acc` from the original loop) is not.

This patch disables hoisting in cases where the source of "candidate"
`vector.transfer_read` aliases with some other `memref`. A more generic
approach would be to make sure that all values are correctly forwarded
via `iter_args`, but that would require involving alias analysis.

[1] Based on https://github.com/openxla/iree/issues/14994.
2023-09-29 15:34:37 +01:00
Matthias Springer
913286baed [mlir][linalg] Add SubsetInsertionOpInterface to linalg.copy (#67524)
This commit enables empty tensor elimination on `linalg.copy` ops.
2023-09-27 10:04:37 +02:00
Matthias Springer
63086d6aa0 [mlir][Interfaces] LoopLikeOpInterface: Add replaceWithAdditionalYields (#67121)
`affine::replaceForOpWithNewYields` and `replaceLoopWithNewYields` (for
"scf.for") are now interface methods and additional loop-carried
variables can now be added to "scf.for"/"affine.for" uniformly. (No more
`TypeSwitch` needed.)

Note: `scf.while` and other loops with loop-carried variables can
implement `replaceWithAdditionalYields`, but to keep this commit small,
that is not done in this commit.
2023-09-27 07:53:39 +02:00
Kazu Hirata
3bca659556 Use llvm::is_contained (NFC) 2023-09-22 17:20:50 -07:00
qcolombet
a44b787e06 [MLIR][linalg] Fix unpack rewriter for dynamic shapes (#67096)
Prior to this patch, `GeneralizeOuterUnitDimsUnPackOpPattern` would
assert that we cannot create a `tensor.empty` operation with dynamic
shapes.

The problem stems from the fact that we were not using the right builder
for the `tensor.empty` operation. Indeed, each dynamic dim needs to be
specified by an input variable.

Simply provide the dynamic dimensions to the `tensor.empty` builder to
fix that.
2023-09-22 12:23:47 +02:00
Matthias Springer
0b2197b0cf [mlir][Interfaces] Clean up DestinationStyleOpInterface (#67015)
* "init" operands are specified with `MutableOperandRange` (which gives
access to the underlying `OpOperand *`). No more magic numbers.
* Remove most interface methods and make them helper functions. Only
`getInitsMutable` should be implemented.
* Provide separate helper functions for accessing mutable/immutable
operands (`OpOperand`/`Value`, in line with #66515): `getInitsMutable`
and `getInits` (same naming convention as auto-generated op accessors).
`getInputOperands` was not renamed because this function cannot return a
`MutableOperandRange` (because the operands are not necessarily
consecutive). `OpOperandVector` is no longer needed.
* The new `getDpsInits`/`getDpsInitsMutable` is more efficient than the
old `getDpsInitOperands` because no `SmallVector` is created. The new
functions return a range of operands.
* Fix a bug in `getDpsInputOperands`: out-of-bounds operands were
potentially returned.
2023-09-21 18:04:08 +02:00
Matthias Springer
9b5ef2bea8 [mlir][Interfaces] LoopLikeOpInterface: Support ops with multiple regions (#66754)
This commit implements `LoopLikeOpInterface` on `scf.while`. This
enables LICM (and potentially other transforms) on `scf.while`.

`LoopLikeOpInterface::getLoopBody()` is renamed to `getLoopRegions` and
can now return multiple regions.

Also fix a bug in the default implementation of
`LoopLikeOpInterface::isDefinedOutsideOfLoop()`, which returned "false"
for some values that are defined outside of the loop (in a nested op, in
such a way that the value does not dominate the loop). This interface is
currently only used for LICM and there is no way to trigger this bug, so
no test is added.
2023-09-19 17:35:38 +02:00
Matthias Springer
d69293c1c8 [mlir][SCF] ForOp: Remove getIterArgNumberForOpOperand (#66629)
This function was inconsistent with the remaining API because it
accepted `OpOperand &` that do not belong to the op. All the other
functions assert. This helper function is also not really necessary, as
the iter_arg number is identical to the result number.
2023-09-19 17:33:40 +02:00
Ingo Müller
159e94a0c3 [mlir][linalg][transform] Add some debug output to vectorization. (NFC) (#66520)
This helps to understand what the problem is when vectorization of
structured ops failes due to mismatching vector sizes.
2023-09-19 10:34:24 +02:00
Matthias Springer
0f952cfe24 [mlir][IR] Change MutableOperandRange::operator[] to return an OpOperand & (#66515)
`operator[]` returns `OpOperand &` instead of `Value`.

* This allows users to get OpOperands by name instead of "magic" number.
E.g., `extractSliceOp->getOpOperand(0)` can be written as
`extractSliceOp.getSourceMutable()[0]`.
* `OperandRange` provides a read-only API to operands: `operator[]`
returns `Value`. `MutableOperandRange` now provides a mutable API:
`operator[]` returns `OpOperand &`, which can be used to set operands.

Note: The TableGen code generator could be changed to return `OpOperand
&` (instead of `MutableOperandRange`) for non-variadic and non-optional
arguments in a subsequent change. Then the `[0]` part in the above
example would no longer be necessary.
2023-09-18 09:43:03 +02:00
Matthias Springer
5cf714bb2f [mlir][SCF] scf.for: Consistent API around initArgs (#66512)
* Always use the auto-generated `getInitArgs` function. Remove the
hand-written `getInitOperands` duplicate.
* Remove `hasIterOperands` and `getNumIterOperands`. The names were
inconsistent because the "arg" is called `initArgs` in TableGen. Use
`getInitArgs().size()` instead.
* Fix verification around ops with no results.
2023-09-18 09:13:43 +02:00
Daniil Dudkin
4a831250b8 [mlir][vector] Rename vector reductions: maxfmaximumf, minfminimumf
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

Here, we are addressing task 2.1 from the plan, which involves renaming the vector reductions to align with the semantics of the corresponding LLVM intrinsics.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158618
2023-09-13 22:49:07 +00:00
Matthias Springer
91464e1d6a [mlir][bufferization][NFC] Rename copy_tensor op to materialize_in_destination (#65467)
The previous name was badly chosen. The op is used to ensure that a
computation materializes in the future buffer of a certain tensor.
2023-09-12 15:20:41 +02:00
Andrzej Warzyński
22f96ab6fb [mlir][vector] Refine vector.transfer_read hoisting/forwarding (#65770)
Make sure that when analysing a `vector.transfer_read` that's a
candidate for either hoisting or store-to-load forwarding,
`memref.collapse_shape` Ops are correctly included in the alias
analysis. This is done by either
* making sure that relevant users are taken into account, or
* source Ops are correctly identified.
2023-09-12 10:33:58 +01:00
Daniil Dudkin
8a6e54c9b3 [mlir][arith] Rename operations: maxfmaximumf, minfminimumf (#65800)
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
2023-09-11 22:02:19 -07:00
MaheshRavishankar
fd8349cdb3 [mlir][Linalg] Move linalg.fill -> linalg.pack pattern into fill canonicalization patterns. (#66002)
This pattern fits better with the other canonicalization patterns that
exist for `linalg.fill`.
2023-09-11 13:41:38 -07:00
Martin Erhart
412c2fd270 [mlir][linalg] Optional dealloc insertion for bufferize_to_allocation (#65610)
This commit allows to omit insertion of the memref.dealloc operation
when linalg.structured.bufferize_to_allocation is run and makes this the
default behavior. This is desirable when the
buffer-deallocation-pipeline is run after bufferization to handle buffer
deallocation.
2023-09-07 17:49:48 +02:00
Aviad Cohen
d6a2014eb8 [mlir][Linalg]: Add memory space to linalg transform::PromoteOp
This patch allows to supply an optional memory space of the promoted
buffer.

Differential Revision: https://reviews.llvm.org/D159074
2023-09-07 17:35:32 +03:00