Commit Graph

252 Commits

Author SHA1 Message Date
Max191
7880b2c858 [mlir] Add direct vectorization lowering for tensor.pack ops (#78660)
This PR adds a direct vectorization lowering of `tensor.pack` into
`mask(vector.transfer_read)`->`vector.shape_cast`->`vector.transpose`->`vector.transfer_write`.
2024-02-07 14:11:11 -05:00
Mehdi Amini
2e0909025e Apply clang-tidy fixes for readability-simplify-boolean-expr in Vectorization.cpp (NFC) 2024-01-22 17:34:55 -08:00
Matthias Springer
5fcf907b34 [mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`

The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.

Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
2024-01-17 11:08:59 +01:00
Matthias Springer
0a8e3dd432 [mlir][Interfaces] DestinationStyleOpInterface: Rename hasTensor/BufferSemantics (#77574)
Rename interface functions as follows:
* `hasTensorSemantics` -> `hasPureTensorSemantics`
* `hasBufferSemantics` -> `hasPureBufferSemantics`

These two functions return "true" if the op has tensor/buffer operands
but not buffer/tensor operands.

Also drop the "ranked" part from the interface, i.e., do not distinguish
between ranked/unranked types.

The new function names describe the functions more accurately. They also
align their semantics with the notion of "tensor semantics" with the
bufferization framework. (An op is supposed to be bufferized if it has
tensor operands, and we don't care if it also has memref operands.)

This change is in preparation of #75273, which adds
`BufferizableOpInterface::hasTensorSemantics`. By renaming the functions
in the `DestinationStyleOpInterface`, we can avoid name clashes between
the two interfaces.
2024-01-12 10:02:54 +01:00
Andrzej Warzyński
db9a16eaed [mlir][nfc] Update comments in the Linalg vectoriser (#76797) 2024-01-04 17:24:22 +00:00
Jakub Kuderski
560564f51c [mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901)
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.

Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.

Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.

Issue: https://github.com/llvm/llvm-project/issues/72354
2023-12-20 00:14:43 -05:00
Andrzej Warzyński
f11bda78c8 [mlir][linalg] Use vector.shuffle to flatten conv filter (#75038)
Updates the vectorisation of 1D depthwise convolution when flattening
the channel dimension (introduced in #71918). In particular - how the
convolution filter is "flattened". ATM, the vectoriser will use
`vector.shape_cast`:

```mlir
  %b_filter = vector.broadcast %filter : vector<4xf32> to vector<3x2x4xf32>
  %sc_filter = vector.shape_cast %b_filter : vector<3x2x4xf32> to vector<3x8xf32>
```

This lowering is not ideal - `vector.shape_cast` can be convenient when
it's folded away, but that's not happening in this case. Instead, this
patch updates the vectoriser to use `vector.shuffle` (the overall result
is identical):

```mlir
  %sh_filter = vector.shuffle %filter, %filter
      [0, 1, 2, 3, 0, 1, 2, 3] : vector<4xf32>, vector<4xf32>
  %b_filter = vector.broadcast %sh_filter : vector<8xf32> to vector<3x8xf32>
```
2023-12-15 17:56:59 +00:00
Andrzej Warzyński
03c2f5d8bb [mlir][linalg][conv] Flatten the channel dimension when vectorizing (#71918)
The current vectorization of 1D depthwise convolutions in Linalg is
_sub-optimal_ for tensor with a low number of channel dimensions, e.g.:

```mlir
linalg.depthwise_conv_1d_nwc_wc
    {dilations = dense<1> : vector<1xi64>,
    strides = dense<1> : vector<1xi64>}
    ins(%input, %filter : tensor<1x8x3xi8>, tensor<1x3xi8>)
    outs(%output : tensor<1x8x3xi8>) -> tensor<1x8x3xi8>
```

That's due to the fact that ultimately (i.e. at LLVM level),
vectorization happens along the trailing dimension (i.e. the channel
dimension). In this case it leads to vectors with 3 elements (or worse,
if there's e.g. only 1 channel dimension). For comparison, a 128 bit
wide vector registers can hold 16 x i8.

Instead, this patch adds an option to flatten/collapse the channel
dimension into the width dimension of the input/filter/output using
`vector.shape_cast` operation:

```mlir
    %sc_input = vector.shape_cast %input : vector<1x8x3xi8> to vector<1x24xi8>
    %sc_output = vector.shape_cast %output : vector<1x8x3xi8> to vector<1x24xi8>
    %b_filter = vector.broadcast %filter : vector<3xi8> to vector<1x8x3xi8>
    %sc_filter = vector.shape_cast %b_filter : vector<1x8x3xi8> to vector<1x24xi8>
```

This new vectorization mode is implemented in `depthwiseConv` by
inserting `vector.shape_cast` Ops before and after 
`depthwiseConv1dSliceAsMulAcc` is invoked. It can be selected through
e.g. a transform dialect attribute:

```mlir
  transform.structured.vectorize_children_and_apply_patterns %conv {flatten_1d_depthwise_conv}
```

A forthcoming patch will implement a strategy to automatically switch
between the two implementations, depending on the shape of the input
tensors.

Co-authored by: Bradley Smith <bradley.smith@arm.com>
2023-12-06 21:35:03 +00:00
long.chen
1609f1c2a5 [mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/

Not all changes are made manually, most of them are made through a clang
tool I wrote https://github.com/lipracer/cpp-refactor.
2023-11-14 13:01:19 +08:00
Han-Chung Wang
03529b99b3 [mlir][linalg] Add support for vectorizing dynamic elementwise named ops (#71454)
We are able to vectorize them in linalg.generic form. We just need to
relax the condition, so it can also vectorize named ops.
2023-11-06 15:35:50 -08:00
Jacques Pienaar
b858309ddc [mlir] Only attempt to vectorize conv if conv.
Avoids hitting assertions due to unsupported convolution patterns.

See https://github.com/openxla/iree/issues/15207#issuecomment-1767650797
2023-10-18 20:34:39 -07:00
Kazu Hirata
3bca659556 Use llvm::is_contained (NFC) 2023-09-22 17:20:50 -07:00
Matthias Springer
0b2197b0cf [mlir][Interfaces] Clean up DestinationStyleOpInterface (#67015)
* "init" operands are specified with `MutableOperandRange` (which gives
access to the underlying `OpOperand *`). No more magic numbers.
* Remove most interface methods and make them helper functions. Only
`getInitsMutable` should be implemented.
* Provide separate helper functions for accessing mutable/immutable
operands (`OpOperand`/`Value`, in line with #66515): `getInitsMutable`
and `getInits` (same naming convention as auto-generated op accessors).
`getInputOperands` was not renamed because this function cannot return a
`MutableOperandRange` (because the operands are not necessarily
consecutive). `OpOperandVector` is no longer needed.
* The new `getDpsInits`/`getDpsInitsMutable` is more efficient than the
old `getDpsInitOperands` because no `SmallVector` is created. The new
functions return a range of operands.
* Fix a bug in `getDpsInputOperands`: out-of-bounds operands were
potentially returned.
2023-09-21 18:04:08 +02:00
Ingo Müller
159e94a0c3 [mlir][linalg][transform] Add some debug output to vectorization. (NFC) (#66520)
This helps to understand what the problem is when vectorization of
structured ops failes due to mismatching vector sizes.
2023-09-19 10:34:24 +02:00
Daniil Dudkin
4a831250b8 [mlir][vector] Rename vector reductions: maxfmaximumf, minfminimumf
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

Here, we are addressing task 2.1 from the plan, which involves renaming the vector reductions to align with the semantics of the corresponding LLVM intrinsics.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D158618
2023-09-13 22:49:07 +00:00
Daniil Dudkin
8a6e54c9b3 [mlir][arith] Rename operations: maxfmaximumf, minfminimumf (#65800)
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
2023-09-11 22:02:19 -07:00
Fangrui Song
7557530f42 [mlir] Fix duplicate word typos; NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 20:53:08 -07:00
Andrzej Warzynski
5f6c03636d [mlir][linalg] Extend scalable vectorisation support
This patch simply removes one of the pre-conditions of scalable
vectorisation. Namely, that only the trailing vector dimension can be
scalable, e.g.

  vector<2x4x[8]xi32>

This limitation can be lifted following the recent work to support
scalable vectorisation in Linalg, most notably:
  * https://reviews.llvm.org/D154336
  * https://reviews.llvm.org/D153372

A test is added that demonstrates scalable vectorisation of
`linalg.matmul`. This is simply a copy of a similar test for fixed-width
vectorisation. The latter is updated - check lines are simplified and
annotated with regex patterns. This in the spirit of [1]:

  Tests should be minimal, and only check what is absolutely necessary.

Note that this change is just one step towards scalable vectorisation in
Linalg. It will allow us to exercise new code paths in the context of
scalable vectors in Linalg and hence make further progress in the
forthcoming patches.

[1] https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices

Differential Revision: https://reviews.llvm.org/D158423
2023-08-22 09:40:34 +00:00
Andrzej Warzynski
84d96947ef [mlir][linalg] Fix vectorisation of tensor.extract with dynamic shapes
The Linalg vectoriser incorrectly recognises the following
`tensor.extract` as contiguous:
```
func.func @example(%in: tensor<123x321xf32>, %arg1: tensor<1x?x8xf32>) -> tensor<1x?x8xf32> {
  %c0 = arith.constant 1 : index
  %2 = linalg.generic {
    indexing_maps = [#map1],
    iterator_types = ["parallel", "parallel", "parallel"]
  } outs(%arg1 : tensor<1x?x8xf32>)
  {
  ^bb0(%arg3: f32):
    %idx_0 = linalg.index 0 : index
    %idx_1 = linalg.index 1 : index
    %idx = arith.addi %idx_0, %idx_1 : index
    %7 = tensor.extract %in[%c0, %idx] : tensor<123x321xf32>
    linalg.yield %7 : f32
  } -> tensor<1x?x8xf32>
  return %2 : tensor<1x?x8xf32>
}
```

However, the following index Op corresponds to the dynamic dimension
in the iteration space:
```
    %idx_1 = linalg.index 1 : index
```
The vectoriser should assume that:
  * this index Op _is not_ loop invariant,
  * the resulting memory access is a gather load
This is what this patch fixes.

Differential Revision: https://reviews.llvm.org/D155373
2023-07-17 17:28:17 +00:00
Andrzej Warzynski
2712b2805b [mlir][linalg] Vectorize 0-d tensor extract
This patch adds the missing logic to vectorise `tensor.extract` for 0-d
tensors.

Fixes #63688

Differential Revision: https://reviews.llvm.org/D154518
2023-07-06 08:34:51 +01:00
Matthias Springer
be6d96e9f8 [mlir][linalg] Remove redundant dimension size helper functions
Differential Revision: https://reviews.llvm.org/D154211
2023-07-03 09:07:18 +02:00
Andrzej Warzynski
f22af204ed [mlir][VectorType] Remove numScalableDims from the vector type
This is a follow-up of https://reviews.llvm.org/D153372 in which
`numScalableDims` (single integer) was effectively replaced with
`isScalableDim` bitmask.

This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
  * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/

Differential Revision: https://reviews.llvm.org/D153412
2023-06-28 13:53:45 +01:00
Andrzej Warzynski
79c83e12c8 [mlir][VectorType] Allow arbitrary dimensions to be scalable
At the moment, only the trailing dimensions in the vector type can be
scalable, i.e. this is supported:

    vector<2x[4]xf32>

and this is not allowed:

    vector<[2]x4xf32>

This patch extends the vector type so that arbitrary dimensions can be
scalable. To this end, an array of bool values is added to every vector
type to denote whether the corresponding dimensions are scalable or not.
For example, for this vector:

  vector<[2]x[3]x4xf32>

the following array would be created:

  {true, true, false}.

Additionally, the current syntax:

  vector<[2x3]x4xf32>

is replaced with:

  vector<[2]x[3]x4xf32>

This is primarily to simplify parsing (this way, the parser can easily
process one dimension at a time rather than e.g. tracking whether
"scalable block" has been entered/left).

NOTE: The `isScalableDim` parameter of `VectorType` (introduced in this
patch) makes `numScalableDims` redundant. For the time being,
`numScalableDims` is preserved to facilitate the transition between the
two parameters. `numScalableDims` will be removed in one of the
subsequent patches.

This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
  * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/

Differential Revision: https://reviews.llvm.org/D153372
2023-06-27 19:21:59 +01:00
Diego Caballero
13f15e8f14 [mlir][Vector] Fix vectorization of generic ops with transposed outputs
This patch fixes a bug in the way we compute the vector type for vector
transfer writes when the value to store needs to be transposed.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D153687
2023-06-26 20:24:29 +00:00
Diego Caballero
3aeb27d69a [mlir][Vector] Fix 0-D tensor vectorization in Linalg
It looks like scalable vector support broke vectorization for 0-D
tensors and we didn't have any test coverting that case. This patch
provides a fix and a test.

Differential Revision: https://reviews.llvm.org/D153181
2023-06-16 23:45:03 +00:00
Rob Suderman
2a00891107 [mlir][linalg] Fix linalg.conv vectorization for mixed int-fp types
We always assume mixed same type values. Instead of ExtF or ExtSI, we
need SIToFp when the values must be promoted.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D152982
2023-06-15 11:13:18 -07:00
Diego Caballero
77a5ea2e67 [mlir][Vector] Add basic scalable vectorization support to Linalg vectorizer
For now, only elementwise operations are supported. Operations that perform any
kind of data permutation require changes in the representation of scalable
dimensions in VectorType.

Differential Revision: https://reviews.llvm.org/D152599
2023-06-13 23:55:15 +00:00
Andrzej Warzynski
678360fd9d [mlir][linalg] Add scalar broadcast load case to the vectoriser
This patch extends the Linalg vectoriser so that scalar loads are
correctly identified as scalar rather than gather loads. Below is an
example of a scalar load (note that both indices are loop invariant):
```
func.func @example(%arg0: tensor<80x16xf32>, %arg2: tensor<1x4xf32>) -> tensor<1x4xf32> {
%c8 = arith.constant 8 : index
%c16 = arith.constant 16 : index
%1 = linalg.generic {
    indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>],
    iterator_types = ["parallel", "parallel"]
  } outs(%arg2 : tensor<1x4xf32>) {
  ^bb0(%out: f32):
    %2 = linalg.index 0 : index
    %extracted = tensor.extract %arg0[%2, %c16] : tensor<80x16xf32>
    linalg.yield %extracted : f32
  } -> tensor<1x4xf32>
  return %1 : tensor<1x4xf32>
}
```

This patch also makes sure that these scalar loads are indeed lowered to
a scalar load followed by a broadcast:
```
    %extracted = tensor.extract %arg0[%1, %c16] : tensor<80x16xf32>
    %2 = vector.broadcast %extracted : f32 to vector<1x4xf32>
```

Differential Revision: https://reviews.llvm.org/D149678
2023-06-12 15:18:42 +01:00
Andrzej Warzynski
7a52f79126 [mlir][transform] Add support for expressing scalable vector sizes
This patch enables specifying scalable vector sizes when using the
Transform dialect to drive vectorisation, e.g.:
```
transform.structured.masked_vectorize %0 vector_sizes [8, 16, [4]]
```
This is implemented by extending the MaskedVectorizeOp with a dedicated
attribute for "scalability" and by overloading `parseDynamicIndexList`
so that MaskedVectorizeOp can continue using the auto-generated parser
and printer.

At the moment, only the trailing vec size can be scalable. The following
is not yet supported:
```
transform.structured.masked_vectorize %0 vector_sizes [8, [16], [4]]
```

As the vectoriser does not support scalable vectorisation just yet, a
warning is issues when scalable vector sizes are used. You can also use
the debug output, `--debug-only=linalg-vectorization`, to check whether
scalable vectorisation has been switched on.

This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
  * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/
Similar patch for tiling: https://reviews.llvm.org/D150944

Differential Revision: https://reviews.llvm.org/D151892
2023-06-08 20:54:17 +01:00
Tres Popp
68f58812e3 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This patch updates all remaining uses of the deprecated functionality in
mlir/. This was done with clang-tidy as described below and further
modifications to GPUBase.td and OpenMPOpsInterfaces.td.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```

Differential Revision: https://reviews.llvm.org/D151542
2023-05-26 10:29:55 +02:00
Hanhan Wang
3669d07987 [mlir][linalg] Only apply masking on xfer_write when needed.
If the input vector sizes are as same as tensor.pad result shape, the
masking is not needed. Otherwise, the masking is needed and the masking
operands should be as same as tensor.empty op.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D151391
2023-05-24 18:24:19 -07:00
Goran Flegar
ffd9d6e0ed [mlir][linalg] Fix unused variable on opt build 2023-05-19 14:00:38 +02:00
Hanhan Wang
6abd8e30f4 [mlir][linalg] Add support for dynamic pad op masking vectorization.
Reviewed By: dcaballe, awarzynski

Differential Revision: https://reviews.llvm.org/D150497
2023-05-18 15:05:18 -07:00
Hanhan Wang
0d871fef18 [mlir][linalg] Unify generic vectorization interface.
It breaks the logic of maskedVectorize (on tensor.pad ops) into
precondition checks and vectorization implementation; unifies the
interface.

The revision also rename`s vectorizeLinalgOpPrecondition` to
`vectorizeOpPrecondition` because we can vectorize ops other
than LinalgOps.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D150495
2023-05-18 12:58:50 -07:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Hanhan Wang
b2fdb1417b [mlir][linalg] Relax masked_vectorize(pad) to allow zero low Values.
It only accepted zeros attributes before the revision. It now acccepts
constant-like zero values as well.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D149474
2023-05-01 14:07:53 -07:00
Diego Caballero
c8557c7c3e [MLIR][Vector] Enable masked vectorizaton of contraction ops
This patch enables the vectorization of contraction ops using vector
masking. Support for vectorizing contractions is already there so this
is just adding contraction ops to the list of supported ops in
`vectorizeDynamicLinalgOpPrecondition` and adding a test.

Reviewed By: hanchung, awarzynski

Differential Revision: https://reviews.llvm.org/D148865
2023-04-21 19:19:01 +00:00
Andrzej Warzynski
a3ae3931d4 [mlir][linalg] Refine tensor.extract vectorisation
This patch updates the vectorisation of the extract Op so that the
permutation map for the transfer_read Op is defined explicitly by the
vectoriser (as opposed to being constructed implicitly by the
transfer_read builder).

This change is needed for cases where the rank of the source tensor is
lower than the rank of the output vector generated by the vectoriser:
```mlir
    %17 = vector.transfer_read %arg1[%14, %16], %cst_4 {in_bounds = [true, true]} : tensor<257x24xf32>, vector<1x1x4xf32>
```
In cases like this, the vectorize will create the following permutation map:
```
  (d0, d1) -> (0, d0, d1)
```

In other cases the behaviour remains unchanged.

Fixes https://github.com/openxla/iree/issues/13036. That's also where
the test case was extracted from.

Differential Revision: https://reviews.llvm.org/D148537
2023-04-21 08:47:55 +01:00
Matthias Springer
4c48f016ef [mlir][Affine][NFC] Wrap dialect in "affine" namespace
This cleanup aligns the affine dialect with all the other dialects.

Differential Revision: https://reviews.llvm.org/D148687
2023-04-20 11:19:21 +09:00
Lei Zhang
7517e246ac [mlir][linalg] Promote operands for convolution vectorization
We are already doing this for depthwise convolution and pooling.
This helps to preserve the promotion semantics from Linalg op
definitions to lower layers.

Along the way, fixed the type mismatch issue in the existing
`promote` implementation.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D148471
2023-04-17 16:37:06 -07:00
Nicolas Vasilache
8c5ad0a2f6 [mlir][Vector] Add a masked vectorization of tensor.pad
This revision takes advantage of masking support to introduce a vectorized
version of pad that does not require lowering to lower-level form.

Lowering to lower-level form (if/else + generate + fill + copy + insert_slice)
creates unnecessary complexity that can be completely sidestepped by using
masked vectorization properly.

Differential Revision: https://reviews.llvm.org/D148261
2023-04-13 13:20:29 -07:00
Nicolas Vasilache
1edfb4b93d [mlir][Linalg] Allow linalg.copy to be vectorized with masking
Differential Revision: https://reviews.llvm.org/D148095
2023-04-12 05:35:17 -07:00
Diego Caballero
5217782014 [mlir][Vector] Enable masking for ops with index semantics
Masking was already supported for linalg.index and n-D extract but
disabled while waiting for some n-D extract vectorization patches to
land. This patch is just enabling masking for them and adding a couple
of tests.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D147359
2023-04-03 21:58:27 +00:00
Diego Caballero
04798db4ea [mlir][Vector][NFC] Small vector masking clean-up
We stored static (int) and dynamic (Value) iteration space dims separately
and then merged them by creating constant ops for the static ones. This
merge happened multiple times during vectorization. This PR changes that
to perform the merge once and store in the state instead of the dynamic
values in isolation.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D147351
2023-04-03 21:58:27 +00:00
Diego Caballero
f18a861299 [mlir][Vector] Enable masked vectorization of linalg.fill
linalg.fill was already vectorizable with masks but not supported in the
dynamic pre-checks.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D146856
2023-03-29 19:53:29 +00:00
Nicolas Vasilache
1cff4cbda3 [mlir][Transform] NFC - Various API cleanups and use RewriterBase in lieu of PatternRewriter
Depends on: D145685

Differential Revision: https://reviews.llvm.org/D145977
2023-03-14 04:23:12 -07:00
Devajith Valaparambil Sreeramaswamy
5299953aba [mlir][linalg] Add vectorization support for conv_1d
This MR add vectorization support for linalg.conv_1D operation.

Reviewed By: nicolasvasilache, hanchung, dcaballe, vmurali

Differential Revision: https://reviews.llvm.org/D145160
2023-03-08 14:23:36 -08:00
Andrzej Warzynski
7a078b65fb [mlir][linalg] Refine how contiguous loads are identified
Vectorization of `tensor.extract` using contiguous loads
(`vector.transfer_read`) was introduced in [1]. This patch updates and
refines the existing logic (so that more cases of contiguous can be
identified), as well as adds more tests.

Specifically, contiguous load operations are identified by making sure
that:
  1. non-trailing indices for `tensor.extract` are loop invariant (so,
     e.g., there are no "jumps" from one row to the other between
     iterations),
  2. the trailing index for `tensor.extract` increments by 1 with every
     loop iteration (so that it's always adjacent elements that are
     loaded).
This patch introduces:
  * `isLoopInvariantIdx` for step 1., and
  * `isContiguousLoadIdx` for step 2.
These new methods replace:
  * `isContiguousLoadIdx`, and `isBasedOnIndexOp`.

Both approaches lead to similar end-result (none of the existing tests
required updating). However, with the updated approach, it's much easier
to treat the trailing and non-trailing indices separately and to add
more cases for which contiguous loads can be used.

[1] https://reviews.llvm.org/D141998

Differential Revision: https://reviews.llvm.org/D145385
2023-03-08 07:49:04 +00:00
Andrzej Warzynski
8ece85a682 [mlir][linalg] Vectorize tensor.extract using contiguous loads
This patch implements vectorization of tensor.extract for n-D tensor (n
>= 2) using contiguous load operations, i.e. `vector.transfer_read`. This
is a follow-up of https://reviews.llvm.org/D137660 in which gather loads
were used, i.e. `vector.gather`.

It is always safe to use gather load operations when the underlying
memory pattern is contiguous, but not vice-verse. At the moment, the
following conditions have to be met for contiguous loads to be
generated:
  1. The _output tensor_ must be a 1-D vector with the trailing dim > 1,
     e.g. `tensor<1x1x4xi32`,
  2. The trailing dim in the _input tensor_ must be > 1, e.g.
     `tensor<1x1x4i32>` would be fine, but not `tensor<1x4x1xi32>`.
If these conditions are not satisfied, gather loads are generated
instead.

Condition 1 guarantees that the iteration space of the corresponding
`linalg.generic` Op is relatively simple. That makes analysing the
indices for `tensor.extract` rather straightforward.

Condition 2 is mostly there to avoid weird vectorisation patterns
resulting in vectors like: `vector<1x1x1xi32>`. In practice, tensors
like `tensor<1x4x1xi32>` should be collapsed to `tensor<1x4xi32>` before
vectorisation, but that's beyond the scope of this patch.

If needed, both conditions can be relaxed. I've not been able to find a
good motivating example for these, hence skipping. For reference,
`tosa.resize` (lowered to Linalg) was the driving example used here.

As a bonus, the test from "vectorization-unsupported.mlir" is moved to
"vectorization.mlir" with proper CHECK lines added.

NOTE: This relands 89b144ece3 (added extra
test, refined comments and variable names).

Differential Revision: https://reviews.llvm.org/D141998

Co-authored-by: Diego Caballero <diegocaballero@google.com>
2023-03-02 09:20:37 +00:00
Benjamin Kramer
e28bbfea5d Revert "[mlir][linalg] Vectorize tensor.extract using contiguous loads"
This reverts commit 89b144ece3. See
https://reviews.llvm.org/D141998 for a test case where this goes wrong.
2023-02-28 13:33:11 +01:00