Commit Graph

387 Commits

Author SHA1 Message Date
Matthias Springer
1534177f8f [mlir][bufferization][NFC] Move OpFilter out of BufferizationOptions
Differential Revision: https://reviews.llvm.org/D126568
2022-05-28 01:47:39 +02:00
Thomas Raoux
89aaa2d033 [mlir][vector] Add new lowering mode to vector.contractionOp
Add lowering for cases where the reduction dimension is fully unrolled.
It is common to unroll the reduction dimension, therefore we would want
to lower the contractions to an elementwise vector op in this case.

Differential Revision: https://reviews.llvm.org/D126120
2022-05-24 14:19:08 +00:00
Thomas Raoux
4c1b65e7bc [mlir][vector] Fix crash in DropInnerMostUnitDims pattern
Fix number of dimensions when incrementally replacing dimensions in
affine map.

Differential Revision: https://reviews.llvm.org/D125984
2022-05-19 17:38:04 +00:00
Chris Lattner
1d7b5cd5bf [ParseResult] Mark this as LLVM_NODISCARD (like LogicalResult) and fix issues.
There are a lot of cases where we accidentally ignored the result of some
parsing hook.  Mark ParseResult as LLVM_NODISCARD just like ParseResult is.
This exposed some stuff to clean up, so do.

Differential Revision: https://reviews.llvm.org/D125549
2022-05-13 16:28:53 +01:00
Thomas Raoux
d02f10d96d [mlir][vector] Add lowering pattern for vector.warp_execute_on_lane_0 op
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.

This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.

This is mostly from @springerm contribution.

Differential Revision: https://reviews.llvm.org/D125430
2022-05-12 13:27:43 +00:00
Chris Lattner
5dedf911de [AsmParser] Rework logic around "region argument parsing"
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).

Unfortunately the implementation has two problems:

1) It didn't actually check for the result number and reject
   it.  parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
   operand parsing.  This also was functionally identical, but
   also had some subtle differences (e.g. the parseOptional
   stuff had a different result type).

I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand.  This keeps the
codepaths unified and adds the missing error checks.

Differential Revision: https://reviews.llvm.org/D124470
2022-04-28 11:12:44 -07:00
Alex Zinenko
4c807f2f57 [mlir][vector] insert allocas outside of loops
After https://reviews.llvm.org/D119743 added the `AutomaticAllocationScope`
trait to loop-like constructs, the vector transfer full/partial splitting pass
started inserting allocations for temporaries within the closest loop rather
than the closest function (or other allocation scope such as `async.execute`).
While this is correct as long as the lowered code takes care of automatic
deallocation at the end of each iteration of the loop, this interferes with
downstream optimizations that expect `alloca`s to be at the function level.
Step over loops when looking for the closest allocation scope in vector
transfer full/partial splitting pass thus restoring the original behavior.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D124366
2022-04-25 10:49:09 +02:00
River Riddle
eda6f907d2 [mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp
Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.

Differential Revision: https://reviews.llvm.org/D124298
2022-04-23 01:09:29 -07:00
Lei Zhang
6f28fd0bf7 [mlir][vector] Fold 1-element reduction into extract or arith ops
If there is only one single element in the vector, then we can
just extract the element to compute the final result.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D124129
2022-04-22 14:24:46 -04:00
Lei Zhang
fc760c0260 [mlir][vector] Fold cancelling vector.shape_cast(vector.broadcast)
vector.broadcast can inject all size one dimensions. If it's
followed by a vector.shape_cast to the original type, we can
cancel the op pair, like cancelling consecutive shape_cast ops.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D124094
2022-04-22 08:58:26 -04:00
jacquesguan
61baf2ffa7 [mlir][Vector] Add check of supported reduction kind for ScanOp.
This patch adds check of supported reduction kind for ScanOp to avoid using and/or/xor for floating point type.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123977
2022-04-20 02:42:19 +00:00
jacquesguan
5479044bfc [mlir][Vector] Fold transpose splat to splat with transposed type.
This revision folds transpose splat to a new splat with the transposed vector type. For a splat, there is no need to actually do transpose for it, it would be more effective to just build a new splat as the result.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123765
2022-04-18 03:00:17 +00:00
Thomas Raoux
b4bcef05b7 [mlir][vector] Fix bug in extractFromBroadcast folding
extract was incorrectly folded when the source was coming from a
broadcast that was both adding new rank and broadcasting the inner
dimension.

Differential Revision: https://reviews.llvm.org/D123867
2022-04-15 19:21:45 +00:00
Lei Zhang
4db65e279b [mlir][vector] Reorder elementwise(transpose)
Similar to the existing pattern for reodering cast(transpose),
this makes transpose following transpose and increases the chance
of embedding the transposition inside contraction op. Actually
cast ops are just special instances of elementwise ops.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123596
2022-04-15 09:05:35 -04:00
Thomas Raoux
59058c441a [mlir][vector] Add operations used for Vector distribution
Add vector op warp_execute_on_lane_0 that will be used to do incremental
vector distribution in order to target warp level vector programming for
architectures with GPU-like SIMT programming model.
The idea behing the op is discussed further on discourse:
https://discourse.llvm.org/t/vector-vector-distribution-large-vector-to-small-vector/1983/23

Differential Revision: https://reviews.llvm.org/D123703
2022-04-15 03:47:52 +00:00
Lei Zhang
e54236dfb5 [mlir][vector] Cast away leading one dims for insert ops
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123621
2022-04-14 08:57:32 -04:00
Lei Zhang
bc408afbfe [mlir][vector] Fold splat constant transpose
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123595
2022-04-14 08:51:25 -04:00
Mehdi Amini
35f48edb91 Apply clang-tidy fixes for llvm-qualified-auto in VectorTransforms.cpp (NFC) 2022-04-14 09:42:37 +00:00
Thomas Raoux
5b1b7108c8 [mlir][vector] Add unrolling pattern for TransposeOp
Support unrolling for vector.transpose following the same interface as
other vector unrolling ops.

Differential Revision: https://reviews.llvm.org/D123688
2022-04-13 19:44:16 +00:00
gysit
39b9336474 [mlir][vector] Swap ExtractSliceOp(TransferWriteOp).
Rewrite tensor::ExtractSliceOp(vector::TransferWriteOp) to vector::TransferWriteOp(tensor::ExtractSliceOp) if the full slice is overwritten and inserted into another tensor. After this rewrite, the operations bufferize in-place since all of them work on the same %iter_arg slice.

For example:
```mlir
  %0 = vector.transfer_write %vec, %init_tensor[%c0, %c0]
       : vector<8x16xf32>, tensor<8x16xf32>
  %1 = tensor.extract_slice %0[0, 0] [%sz0, %sz1] [1, 1]
       : tensor<8x16xf32> to tensor<?x?xf32>
  %r = tensor.insert_slice %1 into %iter_arg[%iv0, %iv1] [%sz0, %sz1] [1, 1]
       : tensor<?x?xf32> into tensor<27x37xf32>
```
folds to
```mlir
  %0 = tensor.extract_slice %iter_arg[%iv0, %iv1] [%sz0, %sz1] [1, 1]
       : tensor<27x37xf32> to tensor<?x?xf32>
  %1 = vector.transfer_write %vec, %0[%c0, %c0]
       : vector<8x16xf32>, tensor<?x?xf32>
  %r = tensor.insert_slice %1 into %iter_arg[%iv0, %iv1] [%sz0, %sz1] [1, 1]
       : tensor<?x?xf32> into tensor<27x37xf32>

Reviewed By: nicolasvasilache, hanchung

Differential Revision: https://reviews.llvm.org/D123190
2022-04-11 10:28:53 +00:00
jacquesguan
e79b7f501c [mlir][Vector] Fold extractelement splat.
This revision supports to fold vector.extractelement (splat X) -> X.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D122960
2022-04-08 07:54:37 +00:00
Lei Zhang
7becf0f6cd [mlir][vector] Fold extract(broadcast) of same rank
This case is handled in neither the folding or canonicalization
patterns. The folding pattern cannot generate new broadcast ops,
so it should be handled by the canonicalization pattern.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123307
2022-04-07 12:59:54 -04:00
Bill Wendling
1acba8a4b5 [mlir] Reinstate the variable
Mid-air collition of patches.
2022-04-05 13:57:14 -07:00
Bill Wendling
4169650537 [mlir] Remove an unused variable and correct types.
No functionality change.
2022-04-05 13:44:12 -07:00
Benjamin Kramer
e7f0552682 [mlir] Fix unused variable warning. NFCI. 2022-04-05 21:24:05 +02:00
Lei Zhang
59d3a9e087 [mlir][vector] Separate high-D insert/extract strided slice rewrite
Right now `populateVectorInsertExtractStridedSliceTransforms` contains
two categories of patterns, one for decomposing high-D insert/extract
strided slices, the other for lowering them to shuffle ops.
They are at different levels---the former is in the middle, while
the latter is a step of final lowering. Split them to give users
more control of which pattern to pick.

This means break down the previous `VectorExtractStridedSliceOpRewritePattern`,
which is doing two things together.

Also renamed those patterns to be clearer.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D123137
2022-04-05 15:00:50 -04:00
jacquesguan
bc37077947 [mlir][Vector] Add constant folder for extractelement.
This revision adds constant folder for vector.extractelement.

Differential Revision: https://reviews.llvm.org/D122886
2022-04-02 11:10:42 +08:00
jacquesguan
262823612d [mlir][Vector] Add constant folder for insertelement.
This revision adds constant folder for vector.insertelement.

Differential Revision: https://reviews.llvm.org/D122721
2022-04-02 10:20:19 +08:00
Lei Zhang
a480d75fe4 [mlir][vector] Fold transpose(broadcast(<scalar>))
For such cases, the transpose op can be elided.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D122903
2022-04-01 14:51:36 -04:00
Lei Zhang
57b101bdec [mlir][vector] Handle scalars in extract_strided_slice(broadcast)
For such cases we cannot generate extract_strided_slice ops.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D122902
2022-04-01 12:07:47 -04:00
jacquesguan
01ad70fd1d [mlir][Vector] Fold ShuffleOp if result is identical to one of source vectors.
For example, we could do the following eliminations:
  fold vector.shuffle V1, V2, [0, 1, 2, 3] : <4xi32>, <2xi32> -> V1
  fold vector.shuffle V1, V2, [4, 5] : <4xi32>, <2xi32> -> V2

Differential Revision: https://reviews.llvm.org/D122706
2022-03-31 10:46:13 +08:00
Javier Setoain
7bc8ad5109 [mlir][vector][nfc] Rename index optimizations option
We are using "enable-index-optimizations" and "indexOptimizations" as
names for an optimization that consists of using i32 for indices within
a vector. For instance, when building a vector comparison for mask
generation. The name is confusing and suggests a scope beyond these
vector indices.  This change makes the function of the option explicit
in its name.

Differential Revision: https://reviews.llvm.org/D122415
2022-03-29 11:33:22 +01:00
Jacques Pienaar
7c38fd605b [mlir] Flip Vector dialect accessors used to prefixed form.
This has been on _Both for a couple of weeks. Flip usages in core with
intention to flip flag to _Prefixed in follow up. Needed to add a couple
of helper methods in AffineOps and Linalg to facilitate a pure flag flip
in follow up as some of these classes are used in templates and so
sensitive to Vector dialect changes.

Differential Revision: https://reviews.llvm.org/D122151
2022-03-28 11:24:47 -07:00
Benjamin Kramer
12bd1ef37c [bazel] Add missing dependency after a75a46db89 2022-03-25 12:02:36 +01:00
Javier Setoain
a75a46db89 [mlir][Vector] Enable create_mask for scalable vectors
The way vector.create_mask is currently lowered is
vector-length-dependent, and therefore incompatible with scalable vector
types. This patch adds an alternative lowering path for create_mask
operations that return a scalable vector mask.

Differential Revision: https://reviews.llvm.org/D118248
2022-03-25 10:48:59 +00:00
Chia-hung Duan
14ecafd0bd [mlir] Make OpBuilder::createOperation to accept raw inputs
This provides a way to create an operation without manipulating
OperationState directly. This is useful for creating unregistered ops.

Reviewed By: rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D120787
2022-03-23 22:13:48 +00:00
Markus Böck
e13d23bc6c [mlir] Rename OpAsmParser::OperandType to OpAsmParser::UnresolvedOperand
I am not sure about the meaning of Type in the name (was it meant be interpreted as Kind?), and given the importance and meaning of Type in the context of MLIR, its probably better to rename it. Given the comment in the source code, the suggestion in the GitHub issue and the final discussions in the review, this patch renames the OperandType to UnresolvedOperand.

Fixes https://github.com/llvm/llvm-project/issues/54446

Differential Revision: https://reviews.llvm.org/D122142
2022-03-21 21:42:13 +01:00
River Riddle
77eee5795e [mlir] Refactor DialectRegistry delayed interface support into a general DialectExtension mechanism
The current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc.
when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has a
separate tracking structure, and is also quite limiting. This commit refactors this delayed mutation of
dialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registration
callback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directly
on the loaded constructs, and also allows for loading new dependent dialects. The latter of which is
extremely useful as it will now enable dependent dialects to only apply in the contexts in which they
are necessary. For example, a dialect dependency can now be conditional on if a user actually needs the
interface that relies on it.

Differential Revision: https://reviews.llvm.org/D120367
2022-03-16 22:15:25 -07:00
River Riddle
3655069234 [mlir] Move the Builtin FuncOp to the Func dialect
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.

Differential Revision: https://reviews.llvm.org/D121266
2022-03-16 17:07:03 -07:00
Matthias Springer
9597b16aa9 [mlir][bufferize][NFC] Split BufferizationState into AnalysisState/BufferizationState
Differential Revision: https://reviews.llvm.org/D121361
2022-03-15 17:35:47 +09:00
Matthias Springer
de5022c7d7 [mlir][vector] Implement unrolling of ReductionOp
Differential Revision: https://reviews.llvm.org/D121597
2022-03-15 01:21:24 +09:00
Thomas Raoux
f69175b1e6 [mlir][vector] Add unrolling pattern for multidim_reduce op
Implement the vectorLoopUnroll interface for MultiDimReduceOp and add a
pattern to do the unrolling following the same interface other vector
unroll patterns.

Differential Revision: https://reviews.llvm.org/D121263
2022-03-14 15:22:24 +00:00
gysit
7294be2b8e [mlir][linalg] Replace linalg.fill by OpDSL variant.
The revision removes the linalg.fill operation and renames the OpDSL generated linalg.fill_tensor operation to replace it. After the change, all named structured operations are defined via OpDSL and there are no handwritten operations left.

A side-effect of the change is that the pretty printed form changes from:
```
%1 = linalg.fill(%cst, %0) : f32, tensor<?x?xf32> -> tensor<?x?xf32>
```
changes to
```
%1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<?x?xf32>) -> tensor<?x?xf32>
```
Additionally, the builder signature now takes input and output value ranges as it is the case for all other OpDSL operations:
```
rewriter.create<linalg::FillOp>(loc, val, output)
```
changes to
```
rewriter.create<linalg::FillOp>(loc, ValueRange{val}, ValueRange{output})
```
All other changes remain minimal. In particular, the canonicalization patterns are the same and the `value()`, `output()`, and `result()` methods are now implemented by the FillOpInterface.

Depends On D120726

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D120728
2022-03-14 10:51:08 +00:00
Diego Caballero
f71f9958b9 [mlir][Vector] Modernize default lowering of vector transpose
This patch removes an old recursive implementation to lower vector.transpose to extract/insert operations
and replaces it with a iterative approach that leverages newer linearization/delinearization utilities.
The patch should be NFC except by the order in which the extract/insert ops are generated.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D121321
2022-03-10 22:33:14 +00:00
River Riddle
171850c55a [mlir][Vector] Drop use of FuncOp in transferOpflowOpt
FuncOp isn't really important to hardcode here, it is only used to act
as a root operation for the transformation.

Differential Revision: https://reviews.llvm.org/D121195
2022-03-08 12:25:32 -08:00
Javier Setoain
f2b89c7ae0 [mlir][Vector] Use create_mask in transfer mask materializations
Currently, the transfer mask is materialized by generating the vector
comparison: [offset + 0, .., offset + length - 1] < [dim, .., dim]

A better alternative is to materialize the transfer mask by using the
operation: `vector.create_mask (dim - offset)`, which will generate
simpler code and compose better with scalable vectors.

Differential Revision: https://reviews.llvm.org/D120487
2022-03-08 09:02:50 +00:00
Hanhan Wang
1538bd518c [mlir][Vector] Add patterns to reorder elementwise ops and broadcast/transpose ops.
In quantized comutation, there are casting ops around computation ops.
Reorder the ops to make reduce-to-contract actually work.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D120760
2022-03-07 12:52:12 -08:00
Diego Caballero
917d95fc8a [mlir][Vector] Improve default lowering of vector transpose operations
The default lowering of vector transpose operations generates a large sequence of
scalar extract/insert operations, one pair for each scalar element in the input tensor.
In other words, the vector transpose is scalarized. However, there are transpose
patterns where one or more adjacent high-order dimensions are not transposed (for
example, in the transpose pattern [1, 0, 2, 3], dimensions 2 and 3 are not transposed).
This patch improves the lowering of those cases by not scalarizing them and extracting/
inserting a full n-D vector, where 'n' is the number of adjacent high-order dimensions
not being transposed. By doing so, we prevent the scalarization of the code and generate a
more performant vector version.

Paradoxically, this patch shouldn't improve the performance of transpose operations if
we are using LLVM. The LLVM pipeline is able to optimize away some of the extract/insert
operations and the SLP vectorizer is converting the scalar operations back to its vector
form. However, scalarizing a vector version of the code in MLIR and relying on the SLP
vectorizer to reconstruct the vector code again is highly undesirable for several reasons.

Reviewed By: nicolasvasilache, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D120601
2022-03-07 17:56:02 +00:00
River Riddle
23aa5a7446 [mlir] Rename the Standard dialect to the Func dialect
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:

* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect

See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061

Differential Revision: https://reviews.llvm.org/D120624
2022-03-01 12:10:04 -08:00
Benjamin Kramer
3ce2ee28f0 [mlir][ODS] Infer return types if the operands are variadic but the results are not
Clean up code that worked around this limitation.

Differential Revision: https://reviews.llvm.org/D120119
2022-02-18 15:29:06 +01:00