Commit Graph

217 Commits

Author SHA1 Message Date
Matthias Springer
4cd7362083 [mlir][SCF] foreach_thread: Capture shared output tensors explicitly
This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments.

The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments.

As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again.

Differential Revision: https://reviews.llvm.org/D133114
2022-09-02 14:54:04 +02:00
Matthias Springer
547942841f [mlir][interfaces] Drop dest/tileDestOperands from TilingInterface
`getTiledImplementation`/`generateResultTileValue` only computes the tiled operation, but does not insert the result into any tensor.

Differential Revision: https://reviews.llvm.org/D133015
2022-09-01 08:53:53 +02:00
Michele Scuttari
67d0d7ac0a [MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-31 12:28:45 +02:00
Michele Scuttari
039b969b32 Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit 2be8af8f0e.
2022-08-30 22:21:55 +02:00
Michele Scuttari
2be8af8f0e [MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-30 21:56:31 +02:00
Adrian Kuegel
e0568fa763 [mlir] Apply ClangTidy performance finding (NFC). 2022-08-29 09:15:35 +02:00
Thomas Raoux
06c02d5dbb [mlir][linalg] Fix tiling interface implementation offset calculation
The tiling interface implementation was making assumption on the code
generated by makeTiledShape which were wrong. The ExtractSliceOp create
may be combined with other ExtractSliceOp. To solve that we compute
directly the offset using the new utilities.

Differential Revision: https://reviews.llvm.org/D132182
2022-08-19 00:16:33 +00:00
Mahesh Ravishankar
f365e85c83 [mlir] Revisit LinalgLoopDistributionOptions.
This patch cleans up the way `LinalgLoopDistributionOptions` are meant
to be used. The option just contains a call back that takes the list
of loop ranges that represent the loops that are to be distributed.
These loops are the outer parallel loops of the tiled operation which
have non-zero tile sizes specified. The call back returns for each of
the loops,
- The procId to use,
- The number of processors,
- The distribution method to use for that loop.

Reviewed By: antiagainst, hanchung

Differential Revision: https://reviews.llvm.org/D131232
2022-08-15 15:56:17 +00:00
Benjamin Kramer
9fa59e7643 [mlir] Use C++17 structured bindings instead of std::tie where applicable. NFCI 2022-08-09 13:34:17 +02:00
Alex Zinenko
26821f75ed [mlir][NFC] accept plain OpBuidler in folded construction helpers
A group of functions in the Affine dialect provides a mechanism for
buliding folded-by-construction operations. These functions used to
accept a `RewriterBase` reference because they may need to erase the
operations that were folded and notify the rewriter when called from
rewrite patterns. Adopt a different approach: postpone the builder
notification of the op creation until we are certain that the op will
not be folded away. This removes the need to notify the rewriter about
op deletion following op construction in case of successful folding, and
removes a bunch of one-off `IRRewriter` instances in transform code that
may mess up insertion points.

Reviewed By: springerm, mravishankar

Differential Revision: https://reviews.llvm.org/D130616
2022-07-29 16:01:56 +00:00
Alex Zinenko
e99fae8997 [mlir] more aggressive folding in tiling/fusion transformations
Combine the recently added utilities for folded-by-construction affine
operations with the attribute-based Range to enable more folding. This
decreases the amount of emitted code but has little effect on test
precisely because the tests are not checking for the spurious constants.
The difference in the shape of affine maps comes from the internals of
affine folding.

Depends on D129633

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D130167
2022-07-27 08:52:18 +00:00
Alex Zinenko
70e99f387a [mlir] Make ViewLikeInterface Range work with attributes
While most of methods in ViewLikeInterface accept an `OpFoldResult` for
the offset/size/stride that may be static, represented as `Attribute`,
or dynamic, represented as `Value`, the `Range` abstraction only
accepted `Values`. This can often lead to known-constant
offset/size/strides being materialized into constant operations and
hinder further constant propagation without explicitly running the
constant folding pass. This often leads to a more complicated than
necessary addressing code being emitted. Switch `Range` to use
`OpFoldResult`. Code that uses `Range` currently keeps materializing the
constants to minimize the effect of this change on the IR. Further
commits will make use of this.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D129633
2022-07-27 08:52:13 +00:00
Kazu Hirata
6fa6901bf0 Use has_value instead of hasValue (NFC) 2022-07-22 23:04:38 -07:00
Christopher Bate
297ba167de [mlir][linalg] Add tile_size option to structured.tile_to_foreach_thread_op
This change modifies `structured.tile_to_foreach_thread_op` so that
it accepts either `tile_sizes` or `num_threads` parameters. If
`tile_sizes` are specified, then the number of threads required is
derived the tile sizes rather than the other way around. In both cases,
more aggressive folding of loop parameters is enabled during the
transformation, allowing for the potential elimination of `affine.min`
and `affine.max` operations in the static shape case when calculating
the final adjusted tile size.

Differential Revision: https://reviews.llvm.org/D130139
2022-07-21 10:32:01 -06:00
Nicolas Vasilache
18b92c66fe [mlir][Linalg] Add a TileToForeachThread transform.
This revision adds a new transformation to tile a TilingInterface `op` to a tiled `scf.foreach_thread`, applying
tiling by `num_threads`.
If non-empty, the `threadDimMapping` is added as an attribute to the resulting `scf.foreach_thread`.
0-tile sizes (i.e. tile by the full size of the data) are used to encode
that a dimension is not tiled.

Differential Revision: https://reviews.llvm.org/D129577
2022-07-19 04:56:11 -07:00
Alex Zinenko
81b62f7feb [mlir] Handle linalg.index correctly in TilingInterface
The existing implementation of the TilingInterface for Linalg ops was not
modifying the `linalg.index` ops contained within other Linalg ops (they need
to be summed up with the values of respective tile loop induction variables),
which led to the interface-based tiling being incorrect for any Linalg op with
index semantics.

In the process, fix the function performing the index offsetting to use the
pattern rewriter API instead of RAUW as it is being called from patterns and
may mess up the internal state of the rewriter. Also rename the function to
clearly catch all uses.

Depends On D129365

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D129366
2022-07-12 12:36:33 +00:00
Alex Zinenko
3963b4d0dc [mlir] Transform op for multitile size generation
Introduce a structured transform op that emits IR computing the multi-tile
sizes with requested parameters (target size and divisor) for the given
structured op. The sizes may fold to arithmetic constant operations when the
shape is constant. These operations may then be used to call the existing
tiling transformation with a single non-zero dynamic size (i.e. perform
strip-mining) for each of the dimensions separately, thus achieving multi-size
tiling with optional loop interchange. A separate test exercises the entire
script.

Depends On D129217

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D129287
2022-07-12 12:36:28 +00:00
Alex Zinenko
ff6e5508d6 [mlir] Structured transforms: introduce op splitting
Introduce a new transformation on structured ops that splits the iteration
space into two parts along the specified dimension. The index at which the
splitting happens may be static or dynamic. This transformation can be seen as
a rudimentary form of index-set splitting that only supports the splitting
along hyperplanes parallel to the iteration space hyperplanes, and is therefore
decomposable into per-dimension application.

It is a key low-level transformation that enables independent scheduling for
different parts of the iteration space of the same op, which hasn't been
possible previously. It may be used to implement, e.g., multi-sized tiling. In
future, peeling can be implemented as a combination of split-off amount
computation and splitting.

The transformation is conceptually close to tiling in its separation of the
iteration and data spaces, but cannot be currently implemented on top of
TilingInterface as the latter does not properly support `linalg.index`
offsetting.

Note that the transformation intentionally bypasses folding of
`tensor.extract_slice` operations when creating them as this folding was found
to prevent repeated splitting of the same operation because due to internal
assumptions about extract/insert_slice combination in dialect utilities.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D129090
2022-07-07 13:19:44 +02:00
Jacques Pienaar
04235d07ad [mlir] Update flipped accessors (NFC)
Follow up with memref flipped and flipping any intermediate changes
made.
2022-06-28 13:11:26 -07:00
Alex Zinenko
8b68da2c7d [mlir] move SCF headers to SCF/{IR,Transforms} respectively
This aligns the SCF dialect file layout with the majority of the dialects.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D128049
2022-06-20 10:18:01 +02:00
Mahesh Ravishankar
cf6a7c1947 [mlir][TilingInterface] Add pattern to tile using TilingInterface and implement TilingInterface for Linalg ops.
This patch adds support for tiling operations that implement the
TilingInterface.
- It separates the loop constructs that are used to iterate over tile
  from the implementation of the tiling itself. For example, the use
  of destructive updates is more related to use of scf.for for
  iterating over tiles that are tensors.
- To test the transformation, TilingInterface is implemented for
  LinalgOps. The separation of the looping constructs used from the
  implementation of tile code generation greatly simplifies the
  latter.
- The implementation of TilingInterface for LinalgOp is kept as an
  external model for now till this approach can be fully flushed out
  to replace the existing tiling + fusion approaches in Linalg.

Differential Revision: https://reviews.llvm.org/D127133
2022-06-13 20:37:44 +00:00
River Riddle
58ceae9561 [mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespace
FuncOp has been moved to the `func` namespace for a little over a month, the
using directive can be dropped now.
2022-04-18 12:01:55 -07:00
Okwan Kwon
65bdeddb1e [mlir] Bubble up tensor.extract_slice above linalg operation
Bubble up extract_slice above Linalg operation.

A sequence of operations

    %0 = linalg.<op> ... arg0, arg1, ...
    %1 = tensor.extract_slice %0 ...

can be replaced with

    %0 = tensor.extract_slice %arg0
    %1 = tensor.extract_slice %arg1
    %2 = linalg.<op> ... %0, %1, ...

This results in the reduce computation of the linalg operation.

The implementation uses the tiling utility functions. One difference
from the tiling process is that we don't need to insert the checking
code for the out-of-bound accesses. The use of the slice itself
represents that the code writer is sure about the boundary condition.
To avoid adding the boundary condtion check code, `omitPartialTileCheck`
is introduced for the tiling utility functions.

Differential Revision: https://reviews.llvm.org/D122437
2022-03-31 16:48:38 +00:00
Diego Caballero
f71f9958b9 [mlir][Vector] Modernize default lowering of vector transpose
This patch removes an old recursive implementation to lower vector.transpose to extract/insert operations
and replaces it with a iterative approach that leverages newer linearization/delinearization utilities.
The patch should be NFC except by the order in which the extract/insert ops are generated.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D121321
2022-03-10 22:33:14 +00:00
Alexander Belyaev
1a829d2d06 [mlir] Purge linalg.tiled_loop.
Differential Revision: https://reviews.llvm.org/D119415
2022-02-28 09:05:18 +01:00
Alexander Belyaev
c962038914 [mlir][nfc] Expose linalg tiling helpers.
Differential Revision: https://reviews.llvm.org/D119330
2022-02-09 15:26:06 +01:00
Alexander Belyaev
fd0c6f5391 [mlir] Move linalg::PadTensorOp to tensor::PadOp.
RFC: https://llvm.discourse.group/t/rfc-move-linalg-padtensorop-to-tensor-padop/5785

Differential Revision: https://reviews.llvm.org/D117892
2022-01-21 20:02:39 +01:00
River Riddle
4157455425 [mlir][Pass] Deprecate FunctionPass in favor of OperationPass<FuncOp>
The only benefit of FunctionPass is that it filters out function
declarations. This isn't enough to justify carrying it around, as we can
simplify filter out declarations when necessary within the pass. We can
also explore with better scheduling primitives to filter out declarations
at the pipeline level in the future.

The definition of FunctionPass is left intact for now to allow time for downstream
users to migrate.

Differential Revision: https://reviews.llvm.org/D117182
2022-01-18 19:52:44 -08:00
Nicolas Vasilache
8a8f0a00b2 [mlir][Linalg] Relax PadTensor tiling constraints and expose it to strategies.
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D117334
2022-01-17 17:13:55 +00:00
Nicolas Vasilache
4a661602ef [mlir][Linalg] NFC - Modernize APIs and get rid of unnecessary tiling paterns.
Tiling patterns can be reduced to a single pattern by using interface-based patterns.

Differential Revision: https://reviews.llvm.org/D116733
2022-01-06 16:27:35 -05:00
Mehdi Amini
e4853be2f1 Apply clang-tidy fixes for performance-for-range-copy to MLIR (NFC) 2022-01-02 22:19:56 +00:00
Mehdi Amini
1fc096af1e Apply clang-tidy fixes for performance-unnecessary-value-param to MLIR (NFC)
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D116250
2022-01-02 01:45:18 +00:00
gysit
b7f2c108eb [mlir][linalg] Replace LinalgOps.h and LinalgTypes.h by a single header.
After removing the range type, Linalg does not define any type. The revision thus consolidates the LinalgOps.h and LinalgTypes.h into a single Linalg.h header. Additionally, LinalgTypes.cpp is renamed to LinalgDialect.cpp to follow the convention adopted by other dialects such as the tensor dialect.

Depends On D115727

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115728
2021-12-15 12:15:03 +00:00
Nicolas Vasilache
61ba9f9110 [mlir][Linalg] NFC - Extend the TilingInterface to allow better composition with out-of-tree dialects.
Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D115233
2021-12-07 13:06:27 +00:00
MaheshRavishankar
526dfe3f4d [mlir][Linalg] Do not return failure when all tile sizes are zero.
Returning failure when tile sizes are all zero prevents the change in
the marker. This makes pattern rewriter run the pattern multiple times
only to exit when it hits a limit. Instead just clone the operation
(since tiling is essentially cloning in this case). Then the
transformation filter kicks in to avoid the pattern rewriter to be
invoked many times.

Differential Revision: https://reviews.llvm.org/D113949
2021-11-18 09:28:25 -08:00
River Riddle
195730a650 [mlir][NFC] Replace references to Identifier with StringAttr
This is part of the replacement of Identifier with StringAttr.

Differential Revision: https://reviews.llvm.org/D113953
2021-11-16 17:36:26 +00:00
Nicolas Vasilache
489fec2777 [mlir][Linalg] NFC - Drop Optional in favor of FailureOr
Differential revision: https://reviews.llvm.org/D112332
2021-10-22 19:28:18 +00:00
Mogball
cb3aa49ec0 [MLIR][arith] fix references to std.constant in comments
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D111820
2021-10-14 20:38:47 +00:00
Mogball
a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
Tobias Gysi
8ed2e8e04f [mlir][linalg] Retire Linalg ConvOp.
The convolution op is one of the remaining hard coded Linalg operations that have no region attached. It got obsolete due to the OpDSL convolution operations. Removing it allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths.

Test needed due to specialized implementations are removed. Tiling and fusion tests are replaced by variants using linalg.conv_2d.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111233
2021-10-08 06:56:37 +00:00
Lei Zhang
a3f425946d [mlir][linalg] Include InitTensorOp in tiling canonicalization
Tiling can create dim ops and those dim ops can take `InitTensorOp`
as input. Including it in the tiling canonicalization patterns
allows us to fold those dim ops away.

Also sorted the existing ops along the way.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D110876
2021-10-01 14:13:19 -04:00
Matthias Springer
8dc16ba8d2 [mlir][linalg] Merge all tiling passes into a single one.
Passes such as `linalg-tile-to-tiled-loop` are merged into `linalg-tile`.

Differential Revision: https://reviews.llvm.org/D110214
2021-09-24 10:16:46 +09:00
Tobias Gysi
9072f1b5f8 [mlir][linalg] Add isPermutation helper (NFC).
Add a helper method to check if an index vector contains a permutation of its indices. Additionally, refactor applyPermutationToVector to take int64_t.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110135
2021-09-21 15:07:39 +00:00
Tobias Gysi
90b7817e03 [mlir][linalg] Add helper to update IndexOps after tiling (NFC).
Add the addTileLoopIvsToIndexOpResults method to shift the IndexOp results after tiling.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D109761
2021-09-17 15:17:33 +00:00
Tobias Gysi
16488dc300 [mlir][linalg] Pass all operands to tile to the tile loop region builder (NFC).
Extend the signature of the tile loop nest region builder to take all operand values to use and not just the scf::For iterArgs. This change allows us to pass in all block arguments of TiledLoop and use them directly instead of replacing them after the loop generation.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D109569
2021-09-10 08:35:11 +00:00
Matthias Springer
c95a7246a3 [mlir][linalg] Tiling: Use loop ub in extract_slice size computation if possible
When tiling a LinalgOp, extract_slice/insert_slice pairs are inserted. To avoid going out-of-bounds when the tile size does not divide the shape size evenly (at the boundary), AffineMin ops are inserted. Some ops have assumptions regarding the dimensions of inputs/outputs. E.g., in a `A * B` matmul, `dim(A, 1) == dim(B, 0)`. However, loop bounds use either `dim(A, 1)` or `dim(B, 0)`.

With this change, AffineMin ops are expressed in terms of loop bounds instead of tensor sizes. (Both have the same runtime value.) This simplifies canonicalizations.

Differential Revision: https://reviews.llvm.org/D109267
2021-09-09 11:06:22 +09:00
MaheshRavishankar
b686fdbf92 [mlir][Linalg] Drop output tensor from linalg.pad_tensor op.
The output tensor was added for tiling purposes. With use of
`TilingInterface` for tiling pad operations, there is no need for an
explicit operand for the shape of result of `linalg.pad_tensor`
op. The interface allows the tiling pattern to query the value that
can be used for the "init" needed for tiling dynamically.

Differential Revision: https://reviews.llvm.org/D108613
2021-08-31 11:12:24 -07:00
MaheshRavishankar
ba72cfe734 [mlir] Add an interface to allow operations to specify how they can be tiled.
An interface to allow for tiling of operations is introduced. The
tiling of the linalg.pad_tensor operation is modified to use this
interface.

Differential Revision: https://reviews.llvm.org/D108611
2021-08-30 16:31:18 -07:00
Matthias Springer
d18ffd61d4 [mlir][SCF] Canonicalize dim(x) where x is an iter_arg
* Add `DimOfIterArgFolder`.
* Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`.
* Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`.
* Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.)

Differential Revision: https://reviews.llvm.org/D108806
2021-08-30 01:39:56 +00:00
Matthias Springer
2de2dbef2a [mlir][linalg] Replace AffineMinSCFCanonicalizationPattern with SCF reimplementation
Use the new canonicalization pattern in the SCF dialect.

Differential Revision: https://reviews.llvm.org/D107732
2021-08-25 08:52:56 +09:00