Commit Graph

359 Commits

Author SHA1 Message Date
Kazu Hirata
70c73d1b72 [mlir] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:23:50 -08:00
Kazu Hirata
1a36588ec6 [mlir] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 18:50:27 -08:00
River Riddle
b74192b7ae [mlir] Remove support for non-prefixed accessors
This finishes off a year long pursuit to LLVMify the generated
operation accessors, prefixing them with get/set. Support for
any other accessor naming is fully removed after this commit.

https://discourse.llvm.org/t/psa-raw-accessors-are-being-removed/65629

Differential Revision: https://reviews.llvm.org/D136727
2022-12-02 13:32:36 -08:00
Hanhan Wang
b1d3afc93e [mlir] Factor more common utils to IndexingUtils
Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D139159
2022-12-02 13:27:01 -08:00
Nicolas Vasilache
a8850312c1 [mlir][Transform][NFC] Use a single rewriter instead of duplicating it everywhere
Differential Revision: https://reviews.llvm.org/D139094
2022-12-01 03:54:31 -08:00
Benjamin Kramer
652592c5ca [MLIR][Transform] Disambiguate ternary operator for MSVC
mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp(42): error C2446: ':': no conversion from 'OpTy' to 'OpTy'
        with
        [
            OpTy=mlir::scf::ForOp
        ]
        and
        [
            OpTy=mlir::AffineForOp
        ]
mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp(42): note: No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called
2022-12-01 08:59:58 +01:00
Christian Sigg
be065c41d8 [mlir] Change scf::LoopNest to store 'results'.
This fixes the case where scf::LoopNest::loops is empty.

Change LoopVector and ValueVector to SmallVector.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D136926
2022-12-01 06:51:45 +01:00
Amy Wang
e4e64eaade [MLIR][Transform] Consolidate the transform ops of get_parent_for and loop unroll from affine and scf dialects.
This patch consolidates the two transform ops from the affine dialect
and the scf dialect to avoid code duplication.

This is to address the review comments from
https://reviews.llvm.org/D137997.

The transform ops directory / file structure for the affine dialect is
kept for the purpose of forth-coming transform ops
for affine, but get_parent_for and unroll are removed.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D138980
2022-11-30 11:07:44 -05:00
Matthias Springer
0d9761d50e [mlir][SCF] Add tensor.dim(scf.foreach_thread) folding
Dim sizes of `scf.foreach_thread` op results match the dim sizes of their respective tied shared_outs operands.

Differential Revision: https://reviews.llvm.org/D138484
2022-11-22 11:28:27 +01:00
Lei Zhang
9bb633741a [mlir][bufferization] Support general Attribute as memory space
MemRef has been accepting a general Attribute as memory space for
a long time. This commits updates bufferization side to catch up,
which allows downstream users to plugin customized symbolic memory
space. This also eliminates quite a few `getMemorySpaceAsInt`
calls, which is deprecated.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D138330
2022-11-21 09:40:50 -05:00
Che-Yu Wu
4fdf624b78 Increase the limit of SCF nested tiling loop to 10
Differential Revision: https://reviews.llvm.org/D138156
2022-11-17 08:00:21 +00:00
Mahesh Ravishankar
fc367dfa67 [mlir] Remove Transforms/SideEffectUtils.h and move the methods into Interface/SideEffectInterfaces.h.
The methods in `SideEffectUtils.h` (and their implementations in
`SideEffectUtils.cpp`) seem to have similar intent to methods already
existing in `SideEffectInterfaces.h`. Move the decleration (and
implementation) from `SideEffectUtils.h` (and `SideEffectUtils.cpp`)
into `SideEffectInterfaces.h` (and `SideEffectInterface.cpp`).

Also drop the `SideEffectInterface::hasNoEffect` method in favor of
`mlir::isMemoryEffectFree` which actually recurses into the operation
instead of just relying on the `hasRecursiveMemoryEffectTrait`
exclusively.

Differential Revision: https://reviews.llvm.org/D137857
2022-11-15 20:07:35 +00:00
Mehdi Amini
fbfca43e6d Apply clang-tidy fixes for llvm-qualified-auto in TileUsingInterface.cpp (NFC) 2022-11-15 18:14:01 +00:00
Mohammed Anany
77533d79f7 [mlir][SCF] Adding custom builder to SCF::WhileOp.
This is a similar builder to the one for SCF::IfOp which allows users to pass region builders to it. Refer to the builders for IfOp.

Reviewed By: tpopp

Differential Revision: https://reviews.llvm.org/D137709
2022-11-15 18:16:49 +01:00
Nicolas Vasilache
f0a411da77 [mlir][Transform]Significantly cleanup scf.foreach_thread and GPU transform permutation handling
Previously, the need for a dense permutation leaked into the thread_dim_mapping specification.
This revision allows to use a sparse specification of the thread_dim_mapping and the proper completion / sorting is applied automatically.

In the process, the sematics of scf.foreach_thread is tightened to require a matching number of thread dimensions and mappings.
The relevant negative test is added.

Differential Revision: https://reviews.llvm.org/D137906
2022-11-14 09:19:49 -08:00
Alexander Belyaev
07665e78cb [mlir] Fix forward the fix for incorrect Optional<ArrayAttr> usage. 2022-11-11 10:53:04 +01:00
Alexander Belyaev
b7162e136e [mlir] Fix incorrect access to the Optional<ArrayAttr> underlying values. 2022-11-11 10:46:04 +01:00
Guray Ozen
6663f34704 [mlir] Introduce device mapper attribute for thread_dim_map and mapped to dims
`scf.foreach_thread` defines mapping its loops to processors via an integer array, see an example below. A lowering can use this mapping. However, expressing mapping as an integer array is very confusing, especially when there are multiple levels of parallelism. In addition, the op does not verify the integer array. This change introduces device mapping attribute to make mapping descriptive and verifiable. Then it makes GPU transform dialect use it.

```
scf.foreach_thread (%i, %j) in (%c1, %c2) {
	scf.foreach_thread (%i2, %j2) in (%c1, %c2)
	{...} { thread_dim_mapping = [0, 1]}
} { thread_dim_mapping = [0, 1]}
```

It first introduces a `DeviceMappingInterface` which is an attribute interface. `scf.foreach_thread` defines its mapping via this interface. A lowering must define its attributes and implement this interface as well. This way gives us a clear validation.

The change also introduces two new attributes (`#gpu.thread<x/y/z>` and `#gpu.block<x,y,z>` ). After this change, the above code prints as below, as seen here, this way clarifies the loop mappings. The change also implements consuming of these two new attribute by the transform dialect. Transform dialect binds the outermost loops to the thread blocks and innermost loops to threads.

```
scf.foreach_thread (%i, %j) in (%c1, %c2) {
	scf.foreach_thread (%i2, %j2) in (%c1, %c2)
	{...} { thread_dim_mapping = [#gpu.thread<x>, #gpu.thread<y>]}
} { thread_dim_mapping = [#gpu.block<x>, #gpu.block<y>]}
```

Reviewed By: ftynse, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D137413
2022-11-11 08:44:57 +01:00
Mehdi Amini
34233d4995 Apply clang-tidy fixes for readability-redundant-smartptr-get in SCF.cpp (NFC) 2022-11-09 00:41:30 +00:00
Mehdi Amini
f5865c8701 Apply clang-tidy fixes for llvm-qualified-auto in SCF.cpp (NFC) 2022-11-09 00:41:30 +00:00
Kazu Hirata
585e35a998 [mlir] Use llvm::is_contained (NFC) 2022-11-06 19:56:15 -08:00
Hanhan Wang
52ffc72818 [mlir][tiling] Relax tiling to accept generating multiple operations.
Some operations need to generate multiple operations when implementing
the tiling interface. Here is a sound example in IREE, see
https://github.com/iree-org/iree/pull/10905 for more details.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D137300
2022-11-04 13:59:24 -07:00
Thomas Raoux
3310fe55d9 [mlir][linalg] Add reduction tiling transformation
Add a transformation to tile reduction ops into a parallel operation
followed by a merge operation. This is equivalent to the existing
reduction spliting transformation but using loops instead of using
higher dimensions linalg.

Differential Revision: https://reviews.llvm.org/D136586
2022-11-03 23:07:12 +00:00
Peiming Liu
1ca119728e [mlir][scf] support 1:N type conversion for scf.if/while/condition
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D137100
2022-11-02 16:53:36 +00:00
Peiming Liu
f4cd3674ea [mlir][scf] refactor scf structuralOpConversion to better support 1:N type conversion
This patch moves the 1:N type mapping into its own classes to allow better code reuse in D137100.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D137099
2022-11-02 16:45:39 +00:00
Nicolas Vasilache
d4c4e49196 [mlir][Linalg] Drop usage of tileWithLinalgTilingOptions in the structured.tile transform
This is on a path to deprecation.
Context: https://discourse.llvm.org/t/psa-retire-tileandfuselinalgops-method/63850

As the interface-based transformation is more generic, some additional folding of AffineMin/MaxOp and some extra canonicalizations are needed.
This can be further evolved.

Differential Revision: https://reviews.llvm.org/D137195
2022-11-01 14:36:24 -07:00
Sanjoy Das
788390c130 Make scf.for and affine.for conditionally speculatable
for (I = Start; I < End; I += 1) always terminates so mark
{scf|affine}.for as RecursivelySpeculatable when step is known to be
1.

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D136376
2022-10-30 16:08:42 -07:00
Hanhan Wang
71cf48a62a [mlir][scf] Enhance sizes computation in tileUsingSCFForOp.
The boundary is always 1 if the tile size is 1.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D136884
2022-10-28 13:03:10 -07:00
Alexander Belyaev
b4db15a949 [mlir] Rename getInputs->getDpsInputs and getOutputs->getDpsInits in DPS interface.
https://discourse.llvm.org/t/rfc-interface-for-destination-style-ops/64056

Differential Revision: https://reviews.llvm.org/D136943
2022-10-28 15:41:12 +02:00
Matthias Springer
c9b3638126 [mlir][scf][bufferize] Fix bufferizesToMemoryRead with 0 loop iterations
There was a bug in scf.for loop bufferization that could lead to a missing buffer copy (alloc was there, but not the copy).

Differential Revision: https://reviews.llvm.org/D135053
2022-10-24 14:34:41 +02:00
Matthias Springer
b169643f3a [mlir][interfaces] Remove getDestinationOperands from TilingInterface
`getDestinationOperands` was almost a duplicate of `DestinationStyleOpInterface::getOutputOperands`. Now that the interface has been moved to mlir/Interfaces, it is no longer needed.

Differential Revision: https://reviews.llvm.org/D136240
2022-10-24 09:26:19 +02:00
Peiming Liu
d3f5f33067 [mlir][scf] support 1:N type conversion for scf.for.
scf.for used to only support 1:1 type conversion, this patch add support for 1:N type conversion.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D136314
2022-10-21 21:11:55 +00:00
Jeff Niu
07d8fe9391 [mlir][scf] Add an IndexSwitchOp
The `scf.index_switch` is a control-flow operation that branches to one of the
given regions based on the values of the argument and the cases. The
argument is always of type `index`.

Example:

```mlir
%0 = scf.index_switch %arg0 -> i32
case 2 {
  %1 = arith.constant 10 : i32
  scf.yield %1 : i32
}
case 5 {
  %2 = arith.constant 20 : i32
  scf.yield %2 : i32
}
default {
  %3 = arith.constant 30 : i32
  scf.yield %3 : i32
}
```

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D136003
2022-10-21 09:21:10 -07:00
Jeff Niu
6005a1d8af [mlir][scf] Match any constants instead of arith.constant
By matching `arith.constant` specifically, SCF canonicalizers/folders
are incompatible with other kinds of constants. Use the generic
matchers instead.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D135517
2022-10-12 18:01:57 -07:00
Mehdi Amini
6d4baa7442 Apply clang-tidy fixes for performance-unnecessary-value-param in TileUsingInterface.cpp (NFC) 2022-10-12 05:03:45 +00:00
Mehdi Amini
2a6f0fb34a Apply clang-tidy fixes for performance-for-range-copy in TileUsingInterface.cpp (NFC) 2022-10-12 05:03:45 +00:00
Mehdi Amini
23f989a2e3 Apply clang-tidy fixes for readability-simplify-boolean-expr in BufferizableOpInterfaceImpl.cpp (NFC) 2022-10-12 01:16:36 +00:00
Alex Zinenko
59bb8af4c3 [mlir] switch the transform loop extension to use types
Add types to the Loop (SCF) extension of the transform dialect.

See https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135587
2022-10-11 09:55:23 +00:00
Nicolas Vasilache
7915027926 [mlir][Linalg] Retire LinalgStrategyTileAndFusePass and filter-based pattern.
Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785

In the process, also retire `tileConsumerAndFuseProducers` that is now replaced by `tileConsumerAndFuseProducerGreedilyUsingSCFForOp`.

Context: https://discourse.llvm.org/t/psa-retire-tileandfuselinalgops-method/63850

When performing this replacement, a change of behavior appeared: the older `tileConsumerAndFuseProducers` would split the parallel
and non-parallel dimensions automatically and perform a first level of tile-and-fuse on parallel dimensions only and then introduce a
second level of tiling-only on the reduction dimensions. The newer `tileConsumerAndFuseProducerGreedilyUsingSCFForOp` on the other hand
does not perform this breakdown. As a consequence, the transform specification is evolved to produce the same output.

Additionally, replace some uses of `unsigned` by `int64_t` where possible without pulling in larger interface changes (left for a future PR).

Context: https://www.youtube.com/watch?v=Puio5dly9N8

Lastly, tests that were performing tile and fuse and distribute on tensors are retired: the generated IR mixing scf.for, tensors and
distributed processor ids was racy at best ..

Differential Revision: https://reviews.llvm.org/D135559
2022-10-10 07:04:01 -07:00
Nicolas Vasilache
af664e4459 [mlir][Transform] Add a transform.split_handles operation and fix general silenceable bugs.
The transform.split_handles op is useful for ensuring a statically known number of operations are
tracked by the source `handle` and to extract them into individual handles
that can be further manipulated in isolation.

In the process of making the op robust wrt to silenceable errors and the suppress mode, issues were
uncovered and fixed.

The main issue was that silenceable errors were short-circuited too early and the payloads were not
set. This resulted in suppressed silenceable errors not propagating correctly.
Fixing the issue triggered a few test failures: silenceable error returns now must properly set the results state.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D135426
2022-10-07 09:01:34 -07:00
Adrian Kuegel
67bcf9825a [mlir][SCF] Apply ClangTidyPerformance finding (NFC) 2022-09-30 12:47:32 +02:00
Mahesh Ravishankar
0d5cb90f6c [mlir][scf] Simplify the logic for replaceLoopWithNewYields for perfectly nested loops.
Based on discussion in https://reviews.llvm.org/D134411, instead of
first modifying the inner most loop first followed by modifying the
outer loops from inside out, this patch restructures the logic to
start the modification from the outer most loop.

Differential Revision: https://reviews.llvm.org/D134832
2022-09-29 16:52:02 +00:00
Jakub Kuderski
abc362a107 [mlir][arith] Change dialect name from Arithmetic to Arith
Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22.

Tested with:
`ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples`

and `bazel build --config=generic_clang @llvm-project//mlir:all`.

Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini

Differential Revision: https://reviews.llvm.org/D134762
2022-09-29 11:23:28 -04:00
Mahesh Ravishankar
97f919820b [mlir][TilingInterface] NFC Refactor of tile and fuse using TilingInterface.
This patch refactors the tiling and tile + fuse implementation using
`TilingInterface`. Primarily, it exposes the functionality as simple
utility functions instead of as a Pattern to allow calling it from a
pattern as it is done in the test today or from within the transform
dialect (in the future). This is a step towards deprecating similar
methods in Linalg dialect.

- The utility methods do not erase the root operations.
- The return value provides the values to use for replacements.

Differential Revision: https://reviews.llvm.org/D134144
2022-09-28 20:25:33 +00:00
Mahesh Ravishankar
7ee34550f5 [mlir][TilingInterface] Fix iter_args handling in tile (and fuse).
The current approach for handling `iter_args` was to replace all uses
of the value that is used as `init` value with the corresponding
region block argument within the `scf.for`. This is not always
correct. Instead a more deliberate approach needs to be taken to
handle these. If the slice being fused represents a slice of the
destination operand of the untiled op, then
- Make the destination of the fused producer the `init` value of the
  loop nest
- For the tiled and fused producer op created, replace the slice of
  the destination operand with a slice of the corresponding region
  iter arg of the innermost loop of the generated loop nest

Differential Revision: https://reviews.llvm.org/D134411
2022-09-26 19:09:29 +00:00
Johannes Reifferscheid
eaf20c4fc2 [mlir] Fix a cast that should be a dyn_cast.
This fixes a crash for certain IR, see the new test case for an
example.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D134424
2022-09-22 13:13:21 +02:00
Christopher Bate
f5fe92f693 [mlir][SCF] Fix loop pipelining unable to handle ops with regions
This change allows the SCF LoopPipelining transform to handle ops with
nested regions within the pipelined `scf.for` body. The op and nested
regions are treated as a single unit from the transform's perspective.
This change also makes explicit the requirement that only ops whose
parent Block is the loop body Block are allowed to be scheduled by the
caller.

Reviewed By: ThomasRaoux, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D133965
2022-09-20 21:58:53 -06:00
Peiming Liu
52887071ea [mlir][scf] Support simple symbolic expression without depending on AffineDialect to simply trivial loops.
Remove dependence of AffineDialect

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D134291
2022-09-20 18:13:05 +00:00
Peiming Liu
d518fc28b6 [mlir][scf] Support simple symbolic expression when simplify loops
Reviewed By: aartbik, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D134204
2022-09-19 21:50:01 +00:00
Guray Ozen
233de4e808 [mlir] Add map_nested_foreach_thread_to_gpu_threads op to transform dialect
This revision adds a new op `map_nested_foreach_thread_to_gpu_threads` to transform dialect. The op searches `scf.foreach_threads` inside the `gpu_launch` and distributes them with `gpu.thread_id` attribute.

Loop mapping is explicit and given by the `map_nested_foreach_thread_to_gpu_threads` op. Mapping is done one-to-one, therefore the loops dissappear.

The dynamic trip count or trip count that are larger than thread size are not supported for the time being. However, we can indeed support them by generating a loop inside with cyclic scheduling.

For the time being, trip counts that are dynamic or bigger than thread sizes are not supported. However, in the future the compiler can indeed generate a loop with static cyclic scheduling to support these cases.

Current mechanism allows `scf.foreach_threads` to be siblings or nested. There cannot be interleaving code between the loops when they are nested.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D133950
2022-09-19 16:27:30 +02:00