clang-p2996

Author	SHA1	Message	Date
Matthias Springer	0ba3e96be1	[mlir][SCF][NFC] `ValueBoundsConstraintSet`: Simplify `scf.for` implementation (#87862 ) This commit simplifies the implementation of the `ValueBoundsOpInterface` for `scf.for` based on the newly added `ValueBoundsConstraintSet::compare` API and adds additional documentation. Previously, the interface implementation created a new constraint set just to check if the yielded value and iter_arg are equal. This was inefficient because constraints were added multiple times (to two different constraint sets) for ops that are inside the loop. Note: This is a re-upload of #86239.	2024-04-06 15:30:26 +09:00
Matthias Springer	76435f2dca	[mlir][SCF] `ValueBoundsConstraintSet`: Support `scf.if` (branches) (#87860 ) This commit adds support for `scf.if` to `ValueBoundsConstraintSet`. Example: ``` %0 = scf.if ... -> index { scf.yield %a : index } else { scf.yield %b : index } ``` The following constraints hold for %0: * %0 >= min(%a, %b) * %0 <= max(%a, %b) Such constraints cannot be added to the constraint set; min/max is not supported by `IntegerRelation`. However, if we know which one of %a and %b is larger, we can add constraints for %0. E.g., if %a <= %b: * %0 >= %a * %0 <= %b This commit required a few minor changes to the `ValueBoundsConstraintSet` infrastructure, so that values can be compared while we are still in the process of traversing the IR/adding constraints. Note: This is a re-upload of #85895, which was reverted. The bug that caused the failure was fixed in #87859.	2024-04-06 13:04:49 +09:00
Mehdi Amini	8487e05967	Revert "[mlir][SCF] `ValueBoundsConstraintSet`: Support `scf.if` (branches) (#85895 )" This reverts commit `6b30ffef28`. gcc7 bot is broken	2024-04-05 03:00:35 -07:00
Mehdi Amini	e5e1bc0ad8	Revert "[mlir][SCF][NFC] `ValueBoundsConstraintSet`: Simplify `scf.for` implementation (#86239 )" This reverts commit `24e4429980`. gcc7 bot is broken	2024-04-05 03:00:29 -07:00
Matthias Springer	24e4429980	[mlir][SCF][NFC] `ValueBoundsConstraintSet`: Simplify `scf.for` implementation (#86239 ) This commit simplifies the implementation of the `ValueBoundsOpInterface` for `scf.for` based on the newly added `ValueBoundsConstraintSet::compare` API and adds additional documentation. Previously, the interface implementation created a new constraint set just to check if the yielded value and iter_arg are equal. This was inefficient because constraints were added multiple times (to two different constraint sets) for ops that are inside the loop.	2024-04-05 13:27:50 +09:00
Matthias Springer	6b30ffef28	[mlir][SCF] `ValueBoundsConstraintSet`: Support `scf.if` (branches) (#85895 ) This commit adds support for `scf.if` to `ValueBoundsConstraintSet`. Example: ``` %0 = scf.if ... -> index { scf.yield %a : index } else { scf.yield %b : index } ``` The following constraints hold for %0: * %0 >= min(%a, %b) * %0 <= max(%a, %b) Such constraints cannot be added to the constraint set; min/max is not supported by `IntegerRelation`. However, if we know which one of %a and %b is larger, we can add constraints for %0. E.g., if %a <= %b: * %0 >= %a * %0 <= %b This commit required a few minor changes to the `ValueBoundsConstraintSet` infrastructure, so that values can be compared while we are still in the process of traversing the IR/adding constraints.	2024-04-05 13:14:00 +09:00
MaheshRavishankar	5aeb604c7c	[mlir][SCF] Modernize `coalesceLoops` method to handle `scf.for` loops with iter_args (#87019 ) As part of this extension this change also does some general cleanup 1) Make all the methods take `RewriterBase` as arguments instead of creating their own builders that tend to crash when used within pattern rewrites 2) Split `coalesePerfectlyNestedLoops` into two separate methods, one for `scf.for` and other for `affine.for`. The templatization didnt seem to be buying much there. Also general clean up of tests.	2024-04-04 13:44:24 -07:00
Matthias Springer	5e4a44380e	[mlir][Interfaces][NFC] `ValueBoundsConstraintSet`: Pass stop condition in the constructor (#86099 ) This commit changes the API of `ValueBoundsConstraintSet`: the stop condition is now passed to the constructor instead of `processWorklist`. That makes it easier to add items to the worklist multiple times and process them in a consistent manner. The current `ValueBoundsConstraintSet` is passed as a reference to the stop function, so that the stop function can be defined before the the `ValueBoundsConstraintSet` is constructed. This change is in preparation of adding support for branches.	2024-04-04 17:05:47 +09:00
Ivan Butygin	a6d932bca8	[mlir][scf] Align `scf.while` `before` block args in canonicalizer (#76195 ) If `before` block args are directly forwarded to `scf.condition` make sure they are passed in the same order. This is needed for `scf.while` uplifting https://github.com/llvm/llvm-project/pull/76108	2024-04-02 17:54:06 +03:00
Rolf Morel	eacda36c7d	[SCF][Transform] Add support for scf.for in LoopFuseSibling op (#81495 ) Adds support for fusing two scf.for loops occurring in the same block. Uses the rudimentary checks already in place for scf.forall (like the target loop's operands being dominated by the source loop). - Fixes a bug in the dominance check whereby it was checked that values in the target loop themselves dominated the source loop rather than the ops that define these operands. - Renames the LoopFuseSibling op to LoopFuseSiblingOp. - Updates LoopFuseSiblingOp's description. - Adds tests for using LoopFuseSiblingOp on scf.for loops, including one which fails without the fix for the dominance check. - Adds tests checking the different failure modes of the dominance checker. - Adds test for case whereby scf.yield is automatically generated when there are no loop-carried variables.	2024-03-28 14:13:08 +01:00
Justin Fargnoli	35d55f2894	[NFC][mlir] Reorder `declarePromisedInterface()` operands (#86628 ) Reorder the template operands of `declarePromisedInterface()` to match `declarePromisedInterfaces()`.	2024-03-27 10:30:17 -07:00
Vivian	a9d1fead96	Fix the condition for peeling the first iteration (#86350 ) This PR fixes the condition used in loop peeling of the first iteration. Using ceilDiv instead of floorDiv when calculating the loop counts, so that the first iteration gets peeled as needed.	2024-03-25 09:53:57 -07:00
Oleksandr "Alex" Zinenko	5a9bdd85ee	[mlir] split transform interfaces into a separate library (#85221 ) Transform interfaces are implemented, direction or via extensions, in libraries belonging to multiple other dialects. Those dialects don't need to depend on the non-interface part of the transform dialect, which includes the growing number of ops and transitive dependency footprint. Split out the interfaces into a separate library. This in turn requires flipping the dependency from the interface on the dialect that has crept in because both co-existed in one library. The interface shouldn't depend on the transform dialect either. As a consequence of splitting, the capability of the interpreter to automatically walk the payload IR to identify payload ops of a certain kind based on the type used for the entry point symbol argument is disabled. This is a good move by itself as it simplifies the interpreter logic. This functionality can be trivially replaced by a `transform.structured.match` operation.	2024-03-20 22:15:17 +01:00
Justin Fargnoli	513cdb8222	[mlir] Declare promised interfaces for all dialects (#78368 ) This PR adds promised interface declarations for all interfaces declared in `InitAllDialects.h`. Promised interfaces allow a dialect to declare that it will have an implementation of a particular interface, crashing the program if one isn't provided when the interface is used.	2024-03-15 20:23:20 -07:00
Matthias Springer	492e8ba038	[mlir] Fix memory leaks after #81759 (#82762 ) This commit fixes memory leaks that were introduced by #81759. The way ops and blocks are erased changed slightly. The leaks were caused by an incorrect implementation of op builders: blocks must be created with the supplied builder object. Otherwise, they are not properly tracked by the dialect conversion and can leak during rollback.	2024-02-23 14:28:57 +01:00
mlevesquedion	d4fd20258f	[mlir] Use arith max or min ops instead of cmp + select (#82178 ) I believe the semantics should be the same, but this saves 1 op and simplifies the code. For example, the following two instructions: ``` %2 = cmp sgt %0, %1 %3 = select %2, %0, %1 ``` Are equivalent to: ``` %2 = maxsi %0 %1 ```	2024-02-21 12:28:05 -08:00
Mehdi Amini	e546a6e95d	Apply clang-tidy fixes for modernize-loop-convert in Utils.cpp (NFC)	2024-02-12 13:27:49 -08:00
Mehdi Amini	8189db978f	Apply clang-tidy fixes for performance-unnecessary-value-param in BufferizableOpInterfaceImpl.cpp (NFC)	2024-02-12 13:27:48 -08:00
Mehdi Amini	5370425772	Apply clang-tidy fixes for bugprone-argument-comment in SCF.cpp (NFC)	2024-02-12 13:27:48 -08:00
Jerry Wu	f7201505a6	[mlir] Add transformation to wrap scf::while in zero-trip-check (#81050 ) Add `scf::wrapWhileLoopInZeroTripCheck` to wrap scf while loop in zero-trip-check.	2024-02-08 17:52:09 -05:00
Ivan Butygin	6050cf2884	[mlir][scf] Add reductions support to `scf.parallel` fusion (#75955 ) Properly handle fusion of loops with reductions: * Check there are no first loop results users between loops * Create new loop op with merged reduction init values * Update `scf.reduce` op to contain reductions from both loops * Update loops users with new loop results	2024-02-01 18:37:19 +03:00
Hsiangkai Wang	c3eb2978a6	[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp (#80145 ) When checking the load indices of the second loop coincide with the store indices of the first loop, it only considers the index values are the same or not. However, there are some cases the index values defined by other operators. In these cases, it will treat them as different even the results of defining operators are the same. We already check if the iteration space is the same in isFusionLegal(). When checking operands of defining operators, we only need to consider the operands come from the same induction variables. If so, we know the results of defining operators are the same.	2024-02-01 13:57:31 +00:00
fabrizio-indirli	d17b005e46	[mlir][scf] Relax requirements for loops fusion (#79187 ) Enable the fusion of parallel loops also when the 1st loop contains multiple write accesses to the same buffer, if the accesses are always on the same indices. Fix LIT test cases whose loops were not being fused. Signed-off-by: Fabrizio Indirli <Fabrizio.Indirli@arm.com>	2024-01-30 20:17:06 +03:00
MaheshRavishankar	76ead96c1d	[mlir][TilingInterface] Use `LoopLikeOpInterface` in tiling using SCF to unify tiling with `scf.for` and `scf.forall`. (#77874 ) Using `LoopLikeOpInterface` as the basis for the implementation unifies all the tiling logic for both `scf.for` and `scf.forall`. The only difference is the actual loop generation. This is a follow up to https://github.com/llvm/llvm-project/pull/72178 Instead of many entry points for each loop type, the loop type is now passed as part of the options passed to the tiling method. This is a breaking change with the following changes 1) The `scf::tileUsingSCFForOp` is renamed to `scf::tileUsingSCF` 2) The `scf::tileUsingSCFForallOp` is deprecated. The same functionality is obtained by using `scf::tileUsingSCF` and setting the loop type in `scf::SCFTilingOptions` passed into this method to `scf::SCFTilingOptions::LoopType::ForallOp` (using the `setLoopType` method). 3) The `scf::tileConsumerAndFusedProducerGreedilyUsingSCFForOp` is renamed to `scf::tileConsumerAndFuseProducerUsingSCF`. The use of the `controlFn` in `scf::SCFTileAndFuseOptions` allows implementing any strategy with the default callback implemeting the greedy fusion. 4) The `scf::SCFTilingResult` and `scf::SCFTileAndFuseResult` now use `SmallVector<LoopLikeOpInterface>`. 5) To make `scf::ForallOp` implement the parts of `LoopLikeOpInterface` needed, the `getOutputBlockArguments()` method is replaced with `getRegionIterArgs()` These changes now bring the tiling and fusion capabilities using `scf.forall` on par with what was already supported by `scf.for`	2024-01-25 21:26:23 -08:00
Matthias Springer	5cc0f76d34	[mlir][IR] Add rewriter API for moving operations (#78988 ) The pattern rewriter documentation states that "all IR mutations [...] are required to be performed via the `PatternRewriter`." This commit adds two functions that were missing from the rewriter API: `moveOpBefore` and `moveOpAfter`. After an operation was moved, the `notifyOperationInserted` callback is triggered. This allows listeners such as the greedy pattern rewrite driver to react to IR changes. This commit narrows the discrepancy between the kind of IR modification that can be performed and the kind of IR modifications that can be listened to.	2024-01-25 11:01:28 +01:00
Matthias Springer	5fcf907b34	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260 ) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`).	2024-01-17 11:08:59 +01:00
Matthias Springer	c1730f4221	[mlir][SCF] Do not verify step size of `scf.for` (#78141 ) An op verifier should verify only local properties. This commit removes the verification of `scf.for` step sizes. (Verifiers can check attributes but should not follow SSA values.) This verification could reject IR that is actually valid, e.g.: ```mlir scf.if %always_false { // Branch is never entered. scf.for ... step %c0 { ... } } ``` This commit fixes `for-loop-peeling.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`: ``` within split at llvm-project/mlir/test/Dialect/SCF/for-loop-peeling.mlir:293 offset :9:3: note: see current operation: "scf.for"(%0, %3, %2) ({ ^bb0(%arg1: index): %4 = "arith.index_cast"(%arg1) : (index) -> i64 "memref.store"(%4, %arg0) : (i64, memref<i64>) -> () "scf.yield"() : () -> () }) {__peeled_loop__} : (index, index, index) -> () LLVM ERROR: IR failed to verify after folding ``` Note: `%2` is `arith.constant 0 : index`.	2024-01-15 17:46:54 +01:00
Felix Schneider	f6f1ab9d90	[mlir][scf] Fix `for-loop-peeling` crash (#77697 ) Before applying the peeling patterns, it can happen that the `ForOp` gets a step of zero during folding. This leads to a division-by-zero down the line. This patch adds an additional check for a constant-zero step and a test. Fix https://github.com/llvm/llvm-project/issues/75758	2024-01-12 19:08:16 +01:00
MaheshRavishankar	aa2a96a24a	[mlir][TilingInterface] Move TilingInterface tests to use transform dialect ops. (#77204 ) In the process a couple of test transform dialect ops are added just for testing. These operations are not intended to use as full flushed out of transformation ops, but are rather operations added for testing. A separate operation is added to `LinalgTransformOps.td` to convert a `TilingInterface` operation to loops using the `generateScalarImplementation` method implemented by the operation. Eventually this and other operations related to tiling using the `TilingInterface` need to move to a better place (i.e. out of `Linalg` dialect)	2024-01-11 21:31:03 -08:00
Thomas Raoux	c933bd8185	[MLIR][SCF] Add checks to verify that the pipeliner schedule is correct. (#77083 ) Add a check to validate that the schedule passed to the pipeliner transformation is valid and won't cause the pipeliner to break SSA. This checks that the for each operation in the loop operations are scheduled after their operands.	2024-01-10 04:25:57 -08:00
MaheshRavishankar	4435ced949	[mlir][TilingInterface] Allow controlling what fusion is done within tile and fuse (#76871 ) Currently the `tileConsumerAndFuseProducerGreedilyUsingSCFFor` method greedily fuses through all slices that are generated during the tile and fuse flow. That is not the normal use case. Ideally the caller would like to control which slices get fused and which dont. This patch introduces a new field to the `SCFTileAndFuseOptions` to specify this control. The contol function also allows the caller to specify if the replacement for the fused producer needs to be yielded from within the tiled computation. This allows replacing the fused producers in case they have other uses. Without this the original producers still survive negating the utility of the fusion. The change here also means that the name of the function `tileConsumerAndFuseProducerGreedily...` can be updated. Defering that to a later stage to reduce the churn of API changes.	2024-01-08 13:26:10 -08:00
Arseniy Obolenskiy	59569eb756	[mlir] Fix support for loop normalization with integer indices (#76566 ) Choose correct type for updated loop boundaries after scf loop normalization, do not force chosen type to IndexType	2024-01-05 17:49:21 +03:00
Matthias Springer	10056c821a	[mlir][SCF] `scf.parallel`: Make reductions part of the terminator (#75314 ) This commit makes reductions part of the terminator. Instead of `scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops. `scf.reduce` may contain an arbitrary number of reductions, with one region per reduction. Example: ```mlir %init = arith.constant 0.0 : f32 %r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init) -> f32, f32 { %elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32> %elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32> scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) { ^bb0(%lhs : f32, %rhs: f32): %res = arith.addf %lhs, %rhs : f32 scf.reduce.return %res : f32 }, { ^bb0(%lhs : f32, %rhs: f32): %res = arith.mulf %lhs, %rhs : f32 scf.reduce.return %res : f32 } } ``` `scf.reduce` operations can no longer be interleaved with other ops in the body of `scf.parallel`. This simplifies the op and makes it possible to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was not possible before because the op was not a terminator, causing the op to be DCE'd.)	2023-12-20 11:06:27 +09:00
Han-Chung Wang	899c2bed9e	[mlir][TilingInterface] Early return cloned ops if tile sizes are zeros. (#75410 ) It is a trivial early-return case. If the cloned ops are not returned, it will generate `extract_slice` op that extracts the whole slice. However, it is not folded away. Early-return to avoid the case. E.g., ```mlir func.func @matmul_tensors( %arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf32>, %arg2: tensor<?x?xf32>) -> tensor<?x?xf32> { %0 = linalg.matmul ins(%arg0, %arg1: tensor<?x?xf32>, tensor<?x?xf32>) outs(%arg2: tensor<?x?xf32>) -> tensor<?x?xf32> return %0 : tensor<?x?xf32> } module attributes {transform.with_named_sequence} { transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) { %0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op %1 = transform.structured.tile_using_for %0 [0, 0, 0] : (!transform.any_op) -> (!transform.any_op) transform.yield } } ``` Apply the transforms and canonicalize the IR: ``` mlir-opt --transform-interpreter -canonicalize input.mlir ``` we will get ```mlir module { func.func @matmul_tensors(%arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf32>, %arg2: tensor<?x?xf32>) -> tensor<?x?xf32> { %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index %dim = tensor.dim %arg0, %c0 : tensor<?x?xf32> %dim_0 = tensor.dim %arg0, %c1 : tensor<?x?xf32> %dim_1 = tensor.dim %arg1, %c1 : tensor<?x?xf32> %extracted_slice = tensor.extract_slice %arg0[0, 0] [%dim, %dim_0] [1, 1] : tensor<?x?xf32> to tensor<?x?xf32> %extracted_slice_2 = tensor.extract_slice %arg1[0, 0] [%dim_0, %dim_1] [1, 1] : tensor<?x?xf32> to tensor<?x?xf32> %extracted_slice_3 = tensor.extract_slice %arg2[0, 0] [%dim, %dim_1] [1, 1] : tensor<?x?xf32> to tensor<?x?xf32> %0 = linalg.matmul ins(%extracted_slice, %extracted_slice_2 : tensor<?x?xf32>, tensor<?x?xf32>) outs(%extracted_slice_3 : tensor<?x?xf32>) -> tensor<?x?xf32> return %0 : tensor<?x?xf32> } } ``` The revision early-return the case so we can get: ```mlir func.func @matmul_tensors(%arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf32>, %arg2: tensor<?x?xf32>) -> tensor<?x?xf32> { %0 = linalg.matmul ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>) outs(%arg2 : tensor<?x?xf32>) -> tensor<?x?xf32> return %0 : tensor<?x?xf32> } ```	2023-12-19 09:14:43 -08:00
Ivan Butygin	c0d2ea9d42	[mlir][scf] Improve `scf.parallel` fusion pass (#75852 ) Abort fusion if memref load may alias write, but not the exact alias. Add alias check hook to `naivelyFuseParallelOps`, so user can customize alias checking. Use builtin alias analysis in `ParallelLoopFusion` pass.	2023-12-19 18:07:46 +03:00
Vivian	bd6a2452ae	[mlir][SCF] Add support for peeling the first iteration out of the loop (#74015 ) There is a use case that we need to peel the first iteration out of the for loop so that the peeled forOp can be canonicalized away and the fillOp can be fused into the inner forall loop. For example, we have nested loops as below ``` linalg.fill ins(...) outs(...) scf.for %arg = %lb to %ub step %step scf.forall ... ``` After the peeling transform, it is expected to be ``` scf.forall ... linalg.fill ins(...) outs(...) scf.for %arg = %(lb + step) to %ub step %step scf.forall ... ``` This patch makes the most use of the existing peeling functions and adds support for peeling the first iteration out of the loop.	2023-12-14 17:03:52 -08:00
Keren Zhou	e66f97e8a8	[mlir] Fix loop pipelining when the operand of `yield` is not defined in the loop body (#75423 )	2023-12-13 19:19:13 -08:00
Thomas Raoux	ef112833e1	[MLIR][SCF] Add support for pipelining dynamic loops (#74350 ) Support loops without static boundaries. Since the number of iteration is not known we need to predicate prologue and epilogue in case the number of iterations is smaller than the number of stages. This patch includes work from @chengjunlu	2023-12-10 22:32:11 -08:00
Matthias Springer	77f5b33c46	[mlir][SCF] Retire SCF-specific `to_memref`/`to_tensor` canonicalization patterns (#74551 ) The partial bufferization framework has been replaced with One-Shot Bufferize. SCF-specific canonicalization patterns for `to_memref`/`to_tensor` are no longer needed.	2023-12-07 08:24:17 +09:00
Matthias Springer	df7545e4be	[mlir][SCF] Fix invalid IR in `ParallelOpSingleOrZeroIterationDimsFolder` pattern (#74552 ) `ParallelOpSingleOrZeroIterationDimsFolder` used to produce invalid IR: ``` within split at mlir/test/Dialect/SCF/canonicalize.mlir:1 offset :11:3: error: 'scf.parallel' op expects region #0 to have 0 or 1 blocks scf.parallel (%i0, %i1, %i2) = (%c0, %c3, %c7) to (%c1, %c6, %c10) step (%c1, %c2, %c3) { ^ within split at mlir/test/Dialect/SCF/canonicalize.mlir:1 offset :11:3: note: see current operation: "scf.parallel"(%4, %5, %3) <{operandSegmentSizes = array<i32: 1, 1, 1, 0>}> ({ ^bb0(%arg1: index): "memref.store"(%0, %arg0, %1, %arg1, %6) : (i32, memref<?x?x?xi32>, index, index, index) -> () "scf.yield"() : () -> () ^bb1(%8: index): // no predecessors "scf.yield"() : () -> () }) : (index, index, index) -> () ``` Together with #74551, this commit fixes `mlir/test/Dialect/SCF/canonicalize.mlir` when verifying the IR after each pattern application (#74270).	2023-12-06 15:14:29 +09:00
Thomas Raoux	19e068b048	[MLIR][SCF] Handle more cases in pipelining transform (#74007 ) -Fix case where an op is scheduled in stage 0 and used with a distance of 1 -Fix case where we don't peel the epilogue and a value not part of the last stage is used outside the loop.	2023-12-01 21:28:21 -08:00
MaheshRavishankar	ec1086f2a0	Fix build error from #72178 (#72905 )	2023-11-20 23:09:59 -08:00
Mehdi Amini	26a0b27736	Make MLIR Value more consistent in terms of `const` "correctness" (NFC) (#72765 ) MLIR can't really be const-correct (it would need a `ConstValue` class alongside the `Value` class really, like `ArrayRef` and `MutableArrayRef`). This is however making is more consistent: method that are directly modifying the Value shouldn't be marked const.	2023-11-20 20:52:15 -08:00
Jie Fu	3e6ae77950	[mlir] Non-void lambda does not return a value in all control paths in yieldReplacementForFusedProducer (NFC) /llvm-project/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp:703:5: error: non-void lambda does not return a value in all control paths [-Werror,-Wreturn-type] }; ^ 1 error generated.	2023-11-21 09:11:50 +08:00
MaheshRavishankar	4a020018ce	[NFC] Simplify the tiling implementation using cloning. (#72178 ) The current implementation of tiling using `scf.for` is convoluted to make sure that the destination passing style of the untiled program is preserved. The addition of support to tile using `scf.forall` (adapted from the transform operation in Linalg) in https://github.com/llvm/llvm-project/pull/67083 used cloning of the tiled operations to better streamline the implementation. This PR adapts the other tiling methods to use a similar approach, making the transformations (and handling destination passing style semantics) more systematic. --------- Co-authored-by: Abhishek-Varma <avarma094@gmail.com>	2023-11-20 09:05:48 -08:00
Matthias Springer	96901f1b02	[mlir][SCF] Do not peel already peeled loops (#71900 ) Loop peeling is not beneficial if the step size already divides "ub - lb". There are currently some simple checks to prevent peeling in such cases when lb, ub, step are constants. This commit adds support for IR that is the result of loop peeling in the general case; i.e., lb, ub, step do not necessarily have to be constants. This change adds a new affine_map simplification rule for semi-affine maps that appear during loop peeling and are guaranteed to evaluate to a constant zero. Affine maps such as: ``` (1) affine_map<()[ub, step] -> ((ub - ub mod step) mod step) (2) affine_map<()[ub, lb, step] -> ((ub - (ub - lb) mod step - lb) mod step) (3) ^ may contain additional summands ``` Other affine maps with modulo expressions are not supported by the new simplification rule. This fixes #71469.	2023-11-16 11:47:57 +09:00
long.chen	1609f1c2a5	[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269 ) detail see the docment: https://mlir.llvm.org/deprecation/ Not all changes are made manually, most of them are made through a clang tool I wrote https://github.com/lipracer/cpp-refactor.	2023-11-14 13:01:19 +08:00
Nicolas Vasilache	644cd0724d	[mlir][SCF] Add folding for IndexSwitchOp (#70924 )	2023-11-01 14:30:42 +01:00
Matthias Springer	98a6edd38f	[mlir][Interfaces] `LoopLikeOpInterface`: Expose tied loop results (#70535 ) Expose loop results, which correspond to the region iter_arg values that are returned from the loop when there are no more iterations. Exposing loop results is optional because some loops (e.g., `scf.while`) do not have a 1-to-1 mapping between region iter_args and op results. Also add additional helper functions to query tied results/iter_args/inits.	2023-11-01 08:34:14 +09:00
Matthias Springer	04736c7f7a	[mlir][SCF] Use `transform.get_parent_op` instead of `transform.loop.get_parent_for` (#70757 ) Add a new attribute to `get_parent_op` to get the n-th parent. Remove `transform.loop.get_parent_for`, which is no longer needed.	2023-10-31 18:36:40 +09:00

1 2 3 4 5 ...

567 Commits