clang-p2996

Author	SHA1	Message	Date
Matthias Springer	76435f2dca	[mlir][SCF] `ValueBoundsConstraintSet`: Support `scf.if` (branches) (#87860 ) This commit adds support for `scf.if` to `ValueBoundsConstraintSet`. Example: ``` %0 = scf.if ... -> index { scf.yield %a : index } else { scf.yield %b : index } ``` The following constraints hold for %0: * %0 >= min(%a, %b) * %0 <= max(%a, %b) Such constraints cannot be added to the constraint set; min/max is not supported by `IntegerRelation`. However, if we know which one of %a and %b is larger, we can add constraints for %0. E.g., if %a <= %b: * %0 >= %a * %0 <= %b This commit required a few minor changes to the `ValueBoundsConstraintSet` infrastructure, so that values can be compared while we are still in the process of traversing the IR/adding constraints. Note: This is a re-upload of #85895, which was reverted. The bug that caused the failure was fixed in #87859.	2024-04-06 13:04:49 +09:00
Mehdi Amini	8487e05967	Revert "[mlir][SCF] `ValueBoundsConstraintSet`: Support `scf.if` (branches) (#85895 )" This reverts commit `6b30ffef28`. gcc7 bot is broken	2024-04-05 03:00:35 -07:00
Matthias Springer	6b30ffef28	[mlir][SCF] `ValueBoundsConstraintSet`: Support `scf.if` (branches) (#85895 ) This commit adds support for `scf.if` to `ValueBoundsConstraintSet`. Example: ``` %0 = scf.if ... -> index { scf.yield %a : index } else { scf.yield %b : index } ``` The following constraints hold for %0: * %0 >= min(%a, %b) * %0 <= max(%a, %b) Such constraints cannot be added to the constraint set; min/max is not supported by `IntegerRelation`. However, if we know which one of %a and %b is larger, we can add constraints for %0. E.g., if %a <= %b: * %0 >= %a * %0 <= %b This commit required a few minor changes to the `ValueBoundsConstraintSet` infrastructure, so that values can be compared while we are still in the process of traversing the IR/adding constraints.	2024-04-05 13:14:00 +09:00
MaheshRavishankar	5aeb604c7c	[mlir][SCF] Modernize `coalesceLoops` method to handle `scf.for` loops with iter_args (#87019 ) As part of this extension this change also does some general cleanup 1) Make all the methods take `RewriterBase` as arguments instead of creating their own builders that tend to crash when used within pattern rewrites 2) Split `coalesePerfectlyNestedLoops` into two separate methods, one for `scf.for` and other for `affine.for`. The templatization didnt seem to be buying much there. Also general clean up of tests.	2024-04-04 13:44:24 -07:00
Ivan Butygin	a6d932bca8	[mlir][scf] Align `scf.while` `before` block args in canonicalizer (#76195 ) If `before` block args are directly forwarded to `scf.condition` make sure they are passed in the same order. This is needed for `scf.while` uplifting https://github.com/llvm/llvm-project/pull/76108	2024-04-02 17:54:06 +03:00
Rolf Morel	eacda36c7d	[SCF][Transform] Add support for scf.for in LoopFuseSibling op (#81495 ) Adds support for fusing two scf.for loops occurring in the same block. Uses the rudimentary checks already in place for scf.forall (like the target loop's operands being dominated by the source loop). - Fixes a bug in the dominance check whereby it was checked that values in the target loop themselves dominated the source loop rather than the ops that define these operands. - Renames the LoopFuseSibling op to LoopFuseSiblingOp. - Updates LoopFuseSiblingOp's description. - Adds tests for using LoopFuseSiblingOp on scf.for loops, including one which fails without the fix for the dominance check. - Adds tests checking the different failure modes of the dominance checker. - Adds test for case whereby scf.yield is automatically generated when there are no loop-carried variables.	2024-03-28 14:13:08 +01:00
Vivian	a9d1fead96	Fix the condition for peeling the first iteration (#86350 ) This PR fixes the condition used in loop peeling of the first iteration. Using ceilDiv instead of floorDiv when calculating the loop counts, so that the first iteration gets peeled as needed.	2024-03-25 09:53:57 -07:00
Matthias Springer	35d3b3430e	[mlir][bufferization] Add "bottom-up from terminators" analysis heuristic (#83964 ) One-Shot Bufferize currently does not support loops where a yielded value bufferizes to a buffer that is different from the buffer of the region iter_arg. In such a case, the bufferization fails with an error such as: ``` Yield operand #0 is not equivalent to the corresponding iter bbArg scf.yield %0 : tensor<5xf32> ``` One common reason for non-equivalent buffers is that an op on the path from the region iter_arg to the terminator bufferizes out-of-place. Ops that are analyzed earlier are more likely to bufferize in-place. This commit adds a new heuristic that gives preference to ops that are reachable on the reverse SSA use-def chain from a region terminator and are within the parent region of the terminator. This is expected to work better than the existing heuristics for loops where an iter_arg is written to multiple times within a loop, but only one write is fed into the terminator. Current users of One-Shot Bufferize are not affected by this change. "Bottom-up" is still the default heuristic. Users can switch to the new heuristic manually. This commit also turns the "fuzzer" pass option into a heuristic, cleaning up the code a bit.	2024-03-21 14:16:02 +09:00
Jerry Wu	f7201505a6	[mlir] Add transformation to wrap scf::while in zero-trip-check (#81050 ) Add `scf::wrapWhileLoopInZeroTripCheck` to wrap scf while loop in zero-trip-check.	2024-02-08 17:52:09 -05:00
Ivan Butygin	6050cf2884	[mlir][scf] Add reductions support to `scf.parallel` fusion (#75955 ) Properly handle fusion of loops with reductions: * Check there are no first loop results users between loops * Create new loop op with merged reduction init values * Update `scf.reduce` op to contain reductions from both loops * Update loops users with new loop results	2024-02-01 18:37:19 +03:00
Hsiangkai Wang	c3eb2978a6	[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp (#80145 ) When checking the load indices of the second loop coincide with the store indices of the first loop, it only considers the index values are the same or not. However, there are some cases the index values defined by other operators. In these cases, it will treat them as different even the results of defining operators are the same. We already check if the iteration space is the same in isFusionLegal(). When checking operands of defining operators, we only need to consider the operands come from the same induction variables. If so, we know the results of defining operators are the same.	2024-02-01 13:57:31 +00:00
fabrizio-indirli	d17b005e46	[mlir][scf] Relax requirements for loops fusion (#79187 ) Enable the fusion of parallel loops also when the 1st loop contains multiple write accesses to the same buffer, if the accesses are always on the same indices. Fix LIT test cases whose loops were not being fused. Signed-off-by: Fabrizio Indirli <Fabrizio.Indirli@arm.com>	2024-01-30 20:17:06 +03:00
Matthias Springer	c1730f4221	[mlir][SCF] Do not verify step size of `scf.for` (#78141 ) An op verifier should verify only local properties. This commit removes the verification of `scf.for` step sizes. (Verifiers can check attributes but should not follow SSA values.) This verification could reject IR that is actually valid, e.g.: ```mlir scf.if %always_false { // Branch is never entered. scf.for ... step %c0 { ... } } ``` This commit fixes `for-loop-peeling.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`: ``` within split at llvm-project/mlir/test/Dialect/SCF/for-loop-peeling.mlir:293 offset :9:3: note: see current operation: "scf.for"(%0, %3, %2) ({ ^bb0(%arg1: index): %4 = "arith.index_cast"(%arg1) : (index) -> i64 "memref.store"(%4, %arg0) : (i64, memref<i64>) -> () "scf.yield"() : () -> () }) {__peeled_loop__} : (index, index, index) -> () LLVM ERROR: IR failed to verify after folding ``` Note: `%2` is `arith.constant 0 : index`.	2024-01-15 17:46:54 +01:00
Felix Schneider	f6f1ab9d90	[mlir][scf] Fix `for-loop-peeling` crash (#77697 ) Before applying the peeling patterns, it can happen that the `ForOp` gets a step of zero during folding. This leads to a division-by-zero down the line. This patch adds an additional check for a constant-zero step and a test. Fix https://github.com/llvm/llvm-project/issues/75758	2024-01-12 19:08:16 +01:00
Oleksandr "Alex" Zinenko	2798b72ae7	[mlir] introduce debug transform dialect extension (#77595 ) Introduce a new extension for simple print-debugging of the transform dialect scripts. The initial version of this extension consists of two ops that are printing the payload objects associated with transform dialect values. Similar ops were already available in the test extenion and several downstream projects, and were extensively used for testing.	2024-01-12 13:24:02 +01:00
Thomas Raoux	c933bd8185	[MLIR][SCF] Add checks to verify that the pipeliner schedule is correct. (#77083 ) Add a check to validate that the schedule passed to the pipeliner transformation is valid and won't cause the pipeliner to break SSA. This checks that the for each operation in the loop operations are scheduled after their operands.	2024-01-10 04:25:57 -08:00
Arseniy Obolenskiy	59569eb756	[mlir] Fix support for loop normalization with integer indices (#76566 ) Choose correct type for updated loop boundaries after scf loop normalization, do not force chosen type to IndexType	2024-01-05 17:49:21 +03:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Matthias Springer	10056c821a	[mlir][SCF] `scf.parallel`: Make reductions part of the terminator (#75314 ) This commit makes reductions part of the terminator. Instead of `scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops. `scf.reduce` may contain an arbitrary number of reductions, with one region per reduction. Example: ```mlir %init = arith.constant 0.0 : f32 %r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init) -> f32, f32 { %elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32> %elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32> scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) { ^bb0(%lhs : f32, %rhs: f32): %res = arith.addf %lhs, %rhs : f32 scf.reduce.return %res : f32 }, { ^bb0(%lhs : f32, %rhs: f32): %res = arith.mulf %lhs, %rhs : f32 scf.reduce.return %res : f32 } } ``` `scf.reduce` operations can no longer be interleaved with other ops in the body of `scf.parallel`. This simplifies the op and makes it possible to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was not possible before because the op was not a terminator, causing the op to be DCE'd.)	2023-12-20 11:06:27 +09:00
Ivan Butygin	c0d2ea9d42	[mlir][scf] Improve `scf.parallel` fusion pass (#75852 ) Abort fusion if memref load may alias write, but not the exact alias. Add alias check hook to `naivelyFuseParallelOps`, so user can customize alias checking. Use builtin alias analysis in `ParallelLoopFusion` pass.	2023-12-19 18:07:46 +03:00
Vivian	bd6a2452ae	[mlir][SCF] Add support for peeling the first iteration out of the loop (#74015 ) There is a use case that we need to peel the first iteration out of the for loop so that the peeled forOp can be canonicalized away and the fillOp can be fused into the inner forall loop. For example, we have nested loops as below ``` linalg.fill ins(...) outs(...) scf.for %arg = %lb to %ub step %step scf.forall ... ``` After the peeling transform, it is expected to be ``` scf.forall ... linalg.fill ins(...) outs(...) scf.for %arg = %(lb + step) to %ub step %step scf.forall ... ``` This patch makes the most use of the existing peeling functions and adds support for peeling the first iteration out of the loop.	2023-12-14 17:03:52 -08:00
Keren Zhou	e66f97e8a8	[mlir] Fix loop pipelining when the operand of `yield` is not defined in the loop body (#75423 )	2023-12-13 19:19:13 -08:00
Thomas Raoux	ef112833e1	[MLIR][SCF] Add support for pipelining dynamic loops (#74350 ) Support loops without static boundaries. Since the number of iteration is not known we need to predicate prologue and epilogue in case the number of iterations is smaller than the number of stages. This patch includes work from @chengjunlu	2023-12-10 22:32:11 -08:00
Matthias Springer	77f5b33c46	[mlir][SCF] Retire SCF-specific `to_memref`/`to_tensor` canonicalization patterns (#74551 ) The partial bufferization framework has been replaced with One-Shot Bufferize. SCF-specific canonicalization patterns for `to_memref`/`to_tensor` are no longer needed.	2023-12-07 08:24:17 +09:00
Thomas Raoux	19e068b048	[MLIR][SCF] Handle more cases in pipelining transform (#74007 ) -Fix case where an op is scheduled in stage 0 and used with a distance of 1 -Fix case where we don't peel the epilogue and a value not part of the last stage is used outside the loop.	2023-12-01 21:28:21 -08:00
Matthias Springer	96901f1b02	[mlir][SCF] Do not peel already peeled loops (#71900 ) Loop peeling is not beneficial if the step size already divides "ub - lb". There are currently some simple checks to prevent peeling in such cases when lb, ub, step are constants. This commit adds support for IR that is the result of loop peeling in the general case; i.e., lb, ub, step do not necessarily have to be constants. This change adds a new affine_map simplification rule for semi-affine maps that appear during loop peeling and are guaranteed to evaluate to a constant zero. Affine maps such as: ``` (1) affine_map<()[ub, step] -> ((ub - ub mod step) mod step) (2) affine_map<()[ub, lb, step] -> ((ub - (ub - lb) mod step - lb) mod step) (3) ^ may contain additional summands ``` Other affine maps with modulo expressions are not supported by the new simplification rule. This fixes #71469.	2023-11-16 11:47:57 +09:00
Nicolas Vasilache	644cd0724d	[mlir][SCF] Add folding for IndexSwitchOp (#70924 )	2023-11-01 14:30:42 +01:00
Matthias Springer	98a6edd38f	[mlir][Interfaces] `LoopLikeOpInterface`: Expose tied loop results (#70535 ) Expose loop results, which correspond to the region iter_arg values that are returned from the loop when there are no more iterations. Exposing loop results is optional because some loops (e.g., `scf.while`) do not have a 1-to-1 mapping between region iter_args and op results. Also add additional helper functions to query tied results/iter_args/inits.	2023-11-01 08:34:14 +09:00
Matthias Springer	04736c7f7a	[mlir][SCF] Use `transform.get_parent_op` instead of `transform.loop.get_parent_for` (#70757 ) Add a new attribute to `get_parent_op` to get the n-th parent. Remove `transform.loop.get_parent_for`, which is no longer needed.	2023-10-31 18:36:40 +09:00
Oleksandr "Alex" Zinenko	e4384149b5	[mlir] use transform-interpreter in test passes (#70040 ) Update most test passes to use the transform-interpreter pass instead of the test-transform-dialect-interpreter-pass. The new "main" interpreter pass has a named entry point instead of looking up the top-level op with `PossibleTopLevelOpTrait`, which is arguably a more understandable interface. The change is mechanical, rewriting an unnamed sequence into a named one and wrapping the transform IR in to a module when necessary. Add an option to the transform-interpreter pass to target a tagged payload op instead of the root anchor op, which is also useful for repro generation. Only the test in the transform dialect proper and the examples have not been updated yet. These will be updated separately after a more careful consideration of testing coverage of the transform interpreter logic.	2023-10-24 16:12:34 +02:00
Justin Fargnoli	e38c8bdca8	[MLIR][scf.parallel] Don't allow a tile size of 0 (#68762 ) Fix a crash reported in #64331. The crash is described in the following comment: > It looks like the bug is being caused by the command line argument --scf-parallel-loop-tiling=parallel-loop-tile-sizes=0. More specifically, --scf-parallel-loop-tiling=parallel-loop-tile-sizes sets the tileSize variable to 0 on [this line](`7cc1bfaf37/mlir/lib/Dialect/SCF/Transforms/ParallelLoopTiling.cpp (L67)`). tileSize is then used on [this line](`7cc1bfaf37/mlir/lib/Dialect/SCF/Transforms/ParallelLoopTiling.cpp (L117)`) causing a divide by zero exception. This PR will: 1. Call `signalPassFail()` when 0 is passed as a tile size. 2. Avoid the divide by zero that causes the crash. Note: This is my first PR for MLIR, so please liberally critique it.	2023-10-23 14:28:44 -07:00
Stephen Chou	fd673e8c4e	[MLIR][SCF] Removes incorrect assertion in loop unroller (#69028 ) In particular, `upperBoundUnrolledCst` may be larger than `ubCst` when: 1. the step size is greater than 1; 2. `ub - lb` is not evenly divisible by the step size; and 3. the loop's trip count is evenly divisible by the unroll factor. This is okay since the non-unit step size ensures that the unrolled loop maintains the same trip count as the original loop. Added a test case for this. Fixes #61832. Co-authored-by: Stephen Chou <stephenchou@google.com>	2023-10-16 06:20:43 +02:00
Matthias Springer	ab737a8699	[mlir][Interfaces] `LoopLikeOpInterface`: Add helper to get yielded values (#67305 ) Add a new interface method that returns the yielded values. Also add a verifier that checks the number of inits/iter_args/yielded values. Most of the checked invariants (but not all of them) are already covered by the `RegionBranchOpInterface`, but the `LoopLikeOpInterface` now provides (additional) error messages that are easier to read.	2023-10-16 08:45:48 +09:00
Matthias Springer	173fd67a12	[mlir][scf][bufferize] Improve bufferization of allocs yielded from `scf.for` (#68089 ) The `BufferizableOpInterface` implementation of `scf.for` currently assumes that an OpResult does not alias with any tensor apart from the corresponding init OpOperand. Newly allocated buffers (inside of the loop) are also allowed. The current implementation checks whether the respective init_arg and yielded value are equivalent. This is overly strict and causes extra buffer allocations/copies when yielding a new buffer allocation from a loop.	2023-10-03 16:08:50 +02:00
Matthias Springer	c95fcd343d	[mlir][bufferization] Remove `resolveUsesInRepetitiveRegions` (#67927 ) The bufferization analysis has been improved over the last months and this workaround is no longer needed.	2023-10-02 16:04:27 +02:00
Matthias Springer	e52899ea52	[mlir][SCF] Bufferize scf.index_switch (#67666 ) Add the `BufferizableOpInterface` implementation of `scf.index_switch`.	2023-09-28 19:05:14 +02:00
Andrzej Warzynski	1e70ab5f0d	[mlir][transform] Update transform.loop.peel (reland #67482 ) This patch updates `transform.loop.peel` so that this Op returns two rather than one handle: * one for the peeled loop, and * one for the remainder loop. Also, following this change this Op will fail if peeling fails. This is consistent with other similar Ops that also fail if no transformation takes place. Relands #67482 with an extra fix for transform_loop_ext.py	2023-09-28 14:35:46 +00:00
Andrzej Warzynski	1529754c93	Revert "[mlir][transform] Update transform.loop.peel (#67482 )" This reverts commit `1d5ccce121`. Broken bot: * https://lab.llvm.org/buildbot/#/builders/220/builds/28440	2023-09-27 21:19:11 +00:00
Andrzej Warzynski	44274e566a	[mlir][nfc] Add missing comment in a test Fixes an accidental omission in #67482	2023-09-27 20:02:26 +00:00
Andrzej Warzyński	1d5ccce121	[mlir][transform] Update transform.loop.peel (#67482 ) This patch updates `transform.loop.peel` so that this Op returns two rather than one handle: * one for the peeled loop, and * one for the remainder loop. Also, following this change this Op will fail if peeling fails. This is consistent with other similar Ops that also fail if no transformation takes place.	2023-09-27 20:58:39 +01:00
Oleksandr "Alex" Zinenko	96ff0255f2	[mlir] cleanup of structured.tile* transform ops (#67320 ) Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme: - `transform.structured.tile_using_for`; - `transform.structured.tile_using_forall`; - `transform.structured.tile_reduction_using_for`; - `transform.structured.tile_reduction_using_forall`. This drops the "_op" naming artifact from `tile_to_forall_op` that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of `tile_reduction_using_scf` since `scf.forall` also belongs to `scf`), and opts for the `using` connector to avoid ambiguity. The loops produced by tiling are now systematically placed as trailing results of the transform op. While this required changing 3 out of 4 ops (except for `tile_using_for`), this is the only choice that makes sense when producing multiple `scf.for` ops that can be associated with a variadic number of handles. This choice is also most consistent with other transform ops from the structured extension, in particular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.	2023-09-26 09:14:29 +02:00
Oleksandr "Alex" Zinenko	4fbb5f9350	[mlir] introduce transform.loop.forall_to_for (#65474 ) Add a straightforward sequentialization transform from `scf.forall` to a nest of `scf.for` in absence of results and expose it as a transform op. This is helpful in combination with other transform ops, particularly fusion, that work best on parallel-by-construction `scf.forall` but later need to target sequential `for` loops.	2023-09-20 18:53:08 +02:00
Martin Erhart	ba727ac219	[mlir][bufferization][scf] Implement BufferDeallocationOpInterface for scf.reduce.return (#66886 ) This is necessary to run the new buffer deallocation pipeline as part of the sparse compiler pipeline.	2023-09-20 14:19:13 +02:00
Martin Erhart	6bf043e743	[mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute (#66619 ) This commit removes the deallocation capabilities of one-shot-bufferization. One-shot-bufferization should never deallocate any memrefs as this should be entirely handled by the ownership-based-buffer-deallocation pass going forward. This means the `allow-return-allocs` pass option will default to true now, `create-deallocs` defaults to false and they, as well as the escape attribute indicating whether a memref escapes the current region, will be removed. A new `allow-return-allocs-from-loops` option is added as a temporary workaround for some bufferization limitations.	2023-09-18 16:44:48 +02:00
Martin Erhart	1a4dd8d362	[mlir][bufferization] Switch tests to new deallocation pass pipeline (#66517 ) Use the new ownership based deallocation pass pipeline in the regression and integration tests. Some one-shot bufferization tests tested one-shot bufferize and deallocation at the same time. I removed the deallocation pass there because the deallocation pass is already thoroughly tested by itself. Fixed version of #66471	2023-09-18 12:00:27 +02:00
Matthias Springer	5cf714bb2f	[mlir][SCF] scf.for: Consistent API around `initArgs` (#66512 ) * Always use the auto-generated `getInitArgs` function. Remove the hand-written `getInitOperands` duplicate. * Remove `hasIterOperands` and `getNumIterOperands`. The names were inconsistent because the "arg" is called `initArgs` in TableGen. Use `getInitArgs().size()` instead. * Fix verification around ops with no results.	2023-09-18 09:13:43 +02:00
Martin Erhart	3d51010a33	Revert "[mlir][bufferization] Switch tests to new deallocation pass pipeline (#66471 )" This reverts commit `ea42b49f10`. Some GPU integration tests are failing that I didn't observe locally. Reverting until I have a fix.	2023-09-15 09:19:54 +00:00
Martin Erhart	ea42b49f10	[mlir][bufferization] Switch tests to new deallocation pass pipeline (#66471 ) Use the new ownership based deallocation pass pipeline in the regression and integration tests. Some one-shot bufferization tests tested one-shot bufferize and deallocation at the same time. I removed the deallocation pass there because the deallocation pass is already thoroughly tested by itself.	2023-09-15 11:08:53 +02:00
Christopher Bate	e2d39f799b	[mlir][Transform] Add `updateConversionTarget` to `ConversionPatternDescriptorOpInterface` This change adds a method to modify the ConversionTarget used during `transform.apply_conversion_patterns` to the `ConversionPatternDescriptorOpInterface`. This is needed when the TypeConverter is used to dictate the dynamic legality of operations, as in "structural" conversion patterns present in, for example, the SCF and func dialects. As a first use case/test, this change also adds a `transform.apply_patterns.scf.structural_conversions` operation to the SCF dialect. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D158672	2023-09-14 11:39:47 -06:00
Martin Erhart	66aa9a2517	[mlir][bufferization] Implement BufferDeallocationopInterface for scf.forall.in_parallel (#66351 ) The scf.forall.in_parallel terminator operation has a nested graph region with the NoTerminator trait. Such regions are not supported by the default implementations. Therefore, this commit adds a specialized implementation for this operation which only covers the case where the nested region is empty. This is because after bufferization, ops like tensor.parallel_insert_slice were already converted to memref operations residing int the scf.forall only and the nested region of scf.forall.in_parallel ends up empty.	2023-09-14 16:20:24 +02:00

1 2 3 4 5 ...

291 Commits