- Change walkReturnOperations() to be a non-template and look at block terminator
for ReturnLike trait.
- Clarify description of validateSupportedControlFlow
- Eliminate unused argument in Backedges::recurse.
- Eliminate repeated calls to getFunction()
- Fix wording for non-SCF loop failure
Differential Revision: https://reviews.llvm.org/D106373
Changes include the following:
1. Single iteration reduction loops being sibling fused at innermost insertion level
are skipped from being considered as sequential loops.
Otherwise, the slice bounds of these loops is reset.
2. Promote loops that are skipped in previous step into outer loops.
3. Two utility function - buildSliceTripCountMap, getSliceIterationCount - are moved from
mlir/lib/Transforms/Utils/LoopFusionUtils.cpp to mlir/lib/Analysis/Utils.cpp
Reviewed By: bondhugula, vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D104249
* Split memref.dim into two operations: memref.dim and tensor.dim. Both ops have the same builder interface and op argument names, so that they can be used with templates in patterns that apply to both tensors and memrefs (e.g., some patterns in Linalg).
* Add constant materializer to TensorDialect (needed for folding in affine.apply etc.).
* Remove some MemRefDialect dependencies, make some explicit.
Differential Revision: https://reviews.llvm.org/D105165
Without it BufferDeallocationPass process only CloneOps created during pass itself and ignore all CloneOps that were already present in IR.
For our specific usecase:
```
func @dealloc_existing_clones(%arg0: memref<?x?xf64>, %arg1: memref<?x?xf64>) -> memref<?x?xf64> {
return %arg0 : memref<?x?xf64>
}
```
Input arguments will be freed immediately after return from function and we want to prolong lifetime for the returned argument.
To achieve this we explicitly add clones to all input memrefs and expect that BufferDeallocationPass will add correct deallocs to them (unnessesary clone+dealloc pairs will be canonicalized away later).
Differential Revision: https://reviews.llvm.org/D104973
During slice computation of affine loop fusion, detect one id as the mod
of another id w.r.t a constant in a more generic way. Restrictions on
co-efficients of the ids is removed. Also, information from the
previously calculated ids is used for simplification of affine
expressions, e.g.,
If `id1` = `id2`,
`id_n - divisor * id_q - id_r + id1 - id2 = 0`, is simplified to:
`id_n - divisor * id_q - id_r = 0`.
If `c` is a non-zero integer,
`c*id_n - c*divisor * id_q - c*id_r = 0`, is simplified to:
`id_n - divisor * id_q - id_r = 0`.
Reviewed By: bondhugula, ayzhuang
Differential Revision: https://reviews.llvm.org/D104614
The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect.
* Rename SubTensorOp -> tensor.extract_slice, SubTensorInsertOp -> tensor.insert_slice.
* Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit.
* Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard
* Remove dialect dependencies: Standard --> Tensor
* Move canonicalization test cases to correct dialect (Tensor/MemRef).
Note: This is a fixed version of https://reviews.llvm.org/D104499, which was reverted due to a missing update to two CMakeFile.txt.
Differential Revision: https://reviews.llvm.org/D104676
The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect.
* Rename ops: SubTensorOp --> ExtractTensorOp, SubTensorInsertOp --> InsertTensorOp
* Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit.
* Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard
* Remove dialect dependencies: Standard --> Tensor
* Move canonicalization test cases to correct dialect (Tensor/MemRef).
Differential Revision: https://reviews.llvm.org/D104499
Based on dicussion in
[this](https://llvm.discourse.group/t/remove-canonicalizer-for-memref-dim-via-shapedtypeopinterface/3641)
thread the pattern to resolve the `memref.dim` of a value that is a
result of an operation that implements the
`InferShapedTypeOpInterface` is moved to a separate pass instead of
running it as a canonicalization pass. This allows shape resolution to
happen when explicitly required, instead of automatically through a
canonicalization.
Differential Revision: https://reviews.llvm.org/D104321
This patch changes the (not recommended) static registration API from:
static PassRegistration<MyPass> reg("my-pass", "My Pass Description.");
to:
static PassRegistration<MyPass> reg;
And the explicit registration from:
void registerPass("my-pass", "My Pass Description.",
[] { return createMyPass(); });
To:
void registerPass([] { return createMyPass(); });
It is expected that Pass implementations overrides the getArgument() method
instead. This will ensure that pipeline description can be printed and parsed
back.
Differential Revision: https://reviews.llvm.org/D104421
This allows for dialects to do different post-processing depending on operations with the inliner (my use case requires different attribute propagation rules depending on call op). This hook runs before the regular processInlinedBlocks method.
Differential Revision: https://reviews.llvm.org/D104399
This revision allows for attaching "debug labels" to patterns, and provides to FrozenRewritePatternSet for filtering patterns based on these labels (in addition to the debug name of the pattern). This will greatly simplify the ability to write tests targeted towards specific patterns (in cases where many patterns may interact), will also simplify debugging pattern application by observing how application changes when enabling/disabling specific patterns.
To enable better reuse of pattern rewrite options between passes, this revision also adds a new PassUtil.td file to the Rewrite/ library that will allow for passes to easily hook into a common interface for pattern debugging. Two options are used to seed this utility, `disable-patterns` and `enable-patterns`, which are used to enable the filtering behavior indicated above.
Differential Revision: https://reviews.llvm.org/D102441
* Add `hasCanonicalizer` option to Dialect.
* Initialize canonicalizer with dialect-wide canonicalization patterns.
* Add test case to TestDialect.
Dialect-wide canonicalization patterns are useful if a canonicalization pattern does not conceptually associate with any single operation, i.e., it should not be registered as part of an operation's `getCanonicalizationPatterns` function. E.g., this is the case for canonicalization patterns that match an op interface.
Differential Revision: https://reviews.llvm.org/D103226
This provides a sizable compile time improvement by seeding
the worklist in an order that leads to less iterations of the
worklist.
This patch only changes the behavior of the Canonicalize pass
itself, it does not affect other passes that use the
GreedyPatternRewrite driver
Differential Revision: https://reviews.llvm.org/D103053
Steps for normalizing dynamic memrefs for tiled layout map
1. Check if original map is tiled layout. Only tiled layout is supported.
2. Create normalized memrefType. Dimensions that include dynamic dimensions
in the map output will be dynamic dimensions.
3. Create new maps to calculate each dimension size of new memref.
In tiled layout, the dimension size can be calculated by replacing
"floordiv <tile size>" with "ceildiv <tile size>" and
"mod <tile size>" with "<tile size>".
4. Create AffineApplyOp to apply the new maps. The output of AffineApplyOp is
dynamicSizes for new AllocOp.
5. Add the new dynamic sizes in new AllocOp.
This patch also set MemRefsNormalizable trant in CastOp and DimOp since
they used with dynamic memrefs.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D97655
This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).
This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).
The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode). This is complete for generic operations, but requires manual
adoption for custom ops.
I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor. I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.
This is a reapply of the patch here: https://reviews.llvm.org/D102567
with an additional change: we now never defer block argument locations,
guaranteeing that we can round trip correctly.
This isn't required in all cases, but allows us to hill climb here and
works around unrelated bugs like https://bugs.llvm.org/show_bug.cgi?id=50451
Differential Revision: https://reviews.llvm.org/D102991
"[mlir] Speed up Lexer::getEncodedSourceLocation"
This reverts commit 3043be9d2d and commit
861d69a525.
This change resulted in printing textual MLIR that can't be parsed; see
review thread https://reviews.llvm.org/D102567 for details.
This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).
This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).
The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode). This is complete for generic operations, but requires manual
adoption for custom ops.
I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor. I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.
Differential Revision: https://reviews.llvm.org/D102567
During affine loop fusion, create private memrefs for escaping memrefs
too under the conditions that:
-- the source is not removed after fusion, and
-- the destination does not write to the memref.
This creates more fusion opportunities as illustrated in the test case.
Reviewed By: bondhugula, ayzhuang
Differential Revision: https://reviews.llvm.org/D102604
In the buffer deallocation pass, unranked memref types are not properly supported.
After investigating this issue, it turns out that the Clone and Dealloc operation
does not support unranked memref types in the current implementation.
This patch adds the missing feature and enables the transformation of any memref
type.
This patch solves this bug: https://bugs.llvm.org/show_bug.cgi?id=48385
Differential Revision: https://reviews.llvm.org/D101760
We weren't properly visiting region successors when the terminator wasn't return like, which could create incorrect results in the analysis. This revision ensures that we properly visit region successors, to avoid optimistically assuming a value is constant when it isn't.
Differential Revision: https://reviews.llvm.org/D101783
Add two canoncalizations for scf.if.
1) A canonicalization that allows users of a condition within an if to assume the condition
is true if in the true region, etc.
2) A canonicalization that removes yielded statements that are equivalent to the condition
or its negation
Differential Revision: https://reviews.llvm.org/D101012
When allocLikeOp is updated in alloc constant folding,
alighnment attribute was ignored. This patch fixes it.
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D99882
MLIR test Transforms/canonicalize.mlir tries to check for the absence of
a sequence of instructions with several CHECK-NOT with one of those
directives using a variable defined in another. However CHECK-NOT are
checked independently so that is using a variable defined in a pattern
that should not occur in the input.
This commit removes the dependency between those CHECK-NOT by replacing
occurences of variables by the regex that were used to define them.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D99958
Fixes a bug in affine fusion pipeline where an incorrect slice is computed.
After the slice computation is done, original domain of the the source is
compared with the new domain that will result if the fusion succeeds. If the
new domain must be a subset of the original domain for the slice to be
valid. If the slice computed is incorrect, fusion based on such a slice is
avoided.
Relevant test cases are added/edited.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49203
Differential Revision: https://reviews.llvm.org/D98239
If an operation has been inserted as a key in to the known values
hashtable, then it can not be modified in a way which changes its hash.
This change avoids modifying the operands of any previously recorded
operation, which prevents their hash from changing.
In an SSACFG region, it is impossible to visit an operation before
visiting its operands, so this is not a problem. This situation can only
happen in regions without strict dominance, such as graph regions.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D99486
This allows for the conversion to match `A(B()) -> C()` with a pattern matching
`A` and marking `B` for deletion.
Also add better assertions when an operation is erased while still having uses.
Differential Revision: https://reviews.llvm.org/D99442
A new `InterfaceMethod` is added to `InferShapedTypeOpInterface` that
allows an operation to return the `Value`s for each dim of its
results. It is intended for the case where the `Value` returned for
each dim is computed using the operands and operation attributes. This
interface method is for cases where the result dim of an operation can
be computed independently, and it avoids the need to aggregate all
dims of a result into a single shape value. This also implies that
this is not suitable for cases where the result type is unranked (for
which the existing interface methods is to be used).
Also added is a canonicalization pattern that uses this interface and
resolves the shapes of the output in terms of the shapes of the
inputs. Moving Linalg ops to use this interface, so that many
canonicalization patterns implemented for individual linalg ops to
achieve the same result can be removed in favor of the added
canonicalization pattern.
Differential Revision: https://reviews.llvm.org/D97887
Add a new clone operation to the memref dialect. This operation implicitly
copies data from a source buffer to a new buffer. In contrast to the linalg.copy
operation, this operation does not accept a target buffer as an argument.
Instead, this operation performs a conceptual allocation which does not need to
be performed manually.
Furthermore, this operation resolves the dependency from the linalg-dialect
in the BufferDeallocation pass. In addition, we also extended the canonicalization
patterns to fold clone operations. The copy removal pass has been removed.
Differential Revision: https://reviews.llvm.org/D99172
This reverts commit 361b7d125b by Chris
Lattner <clattner@nondot.org> dated Fri Mar 19 21:22:15 2021 -0700.
The change to the greedy rewriter driver picking a different order was
made without adequate analysis of the trade-offs and experimentation. A
change like this has far reaching consequences on transformation
pipelines, and a major impact upstream and downstream. For eg., one
can’t be sure that it doesn’t slow down a large number of cases by small
amounts or create other issues. More discussion here:
https://llvm.discourse.group/t/speeding-up-canonicalize/3015/25
Reverting this so that improvements to the traversal order can be made
on a clean slate, in bigger steps, and higher bar.
Differential Revision: https://reviews.llvm.org/D99329
In particular for Graph Regions, the terminator needs is just a
historical artifact of the generalization of MLIR from CFG region.
Operations like Module don't need a terminator, and before Module
migrated to be an operation with region there wasn't any needed.
To validate the feature, the ModuleOp is migrated to use this trait and
the ModuleTerminator operation is deleted.
This patch is likely to break clients, if you're in this case:
- you may iterate on a ModuleOp with `getBody()->without_terminator()`,
the solution is simple: just remove the ->without_terminator!
- you created a builder with `Builder::atBlockTerminator(module_body)`,
just use `Builder::atBlockEnd(module_body)` instead.
- you were handling ModuleTerminator: it isn't needed anymore.
- for generic code, a `Block::mayNotHaveTerminator()` may be used.
Differential Revision: https://reviews.llvm.org/D98468
This reapplies b5d9a3c / https://reviews.llvm.org/D98609 with a one line fix in
processExistingConstants to skip() when erasing a constant we've already seen.
Original commit message:
1) Change the canonicalizer to walk the function in top-down order instead of
bottom-up order. This composes well with the "top down" nature of constant
folding and simplification, reducing iterations and re-evaluation of ops in
simple cases.
2) Explicitly enter existing constants into the OperationFolder table before
canonicalizing. Previously we would "constant fold" them and rematerialize
them, wastefully recreating a bunch fo constants, which lead to pointless
memory traffic.
Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.
One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.
Differential Revision: https://reviews.llvm.org/D99006
When deleting operations in DCE, the algorithm uses a post-order walk of
the IR to ensure that value uses were erased before value defs. Graph
regions do not have the same structural invariants as SSA CFG, and this
post order walk could delete value defs before uses. This problem is
guaranteed to occur when there is a cycle in the use-def graph.
This change stops DCE from visiting the operations and blocks in any
meaningful order. Instead, we rely on explicitly dropping all uses of a
value before deleting it.
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D98919
This reverts commit b5d9a3c923.
The commit introduced a memory error in canonicalization/operation
walking that is exposed when compiled with ASAN. It leads to crashes in
some "release" configurations.
Two changes:
1) Change the canonicalizer to walk the function in top-down order instead of
bottom-up order. This composes well with the "top down" nature of constant
folding and simplification, reducing iterations and re-evaluation of ops in
simple cases.
2) Explicitly enter existing constants into the OperationFolder table before
canonicalizing. Previously we would "constant fold" them and rematerialize
them, wastefully recreating a bunch fo constants, which lead to pointless
memory traffic.
Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.
One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.
Differential Revision: https://reviews.llvm.org/D98609