These properties were useful for a few things before traits had a better integration story, but don't really carry their weight well these days. Most of these properties are already checked via traits in most of the code. It is better to align the system around traits, and improve the performance/cost of traits in general.
Differential Revision: https://reviews.llvm.org/D96088
This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way.
Differential Revision: https://reviews.llvm.org/D95841
In dialect conversion infrastructure, source materialization applies as part of
the finalization procedure to results of the newly produced operations that
replace previously existing values with values having a different type.
However, such operations may be created to replace operations created in other
patterns. At this point, it is possible that the results of the _original_
operation are still in use and have mismatching types, but the results of the
_intermediate_ operation that performed the type change are not in use leading
to the absence of source materialization. For example,
%0 = dialect.produce : !dialect.A
dialect.use %0 : !dialect.A
can be replaced with
%0 = dialect.other : !dialect.A
%1 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %1 : !dialect.A
and then with
%0 = dialect.final : !dialect.B
%1 = dialect.other : !dialect.A // replaced, scheduled for removal
%2 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %2 : !dialect.A
in the same rewriting, but only the %1->%0 replacement is currently considered.
Change the logic in dialect conversion to look up all values that were replaced
by the given value and performing source materialization if any of those values
is still in use with mismatching types. This is performed by computing the
inverse value replacement mapping. This arguably expensive manipulation is
performed only if there were some type-changing replacements. An alternative
could be to consider all replaced operations and not only those that resulted
in type changes, but it would harm pattern-level composability: the pattern
that performed the non-type-changing replacement would have to be made aware of
the type converter in order to call the materialization hook.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95626
We could extend this with an interface to allow dialect to perform a type
conversion, but that would make the folder creating operation which isn't
the case at the moment, and isn't necessarily always desirable.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95991
In dialect conversion, signature conversions essentially perform block argument
replacement and are added to the general value remapping. However, the replaced
values were not tracked, so if a signature conversion was rolled back, the
construction of operand lists for the following patterns could have obtained
block arguments from the mapping and give them to the pattern leading to
use-after-free. Keep track of signature conversions similarly to normal block
argument replacement, and erase such replacements from the general mapping when
the conversion is rolled back.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95688
Currently, for a scf.parallel (i,j,k) after the loop collapsing to 1D is done, the
IVs would be traversed as for an scf.parallel(k,j,i).
Differential Revision: https://reviews.llvm.org/D95693
This patch adds support for producer-consumer fusion scenarios with
multiple producer stores to the AffineLoopFusion pass. The patch
introduces some changes to the producer-consumer algorithm, including:
* For a given consumer loop, producer-consumer fusion iterates over its
producer candidates until a fixed point is reached.
* Producer candidates are gathered beforehand for each iteration of the
consumer loop and visited in reverse program order (not strictly guaranteed)
to maximize the number of loops fused per iteration.
In general, these changes were needed to simplify the multi-store producer
support and remove some of the workarounds that were introduced in the past
to support more fusion cases under the single-store producer limitation.
This patch also preserves the existing functionality of AffineLoopFusion with
one minor change in behavior. Producer-consumer fusion didn't fuse scenarios
with escaping memrefs and multiple outgoing edges (from a single store).
Multi-store producer scenarios will usually (always?) have multiple outgoing
edges so we couldn't fuse any with escaping memrefs, which would greatly limit
the applicability of this new feature. Therefore, the patch enables fusion for
these scenarios. Please, see modified tests for specific details.
Reviewed By: andydavis1, bondhugula
Differential Revision: https://reviews.llvm.org/D92876
This extracts the implementation of getType, setType, and getBody from
FunctionSupport.h into the mlir::impl namespace and defines them
generically in FunctionSupport.cpp. This allows them to be used
elsewhere for any FunctionLike ops that use FunctionType for their
type signature.
Using the new helpers, FuncOpSignatureConversion is generalized to
work with all such FunctionLike ops. Convenience helpers are added to
configure the pattern for a given concrete FunctionLike op type.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95021
This patch adds support for producer-consumer fusion scenarios with
multiple producer stores to the AffineLoopFusion pass. The patch
introduces some changes to the producer-consumer algorithm, including:
* For a given consumer loop, producer-consumer fusion iterates over its
producer candidates until a fixed point is reached.
* Producer candidates are gathered beforehand for each iteration of the
consumer loop and visited in reverse program order (not strictly guaranteed)
to maximize the number of loops fused per iteration.
In general, these changes were needed to simplify the multi-store producer
support and remove some of the workarounds that were introduced in the past
to support more fusion cases under the single-store producer limitation.
This patch also preserves the existing functionality of AffineLoopFusion with
one minor change in behavior. Producer-consumer fusion didn't fuse scenarios
with escaping memrefs and multiple outgoing edges (from a single store).
Multi-store producer scenarios will usually (always?) have multiple outgoing
edges so we couldn't fuse any with escaping memrefs, which would greatly limit
the applicability of this new feature. Therefore, the patch enables fusion for
these scenarios. Please, see modified tests for specific details.
Reviewed By: andydavis1, bondhugula
Differential Revision: https://reviews.llvm.org/D92876
Add a check if regions do not implement the RegionBranchOpInterface. This is not
allowed in the current deallocation steps. Furthermore, we handle edge-cases,
where a single region is attached and the parent operation has no results.
This fixes: https://bugs.llvm.org/show_bug.cgi?id=48575
Differential Revision: https://reviews.llvm.org/D94586
This revision adds a new `replaceOpWithIf` hook that replaces uses of an operation that satisfy a given functor. If all uses are replaced, the operation gets erased in a similar manner to `replaceOp`. DialectConversion support will be added in a followup as this requires adjusting how replacements are tracked there.
Differential Revision: https://reviews.llvm.org/D94632
This corrects the last 2 issues caught by tests when causing dialect
conversion rollbacks to occur.
Differential Revision: https://reviews.llvm.org/D94623
getDynOperands behavior is commonly used in a number of passes. Refactored to
use a helper function and avoid code reuse.
Differential Revision: https://reviews.llvm.org/D94340
This revision adds a new `initialize(MLIRContext *)` hook to passes that allows for them to initialize any heavy state before the first execution of the pass. A concrete use case of this is with patterns that rely on PDL, given that PDL is compiled at run time it is imperative that compilation results are cached as much as possible. The first use of this hook is in the Canonicalizer, which has the added benefit of reducing the number of expensive accesses to the context when collecting patterns.
Differential Revision: https://reviews.llvm.org/D93147
This class used to serve a few useful purposes:
* Allowed containing a null DictionaryAttr
* Provided some simple mutable API around a DictionaryAttr
The first of which is no longer an issue now that there is much better caching support for attributes in general, and a cache in the context for empty dictionaries. The second results in more trouble than it's worth because it mutates the internal dictionary on every action, leading to a potentially large number of dictionary copies. NamedAttrList is a much better alternative for the second use case, and should be modified as needed to better fit it's usage as a DictionaryAttrBuilder.
Differential Revision: https://reviews.llvm.org/D93442
This better matches the rest of the infrastructure, is much simpler, and makes it easier to move these types to being declaratively specified.
Differential Revision: https://reviews.llvm.org/D93432
The current condition implies that the target materialization will be
called even if the type is the new operand type is legal, but slightly
different. For example, if there is a bufferization pattern that changes
memref layout, then target materialization for an illegal type
(TensorType) would be called.
Differential Revision: https://reviews.llvm.org/D93126
Now that passes have support for running nested pipelines, the inliner can now allow for users to provide proper nested pipelines to use for optimization during inlining. This revision also changes the behavior of optimization during inlining to optimize before attempting to inline, which should lead to a more accurate cost model and prevents the need for users to schedule additional duplicate cleanup passes before/after the inliner that would already be run during inlining.
Differential Revision: https://reviews.llvm.org/D91211
This reverts commit 0d48d265db.
This reapplies the following commit, with a fix for CAPI/ir.c:
[mlir] Start splitting the `tensor` dialect out of `std`.
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).
Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.
This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.
Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2
Differential Revision: https://reviews.llvm.org/D92991
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).
Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.
This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.
Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2
Differential Revision: https://reviews.llvm.org/D92991
OperationFolder currently uses ConstantOp as a backup when trying to materialize a constant after an operation is folded. This dependency isn't really useful or necessary given that dialects can/should provide a `materializeConstant` implementation.
Fixes PR#44866
Differential Revision: https://reviews.llvm.org/D92980
This fixes a subtle bug where SCCP could incorrectly optimize a private callable while waiting for its arguments to be resolved.
Fixes PR#48457
Differential Revision: https://reviews.llvm.org/D92976
This is part of a larger refactoring the better congregates the builtin structures under the BuiltinDialect. This also removes the problematic "standard" naming that clashes with the "standard" dialect, which is not defined within IR/. A temporary forward is placed in StandardTypes.h to allow time for downstream users to replaced references.
Differential Revision: https://reviews.llvm.org/D92435
Memrefs with affine_map in the results of normalizable operation were
not normalized by `--normalize-memrefs` option. This patch normalizes
them.
Differential Revision: https://reviews.llvm.org/D88719
Extended promote buffers to stack pass to support dynamically shaped allocas.
The conversion is limited by the rank of the underlying tensor.
An option is added to the pass to adjust the given rank.
Differential Revision: https://reviews.llvm.org/D91969
Given that OpState already implicit converts to Operator*, this seems reasonable.
The alternative would be to add more functions to OpState which forward to Operation.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D92266
- Address TODO in scf-bufferize: the argument materialization issue is
now fixed and the code is now in Transforms/Bufferize.cpp
- Tighten up finalizing-bufferize to avoid creating invalid IR when
operand types potentially change
- Tidy up the testing of func-bufferize, and move appropriate tests
to a new finalizing-bufferize.mlir
- The new stricter checking in finalizing-bufferize revealed that we
needed a DimOp conversion pattern (found when integrating into npcomp).
Previously, the converion infrastructure was blindly changing the
operand type during finalization, which happened to work due to
DimOp's tensor/memref polymorphism, but is generally not encouraged
(the new pattern is the way to tell the conversion infrastructure that
it is legal to change that type).
The rewrite logic has an optimization to drop a cast operation after
rewriting block arguments if the cast operation has no users. This is
unsafe as there might be a pending rewrite that replaced the cast operation
itself and hence would trigger a second free.
Instead, do not remove the casts and leave it up to a later canonicalization
to do so.
Differential Revision: https://reviews.llvm.org/D92184
This enables partial bufferization that includes function signatures. To test this, this
change also makes the func-bufferize partial and adds a dedicated finalizing-bufferize pass.
Differential Revision: https://reviews.llvm.org/D92032
Block merging in MLIR will incorrectly merge blocks with operations whose values are used outside of that block. This change forbids this behavior and provides a test where it is illegal to perform such a merge.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D91745
These utilities are more closely associated with the buffer
optimizations and buffer deallocation than with the dialect conversion
stuff in Bufferize.h. So move them out.
This makes Bufferize.h very easy to understand and completely focused on
dialect conversion.
Differential Revision: https://reviews.llvm.org/D91563
Refactoring/clean-up step needed to add support for producer-consumer fusion
with multi-store producer loops and, in general, to implement more general
loop fusion strategies in Affine. It introduces the following changes:
- AffineLoopFusion pass now uses loop fusion utilities more broadly to compute
fusion legality (canFuseLoops utility) and perform the fusion transformation
(fuseLoops utility).
- Loop fusion utilities have been extended to deal with AffineLoopFusion
requirements and assumptions while preserving both loop fusion utilities and
AffineLoopFusion current functionality within a unified implementation.
'FusionStrategy' has been introduced for this purpose and, in the future, it
will allow us to have a single loop fusion core implementation that will produce
different fusion outputs depending on the strategy used.
- Improve separation of concerns for legality and profitability analysis:
'isFusionProfitable' no longer filters out illegal scenarios that 'canFuse'
didn't detect, or the other way around. 'canFuse' now takes loop dependences
into account to determine the fusion loop depth (producer-consumer fusion only).
- As a result, maximal fusion now doesn't require any profitability analysis.
- Slices are now computed only once and reused across the legality, profitability
and fusion transformation steps (producer-consumer).
- Refactor some utilities and remove redundant copies of them.
This patch is NFCI and should preserve the existing functionality of both the
AffineLoopFusion pass and the affine fusion utilities.
Reviewed By: andydavis1, bondhugula
Differential Revision: https://reviews.llvm.org/D90798
These includes have been deprecated in favor of BuiltinDialect.h, which contains the definitions of ModuleOp and FuncOp.
Differential Revision: https://reviews.llvm.org/D91572
Some rewriters take more iterations to converge, add a parameter to overwrite
the built-in maximum iteration count.
Fix PR48073.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D91553
This replaces the old type decomposition logic that was previously mixed
into bufferization, and makes it easily accessible.
This also deletes TestFinalizingBufferize, because after we remove the type
decomposition, it doesn't do anything that is not already provided by
func-bufferize.
Differential Revision: https://reviews.llvm.org/D90899
The index type does not have a bitsize and hence the size of corresponding allocations cannot be computed. Instead, the promotion pass now has an explicit option to specify the size of index.
Differential Revision: https://reviews.llvm.org/D91360
The previous logic for inlining a region A with N blocks into region B
would produce incorrect results on rollback for N greater than 1. This
rollback logic would leave blocks 1..N in region B and only move block 0
to region A.
The new inlining action recording stores the block move actions from N-1
to 0. Now on roll back, block 0 is moved to region A and then 1..N is
appended to the list of blocks in region A.
Differential Revision: https://reviews.llvm.org/D91185