`scf.foreach_thread` results alias with the underlying `scf.foreach_thread.parallel_insert_slice` destination operands
and they bufferize to equivalent buffers in the absence of other conflicts.
`scf.foreach_thread.parallel_insert_slice` conflict detection is similar to `tensor.insert_slice` conflict detection.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D126769
Add an option to predicate the epilogue within the kernel instead of
peeling the epilogue. This is a useful option to prevent generating
large amount of code for deep pipeline. This currently require a user
lamdba to implement operation predication.
Differential Revision: https://reviews.llvm.org/D126753
These attributes can carry useful information, e.g., pipelines
might use them to organize and chain patterns.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D126320
Before this fix, the bufferization implementation made the incorrect assumption that the values yielded from the "before" region must match with the values yielded from the "after" region.
Differential Revision: https://reviews.llvm.org/D125835
Instead of recomputing memref types from tensor types, try to infer them when possible. This results in more precise layout maps.
Differential Revision: https://reviews.llvm.org/D125614
This follows the same implementation strategy as scf::ForOp and common functionality is extracted into helper functions.
This implementation works well in cases where each yielded value (from either body/condition region) is equivalent to the corresponding bbArg of the parent block. In that case, each OpResult of the loop may be aliasing with the corresponding OpOperand of the loop (and with no other OpOperand).
In the absence of said equivalence relationship, new buffer copies must be inserted, so that the aliasing OpOperand/OpResult contract of scf::WhileOp is honored. In essence, by yielding a newly allocated buffer, we can enforce the specified may-alias relationship. (Newly allocated buffers cannot alias with any OpOperands of the loop.)
Differential Revision: https://reviews.llvm.org/D124929
The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands.
Differential Revision: https://reviews.llvm.org/D124934
Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.
Differential Revision: https://reviews.llvm.org/D124298
Lookup iter_arg buffers using `lookupBuffer` instead of always creating a new `ToMemrefOp`. Also cast all yielded buffers (if necessary), regardless of whether they are an equivalent buffer or a new allocation.
Note: This should have been part of D123369.
Differential Revision: https://reviews.llvm.org/D123383
The current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc.
when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has a
separate tracking structure, and is also quite limiting. This commit refactors this delayed mutation of
dialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registration
callback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directly
on the loaded constructs, and also allows for loading new dependent dialects. The latter of which is
extremely useful as it will now enable dependent dialects to only apply in the contexts in which they
are necessary. For example, a dialect dependency can now be conditional on if a user actually needs the
interface that relies on it.
Differential Revision: https://reviews.llvm.org/D120367
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.
Differential Revision: https://reviews.llvm.org/D121266
This removes a restriction wrt. scf.for loops during One-Shot Bufferization. Such IR was previously rejected. It is still rejected by default because the bufferized IR could be slow. But such IR can now be bufferized with `allow-return-allocs`.
Differential Revision: https://reviews.llvm.org/D121529
This improves the modularity of the bufferization.
From now on, all ops that do not implement BufferizableOpInterface are considered hoisting barriers. Previously, all ops that do not implement the interface were not considered barriers and such ops had to be marked as barriers explicitly. This was unsafe because we could've hoisted across unknown ops where it was not safe to hoist.
As a side effect, this allows for cleaning up AffineBufferizableOpInterfaceImpl. This build unit no longer needed and can be deleted.
Differential Revision: https://reviews.llvm.org/D121519
The Func has a large number of legacy dependencies carried over from the old
Standard dialect, which was pervasive and contained a large number of varied
operations. With the split of the standard dialect and its demise, a lot of lingering
dead dependencies have survived to the Func dialect. This commit removes a
large majority of then, greatly reducing the dependence surface area of the
Func dialect.
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:
* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect
See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061
Differential Revision: https://reviews.llvm.org/D120624
Add `BufferizableOpInterface::verifyAnalysis`. Ops can implement this method to check for expected invariants and limitations.
The purpose of this change is to introduce a modular way of checking assertions such as `assertScfForAliasingProperties`.
Differential Revision: https://reviews.llvm.org/D120189
insert is soft deprecated, so remove all references so it's less likely
to be used and can be easily removed in the future.
Differential Revision: https://reviews.llvm.org/D120021
This allow user to register a callback that can annotate operations
during software pipelining. This allows user potential annotate op to
know what part of the pipeline they correspond to.
Differential Revision: https://reviews.llvm.org/D119866
This makes getAliasingOpResult symmetric to getAliasingOpOperand. The previous implementation was confusing for users and implemented in such a way only because there are currently no bufferizable ops that have multiple aliasing OpResults.
Differential Revision: https://reviews.llvm.org/D119259
They used to be classes with a virtual `run` function. This was inconvenient because post analysis steps are stored in BufferizationOptions. Because of this design choice, BufferizationOptions were not copyable.
Differential Revision: https://reviews.llvm.org/D119258
Induction variable calculation was ignoring scf.for step value. Fix it to get
the correct induction variable value in the prologue.
Differential Revision: https://reviews.llvm.org/D118932
This has been a major TODO for a very long time, and is necessary for establishing a proper
dialect-free dependency layering for the Transforms library. Code was moved to effectively
two main locations:
* Affine/
There was quite a bit of affine dialect related code in Transforms/ do to historical reasons
(of a time way into MLIR's past). The following headers were moved to:
Transforms/LoopFusionUtils.h -> Dialect/Affine/LoopFusionUtils.h
Transforms/LoopUtils.h -> Dialect/Affine/LoopUtils.h
Transforms/Utils.h -> Dialect/Affine/Utils.h
The following transforms were also moved:
AffineLoopFusion, AffinePipelineDataTransfer, LoopCoalescing
* SCF/
Only one SCF pass was in Transforms/ (likely accidentally placed here): ParallelLoopCollapsing
The SCF specific utilities in LoopUtils have been moved to SCF/Utils.h
* Misc:
mlir::moveLoopInvariantCode was also moved to LoopLikeInterface.h given
that it is a simple utility defined in terms of LoopLikeOpInterface.
Differential Revision: https://reviews.llvm.org/D117848
BlockArguments gained the ability to have locations attached a while ago, but they
have always been optional. This goes against the core tenant of MLIR where location
information is a requirement, so this commit updates the API to require locations.
Fixes#53279
Differential Revision: https://reviews.llvm.org/D117633
The only benefit of FunctionPass is that it filters out function
declarations. This isn't enough to justify carrying it around, as we can
simplify filter out declarations when necessary within the pass. We can
also explore with better scheduling primitives to filter out declarations
at the pipeline level in the future.
The definition of FunctionPass is left intact for now to allow time for downstream
users to migrate.
Differential Revision: https://reviews.llvm.org/D117182
The current state of the top level Analysis/ directory is that it contains two libraries;
a generic Analysis library (free from dialect dependencies), and a LoopAnalysis library
that contains various analysis utilities that originated from Affine loop transformations.
This commit moves the LoopAnalysis to the more appropriate home of `Dialect/Affine/Analysis/`,
given the use and intention of the majority of the code within it. After the move, if there
are generic utilities that would fit better in the top-level Analysis/ directory, we can move
them.
Differential Revision: https://reviews.llvm.org/D117351
This revision refactors the implementation of outlineIfOp to expose
a finer-grain functionality `outlineSingleBlockRegion` that will be
reused in other contexts.
Differential Revision: https://reviews.llvm.org/D116591
These functions are generic utility functions that operates on
affine ops within SCF regions. Moving them to their own files
for a better code structure, instead of mixing with loop
specialization logic.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115245
* Implement `FlatAffineConstraints::getConstantBound(EQ)`.
* Inject a simpler constraint for loops that have at most 1 iteration.
* Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`.
Differential Revision: https://reviews.llvm.org/D114138
This change is NFC. There were two issues when passing/reading upper bounds into/from FlatAffineConstraints that negate each other, so the bug was not apparent. However, it made debugging harder because some constraints in the FlatAffineConstraints were off by one when dumping all constraints.
Differential Revision: https://reviews.llvm.org/D114137
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
This has been a TODO for a long time, and it brings about many advantages (namely nice accessors, and less fragile code). The existing overloads that accept ArrayRef are now treated as deprecated and will be removed in a followup (after a small grace period). Most of the upstream MLIR usages have been fixed by this commit, the rest will be handled in a followup.
Differential Revision: https://reviews.llvm.org/D110293
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.
This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).
Differential Revision: https://reviews.llvm.org/D108454
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.
This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).
Differential Revision: https://reviews.llvm.org/D108454