I believe the semantics should be the same, but this saves 1 op and simplifies the code.
For example, the following two instructions:
```
%2 = cmp sgt %0, %1
%3 = select %2, %0, %1
```
Are equivalent to:
```
%2 = maxsi %0 %1
```
This commit makes reductions part of the terminator. Instead of
`scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops.
`scf.reduce` may contain an arbitrary number of reductions, with one
region per reduction.
Example:
```mlir
%init = arith.constant 0.0 : f32
%r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init)
-> f32, f32 {
%elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32>
%elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32>
scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.addf %lhs, %rhs : f32
scf.reduce.return %res : f32
}, {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.mulf %lhs, %rhs : f32
scf.reduce.return %res : f32
}
}
```
`scf.reduce` operations can no longer be interleaved with other ops in
the body of `scf.parallel`. This simplifies the op and makes it possible
to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was
not possible before because the op was not a terminator, causing the op
to be DCE'd.)
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
4. Some changes have been deleted for the following reasons:
- Some files had a variable also named cast
- Some files had not included a header file that defines the cast
functions
- Some files are definitions of the classes that have the casting
methods, so the code still refers to the method instead of the
function without adding a prefix or removing the method declaration
at the same time.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
mlir/lib/**/IR/\
mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
mlir/test/lib/Dialect/Test/TestTypes.cpp\
mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
mlir/test/lib/Dialect/Test/TestAttributes.cpp\
mlir/unittests/TableGen/EnumsGenTest.cpp\
mlir/test/python/lib/PythonTestCAPI.cpp\
mlir/include/mlir/IR/
```
Differential Revision: https://reviews.llvm.org/D150123
The patch adds operations to `BlockAndValueMapping` and renames it to `IRMapping`. When operations are cloned, old operations are mapped to the cloned operations. This allows mapping from an operation to a cloned operation. Example:
```
Operation *opWithRegion = ...
Operation *opInsideRegion = &opWithRegion->front().front();
IRMapping map
Operation *newOpWithRegion = opWithRegion->clone(map);
Operation *newOpInsideRegion = map.lookupOrNull(opInsideRegion);
```
Migration instructions:
All includes to `mlir/IR/BlockAndValueMapping.h` should be replaced with `mlir/IR/IRMapping.h`. All uses of `BlockAndValueMapping` need to be renamed to `IRMapping`.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D139665
This is part of an effort to migrate from llvm::Optional to
std::optional. This patch changes the way mlir-tblgen generates .inc
files, and modifies tests and documentation appropriately. It is a "no
compromises" patch, and doesn't leave the user with an unpleasant mix of
llvm::Optional and std::optional.
A non-trivial change has been made to ControlFlowInterfaces to split one
constructor into two, relating to a build failure on Windows.
See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Differential Revision: https://reviews.llvm.org/D138934
I'm planning to deprecate and eventually remove llvm::empty.
Note that no use of llvm::empty requires the ability of llvm::empty to
determine the emptiness from begin/end only.
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.
Reviewed By: mehdi_amini, rriddle
Differential Review: https://reviews.llvm.org/D132838
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.
Reviewed By: mehdi_amini, rriddle
Differential Review: https://reviews.llvm.org/D132838
This aligns the SCF dialect file layout with the majority of the dialects.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D128049
Marked all dialects that could be (reasonably) easily flipped to _Both
prefix. Updating the accessors to prefixed form will happen in follow
up, this was to flush out conflicts and to mark all dialects explicitly
as I plan to flip OpBase default to _Prefixed to avoid needing to
migrate new dialects.
Except for Standalone example which got flipped to _Prefixed.
Differential Revision: https://reviews.llvm.org/D128027
The Func has a large number of legacy dependencies carried over from the old
Standard dialect, which was pervasive and contained a large number of varied
operations. With the split of the standard dialect and its demise, a lot of lingering
dead dependencies have survived to the Func dialect. This commit removes a
large majority of then, greatly reducing the dependence surface area of the
Func dialect.
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:
* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect
See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061
Differential Revision: https://reviews.llvm.org/D120624
Move expandAffineMap and expandAffineApplyExpr out to AffineUtils. This
is a useful method. The child revision uses it. NFC.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D119401
This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.
Differential Revision: https://reviews.llvm.org/D118533
Per the discussion in https://reviews.llvm.org/D116345 it makes sense
to move AtomicRMWOp out of the standard dialect. This was accentuated by the
need to add a fold op with a memref::cast. The only dialect
that would permit this is the memref dialect (keeping it in the standard dialect
or moving it to the arithmetic dialect would require those dialects to have a
dependency on the memref dialect, which breaks linking).
As the AtomicRMWKind enum is used throughout, this has been moved to Arith.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116392
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
the loop to prevent garbage values from being written to the
accumulator.
Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.
Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100694
This enables to express more complex parallel loops in the affine framework,
for example, in cases of tiling by sizes not dividing loop trip counts perfectly
or inner wavefront parallelism, among others. One can't use affine.max/min
and supply values to the nested loop bounds since the results of such
affine.max/min operations aren't valid symbols. Making them valid symbols
isn't an option since they would introduce selection trees into memref
subscript arithmetic as an unintended and undesired consequence. Also
add support for converting such loops to SCF. Drop some API that isn't used in
the core repo from AffineParallelOp since its semantics becomes ambiguous in
presence of max/min bounds. Loop normalization is currently unavailable for
such loops.
Depends On D101171
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D101172
The existing implementation was always creating 32-bit constants for
floating-point and integer reductions regardless of the actual type, which
resulted in invalid IR being generated for any types other than f32 and i32
when lowering affine.parallel to SCF. Use the actual type instead.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D99942
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names. We'll keep the old names around for a
couple weeks to help transitions.
Differential Revision: https://reviews.llvm.org/D99127
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.
Differential Revision: https://reviews.llvm.org/D99028
This commit fixes the lowering of `Affine.IfOp` to `SCF.IfOp` in the
presence of yield values. These changes have been made as a part of
`-lower-affine` pass.
Differential Revision: https://reviews.llvm.org/D98760
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h
Working on a fix.
This reverts commit 8aa6c3765b.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.
Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp
The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667
Differential Revision: https://reviews.llvm.org/D96425
This patch adds the 'vector.load' and 'vector.store' ops to the Vector
dialect [1]. These operations model *contiguous* vector loads and stores
from/to memory. Their semantics are similar to the 'affine.vector_load' and
'affine.vector_store' counterparts but without the affine constraints. The
most relevant feature is that these new vector operations may perform a vector
load/store on memrefs with a non-vector element type, unlike 'std.load' and
'std.store' ops. This opens the representation to model more generic vector
load/store scenarios: unaligned vector loads/stores, perform scalar and vector
memory access on the same memref, decouple memory allocation constraints from
memory accesses, etc [1]. These operations will also facilitate the progressive
lowering of both Affine vector loads/stores and Vector transfer reads/writes
for those that read/write contiguous slices from/to memory.
In particular, this patch adds the 'vector.load' and 'vector.store' ops to the
Vector dialect, implements their lowering to the LLVM dialect, and changes the
lowering of 'affine.vector_load' and 'affine.vector_store' ops to the new vector
ops. The lowering of Vector transfer reads/writes will be implemented in the
future, probably as an independent pass. The API of 'vector.maskedload' and
'vector.maskedstore' has also been changed slightly to align it with the
transfer read/write ops and the vector new ops. This will improve reusability
among all these operations. For example, the lowering of 'vector.load',
'vector.store', 'vector.maskedload' and 'vector.maskedstore' to the LLVM dialect
is implemented with a single template conversion pattern.
[1] https://llvm.discourse.group/t/memref-type-and-data-layout/
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96185
This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way.
Differential Revision: https://reviews.llvm.org/D95841
This commit addresses the issue of lowering affine.for and
affine.parallel having return values. Relevant test cases are also
added.
Signed-off-by: Prateek Gupta <prateek@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D93090
This class represents a rewrite pattern list that has been frozen, and thus immutable. This replaces the uses of OwningRewritePatternList in pattern driver related API, such as dialect conversion. When PDL becomes more prevalent, this API will allow for optimizing a set of patterns once without the need to do this per run of a pass.
Differential Revision: https://reviews.llvm.org/D89104
Unsigned and Signless attributes use uintN_t and signed attributes use intN_t, where N is the fixed width. The 1-bit variants use bool.
Differential Revision: https://reviews.llvm.org/D86739
The LowerAffine psas was a FunctionPass only for legacy
reasons. Making this Op-agnostic allows it to be used from command
line when affine expressions are within operations other than
`std.func`.
Differential Revision: https://reviews.llvm.org/D84590
Multiple places in the code base were erasing Blocks or operations in them
using in-place modifications (`Block::erase` or `Block::clear`) unknown to
ConversionPatternRewriter. These operations could not be undone if the pattern
failed and could lead to inconsistent in-memory state of the IR with dangling
pointers. Use `ConversionPatternRewriter::eraseOp` and `::eraseBlock` instead.
Differential Revision: https://reviews.llvm.org/D80136
This patch adds `affine.vector_load` and `affine.vector_store` ops to
the Affine dialect and lowers them to `vector.transfer_read` and
`vector.transfer_write`, respectively, in the Vector dialect.
Reviewed By: bondhugula, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D79658