When a block is getting inlined, the destination block does not have to
be legalized. That's because the signature of the destination block does
not change by inlining.
This commit makes the implementation consistent with this comment:
```
// If the pattern moved or created any blocks, make sure the types of block
// arguments get legalized.
```
Refactor the verifiers to make use of the common bits and make
`vector.contract` also use this interface.
In the process, the confusingly named getStaticShape has disappeared.
Note: the verifier for IndexingMapOpInterface is currently called
manually from other verifiers as it was unclear how to avoid it taking
precedence over more meaningful error messages
This commit adds support for non-attribute properties (such as
StringProp and I64Prop) in declarative rewrite patterns. The handling
for properties follows the handling for attributes in most cases,
including in the generation of static matchers.
Constraints that are shared between multiple types are supported by
making the constraint matcher a templated function, which is the
equivalent to passing ::mlir::Attribute for an arbitrary C++ type.
This transform takes a module and a function name, and replaces the
signature of the function by reordering the arguments and results
according to the interchange arrays. The function is expected to be
defined in the module, and the interchange arrays must match the number
of arguments and results of the function.
Following up from https://github.com/llvm/llvm-project/pull/143467,
this PR adds support for
`ReductionTilingStrategy::PartialReductionOuterParallel` to
`tileUsingSCF`. The implementation of
`PartialReductionTilingInterface` for `Linalg` ops has been updated to
support this strategy as well. This makes the `tileUsingSCF` come on
par with `linalg::tileReductionUsingForall` which will be deprecated
subsequently.
Changes summary
- `PartialReductionTilingInterface` changes :
- `tileToPartialReduction` method needed to get the induction
variables of the generated tile loops. This was needed to keep the
generated code similar to `linalg::tileReductionUsingForall`,
specifically to create a simplified access for slicing the
intermediate partial results tensor when tiled in `num_threads` mode.
- `getPartialResultTilePosition` methods needs the induction
varialbes for the generated tile loops for the same reason above,
and also needs the `tilingStrategy` to be passed in to generate
correct code.
The tests in `transform-tile-reduction.mlir` testing the
`linalg::tileReductionUsingForall` have been moved over to test
`scf::tileUsingSCF` with
`ReductionTilingStrategy::PartialReductionOuterParallel`
strategy. Some of the test that were doing further cyclic distribution
of the transformed code from tiling are removed. Those seem like two
separate transformation that were merged into one. Ideally that would
need to happen when resolving the `scf.forall` rather than during
tiling.
Please review only the top commit. Depends on
https://github.com/llvm/llvm-project/pull/143467
Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
This is a precursor to generalizing the `tileUsingSCF` to handle
`ReductionTilingStrategy::PartialOuterParallel` strategy. This change
itself is generalizing/refactoring the current implementation that
supports only `ReductionTilingStrategy::PartialOuterReduction`.
Changes in this PR
- Move the `ReductionTilingStrategy` enum out of
`scf::SCFTilingOptions` and make them visible to `TilingInterface`.
- `PartialTilingInterface` changes
- Pass the `tilingStrategy` used for partial reduction to
`tileToPartialReduction`.
- Pass the reduction dimension along as `const
llvm::SetVector<unsigned> &`.
- Allow `scf::SCFTilingOptions` to set the reduction dimensions that
are to be tiled.
- Change `structured.tiled_reduction_using_for` to allow specification
of the reduction dimensions to be partially tiled.
Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
Extend the "storage class" <-> "memory space" map for the Vulkan SPIR-V
environment to include the Image class. 12 is chosen as the next
available value in the MemRef memory space list.
Signed-off-by: Jack Frankland <jack.frankland@arm.com>
This patch is adding the ability to print a uint8_t/int8_t as an int
instead of a char and demonstrate support for int8_t/uin8_t properties.
Fix#144993
Previously, slices were sometimes marked as non-contiguous when they
were actually contiguous. This occurred when the vector type had leading
unit dimensions, e.g., `vector<1x1x...x1xd0xd1x...xdn-1xT>`. In such
cases, only the trailing `n` dimensions of the memref need to be
contiguous, not the entire vector rank.
This affects how `FlattenContiguousRowMajorTransfer{Read,Write}Pattern`
flattens `transfer_read` and `transfer_write` ops.
The patterns used to collapse a number of dimensions equal to the vector
rank which missed some opportunities when the leading unit dimensions of
the vector span non-contiguous dimensions of the memref.
Now that the contiguity of the slice is determined correctly, there is a
choice how many dimensions of the
memref to collapse, ranging from
a) the number of vector dimensions after ignoring the leading unit
dimensions, up to
b) the maximum number of contiguous memref dimensions
This patch makes a choice to do minimal memref collapsing. The rationale
behind this decision is that
this way the least amount of information is discarded.
(It follows that in some cases where the patterns used to trigger and
collapse some memref dimensions, after this patch the patterns may
collapse less dimensions).
Previously, `erase_dead_alloc_and_stores` didn't support
`memref.alloca`. This patch introduces support for it.
---------
Signed-off-by: Vitalii Shutov <vitalii.shutov@arm.com>
Fixing buildbot errors on some platforms like
```
undefined reference to `mlir::dlti::query(mlir::Operation*, llvm::ArrayRef<llvm::StringRef>, bool)'
```
Introduced in #144716
This revision aligns padding specification in pad_tiling_interface to
that of tiling specification.
Dimensions that should be skipped are specified by "padding by 0".
Trailing dimensions that are ignored are automatically completed to "pad
to 0".
…ransform.pad-tiling-interface
This revision introduces a simple variant of AffineMin folding in
makeComposedFoldedAffineApply and makes use of it in
transform.pad-tiling-interface. Since this version explicitly call
ValueBoundsInterface, it may be too expensive and is only activate
behind a flag.
It results in better foldings when mixing tiling and padding, including
with dynamic shapes.
This commit adds 1:N support to
`ConversionPatternRewriter::replaceUsesOfBlockArgument`. This was one of
the few remaining dialect conversion APIs that does not support 1:N
conversions yet.
This commit also reuses `replaceUsesOfBlockArgument` in the
implementation of `applySignatureConversion`. This is in preparation of
the One-Shot Dialect Conversion refactoring. The goal is to bring the
`applySignatureConversion` implementation into a state where it works
both with and without rollbacks. To that end, `applySignatureConversion`
should not directly access the `mapping`.
Since #145030, `ConversionPatternRewriter::eraseBlock` no longer calls
`ConversionPatternRewriter::eraseOp`. This now happens in the rewriter
impl (during the cleanup phase). Therefore, a safety check in
`replaceOp` can now be simplified.
This commit allows zero-points used by a number of tosa operations to be
unranked. This allows the shape inference pass to propagate shape
information.
Just recently learned about `isSignlessInteger`, use that instead of
comparing to types obtained via `rewriter.getI<N>Type()`.
It also makes it closer to a similar function in
`LowerContractionToNeonI8MMPattern.cpp` (formerly `LowerContractionToSMMLAPattern.cpp`)
which would help a potential effort to unify these patterns.
This patch enhances `MemRefType::areTrailingDimsContiguous` to also
handle memrefs with dynamic dimensions.
The implementation itself is based on a new member function
`MemRefType::getMaxCollapsableTrailingDims` that return the maximum
number of trailing dimensions that can be collapsed - trivially all
dimensions for memrefs with identity layout, or by examining the memref
strides stopping at discontiguous or statically unknown strides.
Add a new API to access all blobs that are stored in the blob manager.
The main purpose (as of now) is to allow users of dialect resources to
iterate over all blobs, especially when the blobs are no longer used in
IR (e.g. the operation that uses the blob is deleted) and thus cannot be
easily accessed without manual tracking of keys.
This commit adds a check to ensure that the calculated output height and
width, during shape inference, should be non-negative. An error is
output if this is the case.
Fixes: #142402
Previously running `-generate-runtime-verification` on an IR containing
`memref.reinterpret_cast` would crash because its implementation of the
`RuntimeVerifiableOpInterface` was removed in
https://github.com/llvm/llvm-project/pull/132547 but its associated
entry in `declarePromisedInterface` was never removed.
This causes an error when you try and run
`-generate-runtime-verification` on an IR containing
`memref.reinterpret_cast` that looks like
```
LLVM ERROR: checking for an interface (`mlir::RuntimeVerifiableOpInterface`) that was promised by dialect 'memref' but never implemented. This is generally an indication that the dialect extension implementing the interface was never registered.
```
as reported in https://github.com/llvm/llvm-project/issues/144028.
In this PR I also added all the ops that do have implementations of this
interface in
`mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp` to the
`declarePromisedInterface` for consistency.
Fixes https://github.com/llvm/llvm-project/issues/144028
Bug description: Hardware intrinsic functions created during GPU
conversion to NVVM may contain debug info metadata from the original
function which cannot be used out of that function.
This commit makes the following changes:
- Expose `map` and `mapOperands` in
`ValueBoundsConstraintSet::Variable`, so that the class can be used by
subclasses of `ValueBoundsConstraintSet`. Otherwise subclasses cannot
access those members.
- Add `ValueBoundsConstraintSet::strongCompare`. This method is similar
to `ValueBoundsConstraintSet::compare` except that it returns false when
the inverse comparison holds, and `llvm::failure()` if neither the
relation nor its inverse relation could be proven.
- Add `simplifyAffineMinOp`, `simplifyAffineMaxOp`, and
`simplifyAffineMinMaxOps` to simplify those operations using
`ValueBoundsConstraintSet`.
- Adds the `SimplifyMinMaxAffineOpsOp` transform op that uses
`simplifyAffineMinMaxOps`.
- Add the `test.value_with_bounds` op to test unknown values with a min
max range using `ValueBoundsOpInterface`.
- Adds tests verifying the transform.
Example:
```mlir
func.func @overlapping_constraints() -> (index, index) {
%0 = test.value_with_bounds {min = 0 : index, max = 192 : index}
%1 = test.value_with_bounds {min = 128 : index, max = 384 : index}
%2 = test.value_with_bounds {min = 256 : index, max = 512 : index}
%r0 = affine.min affine_map<()[s0, s1, s2] -> (s0, s1, s2)>()[%0, %1, %2]
%r1 = affine.max affine_map<()[s0, s1, s2] -> (s0, s1, s2)>()[%0, %1, %2]
return %r0, %r1 : index, index
}
// Result of applying `simplifyAffineMinMaxOps` to `func.func`
#map1 = affine_map<()[s0, s1] -> (s1, s0)>
func.func @overlapping_constraints() -> (index, index) {
%0 = test.value_with_bounds {max = 192 : index, min = 0 : index}
%1 = test.value_with_bounds {max = 384 : index, min = 128 : index}
%2 = test.value_with_bounds {max = 512 : index, min = 256 : index}
%3 = affine.min #map1()[%0, %1]
%4 = affine.max #map1()[%1, %2]
return %3, %4 : index, index
}
```
---------
Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com>
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
One of the uses of std::nullopt is in a the constructors for
TypeRange. This patch takes care of the migration where we need
TypeRange() to facilitate perfect forwarding. Note that {} would be
ambiguous for perfecting forwarding to work.
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
One of the common uses of std::nullopt is in one of the constructors
for ValueRange. This patch takes care of the migration where we need
ValueRange() to facilitate perfect forwarding. Note that {} would be
ambiguous for perfecting forwarding to work.
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch takes care of the mlir side of the migration, starting with
straightforward places like "return std::nullopt;" and ternally
expressions involving std::nullopt.
Add missing listener notifications when erasing nested
blocks/operations.
This commit also moves some of the functionality from
`ConversionPatternRewriter` to `ConversionPatternRewriterImpl`. This is
in preparation of the One-Shot Dialect Conversion refactoring: The
implementations in `ConversionPatternRewriter` should be as simple as
possible, so that a switch between "rollback allowed" and "rollback not
allowed" can be inserted at that level. (In the latter case,
`ConversionPatternRewriterImpl` can be bypassed to some degree, and
`PatternRewriter::eraseBlock` etc. can be used.)
Depends on #145018.
This patch adds the `PtrLikeTypeInterface` type interface to identify
pointer-like types. This interface is defined as:
```
A ptr-like type represents an object storing a memory address. This object
is constituted by:
- A memory address called the base pointer. This pointer is treated as a
bag of bits without any assumed structure. The bit-width of the base
pointer must be a compile-time constant. However, the bit-width may remain
opaque or unavailable during transformations that do not depend on the
base pointer. Finally, it is considered indivisible in the sense that as
a `PtrLikeTypeInterface` value, it has no metadata.
- Optional metadata about the pointer. For example, the size of the memory
region associated with the pointer.
Furthermore, all ptr-like types have two properties:
- The memory space associated with the address held by the pointer.
- An optional element type. If the element type is not specified, the
pointer is considered opaque.
```
This patch adds this interface to `!ptr.ptr` and the `memref` type.
Furthermore, this patch adds necessary ops and type to handle casting
between `!ptr.ptr` and ptr-like types.
First, it defines the `!ptr.ptr_metadata` type. An opaque type to
represent the metadata of a ptr-like type. The rationale behind adding
this type, is that at high-level the metadata of a type like `memref`
cannot be specified, as its structure is tied to its lowering.
The `ptr.get_metadata` operation was added to extract the opaque pointer
metadata. The concrete structure of the metadata is only known when the
op is lowered.
Finally, this patch adds the `ptr.from_ptr` and `ptr.to_ptr` operations.
Allowing to cast back and forth between `!ptr.ptr` and ptr-like types.
```mlir
func.func @func(%mr: memref<f32, #ptr.generic_space>) -> memref<f32, #ptr.generic_space> {
%ptr = ptr.to_ptr %mr : memref<f32, #ptr.generic_space> -> !ptr.ptr<#ptr.generic_space>
%mda = ptr.get_metadata %mr : memref<f32, #ptr.generic_space>
%res = ptr.from_ptr %ptr metadata %mda : !ptr.ptr<#ptr.generic_space> -> memref<f32, #ptr.generic_space>
return %res : memref<f32, #ptr.generic_space>
}
```
It's future work to replace and remove the `bare-ptr-convention` through
the use of these ops.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Changes:
* Decouple layout propagation from subgroup distribution and move it to
an independent pass.
* Refine layout assignment to handle control-flow ops correctly (scf.for, scf.while).
* Refine test cases.