This commit adds a new `matchAndRewrite` overload to `ConversionPattern`
to support 1:N replacements. This is the first of two main PRs that
merge the 1:1 and 1:N dialect conversion drivers.
The existing `matchAndRewrite` function supports only 1:1 replacements,
as can be seen from the `ArrayRef<Value>` parameter.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<Value> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const;
```
This commit adds a `matchAndRewrite` overload that is called by the
dialect conversion driver. By default, this new overload dispatches to
the original 1:1 `matchAndRewrite` implementation. Existing
`ConversionPattern`s do not need to be changed as long as there are no
1:N type conversions or value replacements.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<ValueRange> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const {
// Note: getOneToOneAdaptorOperands produces a fatal error if at least one
// ValueRange has 0 or more than 1 value.
return matchAndRewrite(op, getOneToOneAdaptorOperands(operands), rewriter);
}
```
The `ConversionValueMapping`, which keeps track of value replacements
and materializations, still does not support 1:N replacements. We still
rely on argument materializations to convert N replacement values back
into a single value. The `ConversionValueMapping` will be generalized to
1:N mappings in the second main PR.
Before handing the adaptor values to a `ConversionPattern`, all argument
materializations are "unpacked". The `ConversionPattern` receives N
replacement values and does not see any argument materializations. This
implementation strategy allows us to use the 1:N infrastructure/API in
`ConversionPattern`s even though some functionality is still missing in
the driver. This strategy was chosen to keep the sizes of the PRs
smaller and to make it easier for downstream users to adapt to API
changes.
This commit also updates the the "decompose call graphs" transformation
and the "sparse tensor codegen" transformation to use the new 1:N
`ConversionPattern` API.
Note for LLVM conversion: If you are using a type converter with 1:N
type conversion rules or if your patterns are performing 1:N
replacements (via `replaceOpWithMultiple` or
`applySignatureConversion`), conversion pattern applications will start
failing (fatal LLVM error) with this error message: `pattern 'name' does
not support 1:N conversion`. The name of the failing pattern is shown in
the error message. These patterns must be updated to the new 1:N
`matchAndRewrite` API.
`getDescriptorFromTensorTuple` and `getMutDescriptorFromTensorTuple`
extract the tensor type from an `unrealized_conversion_cast` op that
serves as a workaround for missing 1:N dialect conversion support.
This commit changes these functions so that they explicitly receive the
tensor type as a function argument. This is in preparation of merging
the 1:1 and 1:N conversion drivers. The conversion patterns in this file
will soon start receiving multiple SSA values (`ValueRange`) from their
adaptors (instead of a single value that is the result of
`unrealized_conversion_cast`). It will no longer be possible to take the
tensor type from the `unrealized_conversion_cast` op. The
`unrealized_conversion_cast` workaround will disappear entirely.
This commit adds a new function
`ConversionPatternRewriter::replaceOpWithMultiple`. This function is
similar to `replaceOp`, but it accepts multiple `ValueRange`
replacements, one per op result.
Note: This function is not an overload of `replaceOp` because of
ambiguous overload resolution that would make the API difficult to use.
This commit aligns "block signature conversions" with "op replacements":
both support 1:N replacements now. Due to incomplete 1:N support in the
dialect conversion driver, an argument materialization is inserted when
an SSA value is replaced with multiple values; same as block signature
conversions already work around the problem. These argument
materializations are going to be removed in a subsequent commit that
adds full 1:N support. The purpose of this PR is to add missing features
gradually in small increments.
This commit also updates two MLIR transformations that have their custom
workarounds around missing 1:N support. These can already start using
`replaceOpWithMultiple`.
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
Currently, the lowering for vector.step lives
under a folder. This is not ideal if we want
to do transformation on it and defer the
materizaliztion of the constants much later.
This commits adds a rewrite pattern that
could be used by using
`transform.structured.vectorize_children_and_apply_patterns`
transform dialect operation.
Moreover, the rewriter of vector.step is also
now used in -convert-vector-to-llvm pass where
it handles scalable and non-scalable types as
LLVM expects it.
As a consequence of removing the vector.step
lowering as its folder, linalg vectorization
will keep vector.step intact.
This commit simplifies the result type of materialization functions.
Previously: `std::optional<Value>`
Now: `Value`
The previous implementation allowed 3 possible return values:
- Non-null value: The materialization function produced a valid
materialization.
- `std::nullopt`: The materialization function failed, but another
materialization can be attempted.
- `Value()`: The materialization failed and so should the dialect
conversion. (Previously: Dialect conversion can roll back.)
This commit removes the last variant. It is not particularly useful
because the dialect conversion will fail anyway if all other
materialization functions produced `std::nullopt`.
Furthermore, in contrast to type conversions, at least one
materialization callback is expected to succeed. In case of a failing
type conversion, the current dialect conversion can roll back and try a
different pattern. This also used to be the case for materializations,
but that functionality was removed with #107109: failed materializations
can no longer trigger a rollback. (They can just make the entire dialect
conversion fail without rollback.) With this in mind, it is even less
useful to have an additional error state for materialization functions.
This commit is in preparation of merging the 1:1 and 1:N type
converters. Target materializations will have to return multiple values
instead of a single one. With this commit, we can keep the API simple:
`SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`.
Note for LLVM integration: All 1:1 materializations should return
`Value` instead of `std::optional<Value>`. Instead of `std::nullopt`
return `Value()`.
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.
Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.
Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
This PR fixes a bug in `SparseTensorDimOpRewriter` when `tensor.dim` has
an unranked tensor type. To prevent crashes, we now use
`tryGetSparseTensorType` instead of `getSparseTensorType`. Fixes
#107807.
When only all-dense "sparse" tensors occur in a function prototype, the
assembler would skip the method conversion purely based on input/output
counts. It should rewrite based on the presence of any annotation,
however.
This is described in (N2) https://pvs-studio.com/en/blog/posts/cpp/1126/
so caught by the PVS Studio analyzer.
Warning message -
V502 Perhaps the '?:' operator works in a different way than it was
expected. The '?:' operator has a lower priority than the '+' operator.
LoopEmitter.cpp 983
V502 Perhaps the '?:' operator works in a different way than it was
expected. The '?:' operator has a lower priority than the '+' operator.
LoopEmitter.cpp 1039
The assert should be
assert(bArgs.size() == reduc.size() + (needsUniv ? 1 : 0));
since + has higher precedence and ? has lower.
This further can be reduce to
assert(aArgs.size() == reduc.size() + needsUniv);
because needUniv is a bool value which is implicitly converted to 0 or
Some of the options only fed into the full sparse pipeline. However,
some backends prefer to use the sparse minipipeline. This change exposes
some important optimization flags to the pass as well. This prepares
some SIMDization of PyTorch sparsified code.
Codegen "vectors" for pos/crd/val use the capacity as memref size, not
the actual used size. Although the sparsifier itself always uses just
the defined pos/crd/val parts, printing these and passing them back to a
runtime environment could benefit from wrapping the basic pos/crd/val
getters into a proper memref view that sets the right size.
This patch generalizes tensor.expand_shape and memref.expand_shape to
consume the output shape as a list of SSA values. This enables us to
implement generic reshape operations with dynamic shapes using
collapse_shape/expand_shape pairs.
The output_shape input to expand_shape follows the static/dynamic
representation that's also used in `tensor.extract_slice`.
Differential Revision: https://reviews.llvm.org/D140821
---------
Signed-off-by: Gaurav Shukla<gaurav.shukla@amd.com>
Signed-off-by: Gaurav Shukla <gaurav.shukla@amd.com>
Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>
This ensures the explicit value is generated (and not a load into the
values array). Note that actually not storing values array at all is
still TBD, this is just the very first step.
Addresses issue #88716.
Some function parameter names in the affected header files did not match
the parameter names in the definitions, or were listed in a different
order.
---------
Signed-off-by: Troy-Butler <squintik@outlook.com>
This patch generalizes tensor.expand_shape and memref.expand_shape to
consume the output shape as a list of SSA values. This enables us to
implement generic reshape operations with dynamic shapes using
collapse_shape/expand_shape pairs.
The output_shape input to expand_shape follows the static/dynamic
representation that's also used in `tensor.extract_slice`.
Differential Revision: https://reviews.llvm.org/D140821
Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>
This commit fixes the following error when stopping the sparse compiler
pipeline after bufferization (e.g., with `test-analysis-only`):
```
LLVM ERROR: Building op `vector.print` but it isn't known in this MLIRContext: the dialect may not be loaded or this operation hasn't been added by the dialect. See also https://mlir.llvm.org/getting_started/Faq/#registered-loaded-dependent-whats-up-with-dialects-management
```
In order to support various external frameworks (JAX vs PyTorch) we need
a bit more flexibility in [dis]assembling external buffers to and from
sparse tensors in MLIR land. This PR adds a direct-out option that
avoids the rigid pre-allocated for copy-out semantics.
Note that over time, we expect the [dis]assemble operations to converge
into something that supports all sorts of external frameworks. Until
then, this option helps in experimenting with different options.