Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
The current implementation of LocationSnapshotPass takes an
OpPrintingFlags argument and stores it as member, but does not use it
for printing.
Properly implement the printing flags, also supporting command line args.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Fixes two minor issues in `findOrBuildReplacementValue`:
* Remove a redundant `mapping.map`.
* Map `repl` instead of `value`. We used to overwrite an existing
mapping, which could introduce extra materializations.
Note: We generally do not want to overwrite mappings, but create a chain
of mappings. There are still a few more places, where a mapping is
overwritten. Once those are fixed, I will put an assertion into
`ConversionValueMapping::map`.
The `buildUnresolvedMaterialization` implementation used to check if a
materialization is necessary. A materialization is not necessary if the
desired types already match the input. However, this situation can never
happen: we look for mapped values with the desired type at the call
sites before requesting a new unresolved materialization.
The previous implementation seemed incorrect because
`buildUnresolvedMaterialization` created a mapping that is never rolled
back. (When in reality that code was never executed, so it is
technically not incorrect.)
Also fix a comment that in `findOrBuildReplacementValue` that was
incorrect.
This commit updates the internal `ConversionValueMapping` data structure
in the dialect conversion driver to support 1:N replacements. This is
the last major commit for adding 1:N support to the dialect conversion
driver.
Since #116470, the infrastructure already supports 1:N replacements. But
the `ConversionValueMapping` still stored 1:1 value mappings. To that
end, the driver inserted temporary argument materializations (converting
N SSA values into 1 value). This is no longer the case. Argument
materializations are now entirely gone. (They will be deleted from the
type converter after some time, when we delete the old 1:N dialect
conversion driver.)
Note for LLVM integration: Replace all occurrences of
`addArgumentMaterialization` (except for 1:N dialect conversion passes)
with `addSourceMaterialization`.
---------
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
During a 1:N replacement (`applySignatureConversion` or
`replaceOpWithMultiple`), the dialect conversion driver used to insert
two materializations:
* Argument materialization: convert N replacement values to 1 SSA value
of the original type `S`.
* Target materialization: convert original type to legalized type `T`.
The target materialization is unnecessary. Subsequent patterns receive
the replacement values via their adaptors. These patterns have their own
type converter. When they see a replacement value of type `S`, they will
automatically insert a target materialization to type `T`. There is no
reason to do this already during the 1:N replacement. (The functionality
used to be duplicated in `remapValues` and `insertNTo1Materialization`.)
Special case: If a subsequent pattern does not have a type converter, it
does *not* insert any target materializations. That's because the
absence of a type converter indicates that the pattern does not care
about type legality. Therefore, it is correct to pass an SSA value of
type `S` (or any other type) to the pattern.
Note: Most patterns in `TestPatterns.cpp` run without a type converter.
To make sure that the tests still behave the same, some of these
patterns now have a type converter.
This commit is in preparation of adding 1:N support to the conversion
value mapping. Before making any further changes to the mapping
infrastructure, I'd like to make sure that the code base around it (that
uses the mapping) is robust.
The greedy rewriter is used in many different flows and it has a lot of
convenience (work list management, debugging actions, tracing, etc). But
it combines two kinds of greedy behavior 1) how ops are matched, 2)
folding wherever it can.
These are independent forms of greedy and leads to inefficiency. E.g.,
cases where one need to create different phases in lowering and is
required to applying patterns in specific order split across different
passes. Using the driver one ends up needlessly retrying folding/having
multiple rounds of folding attempts, where one final run would have
sufficed.
Of course folks can locally avoid this behavior by just building their
own, but this is also a common requested feature that folks keep on
working around locally in suboptimal ways.
For downstream users, there should be no behavioral change. Updating
from the deprecated should just be a find and replace (e.g., `find ./
-type f -exec sed -i
's|applyPatternsAndFoldGreedily|applyPatternsGreedily|g' {} \;` variety)
as the API arguments hasn't changed between the two.
The dialect conversion has various checks that detect incorrect API
usage in patterns. One of these checks turned out to be quite expensive
(N*M complexity where N is the number of block rewrites and M is the
total number of rewrites) in NVIDIA-internal workloads: Checking that a
block is not converted multiple times.
This check iterates over the stack of all rewrites, which can be large.
We saw `hasRewrite` being called around 45000 times with an average
rewrite stack size of 500000.
This PR moves the check to `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`.
For consistency reasons, the other `hasRewrite`-based check is also
moved there.
This PR fixes a bug in `RemoveDeadValues` where the
`FunctionOpInterface` does not have the `isDeclaration` method. As a
result, we should use the `isExternal` method instead. Fixes#116347.
This commit adds a new `matchAndRewrite` overload to `ConversionPattern`
to support 1:N replacements. This is the first of two main PRs that
merge the 1:1 and 1:N dialect conversion drivers.
The existing `matchAndRewrite` function supports only 1:1 replacements,
as can be seen from the `ArrayRef<Value>` parameter.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<Value> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const;
```
This commit adds a `matchAndRewrite` overload that is called by the
dialect conversion driver. By default, this new overload dispatches to
the original 1:1 `matchAndRewrite` implementation. Existing
`ConversionPattern`s do not need to be changed as long as there are no
1:N type conversions or value replacements.
```c++
LogicalResult ConversionPattern::matchAndRewrite(
Operation *op, ArrayRef<ValueRange> operands /*adaptor values*/,
ConversionPatternRewriter &rewriter) const {
// Note: getOneToOneAdaptorOperands produces a fatal error if at least one
// ValueRange has 0 or more than 1 value.
return matchAndRewrite(op, getOneToOneAdaptorOperands(operands), rewriter);
}
```
The `ConversionValueMapping`, which keeps track of value replacements
and materializations, still does not support 1:N replacements. We still
rely on argument materializations to convert N replacement values back
into a single value. The `ConversionValueMapping` will be generalized to
1:N mappings in the second main PR.
Before handing the adaptor values to a `ConversionPattern`, all argument
materializations are "unpacked". The `ConversionPattern` receives N
replacement values and does not see any argument materializations. This
implementation strategy allows us to use the 1:N infrastructure/API in
`ConversionPattern`s even though some functionality is still missing in
the driver. This strategy was chosen to keep the sizes of the PRs
smaller and to make it easier for downstream users to adapt to API
changes.
This commit also updates the the "decompose call graphs" transformation
and the "sparse tensor codegen" transformation to use the new 1:N
`ConversionPattern` API.
Note for LLVM conversion: If you are using a type converter with 1:N
type conversion rules or if your patterns are performing 1:N
replacements (via `replaceOpWithMultiple` or
`applySignatureConversion`), conversion pattern applications will start
failing (fatal LLVM error) with this error message: `pattern 'name' does
not support 1:N conversion`. The name of the failing pattern is shown in
the error message. These patterns must be updated to the new 1:N
`matchAndRewrite` API.
When an unresolved materialization (`unrealized_conversion_cast` op) is
rolled back, the mapping should be rolled back as well, regardless of
whether it is a source, target or argument materialization. Otherwise,
we accumulate pointers to erased IR in the `mapping`. This is harmless
in most cases, but can cause issues when a new operation is allocated at
the same memory location and the pointer is "reused".
It is not possible to write a test case for this because I cannot
trigger the pointer reuse programmatically.
This commit adds an extra assertion to `applySignatureConversion` to
prevent incorrect API usage: The same block cannot be converted multiple
times. That would mess with the underlying conversion value mapping.
(Mappings would be overwritten.) This is similar to op replacements: The
same op cannot be replaced multiple times.
To simplify the check, `BlockTypeConversionRewrite::block` now stores
the original block. The new block is stored in an extra field. (It used
to be the other way around.)
This commit is in preparation of adding 1:N support to the conversion
value mapping. Before making any further changes to the mapping
infrastructure, I'd like to make sure that the code base around it (that
uses the mapping) is robust.
This change removes the restriction on `SymbolUserOpInterface` operators
so they can be used with operators that implement `SymbolOpInterface`,
example:
`memref.global` implements `SymbolOpInterface` so it can be used with
`memref.get_global` which implements `SymbolUserOpInterface`
```
// Define a global constant array
memref.global "private" constant @global_array : memref<10xi32> = dense<[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]> : tensor<10xi32>
// Access this global constant within a function
func @use_global() {
%0 = memref.get_global @global_array : memref<10xi32>
}
```
Reference: https://github.com/llvm/llvm-project/pull/116519 and
https://discourse.llvm.org/t/question-on-criteria-for-acceptable-ir-in-removedeadvaluespass/83131
---------
Co-authored-by: Zeeshan Siddiqui <mzs@ntdev.microsoft.com>
This is a re-upload of #116934, which was reverted.
The dialect conversion driver has three phases:
- **Create** `IRRewrite` objects as the IR is traversed.
- **Finalize** `IRRewrite` objects. During this phase, source
materializations for mismatching value types are created. (E.g., when
`Value` is replaced with a `Value` of different type, but there is a
user of the original value that was not modified because it is already
legal.)
- **Commit** `IRRewrite` objects. During this phase, all remaining IR
modifications are materialized. In particular, SSA values are actually
being replaced during this phase.
This commit removes the "finalize" phase. This simplifies the code base
a bit and avoids one traversal over the `IRRewrite` stack. Source
materializations are now built during the "commit" phase, right before
an SSA value is being replaced.
This commit also removes the "inverse mapping" of the conversion value
mapping, which was used to predict if an SSA value will be dead at the
end of the conversion. This check is replaced with an approximate check
that does not require an inverse mapping. (A false positive for `v` can
occur if another value `v2` is mapped to `v` and `v2` turns out to be
dead at the end of the conversion. This case is not expected to happen
very often.) This reduces the complexity of the driver a bit and removes
one potential source of bugs. (There have been bugs in the usage of the
inverse mapping in the past.)
`BlockTypeConversionRewrite` no longer stores a pointer to the type
converter. This pointer is now stored in `ReplaceBlockArgRewrite`.
This commit is in preparation of merging the 1:1 and 1:N dialect
conversion driver. It simplifies the upcoming changes around the
conversion value mapping. (API surface of the conversion value mapping
is reduced.)
The dialect conversion driver has three phases:
- **Create** `IRRewrite` objects as the IR is traversed.
- **Finalize** `IRRewrite` objects. During this phase, source
materializations for mismatching value types are created. (E.g., when
`Value` is replaced with a `Value` of different type, but there is a
user of the original value that was not modified because it is already
legal.)
- **Commit** `IRRewrite` objects. During this phase, all remaining IR
modifications are materialized. In particular, SSA values are actually
being replaced during this phase.
This commit removes the "finalize" phase. This simplifies the code base
a bit and avoids one traversal over the `IRRewrite` stack. Source
materializations are now built during the "commit" phase, right before
an SSA value is being replaced.
This commit also removes the "inverse mapping" of the conversion value
mapping, which was used to predict if an SSA value will be dead at the
end of the conversion. This check is replaced with an approximate check
that does not require an inverse mapping. (A false positive for `v` can
occur if another value `v2` is mapped to `v` and `v2` turns out to be
dead at the end of the conversion. This case is not expected to happen
very often.) This reduces the complexity of the driver a bit and removes
one potential source of bugs. (There have been bugs in the usage of the
inverse mapping in the past.)
`BlockTypeConversionRewrite` no longer stores a pointer to the type
converter. This pointer is now stored in `ReplaceBlockArgRewrite`.
This commit is in preparation of merging the 1:1 and 1:N dialect
conversion driver. It simplifies the upcoming changes around the
conversion value mapping. (API surface of the conversion value mapping
is reduced.)
This commit adds a new function
`ConversionPatternRewriter::replaceOpWithMultiple`. This function is
similar to `replaceOp`, but it accepts multiple `ValueRange`
replacements, one per op result.
Note: This function is not an overload of `replaceOp` because of
ambiguous overload resolution that would make the API difficult to use.
This commit aligns "block signature conversions" with "op replacements":
both support 1:N replacements now. Due to incomplete 1:N support in the
dialect conversion driver, an argument materialization is inserted when
an SSA value is replaced with multiple values; same as block signature
conversions already work around the problem. These argument
materializations are going to be removed in a subsequent commit that
adds full 1:N support. The purpose of this PR is to add missing features
gradually in small increments.
This commit also updates two MLIR transformations that have their custom
workarounds around missing 1:N support. These can already start using
`replaceOpWithMultiple`.
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
This is intended as a fast pattern rewrite driver for the cases when a
simple walk gets the job done but we would still want to implement it in
terms of rewrite patterns (that can be used with the greedy pattern
rewrite driver downstream).
The new driver is inspired by the discussion in
https://github.com/llvm/llvm-project/pull/112454 and the LLVM Dev
presentation from @matthias-springer earlier this week.
This limitation comes with some limitations:
* It does not repeat until a fixpoint or revisit ops modified in place
or newly created ops. In general, it only walks forward (in the
post-order).
* `matchAndRewrite` can only erase the matched op or its descendants.
This is verified under expensive checks.
* It does not perform folding / DCE.
We could probably relax some of these in the future without sacrificing
too much performance.
This commit changes the format of the materialization error message.
Previously: `failed to legalize unresolved materialization from ('f64')
to 'f32' that remained live after conversion`
Now: `failed to legalize unresolved materialization from ('f64') to
('f32') that remained live after conversion`
This commit is in preparation of merging the 1:1 and 1:N dialect
conversions. At that point, target materializations may create more than
one SSA value. I am sending this change as a separate PR to keep the
main PR smaller.
This commit adds extra checks/assertions to the
`ConversionPatternRewriterImpl::notifyOpReplaced` to improve its
robustness.
1. Replacing an `unrealized_conversion_cast` op that was created by the
driver is now forbidden and caught early during `replaceOp`. It may work
in some cases, but it is generally dangerous because the conversion
driver keeps track of these ops and performs some extra legalization
steps during the "finalize" phase. (Erasing is them is fine.)
2. `null` replacement values are no longer registered in the
`ConversionValueMapping`. This was an oversight in #106760. There is no
benefit in having `null` values in the `ConversionValueMapping`. (It may
even cause problems.)
This change is in preparation of merging the 1:1 and 1:N dialect
conversion drivers.
The 1:N type converter derived from the 1:1 type converter and extends
it with 1:N target materializations. This commit merges the two type
converters and stores 1:N target materializations in the 1:1 type
converter. This is in preparation of merging the 1:1 and 1:N dialect
conversion infrastructures.
1:1 target materializations (producing a single `Value`) will remain
valid. An additional API is added to the type converter to register 1:N
target materializations (producing a `SmallVector<Value>`). Internally,
all target materializations are stored as 1:N materializations.
The 1:N type converter is removed.
Note for LLVM integration: If you are using the `OneToNTypeConverter`,
simply switch all occurrences to `TypeConverter`.
---------
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
This commit simplifies the result type of materialization functions.
Previously: `std::optional<Value>`
Now: `Value`
The previous implementation allowed 3 possible return values:
- Non-null value: The materialization function produced a valid
materialization.
- `std::nullopt`: The materialization function failed, but another
materialization can be attempted.
- `Value()`: The materialization failed and so should the dialect
conversion. (Previously: Dialect conversion can roll back.)
This commit removes the last variant. It is not particularly useful
because the dialect conversion will fail anyway if all other
materialization functions produced `std::nullopt`.
Furthermore, in contrast to type conversions, at least one
materialization callback is expected to succeed. In case of a failing
type conversion, the current dialect conversion can roll back and try a
different pattern. This also used to be the case for materializations,
but that functionality was removed with #107109: failed materializations
can no longer trigger a rollback. (They can just make the entire dialect
conversion fail without rollback.) With this in mind, it is even less
useful to have an additional error state for materialization functions.
This commit is in preparation of merging the 1:1 and 1:N type
converters. Target materializations will have to return multiple values
instead of a single one. With this commit, we can keep the API simple:
`SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`.
Note for LLVM integration: All 1:1 materializations should return
`Value` instead of `std::optional<Value>`. Instead of `std::nullopt`
return `Value()`.
This commit adds an optional `originalType` parameter to target
materialization functions. Without this parameter, target
materializations are underspecified.
Note: `originalType` is only needed for target materializations.
Source/argument materializations do not have it.
Consider the following example: Let's assume that a conversion pattern
"P1" replaced an SSA value "v1" (type "t1") with "v2" (type "t2"). Then
a different conversion pattern "P2" matches an op that has "v1" as an
operand. Let's furthermore assume that "P2" determines that the
legalized type of "t1" is "t3", which may be different from "t2". In
this example, the target materialization callback will be invoked with:
outputType = "t3", inputs = "v2", originalType = "t1". Note that the
original type "t1" cannot be recovered from just "t3" and "v2"; that's
why the `originalType` parameter is added.
This change is in preparation of merging the 1:1 and 1:N dialect
conversion drivers. As part of that change, argument materializations
will be removed (as they are no longer needed; they were just a
workaround because of missing 1:N support in the dialect conversion).
The new `originalType` parameter is needed when lowering MemRef to LLVM.
During that lowering, MemRef function block arguments are replaced with
the elements that make up a MemRef descriptor. The type converter is set
up in such a way that the legalized type of a MemRef type is an
`!llvm.struct` that represents the MemRef descriptor. When the bare
pointer calling convention is enabled, the function block arguments
consist of just an LLVM pointer. In such a case, a target
materialization will be invoked to construct a MemRef descriptor (output
type = `!llvm.struct<...>`) from just the bare pointer (inputs =
`!llvm.ptr`). The original MemRef type is required to construct the
MemRef descriptor, as static sizes/strides/offset cannot be inferred
from just the bare pointer.
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.
Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.
Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
This commit changes the implementation of analysis conversions, so that
no rollback is needed at the end of the analysis. Instead, the dialect conversion is run on a clone of the IR.
The purpose of this commit is to reduce the number of rollbacks in the
dialect conversion framework. (Long term goal: Remove rollback
functionality entirely.)
Fixes#107870.
We can allow the enclosing Module operation to have a symbol.
The check was likely originally not considering this case and intended
to catch symbols inside the region, not accounting that the walk would
visit the enclosing operation.
This commit simplifies the signature of `OperationConverter::finalize`.
This function always returns "success", so the return value can be
removed.
Note: Previously, this function used to return "failure" if a
materialization failed to legalize. This is now optional and happening
at a later point of time (see `config.buildMaterializations`).
There are some spurious libraries which can be removed.
I'm trying to bundle MLIR/LLVM library dependencies for our own
libraries. We're utilizing cmake function to recursively collect
MLIR/LLVM related dependencies. However, we identified certain library
dependencies as redundant and safe for removal.
Simplify the nesting structure of "if" checks in `remapValues` and
update the code comments.
This is what the comments stated in case there is no type converter:
```
// TODO: What we should do here is just set `desiredType` to `origType`
// and then handle the necessary type conversions after the conversion
// process has finished. Unfortunately a lot of patterns currently rely on
// receiving the new operands even if the types change, so we keep the
// original behavior here for now until all of the patterns relying on
// this get updated.
```
However, without a type converter it is not possible to perform any
materializations. Furthermore, the absence of a type converter indicates
that the pattern does not care about type legality. Therefore, the
current implementation is correct and this TODO can be removed.
Note: Patterns that actually require a remapped type to match the
original operand type can be equipped with a type converter that maps
each type to itself.
This TODO is outdated:
```
// TODO: There currently isn't any mechanism to do 1->N type conversion
// via the PatternRewriter replacement API, so for now we just ignore it.
```
1->N type conversions are already possible as part of block signature
conversions. It is incorrect to just ignore such cases. However, there
is currently no better way to handle 1->N conversions in this function
because of infrastructure limitations. This is now clarified in the
comments.
Remove a redundant `lookupOrDefault` that has no effect.
When no type is passed to `lookupOrDefault`, that function returns the
furthest mapped value (by following the mapping iteratively). If there
is no mapped value with the desired type, then the function also returns
the furthest mapped value.
The value that was passed to the redundant `lookupOrDefault` was
produced by this code:
```
Value newOperand = mapping.lookupOrDefault(operand, desiredType);
```
There are 2 possible cases:
- Case 1: There is no mapping to `desiredType`. Then `newOperand` is the
furthest mapped value.
- Case 2: There is a mapping to `desiredType`. Then the type of
`newOperand` is `desiredType` and the "if" branch that encloses the
redundant `lookupOrDefault` is not executed at all.
Also improve the documentation of
`ConversionValueMapping::lookupOrDefault` and simplify the
implementation a bit.
PR #106760 aligned the handling of dropped block arguments and dropped
op results. The two helper functions that insert source materializations
for uses of replaced block arguments / op results that survived the
conversion are now almost identical (`legalizeConvertedArgumentTypes`
and `legalizeConvertedOpResultTypes`). This PR merges the two functions
and moves the implementation directly into `finalize`.
This PR simplifies the code base and improves the efficiency a bit:
previously, `finalize` iterated over
`ConversionPatternRewriterImpl::rewrites` twice. Now, only one iteration
is needed.
---------
Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>