Treat integer range for vector type as union of ranges of individual
elements. With this semantics, most arith ops on vectors will work out
of the box, the only special handling needed for constants and vector
elements manipulation ops.
The end goal of these changes is to be able to optimize vectorized index
calculations.
This is intended as a fast pattern rewrite driver for the cases when a
simple walk gets the job done but we would still want to implement it in
terms of rewrite patterns (that can be used with the greedy pattern
rewrite driver downstream).
The new driver is inspired by the discussion in
https://github.com/llvm/llvm-project/pull/112454 and the LLVM Dev
presentation from @matthias-springer earlier this week.
This limitation comes with some limitations:
* It does not repeat until a fixpoint or revisit ops modified in place
or newly created ops. In general, it only walks forward (in the
post-order).
* `matchAndRewrite` can only erase the matched op or its descendants.
This is verified under expensive checks.
* It does not perform folding / DCE.
We could probably relax some of these in the future without sacrificing
too much performance.
The `ValueDecomposer` in `DecomposeCallGraphTypes` was a workaround
around missing 1:N support in the dialect conversion. Since #113032, the
dialect conversion infrastructure supports 1:N type conversions and 1:N
target materializations. The `ValueDecomposer` class is no longer
needed. (However, target materializations must still be inserted
manually, until we fully merge the 1:1 and 1:N drivers.)
Note for LLVM integration: Register 1:N target materializations on the
type converter instead of "decompose value conversions" on the
`ValueDecomposer`.
This commit simplifies the result type of materialization functions.
Previously: `std::optional<Value>`
Now: `Value`
The previous implementation allowed 3 possible return values:
- Non-null value: The materialization function produced a valid
materialization.
- `std::nullopt`: The materialization function failed, but another
materialization can be attempted.
- `Value()`: The materialization failed and so should the dialect
conversion. (Previously: Dialect conversion can roll back.)
This commit removes the last variant. It is not particularly useful
because the dialect conversion will fail anyway if all other
materialization functions produced `std::nullopt`.
Furthermore, in contrast to type conversions, at least one
materialization callback is expected to succeed. In case of a failing
type conversion, the current dialect conversion can roll back and try a
different pattern. This also used to be the case for materializations,
but that functionality was removed with #107109: failed materializations
can no longer trigger a rollback. (They can just make the entire dialect
conversion fail without rollback.) With this in mind, it is even less
useful to have an additional error state for materialization functions.
This commit is in preparation of merging the 1:1 and 1:N type
converters. Target materializations will have to return multiple values
instead of a single one. With this commit, we can keep the API simple:
`SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`.
Note for LLVM integration: All 1:1 materializations should return
`Value` instead of `std::optional<Value>`. Instead of `std::nullopt`
return `Value()`.
When an operation has no properties, no property struct is emitted. To avoid a compilation error, we should also skip emitting `setPropertiesFromParsedAttr`, `parseProperties` and `printProperties` in such cases.
Compilation error:
```
error: ‘Properties’ has not been declared
static ::llvm::LogicalResult setPropertiesFromParsedAttr(Properties &prop, ::mlir::Attribute attr, ::llvm::function_ref<::mlir::InFlightDiagnostic()> emitError);
```
This is a reasonable canonicalization because `extract` is more
constrained than `extract_strided_slices`, so there is no loss of
semantics here, just lifting an op to a special-case higher/constrained
op. And the additional `shape_cast` is merely adding leading unit dims
to match the original result type.
Context: discussion on #111541. I wasn't sure how this would turn out,
but in the process of writing this PR, I discovered at least 2 bugs in
the pattern introduced in #111541, which shows the value of shared
canonicalization patterns which are exercised on a high number of
testcases.
---------
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
This patch adds the capability to define dialect-specific location
attrs. This is useful in particular for defining location structure that
doesn't necessarily fit within the core MLIR location hierarchy, but
doesn't make sense to push upstream (i.e. a custom use case).
This patch adds an AttributeTrait, `IsLocation`, which is tagged onto
all the builtin location attrs, as well as the test location attribute.
This is necessary because previously LocationAttr::classof only returned
true if the attribute was one of the builtin location attributes, and
well, the point of this patch is to allow dialects to define their own
location attributes.
There was an alternate implementation I considered wherein LocationAttr
becomes an AttrInterface, but that was discarded because there are
likely to be *many* locations in a single program, and I was concerned
that forcing every MLIR user to pay the cost of the additional
lookup/dispatch was unacceptable. It also would have been a *much* more
invasive change. It would have allowed for more flexibility in terms of
pretty printing, but it's unclear how useful/necessary that flexibility
would be given how much customizability there already is for attribute
definitions.
Making the existing populateGpuLowerSubgroupReduceToShufflePatterns()
function also cover the new "clustered" subgroup reductions is proving
to be inconvenient, because certain backends may have more specific
lowerings that only cover the non-clustered type, and this creates pass
ordering constraints. This commit removes coverage of clustered
reductions from this function in favour of a new separate function,
which makes controlling the lowering much more straightforward.
This commit makes source/target/argument materializations (via the
`TypeConverter` API) optional.
By default (`ConversionConfig::buildMaterializations = true`), the
dialect conversion infrastructure tries to legalize all unresolved
materializations right after the main transformation process has
succeeded. If at least one unresolved materialization fails to resolve,
the dialect conversion fails. (With an error message such as `failed to
legalize unresolved materialization ...`.) Automatic materializations
through the `TypeConverter` API can now be deactivated. In that case,
every unresolved materialization will show up as a
`builtin.unrealized_conversion_cast` op in the output IR.
There used to be a complex and error-prone analysis in the dialect
conversion that predicted the future uses of unresolved
materializations. Based on that logic, some casts (that were deemed to
unnecessary) were folded. This analysis was needed because folding
happened at a point of time when some IR changes (e.g., op replacements)
had not materialized yet.
This commit removes that analysis. Any folding of cast ops now happens
after all other IR changes have been materialized and the uses can
directly be queried from the IR. This simplifies the analysis
significantly. And certain helper data structures such as
`inverseMapping` are no longer needed for the analysis. The folding
itself is done by `reconcileUnrealizedCasts` (which also exists as a
standalone pass).
After casts have been folded, the remaining casts are materialized
through the `TypeConverter`, as usual. This last step can be deactivated
in the `ConversionConfig`.
`ConversionConfig::buildMaterializations = false` can be used to debug
error messages such as `failed to legalize unresolved materialization
...`. (It is also useful in case automatic materializations are not
needed.) The materializations that failed to resolve can then be seen as
`builtin.unrealized_conversion_cast` ops in the resulting IR. (This is
better than running with `-debug`, because `-debug` shows IR where some
IR changes have not been materialized yet.)
Note: This is a reupload of #104668, but with correct handling of cyclic
unrealized_conversion_casts that may be generated by the dialect
conversion.
It is better to do the replacement in the caller. This avoids the
footgun if the caller needs the original operation. Instead return the
produced operation and replacement values.
Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
Mem2Reg assumes SSA dependencies but did not check for graph regions.
This fixes it.
---------
Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
Group all patterns that re-order vector.transpose and vector.broadcast
Ops (*) under `populateSinkVectorOpsPatterns`. These patterns are
normally used to "sink" redundant Vector Ops, hence grouping together.
Example:
```mlir
%at = vector.transpose %a, [1, 0]: vector<4x2xf32> to vector<2x4xf32>
%bt = vector.transpose %b, [1, 0]: vector<4x2xf32> to vector<2x4xf32>
%r = arith.addf %at, %bt : vector<2x4xf32>
```
would get converted to:
```mlir
%0 = arith.addf %a, %b : vector<4x2xf32>
%r = vector.transpose %0, [1, 0] : vector<2x4xf32>
```
This patch also moves all tests for these patterns so that all of them
are:
* run under one test-flag: `test-vector-sink-patterns`,
* located in one file: "vector-sink.mlir".
To facilitate this change:
* `-test-sink-vector-broadcast` is renamed as
`test-vector-sink-patterns`,
* "sink-vector-broadcast.mlir" is renamed as "vector-sink.mlir",
* tests for `ReorderCastOpsOnBroadcast` and
`ReorderElementwiseOpsOnTranspose` patterns are moved from
"vector-reduce-to-contract.mlir" to "vector-sink.mlir",
* `ReorderElementwiseOpsOnTranspose` patterns are removed from
`populateVectorReductionToContractPatterns` and added to (newly
created) `populateSinkVectorOpsPatterns`,
* `ReorderCastOpsOnBroadcast` patterns are removed from
`populateVectorReductionToContractPatterns` - these are already
present in `populateSinkVectorOpsPatterns`.
This should allow us better layering and more straightforward testing.
For the latter, the goal is to be able to easily identify which pattern
a particular test is exercising (especially when it's a specific
pattern).
NOTES FOR DOWNSTREAM USERS
In order to preserve the current functionality, please make sure to add
* `populateSinkVectorOpsPatterns`,
wherever you are using `populateVectorReductionToContractPatterns`.
Also, rename `populateSinkVectorBroadcastPatterns` as
`populateSinkVectorOpsPatterns`.
(*) I didn't notice any other re-order patterns.
Refactored @Max191's PR https://github.com/llvm/llvm-project/pull/94637
to move it to `Tensor`
From the original PR
>This PR adds fusion by expansion patterns to push a tensor.expand_shape
up through a tensor.collapse_shape with non-intersecting reassociations.
Sometimes parallel collapse_shape ops like this can block propagation of
expand_shape ops, so this allows them to pass through each other.
I'm not sure if I put the code/tests in the right places, so let me know
where those go if they aren't.
cc @MaheshRavishankar @hanhanW
---------
Co-authored-by: Max Dawkins <max.dawkins@gmail.com>
This is a fixed copy of #98145 (necessary after it got reverted).
@sogartar @yaochengji
This PR adds the following to #98145:
- `UpdateHaloOp` accepts a `memref` (instead of a tensor) and not
returning a result to clarify its inplace-semantics
- `UpdateHaloOp` accepts `split_axis` to allow multiple mesh-axes per
tensor/memref-axis (similar to `mesh.sharding`)
- The implementation of `Shardinginterface` for tensor operation
(`tensor.empty` for now) moved from the tensor library to the mesh
interface library. `spmdize` uses features from `mesh` dialect.
@rengolin agreed that `tensor` should not depend on `mesh` so this
functionality cannot live in a `tensor`s lib. The unfulfilled dependency
caused the issues leading to reverting #98145. Such cases are generally
possible and might lead to re-considering the current structure (like
for tosa ops).
- rebased onto latest main
--------------------------
Replacing `#mesh.sharding` attribute with operation `mesh.sharding`
- extended semantics now allow providing optional `halo_sizes` and
`sharded_dims_sizes`
- internally a sharding is represented as a non-IR class
`mesh::MeshSharding`
What previously was
```mlir
%sharded0 = mesh.shard %arg0 <@mesh0, [[0]]> : tensor<4x8xf32>
%sharded1 = mesh.shard %arg1 <@mesh0, [[0]]> annotate_for_users : tensor<16x8xf32>
```
is now
```mlir
%sharding = mesh.sharding @mesh0, [[0]] : !mesh.sharding
%0 = mesh.shard %arg0 to %sharding : tensor<4x8xf32>
%1 = mesh.shard %arg1 to %sharding annotate_for_users : tensor<16x8xf32>
```
and allows additional annotations to control the shard sizes:
```mlir
mesh.mesh @mesh0 (shape = 4)
%sharding0 = mesh.sharding @mesh0, [[0]] halo_sizes = [1, 2] : !mesh.sharding
%0 = mesh.shard %arg0 to %sharding0 : tensor<4x8xf32>
%sharding1 = mesh.sharding @mesh0, [[0]] sharded_dims_sizes = [3, 5, 5, 3] : !mesh.sharding
%1 = mesh.shard %arg1 to %sharding1 annotate_for_users : tensor<16x8xf32>
```
- `mesh.shard` op accepts additional optional attribute `force`, useful
for halo updates
- Some initial spmdization support for the new semantics
- Support for `tensor.empty` reacting on `sharded_dims_sizes` and
`halo_sizes` in the sharding
- New collective operation `mesh.update_halo` as a spmdized target for
shardings with `halo_sizes`
---------
Co-authored-by: frank.schlimbach <fschlimb@smtp.igk.intel.com>
Co-authored-by: Jie Fu <jiefu@tencent.com>
When a type/attribute is defined in TableGen, a type constraint can be
used for parameters, but the type constraint verification was missing.
Example:
```
def TestTypeVerification : Test_Type<"TestTypeVerification"> {
let parameters = (ins AnyTypeOf<[I16, I32]>:$param);
// ...
}
```
No verification code was generated to ensure that `$param` is I16 or
I32.
When type constraints a present, a new method will generated for types
and attributes: `verifyInvariantsImpl`. (The naming is similar to op
verifiers.) The user-provided verifier is called `verify` (no change).
There is now a new entry point to type/attribute verification:
`verifyInvariants`. This function calls both `verifyInvariantsImpl` and
`verify`. If neither of those two verifications are present, the
`verifyInvariants` function is not generated.
When a type/attribute is not defined in TableGen, but a verifier is
needed, users can implement the `verifyInvariants` function. (This
function was previously called `verify`.)
Note for LLVM integration: If you have an attribute/type that is not
defined in TableGen (i.e., just C++), you have to rename the
verification function from `verify` to `verifyInvariants`. (Most
attributes/types have no verification, in which case there is nothing to
do.)
Depends on #102657.
This adds a new transform `eliminateVectorMasks()` which aims at
removing scalable `vector.create_masks` that will be all-true at
runtime. It attempts to do this by simply pattern-matching the mask
operands (similar to some canonicalizations), if that does not lead to
an answer (is all-true? yes/no), then value bounds analysis will be used
to find the lower bound of the unknown operands. If the lower bound is
>= to the corresponding mask vector type dim, then that dimension of the
mask is all true.
Note that the pattern matching prevents expensive value-bounds analysis
in cases where the mask won't be all true.
For example:
```mlir
%mask = vector.create_mask %dynamicValue, %c2 : vector<8x4xi1>
```
From looking at `%c2` we can tell this is not going to be an all-true
mask, so we don't need to run the value-bounds analysis for
`%dynamicValue` (and can exit the transform early).
Note: Eliminating create_masks here means replacing them with all-true
constants (which will then lead to the masks folding away).
- Replacing `#mesh.sharding` attribute with operation `mesh.sharding`
- extended semantics now allow providing optional `halo_sizes` and
`sharded_dims_sizes`
- internally a sharding is represented as a non-IR class
`mesh::MeshSharding`
What previously was
```mlir
%sharded0 = mesh.shard %arg0 <@mesh0, [[0]]> : tensor<4x8xf32>
%sharded1 = mesh.shard %arg1 <@mesh0, [[0]]> annotate_for_users : tensor<16x8xf32>
```
is now
```mlir
%sharding = mesh.sharding @mesh0, [[0]] : !mesh.sharding
%0 = mesh.shard %arg0 to %sharding : tensor<4x8xf32>
%1 = mesh.shard %arg1 to %sharding annotate_for_users : tensor<16x8xf32>
```
and allows additional annotations to control the shard sizes:
```mlir
mesh.mesh @mesh0 (shape = 4)
%sharding0 = mesh.sharding @mesh0, [[0]] halo_sizes = [1, 2] : !mesh.sharding
%0 = mesh.shard %arg0 to %sharding0 : tensor<4x8xf32>
%sharding1 = mesh.sharding @mesh0, [[0]] sharded_dims_sizes = [3, 5, 5, 3] : !mesh.sharding
%1 = mesh.shard %arg1 to %sharding1 annotate_for_users : tensor<16x8xf32>
```
- `mesh.shard` op accepts additional optional attribute `force`, useful
for halo updates
- Some initial spmdization support for the new semantics
- Support for `tensor.empty` reacting on `sharded_dims_sizes` and
`halo_sizes` in the sharding
- New collective operation `mesh.update_halo` as a spmdized target for
shardings with `halo_sizes`
@sogartar @yaochengji
`PassManager::run` loads the dependent dialects for each pass into the
current context prior to invoking the individual passes. If the
dependent dialect is already loaded into the context, this should be a
no-op. However, if there are extensions registered in the
`DialectRegistry`, the dependent dialects are unconditionally registered
into the context.
This poses a problem for dynamic pass pipelines, however, because they
will likely be executing while the context is in an immutable state
(because of the parent pass pipeline being run).
To solve this, we'll update the extension registration API on
`DialectRegistry` to require a type ID for each extension that is
registered. Then, instead of unconditionally registered dialects into a
context if extensions are present, we'll check against the extension
type IDs already present in the context's internal `DialectRegistry`.
The context will only be marked as dirty if there are net-new extension
types present in the `DialectRegistry` populated by
`PassManager::getDependentDialects`.
Note: this PR removes the `addExtension` overload that utilizes
`std::function` as the parameter. This is because `std::function` is
copyable and potentially allocates memory for the contained function so
we can't use the function pointer as the unique type ID for the
extension.
Downstream changes required:
- Existing `DialectExtension` subclasses will need a type ID to be
registered for each subclass. More details on how to register a type ID
can be found here:
8b68e06731/mlir/include/mlir/Support/TypeID.h (L30)
- Existing uses of the `std::function` overload of `addExtension` will
need to be refactored into dedicated `DialectExtension` classes with
associated type IDs. The attached `std::function` can either be inlined
into or called directly from `DialectExtension::apply`.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
This code got lost in #97213 and there was no test for it. Add it back
with an MLIR test.
When a pattern is run without a type converter, we can assume that the
new block argument types of a signature conversion are legal. That's
because they were specified by the user. This won't work for 1->N
conversions due to limitations in the dialect conversion infrastructure,
so the original `FIXME` has to stay in place.
Define SCF dialect patterns rotating `scf.while` loops leveraging
existing `mlir::scf::wrapWhileLoopInZeroTripCheck`. `forceCreateCheck`
is always `false` as the pattern would lead to an infinite recursion
otherwise.
This pattern rotates `scf.while` ops, mutating them from "while" loops to
"do-while" loops. A guard checking the condition for the first iteration
is inserted. Note this guard can be optimized away if the compiler can
prove the loop will be executed at least once.
Using this pattern, the following while loop:
```mlir
scf.while (%arg0 = %init) : (i32) -> i64 {
%val = .., %arg0 : i64
%cond = arith.cmpi .., %arg0 : i32
scf.condition(%cond) %val : i64
} do {
^bb0(%arg1: i64):
%next = .., %arg1 : i32
scf.yield %next : i32
}
```
Can be transformed into:
``` mlir
%pre_val = .., %init : i64
%pre_cond = arith.cmpi .., %init : i32
scf.if %pre_cond -> i64 {
%res = scf.while (%arg1 = %va0) : (i64) -> i64 {
// Original after block
%next = .., %arg1 : i32
// Original before block
%val = .., %next : i64
%cond = arith.cmpi .., %next : i32
scf.condition(%cond) %val : i64
} do {
^bb0(%arg2: i64):
%scf.yield %arg2 : i32
}
scf.yield %res : i64
} else {
scf.yield %pre_val : i64
}
```
The test pass for `wrapWhileLoopInZeroTripCheck` has been modified to
use the new pattern when `forceCreateCheck=false`.
---------
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
While we have had a Properties.td that allowed for defining
non-attribute-backed properties, such properties were not plumbed
through the basic autogeneration facilities available to attributes,
forcing those who want to migrate to the new system to write such code
by hand.
## Potentially breaking changes
- The `setFoo()` methods on `Properties` struct no longer take their
inputs by const reference. Those wishing to pass non-owned values of a
property by reference to constructors and setters should set the
interface type to `const [storageType]&`
- Adapters and operations now define getters and setters for properties
listed in ODS, which may conflict with custom getters.
- Builders now include properties listed in ODS specifications,
potentially conflicting with custom builders with the same type
signature.
## Extensions to the `Property` class
This commit adds several fields to the `Property` class, including:
- `parser`, `optionalParser`, and `printer` (for parsing/printing
properties of a given type in ODS syntax)
- `storageTypeValueOverride`, an extension of `defaultValue` to allow
the storage and interface type defaults to differ
- `baseProperty` (allowing for classes like `DefaultValuedProperty`)
Existing fields have also had their documentation comments updated.
This commit does not add a `PropertyConstraint` analogous to
`AttrConstraint`, but this is a natural evolution of the work here.
This commit also adds the concrete property kinds `I32Property`,
`I64Property`, `UnitProperty` (and special handling for it like for
UnitAttr), and `BoolProperty`.
## Property combinators
`Properties.td` also now includes several ways to combine properties.
One is `ArrayProperty<Property elem>`, which now stores a
variable-length array of some property as
`SmallVector<elem.storageType>` and uses `ArrayRef<elem.storageType>` as
its interface type. It has `IntArrayProperty` subclasses that change its
conversion to attributes to use `DenseI[N]Attr`s instead of an
`ArrayAttr`.
Similarly, `OptionalProperty<Property p>` wraps a property's storage in
`std::optional<>` and adds a `std::nullopt` default value. In the case
where the underlying property can be parsed optionally but doesn't have
its own default value, `OptionalProperty` can piggyback off the optional
parser to produce a cleaner syntax, as opposed to its general form,
which is either `none` or `some<[value]>`.
(Note that `OptionalProperty` can be nested if desired).
## Autogeneration changes
Operations and adaptors now support getters and setters for properties
like those for attributes. Unlike for attributes, there aren't separate
value and attribute forms, since there is no `FooAttr()` available for a
`getFooAttr()` to return.
The largest change is to operation formats. Previously, properties could
only be used in custom directives. Now, they can be used anywhere an
attribute could be used, and have parsers and printers defined in their
tablegen records.
These updates include special `UnitProperty` logic like that used for
`UnitAttr`.
## Misc.
Some attempt has been made to test the new functionality.
This commit takes tentative steps towards updating the documentation to
account for properties. A full update will be in order once any followup
work has been completed and the interfaces have stabilized.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
This commit fixes a crash in the dialect conversion when applying a
signature conversion to a block inside of a detached region.
This fixes an issue reported in
4114d5be87 (r1691809730).
An attribute parser needs to parse lists of possibly negative integers
separated by x in a way which is foiled by parseInteger handling hex
formats and parseIntegerInDimensionList does not allow negatives.
---------
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Convert Linalg winograd_filter_transform, winograd_input_transform, and
winograd_output_transform into nested loops with matrix multiplication
with constant transform matrices.
Support several configurations of Winograd Conv2D, including F(2, 3),
F(4, 3) and F(2, 5). These configurations show that the implementation
can support different kernel size (3 and 5) and different output size
(2 and 4). Besides symetric kernel size 3x3 and 5x5, this patch also
supports 1x3, 3x1, 1x5, and 5x1 kernels.
The implementation is based on the paper, Fast Algorithm for
Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308)
Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin
Reviewed By: ftynse, Max191
Pull Request: https://github.com/llvm/llvm-project/pull/96183
Define high level winograd operators and convert conv_2d_nhwc_fhwc into
winograd operators. According to Winograd Conv2D algorithm, we need
three transform operators for input, filter, and output transformation.
The formula of Winograd Conv2D algorithm is
Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A
filter transform: G x g x G^T
input transform: B^T x d x B
output transform: A^T x y x A
The implementation is based on the paper, Fast Algorithm for
Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308)
Reviewers: stellaraccident, ftynse, Max191, GeorgeARM, cxy-1993, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin
Reviewed By: ftynse, Max191, stellaraccident
Pull Request: https://github.com/llvm/llvm-project/pull/96181
It's not easy to determine whether we want to propagate pack/unpack ops
because we don't know the (producer, consumer) information. The
revisions switch it to `OpOperand*`, so the control function can capture
the (producer, consumer) pair. E.g.,
```
Operation *producer = opOperand->get().getDefiningOp();
Operation *consumer = opOperand->getOwner();
```
https://github.com/llvm/llvm-project/pull/96242 fixed an issue where the
auto-generated parsers were not loading dialects whose namespaces are
not present in the textual IR. This required the attribute parameter to
be a tablegen def with its dialect information attached.
This fails when using parameter wrapper classes like
`OptionalParameter`. This came up because `RingAttr` uses
`OptionalParameter` for its second and third attributes.
`OptionalParameter` takes as input the C++ type as a string instead of
the tablegen def, and so it doesn't have a dialect member value to
trigger the fix from https://github.com/llvm/llvm-project/pull/96242.
The docs on this topic say the appropriate solution as overloading
`FieldParser` for a particular type.
This PR updates `FieldParser` for generic attributes to load the dialect
on demand. This requires `mlir-tblgen` to emit a `dialectName` static
field on the generated attribute class, and check for it with template
metaprogramming, since not all attribute types go through `mlir-tblgen`.
---------
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
The `elementPtrs` has changed meaning over time and the name is now
outdated which may be confusing. This PR updates it to a name
representative of current usage.
This patch introduces pattern rewrites for reducing the rank of named
linalg contraction ops with unit spatial dim(s) to other named
contraction ops. For example `linalg.batch_matmul` with batch size 1 ->
`linalg.matmul` and `linalg.matmul` with unit LHS spatial dim ->
`linalg.vecmat`, etc. These patterns don't support reducing the rank
along reduction dimension as those don't convert to other named
contraction ops.
The mlir-translate tool calls into the parser without loading registered
dependent dialects, and the parser only loads attributes if the
fully-namespaced attribute is present in the textual IR. This causes
parsing to break when an op has an attribute that prints/parses without
the namespaced attribute.
Co-authored-by: Jeremy Kun <jkun@google.com>
This patch adds more precise side effects to the current ops with memory
effects, allowing us to determine which OpOperand/OpResult/BlockArgument
the
operation reads or writes, rather than just recording the reading and
writing
of values. This allows for convenient use of precise side effects to
achieve
analysis and optimization.
Related discussions:
https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243
MLIR's LLVM dialect does not internally support debug records, only
converting to/from debug intrinsics. To smooth the transition from
intrinsics to records, there is a step prior to IR->MLIR translation
that switches the IR module to intrinsic-form; this patch adds the
equivalent conversion to record-form at MLIR->IR translation.
This is a partial reapply of
https://github.com/llvm/llvm-project/pull/95098 which can be landed once
the flang frontend has been updated by
https://github.com/llvm/llvm-project/pull/95306. This is the counterpart
to the earlier patch https://github.com/llvm/llvm-project/pull/89735
which handled the IR->MLIR conversion.
Also reverts "[MLIR][Flang][DebugInfo] Convert debug format in MLIR translators"
The patch above introduces behaviour controlled by an LLVM flag into the
Flang driver, which is incorrect behaviour.
This reverts commits:
3cc2710e0d.
460408f78b.
Presently, if name starts with a symbol it's converted to hex which may
cause the result to be invalid by starting with a digit.
Address this and add a small test.
Co-authored-by: Will Dietz <w@wdtz.org>
This commit simplifies and improves documentation for the part of the
`ConversionPatternRewriter` API that deals with signature conversions.
There are now two public functions for signature conversion:
* `applySignatureConversion` converts a single block signature. This
function used to take a `Region *` (but converted only the entry block).
It now takes a `Block *`.
* `convertRegionTypes` converts all block signatures of a region.
`convertNonEntryRegionTypes` is removed because it is not widely used
and can easily be expressed with a call to `applySignatureConversion`
inside a loop. (See `Detensorize.cpp` for an example.)
Note: For consistency, `convertRegionTypes` could be renamed to
`applySignatureConversion` (overload) in the future. (Or
`applySignatureConversion` renamed to `convertBlockTypes`.)
Also clarify when a type converter and/or signature conversion object is
needed and for what purpose.
Internal code refactoring (NFC) of `ConversionPatternRewriterImpl` (the
part that deals with signature conversions). This part of the codebase
was quite convoluted and unintuitive.
From a functional perspective, this change is NFC. However, the public
API changes, thus not marking as NFC.
Note for LLVM integration: When you see
`applySignatureConversion(region, ...)`, replace with
`applySignatureConversion(region->front(), ...)`. In the unlikely case
that you see `convertNonEntryRegionTypes`, apply the same changes as
this commit did to `Detensorize.cpp`.
---------
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
This adds a new option
`-enable-arm-streaming=if-contains-scalable-vectors`, which only applies
the selected streaming/ZA modes if the function contains scalable vector
types.
As a NFC this patch also removes the `only-` prefix from the
`if-required-by-ops` mode.
Integer range analysis will not update the range of an operation when
any of the inferred input lattices are uninitialized. In the current
behavior, all lattice values for non integer types are uninitialized.
For operations like arith.cmpf
```mlir
%3 = arith.cmpf ugt, %arg0, %arg1 : f32
```
that will result in the range of the output also being uninitialized,
and so on for any consumer of the arith.cmpf result. When control-flow
ops are involved, the lack of propagation results in incorrect ranges,
as the back edges for loop carried values are not properly joined with
the definitions from the body region.
For example, an scf.while loop whose body region produces a value that
is in a dataflow relationship with some floating-point values through an
arith.cmpf operation:
```mlir
func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) {
%c4 = arith.constant 4 : index
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%3 = arith.cmpf ugt, %arg0, %arg1 : f32
%1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) {
%2 = arith.cmpi ult, %arg2, %c4 : index
scf.condition(%2) %arg2, %arg3 : index, index
} do {
^bb0(%arg2: index, %arg3: index):
%4 = arith.select %3, %arg3, %arg3 : index
%5 = arith.addi %arg2, %c1 : index
scf.yield %5, %4 : index, index
}
return %1#0, %1#1 : index, index
}
```
The existing behavior results in the control condition %2 being
optimized to true, turning the while loop into an infinite loop. The
update to %arg2 through the body region is never factored into the range
calculation, as the ranges for the body ops all test as uninitialized.
This change causes all values initialized with setToEntryState to be set
to some initialized range, even if the values are not integers.
---------
Co-authored-by: Spenser Bauman <sabauma@fastmail>
This patch adapts the `test.reflect_bounds` test Op to use explicitly
signed and unsigned representation for signed and unsigned bounds of
`IntegerType`s.
This is mostly a cosmetic change as the internal representation of the
ranges is unchanged. However, it improves readability of tests.