The following logic can lead to a class name mismatch when using
`linalg.powf` in Python. This PR fixed the issue and also renamed
`NegfOp` to `NegFOp` in linalg to adhere to the naming convention, as
exemplified by `arith::NegFOp`.
173514d58e/mlir/python/mlir/dialects/linalg/opdsl/lang/dsl.py (L140-L143)
```
# linalg.powf(arg0, arg1, outs=[init_result.result])
NotImplementedError: Unknown named op_name / op_class_name: powf / PowfOp
```
Similar to other attributes in Binding, the `PyAffineMapAttribute`
should include a value attribute to enable users to directly retrieve
the `AffineMap` from the `AffineMapAttr`.
This patch enables continuous tiling of a target structured op using
diminishing tile sizes. In cases where the tensor dimensions are not
exactly divisible by the tile size, we are left with leftover tensor
chunks that are irregularly tiled. This approach enables tiling of the
leftover chunk with a smaller tile size and repeats this process
recursively using exponentially diminishing tile sizes. This eventually
generates a chain of loops that apply tiling using diminishing tile
sizes.
Adds `continuous_tile_sizes` op to the transform dialect. This op, when
given a tile size and a dimension, computes a series of diminishing tile
sizes that can be used to tile the target along the given dimension.
Additionally, this op also generates a series of chunk sizes that the
corresponding tile sizes should be applied to along the given dimension.
Adds `multiway` attribute to `transform.structured.split` that enables
multiway splitting of a single target op along the given dimension, as
specified in a list enumerating the chunk sizes.
The MLIR C and Python Bindings expose various methods from
`mlir::OpPrintingFlags` . This PR adds a binding for the `skipRegions`
method, which allows to skip the printing of Regions when printing Ops.
It also exposes this option as parameter in the python `get_asm` and
`print` methods
Following a rather direct approach to expose PDL usage from C and then
Python. This doesn't yes plumb through adding support for custom
matchers through this interface, so constrained to basics initially.
This also exposes greedy rewrite driver. Only way currently to define
patterns is via PDL (just to keep small). The creation of the PDL
pattern module could be improved to avoid folks potentially accessing
the module used to construct it post construction. No ergonomic work
done yet.
---------
Signed-off-by: Jacques Pienaar <jpienaar@google.com>
The PR implements MLIR Python Bindings for a few simple edit operations
on Block arguments, namely, `add_argument`, `erase_argument`, and
`erase_arguments`.
When an operation is erased in Python, its children may still be in the
"live" list inside Python bindings. After this, if some of the newly
allocated operations happen to reuse the same pointer address, this will
trigger an assertion in the bindings. This assertion would be incorrect
because the operations aren't actually live. Make sure we remove the
children operations from the "live" list when erasing the parent.
This also concentrates responsibility over the removal from the "live"
list and invalidation in a single place.
Note that this requires the IR to be sufficiently structurally valid so
a walk through it can succeed. If this invariant was broken by, e.g, C++
pass called from Python, there isn't much we can do.
Using `for_` is very hand with python bindings. Currently, it doesn't
support results, we had to fallback to two lines scf.for.
This PR yields results of scf.for in `for_`
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This change adds bindings for `mlirDenseElementsAttrGet` which accepts a
list of MLIR attributes and constructs a DenseElementsAttr. This allows
for creating `DenseElementsAttr`s of types not natively supported by
Python (e.g. BF16) without requiring other dependencies (e.g. `numpy` +
`ml-dtypes`).
This patch is a first pass at making consistent syntax across the
`LinalgTransformOp`s that use dynamic index lists for size parameters.
Previously, there were two different forms: inline types in the list, or
place them in the functional style tuple. This patch goes for the
latter.
In order to do this, the `printPackedOrDynamicIndexList`,
`printDynamicIndexList` and their `parse` counterparts were modified so
that the types can be optionally provided to the corresponding custom
directives.
All affected ops now use tablegen `assemblyFormat`, so custom
`parse`/`print` functions have been removed. There are a couple ops that
will likely add dynamic size support, and once that happens it should be
made sure that the assembly remains consistent with the changes in this
patch.
The affected ops are as follows: `pack`, `pack_greedily`,
`tile_using_forall`. The `tile_using_for` and `vectorize` ops already
used this syntax, but their custom assembly was removed.
---------
Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
This patch modifies the definition of `PadOp` to take transform params
and handles for the `pad_to_multiple_of` operand.
---------
Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
1. Explicit value means the non-zero value in a sparse tensor. If
explicitVal is set, then all the non-zero values in the tensor have the
same explicit value. The default value Attribute() indicates that it is
not set.
2. Implicit value means the "zero" value in a sparse tensor. If
implicitVal is set, then the "zero" value in the tensor is equal to the
implicit value. For now, we only support `0` as the implicit value but
it could be extended in the future. The default value Attribute()
indicates that the implicit value is `0` (same type as the tensor
element type).
Example:
```
#CSR = #sparse_tensor.encoding<{
map = (d0, d1) -> (d0 : dense, d1 : compressed),
posWidth = 64,
crdWidth = 64,
explicitVal = 1 : i64,
implicitVal = 0 : i64
}>
```
Note: this PR tests that implicitVal could be set to other values as
well. The following PR will add verifier and reject any value that's not
zero for implicitVal.
The Python bindings generated for "async" dialect didn't include any of
the "async" dialect ops. This PR fixes issues with generation of Python
bindings for "async" dialect and adds a test case to use them.
If the python callback throws an error, the c++ code will throw a
py::error_already_set that needs to be caught and handled in the c++
code .
This change is inspired by the similar solution in
PySymbolTable::walkSymbolTables.
This commit adds `walk` method to PyOperationBase that uses a python
object as a callback, e.g. `op.walk(callback)`. Currently callback must
return a walk result explicitly.
We(SiFive) have implemented walk method with python in our internal
python tool for a while. However the overhead of python is expensive and
it didn't scale well for large MLIR files. Just replacing walk with this
version reduced the entire execution time of the tool by 30~40% and
there are a few configs that the tool takes several hours to finish so
this commit significantly improves tool performance.
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
Reverts llvm/llvm-project#84103
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
From https://reviews.llvm.org/D153245
This adds support for native PDL (and PDLL) C++ constraints to return
results.
This is useful for situations where a pattern checks for certain
constraints of multiple interdependent attributes and computes a new
attribute value based on them. Currently, for such an example it is
required to escape to C++ during matching to perform the check and after
a successful match again escape to native C++ to perform the computation
during the rewriting part of the pattern. With this work we can do the
computation in C++ during matching and use the result in the rewriting
part of the pattern. Effectively this enables a choice in the trade-off
of memory consumption during matching vs recomputation of values.
This is an example of a situation where this is useful: We have two
operations with certain attributes that have interdependent constraints.
For instance `attr_foo: one_of [0, 2, 4, 8], attr_bar: one_of [0, 2, 4,
8]` and `attr_foo == attr_bar`. The pattern should only match if all
conditions are true. The new operation should be created with a new
attribute which is computed from the two matched attributes e.g.
`attr_baz = attr_foo * attr_bar`. For the check we already escape to
native C++ and have all values at hand so it makes sense to directly
compute the new attribute value as well:
```
Constraint checkAndCompute(attr0: Attr, attr1: Attr) -> Attr;
Pattern example with benefit(1) {
let foo = op<test.foo>() {attr = attr_foo : Attr};
let bar = op<test.bar>(foo) {attr = attr_bar : Attr};
let attr_baz = checkAndCompute(attr_foo, attr_bar);
rewrite bar with {
let baz = op<test.baz> {attr=attr_baz};
replace bar with baz;
};
}
```
To achieve this the following notable changes were necessary:
PDLL:
- Remove check in PDLL parser that prevented native constraints from
returning results
PDL:
- Change PDL definition of pdl.apply_native_constraint to allow variadic
results
PDL_interp:
- Change PDL_interp definition of pdl_interp.apply_constraint to allow
variadic results
PDLToPDLInterp Pass:
The input to the pass is an arbitrary number of PDL patterns. The pass
collects the predicates that are required to match all of the pdl
patterns and establishes an ordering that allows creation of a single
efficient matcher function to match all of them. Values that are matched
and possibly used in the rewriting part of a pattern are represented as
positions. This allows fusion and thus reusing a single position for
multiple matching patterns. Accordingly, we introduce
ConstraintPosition, which records the type and index of the result of
the constraint. The problem is for the corresponding value to be used in
the rewriting part of a pattern it has to be an input to the
pdl_interp.record_match operation, which is generated early during the
pass such that its surrounding block can be referred to by branching
operations. In consequence the value has to be materialized after the
original pdl.apply_native_constraint has been deleted but before we get
the chance to generate the corresponding pdl_interp.apply_constraint
operation. We solve this by emitting a placeholder value when a
ConstraintPosition is evaluated. These placeholder values (due to fusion
there may be multiple for one constraint result) are replaced later when
the actual pdl_interp.apply_constraint operation is created.
Changes since the phabricator review:
- Addressed all comments
- In particular, removed registerConstraintFunctionWithResults and
instead changed registerConstraintFunction so that contraint functions
always have results (empty by default)
- Thus we don't need to reuse `rewriteFunctions` to store constraint
functions with results anymore, and can instead use
`constraintFunctions`
- Perform a stable sort of ConstraintQuestion, so that
ConstraintQuestion appear before other ConstraintQuestion that use their
results.
- Don't create placeholders for pdl_interp::ApplyConstraintOp. Instead
generate the `pdl_interp::ApplyConstraintOp` before generating the
successor block.
- Fixed a test failure in the pdl python bindings
Original code by @martin-luecke
Co-authored-by: martin-luecke <martinpaul.luecke@amd.com>
_SubClassValueT is only useful when it is has >1 usage in a signature.
This was not true for the signatures produced by tblgen.
For example
def call(result, callee, operands_, *, loc=None, ip=None) ->
_SubClassValueT:
...
here a type checker does not have enough information to infer a type
argument for _SubClassValueT, and thus effectively treats it as Any.
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.
…ct LevelType from LevelFormat and properties instead.
**Rationale**
We used to explicitly declare every possible combination between
`LevelFormat` and `LevelProperties`, and it now becomes difficult to
scale as more properties/level formats are going to be introduced.
1. Add python test for n out of m
2. Add more methods for python binding
3. Add verification for n:m and invalid encoding tests
4. Add e2e test for n:m
Previous PRs for n:m #80501#79935
Currently, a method exists to get the count of the operation objects
which are still alive. This helps for sanity checking, but isn't
terribly useful for debugging. This new method returns the actual
operation objects which are still alive.
This allows Python code like the following:
```
gc.collect()
live_ops = ir.Context.current._get_live_operation_objects()
for op in live_ops:
print(f"Warning: {op} is still live. Referrers:")
for referrer in gc.get_referrers(op)[0]:
print(f" {referrer}")
```
Following the discussion in
https://discourse.llvm.org/t/symboltable-and-symbol-parent-child-relationship/75446,
we should enforce that a symbol's immediate parent is a symbol table.
I changed some tests to pass the verification. In most cases, we can
wrap the func with a module, change the func to another op with regions
i.e. scf.if, or change the expected error message.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
1. C++ enum is set through enum class LevelType : uint_64.
2. C enum is set through typedef uint_64 level_type. It is due to the
limitations in Windows build: setting enum width to ui64 is not
supported in C.