1. Explicit value means the non-zero value in a sparse tensor. If
explicitVal is set, then all the non-zero values in the tensor have the
same explicit value. The default value Attribute() indicates that it is
not set.
2. Implicit value means the "zero" value in a sparse tensor. If
implicitVal is set, then the "zero" value in the tensor is equal to the
implicit value. For now, we only support `0` as the implicit value but
it could be extended in the future. The default value Attribute()
indicates that the implicit value is `0` (same type as the tensor
element type).
Example:
```
#CSR = #sparse_tensor.encoding<{
map = (d0, d1) -> (d0 : dense, d1 : compressed),
posWidth = 64,
crdWidth = 64,
explicitVal = 1 : i64,
implicitVal = 0 : i64
}>
```
Note: this PR tests that implicitVal could be set to other values as
well. The following PR will add verifier and reject any value that's not
zero for implicitVal.
If the python callback throws an error, the c++ code will throw a
py::error_already_set that needs to be caught and handled in the c++
code .
This change is inspired by the similar solution in
PySymbolTable::walkSymbolTables.
This commit adds `walk` method to PyOperationBase that uses a python
object as a callback, e.g. `op.walk(callback)`. Currently callback must
return a walk result explicitly.
We(SiFive) have implemented walk method with python in our internal
python tool for a while. However the overhead of python is expensive and
it didn't scale well for large MLIR files. Just replacing walk with this
version reduced the entire execution time of the tool by 30~40% and
there are a few configs that the tool takes several hours to finish so
this commit significantly improves tool performance.
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.
…ct LevelType from LevelFormat and properties instead.
**Rationale**
We used to explicitly declare every possible combination between
`LevelFormat` and `LevelProperties`, and it now becomes difficult to
scale as more properties/level formats are going to be introduced.
1. Add python test for n out of m
2. Add more methods for python binding
3. Add verification for n:m and invalid encoding tests
4. Add e2e test for n:m
Previous PRs for n:m #80501#79935
Currently, a method exists to get the count of the operation objects
which are still alive. This helps for sanity checking, but isn't
terribly useful for debugging. This new method returns the actual
operation objects which are still alive.
This allows Python code like the following:
```
gc.collect()
live_ops = ir.Context.current._get_live_operation_objects()
for op in live_ops:
print(f"Warning: {op} is still live. Referrers:")
for referrer in gc.get_referrers(op)[0]:
print(f" {referrer}")
```
1. C++ enum is set through enum class LevelType : uint_64.
2. C enum is set through typedef uint_64 level_type. It is due to the
limitations in Windows build: setting enum width to ui64 is not
supported in C.
This reverts commit bbc2976868.
This change seems to be at odds with the non-owning part semantics of
MlirOperation in C API. Since downstream clients can only take and
return MlirOperation, it does not sound correct to force all returns of
MlirOperation transfer ownership. Specifically, this makes it impossible
for downstreams to implement IR-traversing functions that, e.g., look at
neighbors of an operation.
The following patch triggers the exception, and there does not seem to
be an alternative way for a downstream binding writer to express this:
```
diff --git a/mlir/lib/Bindings/Python/IRCore.cpp b/mlir/lib/Bindings/Python/IRCore.cpp
index 39757dfad5be..2ce640674245 100644
--- a/mlir/lib/Bindings/Python/IRCore.cpp
+++ b/mlir/lib/Bindings/Python/IRCore.cpp
@@ -3071,6 +3071,11 @@ void mlir::python::populateIRCore(py::module &m) {
py::arg("successors") = py::none(), py::arg("regions") = 0,
py::arg("loc") = py::none(), py::arg("ip") = py::none(),
py::arg("infer_type") = false, kOperationCreateDocstring)
+ .def("_get_first_in_block", [](PyOperation &self) -> MlirOperation {
+ MlirBlock block = mlirOperationGetBlock(self.get());
+ MlirOperation first = mlirBlockGetFirstOperation(block);
+ return first;
+ })
.def_static(
"parse",
[](const std::string &sourceStr, const std::string &sourceName,
diff --git a/mlir/test/python/ir/operation.py b/mlir/test/python/ir/operation.py
index f59b1a26ba48..6b12b8da5c24 100644
--- a/mlir/test/python/ir/operation.py
+++ b/mlir/test/python/ir/operation.py
@@ -24,6 +24,25 @@ def expect_index_error(callback):
except IndexError:
pass
+@run
+def testCustomBind():
+ ctx = Context()
+ ctx.allow_unregistered_dialects = True
+ module = Module.parse(
+ r"""
+ func.func @f1(%arg0: i32) -> i32 {
+ %1 = "custom.addi"(%arg0, %arg0) : (i32, i32) -> i32
+ return %1 : i32
+ }
+ """,
+ ctx,
+ )
+ add = module.body.operations[0].regions[0].blocks[0].operations[0]
+ op = add.operation
+ # This will get a reference to itself.
+ f1 = op._get_first_in_block()
+
+
# Verify iterator based traversal of the op/region/block hierarchy.
# CHECK-LABEL: TEST: testTraverseOpRegionBlockIterators
```
This fixes a longstanding bug in the `Context._CAPICreate` method
whereby it was not taking ownership of the PyMlirContext wrapper when
casting to a Python object. The result was minimally that all such
contexts transferred in that way would leak. In addition, counter to the
documentation for the `_CAPICreate` helper (see
`mlir-c/Bindings/Python/Interop.h`) and the `forContext` /
`forOperation` methods, we were silently upgrading any unknown
context/operation pointer to steal-ownership semantics. This is
dangerous and was causing some subtle bugs downstream where this
facility is getting the most use.
This patch corrects the semantics and will only do an ownership transfer
for `_CAPICreate`, and it will further require that it is an ownership
transfer (if already transferred, it was just silently succeeding).
Removing the mis-aligned behavior made it clear where the downstream was
doing the wrong thing.
It also adds some `_testing_` functions to create unowned context and
operation capsules so that this can be fully tested upstream, reworking
the tests to verify the behavior.
In some torture testing downstream, I was not able to trigger any memory
corruption with the newly enforced semantics. When getting it wrong, a
regular exception is raised.
This adds Python abstractions for the different handle types of the
transform dialect
The abstractions allow for straightforward chaining of transforms by
calling their member functions.
As an initial PR for this infrastructure, only a single transform is
included: `transform.structured.match`.
With a future `tile` transform abstraction an example of the usage is:
```Python
def script(module: OpHandle):
module.match_ops(MatchInterfaceEnum.TilingInterface).tile(tile_sizes=[32,32])
```
to generate the following IR:
```mlir
%0 = transform.structured.match interface{TilingInterface} in %arg0
%tiled_op, %loops = transform.structured.tile_using_for %0 [32, 32]
```
These abstractions are intended to enhance the usability and flexibility
of the transform dialect by providing an accessible interface that
allows for easy assembly of complex transformation chains.
The "Dim" prefix is a legacy left-over that no longer makes sense, since
we have a very strict "Dimension" vs. "Level" definition for sparse
tensor types and their storage.
This reverts commit 0d109035c2.
Changes make Python bindings unbuildable without additional cmake
modifications (or modified `$PATH`).
```
/llvm-project/mlir/lib/Bindings/Python/IRCore.cpp:33:10: fatal error: 'funcobject.h' file not found
```
This header is provided by cpython, and we are not looking for that in
cmake.
Moreover, the nature of this change is not very clear to me. Seems to
replace one include with two dozens, presumably because the code is only
using transitively included headers, but the value for readability is
dubious. LLVM is also not strictly following IWYU.
Enables reusing the AsmState when printing from Python. Also moves the
fileObject and binary to the end (pybind11::object was resulting in the
overload not working unless `state=` was specified).
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
The existing initialization sequence always enables multi-threading at
MLIRContext construction time, making it impractical to provide a
customized thread pool.
Here, this is changed to always create the context with threading
disabled, process all site-specific init hooks (which can set thread
pools) and ultimately enable multi-threading unless if site-configured
to not do so.
This should preserve the existing user-visible initialization behavior
while also letting downstreams ensure that contexts are always created
with a shared thread pool. This was tested with IREE, which has such a
concept. Using site-specific thread tuning produced up to 2x single
compilation job behavior and customization of batch compilation (i.e. as
part of a build system) to utilize half the memory and run the entire
test suite ~2x faster. Given this, I believe that the additional
configurability can well pay for itself for implementations that use it.
We may also want to present user-level Python APIs for controlling
threading configuration in the future.
This PR adds "value casting", i.e., a mechanism to wrap `ir.Value` in a
proxy class that overloads dunders such as `__add__`, `__sub__`, and
`__mul__` for fun and great profit.
This is thematically similar to
bfb1ba7526
and
9566ee2806.
The example in the test demonstrates the value of the feature (no pun
intended):
```python
@register_value_caster(F16Type.static_typeid)
@register_value_caster(F32Type.static_typeid)
@register_value_caster(F64Type.static_typeid)
@register_value_caster(IntegerType.static_typeid)
class ArithValue(Value):
__add__ = partialmethod(_binary_op, op="add")
__sub__ = partialmethod(_binary_op, op="sub")
__mul__ = partialmethod(_binary_op, op="mul")
a = arith.constant(value=FloatAttr.get(f16_t, 42.42))
b = a + a
# CHECK: ArithValue(%0 = arith.addf %cst, %cst : f16)
print(b)
a = arith.constant(value=FloatAttr.get(f32_t, 42.42))
b = a - a
# CHECK: ArithValue(%1 = arith.subf %cst_0, %cst_0 : f32)
print(b)
a = arith.constant(value=FloatAttr.get(f64_t, 42.42))
b = a * a
# CHECK: ArithValue(%2 = arith.mulf %cst_1, %cst_1 : f64)
print(b)
```
**EDIT**: this now goes through the bindings and thus supports automatic
casting of `OpResult` (including as an element of `OpResultList`),
`BlockArgument` (including as an element of `BlockArgumentList`), as
well as `Value`.
The scalable dimension functionality was added to the vector type after
the bindings for it were defined, without the bindings being ever
updated. Fix that.
<img
src="https://github.com/llvm/llvm-project/assets/5657668/443852b6-ac25-45bb-a38b-5dfbda09d5a7"
height="400" />
<p></p>
So turns out that none of the `replace=True` things actually work
because of the map caches (except for
`register_attribute_builder(replace=True)`, which doesn't use such a
cache). This was hidden by a series of unfortunate events:
1. `register_type_caster` failure was hidden because it was the same
`TestIntegerRankedTensorType` being replaced with itself (d'oh).
2. `register_operation` failure was hidden behind the "order of events"
in the lifecycle of typical extension import/use. Since extensions are
loaded/registered almost immediately after generated builders are
registered, there is no opportunity for the `operationClassMapCache` to
be populated (through e.g., `module.body.operations[2]` or
`module.body.operations[2].opview` or something). Of course as soon as
you as actually do "late-bind/late-register" the extension, you see it's
not successfully replacing the stale one in `operationClassMapCache`.
I'll take this opportunity to propose we ditch the caches all together.
I've been cargo-culting them but I really don't understand how they
work. There's this comment above `operationClassMapCache`
```cpp
/// Cache of operation name to external operation class object. This is
/// maintained on lookup as a shadow of operationClassMap in order for repeat
/// lookups of the classes to only incur the cost of one hashtable lookup.
llvm::StringMap<pybind11::object> operationClassMapCache;
```
But I don't understand how that's true given that the canonical thing
`operationClassMap` is already a map:
```cpp
/// Map of full operation name to external operation class object.
llvm::StringMap<pybind11::object> operationClassMap;
```
Maybe it wasn't always the case? Anyway things work now but it seems
like an unnecessary layer of complexity for not much gain? But maybe I'm
wrong.
Added missing register_translations in python to replicate the same in
the C-API
Cleaned up the current calls to register passes where the other calls
are already embedded in the mlirRegisterAllPasses.
found here,
https://discourse.llvm.org/t/opencl-example/74187
`PyOperations` are Python-level handles to `Operation *` instances. When
the latter are modified by C++, the former need to be invalidated.
#69746 implements such invalidation mechanism by setting all
`PyReferences` to `invalid`. However, that is not enough: they also need
to be removed from the `liveOperations` map since other parts of the
code (such as `PyOperation::createDetached`) assume that that map only
contains valid refs.
This is required to actually solve the issue in #69730.
Fixes https://github.com/llvm/llvm-project/issues/69730 (also see
https://reviews.llvm.org/D155543).
There are two things outstanding (why I didn't land before):
1. add some C API tests for `mlirOperationWalk`;
2. potentially refactor how the invalidation in `run` works; the first
version of the code looked like this:
```cpp
if (invalidateOps) {
auto *context = op.getOperation().getContext().get();
MlirOperationWalkCallback invalidatingCallback =
[](MlirOperation op, void *userData) {
PyMlirContext *context =
static_cast<PyMlirContext *>(userData);
context->setOperationInvalid(op);
};
auto numRegions =
mlirOperationGetNumRegions(op.getOperation().get());
for (int i = 0; i < numRegions; ++i) {
MlirRegion region =
mlirOperationGetRegion(op.getOperation().get(), i);
for (MlirBlock block = mlirRegionGetFirstBlock(region);
!mlirBlockIsNull(block);
block = mlirBlockGetNextInRegion(block))
for (MlirOperation childOp =
mlirBlockGetFirstOperation(block);
!mlirOperationIsNull(childOp);
childOp = mlirOperationGetNextInBlock(childOp))
mlirOperationWalk(childOp, invalidatingCallback, context,
MlirWalkPostOrder);
}
}
```
This is verbose and ugly but it has the important benefit of not
executing `mlirOperationEqual(rootOp->get(), op)` for every op
underneath the root op.
Supposing there's no desire for the slightly more efficient but highly
convoluted approach, I can land this "posthaste".
But, since we have eyes on this now, any suggestions or approaches (or
needs/concerns) are welcome.
This PR replaces the mixin `OpView` extension mechanism with the
standard inheritance mechanism.
Why? Firstly, mixins are not very pythonic (inheritance is usually used
for this), a little convoluted, and too "tight" (can only be used in the
immediately adjacent `_ext.py`). Secondly, it (mixins) are now blocking
are correct implementation of "value builders" (see
[here](https://github.com/llvm/llvm-project/pull/68764)) where the
problem becomes how to choose the correct base class that the value
builder should call.
This PR looks big/complicated but appearances are deceiving; 4 things
were needed to make this work:
1. Drop `skipDefaultBuilders` in
`OpPythonBindingGen::emitDefaultOpBuilders`
2. Former mixin extension classes are converted to inherit from the
generated `OpView` instead of being "mixins"
a. extension classes that simply were calling into an already generated
`super().__init__` continue to do so
b. (almost all) extension classes that were calling `self.build_generic`
because of a lack of default builder being generated can now also just
call `super().__init__`
3. To handle the [lone single
use-case](https://sourcegraph.com/search?q=context%3Aglobal+select_opview_mixin&patternType=standard&sm=1&groupBy=repo)
of `select_opview_mixin`, namely
[linalg](https://github.com/llvm/llvm-project/blob/main/mlir/python/mlir/dialects/_linalg_ops_ext.py#L38),
only a small change was necessary in `opdsl/lang/emitter.py` (thanks to
the emission/generation of default builders/`__init__`s)
4. since the `extend_opview_class` decorator is removed, we need a way
to register extension classes as the desired `OpView` that `op.opview`
conjures into existence; so we do the standard thing and just enable
replacing the existing registered `OpView` i.e.,
`register_operation(_Dialect, replace=True)`.
Note, the upgrade path for the common case is to change an extension to
inherit from the generated builder and decorate it with
`register_operation(_Dialect, replace=True)`. In the slightly more
complicated case where `super().__init(self.build_generic(...))` is
called in the extension's `__init__`, this needs to be updated to call
`__init__` in `OpView`, i.e., the grandparent (see updated docs).
Note, also `<DIALECT>_ext.py` files/modules will no longer be automatically loaded.
Note, the PR has 3 base commits that look funny but this was done for
the purpose of tracking the line history of moving the
`<DIALECT>_ops_ext.py` class into `<DIALECT>.py` and updating (commit
labeled "fix").
The reason I want this is that I am writing my own Python bindings and
would like to use the insertion point from
`PyThreadContextEntry::getDefaultInsertionPoint()` to call C++ functions
that take an `OpBuilder` (I don't need to expose it in Python but it
also seems appropriate). AFAICT, there is currently no way to translate
a `PyInsertionPoint` into an `OpBuilder` because the operation is
inaccessible.
Updates:
1. Infer lvlToDim from dimToLvl
2. Add more tests for block sparsity
3. Finish TODOs related to lvlToDim, including adding lvlToDim to python
binding
Verification of lvlToDim that user provides will be implemented in the
next PR.