This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.
This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.
This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.
---------
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
Relands #118583, with a fix for Python 3.8 compatibility. It was not
possible to set the buffer protocol accessers via slots in Python 3.8.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::` to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
Why? https://nanobind.readthedocs.io/en/latest/why.html says it better
than I can, but my primary motivation for this change is to improve MLIR
IR construction time from JAX.
For a complicated Google-internal LLM model in JAX, this change improves
the MLIR
lowering time by around 5s (out of around 30s), which is a significant
speedup for simply switching binding frameworks.
To a large extent, this is a mechanical change, for instance changing
`pybind11::`
to `nanobind::`.
Notes:
* this PR needs Nanobind 2.4.0, because it needs a bug fix
(https://github.com/wjakob/nanobind/pull/806) that landed in that
release.
* this PR does not port the in-tree dialect extension modules. They can
be ported in a future PR.
* I removed the py::sibling() annotations from def_static and def_class
in `PybindAdapters.h`. These ask pybind11 to try to form an overload
with an existing method, but it's not possible to form mixed
pybind11/nanobind overloads this ways and the parent class is now
defined in nanobind. Better solutions may be possible here.
* nanobind does not contain an exact equivalent of pybind11's buffer
protocol support. It was not hard to add a nanobind implementation of a
similar API.
* nanobind is pickier about casting to std::vector<bool>, expecting that
the input is a sequence of bool types, not truthy values. In a couple of
places I added code to support truthy values during casting.
* nanobind distinguishes bytes (`nb::bytes`) from strings (e.g.,
`std::string`). This required nb::bytes overloads in a few places.
This PR allows out-of-tree dialects to write Python dialect modules
using nanobind instead of pybind11.
It may make sense to migrate in-tree dialects and some of the ODS Python
infrastructure to nanobind, but that is a topic for a future change.
This PR makes the following changes:
* adds nanobind to the CMake and Bazel build systems. We also add
robin_map to the Bazel build, which is a dependency of nanobind.
* adds a PYTHON_BINDING_LIBRARY option to various CMake functions, such
as declare_mlir_python_extension, allowing users to select a Python
binding library.
* creates a fork of mlir/include/mlir/Bindings/Python/PybindAdaptors.h
named NanobindAdaptors.h. This plays the same role, using nanobind
instead of pybind11.
* splits CollectDiagnosticsToStringScope out of PybindAdaptors.h and
into a new header mlir/include/mlir/Bindings/Python/Diagnostics.h, since
it is code that is no way related to pybind11 or for that matter,
Python.
* changed the standalone Python extension example to have both pybind11
and nanobind variants.
* changed mlir/python/mlir/dialects/python_test.py to have both pybind11
and nanobind variants.
Notes:
* A slightly unfortunate thing that I needed to do in the CMake
integration was to use FindPython in addition to FindPython3, since
nanobind's CMake integration expects the Python_ names for variables.
Perhaps there's a better way to do this.
Following a rather direct approach to expose PDL usage from C and then
Python. This doesn't yes plumb through adding support for custom
matchers through this interface, so constrained to basics initially.
This also exposes greedy rewrite driver. Only way currently to define
patterns is via PDL (just to keep small). The creation of the PDL
pattern module could be improved to avoid folks potentially accessing
the module used to construct it post construction. No ergonomic work
done yet.
---------
Signed-off-by: Jacques Pienaar <jpienaar@google.com>
The Python bindings generated for "async" dialect didn't include any of
the "async" dialect ops. This PR fixes issues with generation of Python
bindings for "async" dialect and adds a test case to use them.
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.
This adds Python abstractions for the different handle types of the
transform dialect
The abstractions allow for straightforward chaining of transforms by
calling their member functions.
As an initial PR for this infrastructure, only a single transform is
included: `transform.structured.match`.
With a future `tile` transform abstraction an example of the usage is:
```Python
def script(module: OpHandle):
module.match_ops(MatchInterfaceEnum.TilingInterface).tile(tile_sizes=[32,32])
```
to generate the following IR:
```mlir
%0 = transform.structured.match interface{TilingInterface} in %arg0
%tiled_op, %loops = transform.structured.tile_using_for %0 [32, 32]
```
These abstractions are intended to enhance the usability and flexibility
of the transform dialect by providing an accessible interface that
allows for easy assembly of complex transformation chains.
This PR replaces the mixin `OpView` extension mechanism with the
standard inheritance mechanism.
Why? Firstly, mixins are not very pythonic (inheritance is usually used
for this), a little convoluted, and too "tight" (can only be used in the
immediately adjacent `_ext.py`). Secondly, it (mixins) are now blocking
are correct implementation of "value builders" (see
[here](https://github.com/llvm/llvm-project/pull/68764)) where the
problem becomes how to choose the correct base class that the value
builder should call.
This PR looks big/complicated but appearances are deceiving; 4 things
were needed to make this work:
1. Drop `skipDefaultBuilders` in
`OpPythonBindingGen::emitDefaultOpBuilders`
2. Former mixin extension classes are converted to inherit from the
generated `OpView` instead of being "mixins"
a. extension classes that simply were calling into an already generated
`super().__init__` continue to do so
b. (almost all) extension classes that were calling `self.build_generic`
because of a lack of default builder being generated can now also just
call `super().__init__`
3. To handle the [lone single
use-case](https://sourcegraph.com/search?q=context%3Aglobal+select_opview_mixin&patternType=standard&sm=1&groupBy=repo)
of `select_opview_mixin`, namely
[linalg](https://github.com/llvm/llvm-project/blob/main/mlir/python/mlir/dialects/_linalg_ops_ext.py#L38),
only a small change was necessary in `opdsl/lang/emitter.py` (thanks to
the emission/generation of default builders/`__init__`s)
4. since the `extend_opview_class` decorator is removed, we need a way
to register extension classes as the desired `OpView` that `op.opview`
conjures into existence; so we do the standard thing and just enable
replacing the existing registered `OpView` i.e.,
`register_operation(_Dialect, replace=True)`.
Note, the upgrade path for the common case is to change an extension to
inherit from the generated builder and decorate it with
`register_operation(_Dialect, replace=True)`. In the slightly more
complicated case where `super().__init(self.build_generic(...))` is
called in the extension's `__init__`, this needs to be updated to call
`__init__` in `OpView`, i.e., the grandparent (see updated docs).
Note, also `<DIALECT>_ext.py` files/modules will no longer be automatically loaded.
Note, the PR has 3 base commits that look funny but this was done for
the purpose of tracking the line history of moving the
`<DIALECT>_ops_ext.py` class into `<DIALECT>.py` and updating (commit
labeled "fix").
This PR creates the necessary files to support bindings for operations
in the affine dialect.
This is the first of many PRs which will progressively introduce
affine.load, affine.for, etc operations. I would like to
acknowledge the work by Nelli's author @makslevental :
https://github.com/makslevental/nelli/blob/main/nelli/mlir/affine/affine.py
which jump-starts the work.
Don't generate enums from the main VectorOps.td file as that
transitively includes enums from Arith.
---------
Co-authored-by: Nicolas Vasilache <ntv@google.com>
This PR implements python enum bindings for *all* the enums - this includes `I*Attrs` (including positional/bit) and `Dialect/EnumAttr`.
There are a few parts to this:
1. CMake: a small addition to `declare_mlir_dialect_python_bindings` and `declare_mlir_dialect_extension_python_bindings` to generate the enum, a boolean arg `GEN_ENUM_BINDINGS` to make it opt-in (even though it works for basically all of the dialects), and an optional `GEN_ENUM_BINDINGS_TD_FILE` for handling corner cases.
2. EnumPythonBindingGen.cpp: there are two weedy aspects here that took investigation:
1. If an enum attribute is not a `Dialect/EnumAttr` then the `EnumAttrInfo` record is canonical, as far as both the cases of the enum **and the `AttrDefName`**. On the otherhand, if an enum is a `Dialect/EnumAttr` then the `EnumAttr` record has the correct `AttrDefName` ("load bearing", i.e., populates `ods.ir.AttributeBuilder('<NAME>')`) but its `enum` field contains the cases, which is an instance of `EnumAttrInfo`. The solution is to generate an one enum class for both `Dialect/EnumAttr` and "independent" `EnumAttrInfo` but to make that class interopable with two builder registrations that both do the right thing (see next sub-bullet).
2. Because we don't have a good connection to cpp `EnumAttr`, i.e., only the `enum class` getters are exposed (like `DimensionAttr::get(Dimension value)`), we have to resort to parsing e.g., `Attribute.parse(f'#gpu<dim {x}>')`. This means that the set of supported `assemblyFormat`s (for the enum) is fixed at compile of MLIR (currently 2, the only 2 I saw). There might be some things that could be done here but they would require quite a bit more C API work to support generically (e.g., casting ints to enum cases and binding all the getters or going generically through the `symbolize*` methods, like `symbolizeDimension(uint32_t)` or `symbolizeDimension(StringRef)`).
A few small changes:
1. In addition, since this patch registers default builders for attributes where people might've had their own builders already written, I added a `replace` param to `AttributeBuilder.insert` (`False` by default).
2. `makePythonEnumCaseName` can't handle all the different ways in which people write their enum cases, e.g., `llvm.CConv.Intel_OCL_BI`, which gets turned into `INTEL_O_C_L_B_I` (because `llvm::convertToSnakeFromCamelCase` doesn't look for runs of caps). So I dropped it. On the otherhand regularization does need to done because some enums have `None` as a case (and others might have other python keywords).
3. I turned on `llvm` dialect generation here in order to test `nvvm.WGMMAScaleIn`, which is an enum with [[ d7e26b5620/mlir/include/mlir/IR/EnumAttr.td (L22-L25) | no explicit discriminator ]] for the `neg` case.
Note, dialects that didn't get a `GEN_ENUM_BINDINGS` don't have any enums to generate.
Let me know if I should add more tests (the three trivial ones I added exercise both the supported `assemblyFormat`s and `replace=True`).
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D157934
This patch adds a mix-in class for the only transform op of the tensor
dialect that can benefit from one: the MakeLoopIndependentOp. It adds an
overload that makes providing the return type optional.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D156918
This patch uses the new enum binding generation to add the enums of the
dialect to the Python bindings and uses them in the mix-in class where
it was still missing (namely, the `LayoutMapOption` for the
`function_boundary_type_conversion` of the `OneShotBufferizeOp`.
The patch also piggy-backs a few smaller clean-ups:
* Order the keyword-only arguments alphabetically.
* Add the keyword-only arguments to an overload where they were left out
by accident.
* Change some of the attribute values used in the tests to non-default
values such that they show up in the output IR and check for that
output.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D156664
Create a mix-in class with an overloaded constructor that makes the
return type optional.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D156561
Provide Python bindings for transform ops defined in the vector dialect.
All of these ops are sufficiently simple that no mixins are necessary
for them to be nicely usable.
Reviewed By: ingomueller-net
Differential Revision: https://reviews.llvm.org/D156554
Add an ODS (tablegen) backend to generate Python enum classes and
attribute builders for enum attributes defined in ODS. This will allow
us to keep the enum attribute definitions in sync between C++ and
Python, as opposed to handwritten enum classes in Python that may end up
using mismatching values. This also makes autogenerated bindings more
convenient even in absence of mixins.
Use this backend for the transform dialect failure propagation mode enum
attribute as demonstration.
Reviewed By: ingomueller-net
Differential Revision: https://reviews.llvm.org/D156553
This patch creates the .td files for the Python bindings of the
transform ops of the MemRef dialect and integrates them into the build
systems (CMake and Bazel).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D156536
This patch adds mix-in classes for the Python bindings of
`EmptyTensorToAllocTensorOp` and `OneShotBufferizeOp`. For both classes,
the mix-in add overloads to the `__init__` functions that allow to
construct them without providing the return type, which is defaulted to
the only allowed type and `AnyOpType`, respectively.
Note that the mix-in do not expose the
`function_boundary_type_conversion` attribute. The attribute has a
custom type from the bufferization dialect that is currently not exposed
in the Python bindings. Handling of that attribute can be added easily
to the mix-in class when the need arises.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D155799
This patch adds a mix-in class for MapForallToBlocks with overloaded
constructors. This makes it optional to provide the return type of the
op, which is defaulte to `AnyOpType`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155717
The initial bring-up of the Transform dialect relied on PDL to provide
the default handle type (`!pdl.operation`) and the matching capability.
Both are now provided natively by the Transform dialect removing the
reason to have a hard dependency on the PDL dialect and its interpreter.
Move PDL-related transform operations into a separate extension.
This requires us to introduce a dialect state extension mechanism into
the Transform dialect so it no longer needs to know about PDL constraint
functions that may be injected by extensions similarly to operations and
types. This mechanism will be reused to connect pattern application
drivers and the Transform dialect.
This completes the restructuring of the Transform dialect to remove
overrilance on PDL.
Note to downstreams: flow that are using `!pdl.operation` with Transform
dialect operations will now require `transform::PDLExtension` to be
applied to the transform dialect in order to provide the transform
handle type interface for `!pdl.operation`.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D151104
Add a new OperationType handle type to the Transform dialect. This
transform type is parameterized by the name of the payload operation it
can point to. It is intended as a constraint on transformations that are
only applicable to a specific kind of payload operations. If a
transformation is applicable to a small set of operation classes, it can
be wrapped into a transform op by using a disjunctive constraint, such
as `Type<Or<[Transform_ConcreteOperation<"foo">.predicate,
Transform_ConcreteOperation<"bar">.predicate]>>` for its operand without
modifying this type. Broader sets of accepted operations should be
modeled as specific types.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D135586
tensor.empty/linalg.init_tensor produces an uninititalized tensor that can be used as a destination operand for destination-style ops (ops that implement `DestinationStyleOpInterface`).
This change makes it possible to implement `TilingInterface` for non-destination-style ops without depending on the Linalg dialect.
RFC: https://discourse.llvm.org/t/rfc-add-tensor-from-shape-operation/65101
Differential Revision: https://reviews.llvm.org/D135129
Using if (TARGET ${LLVM_NATIVE_ARCH}) only works if MLIR is built
together with LLVM, but not for standalone builds of MLIR. The
correct way to check this is
if (${LLVM_NATIVE_ARCH} IN_LIST LLVM_TARGETS_TO_BUILD), as the
LLVM build system exports LLVM_TARGETS_TO_BUILD.
To avoid repeating the same check many times, add a
MLIR_ENABLE_EXECUTION_ENGINE variable.
Differential Revision: https://reviews.llvm.org/D131071