This functionality has been replaced by TypeCasters (see D151840)
depends on D154468
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D154469
I've been trying to come up with a simple and clean implementation for
ReLU. TOSA uses `clamp` which is probably the goal, but that means
table-gen to make it efficient (attributes, only lower `min` or `max`).
For now, `max` is a reasonable named op despite ReLU, so we can start
using it for tiling and fusion, and upon success, we create a more
complete op `clamp` that doesn't need a whole tensor filled with zeroes
or ones to implement the different activation functions.
As with other named ops, we start "requiring" type casts and broadcasts,
and zero filled constant tensors to a more complex pattern-matcher, and
can slowly simplify with attributes or structured matchers (ex. PDL) in
the future.
Differential Revision: https://reviews.llvm.org/D154703
Following binary arithmetic in previous commits, this patch adds unary
maths ops to linalg.
It also fixes a few of the previous tests, and makes the binary ops call
BinaryFn.<op> directly instead of relying on Python to recognise the
operation.
Differential Revision: https://reviews.llvm.org/D154618
Re-apply eda47fdd25 after implementing __truediv__ for TensorUse.
[MLIR][Linalg] Add more arith named ops to linalg
Following up the 'add' named op, here are the remaining basic arithmetic
and maths, including a 'div_unsigned' for integer unsigned values. In the
same pattern as 'matmul_unsigned', the simply named 'div' assumes signed
values and the '_unsigned' variation handles the unsigned values.
It's a bit odd, but there doesn't seem to be a easy way to restrict to
specific types to make 'div_unsigned' only work with integers in the
structured ops framework.
Same as 'add', these have strict semantics regarding casts.
Unary math ops will need some massaging, so I split these ones for now
as I continue working on them.
Differential Revision: https://reviews.llvm.org/D154524
Following up the 'add' named op, here are the remaining basic arithmetic
and maths, including a 'div_unsigned' for integer unsigned values. In the
same pattern as 'matmul_unsigned', the simply named 'div' assumes signed
values and the '_unsigned' variation handles the unsigned values.
It's a bit odd, but there doesn't seem to be a easy way to restrict to
specific types to make 'div_unsigned' only work with integers in the
structured ops framework.
Same as 'add', these have strict semantics regarding casts.
Unary math ops will need some massaging, so I split these ones for now
as I continue working on them.
Differential Revision: https://reviews.llvm.org/D154524
This adds the first strict element-wise named op to Linalg.
The semantics here is to not allow auto-cast, broadcast semantics and to
restrict the operations only to identical types. The remaining semantics
must come in the form of surrounding operations on operands, to avoid
ambiguity.
Examples:
```
// Cast int-to-fp
%0 = linalg.copy ins(%in: tensor<32x32xi32>)
outs(%out: tensor<32x32xf32>)
%1 = linalg.add ins(%arg, %0: tensor<32x32xf32>, tensor<32x32xf32>)
outs(%0: tensor<32x32xf32>)
// This can be lowered to
%1 = linalg.generic {...}
ins(%arg, %in: tensor<32x32xf32>, tensor<32x32xi32>)
outs(%0: tensor<32x32xf32>) {
^bb0(%a: f32, %i: i32, %out: f32):
%f = arith.uitofp %i : f32
%0 = arith.addf %a, %f : f32
linalg.yield %0 : f32
}
// Broadcast
%0 = linalg.broadcast ins(%in: tensor<32xf32>)
init(%out: tensor<32x32xf32>)
%1 = linalg.add ins(%arg, %0: tensor<32x32xf32>, tensor<32x32xf32>)
outs(%0: tensor<32x32xf32>)
// This can be lowered to
#bcast_map = affine_map<(d0, d1) -> (d0)>
%1 = linalg.generic {... #bcast_map] }
ins(%arg, %in: tensor<32x32xf32>, tensor<32xf32>)
outs(%0: tensor<32x32xf32>) {
^bb0(%a: f32, %b: f32, %out: f32):
%0 = arith.addf %a, %b : f32
linalg.yield %0 : f32
}
```
Once this gets accepted, other arithmetic and maths operations will be
added accordingly, with the same semantics.
Differential Revision: https://reviews.llvm.org/D154500
This change lifts the limitation that only the trailing dimensions/sizes
in dynamic index lists can be scalable. It allows us to extend
`MaskedVectorizeOp` and `TileOp` from the Transform dialect so that the
following is allowed:
%1, %loops:3 = transform.structured.tile %0 [4, [4], [4]]
This is also a follow up for https://reviews.llvm.org/D153372
that will enable the following (middle vector dimension is scalable):
transform.structured.masked_vectorize %0 vector_sizes [2, [4], 8]
To facilate this change, the hooks for parsing and printing dynamic
index lists are updated accordingly (`printDynamicIndexList` and
`parseDynamicIndexList`, respectively). `MaskedVectorizeOp` and `TileOp`
are updated to include an array of attribute of bools that captures
whether the corresponding vector dimension/tile size, respectively, are
scalable or not.
NOTE 1: I am re-landing this after the initial version was reverted. To
fix the regression and in addition to the original patch, this revision
updates the Python bindings for the transform dialect
NOTE 2: This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
* https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/
This relands 048764f23a with fixes.
Differential Revision: https://reviews.llvm.org/D154336
* Rename op to `transform.get_parent_op`
* Match parents by "is isolated from above" and/or op name, or just the direct parent.
* Deduplication of result payload ops is optional.
Differential Revision: https://reviews.llvm.org/D154071
"transform.structured.pad" now returns all `tensor::PadOp` in addition to the padded ops.
Also add a test case that shows how to force an allocation for "tensor.pad" ops with a custom memory space.
Differential Revision: https://reviews.llvm.org/D153555
matmul with transposed LHS operand allows better memory access
patterns on several architectures including common GPUs. Having a named
op for it allows to handle this kind of matmul in a more explicit way.
D141430 added the generated yaml file for (batch_)?matmul_transpose_b ops, but the source of truth core_named_ops.py was not updated.
This change fixes .py file to generate the same result as the yaml file.
Differential revision: https://reviews.llvm.org/D150059
Authored-by: kon72 <kinsei0916@gmail.com>
depends on D150839
This diff uses `MlirTypeID` to register `TypeCaster`s (i.e., `[](PyType pyType) -> DerivedTy { return pyType; }`) for all concrete types (i.e., `PyConcrete<...>`) that are then queried for (by `MlirTypeID`) and called in `struct type_caster<MlirType>::cast`. The result is that anywhere an `MlirType mlirType` is returned from a python binding, that `mlirType` is automatically cast to the correct concrete type. For example:
```
c0 = arith.ConstantOp(f32, 0.0)
# CHECK: F32Type(f32)
print(repr(c0.result.type))
unranked_tensor_type = UnrankedTensorType.get(f32)
unranked_tensor = tensor.FromElementsOp(unranked_tensor_type, [c0]).result
# CHECK: UnrankedTensorType
print(type(unranked_tensor.type).__name__)
# CHECK: UnrankedTensorType(tensor<*xf32>)
print(repr(unranked_tensor.type))
```
This functionality immediately extends to typed attributes (i.e., `attr.type`).
The diff also implements similar functionality for `mlir_type_subclass`es but in a slightly different way - for such types (which have no cpp corresponding `class` or `struct`) the user must provide a type caster in python (similar to how `AttrBuilder` works) or in cpp as a `py::cpp_function`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150927
This is an ongoing series of commits that are reformatting our
Python code.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Differential Revision: https://reviews.llvm.org/D150782
Currently blocks are always created with UnknownLoc's for their arguments. This
adds an `arg_locs` argument to all block creation APIs, which takes an optional
sequence of locations to use, one per block argument. If no locations are
supplied, the current Location context is used.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150084
The initial bring-up of the Transform dialect relied on PDL to provide
the default handle type (`!pdl.operation`) and the matching capability.
Both are now provided natively by the Transform dialect removing the
reason to have a hard dependency on the PDL dialect and its interpreter.
Move PDL-related transform operations into a separate extension.
This requires us to introduce a dialect state extension mechanism into
the Transform dialect so it no longer needs to know about PDL constraint
functions that may be injected by extensions similarly to operations and
types. This mechanism will be reused to connect pattern application
drivers and the Transform dialect.
This completes the restructuring of the Transform dialect to remove
overrilance on PDL.
Note to downstreams: flow that are using `!pdl.operation` with Transform
dialect operations will now require `transform::PDLExtension` to be
applied to the transform dialect in order to provide the transform
handle type interface for `!pdl.operation`.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D151104
2.9.0 was released on December 28, 2021, and some following changes
require at least this version.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D150247
This diff adds python bindings for `MlirTypeID`. It paves the way for returning accurately typed `Type`s from python APIs (see D150927) and then further along building type "conscious" `Value` APIs (see D150413).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150839
Add more attribute builders, such as "F32Attr", "F64Attr" and "F64ArrayAttr", which are useful to create operations by python bindings. For example, tosa.clamp in _tosa_ops_gen.py need 'F32Attr'.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150757
This change adds the following three operations and unit tests for them:
- conv_3d_ncdhw_fcdhw
- depthwise_conv_1d_ncw_cw
- depthwise_conv_3d_ncdhw_cdhw
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D150054
Add C and python bindings for InferShapedTypeOpInterface
and ShapedTypeComponents. This allows users to invoke
InferShapedTypeOpInterface for ops that implement it.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D149494
Currently blocks are always created with UnknownLoc's for their arguments. This
adds an `arg_locs` argument to all block creation APIs, which takes an optional
sequence of locations to use, one per block argument. If no locations are
supplied, the current Location context is used.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150084
Outlining is particularly interesting when the outlined function is
replaced with something else, e.g., a microkernel. It is good to have a
handle to the call in this case.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D149849
X. Sun et al. (https://dl.acm.org/doi/10.5555/3454287.3454728) published
a paper showing that an FP format with 4 bits of exponent, 3 bits of
significand and an exponent bias of 11 would work quite well for ML
applications.
Google hardware supports a variant of this format where 0x80 is used to
represent NaN, as in the Float8E4M3FNUZ format. Just like the
Float8E4M3FNUZ format, this format does not support -0 and values which
would map to it will become +0.
This format is proposed for inclusion in OpenXLA's StableHLO dialect: https://github.com/openxla/stablehlo/pull/1308
As part of inclusion in that dialect, APFloat needs to know how to
handle this format.
Differential Revision: https://reviews.llvm.org/D146441
This updates most (all?) error-diagnostic-emitting python APIs to
capture error diagnostics and include them in the raised exception's
message:
```
>>> Operation.parse('"arith.addi"() : () -> ()'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
mlir._mlir_libs.MLIRError: Unable to parse operation assembly:
error: "-":1:1: 'arith.addi' op requires one result
note: "-":1:1: see current operation: "arith.addi"() : () -> ()
```
The diagnostic information is available on the exception for users who
may want to customize the error message:
```
>>> try:
... Operation.parse('"arith.addi"() : () -> ()')
... except MLIRError as e:
... print(e.message)
... print(e.error_diagnostics)
... print(e.error_diagnostics[0].message)
...
Unable to parse operation assembly
[<mlir._mlir_libs._mlir.ir.DiagnosticInfo object at 0x7fed32bd6b70>]
'arith.addi' op requires one result
```
Error diagnostics captured in exceptions aren't propagated to diagnostic
handlers, to avoid double-reporting of errors. The context-level
`emit_error_diagnostics` option can be used to revert to the old
behaviour, causing error diagnostics to be reported to handlers instead
of as part of exceptions.
API changes:
- `Operation.verify` now raises an exception on verification failure,
instead of returning `false`
- The exception raised by the following methods has been changed to
`MLIRError`:
- `PassManager.run`
- `{Module,Operation,Type,Attribute}.parse`
- `{RankedTensorType,UnrankedTensorType}.get`
- `{MemRefType,UnrankedMemRefType}.get`
- `VectorType.get`
- `FloatAttr.get`
closes#60595
depends on D144804, D143830
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D143869
The raw `OpView` classes are used to bypass the constructors of `OpView`
subclasses, but having a separate class can create some confusing
behaviour, e.g.:
```
op = MyOp(...)
# fails, lhs is 'MyOp', rhs is '_MyOp'
assert type(op) == type(op.operation.opview)
```
Instead we can use `__new__` to achieve the same thing without a
separate class:
```
my_op = MyOp.__new__(MyOp)
OpView.__init__(my_op, op)
```
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D143830
Float8E5M2FNUZ and Float8E4M3FNUZ have been added to APFloat in D141863.
This change adds these types as MLIR builtin types alongside Float8E5M2
and Float8E4M3FN (added in D133823 and D138075).
Reviewed By: krzysz00
Differential Revision: https://reviews.llvm.org/D143744
Some Ubuntu 20.04 images come with PyYAML 5.3.1 pre-installed through distutils. This makes pip very angry. See https://github.com/yaml/pyyaml/issues/349.
Since older versions of PyYAML should work for mlir, relax the version requirement to ease developer setup.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D143523
Previously we only allowed the flattened list passed in, but the same
input provided here as to buildGeneric so flatten accordingly. We have
less info here than in buildGeneric so the error is more generic if
unpacking fails.
Differential Revision: https://reviews.llvm.org/D143240
This change is pinning the requirements to a specific version (or a range) depending on the requirement. A couple of considerations:
* numpy 1.24 deprecates np.object, np.bool, np.float, np.complex, np.str, and np.int which are used heavily in onnx-mlir
* not all versions of each package are available on every platform - to the best of my knowledge, these ranges should work on Ubuntu, CentOS and Windows
Adding a minimum and maximum version, or pinning to a specific versions where possible, helps with two major goals - security and maintainability. It gives us an opportunity to make sure that the packages being used are not part of a security attack as well as guaranteeing that they support the features that mlir depends on (see note about numpy deprecation).
Let me know if you are aware of better versions or ranges to pin to.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D142563
`applyTransforms` now takes an optional mapping to be associated with
trailing block arguments of the top-level transform op, in addition to
the payload root. This allows for more advanced forms of communication
between C++ code and the transform dialect interpreter, in particular
supplying operations without having to re-match them during
interpretation.
Reviewed By: shabalin
Differential Revision: https://reviews.llvm.org/D142559
Use the recently introduced transform dialect parameter mechanism to
perform controllable multi-size tiling with sizes computed at the
transformation time rather than at runtime.
This requires to generalize tile and split structured transform
operations to work with any transform dialect handle types, which is
desirable in itself to avoid unchecked overuse of PDL OperationType.
Reviewed By: shabalin
Differential Revision: https://reviews.llvm.org/D140980
Conv3D has an existing linalg operation for floating point. Adding a quantized
variant and corresponding lowering from TOSA. Numerical correctness was validated
using the TOSA conformance tests.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D140919