Commit Graph

284 Commits

Author SHA1 Message Date
annuasd
47ef5c4b7f [mlir][Bindings] Fix missing return value of functions and incorrect type hint in pyi. (#116731)
The zero points of UniformQuantizedPerAxisType should be List[int].
And there are two methods missing return value.

Co-authored-by: 牛奕博 <niuyibo@niuyibodeMacBook-Pro.local>
2024-11-19 15:24:39 -06:00
Krzysztof Drewniak
31aa7f34e0 [mlir][Affine] Let affine.[de]linearize_index omit outer bounds (#116103)
The affine.delinearize_index and affine.linearize_index operations, as
currently defined, require providing a length N basis to [de]linearize N
values. The first value in this basis is never used during lowering and
is unused during lowering. (Note that, even though it isn't used during
lowering it can still be used to, for example, remove length-1 outputs
from a delinearize).

This dead value makes sense in the original context of these operations,
which is linearizing or de-linearizing indexes to memref<>s, vector<>s,
and other shaped types, where that outer bound is avaliable and may be
useful for analysis.

However, other usecases exist where the outer bound is not known. For
example:

    %thread_id_x = gpu.thread_id x : index
%0:3 = affine.delinearize_index %thread_id_x into (4, 16) : index,index,
index

In this code, we don't know the upper bound of the thread ID, but we do
want to construct the ?x4x16 grid of delinearized values in order to
further partition the GPU threads.

In order to support such usecases, we broaden the definition of
affine.delinearize_index and affine.linearize_index to make the outer
bound optional.

In the case of affine.delinearize_index, where the number of results is
a function of the size of the passed-in basis, we augment all existing
builders with a `hasOuterBound` argument, which, for backwards
compatibilty and to preserve the natural usage of the op, defaults to
`true`. If this flag is true, the op returns one result per basis
element, if it is false, it returns one extra result in position 0.

We also update existing canonicalization patterns (and move one of them
into the folder) to handle these cases. Note that disagreements about
the outer bound now no longer prevent delinearize/linearize
cancelations.
2024-11-18 15:41:54 -06:00
Jinyun (Joey) Ye
618f231a6d [MLIR][Transform] Consolidate result of structured.split into one list (#111171)
Follow-up a review comment from
https://github.com/llvm/llvm-project/pull/82792#discussion_r1604925239
as a separate PR:

	E.g.:
	```
	%0:2 = transform.structured.split
	```
	is changed to
	```
	%t = transform.structured.split
	%0:2 = transform.split_handle %t
	```
2024-11-15 10:53:34 +08:00
Amy Wang
d50fbe43c9 [MLIR][Python] Python binding support for AffineIfOp (#108323)
Fix the AffineIfOp's default builder such that it takes in an
IntegerSetAttr. AffineIfOp has skipDefaultBuilders=1 which effectively
skips the creation of the default AffineIfOp::builder on the C++ side.
(AffineIfOp has two custom OpBuilder defined in the
extraClassDeclaration.) However, on the python side, _affine_ops_gen.py
shows that the default builder is being created, but it does not accept
IntegerSet and thus is useless. This fix at line 411 makes the default
python AffineIfOp builder take in an IntegerSet input and does not
impact the C++ side of things.
2024-11-13 16:27:46 -05:00
Md Asghar Ahmad Shahid
3ad0148020 [MLIR][Linalg] Re-land linalg.matmul move to ODS. + Remove/update failing obsolete OpDSL tests. (#115319)
The earlier PR(https://github.com/llvm/llvm-project/pull/104783) which
introduces
transpose and broadcast semantic to linalg.matmul was reverted due to
two failing
OpDSL test for linalg.matmul.

Since linalg.matmul is now defined using TableGen ODS instead of
Python-based OpDSL,
these test started failing and needs to be removed/updated.

This commit removes/updates the failing obsolete tests from below files.
All other files
were part of earlier PR and just cherry picked.
    "mlir/test/python/integration/dialects/linalg/opsrun.py"
    "mlir/test/python/integration/dialects/transform.py"

---------

Co-authored-by: Renato Golin <rengolin@systemcall.eu>
2024-11-07 14:51:02 +00:00
Krzysztof Drewniak
704808c275 [mlir][affine] Add static basis support to affine.delinearize (#113846)
This commit makes `affine.delinealize` join other indexing operators,
like `vector.extract`, which store a mixed static/dynamic set of sizes,
offsets, or such. In this case, the `basis` (the set of values that will
be used to decompose the linear index) is now stored as an array of
index attributes where the basis is statically known, eliminating the
need to cretae constants.

This commit also adds copies of the delinearize utility in the affine
dialect to allow it to take an array of `OpFoldResult`s and extends te
DynamicIndexList parser/printer to allow specifying the delimiters in
tablegen (this is needed to avoid breaking existing syntax).

---------

Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
2024-11-04 14:59:13 -06:00
Andrzej Warzyński
a758bcdbd9 [mlir][td] Rename pack_paddings in structured.pad (#111036)
The pack_paddings attribute in the structure.pad TD Op is used to set
the `nofold` attribute in the generated tensor.pad Op. The current name
is confusing and suggests that there's a relation with the tensor.pack
Op. This patch renames it as `nofold_flags` to better match the actual
usage.
2024-10-15 19:24:43 +01:00
Emilio Cota
1276ce9e97 Revert "[mlir][linalg] Introduce transpose semantic to 'linalg.matmul' ops. (#104783)"
This reverts commit 03483737a7 and
99c8557, which is a fix-up on top of the former.

I'm reverting because this commit broke two tests:
  mlir/test/python/integration/dialects/linalg/opsrun.py
  mlir/test/python/integration/dialects/transform.py
See https://lab.llvm.org/buildbot/#/builders/138/builds/4872

I'm not familiar with the tests, so I'm leaving it to the original author
to either remove or adapt the broken tests, as discussed here:
  https://github.com/llvm/llvm-project/pull/104783#issuecomment-2406390905
2024-10-11 05:22:56 -04:00
Md Asghar Ahmad Shahid
03483737a7 [mlir][linalg] Introduce transpose semantic to 'linalg.matmul' ops. (#104783)
The main goal of this patch is to extend the semantic of 'linalg.matmul'
named op to include per operand transpose semantic while also laying out
a way to move ops definition from OpDSL to tablegen. Hence, it is
implemented in tablegen. Transpose semantic is as follows.

By default 'linalg.matmul' behavior will remain as is. Transpose
semantics can be appiled on per input operand by specifying the optional
permutation attributes (namely 'permutationA' for 1st input and
'permutationB' for 2nd input) for each operand explicitly as needed. By
default, no transpose is mandated for any of the input operand.

    Example:
    ```
%val = linalg.matmul ins(%arg0, %arg1 : memref<5x3xf32>,
memref<5x7xf32>)
              outs(%arg2: memref<3x7xf32>)
              permutationA = [1, 0]
              permutationB = [0, 1]
    ```
2024-10-10 17:00:58 +01:00
Mateusz Sokół
a9746675a5 [MLIR][Python] Add encoding argument to tensor.empty Python function (#110656)
Hi @xurui1995 @makslevental,

I think in https://github.com/llvm/llvm-project/pull/103087 there's
unintended regression where user can no longer create sparse tensors
with `tensor.empty`.

Previously I could pass:
```python
out = tensor.empty(tensor_type, [])
```
where `tensor_type` contained `shape`, `dtype`, and `encoding`.

With the latest 
```python
tensor.empty(sizes: Sequence[Union[int, Value]], element_type: Type, *, loc=None, ip=None)
```
it's no longer possible.

I propose to add `encoding` argument which is passed to
`RankedTensorType.get(static_sizes, element_type, encoding)` (I updated
one of the tests to check it).
2024-10-01 16:48:00 -04:00
Matt Hofmann
ad89e617c7 [MLIR][Python] Fix detached operation coming from IfOp constructor (#107286)
Without this fix, `scf.if` operations would be created without a parent.
Since `scf.if` operations often have no results, this caused silent bugs
where the generated code was straight-up missing the operation.
2024-09-05 00:12:03 -04:00
Kasper Nielsen
3766ba44a8 [mlir][python] Fix how the mlir variadic Python accessor _ods_equally_sized_accessor is used (#101132) (#106003)
As reported in https://github.com/llvm/llvm-project/issues/101132, this
fixes two bugs:

1. When accessing variadic operands inside an operation, it must be
accessed as `self.operation.operands` instead of `operation.operands`
2. The implementation of the `equally_sized_accessor` function is doing
wrong arithmetics when calculating the resulting index and group sizes.

I have added a test for the `equally_sized_accessor` function, which did
not have a test previously.
2024-08-31 03:17:33 -04:00
Fabian Mora
016e1eb9c8 [mlir][gpu] Add metadata attributes for storing kernel metadata in GPU objects (#95292)
This patch adds the `#gpu.kernel_metadata` and `#gpu.kernel_table`
attributes. The `#gpu.kernel_metadata` attribute allows storing metadata
related to a compiled kernel, for example, the number of scalar
registers used by the kernel. The attribute only has 2 required
parameters, the name and function type. It also has 2 optional
parameters, the arguments attributes and generic dictionary for storing
all other metadata.

The `#gpu.kernel_table` stores a table of `#gpu.kernel_metadata`,
mapping the name of the kernel to the metadata.

Finally, the function `ROCDL::getAMDHSAKernelsELFMetadata` was added to
collect ELF metadata from a binary, and to test the class methods in
both attributes.

Example:
```mlir
gpu.binary @binary [#gpu.object<#rocdl.target<chip = "gfx900">, kernels = #gpu.kernel_table<[
    #gpu.kernel_metadata<"kernel0", (i32) -> (), metadata = {sgpr_count = 255}>,
    #gpu.kernel_metadata<"kernel1", (i32, f32) -> (), arg_attrs = [{llvm.read_only}, {}]>
  ]> , bin = "BLOB">]

```
The motivation behind these attributes is to provide useful information
for things like tunning.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-08-27 18:44:50 -04:00
Bimo
4eefc8d4ce [MLIR][Python] enhance python api for tensor.empty (#103087)
Since we have extended `EmptyOp`, maybe we should also provide a
corresponding `tensor.empty` method. In the downstream usage, I tend to
use APIs with all lowercase letters to create ops, so having a
`tensor.empty` to replace the extended `tensor.EmptyOp` would keep my
code style consistent.
2024-08-19 09:06:48 +08:00
Andrzej Warzyński
2ee5586ac7 [mlir][vector] Make the in_bounds attribute mandatory (#97049)
At the moment, the in_bounds attribute has two confusing/contradicting
properties:
  1. It is both optional _and_ has an effective default-value.
  2. The default value is "out-of-bounds" for non-broadcast dims, and
     "in-bounds" for broadcast dims.

(see the `isDimInBounds` vector interface method for an example of this
"default" behaviour [1]).

This PR aims to clarify the logic surrounding the `in_bounds` attribute
by:
  * making the attribute mandatory (i.e. it is always present),
  * always setting the default value to "out of bounds" (that's
    consistent with the current behaviour for the most common cases).

#### Broadcast dimensions in tests

As per [2], the broadcast dimensions requires the corresponding
`in_bounds` attribute to be `true`:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

The changes in this PR mean that we can no longer rely on the
default value in cases like the following (dim 0 is a broadcast dim):
```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

Instead, the broadcast dimension has to explicitly be marked as "in
bounds:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

All tests with broadcast dims are updated accordingly.

#### Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The following patterns in "Vectorization.cpp" are updated to explicitly
set the `in_bounds` attribute to `false`:
* `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and
`vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to
make sure that xfer Ops created by these hooks set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would complain

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Still, this is consistent with the
current behaviour where the broadcast dim would be implicitly assumed
to be "in bounds".

[1]
4145ad2bac/mlir/include/mlir/Interfaces/VectorInterfaces.td (L243-L246)
[2]
https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
2024-07-16 16:49:52 +01:00
Maksim Levental
9315645834 [mlir][python] auto attribute casting (#97786) 2024-07-05 10:43:51 -05:00
Bimo
bfa762a5a5 [MLIR][Python] fix class name of powf and negf in linalg (#97696)
The following logic can lead to a class name mismatch when using
`linalg.powf` in Python. This PR fixed the issue and also renamed
`NegfOp` to `NegFOp` in linalg to adhere to the naming convention, as
exemplified by `arith::NegFOp`.


173514d58e/mlir/python/mlir/dialects/linalg/opdsl/lang/dsl.py (L140-L143)
```
# linalg.powf(arg0, arg1, outs=[init_result.result])
NotImplementedError: Unknown named op_name / op_class_name: powf / PowfOp
```
2024-07-05 09:23:12 +08:00
muneebkhan85
a9efcbf490 [MLIR] Add continuous tiling to transform dialect (#82792)
This patch enables continuous tiling of a target structured op using
diminishing tile sizes. In cases where the tensor dimensions are not
exactly divisible by the tile size, we are left with leftover tensor
chunks that are irregularly tiled. This approach enables tiling of the
leftover chunk with a smaller tile size and repeats this process
recursively using exponentially diminishing tile sizes. This eventually
generates a chain of loops that apply tiling using diminishing tile
sizes.

Adds `continuous_tile_sizes` op to the transform dialect. This op, when
given a tile size and a dimension, computes a series of diminishing tile
sizes that can be used to tile the target along the given dimension.
Additionally, this op also generates a series of chunk sizes that the
corresponding tile sizes should be applied to along the given dimension.

Adds `multiway` attribute to `transform.structured.split` that enables
multiway splitting of a single target op along the given dimension, as
specified in a list enumerating the chunk sizes.
2024-06-21 16:39:43 +02:00
Guray Ozen
7f58ffd09b [mlir][python] Yield results of scf.for_ (#93610)
Using `for_` is very hand with python bindings. Currently, it doesn't
support results, we had to fallback to two lines scf.for.

This PR yields results of scf.for in `for_`

---------

Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
2024-05-29 08:43:13 +02:00
klensy
f0b0c02504 [mlir][test] Fix filecheck annotation typos (#92897)
Moved fixes for mlir from
https://github.com/llvm/llvm-project/pull/91854, plus few additional in
second commit.

---------

Co-authored-by: klensy <nightouser@gmail.com>
2024-05-24 09:24:59 +02:00
srcarroll
2c1c67674c [mlir][transform] Consistent linalg transform op syntax for dynamic index lists (#90897)
This patch is a first pass at making consistent syntax across the
`LinalgTransformOp`s that use dynamic index lists for size parameters.
Previously, there were two different forms: inline types in the list, or
place them in the functional style tuple. This patch goes for the
latter.

In order to do this, the `printPackedOrDynamicIndexList`,
`printDynamicIndexList` and their `parse` counterparts were modified so
that the types can be optionally provided to the corresponding custom
directives.

All affected ops now use tablegen `assemblyFormat`, so custom
`parse`/`print` functions have been removed. There are a couple ops that
will likely add dynamic size support, and once that happens it should be
made sure that the assembly remains consistent with the changes in this
patch.

The affected ops are as follows: `pack`, `pack_greedily`,
`tile_using_forall`. The `tile_using_for` and `vectorize` ops already
used this syntax, but their custom assembly was removed.

---------

Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
2024-05-08 09:11:53 -05:00
Yuanqiang Liu
10ec0d2089 [MLIR] fix _f64ElementsAttr in ir.py (#91176) 2024-05-06 20:08:47 +08:00
srcarroll
f2f65eddc5 [mlir][transform] Add support for transform.param pad multiples in PadOp (#90755)
This patch modifies the definition of `PadOp` to take transform params
and handles for the `pad_to_multiple_of` operand.

---------

Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
2024-05-04 17:34:40 -05:00
Yinying Li
a10d67f9fb [mlir][sparse] Enable explicit and implicit value in sparse encoding (#88975)
1. Explicit value means the non-zero value in a sparse tensor. If
explicitVal is set, then all the non-zero values in the tensor have the
same explicit value. The default value Attribute() indicates that it is
not set.

2. Implicit value means the "zero" value in a sparse tensor. If
implicitVal is set, then the "zero" value in the tensor is equal to the
implicit value. For now, we only support `0` as the implicit value but
it could be extended in the future. The default value Attribute()
indicates that the implicit value is `0` (same type as the tensor
element type).

Example:

```
#CSR = #sparse_tensor.encoding<{
  map = (d0, d1) -> (d0 : dense, d1 : compressed),
  posWidth = 64,
  crdWidth = 64,
  explicitVal = 1 : i64,
  implicitVal = 0 : i64
}>
```

Note: this PR tests that implicitVal could be set to other values as
well. The following PR will add verifier and reject any value that's not
zero for implicitVal.
2024-04-24 16:20:25 -07:00
Oleksandr "Alex" Zinenko
ff57f40673 [mlir][py] fix option passing in transform interpreter (#89922)
There was a typo in dispatch trampoline.
2024-04-24 19:40:53 +02:00
Maksim Levental
79d4d16563 [mlir][python] extend LLVM bindings (#89797)
Add bindings for LLVM pointer type.
2024-04-24 07:43:05 -05:00
Abhishek Kulkarni
37fe3c6788 [mlir][python] Fix generation of Python bindings for async dialect (#75960)
The Python bindings generated for "async" dialect didn't include any of
the "async" dialect ops. This PR fixes issues with generation of Python
bindings for "async" dialect and adds a test case to use them.
2024-04-20 20:49:39 -05:00
Maksim Levental
6e6da74c8b [mlir][python] add binding to #gpu.object (#88992) 2024-04-18 16:31:55 -05:00
Guray Ozen
4f88c23111 [mlir][py] Add NVGPU's TensorMapDescriptorType in py bindings (#88855)
This PR adds NVGPU dialects' TensorMapDescriptorType in the py bindings.

This is a follow-up issue from [this
PR](https://github.com/llvm/llvm-project/pull/87153#discussion_r1546193095)
2024-04-17 15:59:18 +02:00
Oleksandr "Alex" Zinenko
73140daebb [mlir] expose transform dialect symbol merge to python (#87690)
This functionality is available in C++, make it available in Python
directly to operate on transform modules.
2024-04-17 15:01:59 +02:00
srcarroll
b79db39659 [mlir][linalg] Support ParamType in vector_sizes option of VectorizeOp transform (#87557) 2024-04-09 15:52:40 -05:00
Steven Varoumas
eb861acd49 [mlir][python] Enable python bindings for Index dialect (#85827)
This small patch enables python bindings for the index dialect.

---------

Co-authored-by: Steven Varoumas <steven.varoumas1@huawei.com>
2024-03-20 16:56:22 +01:00
Oleksandr "Alex" Zinenko
5d59fa90ce Reapply "[mlir][py] better support for arith.constant construction" (#84142)
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.

Reverts llvm/llvm-project#84103
2024-03-07 17:14:08 +01:00
Mehdi Amini
96fc54828a Revert "[mlir][py] better support for arith.constant construction" (#84103)
Reverts llvm/llvm-project#83259

This broke an integration test on Windows
2024-03-05 18:57:45 -08:00
Oleksandr "Alex" Zinenko
a691f65a84 [mlir][py] better support for arith.constant construction (#83259)
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
2024-03-05 16:09:59 +01:00
Matthias Gehre
8ec28af8ea Reapply "[mlir][PDL] Add support for native constraints with results (#82760)"
with a small stack-use-after-scope fix in getConstraintPredicates()

This reverts commit c80e6edba4.
2024-03-02 20:57:30 +01:00
Matthias Gehre
c80e6edba4 Revert "[mlir][PDL] Add support for native constraints with results (#82760)"
Due to buildbot failure https://lab.llvm.org/buildbot/#/builders/88/builds/72130

This reverts commit dca32a3b59.
2024-03-01 07:44:30 +01:00
Matthias Gehre
dca32a3b59 [mlir][PDL] Add support for native constraints with results (#82760)
From https://reviews.llvm.org/D153245

This adds support for native PDL (and PDLL) C++ constraints to return
results.

This is useful for situations where a pattern checks for certain
constraints of multiple interdependent attributes and computes a new
attribute value based on them. Currently, for such an example it is
required to escape to C++ during matching to perform the check and after
a successful match again escape to native C++ to perform the computation
during the rewriting part of the pattern. With this work we can do the
computation in C++ during matching and use the result in the rewriting
part of the pattern. Effectively this enables a choice in the trade-off
of memory consumption during matching vs recomputation of values.

This is an example of a situation where this is useful: We have two
operations with certain attributes that have interdependent constraints.
For instance `attr_foo: one_of [0, 2, 4, 8], attr_bar: one_of [0, 2, 4,
8]` and `attr_foo == attr_bar`. The pattern should only match if all
conditions are true. The new operation should be created with a new
attribute which is computed from the two matched attributes e.g.
`attr_baz = attr_foo * attr_bar`. For the check we already escape to
native C++ and have all values at hand so it makes sense to directly
compute the new attribute value as well:

```
Constraint checkAndCompute(attr0: Attr, attr1: Attr) -> Attr;

Pattern example with benefit(1) {
    let foo = op<test.foo>() {attr = attr_foo : Attr};
    let bar = op<test.bar>(foo) {attr = attr_bar : Attr};
    let attr_baz = checkAndCompute(attr_foo, attr_bar);
    rewrite bar with {
        let baz = op<test.baz> {attr=attr_baz};
        replace bar with baz;
    };
}
```
To achieve this the following notable changes were necessary:
PDLL:
- Remove check in PDLL parser that prevented native constraints from
returning results

PDL:
- Change PDL definition of pdl.apply_native_constraint to allow variadic
results

PDL_interp:
- Change PDL_interp definition of pdl_interp.apply_constraint to allow
variadic results

PDLToPDLInterp Pass:
The input to the pass is an arbitrary number of PDL patterns. The pass
collects the predicates that are required to match all of the pdl
patterns and establishes an ordering that allows creation of a single
efficient matcher function to match all of them. Values that are matched
and possibly used in the rewriting part of a pattern are represented as
positions. This allows fusion and thus reusing a single position for
multiple matching patterns. Accordingly, we introduce
ConstraintPosition, which records the type and index of the result of
the constraint. The problem is for the corresponding value to be used in
the rewriting part of a pattern it has to be an input to the
pdl_interp.record_match operation, which is generated early during the
pass such that its surrounding block can be referred to by branching
operations. In consequence the value has to be materialized after the
original pdl.apply_native_constraint has been deleted but before we get
the chance to generate the corresponding pdl_interp.apply_constraint
operation. We solve this by emitting a placeholder value when a
ConstraintPosition is evaluated. These placeholder values (due to fusion
there may be multiple for one constraint result) are replaced later when
the actual pdl_interp.apply_constraint operation is created.

Changes since the phabricator review:
- Addressed all comments
- In particular, removed registerConstraintFunctionWithResults and
instead changed registerConstraintFunction so that contraint functions
always have results (empty by default)
- Thus we don't need to reuse `rewriteFunctions` to store constraint
functions with results anymore, and can instead use
`constraintFunctions`
- Perform a stable sort of ConstraintQuestion, so that
ConstraintQuestion appear before other ConstraintQuestion that use their
results.
- Don't create placeholders for pdl_interp::ApplyConstraintOp. Instead
generate the `pdl_interp::ApplyConstraintOp` before generating the
successor block.
- Fixed a test failure in the pdl python bindings


Original code by @martin-luecke

Co-authored-by: martin-luecke <martinpaul.luecke@amd.com>
2024-03-01 07:29:49 +01:00
Alexander Belyaev
bfcd3fa825 [mlir] Add result name for gpu.block_id and gpu.thread_id ops. (#83393)
expand-arith-ops.mlir fails on windows, but this is unrelated to this PR
2024-02-29 10:57:09 +01:00
Peiming Liu
56d58295dd [mlir][sparse] Introduce batch level format. (#83082) 2024-02-26 16:08:28 -08:00
Oleksandr "Alex" Zinenko
91f1161133 [mlir] expose transform interpreter to Python (#82365)
Transform interpreter functionality can be used standalone without going
through the interpreter pass, make it available in Python.
2024-02-21 11:01:00 +01:00
Oleksandr "Alex" Zinenko
bd8fcf75df [mlir][python] expose LLVMStructType API (#81672)
Expose the API for constructing and inspecting StructTypes from the LLVM
dialect. Separate constructor methods are used instead of overloads for
better readability, similarly to IntegerType.
2024-02-14 15:03:04 +01:00
Peiming Liu
429919e328 [mlir][sparse][pybind][CAPI] remove LevelType enum from CAPI, constru… (#81682)
…ct LevelType from LevelFormat and properties instead.

**Rationale**
We used to explicitly declare every possible combination between
`LevelFormat` and `LevelProperties`, and it now becomes difficult to
scale as more properties/level formats are going to be introduced.
2024-02-13 16:45:22 -08:00
Rolf Morel
4c654b7b91 [MLIR][Python] Add missing peel_front argument to LoopPeelOp's extension class (#81424) 2024-02-12 11:35:43 -06:00
Yinying Li
2a6b521b36 [mlir][sparse] Add more tests and verification for n:m (#81186)
1. Add python test for n out of m
2. Add more methods for python binding
3. Add verification for n:m and invalid encoding tests
4. Add e2e test for n:m

Previous PRs for n:m #80501 #79935
2024-02-09 14:34:36 -05:00
Yinying Li
e5924d6499 [mlir][sparse] Implement parsing n out of m (#79935)
1. Add parsing methods for block[n, m].
2. Encode n and m with the newly extended 64-bit LevelType enum.
3. Update 2:4 methods names/comments to n:m.
2024-02-08 14:38:42 -05:00
Yinying Li
cd481fa827 [mlir][sparse] Change LevelType enum to 64 bit (#80501)
1. C++ enum is set through enum class LevelType : uint_64.
2. C enum is set through typedef uint_64 level_type. It is due to the
limitations in Windows build: setting enum width to ui64 is not
supported in C.
2024-02-05 17:00:52 -05:00
Jacques Pienaar
59eadcd28f [mlir][py] Reduce size of allocation memrefs in test. 2024-02-01 16:27:20 -08:00
Maksim Levental
404af14f92 [mlir][python] enable memref.subview (#79393) 2024-01-30 16:21:56 -06:00
Guray Ozen
12c241b365 [MLIR][NVVM] Explicit Data Type for Output in wgmma.mma_async (#78713)
The current implementation of `nvvm.wgmma.mma_async` Op deduces the data
type of the output matrix from the data type of struct member, which can be
non-intuitive, especially in cases where types like `2xf16` are packed
into `i32`.

This PR addresses this issue by improving the Op to include an explicit
data type for the output matrix.

The modified Op now includes an explicit data type for Matrix-D (<f16>),
and looks as follows:

```
%result = llvm.mlir.undef : !llvm.struct<(struct<(i32, i32, ...
nvvm.wgmma.mma_async
    %descA, %descB, %result,
    #nvvm.shape<m = 64, n = 32, k = 16>,
    D [<f16>, #nvvm.wgmma_scale_out<zero>],
    A [<f16>, #nvvm.wgmma_scale_in<neg>, <col>],
    B [<f16>, #nvvm.wgmma_scale_in<neg>, <col>]
```
2024-01-22 08:37:20 +01:00