Previously, references to regions and successors were incorrectly disallowed outside the top-level assembly form. This change enables the use of bound regions and successors as variables in custom directives.
This thread through proper error handling / reporting capabilities to
avoid hitting llvm_unreachable while parsing linalg ops.
Fixes#132755Fixes#132740Fixes#129185
This commit adds support for non-attribute properties (such as
StringProp and I64Prop) in declarative rewrite patterns. The handling
for properties follows the handling for attributes in most cases,
including in the generation of static matchers.
Constraints that are shared between multiple types are supported by
making the constraint matcher a templated function, which is the
equivalent to passing ::mlir::Attribute for an arbitrary C++ type.
This PR introduces the `vector.to_elements` op, which decomposes a
vector into its scalar elements. This operation is symmetrical to the
existing `vector.from_elements`.
Examples:
```
// Decompose a 0-D vector.
%0 = vector.to_elements %v0 : vector<f32>
// %0 = %v0[0]
// Decompose a 1-D vector.
%0:2 = vector.to_elements %v1 : vector<2xf32>
// %0#0 = %v1[0]
// %0#1 = %v1[1]
// Decompose a 2-D.
%0:6 = vector.to_elements %v2 : vector<2x3xf32>
// %0#0 = %v2[0, 0]
// %0#1 = %v2[0, 1]
// %0#2 = %v2[0, 2]
// %0#3 = %v2[1, 0]
// %0#4 = %v2[1, 1]
// %0#5 = %v2[1, 2]
```
This op is aimed at reducing code size when modeling "structured" vector
extractions and simplifying canonicalizations of large sequences of
`vector.extract` and `vector.insert` ops into `vector.shuffle` and other
sophisticated ops that can re-arrange vector elements.
### Description
This patch improves the folding efficiency of `vector.insert` and
`vector.extract` operations by not returning early after successfully
converting dynamic indices to static indices.
This PR also renames the test pass `TestConstantFold` to
`TestSingleFold` and adds comprehensive documentation explaining the
single-pass folding behavior.
### Motivation
Since the `OpBuilder::createOrFold` function only calls `fold` **once**,
the current `fold` methods of `vector.insert` and `vector.extract` may
leave the op in a state that can be folded further. For example,
consider the following un-folded IR:
```
%v1 = vector.insert %e1, %v0 [0] : f32 into vector<128xf32>
%c0 = arith.constant 0 : index
%e2 = vector.extract %v1[%c0] : f32 from vector<128xf32>
```
If we use `createOrFold` to create the `vector.extract` op, then the
result will be:
```
%v1 = vector.insert %e1, %v0 [127] : f32 into vector<128xf32>
%e2 = vector.extract %v1[0] : f32 from vector<128xf32>
```
But this is not the optimal result. `createOrFold` should have returned
`%e1`.
The reason is that the execution of fold returns immediately after
`extractInsertFoldConstantOp`, causing subsequent folding logics to be
skipped.
---------
Co-authored-by: Yang Bai <yangb@nvidia.com>
Whereas backward-slice matching provides support to limit traversal by
specifying the desired depth level, this pull request introduces support
for limiting traversal with a nested matcher (adding forward-slice
also). It also adds support for variadic operators, including `anyOf`
and `allOf`. Rather than simply stopping traversal when an operation
named foo is encountered, one can now define a matcher that specifies
different exit conditions. Variadic support implementation within
mlir-query is very similar to clang-query.
Add `gen-attr-constraint-decls` and `gen-attr-constraint-defs`, which
generate public C++ functions for attribute constraints. The name of the C++
function is specified in the `cppFunctionName` field.
This generalize `cppFunctionName` from `TypeConstraint` introduced in
https://github.com/llvm/llvm-project/pull/104577 to be usable also in `AttrConstraint`.
Below is the original commit description. Furthermore, it applies a
[fix](33a26b9ca2)
for CMakeList.txt
The issue occurs during a downstream pass which does dialect conversion,
where both
[`FuncOpConversion`](cde67b6663/mlir/lib/Conversion/FuncToLLVM/FuncToLLVM.cpp (L480))
and
[`SubviewFolder`](cde67b6663/mlir/lib/Dialect/MemRef/Transforms/ExpandStridedMetadata.cpp (L187))
are run together. The original starting IR is:
```mlir
module {
func.func @foo(%arg0: memref<100x100xf32>, %arg1: index, %arg2: index, %arg3: index, %arg4: index) -> memref<?x?xf32, strided<[100, 1], offset: ?>> {
%subview = memref.subview %arg0[%arg1, %arg2] [%arg3, %arg4] [1, 1] : memref<100x100xf32> to memref<?x?xf32, strided<[100, 1], offset: ?>>
return %subview : memref<?x?xf32, strided<[100, 1], offset: ?>>
}
}
```
After `FuncOpConversion` runs, the IR looks like:
```mlir
"builtin.module"() ({
"llvm.func"() <{CConv = #llvm.cconv<ccc>, function_type = !llvm.func<struct<(ptr, ptr, i64, array<2 x i64>, array<2 x i64>)> (ptr, ptr, i64, i64, i64, i64, i64, i64, i64, i64, i64)>, linkage = #llvm.linkage<external>, sym_name = "foo", visibility_ = 0 : i64}> ({
^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr, %arg2: i64, %arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64, %arg7: i64, %arg8: i64, %arg9: i64, %arg10: i64):
%0 = "memref.subview"(<<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>) <{operandSegmentSizes = array<i32: 1, 2, 2, 0>, static_offsets = array<i64: -9223372036854775808, -9223372036854775808>, static_sizes = array<i64: -9223372036854775808, -9223372036854775808>, static_strides = array<i64: 1, 1>}> : (memref<100x100xf32>, index, index, index, index) -> memref<?x?xf32, strided<[100, 1], offset: ?>>
"func.return"(%0) : (memref<?x?xf32, strided<[100, 1], offset: ?>>) -> ()
}) : () -> ()
"func.func"() <{function_type = (memref<100x100xf32>, index, index, index, index) -> memref<?x?xf32, strided<[100, 1], offset: ?>>, sym_name = "foo"}> ({
}) : () -> ()
}) {llvm.data_layout = "", llvm.target_triple = ""} : () -> ()
```
The `<<UNKNOWN SSA VALUE>>`'s here are block arguments of a separate
unlinked block, which is disconnected from the rest of the IR (so not
only is the IR verifier-invalid, it can't even be parsed). This IR is
created by signature conversion in the dialect conversion infra.
Now `SubviewFolder` is applied, and the utility function here is called
on one of these disconnected block arguments, causing a crash.
The TestMemRefToLLVMWithTransforms pass is introduced to exercise the
bug, and it can be reused by other contributors in the future.
Co-authored-by: Rahul Kayaith <rkayaith@gmail.com>
---------
Signed-off-by: hanhanW <hanhan0912@gmail.com>
The issue occurs during a downstream pass which does dialect conversion,
where both
[`FuncOpConversion`](cde67b6663/mlir/lib/Conversion/FuncToLLVM/FuncToLLVM.cpp (L480))
and
[`SubviewFolder`](cde67b6663/mlir/lib/Dialect/MemRef/Transforms/ExpandStridedMetadata.cpp (L187))
are run together. The original starting IR is:
```mlir
module {
func.func @foo(%arg0: memref<100x100xf32>, %arg1: index, %arg2: index, %arg3: index, %arg4: index) -> memref<?x?xf32, strided<[100, 1], offset: ?>> {
%subview = memref.subview %arg0[%arg1, %arg2] [%arg3, %arg4] [1, 1] : memref<100x100xf32> to memref<?x?xf32, strided<[100, 1], offset: ?>>
return %subview : memref<?x?xf32, strided<[100, 1], offset: ?>>
}
}
```
After `FuncOpConversion` runs, the IR looks like:
```mlir
"builtin.module"() ({
"llvm.func"() <{CConv = #llvm.cconv<ccc>, function_type = !llvm.func<struct<(ptr, ptr, i64, array<2 x i64>, array<2 x i64>)> (ptr, ptr, i64, i64, i64, i64, i64, i64, i64, i64, i64)>, linkage = #llvm.linkage<external>, sym_name = "foo", visibility_ = 0 : i64}> ({
^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr, %arg2: i64, %arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64, %arg7: i64, %arg8: i64, %arg9: i64, %arg10: i64):
%0 = "memref.subview"(<<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>, <<UNKNOWN SSA VALUE>>) <{operandSegmentSizes = array<i32: 1, 2, 2, 0>, static_offsets = array<i64: -9223372036854775808, -9223372036854775808>, static_sizes = array<i64: -9223372036854775808, -9223372036854775808>, static_strides = array<i64: 1, 1>}> : (memref<100x100xf32>, index, index, index, index) -> memref<?x?xf32, strided<[100, 1], offset: ?>>
"func.return"(%0) : (memref<?x?xf32, strided<[100, 1], offset: ?>>) -> ()
}) : () -> ()
"func.func"() <{function_type = (memref<100x100xf32>, index, index, index, index) -> memref<?x?xf32, strided<[100, 1], offset: ?>>, sym_name = "foo"}> ({
}) : () -> ()
}) {llvm.data_layout = "", llvm.target_triple = ""} : () -> ()
```
The `<<UNKNOWN SSA VALUE>>`'s here are block arguments of a separate
unlinked block, which is disconnected from the rest of the IR (so not
only is the IR verifier-invalid, it can't even be parsed). This IR is
created by signature conversion in the dialect conversion infra.
Now `SubviewFolder` is applied, and the utility function here is called
on one of these disconnected block arguments, causing a crash.
The TestMemRefToLLVMWithTransforms pass is introduced to exercise the
bug, and it can be reused by other contributors in the future.
---------
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Co-authored-by: Rahul Kayaith <rkayaith@gmail.com>
The class "ClauseVal" actually represents a definition of an enumeration
value, and in itself it is not bound to any clause. Rename it to EnumVal
and add a comment clarifying how it's translated into an actual enum
definition in the generated source code.
There is no change in functionality.
This patch wraps `populateLowerContractionToSMMLAPatternPatterns` into a
new TD Op `apply_patterns.arm_neon.vector_contract_to_i8mm` .
It also removes the "test-lower-to-arm-neon" pass.
Previously the dialects registered were fixed per LSP binary. This works
as long as all the dialects of interest from the different projects
across which one uses the LSP, are disjoint. This expands this to
support cases where there are dialects that overlap in dialect name but
usage of these are separate wrt projects. The alternative is multiple
binaries and switching LSP used in editor per project (there is some
extra complexity in hosted instances).
This handles a simple (I believe common case) where one can determine
based on path and have single binary - the cost of dynamically doing so
based on path would be either keeping different registries to return or
repopulating dialect & extension maps.
Now that `Property` is a `PropConstraint`, hook it up to the same
constraint-uniquing machinery that other types of constraints use. This
will primarily save on code size for types, like enums, that have
inherent constraints which are shared across many operations.
This PR introduces a new tool, mlir-irdl-to-cpp, that converts IRDL to
C++ definitions.
The C++ definitions allow use of the IRDL-defined dialect in MLIR C++
infrastructure, enabling the use of conversion patterns with IRDL
dialects for example. This PR also adds CMake utilities to easily
integrate the IRDL dialects into MLIR projects.
Note that most IRDL features are not supported. In general, we are only
able to define simple types and operations.
- The only type constraint supported is irdl.any.
- Variadic operands and results are not supported.
- Verifiers for the IRDL constraints are not generated.
- Attributes are not supported.
---------
Co-authored-by: Théo Degioanni <theo.degioanni.llvm.deluge062@simplelogin.fr>
Co-authored-by: Fehr Mathieu <mathieu.fehr@gmail.com>
Background issue: #139813
In
[emitEitherOperandMatch()](e62fc14a5d/mlir/tools/mlir-tblgen/RewriterGen.cpp (L774))
we check if `op.getArg(argIndex)` is a `NamedTypeConstraint`:
```cpp
} else if (isa<NamedTypeConstraint *>(op.getArg(argIndex))) {
emitOperandMatch(tree, opName, /*operandName=*/formatv("v{0}", i).str(),
operandIndex,
/*operandMatcher=*/eitherArgTree.getArgAsLeaf(i),
/*argName=*/eitherArgTree.getArgName(i), argIndex,
/*variadicSubIndex=*/std::nullopt);
++operandIndex;
}
```
but in `emitOperandMatch()` we cast on `op.getArg(operandIndex)`, which
is incorrect if the operation has attributes or other non-operand
arguments before its operands.
This commit takes the `summary` and `description` of TableGen files and
generate a cpp comments on top of the declarations of generated cpp
classes.
The main motivation is to improve the developer experience. When people
work on compilers from an IDE, they will be able to hover over the
symbols (e.g. `"ADialect::BOp"`) in their cpp code and see the summary
and descriptions without having to referring to the `.td` files.
Improve mlir-query tool by implementing `getBackwardSlice` and
`getForwardSlice` matchers. As an addition `SetQuery` also needed to be
added to enable custom configuration for each query. e.g: `inclusive`,
`omitUsesFromAbove`, `omitBlockArguments`.
Note: backwardSlice and forwardSlice algoritms are the same as the ones
in `mlir/lib/Analysis/SliceAnalysis.cpp`
Example of current matcher. The query was made to the file:
`mlir/test/mlir-query/complex-test.mlir`
```mlir
./mlir-query /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir -c "match getDefinitions(hasOpName(\"arith.add
f\"),2)"
Match #1:
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:5:8:
%0 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<5x5xf32>) outs(%arg1 : tensor<5x5xf32>) {
^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:7:10: note: "root" binds here
%2 = arith.addf %in, %in : f32
^
Match #2:
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:10:16:
%collapsed = tensor.collapse_shape %0 [[0, 1]] : tensor<5x5xf32> into tensor<25xf32>
^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:13:11:
%c2 = arith.constant 2 : index
^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:14:18:
%extracted = tensor.extract %collapsed[%c2] : tensor<25xf32>
^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:15:10: note: "root" binds here
%2 = arith.addf %extracted, %extracted : f32
^
2 matches.
```
Similar to vector ops, XeGPU ops need to be unrolled into smaller shapes
such that they can be dispatched into a hardware instruction. This PR
marks the initial phase of a series dedicated to incorporating unroll
patterns for XeGPU operations. In this installment, we introduce
patterns for the following operations:
1. createNd
2. updateNd
3. prefetchNd
4. loadNd
5. storeNd
6. dpas
Since the introduction of `OpAsm{Type,Attr}Interface` (#121187), it is
possible to generate alias in AsmPrinter solely from the type/attribute
itself without consulting the `OpAsmDialectInterface`. This means the
behavior can be put in tablegen file near the type/attribute definition.
A common pattern is to just use the type/attr mnemonic as the alias.
Previously, like #130479/#130481/#130483, this means adding a default
implementation to `extraClassDeclaration` in `LLVM_Attr` base class.
However, as attribute definition may override `extraClassDeclaration`,
it might be preferred to have a new field in tablegen to specify this
behavior.
This commit adds a `genMnemonicAlias` field to `AttrOrTypeDef`, when
enabled, makes `mlir-tblgen` emit a default implementation of `getAlias`
using mnemonic. When `OpAsm{Attr,Type}Interface` is not specified by the
user, `tblgen` will automatically add the interface.
For users wanting other alias behavior, they can ignore such field and
still use `extraClassDeclaration` way.
This PR introduces a new tool, mlir-irdl-to-cpp, that converts IRDL to
C++ definitions.
The C++ definitions allow use of the IRDL-defined dialect in MLIR C++
infrastructure, enabling the use of conversion patterns with IRDL
dialects for example. This PR also adds CMake utilities to easily
integrate the IRDL dialects into MLIR projects.
Note that most IRDL features are not supported. In general, we are only
able to define simple types and operations.
- The only type constraint supported is `irdl.any`.
- Variadic operands and results are not supported.
- Verifiers for the IRDL constraints are not generated.
- Attributes are not supported.
---------
Co-authored-by: Théo Degioanni <theo.degioanni.llvm.deluge062@simplelogin.fr>
Co-authored-by: Fehr Mathieu <mathieu.fehr@gmail.com>
This patch fixes:
mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.cpp:586:14: error:
variable 'realParam' set but not used
[-Werror,-Wunused-but-set-variable]
This PR extends the `struct` directive in tablegen to support nested
`custom` directives. Note that this assumes/verifies that that `custom`
directive has a single parameter.
This enables defining custom field parsing and printing functions if the
`struct` directive doesn't suffice. There is some existing potential
downstream usage for it:
a3c7de9242/stablehlo/dialect/StablehloOps.cpp (L3102)
In Record only store the direct superclasses instead of all
superclasses. getSuperClasses recurses to find all superclasses when
necessary.
This gives a small reduction in memory usage. On lib/Target/X86/X86.td I
measured about 2.0% reduction in total bytes allocated (measured by
valgrind) and 1.3% reduction in peak memory usage (measured by
/usr/bin/time -v).
---------
Co-authored-by: Min-Yih Hsu <min@myhsu.dev>
The error is triggered when an attribute or type uses an APInt typed
parameter with the generated equality operator. If the compared APInts
have different bit widths the equality operator triggers an assert. This
is dangerous, since `StorageUniquer` for types and attributes uses the
equality operator when a hash collision appears. As such, it is
necessary to use custom provided comarator or `APIntParameter` that
already has it.
This commit also replaces uses of the raw `APInt` parameter with the
`APIntParameter` and removes the no longer necessary custom StorageClass
for the `BitVectorAttr` from the SMT dialect that was a workaround for
the described issue.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
The 1:N dialect conversion driver has been deprecated. Use the regular
dialect conversion driver instead. This commit deletes the 1:N dialect
conversion driver.
Note for LLVM integration: If you are already using the regular dialect conversion, but still have argument materializations in your code base, simply delete all `addArgumentMaterialization` calls.
For details, see
https://discourse.llvm.org/t/rfc-merging-1-1-and-1-n-dialect-conversions/82513.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we have a call to dyn_cast earlier in the function, implying
that attrOrProp is nonnull.
Current one-shot bufferization infrastructure operates on top of
TensorType and BaseMemRefType. These are non-extensible base classes of
the respective builtins: tensor and memref. Thus, the infrastructure is
bound to work only with builtin tensor/memref types. At the same time,
there are customization points that allow one to provide custom logic to
control the bufferization behavior.
This patch introduces new type interfaces: tensor-like and buffer-like
that aim to supersede TensorType/BaseMemRefType within the bufferization
dialect and allow custom tensors / memrefs to be used. Additionally,
these new type interfaces are attached to the respective builtin types
so that the switch is seamless.
Note that this patch does very minimal initial work, it does NOT
refactor bufferization infrastructure.
See https://discourse.llvm.org/t/rfc-changing-base-types-for-tensors-and-memrefs-from-c-base-classes-to-type-interfaces/85509
Ops that are already snake case (like [`ROCDL_wmma_*`
ops](66b0b0466b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (L411)))
produce python "value-builders" that collide with the class names:
```python
class wmma_bf16_16x16x16_bf16(_ods_ir.OpView):
OPERATION_NAME = "rocdl.wmma.bf16.16x16x16.bf16"
...
def wmma_bf16_16x16x16_bf16(res, args, *, loc=None, ip=None) -> _ods_ir.Value:
return wmma_bf16_16x16x16_bf16(res=res, args=args, loc=loc, ip=ip).result
```
and thus cannot be emitted (because of recursive self-calls).
This PR fixes that by affixing `_` to the value builder names.
I would've preferred to just rename the ops but that would be a breaking
change 🤷.
Current inliner disables inlining when the caller is in a region with
single block trait, while the callee function contains multiple blocks.
the SingleBlock trait is used in operations such as do/while loop, for
example fir.do_loop, fir.iterate_while and fir.if. Typically, calls within
loops are good candidates for inlining. However, functions with multiple
blocks are also common. for example, any function with "if () then
return" will result in multiple blocks in MLIR.
This change gives the flexibility of a customized inliner to handle such
cases.
doClone: clones instructions and other information from the callee
function into the caller function. .
canHandleMultipleBlocks: checks if functions with multiple blocks can be
inlined into a region with the SingleBlock trait.
The default behavior of the inliner remains unchanged.
---------
Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
I observed that we have the boundary comments in the codebase like:
```
//===----------------------------------------------------------------------===//
// ...
//===----------------------------------------------------------------------===//
```
I also observed that there are incomplete boundary comments. The
revision is generated by a script that completes the boundary comments.
```
//===----------------------------------------------------------------------===//
// ...
...
```
Signed-off-by: hanhanW <hanhan0912@gmail.com>
This commit pulls apart the inherent attribute dependence of classes
like EnumAttrInfo and EnumAttrCase, factoring them out into simpler
EnumCase and EnumInfo variants. This allows specifying the cases of an
enum without needing to make the cases, or the EnumInfo itself, a
subclass of SignlessIntegerAttrBase.
The existing classes are retained as subclasses of the new ones, both
for backwards compatibility and to allow attribute-specific information.
In addition, the new BitEnum class changes its default printer/parser
behavior: cases when multiple keywords appear, like having both nuw and
nsw in overflow flags, will no longer be quoted by the operator<<, and
the FieldParser instance will now expect multiple keywords. All
instances of BitEnumAttr retain the old behavior.
This moves the EnumAttrCase and EnumAttr classes from Attribute.h/.cpp
to a new EnumInfo.h/cpp and renames them to EnumCase and EnumInfo,
respectively.
This doesn't change any of the tablegen files or any user-facing aspects
of the enum attribute generation system, just reorganizes code in order
to make main PR (#132148) shorter.