This PR removes one OpRewritePattern `shape_cast(shape_cast(x)) -> x`
that is already handled by `ShapeCastOp::fold`.
Note that this might affect downstream users who indirectly call
`populateShapeCastFoldingPatterns(RewritePatternSet &patterns,
PatternBenefit)` and then use `patterns` with a `GreedyRewriteConfig
config` that has `config.fold = false`. (only user I've checked is IREE,
that never uses config.fold = false).
This patch restricts `vector.gather` to only accept tensors and memrefs
as valid sources. Currently, the source is typed as `AnyShaped`, which
also includes vectors—allowing the following (invalid) construct to pass
verification:
```mlir
%0 = vector.gather %base[%c0][%indices], %mask, %pass_thru
: vector<16xf32>, vector<16xi32>, vector<16xi1>, vector<16xf32> into vector<16xf32>
```
(Note: the source %base here is a vector, which is incorrect.)
In contrast, `vector.scatter` currently only accepts memrefs, so some
asymmetry remains between the two ops. This PR is a step toward aligning
their semantics.
Add APIntParameter with custom implementation for comparison and use it
in llvm.constant_range attribute. This is necessary because the default
equality operator of APInt asserts when the bit widths of the compared
APInts differ. The comparison is used by StorageUniquer when hashes of
two ranges with different bit widths collide.
Motivation: amdgpu buffer load instruction will return all zeros when
loading sub-word values. For example, assuming the buffer size is
exactly one word and we attempt to invoke
`llvm.amdgcn.raw.ptr.buffer.load.v2i32` starting from byte 2 of the
word, we will not receive the actual value of the buffer but all zeros
for the first word. This is because the boundary has been crossed for
the first word.
This PR come up with a fix to this problem, such that, it creates a
bounds check against the buffer load instruction. It will compare the
offset + vector size to see if the upper bound of the address will
exceed the buffer size. If it does, masked transfer read will be
optimized to `vector.load` + `arith.select`, else, it will continue to
fall back to default lowering of the masked vector load.
The semantics of `createReadOrMaskedRead` and `createWriteOrMaskedWrite`
are currently a bit inconsistent and not fully documented:
* The input vector sizes are passed as `readShape` and
`inputVectorSizes`, respectively — inconsistent naming.
* Currently, the input vector sizes in `createWriteOrMaskedWrite` are
not required to be complete: any missing trailing sizes are inferred
from the destination tensor. This only works when the destination
tensor is statically shaped.
* Unlike `createReadOrMaskedRead`, the documentation for
`createWriteOrMaskedWrite` does not specify that write offsets are
hard-coded to 0.
This PR only updates the documentation and unifies the naming. As such,
it is NFC.
A follow-up PR will generalize and unify the implementation to support,
for example, dynamically shaped destination tensors — a requirement for
enabling scalable vectorization of `linalg.pack` and `linalg.unpack`.
Current one-shot bufferization infrastructure operates on top of
TensorType and BaseMemRefType. These are non-extensible base classes of
the respective builtins: tensor and memref. Thus, the infrastructure is
bound to work only with builtin tensor/memref types. At the same time,
there are customization points that allow one to provide custom logic to
control the bufferization behavior.
This patch introduces new type interfaces: tensor-like and buffer-like
that aim to supersede TensorType/BaseMemRefType within the bufferization
dialect and allow custom tensors / memrefs to be used. Additionally,
these new type interfaces are attached to the respective builtin types
so that the switch is seamless.
Note that this patch does very minimal initial work, it does NOT
refactor bufferization infrastructure.
See https://discourse.llvm.org/t/rfc-changing-base-types-for-tensors-and-memrefs-from-c-base-classes-to-type-interfaces/85509
MLIR's data flow analysis (especially dense data flow analysis)
constructs a lattice at every lattice anchor (which, for dense data
flow, means every program point). As the program grows larger, the time
and space complexity can become unmanageable.
However, in many programs, the lattice values at numerous lattice
anchors are actually identical. We can leverage this observation to
improve the complexity of data flow analysis. This patch introducing
equivalence lattice anchor to group lattice anchors that must contains
identical lattice on certain state to improve the time and space
footprint of data flow.
`LegalizeDataValuesInRegion` is intended to replace the SSA values used
in a region with the output of data operations, but misses the handling
of the OpenACC `host_data` construct. As a result, currently
```
!$acc host_data use_device(%var)
...%var...
!$acc end host_data
```
is lowered to
```
%dev_var = acc.use_device(%var)
acc.host_data data_operands(%dev_var) {
...%var...
}
```
This pull request updates the LegalizeDataValuesInRegion to handle
HostDataOp such that lowering results in
```
%dev_var = acc.use_device(%var)
acc.host_data data_operands(%dev_var) {
...%dev_var...
}
```
This commit improves the `EnumProp` class, causing it to wrap around an
`EnumInfo` just like` EnumAttr` does. This EnumProp also has logic for
converting to/from an integer attribute and for being read and written
as bitcode.
The following variants of `EnumProp` are provided:
- `EnumPropWithAttrForm` - an EnumProp that can be constructed from (and
will be converted to, if `storeInCustomAttribute` is true) a custom
attribute, like an `EnumAttr`, instead of a plain integer. This is meant
for backwards compatibility with code that uses enum attributes.
`NamedEnumProp` adds a "`mnemonic` `<` $enum `>`" syntax around the
enum, replicating a common pattern seen in MLIR printers and allowing
for reduced ambiguity.
`NamedEnumPropWithAttrForm` combines both of these extensions.
(Sadly, bytecode auto-upgrade is hampered by the lack of the ability to
optionally parse an attribute.)
Depends on #132148
This reverts commit 54e70ac765 which
itself fixed an [asan
leak](https://lab.llvm.org/buildbot/#/builders/55/builds/9761) from the
original upstreaming commit. The leak was due to op allocations not
being `free`ed.
~~The necessary change was to explicitly `->destroy()` the ops at the
end of the tests. I believe this is because the rewriter used in the
tests doesn't actually insert them into a module and so without an
explicit `->destroy()` no bookkeeping process is able to take care of
them.~~
The necessary change was to use `OwningOpRef` which calls `op->erase()`
in its [own
destructor](89cfae41ec/mlir/include/mlir/IR/OwningOpRef.h (L39)).
This PR upstreams the `SMT` dialect from the CIRCT project. Here we only
check in the dialect/op/types/attributes and lit tests. Follow up PRs
will add conversions in and out and etc.
Co-authored-by: Bea Healy <beahealy22@gmail.com>
Co-authored-by: Martin Erhart <maerhart@outlook.com>
Co-authored-by: Mike Urbach <mikeurbach@gmail.com>
Co-authored-by: Will Dietz <will.dietz@sifive.com>
Co-authored-by: fzi-hielscher <hielscher@fzi.de>
Co-authored-by: Fehr Mathieu <mathieu.fehr@gmail.com>
This PR is after #135253 and #134935 to fix the error reported by
https://github.com/llvm/llvm-project/pull/135253#issuecomment-2796077024.
This PR Adds typedef declarations for `MlirLinalgContractionDimensions`
and `MlirLinalgConvolutionDimensions` in the C API to ensure
compatibility with pure C code.
I confirm that this fix resolves the reported error based on my testing.
Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
Addition of `inlinehint` attributes for CallOps in MLIR in order to be
able to say to a function call that the inlining is desirable without
having the attribute on the FuncOp.
This PR is mainly about exposing the python bindings for
`linalg::isaConvolutionOpInterface` and `linalg::inferConvolutionDims`.
---------
Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
This PR implements `verify-diagnostics=only-expected` which is a "best
effort" verification - i.e., `unexpected`s and `near-misses` will not be
considered failures. The purpose is to enable narrowly scoped checking
of verification remarks (just as we have for lit where only a subset of
lines get `CHECK`ed).
The LLVM dialect no longer has its own vector types. It uses
`mlir::VectorType` everywhere. Remove
`LLVM::getFixedVectorType/getScalableVectorType` and use
`VectorType::get` instead. This commit addresses a
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
on the PR that deleted the LLVM vector types.
This PR is mainly about exposing the python bindings for`
linalg::isaContractionOpInterface` and` linalg::inferContractionDims`.
---------
Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
The LLVM dialect no longer has its own vector types. It uses
`mlir::VectorType` everywhere. Remove `LLVM::getVectorElementType` and
use `cast<VectorType>(ty).getElementType()` instead. This commit
addresses a
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
on the PR that deleted the LLVM vector types.
Also improve vector type constraints by specifying the
`mlir::VectorType` C++ class, so that explicit casts to `VectorType` can
be avoided in some places.
Replaces separate x86vector named intrinsic operations with direct calls
to LLVM intrinsic functions.
This rework reduces the number of named ops leaving only high-level MLIR
equivalents of whole intrinsic classes e.g., variants of AVX512 dot on
BF16 inputs. Dialect conversion applies LLVM intrinsic name mangling
further simplifying lowering logic.
The separate conversion step translating x86vector intrinsics into LLVM
IR is also eliminated. Instead, this step is now performed by the
existing llvm dialect infrastructure.
RFC:
https://discourse.llvm.org/t/rfc-simplify-x86-intrinsic-generation/85581
Improve error messages when parsing an incorrect type.
Before:
```
invalid kind of type specified
```
After:
```
invalid kind of type specified: expected builtin.tensor, but found 'tensor<*xi32>'
```
This error message is produced when a certain operand/result type is
expected according to an op's TableGen definition, but a different type
is parsed. Type constraints (which may have nice error messages) are
checked after parsing a type. If an incorrect type is parsed, we never
get to the point of printing type constraint error messages. This may
discourage users from specifying C++ classes with type constraints.
(Explicitly specifying C++ classes is beneficial because the
auto-generated C++ code will have richer type information; explicit
casts are unnecessary, etc.) See #134981 for an example where specifying
additional type information with type constraints (e.g.,
`LLVM_AnyVector`) lead to worse error messages.
Note: In order to generate a better error message, the parser must
retrieve a type's name from the C++ class. TableGen-generated type
classes always have a `name` field, but hand-written C++ type classes
may not. The `HasStaticName` template was copied from
`DialectImplementation.h` (`HasStaticDialectName`).
This patch revisits op verifiers for `LoopWrapperInterface` operations
to improve consistency across operations and to properly cover some
previously misreported cases.
Checks that should be done for these kinds of operations are documented
in the interface description.
Updates the description to align with the specification. Also includes
some small cleanup to `sigmoid`, to avoid confusion.
Signed-off-by: Luke Hutton <luke.hutton@arm.com>
This PR improves the SGMapAttr to enable workgroup-level programming, representing the first step in expanding the XeGPU dialect from subgroup to workgroup level, and renames it to LayoutAttr
Since #125690, the MLIR vector type supports `!llvm.ptr` as an element
type. The only remaining element type for `LLVMFixedVectorType` is now
`LLVMPPCFP128Type`.
This commit turns `LLVMPPCFP128Type` into a proper FP type (by
implementing `FloatTypeInterface`), so that the MLIR vector type accepts
it as an element type. This makes `LLVMFixedVectorType` obsolete.
`LLVMScalableVectorType` is also obsolete. This commit deletes
`LLVMFixedVectorType` and `LLVMScalableVectorType`.
Note for LLVM integration: Use `VectorType` instead of
`LLVMFixedVectorType` and `LLVMScalableVectorType`.
This commit extends the MLIR vector type to support pointer-like types
such as `!llvm.ptr` and `!ptr.ptr`, as indicated by the newly added
`VectorTypeElementInterface`. This makes the LLVM dialect closer to LLVM
IR. LLVM IR already supports pointers as vector element type.
Only integers, floats, pointers and index are valid vector element types
for now. Additional vector element types may be added in the future
after further discussions. The interface is still evolving and may
eventually turn into one of the alternatives that were discussed on the
RFC.
This commit also disallows `!llvm.ptr` as an element type of
`!llvm.vec`. This type exists due to limitations of the MLIR vector
type.
RFC:
https://discourse.llvm.org/t/rfc-allow-pointers-as-element-type-of-vector/85360
Currently if a developer uses the flag `--mlir-enable-debugger-hook` the
debugger hook is not actually enabled. It seems the DebugConfig and the
MainMLIROptConfig are not connected.
To fix this we can move the `enableDebuggerHook` CL Option to the
DebugConfigCLOptions struct so that it can get registered and enabled
along with the other debugger flags. AFAICS there are no other uses of
the flag so this should be safe.
This also adds a small LIT test to check that the hook is enabled by
checking the std::cerr output for the log message.
Defining a new `amdgpu.global_load` op, which is a thin wrap around
ROCDL `global_load_lds` intrinsic, along with its lowering logics to
`rocdl.global.load.lds`.