This patch adds the `convertInstruction` and `getSupportedInstructions`
to `LLVMImportInterface`, allowing any non-LLVM dialect to specify how
to import LLVM IR instructions and overriding the default import of LLVM instructions.
Emitting trivial getters that amount to `(*this)->getOperand(1)`
out-of-line or `getProperties().foo` is a pretty significant performance
hit on these basic MLIR APIs for manipulating ops (3-4x). Emit them
inline (without adding additional dependencies to header files).
Tablegen generates uninitialized arrays of size 1 for raw operands and
types. In the current state this causes static analysis warnings about
"uninitialized fixed-size arrays" as their init is separated from their
declaration. Since these are single-entry array, we can just use a plain
variable instead of an array here.
Co-authored-by: Orest Chura <orest.chura@intel.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
ODS was still generating the old `Operation::setAttr` hooks for ODS
methods for setting attributes, when the backing implementation of the
attributes was changed to properties. No idea how this wasn't noticed
until now.
Operations must be created with the supplied builder. Otherwise, the
dialect conversion / greedy pattern rewrite driver can break.
This commit fixes a crash in the dialect conversion:
```
within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: error: failed to legalize operation 'tosa.add'
%0 = tosa.add %1, %arg2 : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32>
^
within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: note: see current operation: %9 = "tosa.add"(%8, %arg2) : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32>
mlir-opt: llvm-project/mlir/include/mlir/IR/UseDefLists.h:198: mlir::IRObjectWithUseList<mlir::OpOperand>::~IRObjectWithUseList() [OperandType = mlir::OpOperand]: Assertion `use_empty() && "Cannot destroy a value that still has uses!"' failed.
```
This commit is the proper fix for #87297 (which was reverted).
Composite pass allows to run sequence of passes in the loop until fixed
point or maximum number of iterations is reached. The usual candidates
are canonicalize+CSE as canonicalize can open more opportunities for CSE
and vice-versa.
Add operation mapping to the LLVM
`llvm.experimental.constrained.fptrunc.*` intrinsic.
The new operation implements the new
`LLVM::FPExceptionBehaviorOpInterface` and
`LLVM::RoundingModeOpInterface` interfaces.
---------
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Using range.size() "as is" means we accumulate 'size_t' values into
'int32_t' variable. This may produce narrowing conversion warnings
(particularly, on MSVC). The surrounding code seems to cast <x>.size()
to 'int32_t' so following this practice seems safe enough.
Co-authored-by: Ovidiu Pintican <ovidiu.pintican@intel.com>
MSVC fails to parse this construct, leading to
MlirTranslateMain.cpp(70): error C2065: 'inputSplitMarker': undeclared identifier
Just switching to brace init works around the issue
This allows to define custom splitters, which is interesting for
non-MLIR inputs and outputs to `mlir-translate`. For example, one may
use `; -----` as a splitter of `.ll` files. The splitters are now passed
as arguments into `splitAndProcessBuffer`, the input splitter defaulting
to the previous default (`// -----`) and the output splitter defaulting
to the empty string, which also corresponds to the previous default. The
behavior of the input split marker should not change at all; however,
outputs now have one new line *more* than before if there is no splitter
(old: `insertMarkerInOutput = false`, new: `outputSplitMarker = ""`) and
one new line *less* if there is one. The value of the input splitter is
exposed as a command line options of `mlir-translate` and other tools as
an optional value to the previously existing flag `-split-input-file`,
which defaults to the default splitter if not specified; the value of
the output splitter is exposed with the new `-output-split-marker`,
which default to the empty string in `mlir-translate` and the default
splitter in the other tools. In short, the previous usage or omission of
the flags should result in previous behavior (modulo the new lines
mentioned before).
We need to generate `.has_value` for `OptionalParseResult`, also ensure
that `auto result` doesn't conflict with `result` which is the variable
name for `OperationState`.
This patch adds a the `LowerVectorToArmNeonPattern` patterns to the
ArmNeon.
This pattern inspects `vector.contract` ops that can be 1-1 mapped to an
`arm.neon.smmla` intrinsic. The contract ops must be separated into
tiles who's inputs must fit that of a single smmla op (`2x8xi32` inputs
and `2x2xi32` output). The `vector.contract` inputs must be sign
extended from narrow types (<=i8) to be converted. If all conditions are
met, an smmla op is inserted with additional `vector.shape_casts` to
handle linearizing the input and output dimension.
This is another follow-up of #83004, which made the same change for
`MLIR_CUDA_CONVERSIONS_ENABLED`. As the previous PR, this PR commit
exposes mentioned CMake variable through `mlir-config.h` and uses the
macro that is introduced with the same name. This replaces the macro
`MLIR_ENABLE_DEPRECATED_GPU_SERIALIZATION`, which the CMake files
previously defined manually.
When a variadic argument is expected but not provided the compilation
fails later with a difficult to follow compilation error. Add a simple
check to catch one such case.
This is not yet general as it doesn't yet check leaf nodes.
Define all special member functions for mlir::Pass, mlir::OperationPass,
mlir::PassWrapper and PassGen types since these classes explicitly
specify copy-ctor. This, subsequently, should silence static analysis
checkers that report rule-of-3 / rule-of-5 violations.
Given the nature of the types, however, mark other special member
functions deleted: the semantics of a Pass type object seems to be that
it is only ever created by being wrapped in a smart pointer, so the
special member functions are never to be used externally (except for the
copy-ctor - it is "special" since it is a "delegating" ctor for derived
pass types to use during cloning - see https://reviews.llvm.org/D104302
for details).
Deleting other member functions means that `Pass x(std::move(y))` - that
used to silently work (via copy-ctor) - would fail to compile now. Yet,
as the copy ctors through the hierarchy are under 'protected' access,
the issue is unlikely to appear in practice.
Co-authored-by: Asya Pronina <anastasiya.pronina@intel.com>
Co-authored-by: Harald Rotuna <harald.razvan.rotuna@intel.com>
`isContiguousAccess` is an important affine analysis utility but is only
tested very indirectly via passes like vectorization and is not exposed.
Expose it and add a test pass for it that'll make it easier/feasible to
write test cases. This is especially needed since the utility can be
significantly enhanced in power, and we need a test pass to exercise it
directly.
This pass can in the future be used to test the utility of invariant
accesses as well.
The ArmSME compilation pipeline has evolved significantly and is now
sufficiently complex enough that it warrants a proper lowering pipeline
that encapsulates the various passes and orderings. Currently the
pipeline is loosely defined in our integration tests, but these have
diverged and are not using the same passes or ordering everywhere.
This patch introduces a test-lower-to-arm-sme pipeline mirroring
test-lower-to-llvm that provides some sanity when running e2e examples
and can be used a reference for targeting ArmSME in MLIR.
All the integration tests are updated to use this pipeline. The
intention is to productize the pipeline once it becomes more mature.
_SubClassValueT is only useful when it is has >1 usage in a signature.
This was not true for the signatures produced by tblgen.
For example
def call(result, callee, operands_, *, loc=None, ip=None) ->
_SubClassValueT:
...
here a type checker does not have enough information to infer a type
argument for _SubClassValueT, and thus effectively treats it as Any.
This is a new ODS feature that allows dialects to define a list of
key/value pair representing an attribute type and a name.
This will generate helper classes on the dialect to be able to
manage discardable attributes on operations in a type safe way.
For example the `test` dialect can define:
```
let discardableAttrs = (ins
"mlir::IntegerAttr":$discardable_attr_key,
);
```
And the following will be generated in the TestDialect class:
```
/// Helper to manage the discardable attribute `discardable_attr_key`.
class DiscardableAttrKeyAttrHelper {
::mlir::StringAttr name;
public:
static constexpr ::llvm::StringLiteral getNameStr() {
return "test.discardable_attr_key";
}
constexpr ::mlir::StringAttr getName() {
return name;
}
DiscardableAttrKeyAttrHelper(::mlir::MLIRContext *ctx)
: name(::mlir::StringAttr::get(ctx, getNameStr())) {}
mlir::IntegerAttr getAttr(::mlir::Operation *op) {
return op->getAttrOfType<mlir::IntegerAttr>(name);
}
void setAttr(::mlir::Operation *op, mlir::IntegerAttr val) {
op->setAttr(name, val);
}
bool isAttrPresent(::mlir::Operation *op) {
return op->hasAttrOfType<mlir::IntegerAttr>(name);
}
void removeAttr(::mlir::Operation *op) {
assert(op->hasAttrOfType<mlir::IntegerAttr>(name));
op->removeAttr(name);
}
};
DiscardableAttrKeyAttrHelper getDiscardableAttrKeyAttrHelper() {
return discardableAttrKeyAttrName;
}
```
User code having an instance of the TestDialect can then manipulate this
attribute on operation using:
```
auto helper = testDialect.getDiscardableAttrKeyAttrHelper();
helper.setAttr(op, value);
helper.isAttrPresent(op);
...
```
This is to avoid warnings when invoked from the flang documentation
generation build.
The warning can be seen in the CI
(https://lab.llvm.org/buildbot/#/builders/89/builds/57451).
```
/home/buildbot/as-worker-4/publish-sphinx-docs/llvm-project/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp:2956:38: warning: unused variable ‘operand’ [-Wunused-variable]
```
This op is the inverse of all-gather. It is useful to have an explicit
concise representation instead of having a blob of slicing logic.
Add lowering for the op that slices from the tensor based on the
in-group process index.
Make resharding generate an all-slice instead of inserting the slicing
logic directly.
This removes 3 dead function decls from mlir-opt:
- registerTestLowerToNVVM - recently removed in #75775 when NVVM was
productized.
- registerTestPreparationPassWithAllowedMemrefResults - removed in
D90778 (f7bc568266).
- registerTestGenericIRVisitorsInterruptPass - added in D116230
(8067ced144) but never existed. Pass is registered by
registerTestGenericIRVisitorsPass.
Currently ODS generates `populateDefaultAttrs` or
`populateDefaultProperties` based on whether the dialect opted into
usePropertiesForAttributes. But since individual ops might get opted
into using properties (as long as it has one property), it should
actually just check whether the op itself uses properties. Otherwise
`populateDefaultAttrs` will overwrite existing attrs inside properties
when creating an op. Understandably this becomes moot once everything
switches over to using properties, but this fixes it for now.
This PR makes ODS generate `populateDefaultProperties` as long as the op
itself uses properties.
The generated code for dependent dialects is awkwardly formatted, making
the code harder to read. This change reformats the whitespace to align
code in its context and avoid unnecessary empty lines.
Also included are some typo fixes.
Below are examples of the codegen for a dialect before and after the
change.
Before:
```
GPUDialect::GPUDialect(::mlir::MLIRContext *context)
: ::mlir::Dialect(getDialectNamespace(), context, ::mlir::TypeID::get<GPUDialect>()) {
getContext()->loadDialect<arith::ArithDialect>();
initialize();
}
```
After:
```
GPUDialect::GPUDialect(::mlir::MLIRContext *context)
: ::mlir::Dialect(getDialectNamespace(), context, ::mlir::TypeID::get<GPUDialect>()) {
getContext()->loadDialect<arith::ArithDialect>();
initialize();
}
```
Below are examples of the codegen for a pass before and after the
change.
Before:
```
/// Return the dialect that must be loaded in the context before this pass.
void getDependentDialects(::mlir::DialectRegistry ®istry) const override {
registry.insert<func::FuncDialect>();
registry.insert<tensor::TensorDialect>();
registry.insert<tosa::TosaDialect>();
}
```
After:
```
/// Register the dialects that must be loaded in the context before this pass.
void getDependentDialects(::mlir::DialectRegistry ®istry) const override {
registry.insert<func::FuncDialect>();
registry.insert<tensor::TensorDialect>();
registry.insert<tosa::TosaDialect>();
}
```
Rename interface functions as follows:
* `hasTensorSemantics` -> `hasPureTensorSemantics`
* `hasBufferSemantics` -> `hasPureBufferSemantics`
These two functions return "true" if the op has tensor/buffer operands
but not buffer/tensor operands.
Also drop the "ranked" part from the interface, i.e., do not distinguish
between ranked/unranked types.
The new function names describe the functions more accurately. They also
align their semantics with the notion of "tensor semantics" with the
bufferization framework. (An op is supposed to be bufferized if it has
tensor operands, and we don't care if it also has memref operands.)
This change is in preparation of #75273, which adds
`BufferizableOpInterface::hasTensorSemantics`. By renaming the functions
in the `DestinationStyleOpInterface`, we can avoid name clashes between
the two interfaces.
In the process a couple of test transform dialect ops are added just
for testing. These operations are not intended to use as full flushed
out of transformation ops, but are rather operations added for testing.
A separate operation is added to `LinalgTransformOps.td` to convert a
`TilingInterface` operation to loops using the
`generateScalarImplementation` method implemented by the
operation. Eventually this and other operations related to tiling
using the `TilingInterface` need to move to a better place (i.e. out
of `Linalg` dialect)
* Rename mesh.process_index -> mesh.process_multi_index.
* Add mesh.process_linear_index op.
* Add lowering of mesh.process_multi_index into an expression using
mesh.process_linear_index, mesh.cluster_shape and
affine.delinearize_index.
This is useful to lower mesh ops and prepare them for further lowering
where the runtime may have only the linear index of a device/process.
For example in MPI we have a rank (linear index) in a communicator.
According to
https://mlir.llvm.org/docs/DefiningDialects/Operations/#custom-directives,
custom directive supports attr-dict
> attr-dict Directive: NamedAttrList &
But it doesn't support prop-dict which is introduced into MLIR recently.
It's useful to have tblgen support prop-dict like attr-dict. This PR
enable tblgen to support prop-dict
```bash
error: only variables and types may be used as parameters to a custom directive
... custom<Print>(prop-dict)
```
Co-authored-by: Fung Xie <ftse@nvidia.com>
Support WalkResult for AffineExpr walk and support interrupting walks
along the lines of Operation::walk. This allows interrupted walks when a
condition is met. Also, switch from std::function to llvm::function_ref
for the walk function.
Make it so that PDL in pattern rewrites can be optionally disabled.
PDL is still enabled by default and not optional bazel. So this should
be a NOP for most folks, while enabling other to disable.
This only works with tests disabled. With tests enabled this still
compiles but tests fail as there is no lit config to disable tests that
depend on PDL rewrites yet.
Make it so that PDL in pattern rewrites can be optionally disabled.
PDL is still enabled by default and not optional bazel. So this should
be a NOP for most folks, while enabling other to disable.
This is piped through mlir-tblgen invocation and that could be
changed/avoided by splitting up the passes file instead.
This only works with tests disabled. With tests enabled this still
compiles but tests fail as there is no lit config to disable tests that
depend on PDL rewrites yet.
There is a bit of an issue with how `mlir-linalg-ods-yaml-gen` is
classified in the MLIR build. Due to it being a tool, it is excluded
from the install when using `-DLLVM_BUILD_TOOLS=OFF`. However, it is a
necessary component of the build, so it can cause build issues with
users of the installed LLVM, and so I think it should not be excluded.
It is a tablegen-like tool, so my solution is to reclassify it that way
in the build.
The new patterns break down subgroup reduce ops with vector values into
a sequence of subgroup reductions that fit the native shuffle size. The
maximum/native shuffle size is parametrized.
The overall goal is to be able to perform multi-element reductions with
a sequence of `gpu.shuffle` ops.
The `test-lower-to-nvvm` pipeline serves as the common and proper
pipeline for nvvm+host compilation, and it's used across our CUDA
integration tests.
This PR updates the `test-lower-to-nvvm` pipeline to `gpu-lower-to-nvvm`
and moves it within `InitAllPasses.h`. The aim is to call it from
Python, also having a standardize compilation process for nvvm.