* Rename functions with underscore to camel case.
* Return C++ bools of "in_bounds" values instead of an `ArrayAttr`.
Differential Revision: https://reviews.llvm.org/D155277
There was a bug in `TransferWriteNonPermutationLowering`, a pattern that extends the permutation map of a TransferWriteOp with leading transfer dimensions of size ones. These newly added transfer dimensions are always in-bounds, because the starting point of any dimension is in-bounds. VectorToSCF inserts out-of-bounds checks based on the "in_bounds" attribute and dims that are marked as out-of-bounds but that are actually always in-bounds lead to unnecessary "scf.if" ops.
Differential Revision: https://reviews.llvm.org/D155196
It also unifies the computation of StridedLayoutAttr. If the stride is
static known value, we can just use it.
Differential Revision: https://reviews.llvm.org/D155017
Clarify a few diagnostics so that they are more consistent with the
corresponding condition. For example:
```
if (positionAttr.size() >
static_cast<unsigned>(getSourceVectorType().getRank()))
```
should lead to ("no greater than"):
```
return emitOpError(
"expected position attribute of rank no greater than vector rank");
```
as opposed to ("smaller"):
```
return emitOpError(
"expected position attribute of rank smaller than vector rank");
```
Differential Revision: https://reviews.llvm.org/D154998
Currently the transfer splitting patterns will generate an invalid cast
when the source memref for a transfer op has a non-default memory space.
This is handled by first introducing a `memref.memory_space_cast` in
such cases.
Differential Revision: https://reviews.llvm.org/D154515
This patch is a following for the previous patch https://reviews.llvm.org/D151519.
With this patch, vector.load op with narrow bitwidth (e.g., i4) can be converted to
supported wider bitwidth (e.g., i8).
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D154178
Remove patterns that fold tensor subset ops into vector transfer ops from the vector dialect. These patterns already exist in the tensor dialect.
Differential Revision: https://reviews.llvm.org/D154932
`getConstantIntValue` extracts constant values from all constant-like ops, not just `arith::ConstantIndexOp`.
Differential Revision: https://reviews.llvm.org/D154356
* Add `memref::getMixedSize` (same as in the tensor dialect).
* Simplify in-bounds check in `VectorTransferSplitRewritePatterns.cpp` and fix off-by-one error in the static in-bounds check.
* Use "memref::DimOp" instead of `createOrFoldDimOp` when possible.
Differential Revision: https://reviews.llvm.org/D154218
At the moment, only the trailing dimensions in the vector type can be
scalable, i.e. this is supported:
vector<2x[4]xf32>
and this is not allowed:
vector<[2]x4xf32>
This patch extends the vector type so that arbitrary dimensions can be
scalable. To this end, an array of bool values is added to every vector
type to denote whether the corresponding dimensions are scalable or not.
For example, for this vector:
vector<[2]x[3]x4xf32>
the following array would be created:
{true, true, false}.
Additionally, the current syntax:
vector<[2x3]x4xf32>
is replaced with:
vector<[2]x[3]x4xf32>
This is primarily to simplify parsing (this way, the parser can easily
process one dimension at a time rather than e.g. tracking whether
"scalable block" has been entered/left).
NOTE: The `isScalableDim` parameter of `VectorType` (introduced in this
patch) makes `numScalableDims` redundant. For the time being,
`numScalableDims` is preserved to facilitate the transition between the
two parameters. `numScalableDims` will be removed in one of the
subsequent patches.
This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
* https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/
Differential Revision: https://reviews.llvm.org/D153372
The old code used to materialize constants as ops, immediately folded them into the resulting affine map and then deleted the constant ops again. Instead, directly fold the attributes into the affine map. Furthermore, all helpers accept `OpFoldResult` instead of `Value` now. This makes the code at call sites more efficient, because it is no longer necessary to materialize a `Value`, just to be able to use these helper functions.
Note: The API has changed (accepts OpFoldResult instead of Value), otherwise this change is NFC.
Differential Revision: https://reviews.llvm.org/D153324
The new pattern will replace elementwise(broadcast) with
broadcast(elementwise) when safe.
This change affects tests for vectorising nD-extract. In one case
("vectorize_nd_tensor_extract_with_tensor_extract") I just trimmed the
test and only preserved the key parts (scalar and contiguous load from
the original Op). We could do the same with some other tests if that
helps maintainability.
Differential Revision: https://reviews.llvm.org/D152812
For now, only elementwise operations are supported. Operations that perform any
kind of data permutation require changes in the representation of scalable
dimensions in VectorType.
Differential Revision: https://reviews.llvm.org/D152599
In the vector distribute patterns, we used to move
`vector.broadcast`s out of `vector.warp_execute_on_lane0`s
irrespectively of how they were defined.
This could create broadcast operations with invalid semantic.
E.g.,
```
%r = warop ...[32] ... -> vector<1x2xf32> {
%val = broadcast %in : vector<64xf32> to vetor<1x64xf32>
vector.yield %val : vector<1x64xf32>
}
```
=>
```
%r = warop ...[32] ... -> vector<64xf32> {
vector.yield %in : vector<64xf32>
}
// Broadcasting to a narrower type!
broadcast %r : vector<64xf32> to vector<1x2xf32>
```
The root issue is we are trying to broadcast something that is not the same
for each thread, so there is actually nothing to propagate here.
The fix checks that the broadcast we want to create actually makes sense.
Differential Revision: https://reviews.llvm.org/D152154
* Remove `transform::PatternRegistry`.
* Add a new op for each currently registered pattern set.
* Change names of vector dialect pattern selector ops, so that they are consistent with the remaining code base.
* Remove redundant `transform.vector.extract_address_computations` op.
Differential Revision: https://reviews.llvm.org/D152249
Consider mixed precision data type, i.e., F16 input lhs, F16 input rhs, F32 accumulation, and F32 output. This is typically written as F32 <= F16*F16 + F32.
During vectorization from linalg to vector for mixed precision data type (F32 <= F16*F16 + F32), linalg.matmul introduces arith.extf on input lhs and rhs operands.
"linalg.matmul"(%lhs, %rhs, %acc) ({
^bb0(%arg1: f16, %arg2: f16, %arg3: f32):
%lhs_f32 = "arith.extf"(%arg1) : (f16) -> f32
%rhs_f32 = "arith.extf"(%arg2) : (f16) -> f32
%mul = "arith.mulf"(%lhs_f32, %rhs_f32) : (f32, f32) -> f32
%acc = "arith.addf"(%arg3, %mul) : (f32, f32) -> f32
"linalg.yield"(%acc) : (f32) -> ()
})
There are backend that natively supports mixed-precision data type and does not need the arith.extf. For example, NVIDIA A100 GPU has mma.sync.aligned.*.f32.f16.f16.f32 that can support mixed-precision data type. However, the presence of arith.extf in the IR, introduces the unnecessary casting targeting F32 Tensor Cores instead of F16 Tensor Cores for NVIDIA backend. This patch adds a folding pattern to fold arith.extf into vector.contract
Differential Revision: https://reviews.llvm.org/D151918
In the vector distribute patterns, we used to move
`vector.transfer_read`s out of `vector.warp_execute_on_lane0`s
irrespectively of how they were defined.
This could create transfer_read operations that would read values from
within the warpOp's body from outside of the body.
E.g.,
```
warpop {
%defined_in_body
%read = transfer_read %defined_in_body
vector.yield %read
}
```
=>
```
warpop {
%defined_in_body
vector.yield ...
}
// %defined_in_body is referenced outside of its scope.
%read = transfer_read %defined_in_body
```
The fix consists in checking that all the values feeding the new
`transfer_read` are defined outside of warpOp's body.
Note: We could do this check before creating any operation, but that would
mean knowing what `affine::makeComposedAffineApply` actually do. So the
current fix is a trade off of coupling the implementations of this
propagation and `makeComposedAffineApply` versus compile time.
Differential Revision: https://reviews.llvm.org/D152149
Certain functions were declared in `VectorOps.h` instead of `VectorTransforms.h` or `VectorRewritePatterns.h`.
Differential Revision: https://reviews.llvm.org/D152146
Patterns that convert extract(transfer_read) into a scalar load where
incorrectly triggering for cases where a sub-vector instead of a scalar
was extracted.
Reviewed By: nicolasvasilache, hanchung, awarzynski
Differential Revision: https://reviews.llvm.org/D151862
This PR adds support for shape casting from and to 0-D vectors.
Reviewed By: nicolasvasilache, hanchung, awarzynski
Differential Revision: https://reviews.llvm.org/D151851
The `vector.extract` folding patterns do not support 0-D vectors
(actually, 0-D vector support couldn't even be implemented as a folding
pattern as it would require replacing `vector.extract` with a
`vector.extractelement` op). This patch is bailing out folding when 0-D
vectors are found.
Reviewed By: nicolasvasilache, hanchung
Differential Revision: https://reviews.llvm.org/D151847
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This patch updates all remaining uses of the deprecated functionality in
mlir/. This was done with clang-tidy as described below and further
modifications to GPUBase.td and OpenMPOpsInterfaces.td.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```
Differential Revision: https://reviews.llvm.org/D151542
This patch extends the transfer drop unit dim patterns to support cases where the vector shape should also be reduced
(e.g., transfer_read(memref<1x4x1xf32>, vector<1x4x1xf32>) -> transfer_read(memref<4xf32>, vector<4xf32>).
Reviewed By: hanchung, pzread
Differential Revision: https://reviews.llvm.org/D151007
This patch adds support to shape cast a vector<1x1x1...1xElemenType> to
a vector<ElementType> and the other way around.
Differential Revision: https://reviews.llvm.org/D151169
This patch extends the vector.extract(vector.transfer_read) -> scalar
load patterns to support vector.transfer_read with multiple uses. For
now, we check that all the uses are vector.extract operations.
Supporting multiple uses is predicated under a flag.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D150812
These patterns touches the structure generated from tiling so it
affects later steps like bufferization and vector hoisting.
Instead of putting them in canonicalization, this commit creates
separate entry points for them to be called explicitly.
This is NFC regarding the functionality and tests of those patterns.
It also addresses two TODO items in the codebase.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D150702
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Context:
* https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…"
* Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This follows a previous patch that updated calls
`op.cast<T>()-> cast<T>(op)`. However some cases could not handle an
unprefixed `cast` call due to occurrences of variables named cast, or
occurring inside of class definitions which would resolve to the method.
All C++ files that did not work automatically with `cast<T>()` are
updated here to `llvm::cast` and similar with the intention that they
can be easily updated after the methods are removed through a
find-replace.
See https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
for the clang-tidy check that is used and then update printed
occurrences of the function to include `llvm::` before.
One can then run the following:
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-export-fixes /tmp/cast/casts.yaml mlir/*\
-header-filter=mlir/ -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```
Differential Revision: https://reviews.llvm.org/D150348
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
4. Some changes have been deleted for the following reasons:
- Some files had a variable also named cast
- Some files had not included a header file that defines the cast
functions
- Some files are definitions of the classes that have the casting
methods, so the code still refers to the method instead of the
function without adding a prefix or removing the method declaration
at the same time.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
mlir/lib/**/IR/\
mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
mlir/test/lib/Dialect/Test/TestTypes.cpp\
mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
mlir/test/lib/Dialect/Test/TestAttributes.cpp\
mlir/unittests/TableGen/EnumsGenTest.cpp\
mlir/test/python/lib/PythonTestCAPI.cpp\
mlir/include/mlir/IR/
```
Differential Revision: https://reviews.llvm.org/D150123
The existing vector.transpose lowering patterns only triggers if the
input vector is 2D. The revision extends the pattern to handle n-D
vectors which are effectively 2-D vectors (e.g., vector<1x4x1x8x1).
It refactors a common check about 2-D vectors from X86Vector
lowering to VectorUtils.h so it can be reused by both sides.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D149908
This new features enabled to dedicate custom storage inline within operations.
This storage can be used as an alternative to attributes to store data that is
specific to an operation. Attribute can also be stored inside the properties
storage if desired, but any kind of data can be present as well. This offers
a way to store and mutate data without uniquing in the Context like Attribute.
See the OpPropertiesTest.cpp for an example where a struct with a
std::vector<> is attached to an operation and mutated in-place:
struct TestProperties {
int a = -1;
float b = -1.;
std::vector<int64_t> array = {-33};
};
More complex scheme (including reference-counting) are also possible.
The only constraint to enable storing a C++ object as "properties" on an
operation is to implement three functions:
- convert from the candidate object to an Attribute
- convert from the Attribute to the candidate object
- hash the object
Optional the parsing and printing can also be customized with 2 extra
functions.
A new options is introduced to ODS to allow dialects to specify:
let usePropertiesForAttributes = 1;
When set to true, the inherent attributes for all the ops in this dialect
will be using properties instead of being stored alongside discardable
attributes.
The TestDialect showcases this feature.
Another change is that we introduce new APIs on the Operation class
to access separately the inherent attributes from the discardable ones.
We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`,
and other similar method which don't make the distinction explicit, leading
to an entirely separate namespace for discardable attributes.
Recommit d572cd1b06 after fixing python bindings build.
Differential Revision: https://reviews.llvm.org/D141742