[mlir][linalg][transform] Rename {masked_vectorize => vectorize => vectorize_children_and...}. (#66575)
This PR renames the vectorization transform ops as follows: * `structured.masked_vectorize` => `structured.vectorize`. This reflects the fact that since [recently](https://reviews.llvm.org/D157774) the op can also handle the unmasked case. * `structured.vectorize` => `structured.vectorize_children_and_applies_patterns`. This reflects the fact that the op does not just vectorize the given payload op but all vectorizable children contained in it, and applies patterns before and after for preparation and clean-up. This rename was discussed first [here](https://reviews.llvm.org/D157774). The PR also adapts and cleans ups the tablegen description of the `VectorizeChildrenAndApplyPatternsOp` (formerly `VectorizeOp`).
This commit is contained in:
@@ -1947,37 +1947,39 @@ def TileToForallOp :
|
||||
}
|
||||
|
||||
//===----------------------------------------------------------------------===//
|
||||
// VectorizeOp
|
||||
// VectorizeChildrenAndApplyPatternsOp
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",
|
||||
def VectorizeChildrenAndApplyPatternsOp :
|
||||
Op<Transform_Dialect, "structured.vectorize_children_and_apply_patterns",
|
||||
[FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
|
||||
TransformEachOpTrait, TransformOpInterface,
|
||||
ReportTrackingListenerFailuresOpTrait]> {
|
||||
let description = [{
|
||||
Indicates that the given `target` op all the ops it contains should be
|
||||
vectorized with the configuration specified by the attributes of this op.
|
||||
This vectorization only handles structured ops that operate on shaped types
|
||||
and does not vectorize loops or straight-line. Internally, it applies a
|
||||
set of rewrite patterns, some of which enable vectorization and some of
|
||||
which clean up the results. Therefore, it can only be applied to an op with
|
||||
the "isolated from above property". If finer granularity is required, it can
|
||||
be achieved by outlining the target part of the payload IR into, e.g., a
|
||||
function, performing the transformation, and inlining it back. This
|
||||
transformation only fails if the entire pattern rewriting failed, i.e., it
|
||||
does **not** fail when no ops were vectorized.
|
||||
Vectorizes all children contained in the given `target` using the
|
||||
configuration specified by the attributes of this op. This only vectorizes
|
||||
structured ops that operate on shaped types and does not vectorize loops or
|
||||
straight-line. Internally, it applies a set of rewrite patterns, some of
|
||||
which enable vectorization and some of which clean up the results.
|
||||
Therefore, it can only be applied to an op with the "isolated from above"
|
||||
property. This transformation only fails if the entire pattern rewriting
|
||||
failed, i.e., it does **not** fail when no ops were vectorized.
|
||||
|
||||
Note that this transformation is invalidating the handles to any payload IR
|
||||
Finer granularity can be achieved either with the `VectorizeOp` for
|
||||
individual ops or by outlining the target part of the payload IR into, e.g.,
|
||||
a function, performing this transformation, and inlining it back.
|
||||
|
||||
Note that this transformation invalidates the handles to any payload IR
|
||||
operation that is contained inside the vectorization target.
|
||||
|
||||
This transformation supports the following attributes:
|
||||
- `vectorize_padding`: a UnitAttr to activate the vectorization of
|
||||
- `vectorize_padding`: a `UnitAttr` to activate the vectorization of
|
||||
`tensor.pad` ops. Different pipelines may prefer to lower such ops to
|
||||
loops.
|
||||
- `disable_multi_reduction_to_contract_patterns`: a UnitAttr to deactivate
|
||||
- `disable_multi_reduction_to_contract_patterns`: a `UnitAttr` to deactivate
|
||||
the rewrite of `vector.multi_reduction` to `vector.contract`. This is
|
||||
intended to be used in tests only.
|
||||
- `disable_transfer_permutation_map_lowering_patterns`: a UnitAttr to
|
||||
- `disable_transfer_permutation_map_lowering_patterns`: a `UnitAttr` to
|
||||
deactivate the rewrite of `vector.transfer` with permutation maps into
|
||||
explicit `vector.transpose` operations. This is intended to be used in
|
||||
tests only but may be promoted to a first class attribute in the future.
|
||||
@@ -2015,7 +2017,7 @@ def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",
|
||||
}];
|
||||
}
|
||||
|
||||
def MaskedVectorizeOp : Op<Transform_Dialect, "structured.masked_vectorize",
|
||||
def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",
|
||||
[DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
|
||||
TransformOpInterface, ReportTrackingListenerFailuresOpTrait]> {
|
||||
let description = [{
|
||||
@@ -2029,9 +2031,9 @@ def MaskedVectorizeOp : Op<Transform_Dialect, "structured.masked_vectorize",
|
||||
|
||||
```mlir
|
||||
# Masked vectorization - vector sizes are specified explicitly
|
||||
transform.structured.masked_vectorize %target vector_sizes [1, 4] : !transform.any_op
|
||||
transform.structured.vectorize %target vector_sizes [1, 4] : !transform.any_op
|
||||
# Regular vectorization - vector sizes are inferred from the target Op
|
||||
transform.structured.masked_vectorize %target : !transform.any_op
|
||||
transform.structured.vectorize %target : !transform.any_op
|
||||
```
|
||||
|
||||
The vector sizes can be either static or dynamic (SSA values). In case of
|
||||
|
||||
@@ -2904,27 +2904,31 @@ LogicalResult TileToForallOp::verify() {
|
||||
}
|
||||
|
||||
//===----------------------------------------------------------------------===//
|
||||
// VectorizeOp
|
||||
// VectorizeChildrenAndApplyPatternsOp
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
void transform::VectorizeOp::build(OpBuilder &builder, OperationState &result,
|
||||
Value target, bool vectorizePadding,
|
||||
bool vectorizeExtract) {
|
||||
void transform::VectorizeChildrenAndApplyPatternsOp::build(
|
||||
OpBuilder &builder, OperationState &result, Value target,
|
||||
bool vectorizePadding, bool vectorizeExtract) {
|
||||
result.addOperands(target);
|
||||
if (vectorizePadding) {
|
||||
result.addAttribute(VectorizeOp::getVectorizePaddingAttrName(result.name),
|
||||
builder.getUnitAttr());
|
||||
result.addAttribute(
|
||||
VectorizeChildrenAndApplyPatternsOp::getVectorizePaddingAttrName(
|
||||
result.name),
|
||||
builder.getUnitAttr());
|
||||
}
|
||||
if (vectorizeExtract) {
|
||||
result.addAttribute(VectorizeOp::getVectorizeNdExtractAttrName(result.name),
|
||||
builder.getUnitAttr());
|
||||
result.addAttribute(
|
||||
VectorizeChildrenAndApplyPatternsOp::getVectorizeNdExtractAttrName(
|
||||
result.name),
|
||||
builder.getUnitAttr());
|
||||
}
|
||||
result.addTypes(transform::AnyOpType::get(builder.getContext()));
|
||||
}
|
||||
|
||||
namespace {
|
||||
/// This is an helper only to call vectorize via a pattern inside of
|
||||
/// VectorizeOp::applyToOne.
|
||||
/// VectorizeChildrenAndApplyPatternsOp::applyToOne.
|
||||
struct VectorizationPattern : public RewritePattern {
|
||||
explicit VectorizationPattern(MLIRContext *context,
|
||||
bool vectorizeExtract = false)
|
||||
@@ -2947,10 +2951,10 @@ private:
|
||||
} // namespace
|
||||
|
||||
DiagnosedSilenceableFailure
|
||||
transform::VectorizeOp::applyToOne(transform::TransformRewriter &rewriter,
|
||||
Operation *target,
|
||||
transform::ApplyToEachResultList &results,
|
||||
transform::TransformState &state) {
|
||||
transform::VectorizeChildrenAndApplyPatternsOp::applyToOne(
|
||||
transform::TransformRewriter &rewriter, Operation *target,
|
||||
transform::ApplyToEachResultList &results,
|
||||
transform::TransformState &state) {
|
||||
if (!target->hasTrait<OpTrait::IsIsolatedFromAbove>()) {
|
||||
auto diag = this->emitOpError("requires isolated-from-above targets");
|
||||
diag.attachNote(target->getLoc()) << "non-isolated target";
|
||||
@@ -2992,9 +2996,9 @@ transform::VectorizeOp::applyToOne(transform::TransformRewriter &rewriter,
|
||||
}
|
||||
|
||||
//===----------------------------------------------------------------------===//
|
||||
// MaskedVectorizeOp
|
||||
// VectorizeOp
|
||||
//===----------------------------------------------------------------------===//
|
||||
DiagnosedSilenceableFailure transform::MaskedVectorizeOp::apply(
|
||||
DiagnosedSilenceableFailure transform::VectorizeOp::apply(
|
||||
transform::TransformRewriter &rewriter,
|
||||
mlir::transform::TransformResults &transformResults,
|
||||
mlir::transform::TransformState &state) {
|
||||
@@ -3058,19 +3062,19 @@ DiagnosedSilenceableFailure transform::MaskedVectorizeOp::apply(
|
||||
return DiagnosedSilenceableFailure::success();
|
||||
}
|
||||
|
||||
void transform::MaskedVectorizeOp::getEffects(
|
||||
void transform::VectorizeOp::getEffects(
|
||||
SmallVectorImpl<MemoryEffects::EffectInstance> &effects) {
|
||||
consumesHandle(getTarget(), effects);
|
||||
onlyReadsHandle(getVectorSizes(), effects);
|
||||
modifiesPayload(effects);
|
||||
}
|
||||
|
||||
SmallVector<OpFoldResult> MaskedVectorizeOp::getMixedVectorSizes() {
|
||||
SmallVector<OpFoldResult> VectorizeOp::getMixedVectorSizes() {
|
||||
OpBuilder b(getContext());
|
||||
return getMixedValues(getStaticVectorSizes(), getVectorSizes(), b);
|
||||
}
|
||||
|
||||
LogicalResult transform::MaskedVectorizeOp::verify() {
|
||||
LogicalResult transform::VectorizeOp::verify() {
|
||||
if (getStaticVectorSizes().size() != getScalableSizes().size())
|
||||
return emitOpError("expected same number of vector sizes (")
|
||||
<< getStaticVectorSizes().size() << ") and scalable sizes ("
|
||||
|
||||
@@ -360,8 +360,8 @@ class MapCopyToThreadsOp:
|
||||
)
|
||||
|
||||
|
||||
class MaskedVectorizeOp:
|
||||
"""Specialization for MaskedVectorizeOp class."""
|
||||
class VectorizeOp:
|
||||
"""Specialization for VectorizeOp class."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
@@ -730,8 +730,8 @@ class TileToForallOp:
|
||||
)
|
||||
|
||||
|
||||
class VectorizeOp:
|
||||
"""Specialization for VectorizeOp class."""
|
||||
class VectorizeChildrenAndApplyPatternsOp:
|
||||
"""Specialization for VectorizeChildrenAndApplyPatternsOp class."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
|
||||
@@ -17,7 +17,7 @@ transform.sequence failures(propagate) {
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %module_op : (!transform.any_op) -> !transform.any_op
|
||||
%1, %loops:3 = transform.structured.tile %0 [2, 2, 2] : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
|
||||
%2 = get_parent_op %1 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize %2 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize_children_and_apply_patterns %2 : (!transform.any_op) -> !transform.any_op
|
||||
%b = transform.bufferization.one_shot_bufferize layout{IdentityLayoutMap}
|
||||
%module_op {bufferize_function_boundaries = true}
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
|
||||
@@ -80,7 +80,7 @@ transform.sequence failures(propagate) {
|
||||
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)
|
||||
|
||||
// Apply masked vectorization to padding ops.
|
||||
transform.structured.masked_vectorize %tiled_pad_op vector_sizes [128, 4]
|
||||
transform.structured.vectorize %tiled_pad_op vector_sizes [128, 4]
|
||||
: !transform.any_op
|
||||
|
||||
// Assign shared memory buffer to padding.
|
||||
@@ -105,7 +105,7 @@ transform.sequence failures(propagate) {
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
%bufferized_copy_back = transform.structured.match ops{["linalg.copy"]} in %func_op_2
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize
|
||||
transform.structured.vectorize
|
||||
%bufferized_copy_back vector_sizes [128, 4] : !transform.any_op
|
||||
|
||||
// Canonicalize, cleanup and vector lowering. This step also removes buffer
|
||||
@@ -192,7 +192,7 @@ transform.sequence failures(propagate) {
|
||||
}
|
||||
|
||||
// Apply masked vectorization to padding ops.
|
||||
transform.structured.masked_vectorize %tiled_pad_op vector_sizes [128, 4]
|
||||
transform.structured.vectorize %tiled_pad_op vector_sizes [128, 4]
|
||||
: !transform.any_op
|
||||
|
||||
// Assign shared memory buffer to padding.
|
||||
@@ -217,7 +217,7 @@ transform.sequence failures(propagate) {
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
%bufferized_copy_back = transform.structured.match ops{["linalg.copy"]} in %func_op_2
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize
|
||||
transform.structured.vectorize
|
||||
%bufferized_copy_back vector_sizes [128, 4] : !transform.any_op
|
||||
|
||||
// Canonicalize, cleanup and vector lowering. This step also removes buffer
|
||||
|
||||
@@ -111,7 +111,7 @@ transform.sequence failures(propagate) {
|
||||
padding_dimensions=[0, 1, 2],
|
||||
pack_paddings=[1, 1, 1]
|
||||
} : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)
|
||||
transform.structured.masked_vectorize %pad vector_sizes [10, 12] : !transform.any_op
|
||||
transform.structured.vectorize %pad vector_sizes [10, 12] : !transform.any_op
|
||||
%vector_write = transform.structured.match ops{["vector.transfer_write"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%mask_op = transform.get_parent_op %vector_write {op_name = "vector.mask"} : (!transform.any_op) -> !transform.any_op
|
||||
%buffer, %new_ops = transform.structured.bufferize_to_allocation %mask_op {memory_space = 3, emit_dealloc} : !transform.any_op
|
||||
|
||||
@@ -26,7 +26,7 @@ transform.sequence failures(propagate) {
|
||||
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
|
||||
%tiled_linalg_op_0, %loops_1:3 = transform.structured.tile %tiled_linalg_op[8, 8, 8]
|
||||
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
|
||||
transform.structured.masked_vectorize %tiled_linalg_op_0 vector_sizes [8, 8, 8]
|
||||
transform.structured.vectorize %tiled_linalg_op_0 vector_sizes [8, 8, 8]
|
||||
: !transform.any_op
|
||||
|
||||
%func = transform.structured.match ops{["func.func"]} in %module
|
||||
|
||||
@@ -31,7 +31,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.apply_patterns to %2 {
|
||||
transform.apply_patterns.vector.lower_contraction lowering_strategy = "outerproduct"
|
||||
} : !transform.any_op
|
||||
|
||||
@@ -20,7 +20,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -45,7 +45,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -65,7 +65,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.copy"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -111,7 +111,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -159,7 +159,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 {vectorize_padding} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 {vectorize_padding} : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -176,5 +176,5 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
// expected-error @below {{op requires isolated-from-above targets}}
|
||||
%2 = transform.structured.vectorize %0 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %0 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
@@ -1,514 +0,0 @@
|
||||
// RUN: mlir-opt %s -test-transform-dialect-interpreter -split-input-file | FileCheck %s
|
||||
|
||||
func.func @vectorize_dynamic_identity(%arg0: tensor<?xf32>,
|
||||
%arg1: tensor<?xf32>,
|
||||
%arg2: tensor<?xf32>) -> tensor<?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0) -> (d0)>,
|
||||
affine_map<(d0) -> (d0)>,
|
||||
affine_map<(d0) -> (d0)>],
|
||||
iterator_types = ["parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<?xf32>, tensor<?xf32>)
|
||||
outs(%arg2 : tensor<?xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<?xf32>
|
||||
return %0 : tensor<?xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: @vectorize_dynamic_identity
|
||||
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
|
||||
// CHECK: %[[VAL_7:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
|
||||
// CHECK: %[[VAL_8:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_13:.*]] = arith.addf %[[VAL_8]], %[[VAL_10]] : vector<4xf32>
|
||||
// CHECK: %[[VAL_14:.*]] = vector.mask %[[VAL_7]] { vector.transfer_write %{{.*}} {in_bounds = [true]} : vector<4xf32>, tensor<?xf32> } : vector<4xi1> -> tensor<?xf32>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_dynamic_1d_broadcast(%arg0: tensor<?xf32>,
|
||||
%arg1: tensor<?xf32>,
|
||||
%arg2: tensor<?xf32>) -> tensor<?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0) -> (0)>,
|
||||
affine_map<(d0) -> (d0)>,
|
||||
affine_map<(d0) -> (d0)>],
|
||||
iterator_types = ["parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<?xf32>, tensor<?xf32>)
|
||||
outs(%arg2 : tensor<?xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<?xf32>
|
||||
return %0 : tensor<?xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: @vectorize_dynamic_1d_broadcast
|
||||
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
|
||||
// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
|
||||
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
|
||||
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_13:.*]] = arith.addf %[[VAL_7]], %[[VAL_10]] : vector<4xf32>
|
||||
// CHECK: %[[VAL_14:.*]] = vector.mask %{{.*}} { vector.transfer_write %[[VAL_13]], {{.*}} {in_bounds = [true]} : vector<4xf32>, tensor<?xf32> } : vector<4xi1> -> tensor<?xf32>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_dynamic_2d_transpose(%arg0: tensor<?x?xf32>,
|
||||
%arg1: tensor<?x?xf32>,
|
||||
%arg2: tensor<?x?xf32>) -> tensor<?x?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d1, d0)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>],
|
||||
iterator_types = ["parallel", "parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>)
|
||||
outs(%arg2 : tensor<?x?xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<?x?xf32>
|
||||
return %0 : tensor<?x?xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: @vectorize_dynamic_2d_transpose
|
||||
// CHECK: %[[VAL_3:.*]] = arith.constant 1 : index
|
||||
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?x?xf32>
|
||||
// CHECK: %[[VAL_5:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_6:.*]] = tensor.dim %{{.*}}, %[[VAL_5]] : tensor<?x?xf32>
|
||||
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_6]], %[[VAL_4]] : vector<8x4xi1>
|
||||
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : tensor<?x?xf32>, vector<4x8xf32> } : vector<8x4xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_12:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<4x8xi1>
|
||||
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_14:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_16:.*]] = arith.addf %[[VAL_10]], %[[VAL_13]] : vector<4x8xf32>
|
||||
// CHECK: %[[VAL_17:.*]] = vector.mask %[[VAL_12]] { vector.transfer_write %[[VAL_16]], %{{.*}} {in_bounds = [true, true]} : vector<4x8xf32>, tensor<?x?xf32> } : vector<4x8xi1> -> tensor<?x?xf32>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4, 8] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_dynamic_generic_2d_broadcast(%arg0: tensor<?x?xf32>,
|
||||
%arg1: tensor<?x?xf32>,
|
||||
%arg2: tensor<?x?xf32>) -> tensor<?x?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>],
|
||||
iterator_types = ["parallel", "parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>)
|
||||
outs(%arg2 : tensor<?x?xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<?x?xf32>
|
||||
return %0 : tensor<?x?xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: @vectorize_dynamic_generic_2d_broadcast
|
||||
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?x?xf32>
|
||||
// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index
|
||||
// CHECK: %[[VAL_6:.*]] = tensor.dim %{{.*}}, %[[VAL_5]] : tensor<?x?xf32>
|
||||
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_6]] : vector<8xi1>
|
||||
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : tensor<?x?xf32>, vector<4x8xf32> } : vector<8xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_12:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<4x8xi1>
|
||||
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_16:.*]] = arith.addf %[[VAL_10]], %[[VAL_13]] : vector<4x8xf32>
|
||||
// CHECK: %[[VAL_18:.*]] = vector.mask %[[VAL_12]] { vector.transfer_write %{{.*}} {in_bounds = [true, true]} : vector<4x8xf32>, tensor<?x?xf32> } : vector<4x8xi1> -> tensor<?x?xf32>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4, 8] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_dynamic_reduction(%arg0: tensor<?x?xf32>,
|
||||
%arg1: tensor<?xf32>) -> tensor<?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0)>],
|
||||
iterator_types = ["parallel", "reduction"] }
|
||||
ins(%arg0 : tensor<?x?xf32>)
|
||||
outs(%arg1 : tensor<?xf32>) {
|
||||
^bb(%in: f32, %out: f32) :
|
||||
%0 = arith.addf %in, %out : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<?xf32>
|
||||
return %0 : tensor<?xf32>
|
||||
}
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4, 8] : !transform.any_op
|
||||
}
|
||||
|
||||
// CHECK-LABEL: @vectorize_dynamic_reduction(
|
||||
// CHECK-SAME: %[[VAL_0:.*]]: tensor<?x?xf32>,
|
||||
// CHECK-SAME: %[[VAL_1:.*]]: tensor<?xf32>) -> tensor<?xf32> {
|
||||
// CHECK: %[[VAL_2:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_3:.*]] = tensor.dim %[[VAL_0]], %[[VAL_2]] : tensor<?x?xf32>
|
||||
// CHECK: %[[VAL_4:.*]] = arith.constant 1 : index
|
||||
// CHECK: %[[VAL_5:.*]] = tensor.dim %[[VAL_0]], %[[VAL_4]] : tensor<?x?xf32>
|
||||
// CHECK: %[[VAL_8:.*]] = vector.create_mask %[[VAL_3]], %[[VAL_5]] : vector<4x8xi1>
|
||||
// CHECK: %[[VAL_9:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_0]]{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
|
||||
// CHECK: %[[VAL_11:.*]] = vector.create_mask %[[VAL_3]] : vector<4xi1>
|
||||
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_11]] { vector.transfer_read %[[VAL_1]]{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_8]] { vector.multi_reduction <add>, %[[VAL_9]], %[[VAL_12]] [1] : vector<4x8xf32> to vector<4xf32> } : vector<4x8xi1> -> vector<4xf32>
|
||||
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_11]] { vector.transfer_write %[[VAL_13]], %[[VAL_1]]{{.*}} {in_bounds = [true]} : vector<4xf32>, tensor<?xf32> } : vector<4xi1> -> tensor<?xf32>
|
||||
// CHECK: return %[[VAL_15]] : tensor<?xf32>
|
||||
// CHECK: }
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_dynamic_transpose_reduction(%arg0: tensor<?x?x?xf32>,
|
||||
%arg1: tensor<?x?xf32>) -> tensor<?x?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>,
|
||||
affine_map<(d0, d1, d2) -> (d2, d1)>],
|
||||
iterator_types = ["reduction", "parallel", "parallel"] }
|
||||
ins(%arg0 : tensor<?x?x?xf32>)
|
||||
outs(%arg1 : tensor<?x?xf32>) {
|
||||
^bb(%in: f32, %out: f32) :
|
||||
%0 = arith.addf %in, %out : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<?x?xf32>
|
||||
return %0 : tensor<?x?xf32>
|
||||
}
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4, 8, 16] : !transform.any_op
|
||||
}
|
||||
|
||||
// CHECK-LABEL: @vectorize_dynamic_transpose_reduction(
|
||||
// CHECK-SAME: %[[VAL_0:.*]]: tensor<?x?x?xf32>,
|
||||
// CHECK-SAME: %[[VAL_1:.*]]: tensor<?x?xf32>) -> tensor<?x?xf32> {
|
||||
// CHECK: %[[VAL_2:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_3:.*]] = tensor.dim %[[VAL_0]], %[[VAL_2]] : tensor<?x?x?xf32>
|
||||
// CHECK: %[[VAL_4:.*]] = arith.constant 1 : index
|
||||
// CHECK: %[[VAL_5:.*]] = tensor.dim %[[VAL_0]], %[[VAL_4]] : tensor<?x?x?xf32>
|
||||
// CHECK: %[[VAL_6:.*]] = arith.constant 2 : index
|
||||
// CHECK: %[[VAL_7:.*]] = tensor.dim %[[VAL_0]], %[[VAL_6]] : tensor<?x?x?xf32>
|
||||
// CHECK: %[[VAL_10:.*]] = vector.create_mask %[[VAL_3]], %[[VAL_5]], %[[VAL_7]] : vector<4x8x16xi1>
|
||||
// CHECK: %[[VAL_11:.*]] = vector.mask %[[VAL_10]] { vector.transfer_read %[[VAL_0]]{{.*}} {in_bounds = [true, true, true]} : tensor<?x?x?xf32>, vector<4x8x16xf32> } : vector<4x8x16xi1> -> vector<4x8x16xf32>
|
||||
// CHECK: %[[VAL_13:.*]] = vector.create_mask %[[VAL_7]], %[[VAL_5]] : vector<16x8xi1>
|
||||
// CHECK: %[[VAL_14:.*]] = vector.mask %[[VAL_13]] { vector.transfer_read %[[VAL_1]]{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : tensor<?x?xf32>, vector<8x16xf32> } : vector<16x8xi1> -> vector<8x16xf32>
|
||||
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_10]] { vector.multi_reduction <add>, %[[VAL_11]], %[[VAL_14]] [0] : vector<4x8x16xf32> to vector<8x16xf32> } : vector<4x8x16xi1> -> vector<8x16xf32>
|
||||
// CHECK: %[[VAL_17:.*]] = vector.mask %[[VAL_13]] { vector.transfer_write %[[VAL_15]], %{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : vector<8x16xf32>, tensor<?x?xf32> } : vector<16x8xi1> -> tensor<?x?xf32>
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_partial_dynamic_identity(%arg0: tensor<8x?xf32>,
|
||||
%arg1: tensor<8x?xf32>,
|
||||
%arg2: tensor<8x?xf32>) -> tensor<8x?xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>],
|
||||
iterator_types = ["parallel", "parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<8x?xf32>, tensor<8x?xf32>)
|
||||
outs(%arg2 : tensor<8x?xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<8x?xf32>
|
||||
return %0 : tensor<8x?xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: func.func @vectorize_partial_dynamic_identity(
|
||||
// CHECK-SAME: %[[VAL_0:.*]]: tensor<8x?xf32>, %[[VAL_1:.*]]: tensor<8x?xf32>, %[[VAL_2:.*]]: tensor<8x?xf32>) -> tensor<8x?xf32> {
|
||||
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 1 : index
|
||||
// CHECK-DAG: %[[VAL_4:.*]] = tensor.dim %[[VAL_0]], %[[VAL_3]] : tensor<8x?xf32>
|
||||
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 0 : index
|
||||
// CHECK-DAG: %[[VAL_6:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK-DAG: %[[VAL_7:.*]] = arith.constant 8 : index
|
||||
// CHECK: %[[VAL_8:.*]] = vector.create_mask %[[VAL_7]], %[[VAL_4]] : vector<8x32xi1>
|
||||
// CHECK: %[[VAL_9:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_0]][%[[VAL_5]], %[[VAL_5]]], %[[VAL_6]] {in_bounds = [true, true]} : tensor<8x?xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
|
||||
// CHECK: %[[VAL_10:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK: %[[VAL_11:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_1]][%[[VAL_5]], %[[VAL_5]]], %[[VAL_10]] {in_bounds = [true, true]} : tensor<8x?xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
|
||||
// CHECK: %[[VAL_12:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_2]][%[[VAL_5]], %[[VAL_5]]], %[[VAL_12]] {in_bounds = [true, true]} : tensor<8x?xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
|
||||
// CHECK: %[[VAL_14:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] : vector<8x32xf32>
|
||||
// CHECK: %[[VAL_15:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_16:.*]] = vector.mask %[[VAL_8]] { vector.transfer_write %[[VAL_14]], %[[VAL_2]][%[[VAL_15]], %[[VAL_15]]] {in_bounds = [true, true]} : vector<8x32xf32>, tensor<8x?xf32> } : vector<8x32xi1> -> tensor<8x?xf32>
|
||||
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, 32] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @do_not_generate_masks(%arg0: tensor<8x32xf32>,
|
||||
%arg1: tensor<8x32xf32>,
|
||||
%arg2: tensor<8x32xf32>) -> tensor<8x32xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>],
|
||||
iterator_types = ["parallel", "parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<8x32xf32>, tensor<8x32xf32>)
|
||||
outs(%arg2 : tensor<8x32xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<8x32xf32>
|
||||
return %0 : tensor<8x32xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: func.func @do_not_generate_masks
|
||||
// CHECK-NOT: vector.mask
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, 32] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_static_shape_with_mask(%arg0: tensor<8x30xf32>,
|
||||
%arg1: tensor<8x30xf32>,
|
||||
%arg2: tensor<8x30xf32>) -> tensor<8x30xf32> {
|
||||
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>,
|
||||
affine_map<(d0, d1) -> (d0, d1)>],
|
||||
iterator_types = ["parallel", "parallel"] }
|
||||
ins(%arg0, %arg1 : tensor<8x30xf32>, tensor<8x30xf32>)
|
||||
outs(%arg2 : tensor<8x30xf32>) {
|
||||
^bb(%in0: f32, %in1: f32, %out: f32) :
|
||||
%0 = arith.addf %in0, %in1 : f32
|
||||
linalg.yield %0 : f32
|
||||
} -> tensor<8x30xf32>
|
||||
return %0 : tensor<8x30xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: func.func @vectorize_static_shape_with_mask(
|
||||
// CHECK-SAME: %[[VAL_0:.*]]: tensor<8x30xf32>, %[[VAL_1:.*]]: tensor<8x30xf32>, %[[VAL_2:.*]]: tensor<8x30xf32>) -> tensor<8x30xf32> {
|
||||
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 0 : index
|
||||
// CHECK-DAG: %[[VAL_4:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 8 : index
|
||||
// CHECK-DAG: %[[VAL_6:.*]] = arith.constant 30 : index
|
||||
// CHECK: %[[VAL_7:.*]] = vector.create_mask %[[VAL_5]], %[[VAL_6]] : vector<8x32xi1>
|
||||
// CHECK: %[[VAL_8:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %[[VAL_0]][%[[VAL_3]], %[[VAL_3]]], %[[VAL_4]] {in_bounds = [true, true]} : tensor<8x30xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
|
||||
// CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %[[VAL_1]][%[[VAL_3]], %[[VAL_3]]], %[[VAL_9]] {in_bounds = [true, true]} : tensor<8x30xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
|
||||
// CHECK: %[[VAL_11:.*]] = arith.constant 0.000000e+00 : f32
|
||||
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %[[VAL_2]][%[[VAL_3]], %[[VAL_3]]], %[[VAL_11]] {in_bounds = [true, true]} : tensor<8x30xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
|
||||
// CHECK: %[[VAL_13:.*]] = arith.addf %[[VAL_8]], %[[VAL_10]] : vector<8x32xf32>
|
||||
// CHECK: %[[VAL_14:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_7]] { vector.transfer_write %[[VAL_13]], %[[VAL_2]][%[[VAL_14]], %[[VAL_14]]] {in_bounds = [true, true]} : vector<8x32xf32>, tensor<8x30xf32> } : vector<8x32xi1> -> tensor<8x30xf32>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, 32] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @vectorize_dynamic_fill(%A : tensor<?x?xf32>, %arg0 : f32) -> tensor<?x?xf32> {
|
||||
%0 = linalg.fill ins(%arg0 : f32) outs(%A : tensor<?x?xf32>) -> tensor<?x?xf32>
|
||||
return %0 : tensor<?x?xf32>
|
||||
}
|
||||
|
||||
// CHECK-LABEL: func.func @vectorize_dynamic_fill
|
||||
// CHECK: %[[DIM0:.*]] = tensor.dim
|
||||
// CHECK: %[[DIM1:.*]] = tensor.dim
|
||||
// CHECK: %[[MASK:.*]] = vector.create_mask %[[DIM0]], %[[DIM1]] : vector<8x16xi1>
|
||||
// CHECK: %[[BCAST:.*]] = vector.broadcast %{{.*}} : f32 to vector<8x16xf32>
|
||||
// CHECK: vector.mask %[[MASK]] { vector.transfer_write %[[BCAST]], {{.*}} {in_bounds = [true, true]} : vector<8x16xf32>, tensor<?x?xf32> } : vector<8x16xi1>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, 16] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
// CHECK-LABEL: func @test_masked_vectorize_linalg_copy
|
||||
func.func @test_masked_vectorize_linalg_copy(%A : memref<?x?xf32>, %B : memref<?x?xf32>) {
|
||||
// CHECK: %[[c0:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[d0:.*]] = memref.dim %{{.*}}, %[[c0]] : memref<?x?xf32>
|
||||
// CHECK: %[[c1:.*]] = arith.constant 1 : index
|
||||
// CHECK: %[[d1:.*]] = memref.dim %{{.*}}, %[[c1]] : memref<?x?xf32>
|
||||
// CHECK: %[[mask:.*]] = vector.create_mask %[[d0]], %[[d1]] : vector<2x4xi1>
|
||||
// CHECK: vector.mask %[[mask]] {{.*}} vector.transfer_read %{{.*}} {in_bounds = [true, true]} : memref<?x?xf32>, vector<2x4xf32> } : vector<2x4xi1> -> vector<2x4xf32>
|
||||
// CHECK: vector.mask %[[mask]] {{.*}} vector.transfer_write %{{.*}} {in_bounds = [true, true]} : vector<2x4xf32>, memref<?x?xf32> } : vector<2x4xi1>
|
||||
linalg.copy ins(%A : memref<?x?xf32>) outs(%B : memref<?x?xf32>)
|
||||
return
|
||||
}
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.copy"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [2, 4] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
// CHECK-LABEL: func @test_masked_vectorize_pad
|
||||
func.func @test_masked_vectorize_pad(
|
||||
%0 : tensor<?x?xf32>, %h0 : index, %h1 : index)
|
||||
-> tensor<2x4xf32>
|
||||
{
|
||||
// CHECK-DAG: %[[c42:.*]] = arith.constant 4.243000e+01 : f32
|
||||
// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
|
||||
// CHECK-DAG: %[[empty:.*]] = tensor.empty() : tensor<2x4xf32>
|
||||
// CHECK: %[[d0:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
|
||||
// CHECK: %[[d1:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
|
||||
// CHECK: %[[mask:.*]] = vector.create_mask %[[d0]], %[[d1]] : vector<2x4xi1>
|
||||
// CHECK-DAG: %[[c0_2:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[masked_read:.*]] = vector.mask %[[mask]] {
|
||||
// CHECK-SAME: vector.transfer_read %{{.*}}[%[[c0_2]], %[[c0_2]]], %[[c42]]
|
||||
// CHECK-SAME: {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32>
|
||||
// CHECK-SAME: } : vector<2x4xi1> -> vector<2x4xf32>
|
||||
// CHECK: vector.transfer_write %[[masked_read]], %[[empty]][%[[c0_2]], %[[c0_2]]]
|
||||
// CHECK-SAME: {in_bounds = [true, true]} : vector<2x4xf32>, tensor<2x4xf32>
|
||||
%cst = arith.constant 42.43 : f32
|
||||
%c0 = arith.constant 0 : index
|
||||
%1 = tensor.pad %0 low[0, %c0] high[%h0, %h1] {
|
||||
^bb0(%hh1: index, %hh2: index):
|
||||
tensor.yield %cst : f32
|
||||
} : tensor<?x?xf32> to tensor<2x4xf32>
|
||||
return %1: tensor<2x4xf32>
|
||||
}
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["tensor.pad"]} in %arg1
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [2, 4] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
// CHECK: #[[MAP:.+]] = affine_map<()[s0, s1] -> (s0 + s1)>
|
||||
// CHECK: func @test_masked_vectorize_dynamic_pad
|
||||
func.func @test_masked_vectorize_dynamic_pad(
|
||||
%0 : tensor<?x?xf32>, %h0 : index, %h1 : index)
|
||||
-> tensor<?x?xf32>
|
||||
{
|
||||
// CHECK-DAG: %[[c42:.*]] = arith.constant 4.243000e+01 : f32
|
||||
// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
|
||||
// CHECK-DAG: %[[res_d0:.+]] = affine.apply #[[MAP]]()
|
||||
// CHECK-DAG: %[[res_d1:.+]] = affine.apply #[[MAP]]()
|
||||
// CHECK-DAG: %[[empty:.*]] = tensor.empty(%[[res_d0]], %[[res_d1]]) : tensor<?x?xf32>
|
||||
// CHECK: %[[d0:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
|
||||
// CHECK: %[[d1:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
|
||||
// CHECK: %[[mask:.*]] = vector.create_mask %[[d0]], %[[d1]] : vector<2x4xi1>
|
||||
// CHECK-DAG: %[[c0_2:.*]] = arith.constant 0 : index
|
||||
// CHECK: %[[masked_read:.*]] = vector.mask %[[mask]] {
|
||||
// CHECK-SAME: vector.transfer_read %{{.*}}[%[[c0_2]], %[[c0_2]]], %[[c42]]
|
||||
// CHECK-SAME: {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32>
|
||||
// CHECK-SAME: } : vector<2x4xi1> -> vector<2x4xf32>
|
||||
// CHECK: %[[mask_2:.*]] = vector.create_mask %[[res_d0]], %[[res_d1]] : vector<2x4xi1>
|
||||
// CHECK: %[[masked_write:.*]] = vector.mask %[[mask_2]] {
|
||||
// CHECK-SAME: vector.transfer_write %[[masked_read]], %[[empty]][%[[c0_2]], %[[c0_2]]]
|
||||
// CHECK-SAME: {in_bounds = [true, true]} : vector<2x4xf32>, tensor<?x?xf32>
|
||||
// CHECK: return %[[masked_write]] : tensor<?x?xf32>
|
||||
%cst = arith.constant 42.43 : f32
|
||||
%c0 = arith.constant 0 : index
|
||||
%1 = tensor.pad %0 low[0, %c0] high[%h0, %h1] {
|
||||
^bb0(%hh1: index, %hh2: index):
|
||||
tensor.yield %cst : f32
|
||||
} : tensor<?x?xf32> to tensor<?x?xf32>
|
||||
return %1: tensor<?x?xf32>
|
||||
}
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["tensor.pad"]} in %arg1
|
||||
: (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [2, 4] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @matmul(%A: memref<?x?xf32>, %B: memref<?x?xf32>, %C: memref<?x?xf32>) {
|
||||
linalg.matmul ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>)
|
||||
outs(%C: memref<?x?xf32>)
|
||||
return
|
||||
}
|
||||
|
||||
// CHECK-LABEL: func.func @matmul(
|
||||
// CHECK-SAME: %[[A:.*]]: memref<?x?xf32>, %[[B:.*]]: memref<?x?xf32>, %[[C:.*]]: memref<?x?xf32>) {
|
||||
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 0 : index
|
||||
// CHECK-DAG: %[[VAL_4:.*]] = memref.dim %[[A]], %[[VAL_3]] : memref<?x?xf32>
|
||||
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 1 : index
|
||||
// CHECK-DAG: %[[VAL_6:.*]] = memref.dim %[[B]], %[[VAL_5]] : memref<?x?xf32>
|
||||
// CHECK-DAG: %[[VAL_7:.*]] = arith.constant 1 : index
|
||||
// CHECK-DAG: %[[VAL_8:.*]] = memref.dim %[[A]], %[[VAL_7]] : memref<?x?xf32>
|
||||
// CHECK: %[[MASK_A:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_8]] : vector<8x4xi1>
|
||||
// CHECK: %[[LOAD_A:.*]] = vector.mask %[[MASK_A]] { vector.transfer_read %[[A]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x16x4xf32> } : vector<8x4xi1> -> vector<8x16x4xf32>
|
||||
// CHECK: %[[MASK_B:.*]] = vector.create_mask %[[VAL_8]], %[[VAL_6]] : vector<4x16xi1>
|
||||
// CHECK: %[[LOAD_B:.*]] = vector.mask %[[MASK_B]] { vector.transfer_read %[[B]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x16x4xf32> } : vector<4x16xi1> -> vector<8x16x4xf32>
|
||||
// CHECK: %[[MASK_C:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<8x16xi1>
|
||||
// CHECK: %[[LOAD_C:.*]] = vector.mask %[[MASK_C]] { vector.transfer_read %[[C]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true]} : memref<?x?xf32>, vector<8x16xf32> } : vector<8x16xi1> -> vector<8x16xf32>
|
||||
// CHECK: %[[MULF:.*]] = arith.mulf %[[LOAD_A]], %[[LOAD_B]] : vector<8x16x4xf32>
|
||||
// CHECK: %[[MASK_MULIT_RED:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]], %[[VAL_8]] : vector<8x16x4xi1>
|
||||
// CHECK: %[[MULTI_RED:.*]] = vector.mask %[[MASK_MULIT_RED]] { vector.multi_reduction <add>, %[[MULF]], %[[LOAD_C]] [2] : vector<8x16x4xf32> to vector<8x16xf32> } : vector<8x16x4xi1> -> vector<8x16xf32>
|
||||
// CHECK: %[[C2:.*]] = arith.constant 0 : index
|
||||
// CHECK: vector.mask %[[MASK_C]] { vector.transfer_write %[[MULTI_RED]], %[[C]]{{\[}}%[[C2]], %[[C2]]] {in_bounds = [true, true]} : vector<8x16xf32>, memref<?x?xf32> } : vector<8x16xi1>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%matmul = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %matmul vector_sizes [8, 16, 4] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
|
||||
func.func @matmul_scalable(%A: memref<?x?xf32>, %B: memref<?x?xf32>, %C: memref<?x?xf32>) {
|
||||
linalg.matmul ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>)
|
||||
outs(%C: memref<?x?xf32>)
|
||||
return
|
||||
}
|
||||
|
||||
// CHECK-LABEL: func.func @matmul_scalable(
|
||||
// CHECK-SAME: %[[A:.*]]: memref<?x?xf32>, %[[B:.*]]: memref<?x?xf32>, %[[C:.*]]: memref<?x?xf32>) {
|
||||
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 0 : index
|
||||
// CHECK-DAG: %[[VAL_4:.*]] = memref.dim %[[A]], %[[VAL_3]] : memref<?x?xf32>
|
||||
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 1 : index
|
||||
// CHECK-DAG: %[[VAL_6:.*]] = memref.dim %[[B]], %[[VAL_5]] : memref<?x?xf32>
|
||||
// CHECK-DAG: %[[VAL_7:.*]] = arith.constant 1 : index
|
||||
// CHECK-DAG: %[[VAL_8:.*]] = memref.dim %[[A]], %[[VAL_7]] : memref<?x?xf32>
|
||||
// CHECK: %[[MASK_A:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_8]] : vector<8x4xi1>
|
||||
// CHECK: %[[LOAD_A:.*]] = vector.mask %[[MASK_A]] { vector.transfer_read %[[A]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x[16]x4xf32> } : vector<8x4xi1> -> vector<8x[16]x4xf32>
|
||||
// CHECK: %[[MASK_B:.*]] = vector.create_mask %[[VAL_8]], %[[VAL_6]] : vector<4x[16]xi1>
|
||||
// CHECK: %[[LOAD_B:.*]] = vector.mask %[[MASK_B]] { vector.transfer_read %[[B]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x[16]x4xf32> } : vector<4x[16]xi1> -> vector<8x[16]x4xf32>
|
||||
// CHECK: %[[MASK_C:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<8x[16]xi1>
|
||||
// CHECK: %[[LOAD_C:.*]] = vector.mask %[[MASK_C]] { vector.transfer_read %[[C]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true]} : memref<?x?xf32>, vector<8x[16]xf32> } : vector<8x[16]xi1> -> vector<8x[16]xf32>
|
||||
// CHECK: %[[MULF:.*]] = arith.mulf %[[LOAD_A]], %[[LOAD_B]] : vector<8x[16]x4xf32>
|
||||
// CHECK: %[[MASK_MULIT_RED:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]], %[[VAL_8]] : vector<8x[16]x4xi1>
|
||||
// CHECK: %[[MULTI_RED:.*]] = vector.mask %[[MASK_MULIT_RED]] { vector.multi_reduction <add>, %[[MULF]], %[[LOAD_C]] [2] : vector<8x[16]x4xf32> to vector<8x[16]xf32> } : vector<8x[16]x4xi1> -> vector<8x[16]xf32>
|
||||
// CHECK: %[[C2:.*]] = arith.constant 0 : index
|
||||
// CHECK: vector.mask %[[MASK_C]] { vector.transfer_write %[[MULTI_RED]], %[[C]]{{\[}}%[[C2]], %[[C2]]] {in_bounds = [true, true]} : vector<8x[16]xf32>, memref<?x?xf32> } : vector<8x[16]xi1>
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%matmul = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %matmul vector_sizes [8, [16], 4] : !transform.any_op
|
||||
}
|
||||
@@ -29,7 +29,7 @@ func.func @vectorize_dynamic_identity(%arg0: tensor<?xf32>,
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [[4]] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [[4]] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -71,7 +71,7 @@ func.func @vectorize_partial_dynamic_identity(%arg0: tensor<8x?xf32>,
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, [32]] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [8, [32]] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -111,7 +111,7 @@ func.func @vectorize_static_shape_with_mask(%arg0: tensor<8x30xf32>,
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, [32]] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [8, [32]] : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -131,6 +131,6 @@ func.func @vectorize_dynamic_fill(%A : tensor<?x?xf32>, %arg0 : f32) -> tensor<?
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [8, [16]] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [8, [16]] : !transform.any_op
|
||||
}
|
||||
|
||||
|
||||
1787
mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
Normal file
1787
mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -28,7 +28,7 @@ func.func @masked_static_vectorize_nd_tensor_extract_with_affine_apply_contiguou
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -83,7 +83,7 @@ func.func @masked_dynamic_vectorize_nd_tensor_extract_with_affine_apply_contiguo
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -121,7 +121,7 @@ func.func @masked_vectorize_nd_tensor_extract_with_affine_apply_gather(%6: tenso
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -176,7 +176,7 @@ func.func @masked_dynamic_vectorize_nd_tensor_extract_with_affine_apply_gather(%
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -226,7 +226,7 @@ func.func @extract_masked_vectorize(%arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf3
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [3, 3] vectorize_nd_extract : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [3, 3] vectorize_nd_extract : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -269,5 +269,5 @@ func.func @tensor_extract_dynamic_shape(%arg1: tensor<123x321xf32>, %arg2: tenso
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [1, 3, 8] vectorize_nd_extract : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [1, 3, 8] vectorize_nd_extract : !transform.any_op
|
||||
}
|
||||
|
||||
@@ -31,7 +31,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -65,7 +65,7 @@ func.func @vectorize_nd_tensor_extract_constant_idx(%arg0: tensor<3x3xf32>, %arg
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 { vectorize_nd_extract } : !transform.any_op
|
||||
transform.structured.vectorize %0 { vectorize_nd_extract } : !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -104,7 +104,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -156,7 +156,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -204,7 +204,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
// -----
|
||||
|
||||
@@ -248,7 +248,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -290,7 +290,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -332,7 +332,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -376,7 +376,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -416,7 +416,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -456,7 +456,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -495,7 +495,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
// -----
|
||||
@@ -522,5 +522,5 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
@@ -80,7 +80,7 @@ transform.with_pdl_patterns {
|
||||
transform.structured.tile %0 [4, 4, 4] : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
|
||||
%1 = pdl_match @pdl_target_attrC in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%2 = get_parent_op %1 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize %2 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize_children_and_apply_patterns %2 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
}
|
||||
|
||||
@@ -125,7 +125,7 @@ transform.with_pdl_patterns {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = pdl_match @pdl_target in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
}
|
||||
|
||||
@@ -150,5 +150,5 @@ func.func @vectorize_all(
|
||||
|
||||
transform.sequence failures(propagate) {
|
||||
^bb0(%arg0: !transform.any_op):
|
||||
transform.structured.vectorize %arg0 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize_children_and_apply_patterns %arg0 : (!transform.any_op) -> !transform.any_op
|
||||
}
|
||||
|
||||
@@ -19,7 +19,7 @@ transform.sequence failures(propagate) {
|
||||
%1, %loops:3 = transform.structured.tile %0 [8, 4, 2]
|
||||
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
|
||||
%2 = get_parent_op %1 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize %2 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.vectorize_children_and_apply_patterns %2 : (!transform.any_op) -> !transform.any_op
|
||||
%b = transform.bufferization.one_shot_bufferize
|
||||
layout{IdentityLayoutMap} %module_op
|
||||
{bufferize_function_boundaries = true, allow_return_allocs = true}
|
||||
|
||||
@@ -112,7 +112,7 @@ func.func @entry() {
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [[4], [4]] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [[4], [4]] : !transform.any_op
|
||||
}
|
||||
|
||||
llvm.func @printCString(!llvm.ptr<i8>)
|
||||
|
||||
@@ -49,7 +49,7 @@ func.func @entry() {
|
||||
transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
transform.structured.masked_vectorize %0 vector_sizes [[4]] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [[4]] : !transform.any_op
|
||||
}
|
||||
|
||||
llvm.func @printCString(!llvm.ptr<i8>)
|
||||
|
||||
@@ -51,7 +51,7 @@ transform.sequence failures(propagate) {
|
||||
^bb1(%arg1: !transform.any_op):
|
||||
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
|
||||
%func_op = get_parent_op %0 : (!transform.any_op) -> !transform.op<"func.func">
|
||||
transform.structured.masked_vectorize %0 vector_sizes [4, 4, 2] : !transform.any_op
|
||||
transform.structured.vectorize %0 vector_sizes [4, 4, 2] : !transform.any_op
|
||||
transform.apply_patterns to %func_op {
|
||||
transform.apply_patterns.vector.lower_multi_reduction lowering_strategy = "innerreduction"
|
||||
} : !transform.op<"func.func">
|
||||
|
||||
@@ -171,68 +171,66 @@ def testMatchOpNamesList(target):
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testMaskedVectorizeNoArgs(target):
|
||||
structured.MaskedVectorizeOp(target)
|
||||
# CHECK-LABEL: TEST: testMaskedVectorizeNoArgs
|
||||
def testVectorizeNoArgs(target):
|
||||
structured.VectorizeOp(target)
|
||||
# CHECK-LABEL: TEST: testVectorizeNoArgs
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: transform.structured.masked_vectorize
|
||||
# CHECK: transform.structured.vectorize
|
||||
# CHECK-NOT: vector_sizes
|
||||
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testMaskedVectorizeStatic(target):
|
||||
structured.MaskedVectorizeOp(target, [16, 4])
|
||||
# CHECK-LABEL: TEST: testMaskedVectorizeStatic
|
||||
def testVectorizeStatic(target):
|
||||
structured.VectorizeOp(target, [16, 4])
|
||||
# CHECK-LABEL: TEST: testVectorizeStatic
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: transform.structured.masked_vectorize
|
||||
# CHECK: transform.structured.vectorize
|
||||
# CHECK-SAME: vector_sizes [16, 4]
|
||||
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testMaskedVectorizeArray(target):
|
||||
def testVectorizeArray(target):
|
||||
sizes = Attribute.parse("[16, 4]")
|
||||
structured.MaskedVectorizeOp(target, sizes)
|
||||
# CHECK-LABEL: TEST: testMaskedVectorizeArray
|
||||
structured.VectorizeOp(target, sizes)
|
||||
# CHECK-LABEL: TEST: testVectorizeArray
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: transform.structured.masked_vectorize
|
||||
# CHECK: transform.structured.vectorize
|
||||
# CHECK-SAME: vector_sizes [16, 4]
|
||||
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testMaskedVectorizeMixed(target):
|
||||
def testVectorizeMixed(target):
|
||||
sz1 = structured.MatchOp.match_op_names(target, ["arith.constant"])
|
||||
sz2 = Attribute.parse("4")
|
||||
structured.MaskedVectorizeOp(target, [sz1, sz2])
|
||||
# CHECK-LABEL: TEST: testMaskedVectorizeMixed
|
||||
structured.VectorizeOp(target, [sz1, sz2])
|
||||
# CHECK-LABEL: TEST: testVectorizeMixed
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: %[[V0:.*]] = transform.structured.match
|
||||
# CHECK: transform.structured.masked_vectorize
|
||||
# CHECK: transform.structured.vectorize
|
||||
# CHECK-SAME: vector_sizes [%[[V0]] : !transform.any_op, 4]
|
||||
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testMaskedVectorizeScalable(target):
|
||||
def testVectorizeScalable(target):
|
||||
sz1 = structured.MatchOp.match_op_names(target, ["arith.constant"])
|
||||
sz2 = Attribute.parse("4")
|
||||
structured.MaskedVectorizeOp(target, [16, [sz1], [sz2], [8]])
|
||||
# CHECK-LABEL: TEST: testMaskedVectorizeScalable
|
||||
structured.VectorizeOp(target, [16, [sz1], [sz2], [8]])
|
||||
# CHECK-LABEL: TEST: testVectorizeScalable
|
||||
# CHECK: transform.sequence
|
||||
# CHECK-DAG: %[[V0:.*]] = transform.structured.match
|
||||
# CHECK-DAG: transform.structured.masked_vectorize
|
||||
# CHECK-DAG: transform.structured.vectorize
|
||||
# CHECK-SAME: vector_sizes [16, [%[[V0]] : !transform.any_op], [4], [8]]
|
||||
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testMaskedVectorizeArgs(target):
|
||||
structured.MaskedVectorizeOp(target, [16, 4], vectorize_nd_extract=True)
|
||||
# CHECK-LABEL: TEST: testMaskedVectorizeArgs
|
||||
def testVectorizeArgs(target):
|
||||
structured.VectorizeOp(target, [16, 4], vectorize_nd_extract=True)
|
||||
# CHECK-LABEL: TEST: testVectorizeArgs
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: transform.structured.masked_vectorize
|
||||
# CHECK: transform.structured.vectorize
|
||||
# CHECK-SAME: vectorize_nd_extract
|
||||
|
||||
|
||||
@@ -497,15 +495,15 @@ def testTileToForallMapping(target):
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testVectorizeAllAttrs(target):
|
||||
structured.VectorizeOp(
|
||||
def testVectorizeChildrenAndApplyPatternsAllAttrs(target):
|
||||
structured.VectorizeChildrenAndApplyPatternsOp(
|
||||
target,
|
||||
disable_multi_reduction_to_contract_patterns=True,
|
||||
disable_transfer_permutation_map_lowering_patterns=True,
|
||||
vectorize_nd_extract=True,
|
||||
vectorize_padding=True,
|
||||
)
|
||||
# CHECK-LABEL: TEST: testVectorizeAllAttrs
|
||||
# CHECK-LABEL: TEST: testVectorizeChildrenAndApplyPatternsAllAttrs
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: = transform.structured.vectorize
|
||||
# CHECK-SAME: disable_multi_reduction_to_contract_patterns
|
||||
@@ -516,15 +514,15 @@ def testVectorizeAllAttrs(target):
|
||||
|
||||
@run
|
||||
@create_sequence
|
||||
def testVectorizeNoAttrs(target):
|
||||
structured.VectorizeOp(
|
||||
def testVectorizeChildrenAndApplyPatternsNoAttrs(target):
|
||||
structured.VectorizeChildrenAndApplyPatternsOp(
|
||||
target,
|
||||
disable_multi_reduction_to_contract_patterns=False,
|
||||
disable_transfer_permutation_map_lowering_patterns=False,
|
||||
vectorize_nd_extract=False,
|
||||
vectorize_padding=False,
|
||||
)
|
||||
# CHECK-LABEL: TEST: testVectorizeNoAttrs
|
||||
# CHECK-LABEL: TEST: testVectorizeChildrenAndApplyPatternsNoAttrs
|
||||
# CHECK: transform.sequence
|
||||
# CHECK: = transform.structured.vectorize
|
||||
# CHECK-NOT: disable_multi_reduction_to_contract_patterns
|
||||
|
||||
Reference in New Issue
Block a user