[mlir][linalg][transform] Rename {masked_vectorize => vectorize => vectorize_children_and...}. (#66575)

This PR renames the vectorization transform ops as follows:

* `structured.masked_vectorize` => `structured.vectorize`. This reflects
the fact that since [recently](https://reviews.llvm.org/D157774) the op
can also handle the unmasked case.
* `structured.vectorize` =>
`structured.vectorize_children_and_applies_patterns`. This reflects the
fact that the op does not just vectorize the given payload op but all
vectorizable children contained in it, and applies patterns before and
after for preparation and clean-up.

This rename was discussed first
[here](https://reviews.llvm.org/D157774).

The PR also adapts and cleans ups the tablegen description of the
`VectorizeChildrenAndApplyPatternsOp` (formerly `VectorizeOp`).
This commit is contained in:
Ingo Müller
2023-09-21 15:38:29 +02:00
committed by GitHub
parent c00f49cf12
commit 69bc1cbbff
21 changed files with 2369 additions and 2365 deletions

View File

@@ -1947,37 +1947,39 @@ def TileToForallOp :
}
//===----------------------------------------------------------------------===//
// VectorizeOp
// VectorizeChildrenAndApplyPatternsOp
//===----------------------------------------------------------------------===//
def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",
def VectorizeChildrenAndApplyPatternsOp :
Op<Transform_Dialect, "structured.vectorize_children_and_apply_patterns",
[FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
TransformEachOpTrait, TransformOpInterface,
ReportTrackingListenerFailuresOpTrait]> {
let description = [{
Indicates that the given `target` op all the ops it contains should be
vectorized with the configuration specified by the attributes of this op.
This vectorization only handles structured ops that operate on shaped types
and does not vectorize loops or straight-line. Internally, it applies a
set of rewrite patterns, some of which enable vectorization and some of
which clean up the results. Therefore, it can only be applied to an op with
the "isolated from above property". If finer granularity is required, it can
be achieved by outlining the target part of the payload IR into, e.g., a
function, performing the transformation, and inlining it back. This
transformation only fails if the entire pattern rewriting failed, i.e., it
does **not** fail when no ops were vectorized.
Vectorizes all children contained in the given `target` using the
configuration specified by the attributes of this op. This only vectorizes
structured ops that operate on shaped types and does not vectorize loops or
straight-line. Internally, it applies a set of rewrite patterns, some of
which enable vectorization and some of which clean up the results.
Therefore, it can only be applied to an op with the "isolated from above"
property. This transformation only fails if the entire pattern rewriting
failed, i.e., it does **not** fail when no ops were vectorized.
Note that this transformation is invalidating the handles to any payload IR
Finer granularity can be achieved either with the `VectorizeOp` for
individual ops or by outlining the target part of the payload IR into, e.g.,
a function, performing this transformation, and inlining it back.
Note that this transformation invalidates the handles to any payload IR
operation that is contained inside the vectorization target.
This transformation supports the following attributes:
- `vectorize_padding`: a UnitAttr to activate the vectorization of
- `vectorize_padding`: a `UnitAttr` to activate the vectorization of
`tensor.pad` ops. Different pipelines may prefer to lower such ops to
loops.
- `disable_multi_reduction_to_contract_patterns`: a UnitAttr to deactivate
- `disable_multi_reduction_to_contract_patterns`: a `UnitAttr` to deactivate
the rewrite of `vector.multi_reduction` to `vector.contract`. This is
intended to be used in tests only.
- `disable_transfer_permutation_map_lowering_patterns`: a UnitAttr to
- `disable_transfer_permutation_map_lowering_patterns`: a `UnitAttr` to
deactivate the rewrite of `vector.transfer` with permutation maps into
explicit `vector.transpose` operations. This is intended to be used in
tests only but may be promoted to a first class attribute in the future.
@@ -2015,7 +2017,7 @@ def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",
}];
}
def MaskedVectorizeOp : Op<Transform_Dialect, "structured.masked_vectorize",
def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",
[DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
TransformOpInterface, ReportTrackingListenerFailuresOpTrait]> {
let description = [{
@@ -2029,9 +2031,9 @@ def MaskedVectorizeOp : Op<Transform_Dialect, "structured.masked_vectorize",
```mlir
# Masked vectorization - vector sizes are specified explicitly
transform.structured.masked_vectorize %target vector_sizes [1, 4] : !transform.any_op
transform.structured.vectorize %target vector_sizes [1, 4] : !transform.any_op
# Regular vectorization - vector sizes are inferred from the target Op
transform.structured.masked_vectorize %target : !transform.any_op
transform.structured.vectorize %target : !transform.any_op
```
The vector sizes can be either static or dynamic (SSA values). In case of

View File

@@ -2904,27 +2904,31 @@ LogicalResult TileToForallOp::verify() {
}
//===----------------------------------------------------------------------===//
// VectorizeOp
// VectorizeChildrenAndApplyPatternsOp
//===----------------------------------------------------------------------===//
void transform::VectorizeOp::build(OpBuilder &builder, OperationState &result,
Value target, bool vectorizePadding,
bool vectorizeExtract) {
void transform::VectorizeChildrenAndApplyPatternsOp::build(
OpBuilder &builder, OperationState &result, Value target,
bool vectorizePadding, bool vectorizeExtract) {
result.addOperands(target);
if (vectorizePadding) {
result.addAttribute(VectorizeOp::getVectorizePaddingAttrName(result.name),
builder.getUnitAttr());
result.addAttribute(
VectorizeChildrenAndApplyPatternsOp::getVectorizePaddingAttrName(
result.name),
builder.getUnitAttr());
}
if (vectorizeExtract) {
result.addAttribute(VectorizeOp::getVectorizeNdExtractAttrName(result.name),
builder.getUnitAttr());
result.addAttribute(
VectorizeChildrenAndApplyPatternsOp::getVectorizeNdExtractAttrName(
result.name),
builder.getUnitAttr());
}
result.addTypes(transform::AnyOpType::get(builder.getContext()));
}
namespace {
/// This is an helper only to call vectorize via a pattern inside of
/// VectorizeOp::applyToOne.
/// VectorizeChildrenAndApplyPatternsOp::applyToOne.
struct VectorizationPattern : public RewritePattern {
explicit VectorizationPattern(MLIRContext *context,
bool vectorizeExtract = false)
@@ -2947,10 +2951,10 @@ private:
} // namespace
DiagnosedSilenceableFailure
transform::VectorizeOp::applyToOne(transform::TransformRewriter &rewriter,
Operation *target,
transform::ApplyToEachResultList &results,
transform::TransformState &state) {
transform::VectorizeChildrenAndApplyPatternsOp::applyToOne(
transform::TransformRewriter &rewriter, Operation *target,
transform::ApplyToEachResultList &results,
transform::TransformState &state) {
if (!target->hasTrait<OpTrait::IsIsolatedFromAbove>()) {
auto diag = this->emitOpError("requires isolated-from-above targets");
diag.attachNote(target->getLoc()) << "non-isolated target";
@@ -2992,9 +2996,9 @@ transform::VectorizeOp::applyToOne(transform::TransformRewriter &rewriter,
}
//===----------------------------------------------------------------------===//
// MaskedVectorizeOp
// VectorizeOp
//===----------------------------------------------------------------------===//
DiagnosedSilenceableFailure transform::MaskedVectorizeOp::apply(
DiagnosedSilenceableFailure transform::VectorizeOp::apply(
transform::TransformRewriter &rewriter,
mlir::transform::TransformResults &transformResults,
mlir::transform::TransformState &state) {
@@ -3058,19 +3062,19 @@ DiagnosedSilenceableFailure transform::MaskedVectorizeOp::apply(
return DiagnosedSilenceableFailure::success();
}
void transform::MaskedVectorizeOp::getEffects(
void transform::VectorizeOp::getEffects(
SmallVectorImpl<MemoryEffects::EffectInstance> &effects) {
consumesHandle(getTarget(), effects);
onlyReadsHandle(getVectorSizes(), effects);
modifiesPayload(effects);
}
SmallVector<OpFoldResult> MaskedVectorizeOp::getMixedVectorSizes() {
SmallVector<OpFoldResult> VectorizeOp::getMixedVectorSizes() {
OpBuilder b(getContext());
return getMixedValues(getStaticVectorSizes(), getVectorSizes(), b);
}
LogicalResult transform::MaskedVectorizeOp::verify() {
LogicalResult transform::VectorizeOp::verify() {
if (getStaticVectorSizes().size() != getScalableSizes().size())
return emitOpError("expected same number of vector sizes (")
<< getStaticVectorSizes().size() << ") and scalable sizes ("

View File

@@ -360,8 +360,8 @@ class MapCopyToThreadsOp:
)
class MaskedVectorizeOp:
"""Specialization for MaskedVectorizeOp class."""
class VectorizeOp:
"""Specialization for VectorizeOp class."""
def __init__(
self,
@@ -730,8 +730,8 @@ class TileToForallOp:
)
class VectorizeOp:
"""Specialization for VectorizeOp class."""
class VectorizeChildrenAndApplyPatternsOp:
"""Specialization for VectorizeChildrenAndApplyPatternsOp class."""
def __init__(
self,

View File

@@ -17,7 +17,7 @@ transform.sequence failures(propagate) {
%0 = transform.structured.match ops{["linalg.matmul"]} in %module_op : (!transform.any_op) -> !transform.any_op
%1, %loops:3 = transform.structured.tile %0 [2, 2, 2] : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
%2 = get_parent_op %1 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize %2 : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize_children_and_apply_patterns %2 : (!transform.any_op) -> !transform.any_op
%b = transform.bufferization.one_shot_bufferize layout{IdentityLayoutMap}
%module_op {bufferize_function_boundaries = true}
: (!transform.any_op) -> !transform.any_op

View File

@@ -80,7 +80,7 @@ transform.sequence failures(propagate) {
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)
// Apply masked vectorization to padding ops.
transform.structured.masked_vectorize %tiled_pad_op vector_sizes [128, 4]
transform.structured.vectorize %tiled_pad_op vector_sizes [128, 4]
: !transform.any_op
// Assign shared memory buffer to padding.
@@ -105,7 +105,7 @@ transform.sequence failures(propagate) {
: (!transform.any_op) -> !transform.any_op
%bufferized_copy_back = transform.structured.match ops{["linalg.copy"]} in %func_op_2
: (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize
transform.structured.vectorize
%bufferized_copy_back vector_sizes [128, 4] : !transform.any_op
// Canonicalize, cleanup and vector lowering. This step also removes buffer
@@ -192,7 +192,7 @@ transform.sequence failures(propagate) {
}
// Apply masked vectorization to padding ops.
transform.structured.masked_vectorize %tiled_pad_op vector_sizes [128, 4]
transform.structured.vectorize %tiled_pad_op vector_sizes [128, 4]
: !transform.any_op
// Assign shared memory buffer to padding.
@@ -217,7 +217,7 @@ transform.sequence failures(propagate) {
: (!transform.any_op) -> !transform.any_op
%bufferized_copy_back = transform.structured.match ops{["linalg.copy"]} in %func_op_2
: (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize
transform.structured.vectorize
%bufferized_copy_back vector_sizes [128, 4] : !transform.any_op
// Canonicalize, cleanup and vector lowering. This step also removes buffer

View File

@@ -111,7 +111,7 @@ transform.sequence failures(propagate) {
padding_dimensions=[0, 1, 2],
pack_paddings=[1, 1, 1]
} : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)
transform.structured.masked_vectorize %pad vector_sizes [10, 12] : !transform.any_op
transform.structured.vectorize %pad vector_sizes [10, 12] : !transform.any_op
%vector_write = transform.structured.match ops{["vector.transfer_write"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%mask_op = transform.get_parent_op %vector_write {op_name = "vector.mask"} : (!transform.any_op) -> !transform.any_op
%buffer, %new_ops = transform.structured.bufferize_to_allocation %mask_op {memory_space = 3, emit_dealloc} : !transform.any_op

View File

@@ -26,7 +26,7 @@ transform.sequence failures(propagate) {
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
%tiled_linalg_op_0, %loops_1:3 = transform.structured.tile %tiled_linalg_op[8, 8, 8]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
transform.structured.masked_vectorize %tiled_linalg_op_0 vector_sizes [8, 8, 8]
transform.structured.vectorize %tiled_linalg_op_0 vector_sizes [8, 8, 8]
: !transform.any_op
%func = transform.structured.match ops{["func.func"]} in %module

View File

@@ -31,7 +31,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
transform.apply_patterns to %2 {
transform.apply_patterns.vector.lower_contraction lowering_strategy = "outerproduct"
} : !transform.any_op

View File

@@ -20,7 +20,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -45,7 +45,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -65,7 +65,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.copy"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -111,7 +111,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -159,7 +159,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 {vectorize_padding} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 {vectorize_padding} : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -176,5 +176,5 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
// expected-error @below {{op requires isolated-from-above targets}}
%2 = transform.structured.vectorize %0 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %0 : (!transform.any_op) -> !transform.any_op
}

View File

@@ -1,514 +0,0 @@
// RUN: mlir-opt %s -test-transform-dialect-interpreter -split-input-file | FileCheck %s
func.func @vectorize_dynamic_identity(%arg0: tensor<?xf32>,
%arg1: tensor<?xf32>,
%arg2: tensor<?xf32>) -> tensor<?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0) -> (d0)>,
affine_map<(d0) -> (d0)>,
affine_map<(d0) -> (d0)>],
iterator_types = ["parallel"] }
ins(%arg0, %arg1 : tensor<?xf32>, tensor<?xf32>)
outs(%arg2 : tensor<?xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<?xf32>
return %0 : tensor<?xf32>
}
// CHECK-LABEL: @vectorize_dynamic_identity
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
// CHECK: %[[VAL_7:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
// CHECK: %[[VAL_8:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_13:.*]] = arith.addf %[[VAL_8]], %[[VAL_10]] : vector<4xf32>
// CHECK: %[[VAL_14:.*]] = vector.mask %[[VAL_7]] { vector.transfer_write %{{.*}} {in_bounds = [true]} : vector<4xf32>, tensor<?xf32> } : vector<4xi1> -> tensor<?xf32>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [4] : !transform.any_op
}
// -----
func.func @vectorize_dynamic_1d_broadcast(%arg0: tensor<?xf32>,
%arg1: tensor<?xf32>,
%arg2: tensor<?xf32>) -> tensor<?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0) -> (0)>,
affine_map<(d0) -> (d0)>,
affine_map<(d0) -> (d0)>],
iterator_types = ["parallel"] }
ins(%arg0, %arg1 : tensor<?xf32>, tensor<?xf32>)
outs(%arg2 : tensor<?xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<?xf32>
return %0 : tensor<?xf32>
}
// CHECK-LABEL: @vectorize_dynamic_1d_broadcast
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_13:.*]] = arith.addf %[[VAL_7]], %[[VAL_10]] : vector<4xf32>
// CHECK: %[[VAL_14:.*]] = vector.mask %{{.*}} { vector.transfer_write %[[VAL_13]], {{.*}} {in_bounds = [true]} : vector<4xf32>, tensor<?xf32> } : vector<4xi1> -> tensor<?xf32>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [4] : !transform.any_op
}
// -----
func.func @vectorize_dynamic_2d_transpose(%arg0: tensor<?x?xf32>,
%arg1: tensor<?x?xf32>,
%arg2: tensor<?x?xf32>) -> tensor<?x?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d1, d0)>,
affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>],
iterator_types = ["parallel", "parallel"] }
ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>)
outs(%arg2 : tensor<?x?xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
// CHECK-LABEL: @vectorize_dynamic_2d_transpose
// CHECK: %[[VAL_3:.*]] = arith.constant 1 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?x?xf32>
// CHECK: %[[VAL_5:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_6:.*]] = tensor.dim %{{.*}}, %[[VAL_5]] : tensor<?x?xf32>
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_6]], %[[VAL_4]] : vector<8x4xi1>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : tensor<?x?xf32>, vector<4x8xf32> } : vector<8x4xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_12:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<4x8xi1>
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_14:.*]] = arith.constant 0.000000e+00 : f32
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_16:.*]] = arith.addf %[[VAL_10]], %[[VAL_13]] : vector<4x8xf32>
// CHECK: %[[VAL_17:.*]] = vector.mask %[[VAL_12]] { vector.transfer_write %[[VAL_16]], %{{.*}} {in_bounds = [true, true]} : vector<4x8xf32>, tensor<?x?xf32> } : vector<4x8xi1> -> tensor<?x?xf32>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [4, 8] : !transform.any_op
}
// -----
func.func @vectorize_dynamic_generic_2d_broadcast(%arg0: tensor<?x?xf32>,
%arg1: tensor<?x?xf32>,
%arg2: tensor<?x?xf32>) -> tensor<?x?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>],
iterator_types = ["parallel", "parallel"] }
ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>)
outs(%arg2 : tensor<?x?xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
// CHECK-LABEL: @vectorize_dynamic_generic_2d_broadcast
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?x?xf32>
// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index
// CHECK: %[[VAL_6:.*]] = tensor.dim %{{.*}}, %[[VAL_5]] : tensor<?x?xf32>
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_6]] : vector<8xi1>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : tensor<?x?xf32>, vector<4x8xf32> } : vector<8xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_12:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<4x8xi1>
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_12]] { vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_16:.*]] = arith.addf %[[VAL_10]], %[[VAL_13]] : vector<4x8xf32>
// CHECK: %[[VAL_18:.*]] = vector.mask %[[VAL_12]] { vector.transfer_write %{{.*}} {in_bounds = [true, true]} : vector<4x8xf32>, tensor<?x?xf32> } : vector<4x8xi1> -> tensor<?x?xf32>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [4, 8] : !transform.any_op
}
// -----
func.func @vectorize_dynamic_reduction(%arg0: tensor<?x?xf32>,
%arg1: tensor<?xf32>) -> tensor<?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0)>],
iterator_types = ["parallel", "reduction"] }
ins(%arg0 : tensor<?x?xf32>)
outs(%arg1 : tensor<?xf32>) {
^bb(%in: f32, %out: f32) :
%0 = arith.addf %in, %out : f32
linalg.yield %0 : f32
} -> tensor<?xf32>
return %0 : tensor<?xf32>
}
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [4, 8] : !transform.any_op
}
// CHECK-LABEL: @vectorize_dynamic_reduction(
// CHECK-SAME: %[[VAL_0:.*]]: tensor<?x?xf32>,
// CHECK-SAME: %[[VAL_1:.*]]: tensor<?xf32>) -> tensor<?xf32> {
// CHECK: %[[VAL_2:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_3:.*]] = tensor.dim %[[VAL_0]], %[[VAL_2]] : tensor<?x?xf32>
// CHECK: %[[VAL_4:.*]] = arith.constant 1 : index
// CHECK: %[[VAL_5:.*]] = tensor.dim %[[VAL_0]], %[[VAL_4]] : tensor<?x?xf32>
// CHECK: %[[VAL_8:.*]] = vector.create_mask %[[VAL_3]], %[[VAL_5]] : vector<4x8xi1>
// CHECK: %[[VAL_9:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_0]]{{.*}} {in_bounds = [true, true]} : tensor<?x?xf32>, vector<4x8xf32> } : vector<4x8xi1> -> vector<4x8xf32>
// CHECK: %[[VAL_11:.*]] = vector.create_mask %[[VAL_3]] : vector<4xi1>
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_11]] { vector.transfer_read %[[VAL_1]]{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_8]] { vector.multi_reduction <add>, %[[VAL_9]], %[[VAL_12]] [1] : vector<4x8xf32> to vector<4xf32> } : vector<4x8xi1> -> vector<4xf32>
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_11]] { vector.transfer_write %[[VAL_13]], %[[VAL_1]]{{.*}} {in_bounds = [true]} : vector<4xf32>, tensor<?xf32> } : vector<4xi1> -> tensor<?xf32>
// CHECK: return %[[VAL_15]] : tensor<?xf32>
// CHECK: }
// -----
func.func @vectorize_dynamic_transpose_reduction(%arg0: tensor<?x?x?xf32>,
%arg1: tensor<?x?xf32>) -> tensor<?x?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>,
affine_map<(d0, d1, d2) -> (d2, d1)>],
iterator_types = ["reduction", "parallel", "parallel"] }
ins(%arg0 : tensor<?x?x?xf32>)
outs(%arg1 : tensor<?x?xf32>) {
^bb(%in: f32, %out: f32) :
%0 = arith.addf %in, %out : f32
linalg.yield %0 : f32
} -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [4, 8, 16] : !transform.any_op
}
// CHECK-LABEL: @vectorize_dynamic_transpose_reduction(
// CHECK-SAME: %[[VAL_0:.*]]: tensor<?x?x?xf32>,
// CHECK-SAME: %[[VAL_1:.*]]: tensor<?x?xf32>) -> tensor<?x?xf32> {
// CHECK: %[[VAL_2:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_3:.*]] = tensor.dim %[[VAL_0]], %[[VAL_2]] : tensor<?x?x?xf32>
// CHECK: %[[VAL_4:.*]] = arith.constant 1 : index
// CHECK: %[[VAL_5:.*]] = tensor.dim %[[VAL_0]], %[[VAL_4]] : tensor<?x?x?xf32>
// CHECK: %[[VAL_6:.*]] = arith.constant 2 : index
// CHECK: %[[VAL_7:.*]] = tensor.dim %[[VAL_0]], %[[VAL_6]] : tensor<?x?x?xf32>
// CHECK: %[[VAL_10:.*]] = vector.create_mask %[[VAL_3]], %[[VAL_5]], %[[VAL_7]] : vector<4x8x16xi1>
// CHECK: %[[VAL_11:.*]] = vector.mask %[[VAL_10]] { vector.transfer_read %[[VAL_0]]{{.*}} {in_bounds = [true, true, true]} : tensor<?x?x?xf32>, vector<4x8x16xf32> } : vector<4x8x16xi1> -> vector<4x8x16xf32>
// CHECK: %[[VAL_13:.*]] = vector.create_mask %[[VAL_7]], %[[VAL_5]] : vector<16x8xi1>
// CHECK: %[[VAL_14:.*]] = vector.mask %[[VAL_13]] { vector.transfer_read %[[VAL_1]]{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : tensor<?x?xf32>, vector<8x16xf32> } : vector<16x8xi1> -> vector<8x16xf32>
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_10]] { vector.multi_reduction <add>, %[[VAL_11]], %[[VAL_14]] [0] : vector<4x8x16xf32> to vector<8x16xf32> } : vector<4x8x16xi1> -> vector<8x16xf32>
// CHECK: %[[VAL_17:.*]] = vector.mask %[[VAL_13]] { vector.transfer_write %[[VAL_15]], %{{.*}} {in_bounds = [true, true], permutation_map = #{{.*}}} : vector<8x16xf32>, tensor<?x?xf32> } : vector<16x8xi1> -> tensor<?x?xf32>
// -----
func.func @vectorize_partial_dynamic_identity(%arg0: tensor<8x?xf32>,
%arg1: tensor<8x?xf32>,
%arg2: tensor<8x?xf32>) -> tensor<8x?xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>],
iterator_types = ["parallel", "parallel"] }
ins(%arg0, %arg1 : tensor<8x?xf32>, tensor<8x?xf32>)
outs(%arg2 : tensor<8x?xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<8x?xf32>
return %0 : tensor<8x?xf32>
}
// CHECK-LABEL: func.func @vectorize_partial_dynamic_identity(
// CHECK-SAME: %[[VAL_0:.*]]: tensor<8x?xf32>, %[[VAL_1:.*]]: tensor<8x?xf32>, %[[VAL_2:.*]]: tensor<8x?xf32>) -> tensor<8x?xf32> {
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 1 : index
// CHECK-DAG: %[[VAL_4:.*]] = tensor.dim %[[VAL_0]], %[[VAL_3]] : tensor<8x?xf32>
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[VAL_6:.*]] = arith.constant 0.000000e+00 : f32
// CHECK-DAG: %[[VAL_7:.*]] = arith.constant 8 : index
// CHECK: %[[VAL_8:.*]] = vector.create_mask %[[VAL_7]], %[[VAL_4]] : vector<8x32xi1>
// CHECK: %[[VAL_9:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_0]][%[[VAL_5]], %[[VAL_5]]], %[[VAL_6]] {in_bounds = [true, true]} : tensor<8x?xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
// CHECK: %[[VAL_10:.*]] = arith.constant 0.000000e+00 : f32
// CHECK: %[[VAL_11:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_1]][%[[VAL_5]], %[[VAL_5]]], %[[VAL_10]] {in_bounds = [true, true]} : tensor<8x?xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
// CHECK: %[[VAL_12:.*]] = arith.constant 0.000000e+00 : f32
// CHECK: %[[VAL_13:.*]] = vector.mask %[[VAL_8]] { vector.transfer_read %[[VAL_2]][%[[VAL_5]], %[[VAL_5]]], %[[VAL_12]] {in_bounds = [true, true]} : tensor<8x?xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
// CHECK: %[[VAL_14:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] : vector<8x32xf32>
// CHECK: %[[VAL_15:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_16:.*]] = vector.mask %[[VAL_8]] { vector.transfer_write %[[VAL_14]], %[[VAL_2]][%[[VAL_15]], %[[VAL_15]]] {in_bounds = [true, true]} : vector<8x32xf32>, tensor<8x?xf32> } : vector<8x32xi1> -> tensor<8x?xf32>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, 32] : !transform.any_op
}
// -----
func.func @do_not_generate_masks(%arg0: tensor<8x32xf32>,
%arg1: tensor<8x32xf32>,
%arg2: tensor<8x32xf32>) -> tensor<8x32xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>],
iterator_types = ["parallel", "parallel"] }
ins(%arg0, %arg1 : tensor<8x32xf32>, tensor<8x32xf32>)
outs(%arg2 : tensor<8x32xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<8x32xf32>
return %0 : tensor<8x32xf32>
}
// CHECK-LABEL: func.func @do_not_generate_masks
// CHECK-NOT: vector.mask
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, 32] : !transform.any_op
}
// -----
func.func @vectorize_static_shape_with_mask(%arg0: tensor<8x30xf32>,
%arg1: tensor<8x30xf32>,
%arg2: tensor<8x30xf32>) -> tensor<8x30xf32> {
%0 = linalg.generic { indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0, d1)>],
iterator_types = ["parallel", "parallel"] }
ins(%arg0, %arg1 : tensor<8x30xf32>, tensor<8x30xf32>)
outs(%arg2 : tensor<8x30xf32>) {
^bb(%in0: f32, %in1: f32, %out: f32) :
%0 = arith.addf %in0, %in1 : f32
linalg.yield %0 : f32
} -> tensor<8x30xf32>
return %0 : tensor<8x30xf32>
}
// CHECK-LABEL: func.func @vectorize_static_shape_with_mask(
// CHECK-SAME: %[[VAL_0:.*]]: tensor<8x30xf32>, %[[VAL_1:.*]]: tensor<8x30xf32>, %[[VAL_2:.*]]: tensor<8x30xf32>) -> tensor<8x30xf32> {
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[VAL_4:.*]] = arith.constant 0.000000e+00 : f32
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 8 : index
// CHECK-DAG: %[[VAL_6:.*]] = arith.constant 30 : index
// CHECK: %[[VAL_7:.*]] = vector.create_mask %[[VAL_5]], %[[VAL_6]] : vector<8x32xi1>
// CHECK: %[[VAL_8:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %[[VAL_0]][%[[VAL_3]], %[[VAL_3]]], %[[VAL_4]] {in_bounds = [true, true]} : tensor<8x30xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
// CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %[[VAL_1]][%[[VAL_3]], %[[VAL_3]]], %[[VAL_9]] {in_bounds = [true, true]} : tensor<8x30xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
// CHECK: %[[VAL_11:.*]] = arith.constant 0.000000e+00 : f32
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_7]] { vector.transfer_read %[[VAL_2]][%[[VAL_3]], %[[VAL_3]]], %[[VAL_11]] {in_bounds = [true, true]} : tensor<8x30xf32>, vector<8x32xf32> } : vector<8x32xi1> -> vector<8x32xf32>
// CHECK: %[[VAL_13:.*]] = arith.addf %[[VAL_8]], %[[VAL_10]] : vector<8x32xf32>
// CHECK: %[[VAL_14:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_15:.*]] = vector.mask %[[VAL_7]] { vector.transfer_write %[[VAL_13]], %[[VAL_2]][%[[VAL_14]], %[[VAL_14]]] {in_bounds = [true, true]} : vector<8x32xf32>, tensor<8x30xf32> } : vector<8x32xi1> -> tensor<8x30xf32>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, 32] : !transform.any_op
}
// -----
func.func @vectorize_dynamic_fill(%A : tensor<?x?xf32>, %arg0 : f32) -> tensor<?x?xf32> {
%0 = linalg.fill ins(%arg0 : f32) outs(%A : tensor<?x?xf32>) -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
// CHECK-LABEL: func.func @vectorize_dynamic_fill
// CHECK: %[[DIM0:.*]] = tensor.dim
// CHECK: %[[DIM1:.*]] = tensor.dim
// CHECK: %[[MASK:.*]] = vector.create_mask %[[DIM0]], %[[DIM1]] : vector<8x16xi1>
// CHECK: %[[BCAST:.*]] = vector.broadcast %{{.*}} : f32 to vector<8x16xf32>
// CHECK: vector.mask %[[MASK]] { vector.transfer_write %[[BCAST]], {{.*}} {in_bounds = [true, true]} : vector<8x16xf32>, tensor<?x?xf32> } : vector<8x16xi1>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, 16] : !transform.any_op
}
// -----
// CHECK-LABEL: func @test_masked_vectorize_linalg_copy
func.func @test_masked_vectorize_linalg_copy(%A : memref<?x?xf32>, %B : memref<?x?xf32>) {
// CHECK: %[[c0:.*]] = arith.constant 0 : index
// CHECK: %[[d0:.*]] = memref.dim %{{.*}}, %[[c0]] : memref<?x?xf32>
// CHECK: %[[c1:.*]] = arith.constant 1 : index
// CHECK: %[[d1:.*]] = memref.dim %{{.*}}, %[[c1]] : memref<?x?xf32>
// CHECK: %[[mask:.*]] = vector.create_mask %[[d0]], %[[d1]] : vector<2x4xi1>
// CHECK: vector.mask %[[mask]] {{.*}} vector.transfer_read %{{.*}} {in_bounds = [true, true]} : memref<?x?xf32>, vector<2x4xf32> } : vector<2x4xi1> -> vector<2x4xf32>
// CHECK: vector.mask %[[mask]] {{.*}} vector.transfer_write %{{.*}} {in_bounds = [true, true]} : vector<2x4xf32>, memref<?x?xf32> } : vector<2x4xi1>
linalg.copy ins(%A : memref<?x?xf32>) outs(%B : memref<?x?xf32>)
return
}
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.copy"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [2, 4] : !transform.any_op
}
// -----
// CHECK-LABEL: func @test_masked_vectorize_pad
func.func @test_masked_vectorize_pad(
%0 : tensor<?x?xf32>, %h0 : index, %h1 : index)
-> tensor<2x4xf32>
{
// CHECK-DAG: %[[c42:.*]] = arith.constant 4.243000e+01 : f32
// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[empty:.*]] = tensor.empty() : tensor<2x4xf32>
// CHECK: %[[d0:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
// CHECK: %[[d1:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
// CHECK: %[[mask:.*]] = vector.create_mask %[[d0]], %[[d1]] : vector<2x4xi1>
// CHECK-DAG: %[[c0_2:.*]] = arith.constant 0 : index
// CHECK: %[[masked_read:.*]] = vector.mask %[[mask]] {
// CHECK-SAME: vector.transfer_read %{{.*}}[%[[c0_2]], %[[c0_2]]], %[[c42]]
// CHECK-SAME: {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32>
// CHECK-SAME: } : vector<2x4xi1> -> vector<2x4xf32>
// CHECK: vector.transfer_write %[[masked_read]], %[[empty]][%[[c0_2]], %[[c0_2]]]
// CHECK-SAME: {in_bounds = [true, true]} : vector<2x4xf32>, tensor<2x4xf32>
%cst = arith.constant 42.43 : f32
%c0 = arith.constant 0 : index
%1 = tensor.pad %0 low[0, %c0] high[%h0, %h1] {
^bb0(%hh1: index, %hh2: index):
tensor.yield %cst : f32
} : tensor<?x?xf32> to tensor<2x4xf32>
return %1: tensor<2x4xf32>
}
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["tensor.pad"]} in %arg1
: (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [2, 4] : !transform.any_op
}
// -----
// CHECK: #[[MAP:.+]] = affine_map<()[s0, s1] -> (s0 + s1)>
// CHECK: func @test_masked_vectorize_dynamic_pad
func.func @test_masked_vectorize_dynamic_pad(
%0 : tensor<?x?xf32>, %h0 : index, %h1 : index)
-> tensor<?x?xf32>
{
// CHECK-DAG: %[[c42:.*]] = arith.constant 4.243000e+01 : f32
// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[res_d0:.+]] = affine.apply #[[MAP]]()
// CHECK-DAG: %[[res_d1:.+]] = affine.apply #[[MAP]]()
// CHECK-DAG: %[[empty:.*]] = tensor.empty(%[[res_d0]], %[[res_d1]]) : tensor<?x?xf32>
// CHECK: %[[d0:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
// CHECK: %[[d1:.*]] = tensor.dim {{.*}} : tensor<?x?xf32>
// CHECK: %[[mask:.*]] = vector.create_mask %[[d0]], %[[d1]] : vector<2x4xi1>
// CHECK-DAG: %[[c0_2:.*]] = arith.constant 0 : index
// CHECK: %[[masked_read:.*]] = vector.mask %[[mask]] {
// CHECK-SAME: vector.transfer_read %{{.*}}[%[[c0_2]], %[[c0_2]]], %[[c42]]
// CHECK-SAME: {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32>
// CHECK-SAME: } : vector<2x4xi1> -> vector<2x4xf32>
// CHECK: %[[mask_2:.*]] = vector.create_mask %[[res_d0]], %[[res_d1]] : vector<2x4xi1>
// CHECK: %[[masked_write:.*]] = vector.mask %[[mask_2]] {
// CHECK-SAME: vector.transfer_write %[[masked_read]], %[[empty]][%[[c0_2]], %[[c0_2]]]
// CHECK-SAME: {in_bounds = [true, true]} : vector<2x4xf32>, tensor<?x?xf32>
// CHECK: return %[[masked_write]] : tensor<?x?xf32>
%cst = arith.constant 42.43 : f32
%c0 = arith.constant 0 : index
%1 = tensor.pad %0 low[0, %c0] high[%h0, %h1] {
^bb0(%hh1: index, %hh2: index):
tensor.yield %cst : f32
} : tensor<?x?xf32> to tensor<?x?xf32>
return %1: tensor<?x?xf32>
}
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["tensor.pad"]} in %arg1
: (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [2, 4] : !transform.any_op
}
// -----
func.func @matmul(%A: memref<?x?xf32>, %B: memref<?x?xf32>, %C: memref<?x?xf32>) {
linalg.matmul ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>)
outs(%C: memref<?x?xf32>)
return
}
// CHECK-LABEL: func.func @matmul(
// CHECK-SAME: %[[A:.*]]: memref<?x?xf32>, %[[B:.*]]: memref<?x?xf32>, %[[C:.*]]: memref<?x?xf32>) {
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[VAL_4:.*]] = memref.dim %[[A]], %[[VAL_3]] : memref<?x?xf32>
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 1 : index
// CHECK-DAG: %[[VAL_6:.*]] = memref.dim %[[B]], %[[VAL_5]] : memref<?x?xf32>
// CHECK-DAG: %[[VAL_7:.*]] = arith.constant 1 : index
// CHECK-DAG: %[[VAL_8:.*]] = memref.dim %[[A]], %[[VAL_7]] : memref<?x?xf32>
// CHECK: %[[MASK_A:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_8]] : vector<8x4xi1>
// CHECK: %[[LOAD_A:.*]] = vector.mask %[[MASK_A]] { vector.transfer_read %[[A]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x16x4xf32> } : vector<8x4xi1> -> vector<8x16x4xf32>
// CHECK: %[[MASK_B:.*]] = vector.create_mask %[[VAL_8]], %[[VAL_6]] : vector<4x16xi1>
// CHECK: %[[LOAD_B:.*]] = vector.mask %[[MASK_B]] { vector.transfer_read %[[B]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x16x4xf32> } : vector<4x16xi1> -> vector<8x16x4xf32>
// CHECK: %[[MASK_C:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<8x16xi1>
// CHECK: %[[LOAD_C:.*]] = vector.mask %[[MASK_C]] { vector.transfer_read %[[C]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true]} : memref<?x?xf32>, vector<8x16xf32> } : vector<8x16xi1> -> vector<8x16xf32>
// CHECK: %[[MULF:.*]] = arith.mulf %[[LOAD_A]], %[[LOAD_B]] : vector<8x16x4xf32>
// CHECK: %[[MASK_MULIT_RED:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]], %[[VAL_8]] : vector<8x16x4xi1>
// CHECK: %[[MULTI_RED:.*]] = vector.mask %[[MASK_MULIT_RED]] { vector.multi_reduction <add>, %[[MULF]], %[[LOAD_C]] [2] : vector<8x16x4xf32> to vector<8x16xf32> } : vector<8x16x4xi1> -> vector<8x16xf32>
// CHECK: %[[C2:.*]] = arith.constant 0 : index
// CHECK: vector.mask %[[MASK_C]] { vector.transfer_write %[[MULTI_RED]], %[[C]]{{\[}}%[[C2]], %[[C2]]] {in_bounds = [true, true]} : vector<8x16xf32>, memref<?x?xf32> } : vector<8x16xi1>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%matmul = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %matmul vector_sizes [8, 16, 4] : !transform.any_op
}
// -----
func.func @matmul_scalable(%A: memref<?x?xf32>, %B: memref<?x?xf32>, %C: memref<?x?xf32>) {
linalg.matmul ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>)
outs(%C: memref<?x?xf32>)
return
}
// CHECK-LABEL: func.func @matmul_scalable(
// CHECK-SAME: %[[A:.*]]: memref<?x?xf32>, %[[B:.*]]: memref<?x?xf32>, %[[C:.*]]: memref<?x?xf32>) {
// CHECK-DAG: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[VAL_4:.*]] = memref.dim %[[A]], %[[VAL_3]] : memref<?x?xf32>
// CHECK-DAG: %[[VAL_5:.*]] = arith.constant 1 : index
// CHECK-DAG: %[[VAL_6:.*]] = memref.dim %[[B]], %[[VAL_5]] : memref<?x?xf32>
// CHECK-DAG: %[[VAL_7:.*]] = arith.constant 1 : index
// CHECK-DAG: %[[VAL_8:.*]] = memref.dim %[[A]], %[[VAL_7]] : memref<?x?xf32>
// CHECK: %[[MASK_A:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_8]] : vector<8x4xi1>
// CHECK: %[[LOAD_A:.*]] = vector.mask %[[MASK_A]] { vector.transfer_read %[[A]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x[16]x4xf32> } : vector<8x4xi1> -> vector<8x[16]x4xf32>
// CHECK: %[[MASK_B:.*]] = vector.create_mask %[[VAL_8]], %[[VAL_6]] : vector<4x[16]xi1>
// CHECK: %[[LOAD_B:.*]] = vector.mask %[[MASK_B]] { vector.transfer_read %[[B]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true, true], permutation_map = #{{.*}}} : memref<?x?xf32>, vector<8x[16]x4xf32> } : vector<4x[16]xi1> -> vector<8x[16]x4xf32>
// CHECK: %[[MASK_C:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]] : vector<8x[16]xi1>
// CHECK: %[[LOAD_C:.*]] = vector.mask %[[MASK_C]] { vector.transfer_read %[[C]]{{\[}}%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true, true]} : memref<?x?xf32>, vector<8x[16]xf32> } : vector<8x[16]xi1> -> vector<8x[16]xf32>
// CHECK: %[[MULF:.*]] = arith.mulf %[[LOAD_A]], %[[LOAD_B]] : vector<8x[16]x4xf32>
// CHECK: %[[MASK_MULIT_RED:.*]] = vector.create_mask %[[VAL_4]], %[[VAL_6]], %[[VAL_8]] : vector<8x[16]x4xi1>
// CHECK: %[[MULTI_RED:.*]] = vector.mask %[[MASK_MULIT_RED]] { vector.multi_reduction <add>, %[[MULF]], %[[LOAD_C]] [2] : vector<8x[16]x4xf32> to vector<8x[16]xf32> } : vector<8x[16]x4xi1> -> vector<8x[16]xf32>
// CHECK: %[[C2:.*]] = arith.constant 0 : index
// CHECK: vector.mask %[[MASK_C]] { vector.transfer_write %[[MULTI_RED]], %[[C]]{{\[}}%[[C2]], %[[C2]]] {in_bounds = [true, true]} : vector<8x[16]xf32>, memref<?x?xf32> } : vector<8x[16]xi1>
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%matmul = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %matmul vector_sizes [8, [16], 4] : !transform.any_op
}

View File

@@ -29,7 +29,7 @@ func.func @vectorize_dynamic_identity(%arg0: tensor<?xf32>,
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [[4]] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [[4]] : !transform.any_op
}
// -----
@@ -71,7 +71,7 @@ func.func @vectorize_partial_dynamic_identity(%arg0: tensor<8x?xf32>,
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, [32]] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [8, [32]] : !transform.any_op
}
// -----
@@ -111,7 +111,7 @@ func.func @vectorize_static_shape_with_mask(%arg0: tensor<8x30xf32>,
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, [32]] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [8, [32]] : !transform.any_op
}
// -----
@@ -131,6 +131,6 @@ func.func @vectorize_dynamic_fill(%A : tensor<?x?xf32>, %arg0 : f32) -> tensor<?
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [8, [16]] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [8, [16]] : !transform.any_op
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -28,7 +28,7 @@ func.func @masked_static_vectorize_nd_tensor_extract_with_affine_apply_contiguou
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
}
// -----
@@ -83,7 +83,7 @@ func.func @masked_dynamic_vectorize_nd_tensor_extract_with_affine_apply_contiguo
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
}
// -----
@@ -121,7 +121,7 @@ func.func @masked_vectorize_nd_tensor_extract_with_affine_apply_gather(%6: tenso
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
}
// -----
@@ -176,7 +176,7 @@ func.func @masked_dynamic_vectorize_nd_tensor_extract_with_affine_apply_gather(%
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 4] vectorize_nd_extract : !transform.any_op
}
// -----
@@ -226,7 +226,7 @@ func.func @extract_masked_vectorize(%arg0: tensor<?x?xf32>, %arg1: tensor<?x?xf3
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [3, 3] vectorize_nd_extract : !transform.any_op
transform.structured.vectorize %0 vector_sizes [3, 3] vectorize_nd_extract : !transform.any_op
}
// -----
@@ -269,5 +269,5 @@ func.func @tensor_extract_dynamic_shape(%arg1: tensor<123x321xf32>, %arg2: tenso
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [1, 3, 8] vectorize_nd_extract : !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 3, 8] vectorize_nd_extract : !transform.any_op
}

View File

@@ -31,7 +31,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -65,7 +65,7 @@ func.func @vectorize_nd_tensor_extract_constant_idx(%arg0: tensor<3x3xf32>, %arg
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 { vectorize_nd_extract } : !transform.any_op
transform.structured.vectorize %0 { vectorize_nd_extract } : !transform.any_op
}
// -----
@@ -104,7 +104,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -156,7 +156,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -204,7 +204,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -248,7 +248,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -290,7 +290,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -332,7 +332,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -376,7 +376,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -416,7 +416,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -456,7 +456,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -495,7 +495,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}
// -----
@@ -522,5 +522,5 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
%2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
}

View File

@@ -80,7 +80,7 @@ transform.with_pdl_patterns {
transform.structured.tile %0 [4, 4, 4] : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
%1 = pdl_match @pdl_target_attrC in %arg1 : (!transform.any_op) -> !transform.any_op
%2 = get_parent_op %1 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize %2 : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize_children_and_apply_patterns %2 : (!transform.any_op) -> !transform.any_op
}
}
@@ -125,7 +125,7 @@ transform.with_pdl_patterns {
^bb1(%arg1: !transform.any_op):
%0 = pdl_match @pdl_target in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize %1 : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
}
}
@@ -150,5 +150,5 @@ func.func @vectorize_all(
transform.sequence failures(propagate) {
^bb0(%arg0: !transform.any_op):
transform.structured.vectorize %arg0 : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize_children_and_apply_patterns %arg0 : (!transform.any_op) -> !transform.any_op
}

View File

@@ -19,7 +19,7 @@ transform.sequence failures(propagate) {
%1, %loops:3 = transform.structured.tile %0 [8, 4, 2]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
%2 = get_parent_op %1 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize %2 : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize_children_and_apply_patterns %2 : (!transform.any_op) -> !transform.any_op
%b = transform.bufferization.one_shot_bufferize
layout{IdentityLayoutMap} %module_op
{bufferize_function_boundaries = true, allow_return_allocs = true}

View File

@@ -112,7 +112,7 @@ func.func @entry() {
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [[4], [4]] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [[4], [4]] : !transform.any_op
}
llvm.func @printCString(!llvm.ptr<i8>)

View File

@@ -49,7 +49,7 @@ func.func @entry() {
transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
transform.structured.masked_vectorize %0 vector_sizes [[4]] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [[4]] : !transform.any_op
}
llvm.func @printCString(!llvm.ptr<i8>)

View File

@@ -51,7 +51,7 @@ transform.sequence failures(propagate) {
^bb1(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%func_op = get_parent_op %0 : (!transform.any_op) -> !transform.op<"func.func">
transform.structured.masked_vectorize %0 vector_sizes [4, 4, 2] : !transform.any_op
transform.structured.vectorize %0 vector_sizes [4, 4, 2] : !transform.any_op
transform.apply_patterns to %func_op {
transform.apply_patterns.vector.lower_multi_reduction lowering_strategy = "innerreduction"
} : !transform.op<"func.func">

View File

@@ -171,68 +171,66 @@ def testMatchOpNamesList(target):
@run
@create_sequence
def testMaskedVectorizeNoArgs(target):
structured.MaskedVectorizeOp(target)
# CHECK-LABEL: TEST: testMaskedVectorizeNoArgs
def testVectorizeNoArgs(target):
structured.VectorizeOp(target)
# CHECK-LABEL: TEST: testVectorizeNoArgs
# CHECK: transform.sequence
# CHECK: transform.structured.masked_vectorize
# CHECK: transform.structured.vectorize
# CHECK-NOT: vector_sizes
@run
@create_sequence
def testMaskedVectorizeStatic(target):
structured.MaskedVectorizeOp(target, [16, 4])
# CHECK-LABEL: TEST: testMaskedVectorizeStatic
def testVectorizeStatic(target):
structured.VectorizeOp(target, [16, 4])
# CHECK-LABEL: TEST: testVectorizeStatic
# CHECK: transform.sequence
# CHECK: transform.structured.masked_vectorize
# CHECK: transform.structured.vectorize
# CHECK-SAME: vector_sizes [16, 4]
@run
@create_sequence
def testMaskedVectorizeArray(target):
def testVectorizeArray(target):
sizes = Attribute.parse("[16, 4]")
structured.MaskedVectorizeOp(target, sizes)
# CHECK-LABEL: TEST: testMaskedVectorizeArray
structured.VectorizeOp(target, sizes)
# CHECK-LABEL: TEST: testVectorizeArray
# CHECK: transform.sequence
# CHECK: transform.structured.masked_vectorize
# CHECK: transform.structured.vectorize
# CHECK-SAME: vector_sizes [16, 4]
@run
@create_sequence
def testMaskedVectorizeMixed(target):
def testVectorizeMixed(target):
sz1 = structured.MatchOp.match_op_names(target, ["arith.constant"])
sz2 = Attribute.parse("4")
structured.MaskedVectorizeOp(target, [sz1, sz2])
# CHECK-LABEL: TEST: testMaskedVectorizeMixed
structured.VectorizeOp(target, [sz1, sz2])
# CHECK-LABEL: TEST: testVectorizeMixed
# CHECK: transform.sequence
# CHECK: %[[V0:.*]] = transform.structured.match
# CHECK: transform.structured.masked_vectorize
# CHECK: transform.structured.vectorize
# CHECK-SAME: vector_sizes [%[[V0]] : !transform.any_op, 4]
@run
@create_sequence
def testMaskedVectorizeScalable(target):
def testVectorizeScalable(target):
sz1 = structured.MatchOp.match_op_names(target, ["arith.constant"])
sz2 = Attribute.parse("4")
structured.MaskedVectorizeOp(target, [16, [sz1], [sz2], [8]])
# CHECK-LABEL: TEST: testMaskedVectorizeScalable
structured.VectorizeOp(target, [16, [sz1], [sz2], [8]])
# CHECK-LABEL: TEST: testVectorizeScalable
# CHECK: transform.sequence
# CHECK-DAG: %[[V0:.*]] = transform.structured.match
# CHECK-DAG: transform.structured.masked_vectorize
# CHECK-DAG: transform.structured.vectorize
# CHECK-SAME: vector_sizes [16, [%[[V0]] : !transform.any_op], [4], [8]]
@run
@create_sequence
def testMaskedVectorizeArgs(target):
structured.MaskedVectorizeOp(target, [16, 4], vectorize_nd_extract=True)
# CHECK-LABEL: TEST: testMaskedVectorizeArgs
def testVectorizeArgs(target):
structured.VectorizeOp(target, [16, 4], vectorize_nd_extract=True)
# CHECK-LABEL: TEST: testVectorizeArgs
# CHECK: transform.sequence
# CHECK: transform.structured.masked_vectorize
# CHECK: transform.structured.vectorize
# CHECK-SAME: vectorize_nd_extract
@@ -497,15 +495,15 @@ def testTileToForallMapping(target):
@run
@create_sequence
def testVectorizeAllAttrs(target):
structured.VectorizeOp(
def testVectorizeChildrenAndApplyPatternsAllAttrs(target):
structured.VectorizeChildrenAndApplyPatternsOp(
target,
disable_multi_reduction_to_contract_patterns=True,
disable_transfer_permutation_map_lowering_patterns=True,
vectorize_nd_extract=True,
vectorize_padding=True,
)
# CHECK-LABEL: TEST: testVectorizeAllAttrs
# CHECK-LABEL: TEST: testVectorizeChildrenAndApplyPatternsAllAttrs
# CHECK: transform.sequence
# CHECK: = transform.structured.vectorize
# CHECK-SAME: disable_multi_reduction_to_contract_patterns
@@ -516,15 +514,15 @@ def testVectorizeAllAttrs(target):
@run
@create_sequence
def testVectorizeNoAttrs(target):
structured.VectorizeOp(
def testVectorizeChildrenAndApplyPatternsNoAttrs(target):
structured.VectorizeChildrenAndApplyPatternsOp(
target,
disable_multi_reduction_to_contract_patterns=False,
disable_transfer_permutation_map_lowering_patterns=False,
vectorize_nd_extract=False,
vectorize_padding=False,
)
# CHECK-LABEL: TEST: testVectorizeNoAttrs
# CHECK-LABEL: TEST: testVectorizeChildrenAndApplyPatternsNoAttrs
# CHECK: transform.sequence
# CHECK: = transform.structured.vectorize
# CHECK-NOT: disable_multi_reduction_to_contract_patterns