At the moment, they are a part of EmptyOp::getCanonicalizationPatterns. When
extract_slice(tensor.empty) is rewritten as a new tensor.empty, it could
happen that we end up with two tensor.empty ops, since the original
tensor.empty can have two users. After bufferization such cases result in two
allocations.
Differential Revision: https://reviews.llvm.org/D139308
It introduces a pattern that swaps `linalg.generic + tensor.pack` to
`tensor.pack + linalg.generic`. It requires all the iteration types
being parallel; the indexing map of output operand is identiy. They can
all be relaxed in the future.
The user can decide whether the propagation should be applied or not by
passing a control function.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D138882
This revision adapts the pattern in LinAlg to work on DPS interface, and
adds it to canonicalization patterns of tensor dialect. The
InsertSliceOp is skipped in the pattern because it has its own logic
about folding tensor.cast ops.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D139375
We can compute the offsets and sizes for the slice of input because the
iteration domain is defined over outer loops. If the dimension is tiled,
the i-th index is the product of offset_i and inner_tile_i.
Different from tiling a pad op, we do not have to deal with reading zero
data from input. Because the tiling sizes are indicated to packed outer
dimensions. We will read either the entire tile or partial tile for each
packed tile. The scf.if and tensor.generate ops are not needed in this
context.
Co-authored-by: Lorenzo Chelini <l.chelini@icloud.com>
Reviewed By: rengolin, mravishankar
Differential Revision: https://reviews.llvm.org/D138631
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
The `paddingValue` and `outerDimsPerm` are optional to the op;
`innerTiles` can be variadic in terms of static sizes and dynamic sizes.
Add a custom builder for building pack op easier.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D138860
This reverts D132662 (apart from overall cleanups), which introduced a too aggressive optimization for tensor.insert_slice bufferization. Instead, bufferizesToMemoryRead is improved to handle some of these cases. The remaining cases can still bufferize efficiently when running the canonicalizer before the bufferization.
Differential Revision: https://reviews.llvm.org/D138745
Do not generate CollapseShapeOps/ExpandShapeOps that have the same source and result shape. Generate casts instead. Such reshapes became invalid with D138498.
Differential Revision: https://reviews.llvm.org/D138557
Avoid duplicate code by using an existing helper function to interchange
a vector based on a permutation. Address comments emerged after landing
D138119.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D138480
Pack and Unpack return new tensors within which the individual elements
are reshuffled according to the packing specification. This has the
consequence of modifying the canonical order in which a given operator
(i.e., Matmul) accesses the individual elements. After bufferization,
this typically translates to increased access locality and cache
behavior improvement, e.g., eliminating cache line splitting.
Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com>
Co-authored-by: Han-Chung Wang <hanchung@google.com>
RFC: https://discourse.llvm.org/t/rfc-tensor-pack-and-tensor-unpack/66408/1
Reviewed By: nicolasvasilache, rengolin, hanchung
Differential Revision: https://reviews.llvm.org/D138119
MemRef has been accepting a general Attribute as memory space for
a long time. This commits updates bufferization side to catch up,
which allows downstream users to plugin customized symbolic memory
space. This also eliminates quite a few `getMemorySpaceAsInt`
calls, which is deprecated.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D138330
The `RankedTensorType` can have an optional encoding
attribute. Allowing the builders of `tensor.empty` to accept the
encoding attribute (optionally), allows building empty tensors with
the type having the encoding attribute.
Reviewed By: nicolasvasilache, hanchung, springerm
Differential Revision: https://reviews.llvm.org/D137297
This change adds memory space support to tensor.pad. (tensor.generate and tensor.from_elements do not support memory spaces yet.)
The memory space is inferred from the buffer of the source tensor.
Instead of lowering tensor.pad to tensor.generate + tensor.insert_slice, it is now lowered to bufferization.alloc_tensor (with the correct memory space) + linalg.map + tensor.insert_slice.
Memory space support for the remaining two tensor ops is left for a later point, as this requires some more design discussions.
Differential Revision: https://reviews.llvm.org/D136265
There is no memref equivalent of tensor.generate. The purpose of this change is to avoid creating scf.parallel loops during bufferization.
Differential Revision: https://reviews.llvm.org/D136767
tensor.insert and tensor.insert_slice (as destination style ops) do no longer need to implement the entire BufferizableOpInterface.
Differential Revision: https://reviews.llvm.org/D136347
When writing a tensor.extract/tensor.insert, the rank of the tensor is implied by the number of specified indices. When extracting from/inserting into an unranked tensor, it should first be casted to a ranked version.
Differential Revision: https://reviews.llvm.org/D136756
Drop the `createPadScalarOp` from Utils.h since it is a duplicate of
the `build` method added here.
Differential Revision: https://reviews.llvm.org/D136493
`getDestinationOperands` was almost a duplicate of `DestinationStyleOpInterface::getOutputOperands`. Now that the interface has been moved to mlir/Interfaces, it is no longer needed.
Differential Revision: https://reviews.llvm.org/D136240
Prior to this change, the "ExtractSliceFromReshape" pattern would transform
```
%collapsed = tensor.collapse_shape %input [[0, 1], [2]]
: tensor<1x11x100xf32> into tensor<11x100xf32>
%slice = tensor.extract_slice %collapsed [%offt, 0] [%size, 100] [1, 1]
: tensor<11x100xf32> to tensor<?x100xf32>
```
into a loop that iterated over the range `%size - %offt`, that pieces
together multiple sub-slices of `%input` along the first dimension. This
is correct but obviously inefficient. The technical condition is that
collapsing at-most-one non-unit dimension of `%src` will not result in a
subsequent slice along the corresponding dimension of `%collapsed`
mapping across discontinuities in the index space of `%src`. Thus, the
definition of a "linearized dimension" (from the perspective of
`tensor.collapse_shape`) is updated to reflect this condition.
The transform will now generate
```
%slice = tensor.extract_slice %input [0, %offt, 0][1, %size, 100] [1, 1]
: tensor<1x11x100xf32> to tensor<1x?x100xf32>
%result = tensor.collapse_shape [[0, 1], [2]]
: tensor<1x?x100xf32> to tensor<?x100xf32>
```
which can be further canonicalized.
Additional tests are added to check this family of edge cases.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D135726
The assert is misplaced as the result type is allowed to be null. A few
lines below the result type is inferred if it is passed a nullptr.
Besides, this behavior is described in the documentation of the builder.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D136262
These operations have undefined behavior if the index is not less than the rank of the source tensor / memref, so they cannot be freely speculated like they were before this patch. After this patch we speculate them only if we can prove that they don't have UB.
Depends on D135505.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D135748
Inserting a tensor into an equivalent tensor is a no-op after bufferization. No alloc is needed.
Differential Revision: https://reviews.llvm.org/D132662
tensor.empty/linalg.init_tensor produces an uninititalized tensor that can be used as a destination operand for destination-style ops (ops that implement `DestinationStyleOpInterface`).
This change makes it possible to implement `TilingInterface` for non-destination-style ops without depending on the Linalg dialect.
RFC: https://discourse.llvm.org/t/rfc-add-tensor-from-shape-operation/65101
Differential Revision: https://reviews.llvm.org/D135129
This interface is implemented by memref.dim and tensor.dim. This change makes it possible to remove a build dependency of the Affine dialect on the Tensor dialect (and maybe also the MemRef dialect in the future).
Differential Revision: https://reviews.llvm.org/D133595
The utility function should live in `StaticValueUtils.h` as it provides
a convenient way to convert a vector of OpFoldResults into a vector of
Values.
Reviewed By: nicolasvasilache, cota
Differential Revision: https://reviews.llvm.org/D134451
The utility function should live in `StaticValueUtils.h` as it provides
a convenient way to convert a vector of OpFoldResults into a vector of
Values.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134451
Summary:
Use the new enum in TilingIterface and verify that `iterator_type` attribute in
LinalgOp interface is compatible with the enum values. Later IteratorType enum
will be used in LinalgInterface to replace the current `iterator_type` attribute
array of string.
Existing enums in Linalg are moved into a separate td file and tablegen build
target. This is necessary, have one I32EnumAttr in a shared space that generated
enum class definition and EnumAttrs is dialect-specific location. Otherwise
there might be a conflict that I32EnumAttr generates enum definitions in
multiple places.
Differential Revision: https://reviews.llvm.org/D134634
This reverts commit 730ae80d3e.
It fails with a linking errors: `undefined reference to
`mlir::getValueOrCreateConstantIndexOp` in `libMLIRDialectUtils`.
The utility function should live in `StaticValueUtils.h` as it provides
a convenient way to convert a vector of OpFoldResults into a vector of
Values.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134451
So that these utility functions can also be used ViewLikeInterface
ops not in the memref dialect.
Reviewed By: mravishankar, christopherbate
Differential Revision: https://reviews.llvm.org/D134487
This relands commit 5d4603a02d.
It cludes fixes to GCC test failures and simplification to
the implementation.
Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com>
Co-authored-by: Christopher Bate <cbate@nvidia.com>