The current implementation used explicit index->int64_t casts for some, but
not all instances of passing values of type "index" in and from the sparse
support library. This revision makes the situation more consistent by
using new "index_t" type at all such places (which allows for less trivial
casting in the generated MLIR code). Note that the current revision still
assumes that "index" is 64-bit wide. If we want to support targets with
alternative "index" bit widths, we need to build the support library different.
But the current revision is a step forward by making this requirement explicit
and more visible.
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D112122
This revison lifts the artificial restriction on having exact matches between
source and destination type shapes. A static size may become dynamic. We still
reject changing a dynamic size into a static size to avoid the need for a
runtime "assert" on the conversion. This revision also refactors some of the
conversion code to share same-content buffers.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111915
Next step towards supporting sparse tensors outputs.
Also some minor refactoring of enum constants as well
as replacing tensor arguments with proper buffer arguments
(latter is required for more general sizes arguments for
the sparse_tensor.init operation, as well as more general
spares_tensor.convert operations later)
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D111771
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
This relaxes vectorization of dense memrefs a bit so that affine expressions
are allowed in more outer dimensions. Vectorization of non unit stride
references is disabled though, since this seems ineffective anyway.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111469
We have several ways to materialize sparse tensors (new and convert) but no explicit operation to release the underlying sparse storage scheme at runtime (other than making an explicit delSparseTensor() library call). To simplify memory management, a sparse_tensor.release operation has been introduced that lowers to the runtime library call while keeping tensors, opague pointers, and memrefs transparent in the initial IR.
*Note* There is obviously some tension between the concept of immutable tensors and memory management methods. This tension is addressed by simply stating that after the "release" call, no further memref related operations are allowed on the tensor value. We expect the design to evolve over time, however, and arrive at a more satisfactory view of tensors and buffers eventually.
Bug:
http://llvm.org/pr52046
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D111099
The lack of negi details leaked from merger class into codegen part.
Also, special case for vector code was not needed, the type can be used directly!
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110677
This revision makes sure that when the output buffer materializes locally
(in contrast with the passing in of output tensors either in-place or not
in-place), the zero initialization assumption is preserved. This also adds
a bit more documentation on our sparse kernel assumption (viz. TACO
assumptions).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110442
The sparse constant provides a constant tensor in coordinate format. We first split the sparse constant into a constant tensor for indices and a constant tensor for values. We then generate a loop to fill a sparse tensor in coordinate format using the tensors for the indices and the values. Finally, we convert the sparse tensor in coordinate format to the destination sparse tensor format.
Add tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110373
This has been a TODO for a long time, and it brings about many advantages (namely nice accessors, and less fragile code). The existing overloads that accept ArrayRef are now treated as deprecated and will be removed in a followup (after a small grace period). Most of the upstream MLIR usages have been fixed by this commit, the rest will be handled in a followup.
Differential Revision: https://reviews.llvm.org/D110293
When generating code to add an element to SparseTensorCOO (e.g., when doing dense=>sparse conversion), we used to check for nonzero values on the runtime side, whereas now we generate MLIR code to do that check.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110121
Now not just SUM, but also PRODUCT, AND, OR, XOR. The reductions
MIN and MAX are still to be done (also depends on recognizing
these operations in cmp-select constructs).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110203
This change adds automatic wrapper functoins with emit_c_interface
to all methods in the sparse support library that deal with MEMREFs.
The wrappers will take care of passing MEMREFs by value internally
and by pointer externally, thereby avoiding ABI issues across platforms.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110219
This enables the sparsification of more kernels, such as convolutions
where there is a x(i+j) subscript. It also enables more tensor invariants
such as x(1) or other affine subscripts such as x(i+1). Currently, we
reject sparsity altogether for such tensors. Despite this restriction,
however, we can already handle a lot more kernels with compound subscripts
for dense access (viz. convolution with dense input and sparse filter).
Some unit tests and an integration test demonstrate new capability.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D109783
The sparse index order must always be satisfied, but this
may give a choice in topsorts for several cases. We broke
ties in favor of any dense index order, since this gives
good locality. However, breaking ties in favor of pushing
unrelated indices into sparse iteration spaces gives better
asymptotic complexity. This revision improves the heuristic.
Note that in the long run, we are really interested in using
ML for ML to find the best loop ordering as a replacement for
such heuristics.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D109100
SymbolRefAttr is fundamentally a base string plus a sequence
of nested references. Instead of storing the string data as
a copies StringRef, store it as an already-uniqued StringAttr.
This makes a lot of things simpler and more efficient because:
1) references to the symbol are already stored as StringAttr's:
there is no need to copy the string data into MLIRContext
multiple times.
2) This allows pointer comparisons instead of string
comparisons (or redundant uniquing) within SymbolTable.cpp.
3) This allows SymbolTable to hold a DenseMap instead of a
StringMap (which again copies the string data and slows
lookup).
This is a moderately invasive patch, so I kept a lot of
compatibility APIs around. It would be nice to explore changing
getName() to return a StringAttr for example (right now you have
to use getNameAttr()), and eliminate things like the StringRef
version of getSymbol.
Differential Revision: https://reviews.llvm.org/D108899
Rationale:
Passing in a pointer to the memref data in order to implement the
dense to sparse conversion was a bit too low-level. This revision
improves upon that approach with a cleaner solution of generating
a loop nest in MLIR code itself that prepares the COO object before
passing it to our "swiss army knife" setup. This is much more
intuitive *and* now also allows for dynamic shapes.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D108491
Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores.
Differential Revision: https://reviews.llvm.org/D107733
This shares more code with existing utilities. Also, to be consistent,
we moved dimension permutation on the DimOp to the tensor lowering phase.
This way, both pre-existing DimOps on sparse tensors (not likely but
possible) as well as compiler generated DimOps are handled consistently.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D108309
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107681
Introduces a conversion from one (sparse) tensor type to another
(sparse) tensor type. See the operation doc for details. Actual
codegen for all cases is still TBD.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107205
This format was missing from the support library. Although there are some
subtleties reading in an external format for int64 as double, there is no
good reason to omit support for this data type form the support library.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106016
Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105928
Integral AND and OR follow the simple conjunction and disjuction rules
for lattice building. This revision also completes some of the Merge
refactoring by moving the remainder parts that are merger specific from
sparsification into utils files.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105851
Right now, we only accept x/c with nonzero c, since this
conceptually can be treated as a x*(1/c) conjunction for both
FP and INT as far as lattice computations go. The codegen
keeps the division though to preserve precise semantics.
See discussion:
https://llvm.discourse.group/t/sparse-tensors-in-mlir/3389/28
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105731
Remove `getDynOperands` and `createOrFoldDimOp` from MemRef.h to decouple MemRef a bit from Tensor. These two functions are used in other dialects/transforms.
Differential Revision: https://reviews.llvm.org/D105260
This revision extends the sparse compiler support from fp/int addition and multiplication to fp/int negation and subtraction, thereby increasing the scope of sparse kernels that can be compiled.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105306
To make TensorExp clearer, this change refactors the e0/e1 fields into a union: e0/e1 for a binary op tensor expression, and tensor_num for a tensor-kinded tensor expression.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D105303
Rationale:
Follow-up on migrating lattice and tensor expression related methods into the new utility.
This also prepares the next step of generalizing the op kinds that are handled.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105219
* Split memref.dim into two operations: memref.dim and tensor.dim. Both ops have the same builder interface and op argument names, so that they can be used with templates in patterns that apply to both tensors and memrefs (e.g., some patterns in Linalg).
* Add constant materializer to TensorDialect (needed for folding in affine.apply etc.).
* Remove some MemRefDialect dependencies, make some explicit.
Differential Revision: https://reviews.llvm.org/D105165
Moves iteration lattice/merger code into new SparseTensor/Utils directory. A follow-up CL will add lattice/merger unit tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D104757
Slowly we are moving toward full support of sparse tensor *outputs*. First
step was support for all-dense annotated "sparse" tensors. This step adds
support for truly sparse tensors, but only for operations in which the values
of a tensor change, but not the nonzero structure (this was refered to as
"simply dynamic" in the [Bik96] thesis).
Some background text was posted on discourse:
https://llvm.discourse.group/t/sparse-tensors-in-mlir/3389/25
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D104577
* Remove dependency: Standard --> MemRef
* Add dependencies: GPUToNVVMTransforms --> MemRef, Linalg --> MemRef, MemRef --> Tensor
* Note: The `subtensor_insert_propagate_dest_cast` test case in MemRef/canonicalize.mlir will be moved to Tensor/canonicalize.mlir in a subsequent commit, which moves over the remaining Tensor ops from the Standard dialect to the Tensor dialect.
Differential Revision: https://reviews.llvm.org/D104506
We have several ways of introducing a scalar invariant value into
linalg generic ops (should we limit this somewhat?). This revision
makes sure we handle all of them correctly in the sparse compiler.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D104335
This is a very careful start with alllowing sparse tensors at the
left-hand-side of tensor index expressions (viz. sparse output).
Note that there is a subtle difference between non-annotated tensors
(dense, remain n-dim, handled by classic bufferization) and all-dense
annotated "sparse" tensors (linearized to 1-dim without overhead
storage, bufferized by sparse compiler, backed by runtime support library).
This revision gently introduces some new IR to facilitate annotated outputs,
to be generalized to truly sparse tensors in the future.
Reviewed By: gussmith23, bixia
Differential Revision: https://reviews.llvm.org/D104074
Up to now all structured op operands are assumed to be shaped. The patch relaxes this assumption and allows scalar input operands. In contrast to shaped operands scalar operands are not indexed and directly forwarded to the body of the operation. As all other operands, scalar operands are associated to an indexing map that in case of a scalar or a 0D-operand has an empty range.
We will use scalar operands as a replacement for the capture mechanism. In contrast to captures, the approach ensures we can generate the function signature from the operand list and it prevents outdated capture values in case a transformation updates only the capture operand but not the hidden body of a named operation.
Removing captures and updating existing operations such as linalg.fill is left for a later patch.
The patch depends on https://reviews.llvm.org/D103891 and https://reviews.llvm.org/D103890.
Differential Revision: https://reviews.llvm.org/D104109
Controlled by a compiler option, if 32-bit indices can be handled
with zero/sign-extention alike (viz. no worries on non-negative
indices), scatter/gather operations can use the more efficient
32-bit SIMD version.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D103632