When converting to nvvm lowering gpu.printf to vprintf allows us to
support printing when running on cuda.
Differential Revision: https://reviews.llvm.org/D141049
Use an array of structures to represent the indices for the tailing COO region
of a sparse tensor.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D140870
The assert message was previously ignored. The lowered IR now calls `puts` it in case of a failed assertion.
Differential Revision: https://reviews.llvm.org/D138647
Add support for loading, computing, and storing `gpu.subgroup` WMMA ops
in transpose mode as well. Update the GPU to NVVM lowerings to support
`transpose` mode and update integration tests as well.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D139021
This commit updates how the `SparseTensorConversion` pass handles `NewOp`. It breaks up the underlying `openSparseTensor` function into two parts (`SparseTensorReader::create` and `SparseTensorReader::readSparseTensor`) so that the pass can inject code for constructing `lvlSizes` between those two parts. Migrating the construction of `lvlSizes` out of the runtime and into the pass is a necessary first step toward fully supporting non-permutations. (The alternative would be for the pass to generate a `FuncOp` for performing the construction and then passing that to the runtime; which doesn't seem to have any benefits over the design of this commit.) And since the pass now generates the code to call these two functions, this change also removes the `Action::kFromFile` value from the enum used by `_mlir_ciface_newSparseTensor`.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D138363
This reverts commit d0650d1089.
Original commit message:
Subviews are supposed to be expanded before we hit the lowering
code.
The expansion is done with the pass called
expand-strided-metadata.
Add a test that demonstrate how these passes can be linked up to achieve
the desired lowering.
This patch is NFC in spirit but not in practice because `subview` gets
lowered into `reinterpret_cast(extract_strided_metadata, <some math>)`
which lowers in two memref descriptors (one for `reinterpert_cast` and
one for `extract_strided_metadata`), which creates some noise of the
form: `extractvalue(unrealized_cast(extractvalue[0]))[0]` that is
currently not simplified within MLIR but that is really just noop in
that case.
Differential Revision: https://reviews.llvm.org/D136377
This patch is part of a larger simplification effort of vector transfer
operations. It removes the flag `lower-permutation-maps` from
VectorToSCF conversion and enables the lowering of permutation maps
by default. This means that VectorToSCF will always lower permutation
maps to independent broadcast/transpose operations before lowering
vector operations to SCF.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D138742
This is generated by running
```
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.td
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.mlir
```
Reviewed By: rriddle, dcaballe
Differential Revision: https://reviews.llvm.org/D138866
The attribute tells the operator to handle symmetric structures for 2D tensors.
By default, the operator assumes the input tensor is not symmetric.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D138230
Fix a problem in convert op rewriting where it used the original index for
ToIndicesOp.
Extend the concatenate op rewriting to handle dense destination and dynamic
shape destination.
Make the concatenate op integration test run on the codegen path.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D138057
Modify the integration test to check number_of_entries and use it to limit for
outputing sparse tensor values.
Reviewed By: aartbik, Peiming
Differential Revision: https://reviews.llvm.org/D138046
Refactor the rewriting of sparse_tensor.sort to support the implementation of
sparse_tensor.sort_coo.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137522
Permutation wasn't handled correctly. Add a test for the rewriting.
Extend an integration test to run with enable_runtime_library=false to
also test the rewriting.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D137845
(1) also fixes memory leak in sparse2dense rewriting
(2) still needs fix in dense2sparse by skipping zeros
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D137736
This revision generalizes lowering the sparse_tensor.insert op into actual code that directly operates on the memrefs of a sparse storage scheme. The current insertion strategy does *not* rely on a cursor anymore, with introduces some testing overhead for each insertion (but still proportional to the rank, as before). Over time, we can optimize the code generation, but this version enables us to finish the effort to migrate from library to actual codegen.
Things to do:
(1) carefully deal with (un)ordered and (not)unique
(2) omit overhead when not needed
(3) optimize and specialize
(4) try to avoid the pointer "cleanup" (at HasInserts), and make sure the storage scheme is consistent at every insertion point (so that it can "escape" without concerns).
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D137457