Commit Graph

89 Commits

Author SHA1 Message Date
Aart Bik
0a7b8cc5dd [mlir][sparse] fully implement sparse tensor to sparse tensor conversions
with rigorous integration test

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108721
2021-08-27 15:08:18 -07:00
Aart Bik
d5f7f356ce [mlir][sparse] add sparse-dense cases to storage integration test
Reviewed By: grosul1

Differential Revision: https://reviews.llvm.org/D108685
2021-08-25 11:33:20 -07:00
Aart Bik
c5735fada4 [mlir][sparse] enable a few vectorized runs in integration tests
Recent changes outside sparse compiler exposed the requirement of running a
new pass (lower-affine) but this only became apparent with private testing.
By adding some vectorized runs to integration test, we will detect the need
for such changes earlier and also widen codegen coverage of course.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D108667
2021-08-24 16:08:01 -07:00
Rob Suderman
871c812483 [mlir][linalg] Finish refactor of TC ops to YAML
Multiple operations were still defined as TC ops that had equivalent versions
as YAML operations. Reducing to a single compilation path guarantees that
frontends can lower to their equivalent operations without missing the
optimized fastpath.

Some operations are maintained purely for testing purposes (mainly conv{1,2,3}D
as they are included as sole tests in the vectorizaiton transforms.

Differential Revision: https://reviews.llvm.org/D108169
2021-08-20 12:35:04 -07:00
Robert Suderman
65532ea6dd [mlir][linalg] Clear unused linalg tc operations
These operations are not lowered to from any source dialect and are only
used for redundant tests. Removing these named ops, along with their
associated tests, will make migration to YAML operations much more
convenient.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D107993
2021-08-16 11:55:45 -07:00
Aart Bik
8cf8349eaa [mlir][sparse] add an elaborate sparse storage scheme integration test
Looks "under the hood" of the sparse stogage schemes.
Users should typically not be interested in these details
(hey, that is why we have "sparse compilers"!) but this
test makes sure the compact contents are as expected.

Reviewed By: ThomasRaoux, bixia

Differential Revision: https://reviews.llvm.org/D107683
2021-08-09 12:54:15 -07:00
Aart Bik
05c7f450df [mlir][sparse] add dense to sparse conversion implementation
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D107681
2021-08-09 12:12:39 -07:00
Gus Smith
0bd2d4c4b1 [mlir][sparse] Remove comment w/ code in it
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D107484
2021-08-04 21:41:36 +00:00
Aart Bik
2b013a6c8a [mlir][sparse] use proper type alias for filename ptr
Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D106904
2021-07-28 10:25:24 -07:00
Yi Zhang
deebf18512 [mlir][linalg] Add pooling_nchw_max, conv_2d_nchw as yaml ops.
- Add pooling_nchw_max.
- Move conv_2d_nchw to yaml ops and add strides and dilation attributes.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D106658
2021-07-23 17:37:15 +00:00
Eugene Zhulenev
6c1f655818 [mlir] Async: special handling for parallel loops with zero iterations
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106590
2021-07-23 01:22:59 -07:00
Yi Zhang
381c3b9299 Dyanamic shape support for memref reassociation reshape ops
Only memref with identity layout map is supported for now.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D106180
2021-07-19 15:14:36 -07:00
Aart Bik
e6e79b3f0b [mlir][sparse] remove linalg-to-loops from integration tests
With the migration from linalg.copy to memref.copy, this pass
(which was there solely to handle the linalg.copy op) is no
longer required for the end-to-end path for sparse compilation.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D106073
2021-07-15 09:14:46 -07:00
Aart Bik
7039dfc6dd [mlir][memref] adjust integration tests to new lowering passes
these tests run under the emulator and thus were overlooked

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D105855
2021-07-13 09:14:41 -07:00
Alex Zinenko
75e5f0aac9 [mlir] factor memref-to-llvm lowering out of std-to-llvm
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: herhut, silvas

Differential Revision: https://reviews.llvm.org/D105625
2021-07-09 14:49:52 +02:00
thomasraoux
291025389c [mlir][vector] Refactor Vector Unrolling and remove Tuple ops
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.

This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.

Differential Revision: https://reviews.llvm.org/D105381
2021-07-07 11:11:26 -07:00
Nicolas Vasilache
d0b282e10b [mlir][Linalg] Rewrite PadTensorOp to enable its comprehensive bufferization.
Add the rewrite of PadTensorOp to InitTensor + InsertSlice before the
bufferization analysis starts.

This is exercised via a more advanced integration test.

Since the new behavior triggers folding, 2 tests need to be updated.
One of those seems to exhibit a folding issue with `switch` and is modified.

Differential Revision: https://reviews.llvm.org/D105549
2021-07-07 12:39:22 +00:00
Yi Zhang
35df2f6fbd Refactor GenericPadTensorOpVectorizationPattern
Refactor the original code to rewrite a PadTensorOp into a
sequence of InitTensorOp, FillOp and InsertSliceOp without
vectorization by default. `GenericPadTensorOpVectorizationPattern`
provides a customized OptimizeCopyFn to vectorize the
copying step.

Reviewed By: silvas, nicolasvasilache, springerm

Differential Revision: https://reviews.llvm.org/D105293
2021-07-07 11:44:32 +00:00
Nicolas Vasilache
231b9dd9de [mlir][Linalg] Add comprehensive bufferization support for linalg::InitTensor and tensor::CastOp (11/n)
Also add an integration test that connects all the dots end to end, including with cast to unranked tensor for external library calls.

Differential Revision: https://reviews.llvm.org/D105106
2021-07-01 11:26:01 +00:00
Eugene Zhulenev
f57b2420b2 [mlir:Async] Add an async reference counting pass based on the user defined policy
Depends On D104999

Automatic reference counting based on the liveness analysis can add a lot of reference counting overhead at runtime. If the IR is known to be constrained to few particular "shapes", it's much more efficient to provide a custom reference counting policy that will specify where it is required to update the async value reference count.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D105037
2021-06-29 12:53:09 -07:00
Eugene Zhulenev
9ccdaac8f9 [mlir:Async] Fix a bug in automatic refence counting around function calls
Depends On D104998

Function calls "transfer ownership" to the callee and it puts additional constraints on the reference counting optimization pass

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D104999
2021-06-29 09:35:43 -07:00
Eugene Zhulenev
86ad0af870 [mlir:Async] Implement recursive async work splitting for scf.parallel operation (async-parallel-for pass)
Depends On D104780

Recursive work splitting instead of sequential async tasks submission gives ~20%-30% speedup in microbenchmarks.

Algorithm outline:
1. Collapse scf.parallel dimensions into a single dimension
2. Compute the block size for the parallel operations from the 1d problem size
3. Launch parallel tasks
4. Each parallel task reconstructs its own bounds in the original multi-dimensional iteration space
5. Each parallel task computes the original parallel operation body using scf.for loop nest

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D104850
2021-06-25 10:34:39 -07:00
Tobias Gysi
a21a6f51bc [mlir][linalg] Change the pretty printed FillOp operand order.
The patch changes the pretty printed FillOp operand order from output, value to value, output. The change is a follow up to https://reviews.llvm.org/D104121 that passes the fill value using a scalar input instead of the former capture semantics.

Differential Revision: https://reviews.llvm.org/D104356
2021-06-23 07:03:00 +00:00
Aart Bik
b13cbf537f [mlir][sparse] integration test for "simply dynamic" sparse output tensors
Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D104583
2021-06-22 14:28:02 -07:00
Matthias Springer
060208b4c8 [mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect
The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect.

* Rename SubTensorOp -> tensor.extract_slice, SubTensorInsertOp -> tensor.insert_slice.
* Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit.
* Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard
* Remove dialect dependencies: Standard --> Tensor
* Move canonicalization test cases to correct dialect (Tensor/MemRef).

Note: This is a fixed version of https://reviews.llvm.org/D104499, which was reverted due to a missing update to two CMakeFile.txt.

Differential Revision: https://reviews.llvm.org/D104676
2021-06-22 17:55:53 +09:00
Mehdi Amini
60d97fb4cf Revert "[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect"
This reverts commit 83bf801f5f.

This breaks the build with -DBUILD_SHARED_LIBS=ON
2021-06-21 16:39:24 +00:00
Matthias Springer
83bf801f5f [mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect
The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect.

* Rename ops: SubTensorOp --> ExtractTensorOp, SubTensorInsertOp --> InsertTensorOp
* Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit.
* Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard
* Remove dialect dependencies: Standard --> Tensor
* Move canonicalization test cases to correct dialect (Tensor/MemRef).

Differential Revision: https://reviews.llvm.org/D104499
2021-06-22 00:11:21 +09:00
Gus Smith
22911585bb [mlir][sparse] Add Matricized Tensor Times Khatri-Rao Product (MTTKRP) integration test
See this documentation from taco:
http://tensor-compiler.org/docs/data_analytics/index.html

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D104417
2021-06-17 16:53:12 +00:00
Gus Smith
f9a6d47c36 Add sparse matrix multiplication integration test
Adds an integration test for the SPMM (sparse matrix multiplication) kernel, which multiplies a sparse matrix by a dense matrix, resulting in a dense matrix. This is just a simple modification on the existing matrix-vector multiplication kernel.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D104334
2021-06-16 13:20:20 -07:00
Aart Bik
ec8910c4ad [mlir][sparse] integration test for all-dense annotated "sparse" output
Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D104277
2021-06-15 15:44:11 -07:00
thomasraoux
750799b7bc [mlir][NFC] Don't outline kernel in MMA integration tests
This matches better how other gpu integration tests are done.

Differential Revision: https://reviews.llvm.org/D103099
2021-05-27 09:43:54 -07:00
thomasraoux
b44007bec2 [mlir][gpu] Relax restriction on MMA store op to allow chain of mma ops.
In order to allow large matmul operations using the MMA ops we need to chain
operations this is not possible unless "DOp" and "COp" type have matching
layout so remove the "DOp" layout and force accumulator and result type to
match.
Added a test for the case where the MMA value is accumulated.

Differential Revision: https://reviews.llvm.org/D103023
2021-05-27 09:13:51 -07:00
Aart Bik
ca446e58c8 [sparse][mlir] simplify sparse runtime support library
Removed some of the older raw "MLIRized" versions that are
no longer needed now that the sparse runtime support library
can focus on the proper sparse tensor types rather than the
opague pointer approach of the past. This avoids legacy...

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D102960
2021-05-25 09:39:14 -07:00
Matthias Springer
5017b0f88b [mlir] Check only last dim stride in transfer op lowering
Lower a 1D vector transfer op to LLVM if the last dim stride is 1. Also fixes a bug in the original unit stride computation.

Differential Revision: https://reviews.llvm.org/D102897
2021-05-25 17:53:24 +09:00
thomasraoux
dae9038611 [mlir] Lower sm version for TensorCore intergration tests
Those tests only require sm70, this allows to run those integration
tests on more hardware.

Differential Revision: https://reviews.llvm.org/D103049
2021-05-24 14:45:24 -07:00
Navdeep Kumar
e552fa28da [MLIR][GPU] Add CUDA Tensor core WMMA test
Add a test case to test the complete execution of WMMA ops on a Nvidia
GPU with tensor cores. These tests are enabled under
MLIR_RUN_CUDA_TENSOR_CORE_TESTS.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D95334
2021-05-22 16:19:36 +05:30
Aart Bik
c194b49c9c [mlir][sparse] add full dimension ordering support
This revision completes the "dimension ordering" feature
of sparse tensor types that enables the programmer to
define a preferred order on dimension access (other than
the default left-to-right order). This enables e.g. selection
of column-major over row-major storage for sparse matrices,
but generalized to any rank, as in:

dimOrdering = affine_map<(i,j,k,l,m,n,o,p) -> (p,o,j,k,i,l,m,n)>

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102856
2021-05-21 12:35:13 -07:00
Matthias Springer
fb7ec1f187 [mlir] Use VectorTransferPermutationMapLoweringPatterns in VectorToSCF
VectorTransferPermutationMapLoweringPatterns can be enabled via a pass option. These additional patterns lower permutation maps to minor identity maps with broadcasting, if possible, allowing for more efficient vector load/stores. The option is deactivated by default.

Differential Revision: https://reviews.llvm.org/D102593
2021-05-19 14:46:19 +09:00
Nicolas Vasilache
f8dbd61074 [mlir][Linalg] Drop spuriously long matmul_column_major benchmark 2021-05-18 10:07:19 +00:00
Aart Bik
5879da496c [mlir][sparse] replace experimental flag with inplace attribute
The experimental flag for "inplace" bufferization in the sparse
compiler can be replaced with the new inplace attribute. This gives
a uniform way of expressing the more efficient way of bufferization.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102538
2021-05-17 11:43:44 -07:00
Matthias Springer
6774e5a995 [mlir] Fix in_bounds attr handling in TransferReadPermutationLowering
The in_bounds attribute should also be transposed.

Differential Revision: https://reviews.llvm.org/D102572
2021-05-17 15:28:16 +09:00
Matthias Springer
0f24163870 [mlir] Replace vector-to-scf with progressive-vector-to-scf
Depends On D102388

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102101
2021-05-13 23:27:31 +09:00
Matthias Springer
bf068e1077 [mlir] Do not use pass labels in unrolled ProgressiveVectorToSCF
Do not rely on pass labels to detect if the pattern was already applied in the past (which allows for more some extra optimizations to avoid extra InsertOps and ExtractOps). Instead, check if these optimizations can be applied on-the-fly.

This also fixes a bug, where vector.insert and vector.extract ops sometimes disappeared in the middle of the pass because they get folded away, but the next application of the pattern expected them to be there.

Differential Revision: https://reviews.llvm.org/D102206
2021-05-13 22:01:08 +09:00
Matthias Springer
9b77be5583 [mlir] Unrolled progressive-vector-to-scf.
Instead of an SCF for loop, these pattern generate fully unrolled loops with no temporary buffer allocations.

Differential Revision: https://reviews.llvm.org/D101981
2021-05-13 13:08:48 +09:00
Matthias Springer
c52cbe63e4 [mlir] Fix masked vector transfer ops with broadcasts
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.

This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.

Differential Revision: https://reviews.llvm.org/D101745
2021-05-13 12:46:03 +09:00
Matthias Springer
6555e53ab0 Revert "[mlir] Fix masked vector transfer ops with broadcasts"
This reverts commit c9087788f7.

Accidentally pushed old version of the commit.
2021-05-13 11:55:00 +09:00
Matthias Springer
c9087788f7 [mlir] Fix masked vector transfer ops with broadcasts
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.

This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.

Differential Revision: https://reviews.llvm.org/D101745
2021-05-13 11:37:36 +09:00
Aart Bik
96a23911f6 [mlir][sparse] complete migration to sparse tensor type
A very elaborate, but also very fun revision because all
puzzle pieces are finally "falling in place".

1. replaces lingalg annotations + flags with proper sparse tensor types
2. add rigorous verification on sparse tensor type and sparse primitives
3. removes glue and clutter on opaque pointers in favor of sparse tensor types
4. migrates all tests to use sparse tensor types

NOTE: next CL will remove *all* obsoleted sparse code in Linalg

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102095
2021-05-10 12:55:22 -07:00
Aart Bik
a2c9d4bb04 [mlir][sparse] Introduce proper sparsification passes
This revision migrates more code from Linalg into the new permanent home of
SparseTensor. It replaces the test passes with proper compiler passes.

NOTE: the actual removal of the last glue and clutter in Linalg will follow

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D101811
2021-05-04 17:10:09 -07:00
Aart Bik
319072f4e3 [mlir][sparse] migrate sparse operations into new sparse tensor dialect
This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.

NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.

Differential Revision: https://reviews.llvm.org/D101573
2021-04-29 15:52:35 -07:00