clang-p2996

Author	SHA1	Message	Date
Aart Bik	378f1885e3	[mlir][sparse] enhance sparse reduction support Formerly, we accepted and/prod reductions as a standard reduction but these change the semantics after sparsification by not looking at implicit zeros. Therefore, we only accept standard reductions that are insensitive to implicit vs. explicit zeros, and leave the more complex reductions to the sparse_tensor.reduce custom reduction implementation. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151929	2023-06-01 16:30:21 -07:00
wren romano	540d5e0ce6	[mlir][sparse] Updating STEA parser/printer to use the name "dimSlices" Depends On D151505 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151513	2023-05-30 15:50:07 -07:00
wren romano	76647fce13	[mlir][sparse] Combining `dimOrdering`+`higherOrdering` fields into `dimToLvl` This is a major step along the way towards the new STEA design. While a great deal of this patch is simple renaming, there are several significant changes as well. I've done my best to ensure that this patch retains the previous behavior and error-conditions, even though those are at odds with the eventual intended semantics of the `dimToLvl` mapping. Since the majority of the compiler does not yet support non-permutations, I've also added explicit assertions in places that previously had implicitly assumed it was dealing with permutations. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151505	2023-05-30 15:19:50 -07:00
Aart Bik	22caafc9f3	[mlir][sparse][gpu] end to end test for matmul (1) minor bug fix in copy back [always nice to run stuff ;-)] (2) run with and without lib (even though some fall back to CPU) Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D151507	2023-05-25 16:10:22 -07:00
Peiming Liu	f7b8b005ff	[mlir][sparse] fix bugs when computing the memory size when lowering pack op. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151481	2023-05-25 19:19:52 +00:00
Aart Bik	bcb698bfdc	[mlir][sparse][gpu] various cuSparse refinements (1) keep all cuSparse ops on single stream without wait() in right order (2) use more type precise memref types for COO (3) use ToTensor on resulting memref (even though it folds away again) Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151404	2023-05-24 22:32:52 -07:00
Peiming Liu	b2e6b73544	[mlir][sparse] extend unpack operation to unpack arbitrary encodings. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151174	2023-05-23 22:34:01 +00:00
Aart Bik	b75d6a40f1	[mlir][sparse][gpu] recognize SpMM cuSparse during sparsification Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D150715	2023-05-19 17:22:59 -07:00
Peiming Liu	de56088866	[mlir][sparse] Support packing external data into arbitrary sparse tensor encoding. We previously only support packing two array (values and coordinates) into COO tensors. This patch allows packing inputs into arbitrary sparse tensor format. It also deletes the "implicit" data canonicalization performed inside sparse compiler, but instead requires users to canonicalize the data before passing it to the sparse compiler. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150916	2023-05-19 17:41:49 +00:00
wren romano	d9a2f89bee	[mlir][sparse] Adjusting error message wording, to better match new field names This is a followup to D150330, split out because it's not purely mechanical. Depends On D150330 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150409	2023-05-18 15:02:08 -07:00
wren romano	7f5fb90bbb	[mlir][sparse] Fixing GPU tests (followup to D150330) The GPU tests weren't updated when rebasing D150330, so this patch fixes that. Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D150822	2023-05-17 15:29:54 -07:00
wren romano	a0615d020a	[mlir][sparse] Renaming the STEA field `dimLevelType` to `lvlTypes` This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes: * Renaming compiler-internal functions/methods: * `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}` * `Merger::{getDimLevelType => getLvlType}` (for consistency) * `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods) * Renaming external facets to match: * the STEA parser and printer * the C and Python bindings * PyTACO However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150330	2023-05-17 14:24:09 -07:00
Anlun Xu	6116ca67ab	[mlir][sparse] Add sparse rewriting rules for tensor::ReshapeOp Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149564	2023-05-16 14:56:33 -07:00
Aart Bik	ee42e23614	[mlir][sparse][gpu] first implementation of the GPU libgen approach The sparse compiler now has two prototype strategies for GPU acceleration: * CUDA codegen: this converts sparsified code to CUDA threads * CUDA libgen: this converts pre-sparsified code to cuSPARSE library calls This revision introduces the first steps required for the second approach. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150170	2023-05-15 08:49:38 -07:00
Aart Bik	9a018a7b48	[mlir][sparse] relax constraints on tensor.cast with pre-rewriting Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D149489	2023-05-01 16:03:44 -07:00
Peiming Liu	d4db528938	[mlir][sparse] extend unpack operation to support unpacking a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149103	2023-05-01 18:17:29 +00:00
Aart Bik	86888e420c	[mlir][sparse][gpu] generate proper memcpy in/out host and device The host registration is a convenient way to get CUDA kernels running, but it may be slow and does not work for all buffer (like global constants). This revision uses the proper alloc copy dealloc chains for buffers, using asynchronous chains to increase overlap. The host registration mechanism is kept under a flag for the output, just for experimentation purposes while this project ramps up. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148682	2023-04-21 09:30:42 -07:00
Peiming Liu	a7cfcc686b	[mlir][sparse] fix crash when generating coiteration loop with compressed-hi DLT. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148842	2023-04-20 21:15:49 +00:00
Peiming Liu	fd2211d84a	use heap memory for position buffer allocated for PackOp. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148818	2023-04-20 20:26:01 +00:00
Peiming Liu	7864d736cf	[mlir][sparse] extend pack operation to support packing a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148670	2023-04-20 01:35:30 +00:00
Peiming Liu	abd66d918a	[mlir][sparse] support iteration over compressed-hi dimension level in loop emitter Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148668	2023-04-20 00:57:08 +00:00
Peiming Liu	b9589545c4	[mlir][sparse] introduce a new compressed(hi) dimension level type `compressed(hi)` is similar to `compressed`, but instead of reusing the previous position high as the current position low, it uses a pair of positions for each sparse index. The patch only introduces the definition (syntax) but does not provide codegen implementation. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148664	2023-04-18 23:26:11 +00:00
Aart Bik	4889214a48	[mlir][sparse][gpu] generate single module, unique kernel names This fixes a TODO in the first version. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148406	2023-04-15 17:25:36 -07:00
Peiming Liu	5fd9d80135	[mlir][sparse] extend loop emitter to emit slice driven loops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D142930	2023-04-13 03:29:40 +00:00
Aart Bik	19466ebc7f	[mlir][sparse][gpu] a first prototype sparse GPU code generator This implements a proof-of-concept GPU code generator to the sparse compiler pipeline, currently only capable of generating CUDA threads for outermost parallel loops. The objective, obviously, is to grow this concept to a full blown GPU code generator, capable of the right combinaton of code generation as well as exploiting idiomatic kernels or vector specific libraries (think cuSparse). Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D147483	2023-04-05 11:32:06 -07:00
Peiming Liu	7b86f7c5d4	[mlir][sparse] support sparse bufferization.alloc_tensor with copy argument. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147358	2023-03-31 22:27:23 +00:00
bixia1	6071f6fd67	[mlir][sparse] Fix a problem in handling data type conversion. Previously, the genCast function generates arith.trunci for converting f32 to i32. Fix the function to use mlir::convertScalarToDtype to correctly handle conversion cases beyond index casting. Add a test case for codegen the sparse_tensor.convert op. Reviewed By: aartbik, Peiming, wrengr Differential Revision: https://reviews.llvm.org/D147272	2023-03-30 14:54:53 -07:00
Peiming Liu	c24547e969	[mlir][sparse] avoid creating temporary unordered COO buffer when reshape sparse tensor. Reviewed By: aartbik, wrengr Differential Revision: https://reviews.llvm.org/D147192	2023-03-30 01:29:55 +00:00
Peiming Liu	33267f4007	[mlir][sparse] convert a sparse tensor slice to sparse tensor correctly. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147074	2023-03-28 21:39:31 +00:00
Peiming Liu	c44d307c55	[mlir][sparse] add create-sparse-deallocs options to match the create-deallocs in BufferizationOption. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147010	2023-03-27 23:18:32 +00:00
Peiming Liu	2b21327fee	[mlir][sparse] fix crash when using pure constant index in indexing mapping (fixes #61530 ) To address https://github.com/llvm/llvm-project/issues/61530 Reviewed By: aartbik, wrengr Differential Revision: https://reviews.llvm.org/D146563	2023-03-21 23:45:20 +00:00
bixia1	abb05014f9	[mlir][sparse] Modify the pivot selection method for quick sort. Previously, we choose the median of three values. We now choose the median of five values when the number of values being sorted exceed a threshold (currently 100). This is similar to std::sort. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145534	2023-03-15 13:53:00 -07:00
bixia1	2ef416273f	[mlir][sparse] Improve sort operation by generating inlined code to compare values. Previously, we generate function calls to compare values for sorting. It turns out that the compiler doesn't inline those function calls. We now directly generate inlined code. Also, modify the code for comparing values to use less number of branches. This improves all sort implementation in general. For arabic-2005.mtx CSR, the improvement is around 25%. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145442	2023-03-14 15:14:49 -07:00
bixia1	f6424d11cb	[mlir][sparse] Improve quick sort by using a loop to sort the bigger partition. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145440	2023-03-10 20:43:08 -08:00
Peiming Liu	6db397a8d4	[mlir][sparse] support dynamic sparse tensor slices. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D141532	2023-03-10 23:12:41 +00:00
Peiming Liu	8237cac612	[mlir][sparse] extend storage specifier operations for slices. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D141641	2023-03-10 18:58:47 +00:00
Peiming Liu	ab99b5d1f6	[mlir][sparse] deduplicate non-unique coordinates unconditionally Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145621	2023-03-09 21:59:57 +00:00
Peiming Liu	6df483c9a0	[mlir][sparse] add a check test for foreach operation on constant sparse tensor Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145728	2023-03-09 21:25:37 +00:00
Peiming Liu	41089f86e3	[mlir][sparse] fix bugs when convert coo to coo but with different dim ordering Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145723	2023-03-09 20:55:03 +00:00
Peiming Liu	55270f56d2	[mlir][sparse] fix a bug in unpack op that used wrong compare predicate. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145603	2023-03-08 19:52:09 +00:00
Peiming Liu	cc009334eb	[mlir][sparse] deduplicate non-unique coordinates when coiterating COO tensors Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145518	2023-03-07 21:52:38 +00:00
wren romano	84cd51bb97	[mlir][sparse] Renaming "pointer/index" to "position/coordinate" The old "pointer/index" names often cause confusion since these names clash with names of unrelated things in MLIR; so this change rectifies this by changing everything to use "position/coordinate" terminology instead. In addition to the basic terminology, there have also been various conventions for making certain distinctions like: (1) the overall storage for coordinates in the sparse-tensor, vs the particular collection of coordinates of a given element; and (2) particular coordinates given as a `Value` or `TypedValue<MemRefType>`, vs particular coordinates given as `ValueRange` or similar. I have striven to maintain these distinctions as follows: * "p/c" are used for individual position/coordinate values, when there is no risk of confusion. (Just like we use "d/l" to abbreviate "dim/lvl".) * "pos/crd" are used for individual position/coordinate values, when a longer name is helpful to avoid ambiguity or to form compound names (e.g., "parentPos"). (Just like we use "dim/lvl" when we need a longer form of "d/l".) I have also used these forms for a handful of compound names where the old name had been using a three-letter form previously, even though a longer form would be more appropriate. I've avoided renaming these to use a longer form purely for expediency sake, since changing them would require a cascade of other renamings. They should be updated to follow the new naming scheme, but that can be done in future patches. * "coords" is used for the complete collection of crd values associated with a single element. In the runtime library this includes both `std::vector` and raw pointer representations. In the compiler, this is used specifically for buffer variables with C++ type `Value`, `TypedValue<MemRefType>`, etc. The bare form "coords" is discouraged, since it fails to make the dim/lvl distinction; so the compound names "dimCoords/lvlCoords" should be used instead. (Though there may exist a rare few cases where is is appropriate to be intentionally ambiguous about what coordinate-space the coords live in; in which case the bare "coords" is appropriate.) There is seldom the need for the pos variant of this notion. In most circumstances we use the term "cursor", since the same buffer is reused for a 'moving' pos-collection. * "dcvs/lcvs" is used in the compiler as the `ValueRange` analogue of "dimCoords/lvlCoords". (The "vs" stands for "`Value`s".) I haven't found the need for it, but "pvs" would be the obvious name for a pos-`ValueRange`. The old "ind"-vs-"ivs" naming scheme does not seem to have been sustained in more recent code, which instead prefers other mnemonics (e.g., adding "Buf" to the end of the names for `TypeValue<MemRefType>`). I have cleaned up a lot of these to follow the "coords"-vs-"cvs" naming scheme, though haven't done an exhaustive cleanup. * "positions/coordinates" are used for larger collections of pos/crd values; in particular, these are used when referring to the complete sparse-tensor storage components. I also prefer to use these unabbreviated names in the documentation, unless there is some specific reason why using the abbreviated forms helps resolve ambiguity. In addition to making this terminology change, this change also does some cleanup along the way: * correcting the dim/lvl terminology in certain places. * adding `const` when it requires no other code changes. * miscellaneous cleanup that was entailed in order to make the proper distinctions. Most of these are in CodegenUtils.{h,cpp} Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144773	2023-03-06 12:23:33 -08:00
Peiming Liu	fc126022e8	[mlir][sparse] fuse collapse_shape on sparse tensor with GenericOp. Instead of always materializing a new sparse tensor after reshape, this patch tries to fuses the reshape (currently only on COO) with GenericOp and coiterates with the reshaped tensors without allocating a new sparse tensor. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145016	2023-03-01 19:05:48 +00:00
bixia1	2c81d43241	[mlir][sparse] Improve the implementation of sparse_tensor.new for the codegen path. Rewrite a NewOp into a NewOp of a sorted COO tensor and a ConvertOp for converting the sorted COO tensor to the destination tensor type. Codegen a NewOp of a sorted COO tensor to use the new bulk reader API and sort the elements only when the input is not sorted. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144504	2023-03-01 07:29:49 -08:00
Peiming Liu	849529ba8a	[mlir][sparse] fix performance bug in matmul with a sparse rhs due to suboptimal iteration graphs. While dense tensors support random accesses, it is critical to visit them in a row-major order for better cache locality. However, we previously consider dense inputs and outputs together when computing constraints for building iteration graph, it could lead us to less efficient iteration graphs. This patch adds a new `SortMask::kIncludeDenseInput` to treat dense inputs/outputs separately when building iteration graph, thus increasing the chance for use to construct a better iteration graph. A more fine-grained approach is to treat each input separately. Note, related to: https://github.com/llvm/llvm-project/issues/51651 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144932	2023-02-28 21:02:17 +00:00
Peiming Liu	85dbb3fc4b	[mlir][sparse] support sparse tensor element type conversion in codegen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144578	2023-02-23 17:49:50 +00:00
Peiming Liu	44ff23d5e4	[mlir][sparse] unconditionally use IndexType for sparse_tensor.specifier Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144574	2023-02-22 20:21:34 +00:00
Peiming Liu	b7d86f3f1c	[mlir][sparse] revert optimization for dense->csc conversion. Eliminates the sort seems make the whole conversion slower (probably because loop rotation leads to bad locality). Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144517	2023-02-21 21:34:01 +00:00
Peiming Liu	9e8d9316ce	[mlir][sparse] allow foreach operation to generate out-of-order loop on non-annotated tensor. No need for a temp COO and sort even when converting dense -> CSC, we can instead rotate the loop to yield a ordered coordinates at beginning. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144213	2023-02-16 23:23:20 +00:00
bixia1	c2e248c6ae	[mlir][sparse] Remove the expansion of symmetric MTX in the sparse tensor storage. We will support symmetric MTX without expanding the data in the sparse tensor storage. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144059	2023-02-16 13:02:17 -08:00

1 2 3 4 5 ...

330 Commits