clang-p2996

Author	SHA1	Message	Date
Aart Bik	ee42e23614	[mlir][sparse][gpu] first implementation of the GPU libgen approach The sparse compiler now has two prototype strategies for GPU acceleration: * CUDA codegen: this converts sparsified code to CUDA threads * CUDA libgen: this converts pre-sparsified code to cuSPARSE library calls This revision introduces the first steps required for the second approach. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150170	2023-05-15 08:49:38 -07:00
Aart Bik	9a018a7b48	[mlir][sparse] relax constraints on tensor.cast with pre-rewriting Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D149489	2023-05-01 16:03:44 -07:00
Peiming Liu	d4db528938	[mlir][sparse] extend unpack operation to support unpacking a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149103	2023-05-01 18:17:29 +00:00
Aart Bik	86888e420c	[mlir][sparse][gpu] generate proper memcpy in/out host and device The host registration is a convenient way to get CUDA kernels running, but it may be slow and does not work for all buffer (like global constants). This revision uses the proper alloc copy dealloc chains for buffers, using asynchronous chains to increase overlap. The host registration mechanism is kept under a flag for the output, just for experimentation purposes while this project ramps up. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148682	2023-04-21 09:30:42 -07:00
Peiming Liu	a7cfcc686b	[mlir][sparse] fix crash when generating coiteration loop with compressed-hi DLT. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148842	2023-04-20 21:15:49 +00:00
Peiming Liu	fd2211d84a	use heap memory for position buffer allocated for PackOp. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148818	2023-04-20 20:26:01 +00:00
Peiming Liu	7864d736cf	[mlir][sparse] extend pack operation to support packing a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148670	2023-04-20 01:35:30 +00:00
Peiming Liu	abd66d918a	[mlir][sparse] support iteration over compressed-hi dimension level in loop emitter Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148668	2023-04-20 00:57:08 +00:00
Peiming Liu	b9589545c4	[mlir][sparse] introduce a new compressed(hi) dimension level type `compressed(hi)` is similar to `compressed`, but instead of reusing the previous position high as the current position low, it uses a pair of positions for each sparse index. The patch only introduces the definition (syntax) but does not provide codegen implementation. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148664	2023-04-18 23:26:11 +00:00
Aart Bik	4889214a48	[mlir][sparse][gpu] generate single module, unique kernel names This fixes a TODO in the first version. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148406	2023-04-15 17:25:36 -07:00
Peiming Liu	5fd9d80135	[mlir][sparse] extend loop emitter to emit slice driven loops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D142930	2023-04-13 03:29:40 +00:00
Aart Bik	19466ebc7f	[mlir][sparse][gpu] a first prototype sparse GPU code generator This implements a proof-of-concept GPU code generator to the sparse compiler pipeline, currently only capable of generating CUDA threads for outermost parallel loops. The objective, obviously, is to grow this concept to a full blown GPU code generator, capable of the right combinaton of code generation as well as exploiting idiomatic kernels or vector specific libraries (think cuSparse). Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D147483	2023-04-05 11:32:06 -07:00
Peiming Liu	7b86f7c5d4	[mlir][sparse] support sparse bufferization.alloc_tensor with copy argument. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147358	2023-03-31 22:27:23 +00:00
bixia1	6071f6fd67	[mlir][sparse] Fix a problem in handling data type conversion. Previously, the genCast function generates arith.trunci for converting f32 to i32. Fix the function to use mlir::convertScalarToDtype to correctly handle conversion cases beyond index casting. Add a test case for codegen the sparse_tensor.convert op. Reviewed By: aartbik, Peiming, wrengr Differential Revision: https://reviews.llvm.org/D147272	2023-03-30 14:54:53 -07:00
Peiming Liu	c24547e969	[mlir][sparse] avoid creating temporary unordered COO buffer when reshape sparse tensor. Reviewed By: aartbik, wrengr Differential Revision: https://reviews.llvm.org/D147192	2023-03-30 01:29:55 +00:00
Peiming Liu	33267f4007	[mlir][sparse] convert a sparse tensor slice to sparse tensor correctly. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147074	2023-03-28 21:39:31 +00:00
Peiming Liu	c44d307c55	[mlir][sparse] add create-sparse-deallocs options to match the create-deallocs in BufferizationOption. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147010	2023-03-27 23:18:32 +00:00
Peiming Liu	2b21327fee	[mlir][sparse] fix crash when using pure constant index in indexing mapping (fixes #61530 ) To address https://github.com/llvm/llvm-project/issues/61530 Reviewed By: aartbik, wrengr Differential Revision: https://reviews.llvm.org/D146563	2023-03-21 23:45:20 +00:00
bixia1	abb05014f9	[mlir][sparse] Modify the pivot selection method for quick sort. Previously, we choose the median of three values. We now choose the median of five values when the number of values being sorted exceed a threshold (currently 100). This is similar to std::sort. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145534	2023-03-15 13:53:00 -07:00
bixia1	2ef416273f	[mlir][sparse] Improve sort operation by generating inlined code to compare values. Previously, we generate function calls to compare values for sorting. It turns out that the compiler doesn't inline those function calls. We now directly generate inlined code. Also, modify the code for comparing values to use less number of branches. This improves all sort implementation in general. For arabic-2005.mtx CSR, the improvement is around 25%. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145442	2023-03-14 15:14:49 -07:00
bixia1	f6424d11cb	[mlir][sparse] Improve quick sort by using a loop to sort the bigger partition. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145440	2023-03-10 20:43:08 -08:00
Peiming Liu	6db397a8d4	[mlir][sparse] support dynamic sparse tensor slices. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D141532	2023-03-10 23:12:41 +00:00
Peiming Liu	8237cac612	[mlir][sparse] extend storage specifier operations for slices. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D141641	2023-03-10 18:58:47 +00:00
Peiming Liu	ab99b5d1f6	[mlir][sparse] deduplicate non-unique coordinates unconditionally Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145621	2023-03-09 21:59:57 +00:00
Peiming Liu	6df483c9a0	[mlir][sparse] add a check test for foreach operation on constant sparse tensor Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145728	2023-03-09 21:25:37 +00:00
Peiming Liu	41089f86e3	[mlir][sparse] fix bugs when convert coo to coo but with different dim ordering Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145723	2023-03-09 20:55:03 +00:00
Peiming Liu	55270f56d2	[mlir][sparse] fix a bug in unpack op that used wrong compare predicate. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145603	2023-03-08 19:52:09 +00:00
Peiming Liu	cc009334eb	[mlir][sparse] deduplicate non-unique coordinates when coiterating COO tensors Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145518	2023-03-07 21:52:38 +00:00
wren romano	84cd51bb97	[mlir][sparse] Renaming "pointer/index" to "position/coordinate" The old "pointer/index" names often cause confusion since these names clash with names of unrelated things in MLIR; so this change rectifies this by changing everything to use "position/coordinate" terminology instead. In addition to the basic terminology, there have also been various conventions for making certain distinctions like: (1) the overall storage for coordinates in the sparse-tensor, vs the particular collection of coordinates of a given element; and (2) particular coordinates given as a `Value` or `TypedValue<MemRefType>`, vs particular coordinates given as `ValueRange` or similar. I have striven to maintain these distinctions as follows: * "p/c" are used for individual position/coordinate values, when there is no risk of confusion. (Just like we use "d/l" to abbreviate "dim/lvl".) * "pos/crd" are used for individual position/coordinate values, when a longer name is helpful to avoid ambiguity or to form compound names (e.g., "parentPos"). (Just like we use "dim/lvl" when we need a longer form of "d/l".) I have also used these forms for a handful of compound names where the old name had been using a three-letter form previously, even though a longer form would be more appropriate. I've avoided renaming these to use a longer form purely for expediency sake, since changing them would require a cascade of other renamings. They should be updated to follow the new naming scheme, but that can be done in future patches. * "coords" is used for the complete collection of crd values associated with a single element. In the runtime library this includes both `std::vector` and raw pointer representations. In the compiler, this is used specifically for buffer variables with C++ type `Value`, `TypedValue<MemRefType>`, etc. The bare form "coords" is discouraged, since it fails to make the dim/lvl distinction; so the compound names "dimCoords/lvlCoords" should be used instead. (Though there may exist a rare few cases where is is appropriate to be intentionally ambiguous about what coordinate-space the coords live in; in which case the bare "coords" is appropriate.) There is seldom the need for the pos variant of this notion. In most circumstances we use the term "cursor", since the same buffer is reused for a 'moving' pos-collection. * "dcvs/lcvs" is used in the compiler as the `ValueRange` analogue of "dimCoords/lvlCoords". (The "vs" stands for "`Value`s".) I haven't found the need for it, but "pvs" would be the obvious name for a pos-`ValueRange`. The old "ind"-vs-"ivs" naming scheme does not seem to have been sustained in more recent code, which instead prefers other mnemonics (e.g., adding "Buf" to the end of the names for `TypeValue<MemRefType>`). I have cleaned up a lot of these to follow the "coords"-vs-"cvs" naming scheme, though haven't done an exhaustive cleanup. * "positions/coordinates" are used for larger collections of pos/crd values; in particular, these are used when referring to the complete sparse-tensor storage components. I also prefer to use these unabbreviated names in the documentation, unless there is some specific reason why using the abbreviated forms helps resolve ambiguity. In addition to making this terminology change, this change also does some cleanup along the way: * correcting the dim/lvl terminology in certain places. * adding `const` when it requires no other code changes. * miscellaneous cleanup that was entailed in order to make the proper distinctions. Most of these are in CodegenUtils.{h,cpp} Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144773	2023-03-06 12:23:33 -08:00
Peiming Liu	fc126022e8	[mlir][sparse] fuse collapse_shape on sparse tensor with GenericOp. Instead of always materializing a new sparse tensor after reshape, this patch tries to fuses the reshape (currently only on COO) with GenericOp and coiterates with the reshaped tensors without allocating a new sparse tensor. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D145016	2023-03-01 19:05:48 +00:00
bixia1	2c81d43241	[mlir][sparse] Improve the implementation of sparse_tensor.new for the codegen path. Rewrite a NewOp into a NewOp of a sorted COO tensor and a ConvertOp for converting the sorted COO tensor to the destination tensor type. Codegen a NewOp of a sorted COO tensor to use the new bulk reader API and sort the elements only when the input is not sorted. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144504	2023-03-01 07:29:49 -08:00
Peiming Liu	849529ba8a	[mlir][sparse] fix performance bug in matmul with a sparse rhs due to suboptimal iteration graphs. While dense tensors support random accesses, it is critical to visit them in a row-major order for better cache locality. However, we previously consider dense inputs and outputs together when computing constraints for building iteration graph, it could lead us to less efficient iteration graphs. This patch adds a new `SortMask::kIncludeDenseInput` to treat dense inputs/outputs separately when building iteration graph, thus increasing the chance for use to construct a better iteration graph. A more fine-grained approach is to treat each input separately. Note, related to: https://github.com/llvm/llvm-project/issues/51651 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144932	2023-02-28 21:02:17 +00:00
Peiming Liu	85dbb3fc4b	[mlir][sparse] support sparse tensor element type conversion in codegen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144578	2023-02-23 17:49:50 +00:00
Peiming Liu	44ff23d5e4	[mlir][sparse] unconditionally use IndexType for sparse_tensor.specifier Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144574	2023-02-22 20:21:34 +00:00
Peiming Liu	b7d86f3f1c	[mlir][sparse] revert optimization for dense->csc conversion. Eliminates the sort seems make the whole conversion slower (probably because loop rotation leads to bad locality). Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144517	2023-02-21 21:34:01 +00:00
Peiming Liu	9e8d9316ce	[mlir][sparse] allow foreach operation to generate out-of-order loop on non-annotated tensor. No need for a temp COO and sort even when converting dense -> CSC, we can instead rotate the loop to yield a ordered coordinates at beginning. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144213	2023-02-16 23:23:20 +00:00
bixia1	c2e248c6ae	[mlir][sparse] Remove the expansion of symmetric MTX in the sparse tensor storage. We will support symmetric MTX without expanding the data in the sparse tensor storage. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144059	2023-02-16 13:02:17 -08:00
Peiming Liu	c738b430c4	[mlir][sparse] introduce operations to query sparse tensor slice offset/strides at the given dimenion Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D141442	2023-02-16 00:31:11 +00:00
Peiming Liu	e2e83f4c8f	[mlir][sparse] support coiteration over sparse tensor slices Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D140736	2023-02-15 23:52:22 +00:00
wren romano	f708a549b8	[mlir][sparse] Factoring out SparseTensorType class This change adds a new `SparseTensorType` class for making the "dim" vs "lvl" distinction more overt, and for abstracting over the differences between sparse-tensors and dense-tensors. In addition, this change also adds new type aliases `Dimension`, `Level`, and `FieldIndex` to make code more self-documenting. Although the diff is very large, the majority of the changes are mechanical in nature (e.g., changing types to use the new aliases, updating variable names to match, etc). Along the way I also made many variables `const` when they could be; the majority of which required only adding the keyword. A few places had conditional definitions of these variables, requiring actual code changes; however, that was only done when the overall change was extremely local and easy to extract. All these changes are included in the current patch only because it would be too onerous to split them off into a separate patch. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D143800	2023-02-14 19:17:19 -08:00
Peiming Liu	81cb70e46e	[mlir][sparse] fix a bug in UnpackOp converter. UnpackOp Converter used to create reallocOp unconditionally, but it might cause issue when the requested memory size is smaller than the actually storage. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144065	2023-02-15 02:36:00 +00:00
Peiming Liu	dc6427d687	[mlir][sparse] implement lowering rules for sparse_tensor::unpack Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D143672	2023-02-11 01:05:46 +00:00
Peiming Liu	6dbca86d83	[mlir][sparse] introduce sparse_tensor::unpack operation An inverse operation of sparse_tenosr::pack Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D143669	2023-02-11 01:01:35 +00:00
Jim Kitchen	81d0d2b2a0	[mlir][sparse] Sparse reduction in lex order no longer produces dense output Previously, when performing a reduction on a sparse tensor, the result would be different depending on iteration order. For expanded access pattern, an empty row would contribute no entry in the output. For lex ordering, the identity would end up in the output. This code changes that behavior and keeps track of whether any entries were actually reduced in lex ordering, making the output consistent between the two iteration styles. Differential Revision: https://reviews.llvm.org/D142050	2023-02-10 13:09:28 -06:00
bixia1	a150766880	[mlir][sparse] Implement hybrid quick sort for sparse_tensor.sort. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D143227	2023-02-08 14:06:31 -08:00
Aart Bik	3bd82f30dc	[mlir][sparse] compute allocation size_hint This adds the hint to a number of tensor allocations in codegens, shaving off quite some time from e.g. reading in sparse matrices due to zero-reallocation scheme. Note that we can probably provide hints on all allocations, and refine the heuristics that use them for general tensors. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D143309	2023-02-06 14:08:53 -08:00
Peiming Liu	a41672e16a	[mlir][sparse] implement lowering rules for sparse_tensor.pack operation Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D143230	2023-02-03 23:51:36 +00:00
Peiming Liu	1f07853f2b	[mlir][sparse] introduce sparse_tensor.pack operation Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D143224	2023-02-03 22:30:52 +00:00
Aart Bik	e2e6e7a6a3	[mlir][sparse] start using size_hint provided in allocation op Even though we introduced the size_hint, we never used it. This is a very first step, using the hint during the codegen path. Note that we can refine the heuristics. Also, we need to start adding the hint on all allocation generated for reading tensors, converting tensors, etc. Reviewed By: Peiming, bixia Differential Revision: https://reviews.llvm.org/D143292	2023-02-03 14:02:41 -08:00
bixia1	3b1c86cd0f	[mlir][sparse] Implement heap sort for sparse_tensor.sort. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D142913	2023-02-02 15:36:38 -08:00

1 2 3 4 5 ...

317 Commits