clang-p2996

Author	SHA1	Message	Date
Peiming Liu	269c82d389	[mlir][sparse] introduce new 2:4 block sparsity level type. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D155128	2023-07-12 23:33:53 +00:00
K-Wu	e37fc3cc39	[mlir][sparse][gpu] Impl 2:4 SpMM rewrite for linalg op w/ DENSE24 attr Differential Revision: https://reviews.llvm.org/D154772	2023-07-10 22:36:57 +00:00
Peiming Liu	fc5d8fce7d	[mlir][sparse] support dual sparse convolution. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152601	2023-07-10 16:49:32 +00:00
Aart Bik	03125e6894	[mlir][sparse][gpu] fix missing dealloc This dealloc was incorrectly removed in https://reviews.llvm.org/D153173 Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D154564	2023-07-06 09:48:19 -07:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Aart Bik	b939c015a4	[mlir][sparse] add affine parsing to new surface syntax for STEA (1) uses the previously introduce API to reuse AffineExpr parser without codedup (2) solves the look-ahead problem when parsing level spec Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D154254	2023-06-30 14:48:23 -07:00
Peiming Liu	a63d6a0014	[mlir][sparse] make UnpackOp return the actual filled length of unpacked memory This might simplify frontend implementation by avoiding recomputation for the same value. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D154244	2023-06-30 21:35:15 +00:00
Aart Bik	6b88c852b6	[mlir][sparse] Start migration to new surface syntax for STEA We are in the progress of migrating to a much improved surface syntax for the Sparse Tensor Encoding Attribute (STEA). You can see a preview of this in the StableHLO RFC at https://github.com/openxla/stablehlo/blob/main/rfcs/20230210-sparsity.md //This design is courtesy Wren Romano.// This initial revision (1) Introduces the first version of a new parser written by Wren Romano (2) Introduces a simple "migration plan" using NEW_SYNTAX on the STEA, which will allow us to test the new parser with new examples, as well as migrate existing examples over without the need to rewrite them all This first "drop" merely provides the entry points to parse the new syntax. The parser is still under active development. For example, we need to address the "lookahead" issue when parsing the lvl spec (viz. do we see l0 = d0 or a direct d0). Another larger task is to actually implement "affine" parsing (since the MLIR affine parser is not accessible in other parts of the tree). EXAMPLE: Currently, CSR looks like #CSR = #sparse_tensor.encoding<{ lvlTypes = ["dense","compressed"], dimToLvl = affine_map<(i,j) -> (i,j)> }> but you can "force" the new parser with #CSR = #sparse_tensor.encoding<{ NEW_SYNTAX = (d0, d1) -> (l0 = d0 : dense, l1 = d1 : compressed) }> Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D153997	2023-06-29 11:32:07 -07:00
Peiming Liu	df11a2b41a	[mlir][sparse] admit un-sparsifiable operations if all its operands are loaded from dense input Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D153998	2023-06-28 21:27:50 +00:00
Aart Bik	f14c8eb595	[mlir][sparse][gpu] refine SDDMM pattern for cuSPARSE Old pattern was missing some cases (e.g. swapping the arguments) but it also allowed too many cases (e.g. non-empty "absent" or different arguments for add/mul). This fixes the issues. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D153487	2023-06-21 18:31:55 -07:00
Peiming Liu	e7df82816b	[mlir][sparse] rewrite arith::SelectOp to semiring operations to sparsify it. Reviewed By: aartbik, K-Wu Differential Revision: https://reviews.llvm.org/D153397	2023-06-21 21:22:18 +00:00
Aart Bik	5c03c056e0	[mlir][sparse] enhance element-wise fusion heuristics We prevent merging a sparse-in/dense-out with dense-in kernels because the result is usuall not sparsifiable. Dense kernels and sparse kernels are still fused, obviously. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D153077	2023-06-15 16:48:40 -07:00
Kun Wu	9167dd46ba	[mlir][sparse][gpu] recognizing sddmm pattern in GPU libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151582	2023-06-15 23:48:11 +00:00
Aart Bik	65bfd5cb25	[mlir][sparse] proper in-place SDDMM with spy function This specific operation matches the cuSPARSE SDDMM semantics exactly. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D152969	2023-06-15 13:59:38 -07:00
Peiming Liu	faf7cd97d0	[mlir][sparse] merger extension to support sparsifying arith::CmpI/CmpF operation Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152761	2023-06-15 17:26:50 +00:00
Peiming Liu	83b7f018fd	[mlir][sparse] fix crashes when the tensor that defines the loop bound can not be found Reviewed By: aartbik, K-Wu Differential Revision: https://reviews.llvm.org/D152877	2023-06-14 20:27:50 +00:00
Peiming Liu	fd68d36109	[mlir][sparse] unifying enterLoopOverTensorAtLvl and enterCoIterationOverTensorsAtLvls The tensor levels are now explicitly categorized into different `LoopCondKind` to instruct LoopEmitter generate different code for different kinds of condition (e.g., `SparseCond`, `SparseSliceCond`, `SparseAffineIdxCond`, etc) The process of generating a while loop is now dissembled into three steps and they are dispatched to different LoopCondKind handler. 1. Generate LoopCondition (e.g., `pos <= posHi` for `SparseCond`, `slice.isNonEmpty` for `SparseAffineIdxCond`) 2. Generate LoopBody (e.g., compute the coordinates) 3. Generate ExtraChecks (e.g., `if (onSlice(crd))` for `SparseSliceCond`) Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152464	2023-06-14 20:03:10 +00:00
Aart Bik	debdf7e0ff	[mlir][sparse] refine single condition set up for semi-ring ops Reviewed By: Peiming, K-Wu Differential Revision: https://reviews.llvm.org/D152874	2023-06-14 09:23:09 -07:00
Peiming Liu	0258a53521	Brings back "[mlir][sparse] moving inbound check for slice driven loop into before block of the WhileOp" This reverts commit `07b927902d`. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D152566	2023-06-09 17:45:46 +00:00
Peiming Liu	07b927902d	Revert "[mlir][sparse] moving inbound check for slice driven loop into before block of the WhileOp" This reverts commit `853d704fd0`. Differential Revision: https://reviews.llvm.org/D152562	2023-06-09 17:21:40 +00:00
Kun Wu	97f4c22b3a	[mlir][sparse][gpu] unify dnmat and dnvec handle and ops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152465	2023-06-09 17:16:48 +00:00
Peiming Liu	853d704fd0	[mlir][sparse] moving inbound check for slice driven loop into before block of the WhileOp This patch changes the while loop generated for iterating over a fully reduced sparse level with affine index expression. Before: ``` cont = true while (cont) { if (inBound()) { .... cont = true; } else { cont = false; } } ``` After: ``` while(inBound()) { .... } ``` Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D152463	2023-06-09 17:03:15 +00:00
Kun Wu	8ed59c53de	[mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151775	2023-06-06 23:13:21 +00:00
Aart Bik	378f1885e3	[mlir][sparse] enhance sparse reduction support Formerly, we accepted and/prod reductions as a standard reduction but these change the semantics after sparsification by not looking at implicit zeros. Therefore, we only accept standard reductions that are insensitive to implicit vs. explicit zeros, and leave the more complex reductions to the sparse_tensor.reduce custom reduction implementation. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151929	2023-06-01 16:30:21 -07:00
wren romano	540d5e0ce6	[mlir][sparse] Updating STEA parser/printer to use the name "dimSlices" Depends On D151505 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151513	2023-05-30 15:50:07 -07:00
wren romano	76647fce13	[mlir][sparse] Combining `dimOrdering`+`higherOrdering` fields into `dimToLvl` This is a major step along the way towards the new STEA design. While a great deal of this patch is simple renaming, there are several significant changes as well. I've done my best to ensure that this patch retains the previous behavior and error-conditions, even though those are at odds with the eventual intended semantics of the `dimToLvl` mapping. Since the majority of the compiler does not yet support non-permutations, I've also added explicit assertions in places that previously had implicitly assumed it was dealing with permutations. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151505	2023-05-30 15:19:50 -07:00
Aart Bik	22caafc9f3	[mlir][sparse][gpu] end to end test for matmul (1) minor bug fix in copy back [always nice to run stuff ;-)] (2) run with and without lib (even though some fall back to CPU) Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D151507	2023-05-25 16:10:22 -07:00
Peiming Liu	f7b8b005ff	[mlir][sparse] fix bugs when computing the memory size when lowering pack op. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151481	2023-05-25 19:19:52 +00:00
Aart Bik	bcb698bfdc	[mlir][sparse][gpu] various cuSparse refinements (1) keep all cuSparse ops on single stream without wait() in right order (2) use more type precise memref types for COO (3) use ToTensor on resulting memref (even though it folds away again) Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151404	2023-05-24 22:32:52 -07:00
Peiming Liu	b2e6b73544	[mlir][sparse] extend unpack operation to unpack arbitrary encodings. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151174	2023-05-23 22:34:01 +00:00
Aart Bik	b75d6a40f1	[mlir][sparse][gpu] recognize SpMM cuSparse during sparsification Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D150715	2023-05-19 17:22:59 -07:00
Peiming Liu	de56088866	[mlir][sparse] Support packing external data into arbitrary sparse tensor encoding. We previously only support packing two array (values and coordinates) into COO tensors. This patch allows packing inputs into arbitrary sparse tensor format. It also deletes the "implicit" data canonicalization performed inside sparse compiler, but instead requires users to canonicalize the data before passing it to the sparse compiler. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150916	2023-05-19 17:41:49 +00:00
wren romano	d9a2f89bee	[mlir][sparse] Adjusting error message wording, to better match new field names This is a followup to D150330, split out because it's not purely mechanical. Depends On D150330 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150409	2023-05-18 15:02:08 -07:00
wren romano	7f5fb90bbb	[mlir][sparse] Fixing GPU tests (followup to D150330) The GPU tests weren't updated when rebasing D150330, so this patch fixes that. Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D150822	2023-05-17 15:29:54 -07:00
wren romano	a0615d020a	[mlir][sparse] Renaming the STEA field `dimLevelType` to `lvlTypes` This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes: * Renaming compiler-internal functions/methods: * `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}` * `Merger::{getDimLevelType => getLvlType}` (for consistency) * `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods) * Renaming external facets to match: * the STEA parser and printer * the C and Python bindings * PyTACO However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150330	2023-05-17 14:24:09 -07:00
Anlun Xu	6116ca67ab	[mlir][sparse] Add sparse rewriting rules for tensor::ReshapeOp Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149564	2023-05-16 14:56:33 -07:00
Aart Bik	ee42e23614	[mlir][sparse][gpu] first implementation of the GPU libgen approach The sparse compiler now has two prototype strategies for GPU acceleration: * CUDA codegen: this converts sparsified code to CUDA threads * CUDA libgen: this converts pre-sparsified code to cuSPARSE library calls This revision introduces the first steps required for the second approach. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150170	2023-05-15 08:49:38 -07:00
Aart Bik	9a018a7b48	[mlir][sparse] relax constraints on tensor.cast with pre-rewriting Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D149489	2023-05-01 16:03:44 -07:00
Peiming Liu	d4db528938	[mlir][sparse] extend unpack operation to support unpacking a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149103	2023-05-01 18:17:29 +00:00
Aart Bik	86888e420c	[mlir][sparse][gpu] generate proper memcpy in/out host and device The host registration is a convenient way to get CUDA kernels running, but it may be slow and does not work for all buffer (like global constants). This revision uses the proper alloc copy dealloc chains for buffers, using asynchronous chains to increase overlap. The host registration mechanism is kept under a flag for the output, just for experimentation purposes while this project ramps up. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148682	2023-04-21 09:30:42 -07:00
Peiming Liu	a7cfcc686b	[mlir][sparse] fix crash when generating coiteration loop with compressed-hi DLT. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148842	2023-04-20 21:15:49 +00:00
Peiming Liu	fd2211d84a	use heap memory for position buffer allocated for PackOp. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148818	2023-04-20 20:26:01 +00:00
Peiming Liu	7864d736cf	[mlir][sparse] extend pack operation to support packing a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148670	2023-04-20 01:35:30 +00:00
Peiming Liu	abd66d918a	[mlir][sparse] support iteration over compressed-hi dimension level in loop emitter Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148668	2023-04-20 00:57:08 +00:00
Peiming Liu	b9589545c4	[mlir][sparse] introduce a new compressed(hi) dimension level type `compressed(hi)` is similar to `compressed`, but instead of reusing the previous position high as the current position low, it uses a pair of positions for each sparse index. The patch only introduces the definition (syntax) but does not provide codegen implementation. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148664	2023-04-18 23:26:11 +00:00
Aart Bik	4889214a48	[mlir][sparse][gpu] generate single module, unique kernel names This fixes a TODO in the first version. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148406	2023-04-15 17:25:36 -07:00
Peiming Liu	5fd9d80135	[mlir][sparse] extend loop emitter to emit slice driven loops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D142930	2023-04-13 03:29:40 +00:00
Aart Bik	19466ebc7f	[mlir][sparse][gpu] a first prototype sparse GPU code generator This implements a proof-of-concept GPU code generator to the sparse compiler pipeline, currently only capable of generating CUDA threads for outermost parallel loops. The objective, obviously, is to grow this concept to a full blown GPU code generator, capable of the right combinaton of code generation as well as exploiting idiomatic kernels or vector specific libraries (think cuSparse). Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D147483	2023-04-05 11:32:06 -07:00
Peiming Liu	7b86f7c5d4	[mlir][sparse] support sparse bufferization.alloc_tensor with copy argument. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147358	2023-03-31 22:27:23 +00:00
bixia1	6071f6fd67	[mlir][sparse] Fix a problem in handling data type conversion. Previously, the genCast function generates arith.trunci for converting f32 to i32. Fix the function to use mlir::convertScalarToDtype to correctly handle conversion cases beyond index casting. Add a test case for codegen the sparse_tensor.convert op. Reviewed By: aartbik, Peiming, wrengr Differential Revision: https://reviews.llvm.org/D147272	2023-03-30 14:54:53 -07:00

1 2 3 4 5 ...

353 Commits