clang-p2996

Author	SHA1	Message	Date
Peiming Liu	298412b578	[mlir][sparse] setup `SparseIterator` to help generating code to traverse a sparse tensor level. (#78345 )	2024-01-24 11:33:06 -08:00
Oleksandr "Alex" Zinenko	2798b72ae7	[mlir] introduce debug transform dialect extension (#77595 ) Introduce a new extension for simple print-debugging of the transform dialect scripts. The initial version of this extension consists of two ops that are printing the payload objects associated with transform dialect values. Similar ops were already available in the test extenion and several downstream projects, and were extensively used for testing.	2024-01-12 13:24:02 +01:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Aart Bik	41a07e668c	[mlir][sparse] recognize NVidia 2:4 type for matmul (#76758 ) This removes the temporary DENSE24 attribute and replaces it with proper recognition of dense to 24 conversion. The compressionh will be performed on the device prior to performing the matrix mult. Note that we no longer need to start with the linalg version, we can lift this to the proper named linalg op. Also renames some files into more consistent names.	2024-01-02 14:44:24 -08:00
Matthias Springer	10056c821a	[mlir][SCF] `scf.parallel`: Make reductions part of the terminator (#75314 ) This commit makes reductions part of the terminator. Instead of `scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops. `scf.reduce` may contain an arbitrary number of reductions, with one region per reduction. Example: ```mlir %init = arith.constant 0.0 : f32 %r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init) -> f32, f32 { %elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32> %elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32> scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) { ^bb0(%lhs : f32, %rhs: f32): %res = arith.addf %lhs, %rhs : f32 scf.reduce.return %res : f32 }, { ^bb0(%lhs : f32, %rhs: f32): %res = arith.mulf %lhs, %rhs : f32 scf.reduce.return %res : f32 } } ``` `scf.reduce` operations can no longer be interleaved with other ops in the body of `scf.parallel`. This simplifies the op and makes it possible to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was not possible before because the op was not a terminator, causing the op to be DCE'd.)	2023-12-20 11:06:27 +09:00
Peiming Liu	6c06bde7c4	[mlir][sparse] support loop range query using SparseTensorLevel. (#75670 )	2023-12-15 16:33:31 -08:00
Yinying Li	31b72b0742	[mlir][sparse]Make isBlockSparsity more robust (#75113 ) 1. A single dimension can either be blocked (with floordiv and mod pair) or non-blocked. Mixing them would be invalid. 2. Block size should be non-zero value.	2023-12-12 13:43:03 -05:00
Aart Bik	d96f46dd20	[mlir][sparse] fix bug in custom reduction scalarization code (#74898 ) Bug found with BSR of "spy" SDDMM method	2023-12-11 10:22:17 -08:00
Peiming Liu	baa192ea65	[mlir][sparse] optimize memory loads to SSA values when generating sp… (#74787 ) …arse conv.	2023-12-08 09:22:19 -08:00
Peiming Liu	097d2f1417	[mlir][sparse] optimize memory load to SSA value when generating spar… (#74750 ) …se conv kernel.	2023-12-07 12:00:25 -08:00
Peiming Liu	b6cad75e07	[mlir][sparse] refactoring: using util functions to query the index to load from position array for slice-driven loop. (#73986 )	2023-11-30 16:40:11 -08:00
Peiming Liu	2cc4b3d07c	[mlir][sparse] code cleanup using the assumption that dim2lvl maps ar… (#72894 ) …e simplified.	2023-11-20 10:25:42 -08:00
Peiming Liu	573c4db947	[mlir][sparse] refine reinterpret_map test cases (#72684 )	2023-11-17 10:04:56 -08:00
Aart Bik	83cf0dc982	[mlir][sparse] implement direct IR alloc/empty/new for non-permutations (#72585 ) This change implements the correct level sizes set up for the direct IR codegen fields in the sparse storage scheme. This brings libgen and codegen together again. This is step 3 out of 3 to make sparse_tensor.new work for BSR	2023-11-16 17:17:41 -08:00
Yinying Li	c5a67e16b6	[mlir][sparse] Use variable instead of inlining sparse encoding (#72561 ) Example: #CSR = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed), }> // CHECK: #[[$CSR.]] = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed) }> // CHECK-LABEL: func private @sparse_csr( // CHECK-SAME: tensor<?x?xf32, #[[$CSR]]*>) func.func private @sparse_csr(tensor<?x?xf32, #CSR>)	2023-11-16 19:30:21 -05:00
Peiming Liu	06a65ce500	[mlir][sparse] schedule sparse kernels in a separate pass from sparsification. (#72423 )	2023-11-15 12:16:05 -08:00
Tim Harvey	dce7a7cf69	Changed all code and comments that used the phrase "sparse compiler" to instead use "sparsifier" (#71875 ) The changes in this p.r. mostly center around the tests that use the flag sparse_compiler (also: sparse-compiler).	2023-11-15 20:12:35 +00:00
Aart Bik	a40900211a	[mlir][sparse] set rwx permissions to consistent values (#72311 ) some files had "x" permission set, others were missing "r"	2023-11-14 13:32:55 -08:00
Aart Bik	5f32bcfbae	[mlir][sparse][gpu] re-enable all GPU libgen tests (#72185 ) Previous change no longer properly used the GPU libgen pass (even though most tests still passed falling back to CPU). This revision puts the proper pass order into place. Also bit of a cleanup of CPU codegen vs. libgen setup.	2023-11-14 09:06:15 -08:00
Peiming Liu	269685545e	[mlir][sparse] remove filter-loop based algorithm support to handle a… (#71840 ) …ffine subscript expressions.	2023-11-13 11:36:49 -08:00
Peiming Liu	c99951d491	[mlir][sparse] end-to-end matmul between Dense and BSR tensors (#71448 )	2023-11-08 11:28:00 -08:00
Tim Harvey	c43e627457	Changed the phrase sparse-compiler to sparsifier in comments (#71578 ) When the Powers That Be decided that the name "sparse compiler" should be changed to "sparsifier", we negected to change some of the comments in the code; this pull request completes the name change.	2023-11-07 20:55:00 +00:00
Aart Bik	a4eadd7fb6	[mlir][sparse][gpu] add GPU BSR SDDMM check test (#71491 ) also minor edits in other GPU check tests	2023-11-06 22:36:25 -08:00
Christian Ulmann	7ed96b1c0d	[MLIR][LLVM] Remove last typed pointer remnants from tests (#71232 ) This commit removes all LLVM dialect typed pointers from the lit tests. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502	2023-11-04 14:13:31 +01:00
Peiming Liu	c0d78c4232	[mlir][sparse] Implement rewriters to reinterpret maps on alloc_tenso… (#70993 ) …r operation	2023-11-01 18:15:11 -07:00
Peiming Liu	3426d330a7	[mlir][sparse] Implement rewriters to reinterpret maps on foreach (#70868 )	2023-11-01 12:11:47 -07:00
Aart Bik	e599978760	[mlir][sparse] first proof-of-concept non-permutation rewriter (#70863 ) Rather than extending sparsifier codegen with higher order non-permutations, we follow the path of rewriting linalg generic ops into higher order operations. That way, code generation will simply work out of the box. This is a very first proof-of-concept rewriting of that idea.	2023-10-31 16:19:27 -07:00
Christian Ulmann	dcae289d3a	[MLIR][SparseTensor] Introduce opaque pointers in LLVM dialect lowering (#70570 ) This commit changes the SparseTensor LLVM dialect lowering from using `llvm.ptr<i8>` to `llvm.ptr`. This change ensures that the lowering now properly relies on opaque pointers, instead of working with already type erased i8 pointers.	2023-10-31 07:34:49 +01:00
Peiming Liu	ef100c228a	[mlir][sparse] implements tensor.insert on sparse tensors. (#70737 )	2023-10-30 16:04:41 -07:00
Peiming Liu	f82bee1367	[mlir][sparse] split post-sparsification-rewriting into two passes. (#70727 )	2023-10-30 15:22:21 -07:00
Peiming Liu	7d608ee2bb	[mlir][sparse] unify sparse_tensor.out rewriting rules (#70518 )	2023-10-27 16:46:58 -07:00
Aart Bik	7cfac1bedd	[mlir][sparse] add boilterplate code for a new reintepret map pass (#70393 ) The interesting stuff is of course still coming ;-)	2023-10-26 17:57:46 -07:00
Peiming Liu	d808d922b4	[mlir][sparse] introduce sparse_tensor.reinterpret_map operation. (#70378 )	2023-10-26 15:04:09 -07:00
Aart Bik	ff94061a9f	[mlir][sparse] remove reshape dot test (#70359 ) This no longer tests a required feature.	2023-10-26 11:14:44 -07:00
Aart Bik	0cbaff815c	[mlir][sparse] cleanup conversion test (#70356 ) Various TODOs had been added that actually removed the actual test. This puts the CHECK test backs and removes the TODOs that have no immediate plans.	2023-10-26 10:48:29 -07:00
Aart Bik	7e83a1af5d	[mlir][sparse] add verification of absent value in sparse_tensor.unary (#70248 ) This value should always be a plain contant or something invariant computed outside the surrounding linalg operation, since there is no co-iteration defined on anything done in this branch. Fixes: https://github.com/llvm/llvm-project/issues/69395	2023-10-25 13:56:43 -07:00
Aart Bik	a12d057be9	[mlir][sparse] update block24 example (#70145 ) Removes TODO, shows how to define 8-bit crd (lacking 2-bit for now)	2023-10-25 08:29:31 -07:00
Peiming Liu	c780352de9	[mlir][sparse] implement sparse_tensor.lvl operation. (#69993 )	2023-10-24 13:23:28 -07:00
Oleksandr "Alex" Zinenko	e4384149b5	[mlir] use transform-interpreter in test passes (#70040 ) Update most test passes to use the transform-interpreter pass instead of the test-transform-dialect-interpreter-pass. The new "main" interpreter pass has a named entry point instead of looking up the top-level op with `PossibleTopLevelOpTrait`, which is arguably a more understandable interface. The change is mechanical, rewriting an unnamed sequence into a named one and wrapping the transform IR in to a module when necessary. Add an option to the transform-interpreter pass to target a tagged payload op instead of the root anchor op, which is also useful for repro generation. Only the test in the transform dialect proper and the examples have not been updated yet. These will be updated separately after a more careful consideration of testing coverage of the transform interpreter logic.	2023-10-24 16:12:34 +02:00
Peiming Liu	f0f5fdf73d	[mlir][sparse] introduce sparse_tensor.lvl operation. (#69978 )	2023-10-23 15:49:39 -07:00
Peiming Liu	ff21a90e51	[mlir][sparse] introduce sparse_tensor.crd_translate operation (#69630 )	2023-10-19 15:42:09 -07:00
Yinying Li	7b9fb1c228	[mlir][sparse] Update verifier for block sparsity and singleton (#69389 ) Updates: 1. Verification of block sparsity. 2. Verification of singleton level type can only follow compressed or loose_compressed levels. And all level types after singleton should be singleton. 3. Added getBlockSize function. 4. Added an invalid encoding test for an incorrect lvlToDim map that user provides.	2023-10-19 12:34:18 -04:00
Yinying Li	d4088e7d5f	[mlir][sparse] Populate lvlToDim (#68937 ) Updates: 1. Infer lvlToDim from dimToLvl 2. Add more tests for block sparsity 3. Finish TODOs related to lvlToDim, including adding lvlToDim to python binding Verification of lvlToDim that user provides will be implemented in the next PR.	2023-10-17 16:09:39 -04:00
Peiming Liu	71c97c735c	[mlir][sparse] avoid tensor to memref conversion in sparse tensor rewri… (#69362 ) …ting rules.	2023-10-17 11:34:06 -07:00
Aart Bik	d392073f67	[mlir][sparse] simplify reader construction of new sparse tensor (#69036 ) Making the materialize-from-reader method part of the Swiss army knife suite again removes a lot of redundant boiler plate code and unifies the parameter setup into a single centralized utility. Furthermore, we now have minimized the number of entry points into the library that need a non-permutation map setup, simplifying what comes next	2023-10-16 10:25:37 -07:00
Aart Bik	2045cca0c3	[mlir][sparse] add a forwarding insertion to SparseTensorStorage (#68939 )	2023-10-12 21:03:07 -07:00
Peiming Liu	f248d0b28d	[mlir][sparse] implement sparse_tensor.reorder_coo (#68916 ) As a side effect of the change, it also unifies the convertOp implementation between lib/codegen path.	2023-10-12 13:22:45 -07:00
Peiming Liu	0aacc2137a	[mlir][sparse] introduce sparse_tensor.reorder_coo operation (#68827 )	2023-10-12 09:42:12 -07:00
Peiming Liu	325576196b	[mlir][sparse] remove tests (#68826 )	2023-10-11 11:23:25 -07:00
Peiming Liu	dda3dc5e38	[mlir][sparse] simplify ConvertOp rewriting rules (#68350 ) Canonicalize complex convertOp into multiple stages, such that it can either be done by a direct conversion or by sorting.	2023-10-11 09:34:11 -07:00

1 2 3 4 5 ...

447 Commits