clang-p2996

Author	SHA1	Message	Date
Aart Bik	f388a3a446	[mlir][sparse] update doc and examples of the [dis]assemble operations (#88213 ) The doc and examples of the [dis]assemble operations did not reflect all the recent changes on order of the operands. Also clarified some of the text.	2024-04-10 09:42:12 -07:00
Matthias Springer	a27d886ce4	[mlir][linalg][bufferize] Fix element-wise access optimization for sparse tensors (#87305 ) `linalg.generic` ops with sparse tensors do not necessarily bufferize to element-wise access, because insertions into a sparse tensor may change the layout of (or reallocate) the underlying sparse data structures.	2024-04-03 09:57:25 +09:00
Aart Bik	3324f4d4f4	[mlir][sparse] avoid incompatible linalg fuse-into-consumer (#86752 ) This fixes an "infinite" loop bug, where the incoming IR was repeatedly rewritten while adding identical cast operations. The test for compatible types should include the notion of an encoding. If it differs, then a naive fusion into the consumer is invalid.	2024-03-26 17:16:03 -07:00
Aart Bik	2e6b18b3f3	[mlir][sparse] add example to new operation doc, and roundtrip test (#85711 )	2024-03-18 17:06:02 -07:00
Aart Bik	f3a8af07fa	[mlir][sparse] best effort finalization of escaping empty sparse tensors (#85482 ) This change lifts the restriction that purely allocated empty sparse tensors cannot escape the method. Instead it makes a best effort to add a finalizing operation before the escape. This assumes that (1) we never build sparse tensors across method boundaries (e.g. allocate in one, insert in other method) (2) if we have other uses of the empty allocation in the same method, we assume that either that op will fail or will do the finalization for us. This is best-effort, but fixes some very obvious missing cases.	2024-03-15 16:43:09 -07:00
Peiming Liu	94e27c265a	[mlir][sparse] reuse tensor.insert operation to insert elements into … (#84987 ) …a sparse tensor.	2024-03-12 16:59:17 -07:00
Peiming Liu	fc9f1d49aa	[mlir][sparse] use a consistent order between [dis]assembleOp and sto… (#84079 ) …rage layout.	2024-03-06 09:57:41 -08:00
Peiming Liu	52b69aa32f	[mlir][sparse] support sparsifying batch levels (#83898 )	2024-03-04 14:39:06 -08:00
Aart Bik	8dfa1d878d	[mlir][sparse] add roundtrip and invalid tests for sparse_tensor.print (#83349 )	2024-02-28 14:52:43 -08:00
Peiming Liu	0d1f95760b	[mlir][sparse] support type conversion from batched sparse tensors to… (#83163 ) … memrefs.	2024-02-27 12:05:28 -08:00
Peiming Liu	56d58295dd	[mlir][sparse] Introduce batch level format. (#83082 )	2024-02-26 16:08:28 -08:00
Peiming Liu	f40ee6e83f	[mlir][sparse] assemble SoA COO correctly. (#82449 )	2024-02-20 18:46:34 -08:00
Peiming Liu	f740366fa6	[mlir][sparse] support type conversion from SoA COO to memrefs. (#82398 )	2024-02-20 11:19:13 -08:00
Peiming Liu	11705afc19	[mlir][sparse] deallocate tmp coo buffer generated during stage-spars… (#82017 ) …e-ops pass.	2024-02-17 12:17:57 -08:00
Peiming Liu	088c7ce429	[mlir][sparse] introduce SoA level property on singleton level. (#81942 )	2024-02-15 16:41:10 -08:00
Aart Bik	4d273b948e	[mlir][sparse] ensure [dis]assembler wrapper methods properly inline (#81907 )	2024-02-15 11:39:32 -08:00
Yinying Li	2a6b521b36	[mlir][sparse] Add more tests and verification for n:m (#81186 ) 1. Add python test for n out of m 2. Add more methods for python binding 3. Add verification for n:m and invalid encoding tests 4. Add e2e test for n:m Previous PRs for n:m #80501 #79935	2024-02-09 14:34:36 -05:00
Yinying Li	e5924d6499	[mlir][sparse] Implement parsing n out of m (#79935 ) 1. Add parsing methods for block[n, m]. 2. Encode n and m with the newly extended 64-bit LevelType enum. 3. Update 2:4 methods names/comments to n:m.	2024-02-08 14:38:42 -05:00
Aart Bik	5a9af39aab	[mlir][sparse] made sparse vectorizer more robust on position of invariants (#80766 ) Because the sparse vectorizer relies on the code coming out of the sparsifier, the "patterns" are not always made very general. However, a recent change in the generated code revealed an obvious situation where the subscript analysis could be made a bit more robust. Fixes: https://github.com/llvm/llvm-project/issues/79897	2024-02-05 16:12:47 -08:00
Yinying Li	cd481fa827	[mlir][sparse] Change LevelType enum to 64 bit (#80501 ) 1. C++ enum is set through enum class LevelType : uint_64. 2. C enum is set through typedef uint_64 level_type. It is due to the limitations in Windows build: setting enum width to ui64 is not supported in C.	2024-02-05 17:00:52 -05:00
Aart Bik	d00e6d07b1	[mlir][sparse] refine sparse assembler strategy (#80521 ) Rewrite all public methods, making original internal, private methods, and exposing wrappers under the original name. This works a bit better in practice (when combined with c-interface mechanism of torch-mlir for example).	2024-02-05 10:48:18 -08:00
Peiming Liu	4a653b4df5	[mlir][sparse] Support pretty print to debug sparse iteration. (#80207 )	2024-02-01 15:28:36 -08:00
Peiming Liu	07bf1ddb4e	[mlir][sparse] support non-id map for [Dis]assembleOp (#80355 )	2024-02-01 15:11:33 -08:00
Aart Bik	33b463ad99	[mlir][sparse] external entry method wrapper for sparse tensors (#80326 ) Similar to the emit_c_interface, this pull request adds a pass that converts public entry methods that use sparse tensors as input parameters and/or output return values into wrapper functions that [dis]assemble the individual tensors that constitute the actual storage used externally into MLIR sparse tensors. This pass can be used to prepare the public entry methods of a program that is compiled by the MLIR sparsifier to interface with an external runtime, e.g., when passing sparse tensors as numpy arrays from and to Python. Note that eventual bufferization decisions (e.g. who [de]allocates the underlying memory) should be resolved in agreement with the external runtime (Python, PyTorch, JAX, etc.)	2024-02-01 13:32:52 -08:00
Peiming Liu	298412b578	[mlir][sparse] setup `SparseIterator` to help generating code to traverse a sparse tensor level. (#78345 )	2024-01-24 11:33:06 -08:00
Oleksandr "Alex" Zinenko	2798b72ae7	[mlir] introduce debug transform dialect extension (#77595 ) Introduce a new extension for simple print-debugging of the transform dialect scripts. The initial version of this extension consists of two ops that are printing the payload objects associated with transform dialect values. Similar ops were already available in the test extenion and several downstream projects, and were extensively used for testing.	2024-01-12 13:24:02 +01:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Aart Bik	41a07e668c	[mlir][sparse] recognize NVidia 2:4 type for matmul (#76758 ) This removes the temporary DENSE24 attribute and replaces it with proper recognition of dense to 24 conversion. The compressionh will be performed on the device prior to performing the matrix mult. Note that we no longer need to start with the linalg version, we can lift this to the proper named linalg op. Also renames some files into more consistent names.	2024-01-02 14:44:24 -08:00
Matthias Springer	10056c821a	[mlir][SCF] `scf.parallel`: Make reductions part of the terminator (#75314 ) This commit makes reductions part of the terminator. Instead of `scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops. `scf.reduce` may contain an arbitrary number of reductions, with one region per reduction. Example: ```mlir %init = arith.constant 0.0 : f32 %r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init) -> f32, f32 { %elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32> %elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32> scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) { ^bb0(%lhs : f32, %rhs: f32): %res = arith.addf %lhs, %rhs : f32 scf.reduce.return %res : f32 }, { ^bb0(%lhs : f32, %rhs: f32): %res = arith.mulf %lhs, %rhs : f32 scf.reduce.return %res : f32 } } ``` `scf.reduce` operations can no longer be interleaved with other ops in the body of `scf.parallel`. This simplifies the op and makes it possible to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was not possible before because the op was not a terminator, causing the op to be DCE'd.)	2023-12-20 11:06:27 +09:00
Peiming Liu	6c06bde7c4	[mlir][sparse] support loop range query using SparseTensorLevel. (#75670 )	2023-12-15 16:33:31 -08:00
Yinying Li	31b72b0742	[mlir][sparse]Make isBlockSparsity more robust (#75113 ) 1. A single dimension can either be blocked (with floordiv and mod pair) or non-blocked. Mixing them would be invalid. 2. Block size should be non-zero value.	2023-12-12 13:43:03 -05:00
Aart Bik	d96f46dd20	[mlir][sparse] fix bug in custom reduction scalarization code (#74898 ) Bug found with BSR of "spy" SDDMM method	2023-12-11 10:22:17 -08:00
Peiming Liu	baa192ea65	[mlir][sparse] optimize memory loads to SSA values when generating sp… (#74787 ) …arse conv.	2023-12-08 09:22:19 -08:00
Peiming Liu	097d2f1417	[mlir][sparse] optimize memory load to SSA value when generating spar… (#74750 ) …se conv kernel.	2023-12-07 12:00:25 -08:00
Peiming Liu	b6cad75e07	[mlir][sparse] refactoring: using util functions to query the index to load from position array for slice-driven loop. (#73986 )	2023-11-30 16:40:11 -08:00
Peiming Liu	2cc4b3d07c	[mlir][sparse] code cleanup using the assumption that dim2lvl maps ar… (#72894 ) …e simplified.	2023-11-20 10:25:42 -08:00
Peiming Liu	573c4db947	[mlir][sparse] refine reinterpret_map test cases (#72684 )	2023-11-17 10:04:56 -08:00
Aart Bik	83cf0dc982	[mlir][sparse] implement direct IR alloc/empty/new for non-permutations (#72585 ) This change implements the correct level sizes set up for the direct IR codegen fields in the sparse storage scheme. This brings libgen and codegen together again. This is step 3 out of 3 to make sparse_tensor.new work for BSR	2023-11-16 17:17:41 -08:00
Yinying Li	c5a67e16b6	[mlir][sparse] Use variable instead of inlining sparse encoding (#72561 ) Example: #CSR = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed), }> // CHECK: #[[$CSR.]] = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed) }> // CHECK-LABEL: func private @sparse_csr( // CHECK-SAME: tensor<?x?xf32, #[[$CSR]]*>) func.func private @sparse_csr(tensor<?x?xf32, #CSR>)	2023-11-16 19:30:21 -05:00
Peiming Liu	06a65ce500	[mlir][sparse] schedule sparse kernels in a separate pass from sparsification. (#72423 )	2023-11-15 12:16:05 -08:00
Tim Harvey	dce7a7cf69	Changed all code and comments that used the phrase "sparse compiler" to instead use "sparsifier" (#71875 ) The changes in this p.r. mostly center around the tests that use the flag sparse_compiler (also: sparse-compiler).	2023-11-15 20:12:35 +00:00
Aart Bik	a40900211a	[mlir][sparse] set rwx permissions to consistent values (#72311 ) some files had "x" permission set, others were missing "r"	2023-11-14 13:32:55 -08:00
Aart Bik	5f32bcfbae	[mlir][sparse][gpu] re-enable all GPU libgen tests (#72185 ) Previous change no longer properly used the GPU libgen pass (even though most tests still passed falling back to CPU). This revision puts the proper pass order into place. Also bit of a cleanup of CPU codegen vs. libgen setup.	2023-11-14 09:06:15 -08:00
Peiming Liu	269685545e	[mlir][sparse] remove filter-loop based algorithm support to handle a… (#71840 ) …ffine subscript expressions.	2023-11-13 11:36:49 -08:00
Peiming Liu	c99951d491	[mlir][sparse] end-to-end matmul between Dense and BSR tensors (#71448 )	2023-11-08 11:28:00 -08:00
Tim Harvey	c43e627457	Changed the phrase sparse-compiler to sparsifier in comments (#71578 ) When the Powers That Be decided that the name "sparse compiler" should be changed to "sparsifier", we negected to change some of the comments in the code; this pull request completes the name change.	2023-11-07 20:55:00 +00:00
Aart Bik	a4eadd7fb6	[mlir][sparse][gpu] add GPU BSR SDDMM check test (#71491 ) also minor edits in other GPU check tests	2023-11-06 22:36:25 -08:00
Christian Ulmann	7ed96b1c0d	[MLIR][LLVM] Remove last typed pointer remnants from tests (#71232 ) This commit removes all LLVM dialect typed pointers from the lit tests. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502	2023-11-04 14:13:31 +01:00
Peiming Liu	c0d78c4232	[mlir][sparse] Implement rewriters to reinterpret maps on alloc_tenso… (#70993 ) …r operation	2023-11-01 18:15:11 -07:00
Peiming Liu	3426d330a7	[mlir][sparse] Implement rewriters to reinterpret maps on foreach (#70868 )	2023-11-01 12:11:47 -07:00

1 2 3 4 5 ...

471 Commits