clang-p2996

Author	SHA1	Message	Date
Peiming Liu	fb8f492a1c	[mlir][sparse] clone a empty sparse tensor when fuse convert into pro… (#92158 ) …ducer.	2024-05-14 13:26:49 -07:00
Aart Bik	70e227a404	[mlir][sparse] recognize ReLu operation during sparsification (#92016 ) This is a proof of concept recognition of the most basic forms of ReLu operations, used to show-case sparsification of end-to-end PyTorch models. In the long run, we must avoid lowering such constructs too early (with this need for raising them back). See discussion at https://discourse.llvm.org/t/min-max-abs-relu-recognition-starter-project/78918	2024-05-13 14:02:29 -07:00
Peiming Liu	37ffbbb195	[mlir][tensor][sparse] don't drop encoding when infer result type (#91817 ) A general question is: is it possible to support hooks here to infer the encoding? E.g., when the extracted tensor slice is rank-reduced, the encoding need to be updated accordingly as well.	2024-05-13 09:53:15 -07:00
Peiming Liu	13af97a70e	[mlir][sparse] allow multiple COO segments in sparse encodings. (#91786 ) NOTE: we still have implementation holes when handling multiple COO segments in the encoding. But the format should be considered to be legal.	2024-05-10 11:36:01 -07:00
Yinying Li	83f3b1cb48	[mlir][sparse] Add verification for explicit/implicit value (#90111 ) 1. Verify that the type of explicit/implicit values should be the same as the tensor element type. 2. Verify that implicit value could only be zero. 3. Verify that explicit/implicit values should be numeric. 4. Fix the type change issue caused by SparseTensorType(enc).	2024-05-07 20:28:39 -04:00
Aart Bik	5c5116556f	[mlir][sparse] force a properly sized view on pos/crd/val under codegen (#91288 ) Codegen "vectors" for pos/crd/val use the capacity as memref size, not the actual used size. Although the sparsifier itself always uses just the defined pos/crd/val parts, printing these and passing them back to a runtime environment could benefit from wrapping the basic pos/crd/val getters into a proper memref view that sets the right size.	2024-05-07 09:20:56 -07:00
Aart Bik	fc398a112d	[mlir][sparse] test optimization of binary-valued operations (#90986 ) Make sure consumer-producer fusion happens (to avoid the temporary dense tensor) and constant folding occurs in the generated code.	2024-05-03 10:41:16 -07:00
Peiming Liu	fc83eda46e	[mlir][sparse] make sparse compiler more admissible. (#90927 )	2024-05-02 18:53:38 -07:00
Yinying Li	e71eacc5b1	[mlir][sparse] Support explicit/implicit value for complex type (#90771 )	2024-05-02 12:28:34 -04:00
Peiming Liu	78885395c8	[mlir][sparse] support tensor.pad on CSR tensors (#90687 )	2024-05-01 15:37:38 -07:00
Gaurav Shukla	97069a8619	[MLIR] Generalize expand_shape to take shape as explicit input (#90040 ) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 --------- Signed-off-by: Gaurav Shukla<gaurav.shukla@amd.com> Signed-off-by: Gaurav Shukla <gaurav.shukla@amd.com> Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>	2024-04-30 09:28:35 -07:00
Aart Bik	65ee8f10b2	[mlir][sparse] fold explicit value during sparsification (#90530 ) This ensures the explicit value is generated (and not a load into the values array). Note that actually not storing values array at all is still TBD, this is just the very first step.	2024-04-29 18:06:07 -07:00
Peiming Liu	3aeb28b93f	[mlir][sparse] fold sparse convert into producer linalg op. (#89999 )	2024-04-26 10:48:15 -07:00
Yinying Li	a10d67f9fb	[mlir][sparse] Enable explicit and implicit value in sparse encoding (#88975 ) 1. Explicit value means the non-zero value in a sparse tensor. If explicitVal is set, then all the non-zero values in the tensor have the same explicit value. The default value Attribute() indicates that it is not set. 2. Implicit value means the "zero" value in a sparse tensor. If implicitVal is set, then the "zero" value in the tensor is equal to the implicit value. For now, we only support `0` as the implicit value but it could be extended in the future. The default value Attribute() indicates that the implicit value is `0` (same type as the tensor element type). Example: ``` #CSR = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed), posWidth = 64, crdWidth = 64, explicitVal = 1 : i64, implicitVal = 0 : i64 }> ``` Note: this PR tests that implicitVal could be set to other values as well. The following PR will add verifier and reject any value that's not zero for implicitVal.	2024-04-24 16:20:25 -07:00
Peiming Liu	ea3eeb483f	[mlir][sparse] fuse concat and extract_slice op if possible. (#89825 )	2024-04-24 13:51:41 -07:00
Mehdi Amini	8c0341df02	Revert "[MLIR] Generalize expand_shape to take shape as explicit input" (#89540 ) Reverts llvm/llvm-project#69267 this broke some bots.	2024-04-21 14:33:48 +02:00
Gaurav Shukla	e095d978ba	[MLIR] Generalize expand_shape to take shape as explicit input (#69267 ) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>	2024-04-21 07:37:02 -04:00
Peiming Liu	481bd5d416	[mlir][sparse] introduce `sparse_tensor.extract_iteration_space` operation. (#88554 ) A `sparse_tensor.extract_space %tensor at %iterator` extracts a sparse iteration space defined `%tensor`, the operation to traverse the iteration space will be introduced in following PRs.	2024-04-16 11:32:30 -07:00
Peiming Liu	b9556532c7	Revert "[mlir][sparse] introduce sparse_tensor.iterate operation" (#88953 ) Reverts llvm/llvm-project#88807 (merged by mistake)	2024-04-16 11:31:33 -07:00
Peiming Liu	8debcf03c5	[mlir][sparse] introduce sparse_tensor.iterate operation (#88807 ) A `sparse_tensor.iterate` iterates over a sparse iteration space extracted from `sparse_tensor.extract_iteration_space` operation introduced in https://github.com/llvm/llvm-project/pull/88554. DO NOT MERGE before https://github.com/llvm/llvm-project/pull/88554	2024-04-16 11:31:09 -07:00
Kunwar Grover	6f1e23b47d	[MLIR][Bufferization] Choose default memory space in tensor copy insertion (#88500 ) Tensor copy insertion currently uses memory_space = 0 when creating a tensor copy using alloc_tensor. This memory space should instead be the default memory space provided in bufferization options.	2024-04-12 17:56:46 +02:00
Aart Bik	5122a2c232	[mlir][sparse] allow for direct-out passing of sparse tensor buffers (#88327 ) In order to support various external frameworks (JAX vs PyTorch) we need a bit more flexibility in [dis]assembling external buffers to and from sparse tensors in MLIR land. This PR adds a direct-out option that avoids the rigid pre-allocated for copy-out semantics. Note that over time, we expect the [dis]assemble operations to converge into something that supports all sorts of external frameworks. Until then, this option helps in experimenting with different options.	2024-04-11 10:07:24 -07:00
Aart Bik	f388a3a446	[mlir][sparse] update doc and examples of the [dis]assemble operations (#88213 ) The doc and examples of the [dis]assemble operations did not reflect all the recent changes on order of the operands. Also clarified some of the text.	2024-04-10 09:42:12 -07:00
Matthias Springer	a27d886ce4	[mlir][linalg][bufferize] Fix element-wise access optimization for sparse tensors (#87305 ) `linalg.generic` ops with sparse tensors do not necessarily bufferize to element-wise access, because insertions into a sparse tensor may change the layout of (or reallocate) the underlying sparse data structures.	2024-04-03 09:57:25 +09:00
Aart Bik	3324f4d4f4	[mlir][sparse] avoid incompatible linalg fuse-into-consumer (#86752 ) This fixes an "infinite" loop bug, where the incoming IR was repeatedly rewritten while adding identical cast operations. The test for compatible types should include the notion of an encoding. If it differs, then a naive fusion into the consumer is invalid.	2024-03-26 17:16:03 -07:00
Aart Bik	2e6b18b3f3	[mlir][sparse] add example to new operation doc, and roundtrip test (#85711 )	2024-03-18 17:06:02 -07:00
Aart Bik	f3a8af07fa	[mlir][sparse] best effort finalization of escaping empty sparse tensors (#85482 ) This change lifts the restriction that purely allocated empty sparse tensors cannot escape the method. Instead it makes a best effort to add a finalizing operation before the escape. This assumes that (1) we never build sparse tensors across method boundaries (e.g. allocate in one, insert in other method) (2) if we have other uses of the empty allocation in the same method, we assume that either that op will fail or will do the finalization for us. This is best-effort, but fixes some very obvious missing cases.	2024-03-15 16:43:09 -07:00
Peiming Liu	94e27c265a	[mlir][sparse] reuse tensor.insert operation to insert elements into … (#84987 ) …a sparse tensor.	2024-03-12 16:59:17 -07:00
Peiming Liu	fc9f1d49aa	[mlir][sparse] use a consistent order between [dis]assembleOp and sto… (#84079 ) …rage layout.	2024-03-06 09:57:41 -08:00
Peiming Liu	52b69aa32f	[mlir][sparse] support sparsifying batch levels (#83898 )	2024-03-04 14:39:06 -08:00
Aart Bik	8dfa1d878d	[mlir][sparse] add roundtrip and invalid tests for sparse_tensor.print (#83349 )	2024-02-28 14:52:43 -08:00
Peiming Liu	0d1f95760b	[mlir][sparse] support type conversion from batched sparse tensors to… (#83163 ) … memrefs.	2024-02-27 12:05:28 -08:00
Peiming Liu	56d58295dd	[mlir][sparse] Introduce batch level format. (#83082 )	2024-02-26 16:08:28 -08:00
Peiming Liu	f40ee6e83f	[mlir][sparse] assemble SoA COO correctly. (#82449 )	2024-02-20 18:46:34 -08:00
Peiming Liu	f740366fa6	[mlir][sparse] support type conversion from SoA COO to memrefs. (#82398 )	2024-02-20 11:19:13 -08:00
Peiming Liu	11705afc19	[mlir][sparse] deallocate tmp coo buffer generated during stage-spars… (#82017 ) …e-ops pass.	2024-02-17 12:17:57 -08:00
Peiming Liu	088c7ce429	[mlir][sparse] introduce SoA level property on singleton level. (#81942 )	2024-02-15 16:41:10 -08:00
Aart Bik	4d273b948e	[mlir][sparse] ensure [dis]assembler wrapper methods properly inline (#81907 )	2024-02-15 11:39:32 -08:00
Yinying Li	2a6b521b36	[mlir][sparse] Add more tests and verification for n:m (#81186 ) 1. Add python test for n out of m 2. Add more methods for python binding 3. Add verification for n:m and invalid encoding tests 4. Add e2e test for n:m Previous PRs for n:m #80501 #79935	2024-02-09 14:34:36 -05:00
Yinying Li	e5924d6499	[mlir][sparse] Implement parsing n out of m (#79935 ) 1. Add parsing methods for block[n, m]. 2. Encode n and m with the newly extended 64-bit LevelType enum. 3. Update 2:4 methods names/comments to n:m.	2024-02-08 14:38:42 -05:00
Aart Bik	5a9af39aab	[mlir][sparse] made sparse vectorizer more robust on position of invariants (#80766 ) Because the sparse vectorizer relies on the code coming out of the sparsifier, the "patterns" are not always made very general. However, a recent change in the generated code revealed an obvious situation where the subscript analysis could be made a bit more robust. Fixes: https://github.com/llvm/llvm-project/issues/79897	2024-02-05 16:12:47 -08:00
Yinying Li	cd481fa827	[mlir][sparse] Change LevelType enum to 64 bit (#80501 ) 1. C++ enum is set through enum class LevelType : uint_64. 2. C enum is set through typedef uint_64 level_type. It is due to the limitations in Windows build: setting enum width to ui64 is not supported in C.	2024-02-05 17:00:52 -05:00
Aart Bik	d00e6d07b1	[mlir][sparse] refine sparse assembler strategy (#80521 ) Rewrite all public methods, making original internal, private methods, and exposing wrappers under the original name. This works a bit better in practice (when combined with c-interface mechanism of torch-mlir for example).	2024-02-05 10:48:18 -08:00
Peiming Liu	4a653b4df5	[mlir][sparse] Support pretty print to debug sparse iteration. (#80207 )	2024-02-01 15:28:36 -08:00
Peiming Liu	07bf1ddb4e	[mlir][sparse] support non-id map for [Dis]assembleOp (#80355 )	2024-02-01 15:11:33 -08:00
Aart Bik	33b463ad99	[mlir][sparse] external entry method wrapper for sparse tensors (#80326 ) Similar to the emit_c_interface, this pull request adds a pass that converts public entry methods that use sparse tensors as input parameters and/or output return values into wrapper functions that [dis]assemble the individual tensors that constitute the actual storage used externally into MLIR sparse tensors. This pass can be used to prepare the public entry methods of a program that is compiled by the MLIR sparsifier to interface with an external runtime, e.g., when passing sparse tensors as numpy arrays from and to Python. Note that eventual bufferization decisions (e.g. who [de]allocates the underlying memory) should be resolved in agreement with the external runtime (Python, PyTorch, JAX, etc.)	2024-02-01 13:32:52 -08:00
Peiming Liu	298412b578	[mlir][sparse] setup `SparseIterator` to help generating code to traverse a sparse tensor level. (#78345 )	2024-01-24 11:33:06 -08:00
Oleksandr "Alex" Zinenko	2798b72ae7	[mlir] introduce debug transform dialect extension (#77595 ) Introduce a new extension for simple print-debugging of the transform dialect scripts. The initial version of this extension consists of two ops that are printing the payload objects associated with transform dialect values. Similar ops were already available in the test extenion and several downstream projects, and were extensively used for testing.	2024-01-12 13:24:02 +01:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Aart Bik	41a07e668c	[mlir][sparse] recognize NVidia 2:4 type for matmul (#76758 ) This removes the temporary DENSE24 attribute and replaces it with proper recognition of dense to 24 conversion. The compressionh will be performed on the device prior to performing the matrix mult. Note that we no longer need to start with the linalg version, we can lift this to the proper named linalg op. Also renames some files into more consistent names.	2024-01-02 14:44:24 -08:00

1 2 3 4 5 ...

493 Commits