clang-p2996

Author	SHA1	Message	Date
Kun Wu	d46bad7b55	[mlir][sparse][gpu] add the 2:4 spmm integration test from linalg Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D155351	2023-07-15 06:01:03 +00:00
Aart Bik	4df01dc270	[mlir][sparse][gpu][nvidia] add pruning step and check to 2:4 matrix multiplication (1) without the check, the results may silently be wrong, so check is needed (2) add pruning step to guarantee 2:4 property Note, in the longer run, we may want to split out the pruning step somehow, or make it optional. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155320	2023-07-14 12:08:13 -07:00
Aart Bik	f6f817d0d7	[mlir][sparse][gpu] minor improvements in 2:4 example Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155244	2023-07-13 16:20:27 -07:00
Guray Ozen	22a32f7d9c	[mlir][gpu] Add dump-ptx option When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it. This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155166	2023-07-13 21:14:57 +02:00
Peiming Liu	fc5d8fce7d	[mlir][sparse] support dual sparse convolution. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152601	2023-07-10 16:49:32 +00:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Peiming Liu	a63d6a0014	[mlir][sparse] make UnpackOp return the actual filled length of unpacked memory This might simplify frontend implementation by avoiding recomputation for the same value. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D154244	2023-06-30 21:35:15 +00:00
Peiming Liu	e7df82816b	[mlir][sparse] rewrite arith::SelectOp to semiring operations to sparsify it. Reviewed By: aartbik, K-Wu Differential Revision: https://reviews.llvm.org/D153397	2023-06-21 21:22:18 +00:00
Aart Bik	cdbdf93bf0	[mlir][sparse][gpu] extend SDDMM gpu test Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D153378	2023-06-20 16:12:12 -07:00
Kun Wu	632ccc538c	[mlir][sparse][gpu] remove tuple as one of the spmm_buffer_size output type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D153188	2023-06-19 15:57:50 +00:00
Kun Wu	9167dd46ba	[mlir][sparse][gpu] recognizing sddmm pattern in GPU libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151582	2023-06-15 23:48:11 +00:00
Kun Wu	b1c683f5c4	[mlir][sparse][gpu] enable sm80+ sparsity integration test only when explicitly set Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152966	2023-06-15 17:44:38 +00:00
Peiming Liu	faf7cd97d0	[mlir][sparse] merger extension to support sparsifying arith::CmpI/CmpF operation Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152761	2023-06-15 17:26:50 +00:00
Kun Wu	8f3fcbc687	[mlir][sparse][GPU] add 2:4 integration test Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152287	2023-06-13 02:10:26 +00:00
Aart Bik	80fe3168b5	[mlir][sparse] add support for direct prod/and/min/max reductions We recently fixed a bug in "sparsifying" such reductions, since it incorrectly changed this into reductions over stored elements only , which only works for add/sub/or/xor. However, we still want to be able to "sparsify" the reductions even in the general case, and this is a first step by rewriting them into a custom reduction that feeds in the implicit zeros. NOTE HOWEVER, that in the long run we want to do this better and feed in any implicit zero only ONCE for efficiency. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D152580	2023-06-12 09:27:47 -07:00
Aart Bik	e2167d89db	[mlir][sparse] refine absent branch feeding into custom op Document better that unary/binary may only feed to the output or the input of a custom reduction (not even a regular reduction since it may have "no value"!). Also fixes a bug when present branch is empty and feeds into custom reduction. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D152224	2023-06-06 09:57:15 -07:00
Peiming Liu	23dc96bbe4	[mlir][sparse] fix crashes when using custom reduce with unary operation. The tests case is directly copied from https://reviews.llvm.org/D152179 authored by @aartbik Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152204	2023-06-05 23:41:26 +00:00
Peiming Liu	e7b4c93f5e	[mlir][sparse] fix crash when using sparse_tensor::UnaryOp and ReduceOp. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152048	2023-06-03 01:19:05 +00:00
Aart Bik	6a38c772d4	[mlir][sparse] fixed bug with unary op, dense output Note that by sparse compiler convention, dense output is zerod out when not set, so complement results in zeros where elements were present. Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D152046	2023-06-02 18:15:33 -07:00
Peiming Liu	ce6f8c5afe	[mlir][sparse] fix various bug to support sparse pooling Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151776	2023-06-02 17:34:47 +00:00
Aart Bik	378f1885e3	[mlir][sparse] enhance sparse reduction support Formerly, we accepted and/prod reductions as a standard reduction but these change the semantics after sparsification by not looking at implicit zeros. Therefore, we only accept standard reductions that are insensitive to implicit vs. explicit zeros, and leave the more complex reductions to the sparse_tensor.reduce custom reduction implementation. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151929	2023-06-01 16:30:21 -07:00
Peiming Liu	54ac02dd16	[mlir][sparse] fix crashes when generation conv_2d_nchw_fchw with Compressed Dense Compressed Dense sparse encoding. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151773	2023-05-31 18:06:01 +00:00
wren romano	540d5e0ce6	[mlir][sparse] Updating STEA parser/printer to use the name "dimSlices" Depends On D151505 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151513	2023-05-30 15:50:07 -07:00
wren romano	76647fce13	[mlir][sparse] Combining `dimOrdering`+`higherOrdering` fields into `dimToLvl` This is a major step along the way towards the new STEA design. While a great deal of this patch is simple renaming, there are several significant changes as well. I've done my best to ensure that this patch retains the previous behavior and error-conditions, even though those are at odds with the eventual intended semantics of the `dimToLvl` mapping. Since the majority of the compiler does not yet support non-permutations, I've also added explicit assertions in places that previously had implicitly assumed it was dealing with permutations. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151505	2023-05-30 15:19:50 -07:00
Peiming Liu	db7f639b90	[mlir][sparse] fix a crash when generating sparse convolution with nchw input Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151744	2023-05-30 20:16:54 +00:00
Tobias Hieta	f9008e6366	[NFC][Py Reformat] Reformat python files in mlir subdir This is an ongoing series of commits that are reformatting our Python code. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Differential Revision: https://reviews.llvm.org/D150782	2023-05-26 08:05:40 +02:00
Aart Bik	22caafc9f3	[mlir][sparse][gpu] end to end test for matmul (1) minor bug fix in copy back [always nice to run stuff ;-)] (2) run with and without lib (even though some fall back to CPU) Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D151507	2023-05-25 16:10:22 -07:00
Peiming Liu	f7b8b005ff	[mlir][sparse] fix bugs when computing the memory size when lowering pack op. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151481	2023-05-25 19:19:52 +00:00
Aart Bik	76b7dca47d	[mlir][sparse][gpu] fixed typo in CUDA test Test was printing same result twice Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151370	2023-05-24 18:00:23 -07:00
Peiming Liu	b2e6b73544	[mlir][sparse] extend unpack operation to unpack arbitrary encodings. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151174	2023-05-23 22:34:01 +00:00
Peiming Liu	de56088866	[mlir][sparse] Support packing external data into arbitrary sparse tensor encoding. We previously only support packing two array (values and coordinates) into COO tensors. This patch allows packing inputs into arbitrary sparse tensor format. It also deletes the "implicit" data canonicalization performed inside sparse compiler, but instead requires users to canonicalize the data before passing it to the sparse compiler. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150916	2023-05-19 17:41:49 +00:00
wren romano	f56a7383f0	[mlir][sparse] Fixing sparse_reshape.mlir integration test (followup to D150822) For some reason, even though D150822 passed the buildbot, it failed to catch this test Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D150830	2023-05-17 16:56:47 -07:00
wren romano	7f5fb90bbb	[mlir][sparse] Fixing GPU tests (followup to D150330) The GPU tests weren't updated when rebasing D150330, so this patch fixes that. Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D150822	2023-05-17 15:29:54 -07:00
wren romano	a0615d020a	[mlir][sparse] Renaming the STEA field `dimLevelType` to `lvlTypes` This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes: * Renaming compiler-internal functions/methods: * `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}` * `Merger::{getDimLevelType => getLvlType}` (for consistency) * `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods) * Renaming external facets to match: * the STEA parser and printer * the C and Python bindings * PyTACO However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150330	2023-05-17 14:24:09 -07:00
Anlun Xu	6116ca67ab	[mlir][sparse] Add sparse rewriting rules for tensor::ReshapeOp Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149564	2023-05-16 14:56:33 -07:00
Aart Bik	7c1fb94150	[mlir][sparse] change runners to c_runners Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D150628	2023-05-15 18:17:52 -07:00
Aart Bik	c820f9e6ae	[mlir][sparse][gpu] end-to-end integration test of GPU libgen approach Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D150172	2023-05-15 10:57:14 -07:00
Andrzej Warzynski	20bf8c403c	[mlir][SparseTensor][ArmSVE] Disable scalable vectorisation in a test The MLIR SVE integration tests are now enabled in the clang-aarch64-full-2stage buildbot under emulation (QEMU) and one of the sparse integration tests is failing [1]: * Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir That test is failing because we we don't have a LIT substitution to replace: ``` ; RUN: mlir-cpu-runner <command> ``` with ``` ; RUN: <emulator> mlir-cpu-runner <command> ``` clang-aarch64-full-2stage does not support SVE natively and hence all SVE integration tests require emulation. Other SVE tests use `lli` (for which we do have the required substitution) and hence are not affected. This patch simplifies concatenate_dim_1.mlir to always use fixed-width vectorisation. We will re-enable scalable vectorisation once LIT substitutions for `mlir-cpu-runner` are updated. [1] https://lab.llvm.org/buildbot/#/builders/179/builds/6062	2023-05-02 21:14:38 +00:00
Cullen Rhodes	707b6e94b8	[mlir][SparseTensor][ArmSVE] Fix missing lli substitutions The MLIR SVE integration tests are now enabled in the clang-aarch64-full-2stage buildbot under emulation (QEMU) and two of the sparse integration tests are failing [1]: * mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sorted_coo.mlir * mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir The reason for this is the SVE RUN lines use plain 'lli' rather than the '%lli_host_or_aarch64_cmd' substitution that's necessary to run under emulation. The CI doesn't support SVE so the tests will SIGILL unless run under emulation. I should note the logs don't show a SIGILL, only the non-descript: FileCheck error: '<stdin>' is empty. but I expect this is what's actually happening. https://lab.llvm.org/buildbot/#/builders/179/builds/6051/steps/12/logs/stdio	2023-05-02 14:43:48 +00:00
Peiming Liu	d4db528938	[mlir][sparse] extend unpack operation to support unpacking a batched COO type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D149103	2023-05-01 18:17:29 +00:00
Cullen Rhodes	baafc74ab0	[mlir][test][Integration] Refactor Arm emulator configuration The logic enabling the Arm SVE (and now SME) integration tests for various dialects, that may run under emulation, is now duplicated in several places. This patch moves the configuration to the top-level MLIR integration tests Lit config and renames the '%lli' substitution in contexts where it will run exclusively (ArmSVE, ArmSME) on AArch64 (and possibly under emulation) to '%lli_aarch64_cmd', and '%lli_host_or_aarch64_cmd' for contexts where it may run AArch64 (also possibly under emulation). The latter is for integration tests that have target-specific and target-agnostic codepaths such as SparseTensor, which supports scalable vectors. The two substitutions have the same effect but the names are different to convey this information. The '%lli_aarch64_cmd' substitution could be used in the SparseTensor tests but that would be a misnomer if the host were x86 and the MLIR_RUN_SVE_TESTS=OFF. The reason for renaming the '%lli' substitution is to not prevent running other target-specific integration tests at the same time, since the same substitution '%lli' is used for lli in other integration tests: * mlir/test/Integration/Dialect/Vector/CPU/X86Vector - (AVX emulation via Intel SDE) * mlir/test/Integration/Dialect/Vector/CPU/AMX - (AMX emulation via Intel SDE) * mlir/test/Integration/Dialect/LLVMIR/CPU/test-vp-intrinsic.mlir - (RISCV emulation via QEMU if supported, native otherwise) and substituting '%lli' at the top-level with Arm specific logic would override this. Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D148929	2023-04-26 09:57:43 +00:00
Cullen Rhodes	c8d1388e6c	[mlir][ArmSME] Add tests for Streaming SVE This patch adds a couple of tests for targeting Arm Streaming SVE (SSVE) mode, part of the Arm Scalable Matrix Extension (SME). SSVE is enabled in the backend at the function boundary by specifying the `aarch64_pstate_sm_enabled` attribute, as documented here [1]. SSVE can be targeted from MLIR by specifying this in the passthrough attributes [2] and compiling with -mattr=+sme,+sve -force-streaming-compatible-sve The passthrough will propagate to the backend where `smstart/smstop` will be emitted around the call to the SSVE function. The set of legal instructions changes in SSVE, `-force-streaming-compatible-sve` avoids the use of NEON entirely and instead lowers to (streaming-compatible) SVE. The behaviour this flag predicates will be hooked up to the function attribute in the future such that simply specifying this (should) lead to correct code-generation. Two tests are added: * A basic LLVMIR test verifying the attribute is passed through. * An integration test calling a SSVE function. The integration test can be run with QEMU. [1] https://llvm.org/docs/AArch64SME.html [2] https://mlir.llvm.org/docs/Dialects/LLVM/#attribute-pass-through Reviewed By: awarzynski, aartbik Differential Revision: https://reviews.llvm.org/D148111	2023-04-25 07:51:43 +00:00
Aart Bik	86888e420c	[mlir][sparse][gpu] generate proper memcpy in/out host and device The host registration is a convenient way to get CUDA kernels running, but it may be slow and does not work for all buffer (like global constants). This revision uses the proper alloc copy dealloc chains for buffers, using asynchronous chains to increase overlap. The host registration mechanism is kept under a flag for the output, just for experimentation purposes while this project ramps up. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148682	2023-04-21 09:30:42 -07:00
Peiming Liu	fd2211d84a	use heap memory for position buffer allocated for PackOp. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148818	2023-04-20 20:26:01 +00:00
Peiming Liu	98f5a34097	[mlir][sparse] remove redundate integration tests. The removed tests evaluate the same kernels in existing tests, namely `sparse_conv2d.mlir` and `spares_conv3d.mlir`. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D148644	2023-04-18 18:30:43 +00:00
Peiming Liu	6a148c5aa7	[mlir][sparse] enable more sparse convolution kernels. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147670	2023-04-17 17:43:52 +00:00
Peiming Liu	2cd15925f4	[mlir][sparse] implement index redution on dense level (for CSR) Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D147550	2023-04-17 16:36:31 +00:00
Peiming Liu	5fd9d80135	[mlir][sparse] extend loop emitter to emit slice driven loops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D142930	2023-04-13 03:29:40 +00:00
Aart Bik	bdea9b960d	[mlir][sparse][gpu] put sparse compiler GPU end-to-end tests back SM80 flag guards the test for targets that do not support A100 GPUs Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D147863	2023-04-11 15:37:13 -07:00
Mehdi Amini	10dbf23edc	Revert "[mlir][sparse][gpu] end-to-end example with sparse GPU pipeline" This reverts commit `bf94afa10e`. The bot is broken: https://lab.llvm.org/buildbot/#/builders/61/builds/42062	2023-04-06 19:11:27 -07:00

1 2 3 4 5 ...

347 Commits