clang-p2996

Author	SHA1	Message	Date
Twice	b91d5af1ac	[MLIR][Vector] Allow any strided memref for one-element vector.load in lowering vector.gather (#122437 ) In `Gather1DToConditionalLoads`, currently we will check if the stride of the most minor dim of the input memref is 1. And if not, the rewriting pattern will not be applied. However, according to the verification of `vector.load` here: `4e32271e8b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp (L4971-L4975)` .. if the output vector type of `vector.load` contains only one element, we can ignore the requirement of the stride of the input memref, i.e. the input memref can be with any stride layout attribute in such case. So here we can allow more cases in lowering `vector.gather` by relaxing such check. As shown in the test case attached in this patch [here](`1933fbad58/mlir/test/Dialect/Vector/vector-gather-lowering.mlir (L151)`), now `vector.gather` of memref with non-trivial stride can be lowered successfully if the result vector contains only one element. --------- Signed-off-by: PragmaTwice <twice@apache.org> Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2025-01-12 16:02:41 +00:00
Andrzej Warzyński	21ba7aef3b	[mlir][vector][nfc] Update `alignedConversionPrecondition` (#122136 ) Adds some comments and re-name variables to clarify the usage.	2025-01-09 15:14:34 +00:00
Matthias Springer	3ace685105	[mlir][Transforms] Support 1:N mappings in `ConversionValueMapping` (#116524 ) This commit updates the internal `ConversionValueMapping` data structure in the dialect conversion driver to support 1:N replacements. This is the last major commit for adding 1:N support to the dialect conversion driver. Since #116470, the infrastructure already supports 1:N replacements. But the `ConversionValueMapping` still stored 1:1 value mappings. To that end, the driver inserted temporary argument materializations (converting N SSA values into 1 value). This is no longer the case. Argument materializations are now entirely gone. (They will be deleted from the type converter after some time, when we delete the old 1:N dialect conversion driver.) Note for LLVM integration: Replace all occurrences of `addArgumentMaterialization` (except for 1:N dialect conversion passes) with `addSourceMaterialization`. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2025-01-03 16:11:56 +01:00
Jacques Pienaar	09dfc5713d	[mlir] Enable decoupling two kinds of greedy behavior. (#104649 ) The greedy rewriter is used in many different flows and it has a lot of convenience (work list management, debugging actions, tracing, etc). But it combines two kinds of greedy behavior 1) how ops are matched, 2) folding wherever it can. These are independent forms of greedy and leads to inefficiency. E.g., cases where one need to create different phases in lowering and is required to applying patterns in specific order split across different passes. Using the driver one ends up needlessly retrying folding/having multiple rounds of folding attempts, where one final run would have sufficed. Of course folks can locally avoid this behavior by just building their own, but this is also a common requested feature that folks keep on working around locally in suboptimal ways. For downstream users, there should be no behavioral change. Updating from the deprecated should just be a find and replace (e.g., `find ./ -type f -exec sed -i 's\|applyPatternsAndFoldGreedily\|applyPatternsGreedily\|g' {} \;` variety) as the API arguments hasn't changed between the two.	2024-12-20 08:15:48 -08:00
Kazu Hirata	6e41483b84	[MemRef] Migrate away from PointerUnion::{is,get} (NFC) (#120382 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2024-12-18 10:56:27 -08:00
Petr Kurapov	bc29fc937c	[MLIR] Create GPU utils library & move distribution utils (#119264 ) Continue the move of `warp_execute_on_lane_0` op to the gpu dialect (#116994). This patch creates a utils library in GPU and moves generic helper functions there.	2024-12-13 10:26:57 +01:00
lialan	1669ac434c	[MLIR] Refactor mask compression logic when emulating `vector.maskedload` ops (#116520 ) This patch simplifies and extends the logic used when compressing masks emitted by `vector.constant_mask` to support extracting 1-D vectors from multi-dimensional vector loads. It streamlines mask computation, making it applicable for multi-dimensional mask generation, improving the overall handling of masked load operations.	2024-11-27 13:22:13 -08:00
Petr Kurapov	ecaf2c335c	[MLIR] Move warp_execute_on_lane_0 from vector to gpu (#116994 ) Please see the related RFC here: https://discourse.llvm.org/t/rfc-move-execute-on-lane-0-from-vector-to-gpu-dialect/82989. This patch does exactly one thing - moves the op to gpu.	2024-11-22 15:30:47 +01:00
lialan	f981ee7efc	[MLIR] extend `getCompressedMaskOp` support in `VectorEmulateNarrowType` (#116122 ) Previously when `numFrontPadElems` is not zero, `getCompressedMaskOp` produces wrong result if the mask generator op is a `vector.create_mask`. This patch resolves the issue by including `numFrontPadElems` into the mask generation. Signed-off-by: Alan Li <me@alanli.org>	2024-11-19 16:49:05 -08:00
lialan	6626ed6f9f	[MLIR] Fix `BubbleDownVectorBitCastForExtract` crash on non-static index (#116518 ) Previously the patch was not expecting to handle non-static index, when the index is a non constant value it will crash. This patch is to make sure it return gracefully instead of crashing.	2024-11-18 17:25:12 -08:00
Kunwar Grover	2f925d75de	[mlir][Vector] Move insert/extractelement distribution patterns to insert/extract (#116425 ) This is a NFC-ish change that moves vector.extractelement/vector.insertelement vector distribution patterns to vector.insert/vector.extract. Before: 0-d/1-d vector.extract -> vector.extractelement -> distributed vector.extractelement 2-d+ vector.extract -> distributed vector.extract After: scalar input vector.extract -> distributed vector.extract vector.extractelement -> distributed vector.extract 2d+ vector.extract -> distributed vector.extract The same changes are done for insertelement/insert. The change allows us to remove reliance on vector.extractelement/vector.insertelement, which are soon to be depreciated: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops/71116/8 No extra tests are included because this patch doesn't introduce / remove any functionality. It only changes the chain of lowerings. This change can be completly NFC if we make the distributed operation vector.extractelement/vector.insertelement, but that is slightly weird, because you are going from extractelement -> extract -> extractelement.	2024-11-18 10:59:49 +00:00
lialan	ef92aba52a	[MLIR] Fix VectorEmulateNarrowType constant op mask bug (#116064 ) This commit adds support for handling mask constants generated by the `arith.constant` op in the `VectorEmulateNarrowType` pattern. Previously, this pattern would not match due to the lack of mask constant handling in `getCompressedMaskOp`. The changes include: 1. Updating `getCompressedMaskOp` to recognize and handle `arith.constant` ops as mask value sources. 2. Handling cases where the mask is not aligned with the emulated load width. The compressed mask is adjusted to account for the offset. Limitations: - The arith.constant op can only have 1-dimensional constant values. Resolves: #115742 Signed-off-by: Alan Li <me@alanli.org>	2024-11-15 10:06:40 -08:00
Andrzej Warzyński	7a31f3c761	[mlir][vector][nfc] Improve comments in `getCompressedMaskOp` (#115663 )	2024-11-13 17:08:42 +00:00
Kunwar Grover	8e66303916	[mlir][Vector] Remove trivial uses of vector.extractelement/vector.insertelement (1/N) (#116053 ) This patch removes trivial usages of vector.extractelement/vector.insertelement. These operations can be fully represented by vector.extract/vector.insert. See https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops/71116 for more information. Further patches will remove more usages of these ops.	2024-11-13 15:45:59 +00:00
Matthias Springer	804d3c4ce1	[mlir][IR] Add `Block::isReachable` helper function (#114928 ) Add a new helper function `isReachable` to `Block`. This function traverses all successors of a block to determine if another block is reachable from the current block. This functionality has been reimplemented in multiple places in MLIR. Possibly additional copies in downstream projects. Therefore, moving it to a common place.	2024-11-13 14:58:09 +09:00
lialan	24a8092be7	[MLIR] Avoid `vector.extract_strided_slice` when not needed (#115941 ) In `staticallyExtractSubvector`, When the extracting slice is the same as source vector, do not need to emit `vector.extract_strided_slice`. This fixes the lit test case `@vector_store_i4` in `mlir\test\Dialect\Vector\vector-emulate-narrow-type.mlir`, where converting from `vector<8xi4>` to `vector<4xi8>` does not need slice extraction. The issue was introduced in #113411 and #115070, CI failure link: https://buildkite.com/llvm-project/github-pull-requests/builds/118845 This PR does not include a lit test case because it is a fix and the above mentioned `@vector_store_i4` test actually tests the mechanism. Signed-off-by: Alan Li <me@alanli.org>	2024-11-12 13:58:58 -08:00
Andrzej Warzyński	e458434ebe	[mlir][vector] Restrict narrow-type-emulation patterns (#115612 ) All patterns in populateVectorNarrowTypeEmulationPatterns currently assume a 1-D vector load/store rather than an n-D vector load/store. This assumption is evident in ConvertVectorTransferRead, for example, here (extracted from `ConvertVectorTransferRead`): ```cpp auto newRead = rewriter.create<vector::TransferReadOp>( loc, VectorType::get(numElements, newElementType), adaptor.getSource(), getValueOrCreateConstantIndexOp(rewriter, loc, linearizedIndices), newPadding); auto bitCast = rewriter.create<vector::BitCastOp>( loc, VectorType::get(numElements * scale, oldElementType), newRead); ``` Both invocations of `VectorType::get()` here generate a 1-D vector. Attempts to use these patterns with more generic cases, such as 2-D vectors, fail. For example, trying to cast the following 2-D case to `i32`: ```mlir func.func @vector_maskedload_2d_i8_negative( %idx1: index, %idx2: index, %num_elems: index, %passthru: vector<2x4xi8>) -> vector<2x4xi8> { %0 = memref.alloc() : memref<3x4xi8> %mask = vector.create_mask %num_elems, %num_elems : vector<2x4xi1> %1 = vector.maskedload %0[%idx1, %idx2], %mask, %passthru : memref<3x4xi8>, vector<2x4xi1>, vector<2x4xi8> into vector<2x4xi8> return %1 : vector<2x4xi8> } ``` For example, casting to i32 produces: ```bash error: 'vector.bitcast' op failed to verify that all of {source, result} have same rank %1 = vector.maskedload %0[%idx1, %idx2], %mask, %passthru : ^ ``` Instead of reworking these patterns (that's going to require much more effort), I’ve marked them as 1-D only and extended "TestEmulateNarrowTypePass" with an option to disable the Memref type converter - that's to be able to add negative tests (otherwise, the type converter throws an error we can't really test for). While not ideal, this workaround should suit a test pass.	2024-11-12 19:08:54 +00:00
lialan	c3c3ccc364	[MLIR] support dynamic indexing of `vector.maskedload` in `VectorEmulateNarrowTypes` (#115070 ) Based on existing emulating scheme, this patch expands to support dynamic indexing by dynamically create intermediate new mask, new pass thru vector and dynamically insert the result into destination vector. the dynamic parts are constructed by multiple `vector.extract` and `vector.insert` to rearrange the original mask/passthru vector, as `vector.insert_strided_slice` and `vector.extract_strided_slice` only take static offsets and indices. Note: currently only supporting `vector.maskedload` with masks created by `vector.constant_mask`. `vector.create_mask` is currently not working. --------- Co-authored-by: hasekawa-takumi <167335845+hasekawa-takumi@users.noreply.github.com>	2024-11-12 09:22:16 -08:00
Andrzej Warzyński	6fe7ad8be3	[mlir][vector][nfc] Add tests + update docs for narrow-type emulation (#115460 ) The documentation for narrow-type emulation was sparse, so I’ve expanded it with additional clarifications (e.g., specifying that the example discusses `i4` -> `i8` emulation). I also noticed some inconsistencies in testing for narrow-type emulation, with several cases covered only for "loading" and missing for "storing." To address this, I’ve: * Added comments in the test file for easier reference, * Added the missing tests for `vector.maskedstore`. Additionally, I’ve renamed tests for `vector.masked{load\|store}` for clarity: * `@vector_cst_maskedload_i8` -> `@vector_maskedload_i8_constant_mask`. This makes it easier to contrast with similar functions, such as `@vector_maskedload_i8`. Lastly, I’ve added a high-level comment in VectorEmulateNarrowType.cpp to clarify the overall design and intent of the file.	2024-11-12 15:45:46 +00:00
ziereis	4c4db3c943	add pattern for arith::UIToFPOp to VectorNarrowTypeRewritePatterns (#115485 ) This pr just adds the patterns from https://github.com/llvm/llvm-project/pull/89131 for the arith::UIToFPOp. Also does some slight renaming and moving of the tests for better readability.	2024-11-11 15:17:17 +00:00
Matthias Springer	b613a54075	[mlir][IR][NFC] Cleanup insertion point API usage (#115415 ) Use `setInsertionPointToStart` / `setInsertionPointToEnd` when possible.	2024-11-08 14:31:27 +09:00
Kazu Hirata	dbb4858a8c	[mlir] Fix warnings This patch fixes: mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:137:8: error: unused variable 'vectorType' [-Werror,-Wunused-variable] mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:154:8: error: unused variable 'srcType' [-Werror,-Wunused-variable] mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:155:8: error: unused variable 'destType' [-Werror,-Wunused-variable]	2024-11-05 09:53:44 -08:00
Kazu Hirata	6a263cef2d	[mlir] Fix a warning This patch fixes: mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:202:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]	2024-11-05 09:44:03 -08:00
lialan	ce112a7f44	[MLIR] support dynamic indexing in `VectorEmulateNarrowTypes` (#114169 ) * Supports `vector.load` and `vector.transfer_read` ops. * In the case of dynamic indexing, use per-element insertion/extraction to build desired narrow type vectors. * Fixed wrong function comment of `getCompressedMaskOp`. --------- Co-authored-by: Han-Chung Wang <hanhan0912@gmail.com>	2024-11-05 09:21:03 -08:00
Manupa Karunaratne	a6e72f9392	[MLIR][Vector] Add Lowering for vector.step (#113655 ) Currently, the lowering for vector.step lives under a folder. This is not ideal if we want to do transformation on it and defer the materizaliztion of the constants much later. This commits adds a rewrite pattern that could be used by using `transform.structured.vectorize_children_and_apply_patterns` transform dialect operation. Moreover, the rewriter of vector.step is also now used in -convert-vector-to-llvm pass where it handles scalable and non-scalable types as LLVM expects it. As a consequence of removing the vector.step lowering as its folder, linalg vectorization will keep vector.step intact.	2024-11-01 16:38:36 +00:00
lialan	2c313259c6	[MLIR] VectorEmulateNarrowType to support loading of unaligned vectors (#113411 ) Previously, the pass only supported emulation of loading vector sizes that are multiples of the emulated data type. This patch expands its support for emulating sizes that are not multiples of byte sizes. In such cases, the element values are packed back-to-back to preserve memory space. To give a concrete example: if an input has type `memref<3x3xi2>`, it is actually occupying 3 bytes in memory, with the first 18 bits storing the values and the last 6 bits as padding. The slice of `vector<3xi2>` at index `[2, 0]` is stored in memory from bit 12 to bit 18. To properly load the elements from bit 12 to bit 18 from memory, first load byte 2 and byte 3, and convert it to a vector of `i2` type; then extract bits 4 to 10 (element index 2-5) to form a `vector<3xi2>`. A limitation of this patch is that the linearized index of the unaligned vector has to be known at compile time. Extra code needs to be emitted to handle it if the condition does not hold. The following ops are updated: * `vector::LoadOp` * `vector::TransferReadOp` * `vector::MaskedLoadOp`	2024-10-29 20:04:48 -07:00
Andrzej Warzyński	0cf7aaf300	[MLIR][Vector] Update Transfer{Read\|Write}DropUnitDimsPattern patterns (#112394 ) Updates `TransferWriteDropUnitDimsPattern` and `TransferReadDropUnitDimsPattern` to inherit from `MaskableOpRewritePattern` so that masked versions of xfer_read/xfer_write Ops are also supported: ```mlir %v = vector.mask %mask { vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst : memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, vector<3x2xi8> } : vector<3x2xi1> -> vector<3x2xi8> ```	2024-10-26 13:54:04 +01:00
Kunwar Grover	1004865f1c	[mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank (#112907 ) Since `ddf2d62c7d` , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this. There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns.	2024-10-22 15:50:16 +01:00
Benoit Jacob	a9ebdbb5ac	[MLIR] Vector: turn the ExtractStridedSlice rewrite pattern from #111541 into a canonicalization (#111614 ) This is a reasonable canonicalization because `extract` is more constrained than `extract_strided_slices`, so there is no loss of semantics here, just lifting an op to a special-case higher/constrained op. And the additional `shape_cast` is merely adding leading unit dims to match the original result type. Context: discussion on #111541. I wasn't sure how this would turn out, but in the process of writing this PR, I discovered at least 2 bugs in the pattern introduced in #111541, which shows the value of shared canonicalization patterns which are exercised on a high number of testcases. --------- Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>	2024-10-09 09:24:23 -04:00
Benoit Jacob	d905b1caf1	[MLIR] Vector dialect: Address post-merge review comments on #111541 (#111552 ) Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2024-10-08 15:35:16 -04:00
Benoit Jacob	10054ba4ac	[mlir][vector] Add pattern to rewrite contiguous ExtractStridedSlice into Extract (#111541 ) Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-10-08 11:51:01 -04:00
BARRET	1666d13078	[CMake]: Remove unnecessary dependencies on LLVM/MLIR (#111255 ) Previous https://github.com/llvm/llvm-project/pull/110362 (reverted) caused breakage. Here is the PR with fix. My build cmdline: ``` cmake ../llvm \ -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=install \ -DCMAKE_C_COMPILER=gcc-9 \ -DCMAKE_CXX_COMPILER=g++-9 \ -DCMAKE_CUDA_COMPILER=$(which nvcc) \ -DLLVM_ENABLE_LLD=OFF \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_BUILD_EXAMPLES=ON \ -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \ -DLLVM_CCACHE_BUILD=ON \ -DMLIR_ENABLE_BINDINGS_PYTHON=ON \ -DBUILD_SHARED_LIBS=ON \ -DLLVM_ENABLE_PROJECTS='llvm;mlir' ```	2024-10-07 15:52:43 +02:00
Matthias Springer	206fad0e21	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250 ) This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.	2024-10-05 21:32:40 +02:00
Quinn Dawkins	4e2efea5e8	[mlir][vector] Add all view-like ops to transfer flow opt (#110521 ) `vector.transfer_*` folding and forwarding currently does not take into account reshaping view-like memref ops (expand and collapse shape), leading to potentially invalid store folding or value forwarding. This patch adds tracking for those (and other) view-like ops. It is still possible to design operations that alias memrefs without being a view (e.g. memref in the iter_args of an `scf.for`), so these patterns may still need revisiting in the future.	2024-10-02 00:20:44 -04:00
Andrzej Warzyński	1f5e8263b9	[mlir][vector] Add a new TD Op for patterns leveraging ShapeCastOp (#110525 ) Adds a new Transform Dialect Op that collects patters for dropping unit dims from various Ops: * `transform.apply_patterns.vector.drop_unit_dims_with_shape_cast`. It excludes patterns for vector.transfer Ops - these are collected under: * `apply_patterns.vector.rank_reducing_subview_patterns`, and use ShapeCastOp _and_ SubviewOp to reduce the rank (and to eliminate unit dims). This new TD Ops allows us to test the "ShapeCast folder" pattern in isolation. I've extracted the only test that I could find for that folder from "vector-transforms.mlir" and moved it to a dedicated file: "shape-cast-folder.mlir". I also added a test case with scalable vectors. Changes in VectorTransforms.cpp are not needed (added a comment with a TODO + ordered the patterns alphabetically). I am Including them here to avoid a separate PR.	2024-10-01 10:08:43 +01:00
Mehdi Amini	8b47711e84	Revert "CMake: Remove unnecessary dependencies on LLVM/MLIR" (#110594 ) Reverts llvm/llvm-project#110362 Multiple bots are broken.	2024-10-01 00:44:21 +02:00
BARRET	4980f2177e	CMake: Remove unnecessary dependencies on LLVM/MLIR (#110362 ) There are some spurious libraries which can be removed. I'm trying to bundle MLIR/LLVM library dependencies for our own libraries. We're utilizing cmake function to recursively collect MLIR/LLVM related dependencies. However, we identified certain library dependencies as redundant and safe for removal.	2024-09-30 23:57:13 +02:00
Quinn Dawkins	a3b34e67e6	[mlir][vector] Add pattern for dropping unit dims from for loops (#109585 ) This adds a pattern for dropping unit dims from the iter_args of scf.for ops using vector.shape_cast. This composes with the other patterns for dropping unit dims from elementwise ops and transposes.	2024-09-27 11:43:27 -04:00
Longsheng Mou	50febdeb64	[mlir][vector] Bugfix of linearize `vector.extract` (#106836 ) This patch add check for `vector.extract` with scalar type, which is not allowed when linearize `vector.extract`. Fix #106162.	2024-09-04 16:41:56 +08:00
Andrzej Warzyński	42944da5ba	[mlir][vector] Group re-order patterns together (#102856 ) Group all patterns that re-order vector.transpose and vector.broadcast Ops () under `populateSinkVectorOpsPatterns`. These patterns are normally used to "sink" redundant Vector Ops, hence grouping together. Example: ```mlir %at = vector.transpose %a, [1, 0]: vector<4x2xf32> to vector<2x4xf32> %bt = vector.transpose %b, [1, 0]: vector<4x2xf32> to vector<2x4xf32> %r = arith.addf %at, %bt : vector<2x4xf32> ``` would get converted to: ```mlir %0 = arith.addf %a, %b : vector<4x2xf32> %r = vector.transpose %0, [1, 0] : vector<2x4xf32> ``` This patch also moves all tests for these patterns so that all of them are: run under one test-flag: `test-vector-sink-patterns`, * located in one file: "vector-sink.mlir". To facilitate this change: * `-test-sink-vector-broadcast` is renamed as `test-vector-sink-patterns`, * "sink-vector-broadcast.mlir" is renamed as "vector-sink.mlir", * tests for `ReorderCastOpsOnBroadcast` and `ReorderElementwiseOpsOnTranspose` patterns are moved from "vector-reduce-to-contract.mlir" to "vector-sink.mlir", * `ReorderElementwiseOpsOnTranspose` patterns are removed from `populateVectorReductionToContractPatterns` and added to (newly created) `populateSinkVectorOpsPatterns`, * `ReorderCastOpsOnBroadcast` patterns are removed from `populateVectorReductionToContractPatterns` - these are already present in `populateSinkVectorOpsPatterns`. This should allow us better layering and more straightforward testing. For the latter, the goal is to be able to easily identify which pattern a particular test is exercising (especially when it's a specific pattern). NOTES FOR DOWNSTREAM USERS In order to preserve the current functionality, please make sure to add * `populateSinkVectorOpsPatterns`, wherever you are using `populateVectorReductionToContractPatterns`. Also, rename `populateSinkVectorBroadcastPatterns` as `populateSinkVectorOpsPatterns`. (*) I didn't notice any other re-order patterns.	2024-08-16 16:53:53 +01:00
Andrzej Warzyński	efe3db2124	[mlir][vector] Add tests for `populateSinkVectorBroadcastPatterns` (1/n) (#102286 ) Adds tests for scalable vectors in: * sink-vector-broadcast.mlir This test file excercises patterns grouped under `populateSinkVectorBroadcastPatterns`, which includes: * `ReorderElementwiseOpsOnBroadcast`, * `ReorderCastOpsOnBroadcast`. Right now there are only tests for the former. However, I've noticed that "vector-reduce-to-contract.mlir" contains tests for the latter and I've left a few TODOs to group these tests back together in one file. Additionally, added some helpful `notifyMatchFailure` messages in `ReorderElementwiseOpsOnBroadcast`.	2024-08-14 17:28:55 +01:00
Bangtian Liu	b5e47d2e40	[mlir][vector] Add extra check on distribute types to avoid crashes (#102952 ) This PR addresses the issue detailed in https://github.com/iree-org/iree/issues/17948. The problem occurs when distributed types are set to NULL, leading to compilation crashes. --------- Signed-off-by: Bangtian Liu <liubangtian@gmail.com>	2024-08-14 08:47:38 -07:00
Benjamin Maxwell	5f26497da7	[mlir][vector] Use `DenseI64ArrayAttr` in vector.multi_reduction (#102637 ) This prevents some unnecessary conversions to/from int64_t and IntegerAttr.	2024-08-10 14:10:24 +01:00
Benjamin Maxwell	9b06e25e73	[mlir][vector] Add mask elimination transform (#99314 ) This adds a new transform `eliminateVectorMasks()` which aims at removing scalable `vector.create_masks` that will be all-true at runtime. It attempts to do this by simply pattern-matching the mask operands (similar to some canonicalizations), if that does not lead to an answer (is all-true? yes/no), then value bounds analysis will be used to find the lower bound of the unknown operands. If the lower bound is >= to the corresponding mask vector type dim, then that dimension of the mask is all true. Note that the pattern matching prevents expensive value-bounds analysis in cases where the mask won't be all true. For example: ```mlir %mask = vector.create_mask %dynamicValue, %c2 : vector<8x4xi1> ``` From looking at `%c2` we can tell this is not going to be an all-true mask, so we don't need to run the value-bounds analysis for `%dynamicValue` (and can exit the transform early). Note: Eliminating create_masks here means replacing them with all-true constants (which will then lead to the masks folding away).	2024-08-09 10:51:49 +01:00
Han-Chung Wang	201da87c3f	[mlir][vector] Handle corner cases in DropUnitDimsFromTransposeOp. (#102518 ) `da8778e499` breaks the lowering of vector.transpose that all the dimensions are unit dimensions. The revision fixes the issue and adds a test. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2024-08-08 17:29:09 -07:00
Benjamin Maxwell	da492d4387	[mlir][vector] Fix return of `DropUnitDimsFromTransposeOp` pattern (#102478 ) This accidentally returned `failure()` (rather than `success()`) when it applied.	2024-08-08 16:31:11 +01:00
Andrzej Warzyński	22a130220c	[mlir][vector] Add more tests for ConvertVectorToLLVM (1/n) (#101936 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.bitcast * vector.broadcast Note, this has uncovered some missing logic in `BroadcastOpLowering`. This PR fixes the most basic cases where the scalable flags were dropped and the generated code was incorrect. Also, the conditions in `vector::isBroadcastableTo` are relaxed to allow cases like this: ```mlir %0 = vector.broadcast %arg0 : vector<1xf32> to vector<[4]xf32> ``` The `BroadcastOpLowering` pattern is effectively disabled for scalable vectors in more complex cases where an SCF loop would be required to loop over the scalable dims, e.g.: ```mlir %0 = vector.broadcast %arg0 : vector<[4]x1x2xf32> to vector<[4]x3x2xf32> ``` These cases are marked as "Stretch not at start" in the code. In those cases, support for scalable vectors is left as a TODO.	2024-08-08 15:57:36 +01:00
Benjamin Maxwell	da8778e499	[mlir][vector] Add pattern to drop unit dims from vector.transpose (#102017 ) Example: BEFORE: ```mlir %transpose = vector.transpose %vector, [3, 0, 1, 2] : vector<1x1x4x[4]xf32> to vector<[4]x1x1x4xf32> ``` AFTER: ```mlir %dropDims = vector.shape_cast %vector : vector<1x1x4x[4]xf32> to vector<4x[4]xf32> %transpose = vector.transpose %0, [1, 0] : vector<4x[4]xf32> to vector<[4]x4xf32> %restoreDims = vector.shape_cast %transpose : vector<[4]x4xf32> to vector<[4]x1x1x4xf32> ```	2024-08-08 11:42:27 +01:00
Andrzej Warzyński	cb89457ff8	[nlir][vector] Constrain `ContractionOpToMatmulOpLowering` (#102225 ) Disables `ContractionOpToMatmulOpLowering` for scalable vectors. This pattern is meant to enable lowering to `llvm.matrix.multiply` - I'm not aware of any use of that in the context of scalable vectors.	2024-08-07 10:18:14 +01:00
Kazu Hirata	5262865aac	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896 )	2024-08-04 11:43:05 -07:00

1 2 3 4 5 ...

447 Commits