clang-p2996

Author	SHA1	Message	Date
Jerry Wu	dedc7d4d36	[mlir] Exclude masked ops in VectorDropLeadUnitDim (#76468 ) Don't insert cast ops for ops in `vector.mask` region in `VectorDropLeadUnitDim`.	2024-01-20 19:37:46 -05:00
Benjamin Chetioui	35121add2e	[mlir][NFC] Remove unused variable.	2024-01-19 11:32:19 +00:00
Han-Chung Wang	12b676de72	[mlir][vector] Drop innermost unit dims on transfer_write. (#78554 )	2024-01-19 03:15:13 -08:00
Matthias Springer	5fcf907b34	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260 ) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`).	2024-01-17 11:08:59 +01:00
Matthias Springer	510626fa65	[mlir][vector] Fix invalid IR in `RewriteBitCastOfTruncI` (#78146 ) This commit fixes `Dialect/Vector/vector-rewrite-narrow-types.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` within split at llvm-project/mlir/test/Dialect/Vector/vector-rewrite-narrow-types.mlir:1 offset :118:8: error: 'arith.trunci' op operand type 'vector<3xi16>' and result type 'vector<3xi16>' are cast incompatible %1 = vector.bitcast %0 : vector<16xi3> to vector<3xi16> ^ within split at llvm-project/mlir/test/Dialect/Vector/vector-rewrite-narrow-types.mlir:1 offset :118:8: note: see current operation: %48 = "arith.trunci"(%47) : (vector<3xi16>) -> vector<3xi16> LLVM ERROR: IR failed to verify after pattern application ```	2024-01-16 09:45:38 +01:00
Matthias Springer	c0a354dfab	[mlir][vector] Fix invalid IR in `ContractionOpLowering` (#78130 ) If a rewrite pattern returns "failure", it must not have modified the IR. This commit fixes `Dialect/Vector/vector-contract-to-outerproduct-transforms-unsupported.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` * Pattern (anonymous namespace)::ContractionOpToOuterProductOpLowering : 'vector.contract -> ()' { Trying to match "(anonymous namespace)::ContractionOpToOuterProductOpLowering" Insert : 'vector.transpose'(0x5625b3a8cb30) Insert : 'vector.transpose'(0x5625b3a8cbc0) "(anonymous namespace)::ContractionOpToOuterProductOpLowering" result 0 } -> failure : pattern failed to match } -> failure : pattern failed to match LLVM ERROR: pattern returned failure but IR did change ``` Note: `vector-contract-to-outerproduct-transforms-unsupported.mlir` is merged into `vector-contract-to-outerproduct-matvec-transforms.mlir`. The `greedy pattern application failed` error is not longer produced. This error indicates that the greedy pattern rewrite did not convergence; it does not mean that a pattern could not be applied.	2024-01-16 09:40:24 +01:00
Kazu Hirata	8e8bbbd48e	[mlir] Use llvm::is_contained (NFC)	2024-01-12 22:08:29 -08:00
Matthias Springer	ad100b36e7	[mlir][vector] Fix dominance error in warp vector distribution (#77771 ) This commit fixes a test in `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use %1 = vector.extract %0[9] : f32 from vector<64xf32> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region) "func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({ ^bb0(%arg0: index): %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({ %7 = "some_def"() : () -> vector<64xf32> %8 = "arith.constant"() <{value = 9 : index}> : () -> index %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32 "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> () }) : (index) -> (f32, vector<2xf32>) %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32 %4 = "arith.index_cast"(%1) : (index) -> i32 %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32 %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1) "func.return"(%6#0) : (f32) -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ``` The position at which `vector.extractelement` extracts must also be distributed. The fix in `WarpOpExtractElement` is similar to `WarpOpInsertElement`.	2024-01-12 15:08:13 +01:00
Matthias Springer	35c19fdde2	[mlir][vector] Support warp distribution of `transfer_read` with dependencies (#77779 ) Support distribution of `vector.transfer_read` ops when operands are defined inside of the region of `warp_execute_on_lane_0` (except for the buffer from which the op is reading). Such IR was previously not supported. This commit changes the implementation such that indices and the padding value are also distributed. This commit simplifies the implementation considerably: the original implementation created a new `transfer_read` op and then checked if this new op is valid. If not, the rewrite pattern failed. This was a bit hacky. It was also a violation of the rewrite pattern API (detected by `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`) because the IR was modified, but the pattern returned "failure".	2024-01-12 11:55:37 +01:00
Jakub Kuderski	560564f51c	[mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901 ) This is to avoid confusion when dealing with reduction/combining kinds. For example, see a recent PR comment: https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175. Previously, they were picked to mostly mirror the names of the llvm vector reduction intrinsics: https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In isolation, it was not clear if `<maxf>` has `arith.maxnumf` or `arith.maximumf` semantics. The new reduction kind names map 1:1 to arith ops, which makes it easier to tell/look up their semantics. Because both the vector and the gpu dialect depend on the arith dialect, it's more natural to align names with those in arith than with the lowering to llvm intrinsics. Issue: https://github.com/llvm/llvm-project/issues/72354	2023-12-20 00:14:43 -05:00
Jakub Kuderski	9f74e6e615	[mlir][vector][gpu] Use `makeArithReduction` in lowering patterns. NFC. (#75952 ) Use the `vector::makeArithReduction` helper as the source-of-truth of reduction to arith ops lowering.	2023-12-19 19:04:27 -05:00
Jakub Kuderski	07677113ff	[mlir][vector] Add pattern to break down reductions into arith ops (#75727 ) The number of vector elements considered 'small' enough to extract is parameterized. This is to avoid going into specialized reduction lowering when a single/couple of arith ops can do. Targets without dedicated reduction intrinsics can use that as an emulation path too. Depends on https://github.com/llvm/llvm-project/pull/75846.	2023-12-18 17:54:54 -05:00
Jakub Kuderski	a528cee224	[mlir][vector] Improve `makeArithReduction` expansion (#75846 ) Propagate fast math flags. Distinguish `minf`/`maxf` and `minimumf`/`maximumf`. Required for future patterns in https://github.com/llvm/llvm-project/pull/75727.	2023-12-18 17:47:46 -05:00
Hsiangkai Wang	f643eec892	[mlir][vector] Add emulation patterns for vector masked load/store (#74834 ) In this patch, it will convert ``` vector.maskedload %base[%idx_0, %idx_1], %mask, %pass_thru ``` to ``` %ivalue = %pass_thru %m = vector.extract %mask[0] %result0 = scf.if %m { %v = memref.load %base[%idx_0, %idx_1] %combined = vector.insert %v, %ivalue[0] scf.yield %combined } else { scf.yield %ivalue } %m = vector.extract %mask[1] %result1 = scf.if %m { %v = memref.load %base[%idx_0, %idx_1 + 1] %combined = vector.insert %v, %result0[1] scf.yield %combined } else { scf.yield %result0 } ... ``` It will convert ``` vector.maskedstore %base[%idx_0, %idx_1], %mask, %value ``` to ``` %m = vector.extract %mask[0] scf.if %m { %extracted = vector.extract %value[0] memref.store %extracted, %base[%idx_0, %idx_1] } %m = vector.extract %mask[1] scf.if %m { %extracted = vector.extract %value[1] memref.store %extracted, %base[%idx_0, %idx_1 + 1] } ... ```	2023-12-15 11:35:48 +00:00
Jerry Wu	2c9ba9c34a	[mlir] Fix type transformation in DropUnitDimFromElementwiseOps (#75430 ) Use operand and result types to build the corresponding new types in `DropUnitDimFromElementwiseOps`.	2023-12-14 12:20:54 -05:00
Fangrui Song	71ba8bb4a7	[mlir,vector] Fix -Wunused-variable	2023-12-13 13:28:17 -08:00
Andrzej Warzyński	c02d07fdf0	[mlir][vector] Add pattern to drop unit dim from elementwise(a, b)) (#74817 ) For vectors with either leading or trailing unit dim, replaces: elementwise(a, b) with: sc_a = shape_cast(a) sc_b = shape_cast(b) res = elementwise(sc_a, sc_b) return shape_cast(res) The newly inserted shape_cast Ops fold (before elementwise Op) and then restore (after elementwise Op) the unit dim. Vectors `a` and `b` are required to be rank > 1. Example: ```mlir %mul = arith.mulf %B_row, %A_row : vector<1x[4]xf32> %cast = vector.shape_cast %mul : vector<1x[4]xf32> to vector<[4]xf32> ``` gets converted to: ```mlir %B_row_sc = vector.shape_cast %B_row : vector<1x[4]xf32> to vector<[4]xf32> %A_row_sc = vector.shape_cast %A_row : vector<1x[4]xf32> to vector<[4]xf32> %mul = arith.mulf %B_row_sc, %A_row_sc : vector<[4]xf32> %mul_sc = vector.shape_cast %mul : vector<[4]xf32> to vector<1x[4]xf32> %cast = vector.shape_cast %mul_sc : vector<1x[4]xf32> to vector<[4]xf32> ``` In practice, the bottom 2 shape_cast(s) will be folded away.	2023-12-13 20:29:12 +00:00
Jakub Kuderski	8063622721	[mlir][vector] Allow vector distribution with multiple written elements (#75122 ) Add a configuration option to allow vector distribution with multiple elements written by a single lane. This is so that we can perform vector multi-reduction with multiple results per workgroup.	2023-12-12 13:15:17 -05:00
Andrzej Warzyński	07919cf895	Revert "[mlir][vector] Make `TransposeOpLowering` configurable (#73915 )" (#75062 ) Reverting a workaround intended specifically for SPRI-V. That workaround emerged from this discussion: * https://github.com/llvm/llvm-project/pull/72105 AFAIK, it hasn't been required in practice. This is based on IREE (https://github.com/openxla/iree), which has just bumped it's fork of LLVM without using it (). () `cef31e775e` This reverts commit `bbd2b08b95`.	2023-12-11 21:32:23 +00:00
Andrzej Warzyński	2eb9e33cc5	[mlir][Vector] Update patterns for flattening vector.xfer Ops (2/N) (#73523 ) Updates patterns for flattening `vector.transfer_read` by relaxing the requirement that the "collapsed" indices are all zero. This enables collapsing cases like this one: ```mlir %2 = vector.transfer_read %arg4[%c0, %arg0, %arg1, %c0] ... : memref<1x43x4x6xi32>, vector<1x2x6xi32> ``` Previously only the following case would be consider for collapsing (all indices are 0): ```mlir %2 = vector.transfer_read %arg4[%c0, %c0, %c0, %c0] ... : memref<1x43x4x6xi32>, vector<1x2x6xi32> ``` Also adds some new comments and renames the `firstContiguousInnerDim` parameter as `firstDimToCollapse` (the latter better matches the actual meaning). Similar updates for `vector.transfer_write` will be implemented in a follow-up patch.	2023-12-05 08:35:58 +00:00
Andrzej Warzyński	bbd2b08b95	[mlir][vector] Make `TransposeOpLowering` configurable (#73915 ) Following the discussion here: * https://github.com/llvm/llvm-project/pull/72105 this patch makes the `TransposeOpLowering` configurable so that one can select whether to favour `vector.shape_cast` over `vector.transpose`. As per the discussion in #72105, using `vector.shape_cast` is very beneficial and desirable when targeting `LLVM IR` (CPU lowering), but won't work when targeting `SPIR-V` today (GPU lowering). Hence the need for a mechanism to be able to disable/enable the pattern introduced in #72105. This patch proposes one such mechanism. While this should solve the problem that we are facing today, it's understood to be a temporary workaround. It should be removed once support for lowering `vector.shape_cast` to SPIR-V is added. Also, (once implemented) the following proposal might make this workaround redundant: * https://discourse.llvm.org/t/improving-handling-of-unit-dimensions-in-the-vector-dialect/	2023-12-04 16:56:43 +00:00
Andrzej Warzyński	8171eac23f	[mlir][Vector] Update patterns for flattening vector.xfer Ops (1/N) (#73522 ) Updates "flatten vector" patterns to support more cases, namely Ops that read/write vectors with leading unit dims. For example: ```mlir %0 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0] ... : memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>, vector<1x1x2x2xi8> ``` Currently, the `vector.transfer_read` above would not be flattened. With this change, it will be rewritten as follows: ```mlir %collapse_shape = memref.collapse_shape %arg0 [[0, 1, 2, 3]] : memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>> into memref<120xi8, strided<[1], offset: ?>> %0 = vector.transfer_read %collapse_shape[%c0] ... : memref<120xi8, strided<[1], offset: ?>>, vector<4xi8> %1 = vector.shape_cast %0 : vector<4xi8> to vector<1x1x2x2xi8> ``` `hasMatchingInnerContigousShape` is generalised and renamed as `isContiguousSlice` to better match the updated functionality. A few test names are updated to better highlight what case is being exercised.	2023-12-04 10:21:32 +00:00
Quinn Dawkins	fdf84cbf87	[mlir][vector] Fix unit dim dropping pattern for masked writes (#74038 ) This does the same as #72142 for vector.transfer_write. Previously the pattern would silently drop the mask.	2023-12-01 10:01:28 -05:00
Han-Chung Wang	7f82c90621	[mlir][vector] Add support for vector.maskedstore sub-type emulation. (#73871 ) The idea is similar to vector.maskedload + vector.store emulation. What the emulation does is: 1. Get a compressed mask and load the data from destination. 2. Bitcast the data to original vector type. 3. Select values between `op.valueToStore` and the data from load using original mask. 4. Bitcast the new value and store it to destination using compressed masked.	2023-11-30 11:27:06 -08:00
Andrzej Warzyński	a383817b7e	[mlir][Vector] Add a rewrite pattern for gather over a strided memref (#72991 ) This patch adds a rewrite pattern for `vector.gather` over a strided memref like the following: ```mlir %subview = memref.subview %arg0[0, 0] [100, 1] [1, 1] : memref<100x3xf32> to memref<100xf32, strided<[3]>> %gather = vector.gather %subview[%c0] [%idxs], %cst_0, %cst : memref<100xf32, strided<[3]>>, vector<4xindex>, vector<4xi1>, vector<4xf32> into vector<4xf32> ``` After the pattern added in this patch: ```mlir %collapse_shape = memref.collapse_shape %arg0 [[0, 1]] : memref<100x3xf32> into memref<300xf32> %1 = arith.muli %arg3, %cst : vector<4xindex> %gather = vector.gather %collapse_shape[%c0] [%1], %cst_1, %cst_0 : memref<300xf32>, vector<4xindex>, vector<4xi1>, vector<4xf32> into vector<4xf32> ``` Fixes https://github.com/openxla/iree/issues/15364.	2023-11-30 16:33:20 +00:00
Andrzej Warzyński	4b2ba5a61a	[mlir][sve] Add an e2e for linalg.matmul with mixed types (#73773 ) Apart from the test itself, this patch also updates a few patterns to fix how new VectorType(s) are created. Namely, it makes sure that "scalability" is correctly propagated. Regression tests will be updated seperately while auditing Vector dialect tests in the context of scalable vectors: * https://github.com/orgs/llvm/projects/23	2023-11-29 21:21:10 +00:00
Quinn Dawkins	f385f6c93b	[mlir][vector] Distribute all non-permutation or broadcasted masked transfer reads (#73539 ) The primary difficulty with distribution of masked transfers is when the permutation map permutes the vector, in which case the distribution logic needs to make sure the correct mask elements end up with the distributed transfer. This is only tricky when the permutation map has a permutation in it, so we can relax the condition for distribution.	2023-11-27 16:23:48 -05:00
Quinn Dawkins	6e8f7d5966	[mlir][vector] Fix patterns for dropping leading unit dims from masks (#73525 ) Previously the pattern only worked when the permutation map was a minor identity. Infer the new mask type from the new transfer map after dropping leading unit dims.	2023-11-27 12:35:32 -05:00
Jakub Kuderski	d33bad66d8	[mlir][vector] Add patterns to simplify chained reductions (#73048 ) Chained reductions get created during vector unrolling. These patterns simplify them into a series of adds followed by a final reductions. This is preferred on GPU targets like SPIR-V/Vulkan where vector reduction gets lowered into subgroup operations that are generally more expensive than simple vector additions. For now, only the `add` combining kind is handled.	2023-11-22 10:30:04 -05:00
MaheshRavishankar	1f141737c7	Revert "[mlir][vector] Move transpose with unit-dim to shape_cast pattern (#72493 )" (#72918 ) This reverts commit `95acb33b45`.	2023-11-21 06:18:51 -08:00
Matthias Springer	32c3decb77	[mlir][vector] Modernize `vector.transpose` op (#72594 ) * Declare arguments/results with `let` statements. * Rename `transp` to `permutation`. * Change type of `transp` from `I64ArrayAttr` to `DenseI64ArrayAttr` (provides direct access to `ArrayRef<int64_t>` instead of `ArrayAttr`).	2023-11-20 11:25:35 +01:00
Cullen Rhodes	bf897d5d77	[mlir][vector] Extend TransferReadDropUnitDimsPattern to support partially-static memrefs (#72142 ) This patch extends TransferReadDropUnitDimsPattern to support dropping unit dims from partially-static memrefs, for example: %v = vector.transfer_read %base[%c0, %c0], %pad {in_bounds = [true, true]} : memref<?x1xi8, strided<[?, ?], offset: ?>>, vector<[16]x1xi8> Is rewritten as: %dim0 = memref.dim %base, %c0 : memref<?x1xi8, strided<[?, ?], offset: ?>> %subview = memref.subview %base[0, 0] [%dim0, 1] [1, 1] : memref<?x1xi8, strided<[?, ?], offset: ?>> to memref<?xi8, #map1> %v = vector.transfer_read %subview[%c0], %pad {in_bounds = [true]} : memref<?xi8, #map1>, vector<[16]xi8> Scalable vectors are now also supported, the scalable dims were being dropped when creating the rank-reduced vector type. The xfer op can also have a mask of type 'vector.create_mask', which gets rewritten as long as the mask of the unit dim is a constant of 1.	2023-11-20 08:39:34 +00:00
Cullen Rhodes	95acb33b45	[mlir][vector] Move transpose with unit-dim to shape_cast pattern (#72493 ) Moved from lowering to canonicalization.	2023-11-17 14:06:03 +00:00
Matthias Springer	8a40fcaf35	[mlir][vector] Clean up VectorTransferOpInterface (#72353 ) - Better documentation. - Rename interface methods: `source` -> `getSource`, `indices` -> `getIndices`, etc. to conform with MLIR naming conventions. A default implementation is not needed. - Turn many interface methods into helper functions. Most of the previous interface methods were not meant to be overridden, and if some were overridden without others, the op would be have been broken.	2023-11-16 10:35:46 +09:00
Cullen Rhodes	b7b6d54004	[mlir][vector] Add vector.transpose with unit-dim to vector.shape_cast pattern (#72105 ) This patch extends the vector.transpose lowering to replace: vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to vector<1xnx<eltty>> with: vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnx<eltty>> Source with leading unit-dim (inverse) is also replaced. Unit dim must be fixed. Non-unit dim can be scalable. A check is also added to bail out for scalable vectors before unrolling.	2023-11-15 14:14:33 +00:00
long.chen	1609f1c2a5	[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269 ) detail see the docment: https://mlir.llvm.org/deprecation/ Not all changes are made manually, most of them are made through a clang tool I wrote https://github.com/lipracer/cpp-refactor.	2023-11-14 13:01:19 +08:00
Felix Schneider	d5a0fb39ae	[mlir][vector] Handle empty `MaskOp` in `LowerVectorMask`, `MaskOpRewritePattern` (#72031 ) This patch adds handling of an empty `MaskOp` to `MaskOpRewritePattern` and thereby fixes a crash. It also pulls the `MaskOp` canonicalization patterns into `LowerVectorMask` so that empty `MaskOp`s are folded away in the Pass. Fix https://github.com/llvm/llvm-project/issues/71036	2023-11-12 08:12:28 +01:00
Quinn Dawkins	bc81f8c87e	[mlir][vector] Drop incorrect startRootUpdate calls in vector distribution (#71988 ) Fixes asan failures in https://lab.llvm.org/buildbot/#/builders/5/builds/38191 introduced by #71964.	2023-11-10 17:07:39 -05:00
Quinn Dawkins	aa2376a083	[mlir][vector] Notify the rewriter when sinking out of warp ops (#71964 ) A number of the warp distribution patterns work by rewriting a warp op in place by moving a contained op outside. This notifies the rewriter that the warp op is changing in this case.	2023-11-10 14:45:18 -05:00
Han-Chung Wang	2bac720101	[mlir][vector] Take dim sizes into account in DropInnerMostUnitDims. (#71752 ) The `stride == 1` does not imply that we can drop it. Because it could load more than 1 elements. We should also take source sizes and vector sizes into account. Otherwise it generates invalid IRs. E.g., ```mlir func.func @foo(%arg0: memref<1x1xf32>) -> vector<4x8xf32> { %c0 = arith.constant 0 : index %cst = arith.constant 0.000000e+00 : f32 %0 = vector.transfer_read %arg0[%c0, %c0], %cst : memref<1x1xf32>, vector<4x8xf32> return %0 : vector<4x8xf32> } ``` Fixes https://github.com/openxla/iree/issues/15493	2023-11-10 09:27:59 -08:00
Quinn Dawkins	d4d2891447	[mlir][vector] Add distribution pattern for vector.create_mask (#71619 ) This is the last step needed for basic support for distributing masked vector code. The lane id gets delinearized based on the distributed mask shape and then compared against the original mask sizes to compute the bounds for the distributed mask. Note that the distribution of masks is implicit on the shape specified by the warp op. As a result, it is the responsibility of the consumer of the mask to ensure the distributed mask will match its own distribution semantics.	2023-11-10 10:09:37 -05:00
Quinn Dawkins	df49a97ab2	[mlir][vector] Root the transfer write distribution pattern on the warp op (#71868 ) Currently when there is a mix of transfer read ops and transfer write ops that need to be distributed, because the pattern for write distribution is rooted on the transfer write, it is hard to guarantee that the write gets distributed after the read when the two aren't directly connected by SSA. This is likely still relatively unsafe when there are undistributable ops, but structurally these patterns are a bit difficult to work with. For now pattern benefits give fairly good guarantees for happy paths.	2023-11-10 08:49:33 -05:00
Quinn Dawkins	7360d5d30f	[mlir][vector] Fix cases with multiple yielded transfer_read ops (#71625 ) This fixes two bugs: 1) When deciding whether a transfer read could be propagated out of a warp op, it looked for the first yield operand that was produced by a transfer read. If this transfer read wasn't ready to be distributed, the pattern would not re-check for any other transfer reads that could have been propagated. 2) When dropping dead warp results, we do so by updating the warp op signature and splicing in the old region. This does not add the ops in the body of the warp op back to the pattern applicator's worklist, and thus those operations won't be DCE'd. This is a problem for patterns like the one for transfer reads that will still see the dead operation as a user.	2023-11-09 11:35:54 -05:00
Quinn Dawkins	771f5759df	[mlir][vector] Add pattern to distribute masked reads (#71610 ) Because the distribution is based on types, supporting general masked reads requires first materializing the permutation map in IR to align the elements of the mask with the elements read by the transfer op. For now just support cases with the trivial permutation map.	2023-11-09 09:24:26 -05:00
Quinn Dawkins	25ec1fa969	[mlir][vector] Add support for distributing masked writes (#71482 ) General distribution of masked writes requires materializing the permutation on the vector of the write in IR to ensure the vector lines up with the mask. For now just support cases with trivial permutation maps.	2023-11-07 17:54:49 -05:00
Quinn Dawkins	796d48b080	[mlir][vector] Add leading unit dim folding patterns for masked transfers (#71466 ) This handles `vector.transfer_read`, `vector.transfer_write`, and `vector.constant_mask`. The unit dims are only relevant for masks created by `create_mask` and `constant_mask` if the mask size for the unit dim is non-one, in which case all subsequent sizes must also be zero. From the perspective of the vector transfers, however, these unit dims can just be dropped directly.	2023-11-06 20:40:14 -05:00
Quinn Dawkins	98dcd98a1a	[mlir][vector] Hoist uniform scalar loop code after scf.for distribution (#71422 ) After propagation of `vector.warp_execute_on_lane_0` through `scf.for`, uniform operations like those on the loop iterators can now be hoisted out of the inner warp op.	2023-11-06 14:16:15 -05:00
saienduri	24cf476bd6	[mlir] Add support for vector.store sub-byte emulation. (#70293 )	2023-11-01 18:57:21 -07:00
Matthias Springer	1df6504ac2	[mlir][vector] LISH: Implement `SubsetOpInterface` for transfer_read/write (#70629 ) - Implement `SubsetOpInterface`, `SubsetExtractionOpInterface`, `SubsetInsertionOpInterface` for `vector.transfer_read` and `vector.transfer_write`. - Move all tensor subset hoisting test cases from `Linalg` to `loop-invariant-subset-hoisting.mlir`. (Removing 1 duplicate test case.)	2023-11-01 12:19:30 +09:00
tyb0807	674261b203	[mlir][Vector] Add narrow type emulation pattern for vector.maskedload (#68443 )	2023-10-27 10:49:58 +02:00

1 2 3 4 5 ...

320 Commits