clang-p2996

Author	SHA1	Message	Date
xiaoleis-nv	8d6469b0e0	[mlir][vector] Add lower-vector-multi-reduction pass (#87333 ) This MR adds the `lower-vector-multi-reduction` pass to lower the vector.multi_reduction operation. While the Transform Dialect includes an operation, `transform.apply_patterns.vector.lower_multi_reduction`, intended for a similar purpose, its utility is limited to projects that have adopted the Transform Dialect. Recognizing that not all projects are equipped to integrate this dialect, the proposed pass serves as a vital standalone alternative. It ensures that projects solely dependent on the traditional pass infrastructure can also benefit from the optimized lowering of `multi_reduction` operation. --------- Co-authored-by: Xiaolei Shi <xiaoleis@nvidia.com>	2024-04-09 10:04:25 -07:00
Han-Chung Wang	ef5a710911	[mlir][vector] Skip 0D vectors in vector linearization. (#87577 )	2024-04-03 17:00:56 -07:00
Kojo Acquah	66fed33db0	[mlir][vector] Update `castAwayContractionLeadingOneDim` to omit transposes solely on leading unit dims. (#85694 ) Updates `castAwayContractionLeadingOneDim` to check for leading unit dimensions before inserting `vector.transpose` ops. Currently `castAwayContractionLeadingOneDim` removes all leading unit dims based on the accumulator and transpose any subsequent operands to match the accumulator indexing. This does not take into account if the transpose is strictly necessary, for instance when given this vector-matrix contract: ```mlir %result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32> ``` Passing this through `castAwayContractionLeadingOneDim` pattern produces the following: ```mlir %0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32> %1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32> %2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32> %3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32> %4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32> ``` The `vector.transpose` introduced does not affect the underlying data layout (effectively a no op), but it cannot be folded automatically. This change avoids inserting transposes when only leading unit dimensions are involved. Fixes #85691	2024-04-03 19:27:01 -04:00
Jakub Kuderski	971b852546	[mlir][NFC] Simplify type checks with isa predicates (#87183 ) For more context on isa predicates, see: https://github.com/llvm/llvm-project/pull/83753.	2024-04-01 11:40:09 -04:00
Diego Caballero	13b3762608	[mlir][Vector] Fix crash in drop unit dims (#87104 ) An `arith.select` may have a scalar condition and true/false vector values.	2024-03-29 14:15:30 -07:00
Andrzej Warzyński	d3aa92ed14	[mlir][vector] Add support for scalable vectors to VectorLinearize (#86786 ) Adds support for scalable vectors to patterns defined in VectorLineralize.cpp. Linearization is disable in 2 notable cases: * vectors with more than 1 scalable dimension (we cannot represent vscale^2), * vectors initialised with arith.constant that's not a vector splat (such arith.constant Ops cannot be flattened).	2024-03-28 14:53:21 +00:00
Balaji V. Iyer	5f1f9cfaa4	[mlir][Vector] Fix an assertion on failing cast in vector-transfer-flatten-patterns (#86030 ) When the result is not a vectorType, there is an assert. This patch will do the check and bail when the result is not a VectorType.	2024-03-25 16:05:09 -05:00
Crefeda Rodrigues	465ea0bfa6	[mlir][vector] Propagate scalability in TransferWriteNonPermutationLowering (#85632 ) Updates `extendVectorRank` so that scalability in patterns that use it (in particular, `TransferWriteNonPermutationLowering`), is correctly propagated. Closed related previous PR https://github.com/llvm/llvm-project/pull/85270 --------- Signed-off-by: Crefeda Rodrigues <crefeda.rodrigues@arm.com> Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>	2024-03-22 10:08:03 +00:00
Andrzej Warzyński	5f1b2cffe5	[mlir][vector] Add support for masks in castAwayContractionLeadingOneDim (#81906 ) Updates `castAwayContractionLeadingOneDim` to inherit from `MaskableOpRewritePattern` so that this pattern can support masking. Builds on top of #83827	2024-03-22 09:37:43 +00:00
Andrzej Warzyński	b7324b6a9c	[mlir][vector] Adds pattern rewrite for maskable Ops (#83827 ) Adds a generic pattern rewrite for maskable Ops, `MaskableOpRewritePattern`, that will work for both masked and un-masked cases, e.g. for both: * `vector.mask {vector.contract}` (masked), and * `vector.contract` (not masked). This helps to reduce code-duplication and standardise how we implement such patterns. Fixes #78787	2024-03-20 21:04:06 +00:00
Han-Chung Wang	7c83d1bd61	[mlir][vector] Use inferRankReducedResultType for subview type inference. (#84395 ) Fixes https://github.com/openxla/iree/issues/16475	2024-03-12 11:46:05 -07:00
Balaji V. Iyer	6f5c4f2eac	[mlir][vector]Add Vector bitwidth target to Linearize Vectorizable and Constant Ops (#83314 ) Added a new flag `targetVectorBitwidth` to capture bit-width input.	2024-03-04 19:17:51 -06:00
Diego Caballero	4623c114fb	[mlir][Vector] Support vector.insert in bubbling bitcast patterns (#82843 ) This PR is adds support for `vector.insert` to the patterns that bubble up and down `vector.bitcat` ops across `vector.extract/extract_slice/insert_slice` ops.	2024-02-28 08:15:47 -08:00
Quinn Dawkins	c2b952926f	[mlir][vector] Fix n-d transfer write distribution (#83215 ) Currently n-d transfer write distribution can be inconsistent with distribution of reductions if a value has multiple users, one of which is a transfer_write with a non-standard distribution map, and the other of which is a vector.reduction. We may want to consider removing the distribution map functionality in the future for this reason.	2024-02-28 00:11:28 -05:00
Diego Caballero	9d0acb872a	[mlir][Vector] Add support for trunci to narrow type emulation (#82565 ) This PR add support for `arith.trunci` to vector narrow type emulation for iX -> i4 truncations, for X >= 8. For now, the pattern only works for 1D vectors and is based on `vector.shuffle` ops. We would need `vector.deinterleave` to add n-D vector support.	2024-02-27 15:27:31 -08:00
Diego Caballero	847048f497	[mlir][Vector] Fix bug in vector xfer op flattening transformation (#81964 ) It looks like the affine map generated to compute the indices of the collapsed dimensions used the wrong dim size. For indices `[idx0][idx1]` we computed the collapsed index as `idx0size0 + idx1` instead of `idx0size1 + idx1`. This led to correctness issues in convolution tests when enabling this transformation internally.	2024-02-22 12:37:32 -08:00
Diego Caballero	386aa7b169	[mlir][Vector] Replace `vector.shuffle` with `vector.interleave` in vector narrow type emulation (#82550 ) This PR replaces the generation of `vector.shuffle` with `vector.interleave` in the i4 conversions in vector narrow type emulation. The multi dimensional semantics of `vector.interleave` allow us to enable these conversion emulations also for multi dimensional vectors.	2024-02-21 22:52:02 -08:00
Diego Caballero	71441ed171	[mlir][Vector] Add vector bitwidth target to xfer op flattening (#81966 ) This PR adds an optional bitwidth parameter to the vector xfer op flattening transformation so that the flattening doesn't happen if the trailing dimension of the read/writen vector is larger than this bitwidth (i.e., we are already able to fill at least one vector register with that size).	2024-02-21 09:22:48 -08:00
Benjamin Maxwell	a1a6860314	[mlir][VectorOps] Add unrolling for n-D vector.interleave ops (#80967 ) This unrolls n-D vector.interleave ops like: ```mlir vector.interleave %i, %j : vector<6x3xf32> ``` To a sequence of 1-D operations: ```mlir %i_0 = vector.extract %i[0] %j_0 = vector.extract %j[0] %res_0 = vector.interleave %i_0, %j_0 : vector<3xf32> vector.insert %res_0, %result[0] : // ... repeated x6 ``` The 1-D operations can then be directly lowered to LLVM. Depends on: #80966	2024-02-20 14:33:33 +00:00
Diego Caballero	d592c8ec8f	Reapply "[mlir][vector] Drop inner unit dims for transfer ops on dynamic shapes." (#80712 ) (#81778 ) This reverts commit `b4c7152eb4`. Downstream regression due to another issue that this PR exposes. We have identified the work-items to fix the new issue here: https://github.com/openxla/iree/issues/16406 Co-authored-by: Han-Chung Wang <hanchung@google.com>	2024-02-14 11:38:52 -08:00
Mehdi Amini	89dc313af9	Apply clang-tidy fixes for llvm-qualified-auto in VectorUnroll.cpp (NFC)	2024-02-14 10:11:37 -08:00
Ivan Butygin	35ef3994bf	[mlir][vector] ND vectors linearization pass (#81159 ) Common backends (LLVM, SPIR-V) only supports 1D vectors, LLVM conversion handles ND vectors (N >= 2) as `array<array<... vector>>` and SPIR-V conversion doesn't handle them at all at the moment. Sometimes it's preferable to treat multidim vectors as linearized 1D. Add pass to do this. Only constants and simple elementwise ops are supported for now. @krzysz00 I've extracted yours result type conversion code from LegalizeToF32 and moved it to common place. Also, add ConversionPattern class operating on traits.	2024-02-13 15:30:58 +03:00
Andrzej Warzyński	0d72f0beab	[mlir][Vector] Fix "scalability" in CastAwayExtractStridedSliceLeadingOneDim (#81187 ) Makes sure that "scalability" flags in the `CastAwayExtractStridedSliceLeadingOneDim` pattern are correctly updated.	2024-02-09 17:13:37 +00:00
Uday Bondhugula	fe8a62c463	[MLIR] Fix crash in AffineMap::replace for zero result maps (#80930 ) Fix obvious bug in AffineMap::replace for the case of zero result maps. Extend/complete inferExprsFromList to work with empty expression lists.	2024-02-08 19:16:29 +05:30
Han-Chung Wang	d193ac4f71	[mlir][vector] Drop inner unit dims for xWrite on dynamic shapes. (#80725 ) This is part of `66347e516e` The regression in downstream projects is about transfer_read patterns, which needs more investigation. Add the support for transfer_write for now.	2024-02-05 20:03:24 -08:00
Han-Chung Wang	b4c7152eb4	Revert "[mlir][vector] Drop inner unit dims for transfer ops on dynamic shapes." (#80712 ) Reverts llvm/llvm-project#79752 because it is causing regressions in downstream projects.	2024-02-05 09:32:03 -08:00
Diego Caballero	8ba018d72a	[mlir][Vector] Add support for sub-byte transpose emulation (#80110 ) This PR adds patterns to convert a sub-byte vector transpose into a sequence of instructions that perform the transpose on i8 vector elements. Whereas this rewrite may not lead to the absolute peak performance, it should ensure correctness when dealing with sub-byte transposes.	2024-01-31 17:26:50 -08:00
Benjamin Maxwell	88610b7951	[mlir][vector] Disable transpose -> shuffle lowering for scalable vectors (#79979 ) vector.shuffle is not supported for scalable vectors (outside of splats)	2024-01-31 09:21:44 +00:00
Jie Fu	8ca90b29ee	[mlir] Remove unused variable 'byteBitwidth' in VectorEmulateNarrowType.cpp (NFC) llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:763:12: error: unused variable 'byteBitwidth' [-Werror,-Wunused-variable] unsigned byteBitwidth = 8; ^ 1 error generated.	2024-01-30 10:06:40 +08:00
Diego Caballero	a694104022	[mlir][Vector] Add patterns for efficient i4 -> i8 conversion emulation (#79494 ) This PR adds new patterns to improve the generated vector code for the emulation of any conversion that have to go through an i4 -> i8 type extension (only signed extensions are supported for now). This will impact any i4 -> i8/i16/i32/i64 signed extensions as well as sitofp i4 -> f8/f16/f32/f64. The asm code generated for the supported cases is significantly better after this PR for both x86 and aarch64.	2024-01-29 17:44:06 -08:00
Han-Chung Wang	66347e516e	[mlir][vector] Drop inner unit dims for transfer ops on dynamic shapes. (#79752 )	2024-01-29 00:30:19 -08:00
Jerry Wu	dedc7d4d36	[mlir] Exclude masked ops in VectorDropLeadUnitDim (#76468 ) Don't insert cast ops for ops in `vector.mask` region in `VectorDropLeadUnitDim`.	2024-01-20 19:37:46 -05:00
Benjamin Chetioui	35121add2e	[mlir][NFC] Remove unused variable.	2024-01-19 11:32:19 +00:00
Han-Chung Wang	12b676de72	[mlir][vector] Drop innermost unit dims on transfer_write. (#78554 )	2024-01-19 03:15:13 -08:00
Matthias Springer	5fcf907b34	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260 ) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`).	2024-01-17 11:08:59 +01:00
Matthias Springer	510626fa65	[mlir][vector] Fix invalid IR in `RewriteBitCastOfTruncI` (#78146 ) This commit fixes `Dialect/Vector/vector-rewrite-narrow-types.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` within split at llvm-project/mlir/test/Dialect/Vector/vector-rewrite-narrow-types.mlir:1 offset :118:8: error: 'arith.trunci' op operand type 'vector<3xi16>' and result type 'vector<3xi16>' are cast incompatible %1 = vector.bitcast %0 : vector<16xi3> to vector<3xi16> ^ within split at llvm-project/mlir/test/Dialect/Vector/vector-rewrite-narrow-types.mlir:1 offset :118:8: note: see current operation: %48 = "arith.trunci"(%47) : (vector<3xi16>) -> vector<3xi16> LLVM ERROR: IR failed to verify after pattern application ```	2024-01-16 09:45:38 +01:00
Matthias Springer	c0a354dfab	[mlir][vector] Fix invalid IR in `ContractionOpLowering` (#78130 ) If a rewrite pattern returns "failure", it must not have modified the IR. This commit fixes `Dialect/Vector/vector-contract-to-outerproduct-transforms-unsupported.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` * Pattern (anonymous namespace)::ContractionOpToOuterProductOpLowering : 'vector.contract -> ()' { Trying to match "(anonymous namespace)::ContractionOpToOuterProductOpLowering" Insert : 'vector.transpose'(0x5625b3a8cb30) Insert : 'vector.transpose'(0x5625b3a8cbc0) "(anonymous namespace)::ContractionOpToOuterProductOpLowering" result 0 } -> failure : pattern failed to match } -> failure : pattern failed to match LLVM ERROR: pattern returned failure but IR did change ``` Note: `vector-contract-to-outerproduct-transforms-unsupported.mlir` is merged into `vector-contract-to-outerproduct-matvec-transforms.mlir`. The `greedy pattern application failed` error is not longer produced. This error indicates that the greedy pattern rewrite did not convergence; it does not mean that a pattern could not be applied.	2024-01-16 09:40:24 +01:00
Kazu Hirata	8e8bbbd48e	[mlir] Use llvm::is_contained (NFC)	2024-01-12 22:08:29 -08:00
Matthias Springer	ad100b36e7	[mlir][vector] Fix dominance error in warp vector distribution (#77771 ) This commit fixes a test in `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use %1 = vector.extract %0[9] : f32 from vector<64xf32> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region) "func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({ ^bb0(%arg0: index): %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({ %7 = "some_def"() : () -> vector<64xf32> %8 = "arith.constant"() <{value = 9 : index}> : () -> index %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32 "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> () }) : (index) -> (f32, vector<2xf32>) %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32 %4 = "arith.index_cast"(%1) : (index) -> i32 %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32 %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1) "func.return"(%6#0) : (f32) -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ``` The position at which `vector.extractelement` extracts must also be distributed. The fix in `WarpOpExtractElement` is similar to `WarpOpInsertElement`.	2024-01-12 15:08:13 +01:00
Matthias Springer	35c19fdde2	[mlir][vector] Support warp distribution of `transfer_read` with dependencies (#77779 ) Support distribution of `vector.transfer_read` ops when operands are defined inside of the region of `warp_execute_on_lane_0` (except for the buffer from which the op is reading). Such IR was previously not supported. This commit changes the implementation such that indices and the padding value are also distributed. This commit simplifies the implementation considerably: the original implementation created a new `transfer_read` op and then checked if this new op is valid. If not, the rewrite pattern failed. This was a bit hacky. It was also a violation of the rewrite pattern API (detected by `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`) because the IR was modified, but the pattern returned "failure".	2024-01-12 11:55:37 +01:00
Jakub Kuderski	560564f51c	[mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901 ) This is to avoid confusion when dealing with reduction/combining kinds. For example, see a recent PR comment: https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175. Previously, they were picked to mostly mirror the names of the llvm vector reduction intrinsics: https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In isolation, it was not clear if `<maxf>` has `arith.maxnumf` or `arith.maximumf` semantics. The new reduction kind names map 1:1 to arith ops, which makes it easier to tell/look up their semantics. Because both the vector and the gpu dialect depend on the arith dialect, it's more natural to align names with those in arith than with the lowering to llvm intrinsics. Issue: https://github.com/llvm/llvm-project/issues/72354	2023-12-20 00:14:43 -05:00
Jakub Kuderski	9f74e6e615	[mlir][vector][gpu] Use `makeArithReduction` in lowering patterns. NFC. (#75952 ) Use the `vector::makeArithReduction` helper as the source-of-truth of reduction to arith ops lowering.	2023-12-19 19:04:27 -05:00
Jakub Kuderski	07677113ff	[mlir][vector] Add pattern to break down reductions into arith ops (#75727 ) The number of vector elements considered 'small' enough to extract is parameterized. This is to avoid going into specialized reduction lowering when a single/couple of arith ops can do. Targets without dedicated reduction intrinsics can use that as an emulation path too. Depends on https://github.com/llvm/llvm-project/pull/75846.	2023-12-18 17:54:54 -05:00
Jakub Kuderski	a528cee224	[mlir][vector] Improve `makeArithReduction` expansion (#75846 ) Propagate fast math flags. Distinguish `minf`/`maxf` and `minimumf`/`maximumf`. Required for future patterns in https://github.com/llvm/llvm-project/pull/75727.	2023-12-18 17:47:46 -05:00
Hsiangkai Wang	f643eec892	[mlir][vector] Add emulation patterns for vector masked load/store (#74834 ) In this patch, it will convert ``` vector.maskedload %base[%idx_0, %idx_1], %mask, %pass_thru ``` to ``` %ivalue = %pass_thru %m = vector.extract %mask[0] %result0 = scf.if %m { %v = memref.load %base[%idx_0, %idx_1] %combined = vector.insert %v, %ivalue[0] scf.yield %combined } else { scf.yield %ivalue } %m = vector.extract %mask[1] %result1 = scf.if %m { %v = memref.load %base[%idx_0, %idx_1 + 1] %combined = vector.insert %v, %result0[1] scf.yield %combined } else { scf.yield %result0 } ... ``` It will convert ``` vector.maskedstore %base[%idx_0, %idx_1], %mask, %value ``` to ``` %m = vector.extract %mask[0] scf.if %m { %extracted = vector.extract %value[0] memref.store %extracted, %base[%idx_0, %idx_1] } %m = vector.extract %mask[1] scf.if %m { %extracted = vector.extract %value[1] memref.store %extracted, %base[%idx_0, %idx_1 + 1] } ... ```	2023-12-15 11:35:48 +00:00
Jerry Wu	2c9ba9c34a	[mlir] Fix type transformation in DropUnitDimFromElementwiseOps (#75430 ) Use operand and result types to build the corresponding new types in `DropUnitDimFromElementwiseOps`.	2023-12-14 12:20:54 -05:00
Fangrui Song	71ba8bb4a7	[mlir,vector] Fix -Wunused-variable	2023-12-13 13:28:17 -08:00
Andrzej Warzyński	c02d07fdf0	[mlir][vector] Add pattern to drop unit dim from elementwise(a, b)) (#74817 ) For vectors with either leading or trailing unit dim, replaces: elementwise(a, b) with: sc_a = shape_cast(a) sc_b = shape_cast(b) res = elementwise(sc_a, sc_b) return shape_cast(res) The newly inserted shape_cast Ops fold (before elementwise Op) and then restore (after elementwise Op) the unit dim. Vectors `a` and `b` are required to be rank > 1. Example: ```mlir %mul = arith.mulf %B_row, %A_row : vector<1x[4]xf32> %cast = vector.shape_cast %mul : vector<1x[4]xf32> to vector<[4]xf32> ``` gets converted to: ```mlir %B_row_sc = vector.shape_cast %B_row : vector<1x[4]xf32> to vector<[4]xf32> %A_row_sc = vector.shape_cast %A_row : vector<1x[4]xf32> to vector<[4]xf32> %mul = arith.mulf %B_row_sc, %A_row_sc : vector<[4]xf32> %mul_sc = vector.shape_cast %mul : vector<[4]xf32> to vector<1x[4]xf32> %cast = vector.shape_cast %mul_sc : vector<1x[4]xf32> to vector<[4]xf32> ``` In practice, the bottom 2 shape_cast(s) will be folded away.	2023-12-13 20:29:12 +00:00
Jakub Kuderski	8063622721	[mlir][vector] Allow vector distribution with multiple written elements (#75122 ) Add a configuration option to allow vector distribution with multiple elements written by a single lane. This is so that we can perform vector multi-reduction with multiple results per workgroup.	2023-12-12 13:15:17 -05:00
Andrzej Warzyński	07919cf895	Revert "[mlir][vector] Make `TransposeOpLowering` configurable (#73915 )" (#75062 ) Reverting a workaround intended specifically for SPRI-V. That workaround emerged from this discussion: * https://github.com/llvm/llvm-project/pull/72105 AFAIK, it hasn't been required in practice. This is based on IREE (https://github.com/openxla/iree), which has just bumped it's fork of LLVM without using it (). () `cef31e775e` This reverts commit `bbd2b08b95`.	2023-12-11 21:32:23 +00:00

1 2 3 4 5 ...

351 Commits