clang-p2996

Author	SHA1	Message	Date
Benjamin Kramer	4d228e1ebd	[mlir][vector] Escape variable usage in test Otherwise the shell might expand this in the command line.	2024-10-17 12:43:32 +02:00
Andrzej Warzyński	3187a4917d	[mlir][vector] Add more tests for ConvertVectorToLLVM (8/n) (#111997 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * `vector.transfer_read`, * `vector.transfer_write`. In addition: * Duplicate tests from "vector-mask-to-llvm.mlir" are removed. * Tests for xfer_read/xfer_write are moved to a newly created test file, "vector-xfer-to-llvm.mlir". This follows an existing pattern among VectorToLLVM conversion tests. * Tests that test both xfer_read and xfer_write have their names updated to capture that (e.g. @transfer_read_1d_mask -> @transfer_read_write_1d_mask) * @transfer_write_1d_scalable_mask and @transfer_read_1d_scalable_mask are re-written as @transfer_read_write_1d_mask_scalable. This is to make it clear that this case is meant to complement @transfer_read_write_1d_mask. * @transfer_write_tensor is updated to also test xfer_read.	2024-10-15 13:15:36 +01:00
Andrzej Warzyński	f7eb271542	[mlir][vector] Add more tests for ConvertVectorToLLVM (7/n) (#111895 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.fma * vector.reduce	2024-10-11 14:36:26 +01:00
Andrzej Warzyński	f58e85a972	[mlir][vector] Add more tests for ConvertVectorToLLVM (6/n) (#111121 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * `vector.insert_strided_slice` With this change, for every test with fixed-width vectors, there should be a corresponding example with scalable vectors (for `vector.insert_strided_slice`). In addition: * Test function names are updated to more accurately reflect the case being exercised (e.g. `@insert_strided_index_slice1` -> `@insert_strided_index_slice_index_2d_into_3d`) * For consistency, took the liberty of updating some of the function names for `vector.extract_strided_slice` * `@insert_strided_slice_scalable` is effectively replaced with `@insert_strided_slice_f32_2d_into_3d_scalable`	2024-10-07 10:05:16 +01:00
Longsheng Mou	a4b0153c4f	[mlir][vector] Support for extracting 1-element vectors in VectorExtractOpConversion (#107549 ) This patch adds support for converting `vector.extract` that extract 1-element vectors into LLVM, fixing a crash in such cases. E.g., `vector.extract %1[0]: vector<1xf32> from vector<2xf32>`. Fix #61372.	2024-09-11 17:10:58 +08:00
Longsheng Mou	a8f3d30312	[mlir] Add dependent TensorDialect to ConvertVectorToLLVM pass (#108045 ) This patch registers the tensor dialect as dependent of the ConvertVectorToLLVM. This which fixes a crash when `vector.transfer_write` is used with dynamic tensor type. The MaterializeTransferMask pattern would call `vector::createOrFoldDimOp` which creates a `tensor.dim` operation. Fixes #107805.	2024-09-11 17:08:44 +08:00
Andrzej Warzyński	a9c71d3665	[mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#106510 )	2024-09-02 12:19:00 +01:00
Maciej Gabka	95d2d1cba0	Move stepvector intrinsic out of experimental namespace (#98043 ) This patch is moving out stepvector intrinsic from the experimental namespace. This intrinsic exists in LLVM for several years now, and is widely used.	2024-08-28 12:48:20 +01:00
Hugo Trachino	749ba7f6b2	[mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#104784 ) This patch aims to disambiguate test names for some of the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.extractelement * vector.extract * vector.insertelement * vector.insert 1. Tests targetting `vector.{insert\|extract}` Ops do not have names like `{insert\|extract}_element*` which was confusing against `vector.{insert\|extract}element` ops targetting tests. 2. Tests mention the type of the target/source buffer. e.g. `@extractelement` => `@extractelement_from_vec_1d` 3. Align LIT ligns consistently with other tests. 4. Tests with a different type for position have a name updated accordingly. `@extractelement_index` =>`@extractelement_index_position` 5. Tests with a dynamic value for position have a name updated accordingly. `@extract_element_with_value_1d` =>`@extract_scalar_dynamic_position_from_vec_1d` 6. Added the scalable flavour of the tests `insert_scalar_into_vec_2d_dynamic_position` and `@extract_scalar_from_vec_2d_dynamic_position`	2024-08-21 08:55:50 +01:00
Andrzej Warzyński	6fceb3e865	[mlir][vector] Add more tests for ConvertVectorToLLVM (4/n) (#103391 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.insertelement * vector.insert I have also renamed some function names from `@insert_element{}` to `@insertelement{}` - that's to make a clearer distinction between tests for `vector.insertelement` (tested by `@insertelement{}`) and `vector.insert` (tested by `@insert_element{}`).	2024-08-16 16:48:16 +01:00
Andrzej Warzyński	f0f5afe968	[mlir][vector] Add more tests for ConvertVectorToLLVM (3/n) (#102854 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.extractelement * vector.extract I have also renamed some function names from `@extract_element{}` to `@extractelement{}` - that's to make a clearer distinction between tests for `vector.extractelement` (tested by `@extractelement{}`) and `vector.extract` (tested by `@extract_element{}`).	2024-08-13 13:03:35 +01:00
Andrzej Warzyński	7e175b307e	[mlir][vector] Add more tests for ConvertVectorToLLVM (2/n) (#102203 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.outerproduct	2024-08-09 09:06:57 +01:00
Andrzej Warzyński	22a130220c	[mlir][vector] Add more tests for ConvertVectorToLLVM (1/n) (#101936 ) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.bitcast * vector.broadcast Note, this has uncovered some missing logic in `BroadcastOpLowering`. This PR fixes the most basic cases where the scalable flags were dropped and the generated code was incorrect. Also, the conditions in `vector::isBroadcastableTo` are relaxed to allow cases like this: ```mlir %0 = vector.broadcast %arg0 : vector<1xf32> to vector<[4]xf32> ``` The `BroadcastOpLowering` pattern is effectively disabled for scalable vectors in more complex cases where an SCF loop would be required to loop over the scalable dims, e.g.: ```mlir %0 = vector.broadcast %arg0 : vector<[4]x1x2xf32> to vector<[4]x3x2xf32> ``` These cases are marked as "Stretch not at start" in the code. In those cases, support for scalable vectors is left as a TODO.	2024-08-08 15:57:36 +01:00
Cullen Rhodes	1e7d6d3455	[mlir][vector] Propagate scalability to gather/scatter ptrs vector (#97584 ) In convert-vector-to-llvm the first operand (vector of pointers holding all memory addresses to read) to the masked.gather (and scatter) intrinsic has a fixed vector type. This may result in intrinsics where the scalable flag has been dropped: ``` %0 = llvm.intr.masked.gather %1, %2, %3 {alignment = 4 : i32} : (!llvm.vec<4 x ptr>, vector<[4]xi1>, vector<[4]xi32>) -> vector<[4]xi32> ``` Fortunately the operand is overloaded on the result type so we end up with the correct IR when lowering to LLVM, but this is still incorrect. This patch fixes it by propagating scalability.	2024-07-09 09:06:25 +01:00
Cullen Rhodes	67b302c52f	[mlir][vector] Add vector.step operation (#96776 ) This patch adds a new vector.step operation to the Vector dialect. It produces a linear sequence of index values from 0 to N, where N is the number of elements in the result vector, and can be used to create vectors of indices. It supports both fixed-width and scalable vectors. For fixed the canonical representation is `arith.constant dense<[0, .., N]>`. A scalable step cannot be represented as a constant and is lowered to the `llvm.experimental.stepvector` intrinsic [1]. This op enables scalable vectorization of linalg.index ops, see #96778. It can also be used in the SparseVectorizer in-place of lower-level stepvector intrinsic, see [2] (patch to follow). [1] https://llvm.org/docs/LangRef.html#llvm-experimental-stepvector-intrinsic [2] `acf675b63f/mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp (L385-L388)`	2024-07-04 08:57:02 +01:00
Matthias Springer	c6ff2446a4	[mlir][vector] Add `vector.from_elements` op (#95938 ) This commit adds a new operation to the vector dialect: `vector.from_elements` The op constructs a new vector from a given list of scalar values. It is similar to `tensor.from_elements`. ```mlir %0 = vector.from_elements %a, %b, %c, %a, %a, %a : vector<2x3xf32> ``` Constructing a new vector from elements was tedious before this op existed: a typical way was to define an `arith.constant ... : vector<...>`, followed by a chain of `vector.insert`. Folders/canonicalizations are added that can fold `vector.extract` ops and convert the `vector.from_elements` op into a `vector.splat` op. The LLVM lowering generates an `llvm.mlir.undef`, followed by a sequence of scalar insertions in the form of `llvm.insertelement`. Only 0-D and 1-D vectors are currently supported in the LLVM lowering.	2024-06-19 09:58:37 +02:00
Matthias Springer	13d983e730	[mlir][Transforms][NFC] Dialect Conversion: Resolve insertion point TODO (#95653 ) Remove a TODO in the dialect conversion code base when materializing unresolved conversions: ``` // FIXME: Determine a suitable insertion location when there are multiple // inputs. ``` The implementation used to select an insertion point as follows: - If the cast has exactly one operand: right after the definition of the SSA value. - Otherwise: right before the cast op. However, it is not necessary to change the insertion point. Unresolved materializations (`UnrealizedConversionCastOp`) are built during `buildUnresolvedArgumentMaterialization` or `buildUnresolvedTargetMaterialization`. In the former case, the op is inserted at the beginning of the block. In the latter case, only one operand is supported in the dialect conversion, and the op is inserted right after the definition of the SSA value. I.e., the `UnrealizedConversionCastOp` is already inserted at the right place and it is not necessary to change the insertion point for the resolved materialization op. Note: The IR change changes slightly because the `unrealized_conversion_cast` ops at the beginning of a block are no longer doubly-inverted (by setting the insertion to the beginning of the block when inserting the `unrealized_conversion_cast` and again when inserting the resolved conversion op). All affected test cases were fixed by using `CHECK-DAG` instead of `CHECK`. Also improve the quality of multiple test cases that did not check for the correct operands. Note: This commit is in preparation of decoupling the argument/source/target materialization logic of the type converter from the dialect conversion (to reduce its complexity and make that functionality usable from a new dialect conversion driver).	2024-06-17 19:56:40 +02:00
Zhaoshi Zheng	abcbbe7114	[MLIR][VectorToLLVM] Handle scalable dim in createVectorLengthValue() (#93361 ) LLVM's Vector Predication Intrinsics require an explicit vector length parameter: https://llvm.org/docs/LangRef.html#vector-predication-intrinsics. For a scalable vector type, this should be caculated as VectorScaleOp multiplied by base vector length, e.g.: for <[4]xf32> we should return: vscale * 4.	2024-06-13 09:06:05 -07:00
Mubashar Ahmad	b87a80d4eb	[mlir][vector] Add n-d deinterleave lowering (#94237 ) This patch implements the lowering for vector deinterleave for vector of n-dimensions. Process involves unrolling the n-d vector to a series of one-dimensional vectors. The deinterleave operation is then used on these vectors. From: ``` %0, %1 = vector.deinterleave %a : vector<2x8xi8> -> vector<2x4xi8> ``` To: ``` %cst = arith.constant dense<0> : vector<2x4xi32> %0 = vector.extract %arg0[0] : vector<8xi32> from vector<2x8xi32> %res1, %res2 = vector.deinterleave %0 : vector<8xi32> -> vector<4xi32> %1 = vector.insert %res1, %cst [0] : vector<4xi32> into vector<2x4xi32> %2 = vector.insert %res2, %cst [0] : vector<4xi32> into vector<2x4xi32> %3 = vector.extract %arg0[1] : vector<8xi32> from vector<2x8xi32> %res1_0, %res2_1 = vector.deinterleave %3 : vector<8xi32> -> vector<4xi32> %4 = vector.insert %res1_0, %1 [1] : vector<4xi32> into vector<2x4xi32> %5 = vector.insert %res2_1, %2 [1] : vector<4xi32> into vector<2x4xi32> ...etc. ```	2024-06-07 10:57:00 +01:00
Han-Chung Wang	0ea1271ee1	[mlir][vector] Add support for unrolling vector.bitcast ops. (#94064 ) The revision unrolls vector.bitcast like: ```mlir %0 = vector.bitcast %arg0 : vector<2x4xi32> to vector<2x2xi64> ``` to ```mlir %cst = arith.constant dense<0> : vector<2x2xi64> %0 = vector.extract %arg0[0] : vector<4xi32> from vector<2x4xi32> %1 = vector.bitcast %0 : vector<4xi32> to vector<2xi64> %2 = vector.insert %1, %cst [0] : vector<2xi64> into vector<2x2xi64> %3 = vector.extract %arg0[1] : vector<4xi32> from vector<2x4xi32> %4 = vector.bitcast %3 : vector<4xi32> to vector<2xi64> %5 = vector.insert %4, %2 [1] : vector<2xi64> into vector<2x2xi64> ``` The scalable vector is not supported because of the limitation of `vector::createUnrollIterator`. The targetRank could mismatch the final rank during unrolling; there is no direct way to query what the final rank is from the object.	2024-06-03 16:39:52 -07:00
Mubashar Ahmad	bc946f5287	[mlir][vector] Add 1D vector.deinterleave lowering (#93042 ) This patch implements the lowering of vector.deinterleave for 1D vectors. For fixed vector types, the operation is lowered to two llvm shufflevector operations. One for even indexed elements and the other for odd indexed elements. A poison operation is used to satisfy the parameters of the shufflevector parameters. For scalable vectors, the llvm vector.deinterleave2 intrinsic is used for lowering. As such the results found by extraction and used to form the result struct for the intrinsic.	2024-05-30 09:42:35 +01:00
Jakub Kuderski	714aee31e1	[mlir][vector] Add result type to `interleave` assembly format (#93392 ) This is to make it more obvious for what the result type is, especially with some less trivial cases like 0-d inputs resulting in 1-d inputs or interaction with scalable vector types. Note that `vector.deinterleave` uses the same format with explicit result type. Also improve examples and clean up surrounding code.	2024-05-27 11:03:36 -04:00
Maciej Gabka	bfc0317153	Move several vector intrinsics out of experimental namespace (#88748 ) This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.	2024-04-29 10:16:45 +01:00
Diego Caballero	42a6ad7bad	[mlir][Vector] Fix n-D vector.extract/insert lowering to LLVM (#87591 ) The lowering of n-D vector.extract/insert ops to LLVM is not supported but if one of these accidentally reaches the vector-to-llvm conversion patterns, we end up with a kind of puzzling crash. This PR fixes that crash and gracefully bails out in those cases.	2024-04-05 15:01:20 -07:00
Benjamin Maxwell	a1a6860314	[mlir][VectorOps] Add unrolling for n-D vector.interleave ops (#80967 ) This unrolls n-D vector.interleave ops like: ```mlir vector.interleave %i, %j : vector<6x3xf32> ``` To a sequence of 1-D operations: ```mlir %i_0 = vector.extract %i[0] %j_0 = vector.extract %j[0] %res_0 = vector.interleave %i_0, %j_0 : vector<3xf32> vector.insert %res_0, %result[0] : // ... repeated x6 ``` The 1-D operations can then be directly lowered to LLVM. Depends on: #80966	2024-02-20 14:33:33 +00:00
Benjamin Maxwell	79ce2c93ae	[mlir][VectorOps] Add conversion of 1-D vector.interleave ops to LLVM (#80966 ) The 1-D case directly maps to LLVM intrinsics. The n-D case will be handled by unrolling to 1-D first (in a later patch). Depends on: #80965	2024-02-13 10:47:33 +00:00
Andrzej Warzyński	9ddbcee25e	[mlir][vector] Extend vector.{insert\|extract}_strided_slice (#79052 ) Extends `vector.insert_strided_slice` and `vector.insert_strided_slice` to allow scalable input and output vectors. For scalable sizes, the corresponding slice size has to match the corresponding dimension in the output/input vector (insert/extract, respectively). This is supported: ```mlir vector.extract_strided_slice %1 { offsets = [0, 3, 0], sizes = [1, 1, 4], strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[4]xi32> ``` This is not supported: ```mlir vector.extract_strided_slice %1 { offsets = [0, 3, 0], sizes = [1, 1, 2], strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[2]xi32> ```	2024-01-25 19:01:28 +00:00
Krzysztof Drewniak	5cfe24eee4	[mlir][Vector] Add nontemporal attribute, mirroring memref (#76752 ) Since vector loads and stores from scalar memrefs translate to llvm.load/store, add the ability to tag said loads and stores as nontemporal. This mirrors functionality available in memref.load/store.	2024-01-09 11:05:20 -06:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Matthias Springer	c99670ba51	[mlir][vector] `LoadOp`/`StoreOp`: Allow 0-D vectors (#76134 ) Similar to `vector.transfer_read`/`vector.transfer_write`, allow 0-D vectors. This commit fixes `mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir` when verifying the IR after each pattern (#74270). That test produces a temporary 0-D load/store op.	2023-12-22 11:12:58 +09:00
Jakub Kuderski	560564f51c	[mlir][vector][gpu] Align minf/maxf reduction kind names with arith (#75901 ) This is to avoid confusion when dealing with reduction/combining kinds. For example, see a recent PR comment: https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175. Previously, they were picked to mostly mirror the names of the llvm vector reduction intrinsics: https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In isolation, it was not clear if `<maxf>` has `arith.maxnumf` or `arith.maximumf` semantics. The new reduction kind names map 1:1 to arith ops, which makes it easier to tell/look up their semantics. Because both the vector and the gpu dialect depend on the arith dialect, it's more natural to align names with those in arith than with the lowering to llvm intrinsics. Issue: https://github.com/llvm/llvm-project/issues/72354	2023-12-20 00:14:43 -05:00
Jakub Kuderski	a528cee224	[mlir][vector] Improve `makeArithReduction` expansion (#75846 ) Propagate fast math flags. Distinguish `minf`/`maxf` and `minimumf`/`maximumf`. Required for future patterns in https://github.com/llvm/llvm-project/pull/75727.	2023-12-18 17:47:46 -05:00
Christian Ulmann	ceb4dc4477	[MLIR][VectorToLLVM] Remove typed pointer support (#71075 ) This commit removes the support for lowering Vector to LLVM dialect with typed pointers. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502	2023-11-03 11:16:11 +01:00
Benjamin Maxwell	3be3883e6d	[mlir][VectorOps] Support string literals in `vector.print` (#68695 ) Printing strings within integration tests is currently quite annoyingly verbose, and can't be tucked into shared helpers as the types depend on the length of the string: ``` llvm.mlir.global internal constant @hello_world("Hello, World!\0") func.func @entry() { %0 = llvm.mlir.addressof @hello_world : !llvm.ptr<array<14 x i8>> %1 = llvm.mlir.constant(0 : index) : i64 %2 = llvm.getelementptr %0[%1, %1] : (!llvm.ptr<array<14 x i8>>, i64, i64) -> !llvm.ptr<i8> llvm.call @printCString(%2) : (!llvm.ptr<i8>) -> () return } ``` So this patch adds a simple extension to `vector.print` to simplify this: ``` func.func @entry() { // Print a vector of characters ;) vector.print str "Hello, World!" return } ``` Most of the logic for this is now shared with `cf.assert` which already does something similar. Depends on #68694	2023-10-24 09:34:14 +01:00
Quinn Dawkins	78c49743c7	[MLIR][Vector] Allow non-default memory spaces in gather/scatter lowerings (#67500 ) GPU targets can gather on non-default address spaces (e.g. global), so this removes the check for the default memory space.	2023-09-28 19:20:32 -04:00
Cullen Rhodes	9816edc9f3	[mlir][vector] add result type to vector.extract assembly format (#66499 ) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : vector<3x7x8xf32> it's not immediately obvious if this is the source or result type. This patch improves the assembly format to make this clearer, so the above becomes: %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>	2023-09-28 11:11:16 +01:00
Diego Caballero	98f6289a34	[mlir][Vector] Add support for Value indices to vector.extract/insert `vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instead. This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops Differential Revision: https://reviews.llvm.org/D155034	2023-09-22 00:39:32 +00:00
Nicolas Vasilache	1b8b556443	[mlir][Vector] Add fastmath flags to vector.reduction (#66905 ) This revision pipes the fastmath attribute support through the vector.reduction op. This seemingly simple first step already requires quite some genuflexions, file and builder reorganization. In the process, retire the boolean reassoc flag deep in the LLVM dialect builders and just use the fastmath attribute. During conversions, templated builders for predicated intrinsics are partially cleaned up. In the future, to finalize the cleanups, one should consider adding fastmath to the VPIntrinsic ops.	2023-09-20 16:57:20 +02:00
Benjamin Maxwell	2f11ce5579	[mlir][VectorOps] Extend vector.constant_mask to support 'all true' scalable dims (#66638 ) This extends `vector.constant_mask` so that mask dim sizes that correspond to a scalable dimension are treated as if they're implicitly multiplied by vscale. Currently this is limited to mask dim sizes of 0 or the size of the dim/vscale. This allows constant masks to represent all true and all false scalable masks (and some variations): ``` // All true scalable mask %mask = vector.constant_mask [8] : vector<[8]xi1> // All false scalable mask %mask = vector.constant_mask [0] : vector<[8]xi1> // First two scalable rows %mask = vector.constant_mask [2,4] : vector<4x[4]xi1> ```	2023-09-20 14:54:42 +01:00
Benjamin Maxwell	665995b918	[mlir][Conversion] Allow lowering to fixed arrays of scalable vectors This allows lowering vector types like: vector<3x[4]> or vector<3x2x[4]> to LLVM IR, i.e. vectors where the trailing dim is scalable. This is contingent on: https://discourse.llvm.org/t/rfc-enable-arrays-of-scalable-vector-types/72935 More tests will be added in later patches, however, some MLIR fixes are needed first. Depends on: D158517 Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D158752	2023-09-15 09:33:18 +00:00
Daniil Dudkin	8f5d519458	[mlir][vector] Implement Workaround Lowerings for Masked `fm**imum` Reductions This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. Within LLVM, there are no masked reduction counterparts for vector reductions such as `fmaximum` and `fminimum`. More information can be found here: https://github.com/llvm/llvm-project/issues/64940#issuecomment-1690694156. To address this issue in MLIR, where we need to generate appropriate lowerings for these cases, we employ regular non-masked intrinsics. However, we modify the input vector using the `arith.select` operation to effectively deactivate undesired elements using a "neutral mask value". The neutral mask value is the smallest possible value for the `fmaximum` reduction and the largest possible value for the `fminimum` reduction. Depends on D158618 Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158773	2023-09-13 22:49:08 +00:00
Daniil Dudkin	709b27427b	[mlir][vector] Bring back `maxf`/`minf` reductions This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. In line with the mentioned RFC, this patch tackles tasks 2.3 and 2.4. It adds LLVM conversions for the `maxf`/`minf` reductions to the non-NaN-propagating LLVM intrinsics. Depends on D158618 Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158659	2023-09-13 22:49:07 +00:00
Daniil Dudkin	4a831250b8	[mlir][vector] Rename vector reductions: `maxf` → `maximumf`, `minf` → `minimumf` This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. Here, we are addressing task 2.1 from the plan, which involves renaming the vector reductions to align with the semantics of the corresponding LLVM intrinsics. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158618	2023-09-13 22:49:07 +00:00
Daniil Dudkin	8a6e54c9b3	[mlir][arith] Rename operations: `maxf` → `maximumf`, `minf` → `minimumf` (#65800 ) This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.	2023-09-11 22:02:19 -07:00
Benjamin Maxwell	ccef726d09	[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToLLVM) This is a follow-on to D158753, and allows the lowering of a transfer read/write of n-D vectors with a single trailing scalable dimension to primitive vector ops. The final conversion to LLVM depends on D158517 and D158752, without these patches type conversion will fail (or an assert is hit in the LLVM backend) if the final IR contains an array of scalable vectors. This patch adds `transform.apply_patterns.vector.lower_create_mask` which allows the lowering of vector.create_mask/constant_mask to be tested independently of --convert-vector-to-llvm. Reviewed By: c-rhodes, awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D159482	2023-09-11 16:47:51 +00:00
Benjamin Maxwell	f36e909da0	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors Reland of the original patch after updating the Python binding tests, a few CUDA/GPU MLIR tests, and ensuring the assembly format is round-trippable. This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print punctuation <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-11 09:29:54 +00:00
Mehdi Amini	1b272d21c8	Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors" This reverts commit `490dae26cb`. Bot is broken, seems like there is a problem of ambiguity in the parser.	2023-08-09 19:37:01 -07:00
Benjamin Maxwell	490dae26cb	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors Reland of the original patch after updating the Python binding tests and a few CUDA/GPU MLIR tests. This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-09 11:47:18 +00:00
Benjamin Maxwell	b160442dd2	Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors" This reverts commit `3875804a07`. This caused some test failures for the MLIR python bindings. Reverting until those are addressed.	2023-08-09 09:54:05 +00:00
Benjamin Maxwell	3875804a07	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-09 09:38:05 +00:00

1 2 3 4

172 Commits