clang-p2996

Author	SHA1	Message	Date
Sayan Saha	26722f5b61	[MLIR] Fix incorrect memref::DimOp canonicalization, add tensor::DimOp canonicalization (#84225 ) The current canonicalization of `memref.dim` operating on the result of `memref.reshape` into `memref.load` is incorrect as it doesn't check whether the `index` operand of `memref.dim` dominates the source `memref.reshape` op. It always introduces `memref.load` right after `memref.reshape` to ensure the `memref` is not mutated before the `memref.load` call. As a result, the following error is observed: ``` $> mlir-opt --canonicalize input.mlir func.func @reshape_dim(%arg0: memref<xf32>, %arg1: memref<?xindex>, %arg2: index) -> index { %c4 = arith.constant 4 : index %reshape = memref.reshape %arg0(%arg1) : (memref<xf32>, memref<?xindex>) -> memref<xf32> %0 = arith.muli %arg2, %c4 : index %dim = memref.dim %reshape, %0 : memref<xf32> return %dim : index } ``` results in: ``` dominator.mlir:22:12: error: operand #1 does not dominate this use %dim = memref.dim %reshape, %0 : memref<*xf32> ^ dominator.mlir:22:12: note: see current operation: %1 = "memref.load"(%arg1, %2) <{nontemporal = false}> : (memref<?xindex>, index) -> index dominator.mlir:21:10: note: operand defined here (op in the same block) %0 = arith.muli %arg2, %c4 : index ``` Properly fixing this issue requires a dominator analysis which is expensive to run within a canonicalization pattern. So, this patch fixes the canonicalization pattern by being more strict/conservative about the legality condition in which we perform this canonicalization. The more general pattern is also added to `tensor.dim`. Since tensors are immutable we don't need to worry about where to introduce the `tensor.extract` call after canonicalization.	2024-03-11 19:37:33 -07:00
James Newling	67ef4ae2c3	[MLIR][Tensor,MemRef] Fold expand_shape and collapse_shape if identity (#80658 ) Before: op verifiers failed if the input and output ranks were the same (i.e. no expansion or collapse). This behavior requires users of these shape ops to verify manually that they are not creating identity versions of these ops every time they build them -- problematic. This PR removes this strict verification, and introduces folders for the the identity cases. The PR also removes the special case handling of rank-0 tensors for expand_shape and collapse_shape, there doesn't seem to be any reason to treat them differently.	2024-03-12 10:11:58 +09:00
Matthias Springer	9efdccb26f	[mlir][memref] `memref.subview`: Verify result strides with rank reductions (#80158 ) This is a follow-up on #79865. Result strides are now also verified if the `memref.subview` op has rank reductions.	2024-02-02 10:17:55 +01:00
Matthias Springer	ce7cc723b9	[mlir][memref] `memref.subview`: Verify result strides The `memref.subview` verifier currently checks result shape, element type, memory space and offset of the result type. However, the strides of the result type are currently not verified. This commit adds verification of result strides for non-rank reducing ops and fixes invalid IR in test cases. Verification of result strides for ops with rank reductions is more complex (and there could be multiple possible result types). That is left for a separate commit. Also refactor the implementation a bit: * If `computeMemRefRankReductionMask` could not compute the dropped dimensions, there must be something wrong with the op. Return `FailureOr` instead of `std::optional`. * `isRankReducedMemRefType` did much more than just checking whether the op has rank reductions or not. Inline the implementation into the verifier and add better comments. * `produceSubViewErrorMsg` does not have to be templatized. * Fix comment and add additional assert to `ExpandStridedMetadata.cpp`, to make sure that the memref.subview verifier is in sync with the memref.subview -> memref.reinterpret_cast lowering. Note: This change is identical to #79865, but with a fixed comment and an additional assert in `ExpandStridedMetadata.cpp`. (I reverted #79865 in #80116, but the implementation was actually correct, just the comment in `ExpandStridedMetadata.cpp` was confusing.)	2024-01-31 09:28:53 +00:00
Matthias Springer	96c907dbce	Revert "[mlir][memref] `memref.subview`: Verify result strides" (#80116 ) Reverts llvm/llvm-project#79865 I think there is a bug in the stride computation in `SubViewOp::inferResultType`. (Was already there before this change.) Reverting this commit for now and updating the original pull request with a fix and more test cases.	2024-01-31 09:35:13 +01:00
Matthias Springer	db49319264	[mlir][memref] `memref.subview`: Verify result strides (#79865 ) The `memref.subview` verifier currently checks result shape, element type, memory space and offset of the result type. However, the strides of the result type are currently not verified. This commit adds verification of result strides for non-rank reducing ops and fixes invalid IR in test cases. Verification of result strides for ops with rank reductions is more complex (and there could be multiple possible result types). That is left for a separate commit. Also refactor the implementation a bit: * If `computeMemRefRankReductionMask` could not compute the dropped dimensions, there must be something wrong with the op. Return `FailureOr` instead of `std::optional`. * `isRankReducedMemRefType` did much more than just checking whether the op has rank reductions or not. Inline the implementation into the verifier and add better comments. * `produceSubViewErrorMsg` does not have to be templatized.	2024-01-31 09:14:48 +01:00
Benjamin Maxwell	fdf73e9495	[mlir][memref] Remove incorrect `memref.transpose` fold (#79809 ) This folded casts into `memref.transpose` without updating the result type of the transpose op, which resulted in IR that failed to verify for statically sized memrefs. i.e. ```mlir %cast = memref.cast %0 : memref<?x4xf32> to memref<?x?xf32> %transpose = memref.transpose %cast : memref<?x?xf32> to memref<?x?xf32> ``` would fold to: ```mlir // Fails verification: %transpose = memref.transpose %cast : memref<?x4xf32> to memref<?x?xf32> ```	2024-01-30 15:58:22 +00:00
Oleksandr "Alex" Zinenko	2798b72ae7	[mlir] introduce debug transform dialect extension (#77595 ) Introduce a new extension for simple print-debugging of the transform dialect scripts. The initial version of this extension consists of two ops that are printing the payload objects associated with transform dialect values. Similar ops were already available in the test extenion and several downstream projects, and were extensively used for testing.	2024-01-12 13:24:02 +01:00
Felix Schneider	4619e21c72	[mlir][memref] Transpose: allow affine map layouts in result, extend folder (#76294 ) Currently, the `memref.transpose` verifier forces the result type of the Op to have an explicit `StridedLayoutAttr` via the method `inferTransposeResultType`. This means that the example Op given in the documentation is actually invalid because it uses an `AffineMap` to specify the layout. It also means that we can't "un-transpose" a transposed memref back to the implicit layout form, because the verifier will always enforce the explicit strided layout. This patch makes the following changes: 1. The verifier checks whether the canonicalized strided layout of the result Type is identitcal to the canonicalized infered result type layout. This way, it's only important that the two Types have the same strided layout, not necessarily the same representation of it. 2. The folder is extended to support folding away the trivial case of identity permutation and to fold one transposition into another by composing the permutation maps.	2024-01-11 19:54:49 +01:00
Rik Huijzer	672f1a036a	[mlir][memref] Make `LoadOp::verify` error more clear (#75831 ) While debugging https://github.com/llvm/llvm-project/issues/71326, the `LoadOp::verify` code and error were very confusing. This PR improves that. This code was a part from the reverted PR https://github.com/llvm/llvm-project/pull/75519. Fixing the `-convert-vector-to-scf` issue is going to take a bit longer and this code was out of scope anyway. Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>	2023-12-18 18:41:05 +01:00
Rik Huijzer	9f5afc3de9	Revert "[mlir][vector] Fix invalid `LoadOp` indices being created (#75519 )" This reverts commit `3a1ae2f46d`.	2023-12-17 12:34:17 +01:00
Rik Huijzer	3a1ae2f46d	[mlir][vector] Fix invalid `LoadOp` indices being created (#75519 ) Fixes https://github.com/llvm/llvm-project/issues/71326. The cause of the issue was that a new `LoadOp` was created which looked something like: ```mlir %arg4 = func.func main(%arg1 : index, %arg2 : index) { %alloca_0 = memref.alloca() : memref<vector<1x32xi1>> %1 = vector.type_cast %alloca_0 : memref<vector<1x32xi1>> to memref<1xvector<32xi1>> %2 = memref.load %1[%arg1, %arg2] : memref<1xvector<32xi1>> return } ``` which crashed inside the `LoadOp::verify`. Note here that `%alloca_0` is 0 dimensional, `%1` has one dimension, but `memref.load` tries to index `%1` with two indices. This is now fixed by using the fact that `unpackOneDim` always unpacks one dim `1bce61e6b0/mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp (L897-L903)` and so the `loadOp` should just index only one dimension. --------- Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>	2023-12-17 11:42:35 +01:00
Guray Ozen	c65d8c7187	[mlir][memref] extract_strided_metadata for zero-sized memref (#74835 )	2023-12-08 15:55:14 +01:00
Rik Huijzer	68f0bc6f2e	[mlir] Fix a zero stride canonicalizer crash (#74200 ) This PR fixes https://github.com/llvm/llvm-project/issues/73383 and is another shot at the refactoring proposed in https://github.com/llvm/llvm-project/pull/72885. --------- Co-authored-by: Kai Sasaki <lewuathe@gmail.com>	2023-12-06 07:35:18 +01:00
Max191	3a6f02a658	[mlir] Add subbyte emulation support for `memref.store`. (#73174 ) This adds a conversion for narrow type emulation of memref.store ops. The conversion replaces the memref.store with two memref.atomic_rmw ops. Atomics are used to prevent race conditions on same-byte accesses, in the event that two threads are storing into the same byte. Fixes https://github.com/openxla/iree/issues/15370	2023-11-28 11:51:30 -08:00
Max191	b823f8469b	[mlir] Add support for `memref.alloca` sub-byte emulation (#73138 ) Adds a similar case to `memref.alloc` for `memref.alloca` in EmulateNarrowTypes. Fixes https://github.com/openxla/iree/issues/15515	2023-11-27 16:28:22 -08:00
Max191	b29332a318	[mlir] Add narrow type emulation for `memref.reinterpret_cast` (#73144 )	2023-11-27 10:41:14 -08:00
Rik Huijzer	1949fe90bf	[mlir] Verify non-negative `offset` and `size` (#72059 ) In #71153, the `memref.subview` canonicalizer crashes due to a negative `size` being passed as an operand. During `SubViewOp::verify` this negative `size` is not yet detectable since it is dynamic and only available after constant folding, which happens during the canonicalization passes. As discussed in <https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>, the verifier should not be extended as it should "only verify local aspects of an operation". This patch fixes #71153 by not folding in aforementioned situation. Also, this patch adds a basic offset and size check in the `OffsetSizeAndStrideOpInterface` verifier. Note: only `offset` and `size` are checked because `stride` is allowed to be negative (`54d81e49e3`).	2023-11-16 07:42:37 +01:00
Max191	dae3c44ce6	[mlir] Add `vector.store/maskedstore` of `memref.subview` memref alias folding (#72184 ) Fixes https://github.com/openxla/iree/issues/15575	2023-11-14 14:24:54 -08:00
Quinn Dawkins	48f980c535	[mlir][memref] Add memref alias folding for masked transfers (#71476 ) The contents of a mask on a masked transfer are unaffected by the particular region of memory being read/stored to, so just forward the mask in subview folding patterns.	2023-11-07 08:56:54 -05:00
tyb0807	5aa2c65abd	[mlir][MemRef] Add subview folding pattern for vector.maskedload (#71380 ) This is required for fixing https://github.com/openxla/iree/issues/15031	2023-11-06 20:08:30 +01:00
Théo Degioanni	b142501e92	[mlir][memref] Fix segfault in SROA (#71063 ) Fixes #70902. The out of bounds check in the SROA implementation for MemRef was not actually testing anything because it only operated on a store op which does not trigger the logic by itself. It is now checked for real and the underlying bug is fixed. I checked the LLVM implementation just in case but this should not happen as out-of-bound checks happen in GEP's verifier there.	2023-11-06 13:53:16 +01:00
Matthias Springer	437c62178c	[mlir][memref] Remove redundant `memref.tensor_store` op (#71010 ) `bufferization.materialize_in_destination` should be used instead. Both ops bufferize to a memcpy. This change also conceptually cleans up the memref dialect a bit: the memref dialect no longer contains ops that operate on tensor values.	2023-11-05 12:47:18 +09:00
Matthias Springer	6086c272a3	[mlir][memref] Fix out-of-bounds crash when reifying result dims (#70774 ) Do not crash when the input IR is invalid, i.e., when the index of the dimension operand of a `tensor.dim`/`memref.dim` is out-of-bounds. This fixes #70180.	2023-10-31 17:26:56 +09:00
Oleksandr "Alex" Zinenko	e4384149b5	[mlir] use transform-interpreter in test passes (#70040 ) Update most test passes to use the transform-interpreter pass instead of the test-transform-dialect-interpreter-pass. The new "main" interpreter pass has a named entry point instead of looking up the top-level op with `PossibleTopLevelOpTrait`, which is arguably a more understandable interface. The change is mechanical, rewriting an unnamed sequence into a named one and wrapping the transform IR in to a module when necessary. Add an option to the transform-interpreter pass to target a tagged payload op instead of the root anchor op, which is also useful for repro generation. Only the test in the transform dialect proper and the examples have not been updated yet. These will be updated separately after a more careful consideration of testing coverage of the transform interpreter logic.	2023-10-24 16:12:34 +02:00
Felix Schneider	f32b3e1caa	[mlir][memref] Fix index delinearization for CollapseShapeOp folding (#68833 ) The `resolveSourceIndicesCollapseShape` method is used to compute indices into the source `MemRef` of a `CollapseShapeOp` from the collapsed indices. This method didn't check for dynamic sizes of the source shape which led to a crash. Fix https://github.com/llvm/llvm-project/issues/68483	2023-10-12 07:12:43 +02:00
Kunwar Grover	8f397e04e5	[mlir][memref] Fix emulate narrow types for strided memref offset (#68181 ) This patch fixes strided memref offset calculation for emulating narrow types. As a side effect, this patch also adds support for a 1-D subviews with static sizes, static offsets and strides of 1 for testing. Emulate narrow types pass was not tested for strided memrefs before this patch.	2023-10-06 04:52:33 +05:30
qcolombet	932dc9d8c4	[mlir][MemRef] Add a pattern to simplify `extract_strided_metadata(ca… (#68291 ) …st)` `expand-strided-metadata` was missing a pattern to get rid of `memref.cast`. The pattern is straight foward: Produce a new `extract_strided_metadata` with the source of the cast and fold the static information (sizes, strides, offset) along the way.	2023-10-05 14:32:42 +02:00
Ingo Müller	991cb14715	[mlir][memref][transform] Add new alloca_to_global op. (#66511 ) This PR adds a new transform op that replaces `memref.alloca`s with `memref.get_global`s to newly inserted `memref.global`s. This is useful, for example, for allocations that should reside in the shared memory of a GPU, which have to be declared as globals.	2023-09-21 18:17:00 +02:00
Daniil Dudkin	01e80a0f41	[mlir] Add `maxnumf` and `minnumf` to `AtomicRMWKind` (#66442 ) This commit adds the mentioned kinds of `AtomicRMWKind` as well as code generation for them.	2023-09-15 22:41:51 +03:00
Daniil Dudkin	6f4a528698	[mlir][memref] Use dedicated ops in `AtomicRMWOpConverter` (#66437 ) This patch refactors the `AtomicRMWOpConverter` class to use the dedicated operations from Arith dialect instead of using `cmpf` + `select` pattern. Also, a test for `minimumf` kind of `atomic_rmw` has been added.	2023-09-15 00:52:35 +03:00
Daniil Dudkin	c46a04339a	[mlir][arith] Rename `AtomicRMWKind`'s `maxf` → `maximumf`, `minf` → `minimumf` (#66135 ) This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. This commit renames `maxf` and `minf` enumerators of `AtomicRMWKind` to better reflect the current naming scheme and the goals of the RFC.	2023-09-14 01:09:37 +03:00
Oleksandr "Alex" Zinenko	e55e36de7a	[mlir] alloc-to-alloca conversion for memref (#65335 ) Introduce a simple conversion of a memref.alloc/dealloc pair into an alloca in the same scope. Expose it as a transform op and a pattern. Allocas typically lower to stack allocations as opposed to alloc/dealloc that lower to significantly more expensive malloc/free calls. In addition, this can be combined with allocation hoisting from loops to further improve performance.	2023-09-05 17:58:22 +02:00
Martin Erhart	8037deb7af	[mlir][memref] Add pass to expand realloc operations, simplify lowering to LLVM There are two motivations for this change: 1. It considerably simplifies adding support for the realloc operation to the new buffer deallocation pass by lowering the realloc such that no deallocation operation is inserted and the deallocation pass itself can insert that dealloc 2. The lowering is expressed on a higher level and thus easier to understand, and the lowerings of the memref operations it is composed of don't have to be duplicated in the MemRefToLLVM lowering (also see discussion in https://reviews.llvm.org/D133424) Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D159430	2023-09-05 08:58:40 +00:00
Hanhan Wang	c5dee18b63	[mlir][memref] Add support for erasing dead allocations. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D159135	2023-09-01 13:30:26 -07:00
Andrey Turetskiy	01f4390a51	[MLIR] Fold memref.reinterpret_cast(x) -> x when the type is fully static and does not change. Differential Revision: https://reviews.llvm.org/D149296	2023-08-30 20:50:18 -07:00
Matthias Springer	e3373c6c83	[mlir][memref] Fix crash in SubViewReturnTypeCanonicalizer `SubViewReturnTypeCanonicalizer` is used by `OpWithOffsetSizesAndStridesConstantArgumentFolder`, which folds constant SSA value (dynamic) sizes into static sizes. The previous implementation crashed when a dynamic size was folded into a static `1` dimension, which was then mistaken as a rank reduction. Differential Revision: https://reviews.llvm.org/D158721	2023-08-25 16:01:49 +02:00
Mahesh Ravishankar	0f8bab8d59	[mlir] Revamp implementation of sub-byte load/store emulation. When handling sub-byte emulation, the sizes of the converted `memref`s also need to be updated (this was not done in the current implementation). This adds the additional complexity of having to linearize the `memref`s as well. Consider a `memref<3x3xi4>` where the `i4` elements are packed. This has a overall size of 5 bytes (rounded up to number of bytes). This can only be represented by a `memref<5xi8>`. A `memref<3x2xi8>` would imply an implicit padding of 4 bits at the end of each row. So incorporate linearization into the sub-byte load-store emulation. This patch also updates some of the utility functions to make better use of statically available information using `OpFoldResult` and `makeComposedFoldedAffineApplyOps`. Reviewed By: hanchung, yzhang93 Differential Revision: https://reviews.llvm.org/D158125	2023-08-17 20:27:53 +00:00
Hanhan Wang	f6897c37a2	[mlir][MemRef] Bail out for unsupported cases in FoldMemRefAliasOps pass The pass uses `computeSuffixProduct` method which only allows static shapes. This revision adds an early-exit for dynamic cases to avoid crash. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D157668	2023-08-11 14:52:53 -07:00
Matthias Springer	0bb4d4d32f	[mlir][transform] Add ApplyToLLVMConversionPatternsOp This op populates conversion patterns by querying the ConvertToLLVMPatternInterface. Only dialects that support this interface are supported. Differential Revision: https://reviews.llvm.org/D157487	2023-08-09 13:44:47 +02:00
Uday Bondhugula	b36de52c98	NFC. Move remaining affine/memref test cases into respective dialect dirs Move a bunch of lingering test cases from test/Transforms/ into test/Dialect/Affine and MemRef. Differential Revision: https://reviews.llvm.org/D155855	2023-07-21 22:36:01 +05:30
Hanhan Wang	8fc433f055	[mlir][MemRef] Move narrow type emulation common methods to MemRefUtils. It also unifies the computation of StridedLayoutAttr. If the stride is static known value, we can just use it. Differential Revision: https://reviews.llvm.org/D155017	2023-07-13 14:43:21 -07:00
yzhang93	5a1cdcbd86	[mlir] Narrow bitwidth emulation for MemRef load This patch adds support for narrow bitwidth storage emulation. The goal is to support sub-byte type codegen for LLVM CPU. Specifically, a type converter is added to convert memref of narrow bitwidth (e.g., i4) into supported wider bitwidth (e.g., i8). Another focus of this patch is to populate the pattern for int4 memref.load. memref.store pattern should be added in a seperate patch. Reviewed By: hanchung, mravishankar Differential Revision: https://reviews.llvm.org/D151519	2023-06-26 14:18:30 -07:00
Roger Ferrer Ibanez	666a56f1f1	[MLIR][Tests] Update tests so they require assertions These tests check statistics results which require assertions enabled. Differential Revision: https://reviews.llvm.org/D152780	2023-06-13 16:00:44 +00:00
Matthias Springer	cc7f52432b	[mlir][transform] Use separate ops instead of PatternRegistry * Remove `transform::PatternRegistry`. * Add a new op for each currently registered pattern set. * Change names of vector dialect pattern selector ops, so that they are consistent with the remaining code base. * Remove redundant `transform.vector.extract_address_computations` op. Differential Revision: https://reviews.llvm.org/D152249	2023-06-06 11:53:03 +02:00
Guray Ozen	5ec360c589	[mlir] Enable folding memref alias for`vector.load` This work enables folding memref alias pass for`vector.load` Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D151447	2023-05-25 17:07:20 +02:00
Guray Ozen	46c32afbc5	[mlir] Enable folding memref alias for `ldmatrix` Folding mechanism does not recognize `ldmatrix` op. This work helps pass to recognize the op and fold the memref aliases. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D151412	2023-05-25 13:10:17 +02:00
Théo Degioanni	0bf120a820	[mlir] [sroa] Add support for MemRef. This patch implements SROA interfaces for MemRef, up to a given fixed size. Reviewed By: gysit, Dinistro Differential Revision: https://reviews.llvm.org/D151102	2023-05-24 07:33:28 +00:00
Alex Zinenko	2f3ac28cb2	[mlir] don't hardcode PDL_Operation in Transform dialect extensions Update operations in Transform dialect extensions defined in the Affine, GPU, MemRef and Tensor dialects to use the more generic `TransformHandleTypeInterface` type constraint instead of hardcoding `PDL_Operation`. See https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702 for motivation. Remove the dependency on PDLDialect from these extensions. Update tests to use `!transform.any_op` instead of `!pdl.operation`. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D150781	2023-05-17 15:10:12 +00:00
Théo Degioanni	ead8e9d795	[mlir] [mem2reg] Adapt to be pattern-friendly. This revision modifies the mem2reg interfaces and algorithm to be more omfortable to use as a pattern. The motivation behind this is that currently the pattern needs to be applied to the scope op of the region in which allocators should be promoted. However, a more natural way to apply the pattern would be to apply it on the allocator directly. This is not only clearer but easier to parallelize. This revision changes the mem2reg pattern to operate this way. This required restraining the interfaces to only mutate IR using RewriterBase, as the previously used escape hatch is not granular enough to match on the region that is modified only. This has the unfortunate cost of preventing batching allocator promotion and making the block argument adding logic more complex. Because batching no longer made any sense, I made the internal analyzer/promoter decoupling private again. This also adds statistics to the mem2reg infrastructure. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D150432	2023-05-16 08:35:13 +00:00

1 2 3 4

184 Commits