clang-p2996

Author	SHA1	Message	Date
Max191	f1595ecfdc	[mlir] Fix bug in UnPackOp tiling implementation causing infinite loop (#113571 ) This fixes a bug in the tiling implementation of tensor.unpack that was causing an infinite loop when certain unpack ops get tiled and fused as a producer. The tiled implementation of tensor.unpack sometimes needs to create an additional tensor.extract_slice on the result of the tiled unpack op, but this slice was getting added to the `generatedSlices` of the tiling result. The `generatedSlices` are used to find the next producers to fuse, so it caused an infinite loop of fusing the same unpack op after it was already in the loop. This fixes the bug by adding the slice of the source instead of the result. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-24 21:32:45 -04:00
Ian Wood	455f71d285	[mlir] Convert `expand_shape` to more static form (#112265 ) Add pattern that converts a `tensor.expand_shape` op to a more static form. This matches the pattern: `tensor.cast` -> `tensor.expand_shape` if it has a foldable `tensor.cast` and some constant foldable `output_shape` operands for the `tensor.expand_shape`. This makes the `tensor.expand_shape` more static, as well as allowing the static information to be propagated further down in the program.	2024-10-24 17:04:02 -07:00
Matthias Springer	f18c3e4e73	[mlir][Transforms] Dialect Conversion: Simplify materialization fn result type (#113031 ) This commit simplifies the result type of materialization functions. Previously: `std::optional<Value>` Now: `Value` The previous implementation allowed 3 possible return values: - Non-null value: The materialization function produced a valid materialization. - `std::nullopt`: The materialization function failed, but another materialization can be attempted. - `Value()`: The materialization failed and so should the dialect conversion. (Previously: Dialect conversion can roll back.) This commit removes the last variant. It is not particularly useful because the dialect conversion will fail anyway if all other materialization functions produced `std::nullopt`. Furthermore, in contrast to type conversions, at least one materialization callback is expected to succeed. In case of a failing type conversion, the current dialect conversion can roll back and try a different pattern. This also used to be the case for materializations, but that functionality was removed with #107109: failed materializations can no longer trigger a rollback. (They can just make the entire dialect conversion fail without rollback.) With this in mind, it is even less useful to have an additional error state for materialization functions. This commit is in preparation of merging the 1:1 and 1:N type converters. Target materializations will have to return multiple values instead of a single one. With this commit, we can keep the API simple: `SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`. Note for LLVM integration: All 1:1 materializations should return `Value` instead of `std::optional<Value>`. Instead of `std::nullopt` return `Value()`.	2024-10-23 07:29:17 -07:00
Andrzej Warzyński	2a25200828	[mlir][tensor] Restrict the verifier for tensor.pack/tensor.unpack (#113108 ) Restricts the verifier for tensor.pack and tensor.unpack Ops so that the following is no longer allowed: ```mlir %c8 = arith.constant 8 : index %0 = tensor.pack %input inner_dims_pos = [0, 1] inner_tiles = [8, %c8] into %output : tensor<?x?xf32> -> tensor<?x?x8x8xf32> ``` Specifically, in line with other Tensor Ops, require: * a dynamic dimensions for each (dynamic) SSA value, * a static dimension for each static size (attribute). In the example above, a static dimension (8) is mixed with a dynamic size (%c8). Note that this is mostly deleting existing code - that's because this change simplifies the logic in verifier. For more context: * https://discourse.llvm.org/t/tensor-ops-with-dynamic-sizes-which-behaviour-is-more-correct	2024-10-22 20:11:05 -07:00
Max191	98e838a890	[mlir] Do not bufferize parallel_insert_slice dest to read for full slices (#112761 ) In the insert_slice bufferization interface implementation, the destination tensor is not considered read if the full tensor is overwritten by the slice. This PR adds the same check for tensor.parallel_insert_slice. Adds two new StaticValueUtils: - `isAllConstantIntValue` checks if an array of `OpFoldResult` are all equal to a passed `int64_t` value. - `areConstantIntValues` checks if an array of `OpFoldResult` are all equal to a passed array of `int64_t` values. fixes https://github.com/llvm/llvm-project/issues/112435 --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-18 16:02:03 -04:00
Alexander Pivovarov	a24c468782	[MLIR] Fix assert expressions (#112474 ) I noticed that several assertions in MLIR codebase have issues with operator precedence The issue with operator precedence in these assertions is due to the way logical operators are evaluated. The `&&` operator has higher precedence than the `\|\|` operator, which means the assertion is currently evaluating incorrectly, like this: ``` assert((resType.getNumDynamicDims() == dynOutDims.size()) \|\| (dynOutDims.empty() && "Either none or all output dynamic dims must be specified!")); ``` We should add parentheses around the entire expression involving `dynOutDims.empty()` to ensure that the logical conditions are grouped correctly. Here’s the corrected version: ``` assert(((resType.getNumDynamicDims() == dynOutDims.size()) \|\| dynOutDims.empty()) && "Either none or all output dynamic dims must be specified!"); ```	2024-10-16 15:22:29 -07:00
Mehdi Amini	275a2b0581	[MLIR][Tensor] Perform shape inference via in-place modification (NFC) (#111593 ) This is more efficient to avoid a clone that is immediately removed. Also guard the insertion of a cast on the result on whether the destination type changed.	2024-10-09 09:42:16 +02:00
Prashant Kumar	971b579bc6	[MLIR] Don't drop attached discardable attributes (#111261 ) The creation of pack op was dropping discardable attributes.	2024-10-07 22:21:30 +05:30
BARRET	1666d13078	[CMake]: Remove unnecessary dependencies on LLVM/MLIR (#111255 ) Previous https://github.com/llvm/llvm-project/pull/110362 (reverted) caused breakage. Here is the PR with fix. My build cmdline: ``` cmake ../llvm \ -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=install \ -DCMAKE_C_COMPILER=gcc-9 \ -DCMAKE_CXX_COMPILER=g++-9 \ -DCMAKE_CUDA_COMPILER=$(which nvcc) \ -DLLVM_ENABLE_LLD=OFF \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_BUILD_EXAMPLES=ON \ -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \ -DLLVM_CCACHE_BUILD=ON \ -DMLIR_ENABLE_BINDINGS_PYTHON=ON \ -DBUILD_SHARED_LIBS=ON \ -DLLVM_ENABLE_PROJECTS='llvm;mlir' ```	2024-10-07 15:52:43 +02:00
Danial Klimkin	eb6222b9ea	[bazel] Fix build past `66f84c8b8a` (#110830 )	2024-10-02 14:01:19 +02:00
Andrzej Warzyński	66f84c8b8a	[mlir][tensor] Extend the logic to generalise tensor.pack (#109815 ) Extends the logic to generalise tensor.pack (into e.g. tensor.pad + tensor.transpose) so that it also works when one of the inner tile sizes is scalable (i.e. a multiple of `vector.vscale`). For example: ```mlir %c8 = arith.constant 8 : index %vscale = vector.vscale %c8_vscale = arith.muli %vscale, %c8 : index %0 = tensor.pack %input padding_value(%pad : f32) inner_dims_pos = [0, 1] inner_tiles = [%c8_vscale, 2] into %output : tensor<5x1xf32> -> tensor<1x1x?x2xf32> } ``` is generalised as: ```mlir %c8 = arith.constant 8 : index %vscale = vector.vscale %c8_vscale = arith.muli %vscale, %c8 : index %0 = affine.apply #map()[%c8_vscale, %c5] %padded = tensor.pad %arg0 low[0, 0] high[%0, 1] { ^bb0(%arg3: index, %arg4: index): tensor.yield %arg2 : f32 } : tensor<5x1xf32> to tensor<?x2xf32> ``` At the Tensor level, we model scalability using dynamic shapes and this change basically extends the relevant logic so that it also works for dynamic shapes.	2024-10-02 09:44:13 +01:00
Rajveer Singh Bharadwaj	760ffa4736	[mlir][tensor] Apply `InsertSliceOfTransferWriteOpFolder` only when `transfer_write` overwrites all elements of `insert_slice` (#108803 ) Resolves #101708 The updated logic now correctly checks if `transfer_write` completely overwrites `insert_slice` and only then applies the rewrite for this pattern. This check currently covers static sizes, for dynamic sizes value bounds analysis is needed (see `TODO:`).	2024-10-01 14:29:37 -07:00
Mehdi Amini	8b47711e84	Revert "CMake: Remove unnecessary dependencies on LLVM/MLIR" (#110594 ) Reverts llvm/llvm-project#110362 Multiple bots are broken.	2024-10-01 00:44:21 +02:00
BARRET	4980f2177e	CMake: Remove unnecessary dependencies on LLVM/MLIR (#110362 ) There are some spurious libraries which can be removed. I'm trying to bundle MLIR/LLVM library dependencies for our own libraries. We're utilizing cmake function to recursively collect MLIR/LLVM related dependencies. However, we identified certain library dependencies as redundant and safe for removal.	2024-09-30 23:57:13 +02:00
Andrzej Warzyński	bfde17834d	[mlir] Update the return type of `getNum{Dynamic\|Scalable}Dims` (#110472 ) Updates the return type of `getNumDynamicDims` and `getNumScalableDims` from `int64_t` to `size_t`. This is for consistency with other helpers/methods that return "size" and to reduce the number of `static_cast`s in various places.	2024-09-30 14:53:50 +01:00
Han-Chung Wang	a285ba7529	Revert "[mlir][tensor] Refine the semantics of `createPadHighOp`" (#110153 )	2024-09-26 12:44:43 -07:00
Andrzej Warzyński	9c48a04328	[mlir][tensor] Refine the semantics of `createPadHighOp` (#109667 ) Refine `createPadHighOp` so that the output tensor is required to be statically shaped. This is to prevent the current behaviour, which is incorrect: > // If `type` has dynamic dimensions the padding width is set to zero. The actual padding width should be set to: `%new_dim - %old_dim`, where %new_dim` and `%old_dim` are defined via e.g. `tensor.dim` Op applied to output and input tensors, respectively. This PR is an attempt to clarify the semantics surrounding dynamic shapes in preparation for adding support for scalable vectors to the pack/unpack logic in Tensor/Linalg (dynamic shapes is what we use to model scalable () sizes at the Tensor/MemRef level). () Scalable as in Arm's Scalable Vector Extension (SVE)	2024-09-26 16:18:46 +01:00
Andrzej Warzyński	c1826aeef3	[mlir][tensor] Add new helper hooks for RelayoutOp (#109642 ) Implements two helper hooks for PackOp and UnPackOP, `getAllOuterDims` and `getTiledOuterDims`, and adds them to RelayoutOp (that both PackOp an UnPackOp inherit from). This improves code re-use and also clarifies the meaning of "outer dims" and "tiled outer dims".	2024-09-24 13:14:49 +01:00
MaheshRavishankar	d5f0969c96	[mlir][TilingInterface] Avoid looking at operands for getting slices to continue tile + fuse. (#107882 ) Current implementation of `scf::tileConsumerAndFuseProducerUsingSCF` looks at operands of tiled/tiled+fused operations to see if they are produced by `extract_slice` operations to populate the worklist used to continue fusion. This implicit assumption does not always work. Instead make the implementations of `getTiledImplementation` return the slices to use to continue fusion. This is a breaking change - To continue to get the same behavior of `scf::tileConsumerAndFuseProducerUsingSCF`, change all out-of-tree implementation of `TilingInterface::getTiledImplementation` to return the slices to continue fusion on. All in-tree implementations have been adapted to this. - This change touches parts that required a simplification to the `ControlFn` in `scf::SCFTileAndFuseOptions`. It now returns a `std::optional<scf::SCFTileAndFuseOptions::ControlFnResult>` object that should be `std::nullopt` if fusion is not to be performed. Signed-off-by: MaheshRavishankar <mahesh.revishankar@gmail.com>	2024-09-11 22:15:43 -07:00
Quinn Dawkins	6cc3bf7d1d	[mlir][tensor] Add canonicalization to fold consecutive tensor.pad ops (#107302 ) `tensor.pad(tensor.pad)` with the same constant padding value can be combined into a single pad that pads to the sum of the high and low padding amounts.	2024-09-09 11:05:37 -04:00
Longsheng Mou	ede40da1f8	[mlir][tensor] Add check for indices of `tensor.gather` (#106894 ) This patch add a check for indices of `tensor.gather` and `tensor.scatter`. For that the length of gather_dims/scatter_dims should match the size of last dimension of the indices. Fix #94901.	2024-09-06 10:45:59 +08:00
Benoit Jacob	c1667f9099	Fix `transpose->unpack` folding pattern for the partial-tile case of `unpack` (#107271 ) Just directly create the empty tensor of appropriate shape instead of relying on `UnPackOp::createDestinationTensor` which is trying to infer the destination shape, which isn't possible in general with the set of paramters that it is taking. Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>	2024-09-04 15:06:27 -04:00
yifeizh2	8d0816615f	[MLIR][Tensor] Fix source/dest type check in UnPackOp canonicalize (#106094 ) Fix `RankedTensorType` equality check in unpack op canonicalization.	2024-09-04 10:10:43 +08:00
Yun-Fly	c8763f04bf	[mlir][tensor] Fix consumer fusion for `tensor.pack` without explicit `outer_dims_perm` attribute (#106687 )	2024-09-04 09:19:09 +08:00
Christopher Bate	8bf69ceb00	Reapply "[mlir] NFC: fix dependence of (Tensor\|Linalg\|MemRef\|Complex) dialects on LLVM Dialect and LLVM Core in CMake build (#104832 )" (#105703 ) Reapply the commit `43b5085667` with additional fixes for building with BUILD_SHARED_LIBS=ON.	2024-08-28 22:34:14 -06:00
Quinn Dawkins	91e57c6fa8	[mlir][tensor] Add TilingInterface support for fusing tensor.pad (#105892 ) This adds implementations for the two TilingInterface methods required for fusion to `tensor.pad`: `getIterationDomainTileFromResultTile` and `generateResultTileValue`, allowing fusion of pad with a tiled consumer.	2024-08-23 19:10:04 -04:00
Yun-Fly	f06563a5c0	[mlir][tensor] Add consumer fusion for `tensor.pack` op. (#103715 ) Add missing `getIterationDomainTileFromOperandTile` and `getTiledImplementationFromOperandTile` to `tensor.pack` and enable fusing it as a consumer. NOTE that, it only expects perfect tiling scenario without padding semantic currently.	2024-08-23 10:07:17 +08:00
Frank Schlimbach	681ae09722	[MLIR][mesh] moving shardinginterfaceimpl for tensor to tensor extension lib (#104913 ) Follow-up to #102598 : as discussed, move tensor sharding implementation into separate tensor extension lib. @sogartar @yaochengji, could you take a look at this PR?	2024-08-21 11:59:44 +01:00
Christopher Bate	06fd808654	Revert "[mlir] NFC: fix dependence of (Tensor\|Linalg\|MemRef\|Complex) dialects on LLVM Dialect and LLVM Core in CMake build (#104832 )" This reverts commit `43b5085667` since it caused the build to break with BUILD_SHARED_LIBS=ON.	2024-08-20 03:46:29 +00:00
Christopher Bate	43b5085667	[mlir] NFC: fix dependence of (Tensor\|Linalg\|MemRef\|Complex) dialects on LLVM Dialect and LLVM Core in CMake build (#104832 ) This change removes dependencies declared as either 'LINK_LIBS' or 'LINK_COMPONENTS' across several MLIR libraries. The removed dependencies appear to be incorrect and may have been required in older versions of the project. These dependencies cause many high level dialects to have transitive dependence on the LLVM dialect and the LLVM 'Core' library ('llvm/lib/IR'). Note that if using the 'Ninja' CMake generator, one can inspect the dependencies (including all transitive libraries) of any given MLIR target but using the command `ninja -C <build dir> -t browse` and navigating to the library of interest in a web browser.	2024-08-19 18:49:22 -06:00
Ian Wood	a95ad2da36	[mlir] Add bubbling patterns for non intersecting reshapes (#103401 ) Refactored @Max191's PR https://github.com/llvm/llvm-project/pull/94637 to move it to `Tensor` From the original PR >This PR adds fusion by expansion patterns to push a tensor.expand_shape up through a tensor.collapse_shape with non-intersecting reassociations. Sometimes parallel collapse_shape ops like this can block propagation of expand_shape ops, so this allows them to pass through each other. I'm not sure if I put the code/tests in the right places, so let me know where those go if they aren't. cc @MaheshRavishankar @hanhanW --------- Co-authored-by: Max Dawkins <max.dawkins@gmail.com>	2024-08-14 13:58:35 -07:00
Renato Golin	3968942f10	Revert "[mlir][mesh] adding shard-size control (#98145 )" This reverts commit `fca69838ca`. Also reverts the fixup: "[mlir] Fix -Wunused-variable in MeshOps.cpp (NFC)" This reverts commit `fc737368fe`.	2024-08-07 15:12:37 +01:00
Frank Schlimbach	fca69838ca	[mlir][mesh] adding shard-size control (#98145 ) - Replacing `#mesh.sharding` attribute with operation `mesh.sharding` - extended semantics now allow providing optional `halo_sizes` and `sharded_dims_sizes` - internally a sharding is represented as a non-IR class `mesh::MeshSharding` What previously was ```mlir %sharded0 = mesh.shard %arg0 <@mesh0, [[0]]> : tensor<4x8xf32> %sharded1 = mesh.shard %arg1 <@mesh0, [[0]]> annotate_for_users : tensor<16x8xf32> ``` is now ```mlir %sharding = mesh.sharding @mesh0, [[0]] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding : tensor<4x8xf32> %1 = mesh.shard %arg1 to %sharding annotate_for_users : tensor<16x8xf32> ``` and allows additional annotations to control the shard sizes: ```mlir mesh.mesh @mesh0 (shape = 4) %sharding0 = mesh.sharding @mesh0, [[0]] halo_sizes = [1, 2] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding0 : tensor<4x8xf32> %sharding1 = mesh.sharding @mesh0, [[0]] sharded_dims_sizes = [3, 5, 5, 3] : !mesh.sharding %1 = mesh.shard %arg1 to %sharding1 annotate_for_users : tensor<16x8xf32> ``` - `mesh.shard` op accepts additional optional attribute `force`, useful for halo updates - Some initial spmdization support for the new semantics - Support for `tensor.empty` reacting on `sharded_dims_sizes` and `halo_sizes` in the sharding - New collective operation `mesh.update_halo` as a spmdized target for shardings with `halo_sizes` @sogartar @yaochengji	2024-08-07 13:34:57 +01:00
Nikhil Kalra	84cc1865ef	[mlir] Support DialectRegistry extension comparison (#101119 ) `PassManager::run` loads the dependent dialects for each pass into the current context prior to invoking the individual passes. If the dependent dialect is already loaded into the context, this should be a no-op. However, if there are extensions registered in the `DialectRegistry`, the dependent dialects are unconditionally registered into the context. This poses a problem for dynamic pass pipelines, however, because they will likely be executing while the context is in an immutable state (because of the parent pass pipeline being run). To solve this, we'll update the extension registration API on `DialectRegistry` to require a type ID for each extension that is registered. Then, instead of unconditionally registered dialects into a context if extensions are present, we'll check against the extension type IDs already present in the context's internal `DialectRegistry`. The context will only be marked as dirty if there are net-new extension types present in the `DialectRegistry` populated by `PassManager::getDependentDialects`. Note: this PR removes the `addExtension` overload that utilizes `std::function` as the parameter. This is because `std::function` is copyable and potentially allocates memory for the contained function so we can't use the function pointer as the unique type ID for the extension. Downstream changes required: - Existing `DialectExtension` subclasses will need a type ID to be registered for each subclass. More details on how to register a type ID can be found here: `8b68e06731/mlir/include/mlir/Support/TypeID.h (L30)` - Existing uses of the `std::function` overload of `addExtension` will need to be refactored into dedicated `DialectExtension` classes with associated type IDs. The attached `std::function` can either be inlined into or called directly from `DialectExtension::apply`. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2024-08-06 01:32:36 +02:00
Kazu Hirata	5262865aac	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896 )	2024-08-04 11:43:05 -07:00
Rafael Ubal	38d0b2d174	[mlir] New canonicalization patterns for shape.shape_of and tensor.reshape (#98531 ) This PR includes 3 new canonicalization patterns: - Operation `shape.shape_of`: shape of reshape ``` // Before func.func @f(%arg0: tensor<xf32>, %arg1: tensor<?xindex>) -> tensor<?xindex> { %reshape = tensor.reshape %arg0(%arg1) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> %0 = shape.shape_of %reshape : tensor<xf32> -> tensor<?xindex> return %0 : tensor<?xindex> } // After func.func @f(%arg0: tensor<xf32>, %arg1: tensor<?xindex>) -> tensor<?xindex> { return %arg1 : tensor<?xindex> } ``` - Operation `tensor.reshape`: reshape of reshape ``` // Before func.func @fold_tensor_reshape(%arg0: tensor<xf32>, %arg1: tensor<?xindex>, %arg2: tensor<?xindex>) -> tensor<xf32> { %0 = tensor.reshape %arg0(%arg1) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> %1 = tensor.reshape %0(%arg2) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> return %1 : tensor<xf32> } // After func.func @fold_tensor_reshape(%arg0: tensor<xf32>, %arg1: tensor<?xindex>, %arg2: tensor<?xindex>) -> tensor<xf32> { %reshape = tensor.reshape %arg0(%arg2) : (tensor<xf32>, tensor<?xindex>) -> tensor<xf32> return %reshape : tensor<*xf32> } ``` - Operation `tensor.reshape`: reshape 1D to 1D ``` // Before func.func @fold_reshape_1d(%input: tensor<?xf32>, %shape: tensor<1xindex>) -> tensor<?xf32> { %0 = tensor.reshape %input(%shape) : (tensor<?xf32>, tensor<1xindex>) -> tensor<?xf32> return %0 : tensor<?xf32> } // After func.func @fold_reshape_1d(%arg0: tensor<?xf32>, %arg1: tensor<1xindex>) -> tensor<?xf32> { return %arg0 : tensor<?xf32> } ``` These three canonicalization patterns cooperate to simplify the IR structure emerging from the lowering of certain element-wise ops with unranked tensor inputs. See file `unranked-tensor-lowering.mlir` in the proposed change list for a detailed example and description. For context, this PR is meant to enable code optimizations for the code generated while lowering ops `quant.qcast` and `quant.dcast` with unranked tensors, as proposed in https://discourse.llvm.org/t/rfc-improvements-in-the-quant-dialect/79942 (implementation currently in progress).	2024-07-19 10:09:31 -04:00
MaheshRavishankar	c077a4f305	[mlir][Tensor] Add pattern to fold concats of empty. (#98994 ) A concatenation of empty tensors can be replaced by a single empty tensor of the concatenated shape. Add this pattern to `populateFoldTensorEmptyPatterns`.	2024-07-17 09:51:00 -07:00
donald chen	d69e94916e	[mlir] [linalg] Fix bufferize error in tensor.parallel_insert_slice op (#98312 ) tensor.parallel_insert_slice op has implicit inplace behavior. In the "copy-before-write" bufferize mode, the resolveConflict function will generate bufferize.copy, making the result incorrect. This patch fixes this issue.	2024-07-11 20:16:06 +08:00
Max191	c9529f7601	[mlir] Drop outermost dims in slice rank reduction inference (#95020 ) The `getDroppedDims` utility function does not follow the convention of dropping outermost unit dimensions first when inferring a rank reduction mask for a slice. This PR updates the implementation to match this convention.	2024-06-25 12:33:02 -04:00
Ramkumar Ramachandra	0fb216fb2f	mlir/MathExtras: consolidate with llvm/MathExtras (#95087 ) This patch is part of a project to move the Presburger library into LLVM.	2024-06-11 23:00:02 +01:00
Prashant Kumar	1752740f4b	[mlir][tensor] Fix FoldTensorCastProducerOp for multiple result operations (#93374 ) For patterns where there are multiple results apart from dpsInits, this fails. E.g.: ``` %13:2 = iree_codegen.ukernel.generic "iree_uk_unpack" ins(%extracted_slice : tensor<?x1x16x16xf32>) outs(%11 : tensor<?x?xf32>) ... -> tensor<?x?xf32>, i32 ``` The above op has results apart from dpsInit and hence fails. The PR assumes that the result has dpsInits followed by nonDpsInits.	2024-06-07 11:22:36 +05:30
Max191	7ef83f5561	[mlir] Add pack/unpack transpose foldings for linalg.generic ops, fix bugs (#93055 ) This PR adds transpose + pack/unpack folding support for transpose ops in the form of `linalg.generic` ops. There were also some bugs with the permutation composing in the previous patterns, so this PR fixes these bugs and adds tests for them as well.	2024-06-06 10:54:27 -04:00
Spenser Bauman	a9205c5c9d	[mlir][tensor] Implement constant folder for tensor.pad (#92691 ) Extend the folding ability of the RewriteAsConstant patterns to include tensor.pad operations on constants. The new pattern with constant fold tensor.pad operations which operate on tensor constants and have statically resolvable padding sizes/values. %init = arith.constant dense<[[6, 7], [8, 9]]> : tensor<2x2xi32> %pad_value = arith.constant 0 : i32 %0 = tensor.pad %init low[1, 1] high[1, 1] { ^bb0(%arg1: index, %arg2: index): tensor.yield %pad_value : i32 } : tensor<2x2xi32> to tensor<4x4xi32> becomes %cst = arith.constant dense<[[0, 0, 0, 0], [0, 6, 7, 0], [0, 8, 9, 0], [0, 0, 0, 0]]> : tensor<4x4xi32> Co-authored-by: Spenser Bauman <sabauma@fastmail>	2024-06-06 10:22:16 -04:00
Abhishek Varma	2b2ce50fe8	[MLIR][SCF] Add an API to fuse consumer to a producer within scf loop (#88712 ) This commit adds an API (`tileAndFuseConsumerOfSlice`) to fuse consumer to a producer within scf.for/scf.forall loop. To support this two new methods are added to the `TilingInterface` - `getIterationDomainTileFromOperandTile` - `getTiledImplementationFromOperandTile`. Consumer operations that implement this method can be used to be fused with tiled producer operands in a manner similar to (but essentially the inverse of) the fusion of an untiled producer with a tiled consumer. Note that this only does one `tiled producer` -> `consumer` fusion. This could be called repeatedly for fusing multiple consumers. The current implementation also is conservative in when this kicks in (like single use of the value returned by the inter-tile loops that surround the tiled producer, etc.) These can be relaxed over time. Signed-off-by: Abhishek Varma <abhvarma@amd.com> --------- Signed-off-by: Abhishek Varma <abhvarma@amd.com> Signed-off-by: Abhishek Varma <avarma094@gmail.com> Co-authored-by: cxy <chenxunyu1993@gmail.com>	2024-06-01 11:23:41 -07:00
Han-Chung Wang	2db190fda6	[mlir][tensor][NFC] Move function comments to where they are declared. (#94002 ) According to LLVM style guide, we prefer putting the documentation comments for public APIs into the header file. See https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments for more details.	2024-05-31 13:21:44 -07:00
Adam Siemieniuk	8f4d5a32ac	[mlir][tensor] Fold unpadding collapse_shape into extract_slice (#93554 )	2024-05-31 13:29:40 +02:00
Kunwar Grover	debdbeda15	[mlir] Remove dialect specific bufferization passes (Reland) (#93535 ) These passes have been depreciated for a long time and replaced by one-shot bufferization. These passes are also unsafe because they do not check for read-after-write conflicts. Relands https://github.com/llvm/llvm-project/pull/93488 which failed on buildbot. Fixes the failure by updating integration tests to use one-shot-bufferize instead.	2024-05-28 20:04:27 +01:00
Kunwar Grover	39848d0a98	Revert "[mlir] Remove dialect specific bufferization passes" (#93528 ) Reverts llvm/llvm-project#93488 Buildbot failure: https://lab.llvm.org/buildbot/#/builders/220/builds/39911	2024-05-28 11:21:34 +01:00
Kunwar Grover	2fc5106437	[mlir] Remove dialect specific bufferization passes (#93488 ) These passes have been depreciated for a long time and replaced by one-shot bufferization. These passes are also unsafe because they do not check for read-after-write conflicts.	2024-05-28 11:12:58 +01:00
Adam Siemieniuk	a79a0c5288	[mlir][tensor] Simplify pad-like tensor pack and unpack (#92388 ) Extend existing tensor patterns to simplify pad-like tensor pack/unpack into expand/collapse shape operations.	2024-05-24 10:25:42 +02:00

1 2 3 4 5 ...

483 Commits