clang-p2996

Author	SHA1	Message	Date
lorenzo chelini	06c4f78b07	[MLIR][Linalg] improve silenceable failure msg for `lower_pack` (NFC) (#75053 ) Adjust the silenceable failure message as we lower `tensor.unpack` as a combination of `linalg.transpose` + `tensor.collapse_shape` and `tensor.extract_slice`.	2023-12-12 13:06:17 +01:00
Pablo Antonio Martinez	b396e5429c	Reland "[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994 )" Test was failing due to a different transform sequence declaration (transform sequence were used, while now it should be named transform sequence). Test is now fixed.	2023-12-07 11:57:02 +00:00
Mikhail Goncharov	10879403e5	Revert "[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994 )" This reverts commit `c4399130ae`. Test fails https://lab.llvm.org/buildbot/#/builders/272/builds/2757	2023-12-07 10:28:35 +01:00
Pablo Antonio Martinez	c4399130ae	[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994 ) This patchs adds the `filter_operand_types` attribute to transform::MatchOp, allowing to filter ops depending on their operand types.	2023-12-07 08:28:52 +00:00
Andrzej Warzyński	03c2f5d8bb	[mlir][linalg][conv] Flatten the channel dimension when vectorizing (#71918 ) The current vectorization of 1D depthwise convolutions in Linalg is _sub-optimal_ for tensor with a low number of channel dimensions, e.g.: ```mlir linalg.depthwise_conv_1d_nwc_wc {dilations = dense<1> : vector<1xi64>, strides = dense<1> : vector<1xi64>} ins(%input, %filter : tensor<1x8x3xi8>, tensor<1x3xi8>) outs(%output : tensor<1x8x3xi8>) -> tensor<1x8x3xi8> ``` That's due to the fact that ultimately (i.e. at LLVM level), vectorization happens along the trailing dimension (i.e. the channel dimension). In this case it leads to vectors with 3 elements (or worse, if there's e.g. only 1 channel dimension). For comparison, a 128 bit wide vector registers can hold 16 x i8. Instead, this patch adds an option to flatten/collapse the channel dimension into the width dimension of the input/filter/output using `vector.shape_cast` operation: ```mlir %sc_input = vector.shape_cast %input : vector<1x8x3xi8> to vector<1x24xi8> %sc_output = vector.shape_cast %output : vector<1x8x3xi8> to vector<1x24xi8> %b_filter = vector.broadcast %filter : vector<3xi8> to vector<1x8x3xi8> %sc_filter = vector.shape_cast %b_filter : vector<1x8x3xi8> to vector<1x24xi8> ``` This new vectorization mode is implemented in `depthwiseConv` by inserting `vector.shape_cast` Ops before and after `depthwiseConv1dSliceAsMulAcc` is invoked. It can be selected through e.g. a transform dialect attribute: ```mlir transform.structured.vectorize_children_and_apply_patterns %conv {flatten_1d_depthwise_conv} ``` A forthcoming patch will implement a strategy to automatically switch between the two implementations, depending on the shape of the input tensors. Co-authored by: Bradley Smith <bradley.smith@arm.com>	2023-12-06 21:35:03 +00:00
Felix Schneider	e07c92a9c3	[mlir] Fix TileUsingForOp attr-dict printing/parsing (#73261 ) `TileUsingForOp` has an optional Attribute `interchange` which was given in curly braces like this: `{interchange = [...]}`. The way this was parsed meant that no `attr-dict` could be attached to the Op. This patch adds printing / parsing of an `attr-dict` to the Op and prints/parses the `interchange` Attribute separate from the discardable Attributes.	2023-12-06 20:08:01 +01:00
Jack Frankland	4a3d2088d6	[mlir][linalg] Add TransposeConv2D Transform Op (#68567 ) * Add a LinAlg pass to convert 2D convolutions and quantized 2D convolutions that have the `FHWC` filter channel ordering into a transpose followed by 2D convolutions that have the `HWCF` channel ordering. * Add a lit test to check the semantics of the transformation are correct for both quantized and unquantized variants. Signed-off-by: Jack Frankland <jack.frankland@arm.com>	2023-11-28 09:56:12 +00:00
Matthias Springer	6367677c9d	[mlir][linalg] `BufferizeToAllocationOp`: fix side effects (#72986 ) `bufferize_to_allocation` does not bufferize/replace targeted ops if `bufferize_destination_only` is set. Fixes #72931.	2023-11-23 09:22:40 +01:00
Felix Schneider	227654e871	Revert "[mlir] Fix `TileUsingForOp` attr-dict printing/parsing, cleanup assembly format" (#73178 ) Reverts llvm/llvm-project#72745 as it is causing test failures on mlir-nvidia in `mlir/test/python/dialects/transform_structured_ext.py`.	2023-11-22 23:25:50 +01:00
Felix Schneider	0401668483	[mlir] Fix `TileUsingForOp` attr-dict printing/parsing (#72745 ) `TileUsingForOp` has an optional Attribute `interchange` which was given in curly braces like this: `{interchange = [...]}`. The way this was parsed meant that no normal `attr-dict` could be attached to the Op. This patch adds printing / parsing of an `attr-dict` to the Op and treats the `interchange` Attribute as part of that dictionary for now.	2023-11-22 22:35:29 +01:00
Matthias Springer	437c62178c	[mlir][memref] Remove redundant `memref.tensor_store` op (#71010 ) `bufferization.materialize_in_destination` should be used instead. Both ops bufferize to a memcpy. This change also conceptually cleans up the memref dialect a bit: the memref dialect no longer contains ops that operate on tensor values.	2023-11-05 12:47:18 +09:00
Matthias Springer	b9fe461e73	[mlir][transform] LISH: Add transform op (#70630 ) Add a transform op for loop-invariant subset hoisting. Delete the old transform op from the Linalg dialect.	2023-11-05 11:40:51 +09:00
lorenzo chelini	6cbcb79350	[MLIR][Linalg] Introduce SpecializeOp (#70326 ) Introduce an operation to specialize linalg.generics, for example, detecting a linalg.generic that is semantically equivalent to a linalg.copy and replacing the former with the latter. After code generation, it is helpful to lower named operations to vendor-optimized libraries.	2023-10-31 10:07:35 +01:00
Jack Frankland	92e751d426	[mlir][linalg] Add NHWC + FHWC Img2Col (#68708 ) Adds the Img2Col transformation for the fhwc channel ordering in a Conv2D. Because of how the channel ordering affects the matrix dimensions in the flattened filter this results in a slightly different implementation of the actual "matrix multiplication". Instead of doing a regular row-column dot-product this arrangement requires a row-row dot product, otherwise the filter matrix would first need to be transposed. Adds a lit test to the transform dialect to check the semantics of the optimization are correct. Signed-off-by: Jack Frankland <jack.frankland@arm.com>	2023-10-13 10:20:18 +01:00
MaheshRavishankar	93c42299bd	[mlir][TilingInterface] NFC code changes separated out from introduction of `scf::tileUsingSCFForallop`. (#67081 ) This patch contains NFC changes that are precursor to the introduction of `scf::tileUsingSCFForallOp` method introduced in https://github.com/llvm/llvm-project/pull/67083.	2023-09-26 13:42:27 -07:00
Oleksandr "Alex" Zinenko	96ff0255f2	[mlir] cleanup of structured.tile* transform ops (#67320 ) Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme: - `transform.structured.tile_using_for`; - `transform.structured.tile_using_forall`; - `transform.structured.tile_reduction_using_for`; - `transform.structured.tile_reduction_using_forall`. This drops the "_op" naming artifact from `tile_to_forall_op` that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of `tile_reduction_using_scf` since `scf.forall` also belongs to `scf`), and opts for the `using` connector to avoid ambiguity. The loops produced by tiling are now systematically placed as trailing results of the transform op. While this required changing 3 out of 4 ops (except for `tile_using_for`), this is the only choice that makes sense when producing multiple `scf.for` ops that can be associated with a variadic number of handles. This choice is also most consistent with other transform ops from the structured extension, in particular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.	2023-09-26 09:14:29 +02:00
Oleksandr "Alex" Zinenko	702608f4d8	[mlir] emit better errors in transform.structured.interchange (#67315 ) The implementation doesn't emit any diagnostics as it is shared with the pattern-based implementation. Check preconditions early and emit diagnostics from the transform op instead. Without this change, the op would produce a definite failure and no error message.	2023-09-25 15:36:07 +02:00
Ingo Müller	69bc1cbbff	[mlir][linalg][transform] Rename {masked_vectorize => vectorize => vectorize_children_and...}. (#66575 ) This PR renames the vectorization transform ops as follows: * `structured.masked_vectorize` => `structured.vectorize`. This reflects the fact that since [recently](https://reviews.llvm.org/D157774) the op can also handle the unmasked case. * `structured.vectorize` => `structured.vectorize_children_and_applies_patterns`. This reflects the fact that the op does not just vectorize the given payload op but all vectorizable children contained in it, and applies patterns before and after for preparation and clean-up. This rename was discussed first [here](https://reviews.llvm.org/D157774). The PR also adapts and cleans ups the tablegen description of the `VectorizeChildrenAndApplyPatternsOp` (formerly `VectorizeOp`).	2023-09-21 15:38:29 +02:00
MaheshRavishankar	170a25a793	[mlir][TilingInterface] Make the tiling set tile sizes function use `OpFoldResult`. (#66566 )	2023-09-18 17:18:51 -07:00
Martin Erhart	6bf043e743	[mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute (#66619 ) This commit removes the deallocation capabilities of one-shot-bufferization. One-shot-bufferization should never deallocate any memrefs as this should be entirely handled by the ownership-based-buffer-deallocation pass going forward. This means the `allow-return-allocs` pass option will default to true now, `create-deallocs` defaults to false and they, as well as the escape attribute indicating whether a memref escapes the current region, will be removed. A new `allow-return-allocs-from-loops` option is added as a temporary workaround for some bufferization limitations.	2023-09-18 16:44:48 +02:00
lorenzo chelini	d65885ae63	[MLIR][Linalg] Bail out if the tiles provided are more than the number (#66007 ) Currently, the compiler crashes if the number of tiles provided exceeds the number of loops.	2023-09-13 10:41:03 -04:00
Martin Erhart	c199f7dc62	Revert "[mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute" This reverts commit `6a91dfedeb`. This caused problems in downstream projects. We are reverting to give them more time for integration.	2023-09-13 13:53:48 +00:00
Martin Erhart	6a91dfedeb	[mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute This is the first commit in a series with the goal to rework the BufferDeallocation pass. Currently, this pass heavily relies on copies to perform correct deallocations, which leads to very slow code and potentially high memory usage. Additionally, there are unsupported cases such as returning memrefs which this series of commits aims to add support for as well. This first commit removes the deallocation capabilities of one-shot-bufferization.One-shot-bufferization should never deallocate any memrefs as this should be entirely handled by the buffer-deallocation pass going forward. This means the allow-return-allocs pass option will default to true now, create-deallocs defaults to false and they, as well as the escape attribute indicating whether a memref escapes the current region, will be removed. The documentation should w.r.t. these pass option changes should also be updated in this commit. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D156662	2023-09-13 09:30:22 +00:00
Matthias Springer	91464e1d6a	[mlir][bufferization][NFC] Rename copy_tensor op to materialize_in_destination (#65467 ) The previous name was badly chosen. The op is used to ensure that a computation materializes in the future buffer of a certain tensor.	2023-09-12 15:20:41 +02:00
lorenzo chelini	e5137e7c33	[MLIR][Linalg] Retire `tile_to_scf_for` (#65633 ) Both `TileOp` and `TileToScfForOp` use the tiling interface and the `tileUsingSCFForOp` method. This duplication was introduced in `44cfea0279` as a way to retire `linalg::tileLinalgOp,` now there is not more need for this duplication, and it seems that `tileOp` has more recent changes, thus retire `TileToScfForOp.`	2023-09-07 16:13:23 -04:00
Martin Erhart	412c2fd270	[mlir][linalg] Optional dealloc insertion for bufferize_to_allocation (#65610 ) This commit allows to omit insertion of the memref.dealloc operation when linalg.structured.bufferize_to_allocation is run and makes this the default behavior. This is desirable when the buffer-deallocation-pipeline is run after bufferization to handle buffer deallocation.	2023-09-07 17:49:48 +02:00
Aviad Cohen	d6a2014eb8	[mlir][Linalg]: Add memory space to linalg transform::PromoteOp This patch allows to supply an optional memory space of the promoted buffer. Differential Revision: https://reviews.llvm.org/D159074	2023-09-07 17:35:32 +03:00
Oleksandr "Alex" Zinenko	3964d943ec	[mlir] transform.structured.match fix tilingIface condition (#65337 ) The matching condition for payload ops implementing TilingInterface was inverted. Fix it and add a test.	2023-09-05 18:02:33 +02:00
Oleksandr "Alex" Zinenko	c17735053b	[mlir] transform.structured.match loop-like flag (#65336 ) Add an enum option to `transform.structured.match` operation to match payload operations implementing LoopLikeOpInterface.	2023-09-05 17:56:00 +02:00
Andrzej Warzynski	6ca4fe64f1	[mlir][nfc] Make `vectorize_nd_extract` optional Depends on: D157774 Differential Revision: https://reviews.llvm.org/D159360	2023-09-04 18:37:36 +01:00
Matthias Springer	a17313794b	[mlir][linalg][transform] Return copy_back op from PadOp. This patch makes the `transform.structured.pad` op return also a handle to the copy op that it inserts. This allows to continue transformation on that op, such as mapping it to a GPU thread. The patch was mainly authored by @springerm as part of the WIP patch https://reviews.llvm.org/D156371, which also has an example usage of this change. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D159088	2023-08-29 14:55:33 +00:00
Matthias Springer	977cb4fdf8	[mlir][linalg][transform] PadOp: Add option to generate linalg.copy copy_back op Three different options can be specified: * `bufferization.copy_tensor` (default) * `linalg.copy` * `none` (no copy_back) Differential Revision: https://reviews.llvm.org/D156173	2023-08-09 17:10:16 +02:00
Matthias Springer	440808faf6	[mlir][linalg] MapCopyToThreadsOp: Support tensor.pad Also return the generated loop op. Differential Revision: https://reviews.llvm.org/D155950	2023-07-21 15:51:46 +02:00
Matthias Springer	a5bba98a58	[mlir][linalg] BufferizeToAllocationOp: Add option to materialize buffers for operands Add an option that does not bufferize the targeted op itself, but just materializes a buffer for the destination operands. This is useful for partial bufferization of complex ops such as `scf.forall`, which need special handling (and an analysis if the region). Differential Revision: https://reviews.llvm.org/D155946	2023-07-21 15:29:59 +02:00
Ingo Müller	522831384f	[mlir][linalg][transform] Extend diagnostics of FuseIntoContainingOp. This patch extends the diagnostic output of `FuseIntoContainingOp` when it fails to find the next producer by also provided the location of the affected transform op. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155803	2023-07-21 09:34:04 +00:00
Mahesh Ravishankar	67399932c7	[mlir][Linalg] Cleanup the drop unit dims pass in Linalg. TL;DR the following API functions have been merged ``` void populateFoldUnitExtentDimsViaReshapesPatterns(RewritePatternSet &patterns); void populateFoldUnitExtentDimsViaSlicesPatterns(RewritePatternSet &patterns); ``` into ``` void populateFoldUnitExtentDimsPatterns(RewritePatternSet &patterns, ControlDropUnitDims &options); ``` To use the previous functionality use ``` ControlDropUnitDims options; // By default options.rankReductionStrategy is // ControlDropUnitDims::RankReductionStrategy::ReassociativeReshape. populateFoldUnitExtentDimsPatterns(patterns, options); ``` and ``` ControlDropUnitDims options; options.rankReductionStrategy = ControlDropUnitDims::RankReductionStrategy::ExtractInsertSlice populateFoldUnitExtentDimsPatterns(patterns, options); ``` This pass is quite old and needed to be updated based on the current approach to transformations in Linalg - Instead of two patterns, one to just remove loop dimensions that are unit extent (and using 0 in the indexing maps), and another to drop the unit-extents in the operand shapes, combine into a single transformation. This avoid creating an intermediate step with indexing maps having 0's in the domains exp ressions. - Expose the core transformation as a utility function and add a pattern that calls this transformation. This is a mostly NFC change, apart from the API change and dropping the patterns/test that only dropped the loops that are unit extents. Differential Revision: https://reviews.llvm.org/D155518	2023-07-19 17:47:18 +00:00
Quentin Colombet	9be8219f60	[mlir][Linalg] Add an interface to decompose complex ops This patch adds an interface, named AggregatedOpInterface, that decomposes complex operations into simpler ones. For now, make the interface specific to Linalg because although the concept is general, the way to materialize it needs some maturing. Use that interface with the softmax operator. Differential Revision: https://reviews.llvm.org/D154363	2023-07-18 19:06:36 +02:00
Matthias Springer	1a5aa77f30	[mlir][linalg] BufferizeToAllocationOp: Add option to specify custom alloc op Supported ops are "memref.alloc" and "memref.alloca". Differential Revision: https://reviews.llvm.org/D155282	2023-07-14 13:39:05 +02:00
Matthias Springer	d3ddcfd448	[mlir][DialectUtils] Generalize `extractFromI64ArrayAttr` helper Generalize `extractFromI64ArrayAttr` to `extractFromIntegerArrayAttr`, so that arbitrary integer/bool types can be extracted. Differential Revision: https://reviews.llvm.org/D154974	2023-07-12 17:59:40 +02:00
Matthias Springer	579bca1265	[mlir][linalg] BufferizeToAllocation: Add custom memcpy op Add a new option that allows users to specify a memcpy op: "memref.tensor_store", "memref.copy" or "linalg.copy". Differential Revision: https://reviews.llvm.org/D154968	2023-07-11 16:47:42 +02:00
Matthias Springer	8ddd98f831	[mlir][linalg] Return newly created ops from bufferize_to_allocation Return all ops that were generated as part of the bufferization, so that users do not have to match them in the enclosing op. Differential Revision: https://reviews.llvm.org/D154966	2023-07-11 16:34:02 +02:00
Nicolas Vasilache	1e84e91efa	[mlir][Linalg] NFC - Improve some transform op builders	2023-07-11 15:35:43 +02:00
Matthias Springer	867afe5e53	[mlir][vector] Remove duplicate tensor subset <-> vector transfer patterns Remove patterns that fold tensor subset ops into vector transfer ops from the vector dialect. These patterns already exist in the tensor dialect. Differential Revision: https://reviews.llvm.org/D154932	2023-07-11 11:12:29 +02:00
Nicolas Vasilache	171a5a761d	[mlir][Linalg] Add a greedy transform to map copies to threads efficiently. This revision adds a new transformation to map a copy operation to a gpu grid of threads. It implements a first heuristic that allows trading off coalesced accesses vs predication and occupancy. Differential Revision: https://reviews.llvm.org/D154836	2023-07-10 16:11:04 +00:00
Matthias Springer	d6e9efab81	[mlir][linalg][transform] Add verifier to MaskedVectorizeOp Verify that the correct number of `scalable_sizes` was provided. Differential Revision: https://reviews.llvm.org/D154600	2023-07-06 16:24:52 +02:00
Lorenzo Chelini	4d74c845a1	[MLIR][Linalg] Expose `packMatmulGreedily` in `Transforms.h` (NFC) Make the transformation accessible to other drivers (i.e., passes).	2023-07-06 11:59:17 +02:00
Matthias Springer	9b11323904	[mlir][linalg][transform] Fix TileOp builder The TileOp builders did not set `scalable_sizes`, which produces invalid ops. `scalable_sizes` must contain as any booleans as there are sizes. Differential Revision: https://reviews.llvm.org/D154585	2023-07-06 11:40:33 +02:00
Andrzej Warzynski	ad7ef1923f	[mlir][transform] Allow arbitrary indices to be scalable This change lifts the limitation that only the trailing dimensions/sizes in dynamic index lists can be scalable. It allows us to extend `MaskedVectorizeOp` and `TileOp` from the Transform dialect so that the following is allowed: %1, %loops:3 = transform.structured.tile %0 [4, [4], [4]] This is also a follow up for https://reviews.llvm.org/D153372 that will enable the following (middle vector dimension is scalable): transform.structured.masked_vectorize %0 vector_sizes [2, [4], 8] To facilate this change, the hooks for parsing and printing dynamic index lists are updated accordingly (`printDynamicIndexList` and `parseDynamicIndexList`, respectively). `MaskedVectorizeOp` and `TileOp` are updated to include an array of attribute of bools that captures whether the corresponding vector dimension/tile size, respectively, are scalable or not. NOTE 1: I am re-landing this after the initial version was reverted. To fix the regression and in addition to the original patch, this revision updates the Python bindings for the transform dialect NOTE 2: This change is a part of a larger effort to enable scalable vectorisation in Linalg. See this RFC for more context: * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/ This relands `048764f23a` with fixes. Differential Revision: https://reviews.llvm.org/D154336	2023-07-05 09:53:26 +01:00
Matthias Springer	335ada6099	[mlir][linalg] BufferizeToAllocationOp: Bufferize ops, not values The `bufferize_to_allocation` transform op now operates on payload ops, not payload values. Only ops can be bufferized, not values. Also remove the `replacement` result from the transform op. Differential Revision: https://reviews.llvm.org/D153970	2023-07-04 14:35:13 +02:00
Matthias Springer	0e06ec5961	[mlir][linalg] Return tensor::PadOp handle from transform op "transform.structured.pad" now returns all `tensor::PadOp` in addition to the padded ops. Also add a test case that shows how to force an allocation for "tensor.pad" ops with a custom memory space. Differential Revision: https://reviews.llvm.org/D153555	2023-07-04 14:24:47 +02:00

1 2 3 4 5

245 Commits