clang-p2996

Author	SHA1	Message	Date
Matthias Springer	6176d6a93e	[mlir][tensor] Support parallel_insert_slice in MergeConsecutiveInsertExtractSlicePatterns.cpp Differential Revision: https://reviews.llvm.org/D141116	2023-01-06 12:33:45 +01:00
Mehdi Amini	ab32f5b7ef	Apply clang-tidy fixes for readability-simplify-boolean-expr in BufferizableOpInterfaceImpl.cpp (NFC)	2022-12-28 22:42:39 +00:00
Fangrui Song	cbb0981388	[mlir] llvm::Optional::value => operator*/operator-> std::optional::value() has undesired exception checking semantics and is unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The call sites block std::optional migration.	2022-12-17 19:07:38 +00:00
Ramkumar Ramachandra	22426110c5	mlir/tblgen: use std::optional in generation This is part of an effort to migrate from llvm::Optional to std::optional. This patch changes the way mlir-tblgen generates .inc files, and modifies tests and documentation appropriately. It is a "no compromises" patch, and doesn't leave the user with an unpleasant mix of llvm::Optional and std::optional. A non-trivial change has been made to ControlFlowInterfaces to split one constructor into two, relating to a build failure on Windows. See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Signed-off-by: Ramkumar Ramachandra <r@artagnon.com> Differential Revision: https://reviews.llvm.org/D138934	2022-12-17 11:13:26 +01:00
Matthias Springer	e5dc99e642	[mlir][tensor][bufferize] Improve bufferization of DimOp/RankOp The tensor operands do not bufferize to a memory read. Differential Revision: https://reviews.llvm.org/D140007	2022-12-14 12:47:46 +01:00
Matthias Springer	be630f07de	[mlir][bufferize] Implement BufferizableOpInterface for tensor.empty The op is not bufferizable but should be analyzable (for `EliminateEmptyTensors`, which uses the bufferization infrastructure). Also improve debugging functionality and error messages. Also adds a missing pass to the sparse pipeline. (tensor.empty should be replaced with bufferization.alloc_tensor, but it sometimes used to work without depending on how the tensor.empty is used. Now we always fail explicitly.)	2022-12-12 14:19:38 +01:00
Alexander Belyaev	f6fb0a4f35	[mlir] Make patterns for folding tensor.empty optional. At the moment, they are a part of EmptyOp::getCanonicalizationPatterns. When extract_slice(tensor.empty) is rewritten as a new tensor.empty, it could happen that we end up with two tensor.empty ops, since the original tensor.empty can have two users. After bufferization such cases result in two allocations. Differential Revision: https://reviews.llvm.org/D139308	2022-12-07 23:01:34 +01:00
Matthias Springer	9cdf6b641d	[mlir][tensor] Support parallel_insert_slice in reassociative reshape folder Differential Revision: https://reviews.llvm.org/D139540	2022-12-07 16:25:10 +01:00
Matthias Springer	1403073790	[mlir][tensor] Fold rank-reducing insert_slice with inverse collapse_shape Differential Revision: https://reviews.llvm.org/D139221	2022-12-05 09:17:29 +01:00
Matthias Springer	50a2bb95ab	[mlir][tensor] Fold rank-reducing extract_slice with inverse expand_shape Differential Revision: https://reviews.llvm.org/D139220	2022-12-05 09:17:24 +01:00
Matthias Springer	f92c7506e3	Revert "[mlir][tensor] Fold rank-reducing extract_slice with inverse expand_shape" This reverts commit `a076f57a1a`.	2022-12-02 21:22:20 +01:00
Matthias Springer	c837a94754	Revert "[mlir][tensor] Fold rank-reducing insert_slice with inverse collapse_shape" This reverts commit `1522a3b7b3`.	2022-12-02 21:22:04 +01:00
Matthias Springer	1522a3b7b3	[mlir][tensor] Fold rank-reducing insert_slice with inverse collapse_shape Differential Revision: https://reviews.llvm.org/D139104	2022-12-02 10:42:52 +01:00
Matthias Springer	a076f57a1a	[mlir][tensor] Fold rank-reducing extract_slice with inverse expand_shape Differential Revision: https://reviews.llvm.org/D139103	2022-12-02 10:42:46 +01:00
Matthias Springer	13593dc9dc	[mlir][tensor][bufferize] Fix tensor.insert_slice regression This reverts D132662 (apart from overall cleanups), which introduced a too aggressive optimization for tensor.insert_slice bufferization. Instead, bufferizesToMemoryRead is improved to handle some of these cases. The remaining cases can still bufferize efficiently when running the canonicalizer before the bufferization. Differential Revision: https://reviews.llvm.org/D138745	2022-11-26 19:14:33 +01:00
Lei Zhang	9bb633741a	[mlir][bufferization] Support general Attribute as memory space MemRef has been accepting a general Attribute as memory space for a long time. This commits updates bufferization side to catch up, which allows downstream users to plugin customized symbolic memory space. This also eliminates quite a few `getMemorySpaceAsInt` calls, which is deprecated. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D138330	2022-11-21 09:40:50 -05:00
Matthias Springer	09dfb44193	[mlir][tensor][bufferize] Support memory_space for tensor.pad This change adds memory space support to tensor.pad. (tensor.generate and tensor.from_elements do not support memory spaces yet.) The memory space is inferred from the buffer of the source tensor. Instead of lowering tensor.pad to tensor.generate + tensor.insert_slice, it is now lowered to bufferization.alloc_tensor (with the correct memory space) + linalg.map + tensor.insert_slice. Memory space support for the remaining two tensor ops is left for a later point, as this requires some more design discussions. Differential Revision: https://reviews.llvm.org/D136265	2022-10-27 12:29:57 +02:00
Matthias Springer	c1f0a15c65	[mlir][tensor][bufferize] Lower tensor.generate to linalg.map There is no memref equivalent of tensor.generate. The purpose of this change is to avoid creating scf.parallel loops during bufferization. Differential Revision: https://reviews.llvm.org/D136767	2022-10-27 12:03:13 +02:00
Matthias Springer	2d5edc644d	[mlir][bufferize] Provide default BufferizableOpInterface impl for destination style ops tensor.insert and tensor.insert_slice (as destination style ops) do no longer need to implement the entire BufferizableOpInterface. Differential Revision: https://reviews.llvm.org/D136347	2022-10-27 10:52:47 +02:00
Christopher Bate	446981bdb6	[mlir][tensor] ExtractSliceFromReshape: handle collapsing of unit dim edge cases Prior to this change, the "ExtractSliceFromReshape" pattern would transform ``` %collapsed = tensor.collapse_shape %input [[0, 1], [2]] : tensor<1x11x100xf32> into tensor<11x100xf32> %slice = tensor.extract_slice %collapsed [%offt, 0] [%size, 100] [1, 1] : tensor<11x100xf32> to tensor<?x100xf32> ``` into a loop that iterated over the range `%size - %offt`, that pieces together multiple sub-slices of `%input` along the first dimension. This is correct but obviously inefficient. The technical condition is that collapsing at-most-one non-unit dimension of `%src` will not result in a subsequent slice along the corresponding dimension of `%collapsed` mapping across discontinuities in the index space of `%src`. Thus, the definition of a "linearized dimension" (from the perspective of `tensor.collapse_shape`) is updated to reflect this condition. The transform will now generate ``` %slice = tensor.extract_slice %input [0, %offt, 0][1, %size, 100] [1, 1] : tensor<1x11x100xf32> to tensor<1x?x100xf32> %result = tensor.collapse_shape [[0, 1], [2]] : tensor<1x?x100xf32> to tensor<?x100xf32> ``` which can be further canonicalized. Additional tests are added to check this family of edge cases. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D135726	2022-10-22 13:29:34 -06:00
Matthias Springer	6cdd34b973	[mlir][tensor][bufferize] Bufferize inserts into equivalent tensors in-place Inserting a tensor into an equivalent tensor is a no-op after bufferization. No alloc is needed. Differential Revision: https://reviews.llvm.org/D132662	2022-10-06 15:06:33 +09:00
Jakub Kuderski	abc362a107	[mlir][arith] Change dialect name from Arithmetic to Arith Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples` and `bazel build --config=generic_clang @llvm-project//mlir:all`. Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D134762	2022-09-29 11:23:28 -04:00
Lei Zhang	465ec4e0b4	[mlir] NFC: move mergeOffsetsSizesAndStrides into Affine/Utils So that these utility functions can also be used ViewLikeInterface ops not in the memref dialect. Reviewed By: mravishankar, christopherbate Differential Revision: https://reviews.llvm.org/D134487	2022-09-23 13:28:11 -04:00
Lei Zhang	bd81524e7f	Reland "[mlir][tensor] Support more cases in MergeConsecutiveExtractSlice" This relands commit `5d4603a02d`. It cludes fixes to GCC test failures and simplification to the implementation. Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com> Co-authored-by: Christopher Bate <cbate@nvidia.com>	2022-09-22 17:28:50 -04:00
Matthias Springer	04ff6009fc	[mlir][tensor][bufferize] Implement getBufferType for Expand/CollapseShapeOp This function must be implemented for all ops, where the result memref type is different from the input memref type. Differential Revision: https://reviews.llvm.org/D134331	2022-09-21 18:31:59 +09:00
Mehdi Amini	e0a6df53b4	Revert "[mlir][tensor] Support more cases in MergeConsecutiveExtractSlice" This reverts commit `5d4603a02d`. The Dialect/Tensor/fold-consecutive-insert-extract-slice.mlir test is failing when built with GCC	2022-09-21 04:01:57 +00:00
Lei Zhang	2d3b54feb2	[mlir][tensor] NFC: name various Transforms/ files consistently Use a suffix to make clear what the contents inside each file are. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D134202	2022-09-20 20:17:29 -04:00
Lei Zhang	5d4603a02d	[mlir][tensor] Support more cases in MergeConsecutiveExtractSlice This commit adds utility functions to perform general merging of OffsetSizeAndStrideOpInterface by supporting producer rank reducing and non-unit strides. With it we can extend MergeConsecutiveExtractSlice to support more cases. Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com> Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D134294	2022-09-20 20:16:03 -04:00
Lei Zhang	bb4c53b7ba	[mlir][tensor] Merge consecutive insert_slice/extract_slice ops Consecutive tensor.insert_slice/tensor.extract_slice can be created for the case like tiling convolution and then downsizing 2-D convolutions into 1-D ones. It hinders further transformations. So adding these patterns to clean it up. Given that bufferization is sensitive and have requirements over the IR structure (see https://reviews.llvm.org/D132666), these patterns are put in Transforms/ with separate entry points for explicit collection. Reviewed By: ThomasRaoux, mravishankar Differential Revision: https://reviews.llvm.org/D133871	2022-09-20 19:52:56 -04:00
Christopher Bate	4d27f06f94	[mlir][Tensor] Fix ExtractSliceFromReshape transform edge case The transformation would fail if none of the sliced dimensions were linearized by the producing `tensor.collapse_shape`. This is a trivial edge case but it wasn't correctly tested. Fixes the issue and adds a test. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D134088	2022-09-19 14:02:45 -06:00
Alex Zinenko	46b90a7b5d	[mlir] make remaining memref dialect ops produce strided layouts The three following ops in the memref dialect: transpose, expand_shape, collapse_shape, have been originally designed to operate on memrefs with strided layouts but had to go through the affine map representation as the type did not support anything else. Make these ops produce memref values with StridedLayoutAttr instead now that it is available. Depends On D133938 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D133947	2022-09-16 10:56:48 +02:00
Christopher Bate	f4a478cd01	[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape` This change adds a set of utilities to replace the result of a `tensor.collapse_shape -> tensor.extract_slice` chain with the equivalent result formed by aggregating slices of the `tensor.collapse_shape` source. In general, it is not possible to commute `extract_slice` and `collapse_shape` if linearized dimensions are sliced. The i-th dimension of the `tensor.collapse_shape` result is a "linearized sliced dimension" if: 1) Reassociation indices of tensor.collapse_shape in the i'th position is greater than size 1 (multiple dimensions of the input are collapsed) 2) The i-th dimension is sliced by `tensor.extract_slice`. We can work around this by stitching together the result of `tensor.extract_slice` by iterating over any linearized sliced dimensions. This is equivalent to "tiling" the linearized-and-sliced dimensions of the `tensor.collapse_shape` operation in order to manifest the result tile (the result of the `tensor.extract_slice`). The user of the utilities must provide the mechanism to create the tiling (e.g. a loop). In the tests, it is demonstrated how to apply the utilities using either `scf.for` or `scf.foreach_thread`. The below example illustrates the pattern using `scf.for`: ``` %0 = linalg.generic ... -> tensor<3x7x11x10xf32> %1 = tensor.collapse_shape %0 [[0, 1, 2], [3]] : ... to tensor<341x10xf32> %2 = tensor.extract_slice %1 [13, 0] [10, 10] [2, 1] : .... tensor<10x10xf32> ``` We can construct %2 by generating the following IR: ``` %dest = linalg.init_tensor() : tensor<10x10xf32> %2 = scf.for %iv = %c0 to %c10 step %c1 iter_args(%arg0) -> tensor<10x10xf32> { // Step 1: Map this output idx (%iv) to a multi-index for the input (%3): %linear_index = affine.apply affine_map<(d0)[]->(d0*2 + 11)>(%iv) %3:3 = arith.delinearize_index %iv into (3, 7, 11) // Step 2: Extract the slice from the input %4 = tensor.extract_slice %0 [%3#0, %3#1, %3#2, 0] [1, 1, 1, 10] [1, 1, 1, 1] : tensor<3x7x11x10xf32> to tensor<1x1x1x10xf32> %5 = tensor.collapse_shape %4 [[0, 1, 2], [3]] : tensor<1x1x1x10xf32> into tensor<1x10xf32> // Step 3: Insert the slice into the destination %6 = tensor.insert_slice %5 into %arg0 [%iv, 0] [1, 10] [1, 1] : tensor<1x10xf32> into tensor<10x10xf32> scf.yield %6 : tensor<10x10xf32> } ``` The pattern was discussed in the RFC here: https://discourse.llvm.org/t/rfc-tensor-extracting-slices-from-tensor-collapse-shape/64034 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D129699	2022-09-08 21:58:21 -06:00
Mehdi Amini	0b1aee38bd	Revert "[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`" This reverts commit `5711957875`. A circular dependency is introduced here from Dialect/Utils/ to the ViewLikeInterface, but it already depends on Dialect/Utils. Also this introduces a dependency from lib/Dialect/Tensor to Linalg, which isn't obviously correct from a layering point of view.	2022-09-02 23:34:52 +00:00
Christopher Bate	5711957875	[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape` This change adds a set of utilities to replace the result of a `tensor.collapse_shape -> tensor.extract_slice` chain with the equivalent result formed by aggregating slices of the `tensor.collapse_shape` source. In general, it is not possible to commute `extract_slice` and `collapse_shape` if linearized dimensions are sliced. The i-th dimension of the `tensor.collapse_shape` result is a "linearized sliced dimension" if: 1) Reassociation indices of tensor.collapse_shape in the i'th position is greater than size 1 (multiple dimensions of the input are collapsed) 2) The i-th dimension is sliced by `tensor.extract_slice`. We can work around this by stitching together the result of `tensor.extract_slice` by iterating over any linearized sliced dimensions. This is equivalent to "tiling" the linearized-and-sliced dimensions of the `tensor.collapse_shape` operation in order to manifest the result tile (the result of the `tensor.extract_slice`). The user of the utilities must provide the mechanism to create the tiling (e.g. a loop). In the tests, it is demonstrated how to apply the utilities using either `scf.for` or `scf.foreach_thread`. The below example illustrates the pattern using `scf.for`: ``` %0 = linalg.generic ... -> tensor<3x7x11x10xf32> %1 = tensor.collapse_shape %0 [[0, 1, 2], [3]] : ... to tensor<341x10xf32> %2 = tensor.extract_slice %1 [13, 0] [10, 10] [2, 1] : .... tensor<10x10xf32> ``` We can construct %2 by generating the following IR: ``` %dest = linalg.init_tensor() : tensor<10x10xf32> %2 = scf.for %iv = %c0 to %c10 step %c1 iter_args(%arg0) -> tensor<10x10xf32> { // Step 1: Map this output idx (%iv) to a multi-index for the input (%3): %linear_index = affine.apply affine_map<(d0)[]->(d0*2 + 11)>(%iv) %3:3 = arith.delinearize_index %iv into (3, 7, 11) // Step 2: Extract the slice from the input %4 = tensor.extract_slice %0 [%3#0, %3#1, %3#2, 0] [1, 1, 1, 10] [1, 1, 1, 1] : tensor<3x7x11x10xf32> to tensor<1x1x1x10xf32> %5 = tensor.collapse_shape %4 [[0, 1, 2], [3]] : tensor<1x1x1x10xf32> into tensor<1x10xf32> // Step 3: Insert the slice into the destination %6 = tensor.insert_slice %5 into %arg0 [%iv, 0] [1, 10] [1, 1] : tensor<1x10xf32> into tensor<10x10xf32> scf.yield %6 : tensor<10x10xf32> } ``` The pattern was discussed in the RFC here: https://discourse.llvm.org/t/rfc-tensor-extracting-slices-from-tensor-collapse-shape/64034 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D129699	2022-09-02 11:29:04 -06:00
Matthias Springer	4cd7362083	[mlir][SCF] foreach_thread: Capture shared output tensors explicitly This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments. The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments. As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again. Differential Revision: https://reviews.llvm.org/D133114	2022-09-02 14:54:04 +02:00
Matthias Springer	547942841f	[mlir][interfaces] Drop `dest`/`tileDestOperands` from TilingInterface `getTiledImplementation`/`generateResultTileValue` only computes the tiled operation, but does not insert the result into any tensor. Differential Revision: https://reviews.llvm.org/D133015	2022-09-01 08:53:53 +02:00
Michele Scuttari	67d0d7ac0a	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838	2022-08-31 12:28:45 +02:00
Michele Scuttari	039b969b32	Revert "[MLIR] Update pass declarations to new autogenerated files" This reverts commit `2be8af8f0e`.	2022-08-30 22:21:55 +02:00
Michele Scuttari	2be8af8f0e	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838	2022-08-30 21:56:31 +02:00
Matthias Springer	123c4b0251	[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.for) Even though iter_arg and init_arg of an scf.for loop may have the same tensor type, their bufferized memref types are not necessarily equal. It is sometimes necessary to insert a cast in case of differing layout maps. Differential Revision: https://reviews.llvm.org/D132860	2022-08-30 16:35:32 +02:00
Matthias Springer	111c919665	[mlir][bufferization] Generalize getBufferType This change generalizes getBufferType. This function can be used to predict the buffer type of any tensor value (not just BlockArguments) without changing any IR. It also subsumes getMemorySpace. This is useful for loop bufferization, where the precise buffer type of an iter_arg cannot be known without examining the loop body. Differential Revision: https://reviews.llvm.org/D132859	2022-08-30 16:26:44 +02:00
Matthias Springer	ba95bf765d	[mlir][tensor] Add getMixedSizes helper This helper function computes the dimensions of a tensor value as OpFoldResults. Differential Revision: https://reviews.llvm.org/D132475	2022-08-25 10:25:41 +02:00
Matthias Springer	c37ed7762e	[tensor][bufferize] Use affine.apply instead of arith.addi in PadOp lowering Affine exprs compose better than arith ops. Differential Revision: https://reviews.llvm.org/D132456	2022-08-23 11:46:11 +02:00
Matthias Springer	9ee12f4778	[mlir][tensor][bufferize] Bufferize tensor.pad tensor.pad is lowered to tensor.generate + tensor.insert_slice during bufferization. For best performance with constant padding values, users should vectorize the IR before bufferizing it. This change also relaxes tje restriction that no new ops that bufferize to a memory write should be added during bufferization. Since bufferization has been split into two steps a while ago (tensor copy insertion + bufferization), it is reasonable to allow this now. Differential Revision: https://reviews.llvm.org/D132355	2022-08-22 17:00:33 +02:00
Matthias Springer	1defec8730	[mlir][tensor][bufferize][NFC] Remove duplicate code InsertSliceOp and ParallelInsertSliceOp are very similar and can share some of the bufferization analysis code. Differential Revision: https://reviews.llvm.org/D130465	2022-07-25 12:34:16 +02:00
Matthias Springer	664ffa46bb	[mlir][tensor][bufferize] Fix deallocation of GenerateOp/FromElementsOp Both ops allocate a buffer. There were cases in which the buffer was not deallocated. Differential Revision: https://reviews.llvm.org/D130469	2022-07-25 12:25:06 +02:00
Matthias Springer	5f5f71e737	[mlir][tensor][bufferize] Load dependent dialects Load dialects that will be generated by the extension. (Except for BufferizationDialect and MemrefDialect which are loaded already.) Differential Revision: https://reviews.llvm.org/D130463	2022-07-25 11:36:10 +02:00
Kazu Hirata	c27d815249	[mlir] Use value instead of getValue (NFC)	2022-07-14 00:19:59 -07:00
Jacques Pienaar	136d746ec7	[mlir] Flip accessors to prefixed form (NFC) Another mechanical sweep to keep diff small for flip to _Prefixed.	2022-07-10 21:19:11 -07:00
Matthias Springer	606f7c8f7a	[mlir][bufferization][NFC] Move more unknown type conversion logic into BufferizationOptions The `unknownTypeConversion` bufferization option (enum) is now a type converter function option. Some logic of `getMemRefType` is now handled by that function. This change makes type conversion more controllable. Previously, there were only two options when generating memref types for non-bufferizable ops: Static identity layout or fully dynamic layout. With this change, users of One-Shot Bufferize can provide a function with custom logic. Differential Revision: https://reviews.llvm.org/D129273	2022-07-07 13:36:28 +02:00

1 2 3

123 Commits