clang-p2996

Author	SHA1	Message	Date
Matthias Springer	2ff2e871f5	[mlir][bufferization] Remove remaining dialect conversion-based infra parts (#114155 ) This commit removes the last remaining components of the dialect conversion-based bufferization passes. Note for LLVM integration: If you depend on these components, migrate to One-Shot Bufferize or copy them to your codebase.	2024-11-27 09:54:22 +09:00
Christopher Bate	ced2fc7819	[mlir][bufferization] Fix OneShotBufferize when `defaultMemorySpaceFn` is used (#91524 ) As described in issue llvm/llvm-project#91518, a previous PR llvm/llvm-project#78484 introduced the `defaultMemorySpaceFn` into bufferization options, allowing one to inform OneShotBufferize that it should use a specified function to derive the memory space attribute from the encoding attribute attached to tensor types. However, introducing this feature exposed unhandled edge cases, examples of which are introduced by this change in the new test under `test/Dialect/Bufferization/Transforms/one-shot-bufferize-encodings.mlir`. Fixing the inconsistencies introduced by `defaultMemorySpaceFn` is pretty simple. This change: - Updates the `bufferization.to_memref` and `bufferization.to_tensor` operations to explicitly include operand and destination types, whereas previously they relied on type inference to deduce the tensor types. Since the type inference cannot recover the correct tensor encoding/memory space, the operand and result types must be explicitly included. This is a small assembly format change, but it touches a large number of test files. - Makes minor updates to other bufferization functions to handle the changes in building the above ops. - Updates bufferization of `tensor.from_elements` to handle memory space. Integration/upgrade guide: In downstream projects, if you have tests or MLIR files that explicitly use `bufferization.to_tensor` or `bufferization.to_memref`, then update them to the new assembly format as follows: ``` %1 = bufferization.to_memref %0 : memref<10xf32> %2 = bufferization.to_tensor %1 : memref<10xf32> ``` becomes ``` %1 = bufferization.to_memref %0 : tensor<10xf32> to memref<10xf32> %2 = bufferization.to_tensor %0 : memref<10xf32> to tensor<10xf32> ```	2024-11-26 09:45:57 -07:00
Matthias Springer	cbc7802233	[mlir][bufferization] Remove `finalizing-bufferize` pass (#114154 ) The dialect conversion-based bufferization passes have been migrated to One-Shot Bufferize about two years ago. To clean up the code base, this commit removes the `finalizing-bufferize` pass, one of the few remaining parts of the old infrastructure. Most bufferization passes have already been removed. Note for LLVM integration: If you depend on this pass, migrate to One-Shot Bufferize or copy the pass to your codebase. Depends on #114152.	2024-11-21 10:51:23 +09:00
Matthias Springer	804d3c4ce1	[mlir][IR] Add `Block::isReachable` helper function (#114928 ) Add a new helper function `isReachable` to `Block`. This function traverses all successors of a block to determine if another block is reachable from the current block. This functionality has been reimplemented in multiple places in MLIR. Possibly additional copies in downstream projects. Therefore, moving it to a common place.	2024-11-13 14:58:09 +09:00
Matthias Springer	b0a4e958e8	[mlir][bufferization] Add support for non-unique `func.return` (#114017 ) Multiple `func.return` ops inside of a `func.func` op are now supported during bufferization. This PR extends the code base in 3 places: - When inferring function return types, `memref.cast` ops are folded away only if all `func.return` ops have matching buffer types. (E.g., we don't fold if two `return` ops have operands with different layout maps.) - The alias sets of all `func.return` ops are merged. That's because aliasing is a "may be" property. - The equivalence sets of all `func.return` ops are taken only if they match. If different `func.return` ops have different equivalence sets for their operands, the equivalence information is dropped. That's because equivalence is a "must be" property. This commit is in preparation of removing the deprecated `func-bufferize` pass. That pass can bufferize functions with multiple `return` ops.	2024-11-13 08:51:39 +09:00
Matthias Springer	c271ba7f79	[mlir][bufferization] Add support for recursive function calls (#114003 ) This commit adds support for recursive function calls to One-Shot Bufferize. The analysis does not support recursive function calls. The function body itself can be analyzed, but we cannot make any assumptions about the aliasing relation between function result and function arguments. Similarly, when looking at a `call` op, we do not know whether the operands will bufferize to a memory read/write. In the absence of such information, we have to conservatively assume that they do. This commit is in preparation of removing the deprecated `func-bufferize` pass. That pass can bufferize recursive functions.	2024-11-05 10:18:35 +09:00
Matthias Springer	217700baf7	[mlir][bufferization] Support bufferization of external functions (#113999 ) This commit adds support for bufferizing external functions that have no body. Such functions were previously rejected by One-Shot Bufferize if they returned a tensor value. This commit is in preparation of removing the deprecated `func-bufferize` pass. That pass can bufferize external functions. Also update a few comments.	2024-10-30 21:49:10 +09:00
Andrzej Warzyński	91c11574e8	Revert "[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#110322 )" (#113124 ) This reverts commit `2026501cf1`. Failing bot: * https://lab.llvm.org/staging/#/builders/125/builds/389	2024-10-22 13:28:44 +01:00
Simon Camphausen	70334081f7	[mlir][bufferization] Expose buffer alignment as a pass option in one-shot-bufferize (#112505 )	2024-10-16 11:49:49 +02:00
Matthias Springer	206fad0e21	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250 ) This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.	2024-10-05 21:32:40 +02:00
Tzung-Han Juang	2026501cf1	[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#110322 ) Description: This PR replaces a part of `FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface` in `OneShotModuleBufferize`. Also fix the error from an integration test in the a previous PR attempt. (https://github.com/llvm/llvm-project/pull/107295) The below fixes skip `CallOpInterface` so that the assertions are not triggered. `8d78000762/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp (L254-L259)` `8d78000762/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp (L311-L315)` Related Discord Discussion: [Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900) --------- Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>	2024-10-01 15:58:52 +02:00
Matthias Springer	49df12c01e	[mlir][NFC] Minor cleanup around `ModuleOp` usage (#110498 ) Use `moduleOp.getBody()` instead of `moduleOp.getBodyRegion().front()`.	2024-09-30 21:20:48 +02:00
Matthias Springer	ae7b454f98	Revert "[MLIR] Make `OneShotModuleBufferize` use `OpInterface`" (#109919 ) Reverts llvm/llvm-project#107295 This commit breaks an integration test: ``` build/bin/mlir-opt mlir/test/Integration/Dialect/Complex/CPU/correctness.mlir -one-shot-bufferize="bufferize-function-boundaries" ```	2024-09-25 09:17:49 +02:00
Tzung-Han Juang	f586b1e3f4	[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#107295 ) Description: `OneShotModuleBufferize` deals with the bufferization of `FuncOp`, `CallOp` and `ReturnOp` but they are hard-coded. Any custom function-like operations will not be handled. The PR replaces a part of `FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface` in `OneShotModuleBufferize` so that custom function ops and call ops can be bufferized. Related Discord Discussion: [Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900) --------- Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>	2024-09-25 07:27:21 +02:00
Kazu Hirata	8d8bedef0d	[Bufferization] Avoid repeated hash lookups (NFC) (#108925 )	2024-09-17 00:18:23 -07:00
JOE1994	884221eddb	[mlir] Tidy uses of llvm::raw_stream_ostream (NFC) As specified in the docs, 1) raw_string_ostream is always unbuffered and 2) the underlying buffer may be used directly ( `65b13610a5` for further reference ) * Don't call raw_string_ostream::flush(), which is essentially a no-op. * Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.	2024-09-16 23:23:25 -04:00
Henrich Lauko	d1cad2290c	Reland [MLIR] Make resolveCallable customizable in CallOpInterface (#107989 ) Relands #100361 with fixed dependencies.	2024-09-10 15:33:13 +02:00
Matthias Springer	7574042e2a	Revert "[MLIR] Make `resolveCallable` customizable in `CallOpInterface`" (#107984 ) Reverts llvm/llvm-project#100361 This commit caused some linker errors. (Missing `MLIRCallInterfaces` dependency.)	2024-09-10 10:24:05 +02:00
Henrich Lauko	958f59d90f	[MLIR] Make `resolveCallable` customizable in `CallOpInterface` (#100361 ) Allow customization of the `resolveCallable` method in the `CallOpInterface`. This change allows for operations implementing this interface to provide their own logic for resolving callables. - Introduce the `resolveCallable` method, which does not include the optional symbol table parameter. This method replaces the previously existing extra class declaration `resolveCallable`. - Introduce the `resolveCallableInTable` method, which incorporates the symbol table parameter. This method replaces the previous extra class declaration `resolveCallable` that used the optional symbol table parameter.	2024-09-10 10:08:41 +02:00
Menooker	26645ae2ee	[mlir][memref] Fix hoist-static-allocs option of buffer-results-to-out-params when function parameters are returned (#102093 ) buffer-results-to-out-params pass will have a nullptr-referencing error when hoist-static-allocs option is on, when the return value of a function is a parameter of the function. This PR fixes this issue.	2024-09-04 20:36:19 +08:00
Longsheng Mou	7f04a8ad13	[mlir][func][bufferization] Fix cast incompatible when bufferize callOp (#105929 ) Handle caller/callee type mismatch using `castOrReallocMemRefValue` instead of just a `CastOp`. The method insert a reallocation + copy if it cannot be statically guaranteed that a direct cast would be valid. Fix #105916.	2024-08-27 07:06:00 +08:00
Dennis Filimonov	6de04e6fe8	[mlir][bufferization] Adding the optimize-allocation-liveness pass (#101827 ) Adding a pass that is expected to run after the deallocation pipeline and will move buffer deallocations right after their last user or dependency, thus optimizing the allocation liveness.	2024-08-14 13:22:47 +02:00
Giuseppe Rossini	441b672bbd	[mlir] Fix block merging (#102038 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note-1: I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed. Note-2: I fixed a bug found by @Dinistro in #97697 . The issue was that, when looking for redundant arguments, I was not considering that the block might have already some arguments. So the index (in the block args list) of the i-th `newArgument` is `i+numOfOldArguments`.	2024-08-07 09:10:01 +01:00
Longsheng Mou	6867324eee	[mlir][bufferization] Improve performance of DropEquivalentBufferResultsPass (#101281 ) By using DenseMap to minimize the traveral time of callOps, and the efficiency of running this pass has been greatly improved.	2024-08-02 09:22:20 +08:00
Christian Ulmann	6a5a64c56b	Revert "[mlir] Fix block merging" (#100510 ) Reverts llvm/llvm-project#97697 This commit introduced non-trivial bugs related to type consistency.	2024-07-25 10:42:25 +02:00
Giuseppe Rossini	c63125d453	[mlir] Fix block merging (#97697 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note: this a rework of #96871 . I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.	2024-07-17 17:05:40 +01:00
donald chen	662c6fc74c	[mlir] [bufferize] fix bufferize deallocation error in nest symbol table (#98476 ) In nested symbols, the dealloc_helper function generated by lower deallocations pass was incorrectly positioned, causing calls fail. This patch fixes this issue.	2024-07-15 12:52:46 +08:00
Nikhil Kalra	0ad6ac8c53	[NFC][MLIR] Fix: `alloca` promotion for `AllocationOpInterface` (#97672 ) The std::optional returned by buildPromotedAlloc was directly dereferenced and assumed to be non-null, even though the documentation for AllocationOpInterface indicates that std::nullopt is a legal value if buffer stack promotion is not supported (and is the default value supplied by the TableGen interface file). This patch removes the direct dereference so that the optional can be null-checked prior to use. Co-authored-by: Nikhil Kalra <nkalra@apple.com>	2024-07-04 08:49:33 +02:00
Mehdi Amini	28a11cc492	Revert "Fix block merging" (#97460 ) Reverts llvm/llvm-project#96871 Bots are broken.	2024-07-02 20:57:16 +02:00
Giuseppe Rossini	6c3897d90e	Fix block merging (#96871 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note: many tests are still not passing. But I wanted to submit the code before changing all the tests (and probably adding a couple), so that we can agree in principle on the algorithm/design.	2024-07-02 17:12:33 +01:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Matthias Springer	cf9b77a636	[mlir][bufferization] Fix bug in bufferization of elementwise ops (#97209 ) There is an optimization in One-Shot Bufferize wrt. ops that bufferize to elementwise access. A copy can sometimes be avoided. E.g.: ``` %0 = tensor.empty() %1 = tensor.fill ... %2 = linalg.map ins(%1, ...) outs(%1) ``` In the above example, a buffer copy is not needed for %1, even though the same buffer is read/written by two different operands (of the same op). That's because the op bufferizes to elementwise access. ```c++ // Two equivalent operands of the same op are not conflicting if the op // bufferizes to element-wise access. I.e., all loads at a position // happen before all stores to the same position. ``` This optimization cannot be applied when op dominance cannot be used to rule out conflicts. E.g., when the `linalg.map` is inside of a loop. In such a case, the reads/writes happen multiple times and it is not guaranteed that "all loads at a position happen before all stores to the same position." Fixes #90019.	2024-07-01 19:00:21 +02:00
zhicong zhong	1d4ce574a4	[mlir][bufferization] skip empty tensor elimination if they have different element type (#96998 ) In the origin implementation, the empty tensor elimination will add a `tensor.cast` and eliminate the tensor even if they have different element type(f32, bf16). Here add a check for element type and skip the elimination if they are different.	2024-07-01 09:30:04 +08:00
McCowan Zhang	a159b36724	Bufferization with ControlFlow Asserts (#95868 ) Fixed incorrect bufferization interaction with cf.assert - reordered bufferization condition checking - fixed hasNeitherAllocateNorFreeSideEffect checking bug - implemented memory interface for cf.assert --------- Co-authored-by: McCowan Zhang <mccowan.z@ssi.samsung.com>	2024-06-26 08:00:39 +02:00
Max191	d586372194	[mlir] Add bufferization option for parallel region check (#94645 ) Handling parallel region RaW conflicts should usually be the responsibility of the source program, rather than bufferization analysis. However, to preserve current functionality, checks on parallel regions is put behind a bufferization in this PR, which is on by default. Default functionality will not change, but this PR enables the option to leave parallelism checks out of the bufferization analysis.	2024-06-11 10:31:06 -04:00
Matthias Springer	13896b6ce9	[mlir][bufferization] Fix handling of indirect function calls (#94896 ) This commit fixes a crash in the ownership-based buffer deallocation pass when indirectly calling a function via SSA value. Such functions must be conservatively assumed to be public. Fixes #94780.	2024-06-10 08:07:24 +02:00
Kunwar Grover	debdbeda15	[mlir] Remove dialect specific bufferization passes (Reland) (#93535 ) These passes have been depreciated for a long time and replaced by one-shot bufferization. These passes are also unsafe because they do not check for read-after-write conflicts. Relands https://github.com/llvm/llvm-project/pull/93488 which failed on buildbot. Fixes the failure by updating integration tests to use one-shot-bufferize instead.	2024-05-28 20:04:27 +01:00
Kunwar Grover	39848d0a98	Revert "[mlir] Remove dialect specific bufferization passes" (#93528 ) Reverts llvm/llvm-project#93488 Buildbot failure: https://lab.llvm.org/buildbot/#/builders/220/builds/39911	2024-05-28 11:21:34 +01:00
Kunwar Grover	2fc5106437	[mlir] Remove dialect specific bufferization passes (#93488 ) These passes have been depreciated for a long time and replaced by one-shot bufferization. These passes are also unsafe because they do not check for read-after-write conflicts.	2024-05-28 11:12:58 +01:00
Jie Fu	1c8c2fdd28	[mlir] Fix -Wdeprecated-declarations in BufferResultsToOutParams.cpp (NFC) /llvm-project/mlir/lib/Dialect/Bufferization/Transforms/BufferResultsToOutParams.cpp:124:26: error: 'cast' is deprecated: Use mlir::cast<U>() instead [-Werror,-Wdeprecated-declarations] 124 \| orig.getType().cast<MemRefType>().hasStaticShape()) { \|	2024-05-08 10:38:34 +08:00
Menooker	0af448b711	[MLIR][Bufferization] BufferResultsToOutParams: Add an option to eliminate AllocOp and avoid Copy (#90011 ) Add an option hoist-static-allocs to remove the unnecessary memref.alloc and memref.copy after this pass, when the memref in ReturnOp is allocated by memref.alloc and is statically shaped. Instead, it replaces the uses of the allocated memref with the memref in the out argument. By default, BufferResultsToOutParams will result in a memcpy operation to copy the originally returned memref to the output argument memref. This is inefficient when the source of memcpy (the returned memref in the original ReturnOp) is from a local AllocOp. The pass can use the output argument memref to replace the locally allocated memref for better performance.hoist-static-allocs avoids dynamic allocation and memory movement. This option will be critical for performance-sensivtive applications, which require BufferResultsToOutParams pass for a caller-owned output buffer calling convension.	2024-05-08 10:14:52 +08:00
Rafael Ubal	a42a2ca19b	Avoid buffer hoisting from parallel loops (#90735 ) This change corrects an invalid behavior in pass `--buffer-loop-hoisting`. The pass is in charge of extracting buffer allocations (e.g., `memref.alloca`) from loop regions (e.g., `scf.for`) when possible. This works OK for looks with sequential execution semantics. However, a buffer allocated in the body of a parallel loop may be concurrently accessed by multiple thread to store its local data. Extracting such buffer from the loop causes all threads to wrongly share the same memory region. In the following example, dimension 1 of the input tensor is reversed. Dimension 0 is traversed with a parallel loop. ``` func.func @f(%input: memref<2x3xf32>) -> memref<2x3xf32> { %c0 = index.constant 0 %c1 = index.constant 1 %c2 = index.constant 2 %c3 = index.constant 3 %output = memref.alloc() : memref<2x3xf32> scf.parallel (%index) = (%c0) to (%c2) step (%c1) { // Create subviews for working input and output slices %input_slice = memref.subview %input[%index, 2][1, 3][1, -1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, -1], offset: ?>> %output_slice = memref.subview %output[%index, 0][1, 3][1, 1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>> // Copy the input slice into this temporary buffer. This intermediate // copy is unnecessary, but is used for illustration purposes. %temp = memref.alloc() : memref<1x3xf32> memref.copy %input_slice, %temp : memref<1x3xf32, strided<[3, -1], offset: ?>> to memref<1x3xf32> // Copy temporary buffer into output slice memref.copy %temp, %output_slice : memref<1x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>> scf.reduce } return %output : memref<2x3xf32> } ``` The patch submitted here prevents `%temp = memref.alloc() : memref<1x3xf32>` from being hoisted when the containing op is `scf.parallel` or `scf.forall`. A new op trait called `HasParallelRegion` is introduced and assigned to these two ops to indicate that their regions have parallel execution semantics. @joker-eph @ftynse @nicolasvasilache @sabauma	2024-05-04 08:35:36 +02:00
Matthias Springer	179e174945	[mlir][bufferization][NFC] More documentation for `runOneShotBufferize` (#90445 )	2024-04-29 13:23:37 +02:00
Christian Sigg	a5757c5b65	Switch member calls to `isa/dyn_cast/cast/...` to free function calls. (#89356 ) This change cleans up call sites. Next step is to mark the member functions deprecated. See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-19 15:58:27 +02:00
Matthias Gehre	c515c78024	[mlir][Bufferization] castOrReallocMemRefValue: Use BufferizationOptions (#89175 ) This allows to configure both the op used for allocation and copy of memrefs. It also changes the default behavior because the default allocation in `BufferizationOptions` creates `memref.alloc` with `alignment = 64` where we used to create `memref.alloca` without any alignment before. Fixes ``` // TODO: Use alloc/memcpy callback from BufferizationOptions if called via // BufferizableOpInterface impl of ToMemrefOp. ```	2024-04-18 15:47:08 +02:00
Jakub Kuderski	971b852546	[mlir][NFC] Simplify type checks with isa predicates (#87183 ) For more context on isa predicates, see: https://github.com/llvm/llvm-project/pull/83753.	2024-04-01 11:40:09 -04:00
Matthias Springer	dbfc38ed6b	[mlir][bufferization] Add `BufferOriginAnalysis` (#86461 ) This commit adds the `BufferOriginAnalysis`, which can be queried to check if two buffer SSA values originate from the same allocation. This new analysis is used in the buffer deallocation pass to fold away or simplify `bufferization.dealloc` ops more aggressively. The `BufferOriginAnalysis` is based on the `BufferViewFlowAnalysis`, which collects buffer SSA value "same buffer" dependencies. E.g., given IR such as: ``` %0 = memref.alloc() %1 = memref.subview %0 %2 = memref.subview %1 ``` The `BufferViewFlowAnalysis` will report the following "reverse" dependencies (`resolveReverse`) for `%2`: {`%2`, `%1`, `%0`}. I.e., all buffer SSA values in the reverse use-def chain that originate from the same allocation as `%2`. The `BufferOriginAnalysis` is built on top of that. It handles only simple cases at the moment and may conservatively return "unknown" around certain IR with branches, memref globals and function arguments. This analysis enables additional simplifications during `-buffer-deallocation-simplification`. In particular, "regular" scf.for loop nests, that yield buffers (or reallocations thereof) in the same order as they appear in the iter_args, are now handled much more efficiently. Such IR patterns are generated by the sparse compiler.	2024-03-25 18:57:53 +09:00
Matthias Springer	a45e58af1b	[mlir][bufferization] Add `BufferViewFlowOpInterface` (#78718 ) This commit adds the `BufferViewFlowOpInterface` to the bufferization dialect. This interface can be implemented by ops that operate on buffers to indicate that a buffer op result and/or region entry block argument may be the same buffer as a buffer operand (or a view thereof). This interface is queried by the `BufferViewFlowAnalysis`. The new interface has two interface methods: * `populateDependencies`: Implementations use the provided callback to declare dependencies between operands and op results/region entry block arguments. E.g., for `%r = arith.select %c, %m1, %m2 : memref<5xf32>`, the interface implementation should declare two dependencies: %m1 -> %r and %m2 -> %r. * `mayBeTerminalBuffer`: An SSA value is a terminal buffer if the buffer view flow analysis stops at the specified value. E.g., because the value is a newly allocated buffer or because no further information is available about the origin of the buffer. Ops that implement the `RegionBranchOpInterface` or `BranchOpInterface` do not have to implement the `BufferViewFlowOpInterface`. The buffer dependencies can be inferred from those two interfaces. This commit makes the `BufferViewFlowAnalysis` more accurate. For unknown ops, it conservatively used to declare all combinations of operands and op results/region entry block arguments as dependencies (false positives). This is no longer the case. While the analysis is still a "maybe" analysis with false positives (e.g., when analyzing ops such as `arith.select` or `scf.if` where the taken branch is not known at compile time), results and region entry block arguments of unknown ops are now marked as terminal buffers. This commit addresses a TODO in `BufferViewFlowAnalysis.cpp`: ``` // TODO: We should have an op interface instead of a hard-coded list of // interfaces/ops. ``` It is no longer needed to hard-code ops.	2024-03-24 12:48:19 +09:00
Matthias Springer	35d3b3430e	[mlir][bufferization] Add "bottom-up from terminators" analysis heuristic (#83964 ) One-Shot Bufferize currently does not support loops where a yielded value bufferizes to a buffer that is different from the buffer of the region iter_arg. In such a case, the bufferization fails with an error such as: ``` Yield operand #0 is not equivalent to the corresponding iter bbArg scf.yield %0 : tensor<5xf32> ``` One common reason for non-equivalent buffers is that an op on the path from the region iter_arg to the terminator bufferizes out-of-place. Ops that are analyzed earlier are more likely to bufferize in-place. This commit adds a new heuristic that gives preference to ops that are reachable on the reverse SSA use-def chain from a region terminator and are within the parent region of the terminator. This is expected to work better than the existing heuristics for loops where an iter_arg is written to multiple times within a loop, but only one write is fed into the terminator. Current users of One-Shot Bufferize are not affected by this change. "Bottom-up" is still the default heuristic. Users can switch to the new heuristic manually. This commit also turns the "fuzzer" pass option into a heuristic, cleaning up the code a bit.	2024-03-21 14:16:02 +09:00
Benjamin Kramer	db60491127	[mlir][bufferization] Check OpFilter before casting to BufferizableOpInterface (#85690 ) This doesn't change functionality, but lets us avoid attaching all the interfaces after `513cdb8222` turned casting without loading into an error.	2024-03-19 10:31:25 +01:00

1 2 3 4 5 ...

335 Commits