clang-p2996

Author	SHA1	Message	Date
Matthias Springer	217700baf7	[mlir][bufferization] Support bufferization of external functions (#113999 ) This commit adds support for bufferizing external functions that have no body. Such functions were previously rejected by One-Shot Bufferize if they returned a tensor value. This commit is in preparation of removing the deprecated `func-bufferize` pass. That pass can bufferize external functions. Also update a few comments.	2024-10-30 21:49:10 +09:00
Matthias Springer	ea050ab1a9	[mlir][Transforms][NFC] Dialect conversion: Reformat materialization error message (#114176 ) This commit changes the format of the materialization error message. Previously: `failed to legalize unresolved materialization from ('f64') to 'f32' that remained live after conversion` Now: `failed to legalize unresolved materialization from ('f64') to ('f32') that remained live after conversion` This commit is in preparation of merging the 1:1 and 1:N dialect conversions. At that point, target materializations may create more than one SSA value. I am sending this change as a separate PR to keep the main PR smaller.	2024-10-30 21:36:39 +09:00
Andrzej Warzyński	91c11574e8	Revert "[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#110322 )" (#113124 ) This reverts commit `2026501cf1`. Failing bot: * https://lab.llvm.org/staging/#/builders/125/builds/389	2024-10-22 13:28:44 +01:00
Tzung-Han Juang	2026501cf1	[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#110322 ) Description: This PR replaces a part of `FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface` in `OneShotModuleBufferize`. Also fix the error from an integration test in the a previous PR attempt. (https://github.com/llvm/llvm-project/pull/107295) The below fixes skip `CallOpInterface` so that the assertions are not triggered. `8d78000762/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp (L254-L259)` `8d78000762/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp (L311-L315)` Related Discord Discussion: [Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900) --------- Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>	2024-10-01 15:58:52 +02:00
Matthias Springer	ae7b454f98	Revert "[MLIR] Make `OneShotModuleBufferize` use `OpInterface`" (#109919 ) Reverts llvm/llvm-project#107295 This commit breaks an integration test: ``` build/bin/mlir-opt mlir/test/Integration/Dialect/Complex/CPU/correctness.mlir -one-shot-bufferize="bufferize-function-boundaries" ```	2024-09-25 09:17:49 +02:00
Tzung-Han Juang	f586b1e3f4	[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#107295 ) Description: `OneShotModuleBufferize` deals with the bufferization of `FuncOp`, `CallOp` and `ReturnOp` but they are hard-coded. Any custom function-like operations will not be handled. The PR replaces a part of `FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface` in `OneShotModuleBufferize` so that custom function ops and call ops can be bufferized. Related Discord Discussion: [Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900) --------- Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>	2024-09-25 07:27:21 +02:00
Matthias Springer	3815f478bb	[mlir][Transforms] Dialect conversion: Make materializations optional (#107109 ) This commit makes source/target/argument materializations (via the `TypeConverter` API) optional. By default (`ConversionConfig::buildMaterializations = true`), the dialect conversion infrastructure tries to legalize all unresolved materializations right after the main transformation process has succeeded. If at least one unresolved materialization fails to resolve, the dialect conversion fails. (With an error message such as `failed to legalize unresolved materialization ...`.) Automatic materializations through the `TypeConverter` API can now be deactivated. In that case, every unresolved materialization will show up as a `builtin.unrealized_conversion_cast` op in the output IR. There used to be a complex and error-prone analysis in the dialect conversion that predicted the future uses of unresolved materializations. Based on that logic, some casts (that were deemed to unnecessary) were folded. This analysis was needed because folding happened at a point of time when some IR changes (e.g., op replacements) had not materialized yet. This commit removes that analysis. Any folding of cast ops now happens after all other IR changes have been materialized and the uses can directly be queried from the IR. This simplifies the analysis significantly. And certain helper data structures such as `inverseMapping` are no longer needed for the analysis. The folding itself is done by `reconcileUnrealizedCasts` (which also exists as a standalone pass). After casts have been folded, the remaining casts are materialized through the `TypeConverter`, as usual. This last step can be deactivated in the `ConversionConfig`. `ConversionConfig::buildMaterializations = false` can be used to debug error messages such as `failed to legalize unresolved materialization ...`. (It is also useful in case automatic materializations are not needed.) The materializations that failed to resolve can then be seen as `builtin.unrealized_conversion_cast` ops in the resulting IR. (This is better than running with `-debug`, because `-debug` shows IR where some IR changes have not been materialized yet.) Note: This is a reupload of #104668, but with correct handling of cyclic unrealized_conversion_casts that may be generated by the dialect conversion.	2024-09-05 19:40:58 +02:00
Matthias Springer	5eda498811	Revert "[mlir][Transforms] Dialect conversion: Make materializations optional" (#106778 ) Reverts llvm/llvm-project#104668 This commit triggers an edge case that can cause circular `unrealized_conversion_cast` ops. https://github.com/llvm/llvm-project/pull/106760 may fix it, but it is has other issues. Reverting this PR for now, until I find a solution for that problem.	2024-08-30 12:34:41 -07:00
Longsheng Mou	7f04a8ad13	[mlir][func][bufferization] Fix cast incompatible when bufferize callOp (#105929 ) Handle caller/callee type mismatch using `castOrReallocMemRefValue` instead of just a `CastOp`. The method insert a reallocation + copy if it cannot be statically guaranteed that a direct cast would be valid. Fix #105916.	2024-08-27 07:06:00 +08:00
Matthias Springer	d7073c5274	[mlir][Transforms] Dialect conversion: Make materializations optional (#104668 ) This commit makes source/target/argument materializations (via the `TypeConverter` API) optional. By default (`ConversionConfig::buildMaterializations = true`), the dialect conversion infrastructure tries to legalize all unresolved materializations right after the main transformation process has succeeded. If at least one unresolved materialization fails to resolve, the dialect conversion fails. (With an error message such as `failed to legalize unresolved materialization ...`.) Automatic materializations through the `TypeConverter` API can now be deactivated. In that case, every unresolved materialization will show up as a `builtin.unrealized_conversion_cast` op in the output IR. There used to be a complex and error-prone analysis in the dialect conversion that predicted the future uses of unresolved materializations. Based on that logic, some casts (that were deemed to unnecessary) were folded. This analysis was needed because folding happened at a point of time when some IR changes (e.g., op replacements) had not materialized yet. This commit removes that analysis. Any folding of cast ops now happens after all other IR changes have been materialized and the uses can directly be queried from the IR. This simplifies the analysis significantly. And certain helper data structures such as `inverseMapping` are no longer needed for the analysis. The folding itself is done by `reconcileUnrealizedCasts` (which also exists as a standalone pass). After casts have been folded, the remaining casts are materialized through the `TypeConverter`, as usual. This last step can be deactivated in the `ConversionConfig`. `ConversionConfig::buildMaterializations = false` can be used to debug error messages such as `failed to legalize unresolved materialization ...`. (It is also useful in case automatic materializations are not needed.) The materializations that failed to resolve can then be seen as `builtin.unrealized_conversion_cast` ops in the resulting IR. (This is better than running with `-debug`, because `-debug` shows IR where some IR changes have not been materialized yet.)	2024-08-23 14:03:10 -07:00
Matthias Springer	2d50029f98	[mlir][Transforms] Dialect conversion: Build unresolved materialization for replaced ops (#101514 ) When inserting an argument/source/target materialization, the dialect conversion framework first inserts a "dummy" `unrealized_conversion_cast` op (during the rewrite process) and then (in the "finialize" phase) replaces these cast ops with the IR generated by the type converter callback. This is the case for all materializations, except when ops are being replaced with values that have a different type. In that case, the dialect conversion currently directly emits a source materialization. This commit changes the implementation, such that a temporary `unrealized_conversion_cast` is also inserted in that case. This commit simplifies the code base: all materializations now happen in `legalizeUnresolvedMaterialization`. This commit makes it possible to decouple source/target/argument materializations from the dialect conversion (to reduce the complexity of the code base). Such materializations can then also be optional. This will be implemented in a follow-up commit. Depends on #101476. --------- Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>	2024-08-15 11:33:37 +02:00
Dennis Filimonov	6de04e6fe8	[mlir][bufferization] Adding the optimize-allocation-liveness pass (#101827 ) Adding a pass that is expected to run after the deallocation pipeline and will move buffer deallocations right after their last user or dependency, thus optimizing the allocation liveness.	2024-08-14 13:22:47 +02:00
Giuseppe Rossini	441b672bbd	[mlir] Fix block merging (#102038 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note-1: I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed. Note-2: I fixed a bug found by @Dinistro in #97697 . The issue was that, when looking for redundant arguments, I was not considering that the block might have already some arguments. So the index (in the block args list) of the i-th `newArgument` is `i+numOfOldArguments`.	2024-08-07 09:10:01 +01:00
Christian Ulmann	6a5a64c56b	Revert "[mlir] Fix block merging" (#100510 ) Reverts llvm/llvm-project#97697 This commit introduced non-trivial bugs related to type consistency.	2024-07-25 10:42:25 +02:00
Giuseppe Rossini	c63125d453	[mlir] Fix block merging (#97697 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note: this a rework of #96871 . I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.	2024-07-17 17:05:40 +01:00
donald chen	662c6fc74c	[mlir] [bufferize] fix bufferize deallocation error in nest symbol table (#98476 ) In nested symbols, the dealloc_helper function generated by lower deallocations pass was incorrectly positioned, causing calls fail. This patch fixes this issue.	2024-07-15 12:52:46 +08:00
Mehdi Amini	28a11cc492	Revert "Fix block merging" (#97460 ) Reverts llvm/llvm-project#96871 Bots are broken.	2024-07-02 20:57:16 +02:00
Giuseppe Rossini	6c3897d90e	Fix block merging (#96871 ) With this PR I am trying to address: https://github.com/llvm/llvm-project/issues/63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. Note: many tests are still not passing. But I wanted to submit the code before changing all the tests (and probably adding a couple), so that we can agree in principle on the algorithm/design.	2024-07-02 17:12:33 +01:00
zhicong zhong	1d4ce574a4	[mlir][bufferization] skip empty tensor elimination if they have different element type (#96998 ) In the origin implementation, the empty tensor elimination will add a `tensor.cast` and eliminate the tensor even if they have different element type(f32, bf16). Here add a check for element type and skip the elimination if they are different.	2024-07-01 09:30:04 +08:00
McCowan Zhang	a159b36724	Bufferization with ControlFlow Asserts (#95868 ) Fixed incorrect bufferization interaction with cf.assert - reordered bufferization condition checking - fixed hasNeitherAllocateNorFreeSideEffect checking bug - implemented memory interface for cf.assert --------- Co-authored-by: McCowan Zhang <mccowan.z@ssi.samsung.com>	2024-06-26 08:00:39 +02:00
Matthias Springer	13896b6ce9	[mlir][bufferization] Fix handling of indirect function calls (#94896 ) This commit fixes a crash in the ownership-based buffer deallocation pass when indirectly calling a function via SSA value. Such functions must be conservatively assumed to be public. Fixes #94780.	2024-06-10 08:07:24 +02:00
klensy	f0b0c02504	[mlir][test] Fix filecheck annotation typos (#92897 ) Moved fixes for mlir from https://github.com/llvm/llvm-project/pull/91854, plus few additional in second commit. --------- Co-authored-by: klensy <nightouser@gmail.com>	2024-05-24 09:24:59 +02:00
Rafael Ubal	a42a2ca19b	Avoid buffer hoisting from parallel loops (#90735 ) This change corrects an invalid behavior in pass `--buffer-loop-hoisting`. The pass is in charge of extracting buffer allocations (e.g., `memref.alloca`) from loop regions (e.g., `scf.for`) when possible. This works OK for looks with sequential execution semantics. However, a buffer allocated in the body of a parallel loop may be concurrently accessed by multiple thread to store its local data. Extracting such buffer from the loop causes all threads to wrongly share the same memory region. In the following example, dimension 1 of the input tensor is reversed. Dimension 0 is traversed with a parallel loop. ``` func.func @f(%input: memref<2x3xf32>) -> memref<2x3xf32> { %c0 = index.constant 0 %c1 = index.constant 1 %c2 = index.constant 2 %c3 = index.constant 3 %output = memref.alloc() : memref<2x3xf32> scf.parallel (%index) = (%c0) to (%c2) step (%c1) { // Create subviews for working input and output slices %input_slice = memref.subview %input[%index, 2][1, 3][1, -1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, -1], offset: ?>> %output_slice = memref.subview %output[%index, 0][1, 3][1, 1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>> // Copy the input slice into this temporary buffer. This intermediate // copy is unnecessary, but is used for illustration purposes. %temp = memref.alloc() : memref<1x3xf32> memref.copy %input_slice, %temp : memref<1x3xf32, strided<[3, -1], offset: ?>> to memref<1x3xf32> // Copy temporary buffer into output slice memref.copy %temp, %output_slice : memref<1x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>> scf.reduce } return %output : memref<2x3xf32> } ``` The patch submitted here prevents `%temp = memref.alloc() : memref<1x3xf32>` from being hoisted when the containing op is `scf.parallel` or `scf.forall`. A new op trait called `HasParallelRegion` is introduced and assigned to these two ops to indicate that their regions have parallel execution semantics. @joker-eph @ftynse @nicolasvasilache @sabauma	2024-05-04 08:35:36 +02:00
Gaurav Shukla	97069a8619	[MLIR] Generalize expand_shape to take shape as explicit input (#90040 ) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 --------- Signed-off-by: Gaurav Shukla<gaurav.shukla@amd.com> Signed-off-by: Gaurav Shukla <gaurav.shukla@amd.com> Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>	2024-04-30 09:28:35 -07:00
Mehdi Amini	8c0341df02	Revert "[MLIR] Generalize expand_shape to take shape as explicit input" (#89540 ) Reverts llvm/llvm-project#69267 this broke some bots.	2024-04-21 14:33:48 +02:00
Gaurav Shukla	e095d978ba	[MLIR] Generalize expand_shape to take shape as explicit input (#69267 ) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>	2024-04-21 07:37:02 -04:00
Matthias Gehre	c515c78024	[mlir][Bufferization] castOrReallocMemRefValue: Use BufferizationOptions (#89175 ) This allows to configure both the op used for allocation and copy of memrefs. It also changes the default behavior because the default allocation in `BufferizationOptions` creates `memref.alloc` with `alignment = 64` where we used to create `memref.alloca` without any alignment before. Fixes ``` // TODO: Use alloc/memcpy callback from BufferizationOptions if called via // BufferizableOpInterface impl of ToMemrefOp. ```	2024-04-18 15:47:08 +02:00
Kunwar Grover	6f1e23b47d	[MLIR][Bufferization] Choose default memory space in tensor copy insertion (#88500 ) Tensor copy insertion currently uses memory_space = 0 when creating a tensor copy using alloc_tensor. This memory space should instead be the default memory space provided in bufferization options.	2024-04-12 17:56:46 +02:00
Matthias Springer	dbfc38ed6b	[mlir][bufferization] Add `BufferOriginAnalysis` (#86461 ) This commit adds the `BufferOriginAnalysis`, which can be queried to check if two buffer SSA values originate from the same allocation. This new analysis is used in the buffer deallocation pass to fold away or simplify `bufferization.dealloc` ops more aggressively. The `BufferOriginAnalysis` is based on the `BufferViewFlowAnalysis`, which collects buffer SSA value "same buffer" dependencies. E.g., given IR such as: ``` %0 = memref.alloc() %1 = memref.subview %0 %2 = memref.subview %1 ``` The `BufferViewFlowAnalysis` will report the following "reverse" dependencies (`resolveReverse`) for `%2`: {`%2`, `%1`, `%0`}. I.e., all buffer SSA values in the reverse use-def chain that originate from the same allocation as `%2`. The `BufferOriginAnalysis` is built on top of that. It handles only simple cases at the moment and may conservatively return "unknown" around certain IR with branches, memref globals and function arguments. This analysis enables additional simplifications during `-buffer-deallocation-simplification`. In particular, "regular" scf.for loop nests, that yield buffers (or reallocations thereof) in the same order as they appear in the iter_args, are now handled much more efficiently. Such IR patterns are generated by the sparse compiler.	2024-03-25 18:57:53 +09:00
Matthias Springer	35d3b3430e	[mlir][bufferization] Add "bottom-up from terminators" analysis heuristic (#83964 ) One-Shot Bufferize currently does not support loops where a yielded value bufferizes to a buffer that is different from the buffer of the region iter_arg. In such a case, the bufferization fails with an error such as: ``` Yield operand #0 is not equivalent to the corresponding iter bbArg scf.yield %0 : tensor<5xf32> ``` One common reason for non-equivalent buffers is that an op on the path from the region iter_arg to the terminator bufferizes out-of-place. Ops that are analyzed earlier are more likely to bufferize in-place. This commit adds a new heuristic that gives preference to ops that are reachable on the reverse SSA use-def chain from a region terminator and are within the parent region of the terminator. This is expected to work better than the existing heuristics for loops where an iter_arg is written to multiple times within a loop, but only one write is fed into the terminator. Current users of One-Shot Bufferize are not affected by this change. "Bottom-up" is still the default heuristic. Users can switch to the new heuristic manually. This commit also turns the "fuzzer" pass option into a heuristic, cleaning up the code a bit.	2024-03-21 14:16:02 +09:00
Matthias Springer	0940be1581	[mlir][bufferization] Never pass ownership to functions (#80655 ) Even when `private-function-dynamic-ownership` is set, ownership should never be passed to the callee. This can lead to double deallocs (#77096) or use-after-free in the caller because ownership is currently passed regardless of whether there are any further uses of the buffer in the caller or not. Note: This is consistent with the fact that ownership is never passed to nested regions. This commit fixes #77096.	2024-02-05 12:11:49 +01:00
lorenzo chelini	84c8d0377d	[MLIR][Vector] Implement memory effect for print (#80400 ) Add write memory effect for the print operation. The exact memory behavior is implemented in other print-like operations such as `transform::PrintOp` or `gpu::printf`. Providing memory behavior allows using the operation in passes like buffer deallocation instead of emitting an error.	2024-02-02 11:05:35 +01:00
Matthias Springer	fbb62d449c	[mlir][bufferization] Buffer deallocation: Make op preconditions stricter (#75127 ) The buffer deallocation pass checks the IR ("operation preconditions") to make sure that there is no IR that is unsupported. In such a case, the pass signals a failure. The pass now rejects all ops with unknown memory effects. We do not know whether such an op allocates memory or not. Therefore, the buffer deallocation pass does not know whether a deallocation op should be inserted or not. Memory effects are queried from the `MemoryEffectOpInterface` interface. Ops that do not implement this interface but have the `RecursiveMemoryEffects` trait do not have any side effects (apart from the ones that their nested ops may have). Unregistered ops are now rejected by the pass because they do not implement the `MemoryEffectOpInterface` and neither do we know if they have `RecursiveMemoryEffects` or not. All test cases that currently have unregistered ops are updated to use registered ops.	2024-01-21 11:10:09 +01:00
Matthias Springer	b4f24be7ef	[mlir][bufferization] Simplify helper `potentiallyAliasesMemref` (#78690 ) This commit simplifies a helper function in the ownership-based buffer deallocation pass. Fixes a potential double-free (depending on the scheduling of patterns).	2024-01-19 13:22:02 +01:00
Sergio Afonso	8fb685fb7e	[MLIR][LLVM] Add explicit target_cpu attribute to llvm.func (#78287 ) This patch adds the target_cpu attribute to llvm.func MLIR operations and updates the translation to/from LLVM IR to match "target-cpu" function attributes.	2024-01-17 14:55:02 +00:00
Matthias Springer	a43641c9db	[mlir][bufferization] Fix `regionOperatesOnMemrefValues` (#75016 ) `Region::walk([](Block *b) {...})` does not enumerate blocks that are direct children of the region. These blocks must be checked manually.	2023-12-12 08:56:23 +09:00
Nicolas Vasilache	3a223f4414	[mlir][Bufferization] Add support for controlled bufferization of alloc_tensor (#70957 ) This revision adds support to `transform.structured.bufferize_to_allocation` to bufferize `bufferization.alloc_tensor()` ops. This is useful as a means path to control the bufferization of `tensor.empty` ops that have bene previously `bufferization.empty_tensor_to_alloc_tensor`'ed.	2023-11-02 11:34:10 +01:00
Matthias Springer	4fbbb7ad7c	[mlir][bufferization] Fix ownership computation of unknown ops (#70773 ) No ownership is assumed for memref results of ops that implement none of the relevant interfaces and have no memref operands. This fixes #68948.	2023-11-01 09:26:47 +09:00
Oleksandr "Alex" Zinenko	e4384149b5	[mlir] use transform-interpreter in test passes (#70040 ) Update most test passes to use the transform-interpreter pass instead of the test-transform-dialect-interpreter-pass. The new "main" interpreter pass has a named entry point instead of looking up the top-level op with `PossibleTopLevelOpTrait`, which is arguably a more understandable interface. The change is mechanical, rewriting an unnamed sequence into a named one and wrapping the transform IR in to a module when necessary. Add an option to the transform-interpreter pass to target a tagged payload op instead of the root anchor op, which is also useful for repro generation. Only the test in the transform dialect proper and the examples have not been updated yet. These will be updated separately after a more careful consideration of testing coverage of the transform interpreter logic.	2023-10-24 16:12:34 +02:00
Matthias Springer	4d80eff861	[mlir][bufferization] Ownership-based deallocation: Allow manual (de)allocs (#68648 ) Add a new attribute `bufferization.manual_deallocation` that can be attached to allocation and deallocation ops. Buffers that are allocated with this attribute are assigned an ownership of "false". Such buffers can be deallocated manually (e.g., with `memref.dealloc`) if the deallocation op also has the attribute set. Previously, the ownership-based buffer deallocation pass used to reject IR with existing deallocation ops. This is no longer the case if such ops have this new attribute. This change is useful for the sparse compiler, which currently deallocates the sparse tensor buffers by itself.	2023-10-23 09:45:33 +09:00
Matthias Springer	6d88ac11d7	[mlir][bufferization] Transfer `restrict` during empty tensor elimination (#68729 ) Empty tensor elimination is looking for `bufferization.materialize_in_destination` ops with a `tensor.empty` source. It replaces the `tensor.empty` with a `bufferization.to_tensor restrict` of the memref destination. As part of this rewrite, the `restrict` keyword should be removed, so that no second `to_tensor restrict` op will be inserted. Such IR would be invalid. `bufferization.materialize_in_destination` with memref destination and without the `restrict` attribute are ignored by empty tensor elimination. Also relax the verifier of `materialize_in_destination`. The `restrict` keyword is not generally needed because the op does not expose the buffer as a tensor.	2023-10-11 08:51:16 -07:00
Matthias Springer	3d0ca2cfe3	[mlir][bufferization] Allow cyclic function graphs without tensors (#68632 ) Cyclic function call graphs are generally not supported by One-Shot Bufferize. However, they can be allowed when a function does not have tensor arguments or results. This is because it is then no longer necessary that the callee will be bufferized before the caller.	2023-10-09 17:52:04 -07:00
Matthias Springer	8ee38f3b32	[mlir][bufferization] Follow up for #68074 (#68488 ) Address additional comments in #68074. This should have been part of #68074.	2023-10-07 10:07:17 -07:00
Matthias Springer	0fcaca2fea	[mlir][bufferization] `MaterializeInDestinationOp`: Support memref destinations (#68074 ) Extend `bufferization.materialize_in_destination` to support memref destinations. This op can now be used to indicate that a tensor computation should materialize in a given buffer (that may have been allocated by another component/runtime). The op still participates in "empty tensor elimination". Example: ```mlir func.func @test(%out: memref<10xf32>) { %t = tensor.empty() : tensor<10xf32> %c = linalg.generic ... outs(%t: tensor<10xf32>) -> tensor<10xf32> bufferization.materialize_in_destination %c in restrict writable %out : (tensor<10xf32>, memref<10xf32>) -> () return } ``` After "empty tensor elimination", the above IR can bufferize without an allocation: ```mlir func.func @test(%out: memref<10xf32>) { linalg.generic ... outs(%out: memref<10xf32>) return } ``` This change also clarifies the meaning of the `restrict` unit attribute on `bufferization.to_tensor` ops.	2023-10-06 11:57:10 +02:00
long.chen	5979e1dfb1	[mlir] Fix `empty-tensor-elimination` around self-copies (#68129 ) * Fixes #67977, a crash in `empty-tensor-elimination`. * Also improves `linalg.copy` canonicalization. * Also improves indentation indentation in `mlir-linalg-ods-yaml-gen.cpp`.	2023-10-05 12:04:20 +02:00
Matthias Springer	43198b0aa2	[mlir][bufferization] Better analysis around allocs and block arguments (#67923 ) Values that are the result of buffer allocation ops are guaranteed to not be the same allocation as block arguments of containing blocks. This fact can be used to allow for more aggressive simplification of `bufferization.dealloc` ops.	2023-10-02 11:01:12 +02:00
Martin Erhart	6a651c7f44	Revert "[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626 )" This reverts commit `aa9eb47da2`. It introduced a double free in a test case. Reverting to have some time for fixing this and relanding later.	2023-09-28 09:14:46 +00:00
Martin Erhart	aa9eb47da2	[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626 ) Inserting clones requires a lot of assumptions to hold on the input IR, e.g., all writes to a buffer need to dominate all reads. This is not guaranteed by one-shot bufferization and isn't easy to verify, thus it could quickly lead to incorrect results that are hard to debug. This commit changes the mechanism of how an ownership indicator is materialized when there is not already a unique ownership present. Additionally, we don't create copies of returned memrefs anymore when we don't have ownership. Instead, we insert assert operations to make sure we have ownership at runtime, or otherwise report to the user that correctness could not be guaranteed.	2023-09-28 10:45:35 +02:00
Matthias Springer	5109cb28fd	[mlir][bufferization] Make buffer deallocation pipeline op type independent (#67546 ) The buffer deallocation pipeline now works on modules and functions. Also add extra test cases that run the buffer deallocation pipeline on modules and functions. (Test cases that insert a helper function.)	2023-09-27 15:06:25 +02:00
Matthias Springer	913286baed	[mlir][linalg] Add `SubsetInsertionOpInterface` to `linalg.copy` (#67524 ) This commit enables empty tensor elimination on `linalg.copy` ops.	2023-09-27 10:04:37 +02:00

1 2 3 4

188 Commits