clang-p2996

Author	SHA1	Message	Date
Ramkumar Ramachandra	22426110c5	mlir/tblgen: use std::optional in generation This is part of an effort to migrate from llvm::Optional to std::optional. This patch changes the way mlir-tblgen generates .inc files, and modifies tests and documentation appropriately. It is a "no compromises" patch, and doesn't leave the user with an unpleasant mix of llvm::Optional and std::optional. A non-trivial change has been made to ControlFlowInterfaces to split one constructor into two, relating to a build failure on Windows. See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Signed-off-by: Ramkumar Ramachandra <r@artagnon.com> Differential Revision: https://reviews.llvm.org/D138934	2022-12-17 11:13:26 +01:00
Lei Zhang	9bb633741a	[mlir][bufferization] Support general Attribute as memory space MemRef has been accepting a general Attribute as memory space for a long time. This commits updates bufferization side to catch up, which allows downstream users to plugin customized symbolic memory space. This also eliminates quite a few `getMemorySpaceAsInt` calls, which is deprecated. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D138330	2022-11-21 09:40:50 -05:00
Guray Ozen	6663f34704	[mlir] Introduce device mapper attribute for `thread_dim_map` and `mapped to dims` `scf.foreach_thread` defines mapping its loops to processors via an integer array, see an example below. A lowering can use this mapping. However, expressing mapping as an integer array is very confusing, especially when there are multiple levels of parallelism. In addition, the op does not verify the integer array. This change introduces device mapping attribute to make mapping descriptive and verifiable. Then it makes GPU transform dialect use it. ``` scf.foreach_thread (%i, %j) in (%c1, %c2) { scf.foreach_thread (%i2, %j2) in (%c1, %c2) {...} { thread_dim_mapping = [0, 1]} } { thread_dim_mapping = [0, 1]} ``` It first introduces a `DeviceMappingInterface` which is an attribute interface. `scf.foreach_thread` defines its mapping via this interface. A lowering must define its attributes and implement this interface as well. This way gives us a clear validation. The change also introduces two new attributes (`#gpu.thread<x/y/z>` and `#gpu.block<x,y,z>` ). After this change, the above code prints as below, as seen here, this way clarifies the loop mappings. The change also implements consuming of these two new attribute by the transform dialect. Transform dialect binds the outermost loops to the thread blocks and innermost loops to threads. ``` scf.foreach_thread (%i, %j) in (%c1, %c2) { scf.foreach_thread (%i2, %j2) in (%c1, %c2) {...} { thread_dim_mapping = [#gpu.thread<x>, #gpu.thread<y>]} } { thread_dim_mapping = [#gpu.block<x>, #gpu.block<y>]} ``` Reviewed By: ftynse, nicolasvasilache Differential Revision: https://reviews.llvm.org/D137413	2022-11-11 08:44:57 +01:00
Matthias Springer	c9b3638126	[mlir][scf][bufferize] Fix bufferizesToMemoryRead with 0 loop iterations There was a bug in scf.for loop bufferization that could lead to a missing buffer copy (alloc was there, but not the copy). Differential Revision: https://reviews.llvm.org/D135053	2022-10-24 14:34:41 +02:00
Mehdi Amini	23f989a2e3	Apply clang-tidy fixes for readability-simplify-boolean-expr in BufferizableOpInterfaceImpl.cpp (NFC)	2022-10-12 01:16:36 +00:00
Johannes Reifferscheid	eaf20c4fc2	[mlir] Fix a cast that should be a dyn_cast. This fixes a crash for certain IR, see the new test case for an example. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D134424	2022-09-22 13:13:21 +02:00
Johannes Reifferscheid	d1536ee48c	Fix clang-format.	2022-09-08 11:05:12 +02:00
Johannes Reifferscheid	6247988e07	One-shot-bufferize: fix for inconsistent while arg types in before/after. Currently, if the `before` and `after` regions of a while op have tensor args in different indices, this leads to a crash. This moves the pass-through check for args to the handling of the condition block, since that is where the results are produced, so it's also where copies must be made. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D133477	2022-09-08 10:24:11 +02:00
Johannes Reifferscheid	fb9fc79809	One-shot-bufferize: allow non-tensor arguments in scg.while/for. Currently, one-shot-bufferize crashes as soon as there's a mixture of tensor and non-tensor arguments. This seems to happen for no good reason. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D133419	2022-09-07 15:54:25 +02:00
Matthias Springer	4cd7362083	[mlir][SCF] foreach_thread: Capture shared output tensors explicitly This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments. The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments. As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again. Differential Revision: https://reviews.llvm.org/D133114	2022-09-02 14:54:04 +02:00
Matthias Springer	86974e32a4	[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.while) This change implements the same functionality as D132860, but for scf.while. Differential Revision: https://reviews.llvm.org/D132927	2022-08-30 16:58:21 +02:00
Matthias Springer	9d6096c56f	[mlir][SCF][bufferize][NFC] Move scf.if buffer type computation to getBufferType A part of the functionality of `bufferize` is extracted into `getBufferType`. Also, bufferized scf.yields inside scf.if are now created with the correct bufferized type from the get-to. Differential Revision: https://reviews.llvm.org/D132862	2022-08-30 16:48:10 +02:00
Matthias Springer	123c4b0251	[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.for) Even though iter_arg and init_arg of an scf.for loop may have the same tensor type, their bufferized memref types are not necessarily equal. It is sometimes necessary to insert a cast in case of differing layout maps. Differential Revision: https://reviews.llvm.org/D132860	2022-08-30 16:35:32 +02:00
Matthias Springer	111c919665	[mlir][bufferization] Generalize getBufferType This change generalizes getBufferType. This function can be used to predict the buffer type of any tensor value (not just BlockArguments) without changing any IR. It also subsumes getMemorySpace. This is useful for loop bufferization, where the precise buffer type of an iter_arg cannot be known without examining the loop body. Differential Revision: https://reviews.llvm.org/D132859	2022-08-30 16:26:44 +02:00
Kazu Hirata	10bcfeebfa	[mlir] Remove unused using (NFC) Identified with misc-unused-using-decls.	2022-07-17 18:08:48 -07:00
Nicolas Vasilache	7fbf55c927	[mlir][Tensor] Move ParallelInsertSlice to the tensor dialect This is moslty NFC and will allow tensor.parallel_insert_slice to gain rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl. Depends on D128857 Differential Revision: https://reviews.llvm.org/D128920	2022-07-04 01:53:12 -07:00
Nicolas Vasilache	b994d388ae	[mlir][SCF] Add a ParallelCombiningOpInterface to decouple scf::PerformConcurrently from its contained operations This allows purging references of scf.ForeachThreadOp and scf.PerformConcurrentlyOp from ParallelInsertSliceOp. This will allowmoving the op closer to tensor::InsertSliceOp with which it should share much more code. In the future, the decoupling will also allow extending the type of ops that can be used in the parallel combinator as well as semantics related to multiple concurrent inserts to the same result. Differential Revision: https://reviews.llvm.org/D128857	2022-07-01 00:16:02 -07:00
Matthias Springer	76f7e4b7a3	[mlir][SCF][bufferize][NFC] Utilize recently added helper function This should have been part of D128666. Differential Revision: https://reviews.llvm.org/D128885	2022-06-30 09:54:52 +02:00
Matthias Springer	04dac2ca7c	[mlir][SCF][bufferize][NFC] Implement resolveConflicts for ParallelInsertSliceOp This was previous implemented as part of the BufferizableOpInterface of ForEachThreadOp. Moving the implementation to ParallelInsertSliceOp to be consistent with the remaining ops and to have a nice example op that can serve as a blueprint for other ops. Differential Revision: https://reviews.llvm.org/D128666	2022-06-28 12:18:22 +02:00
Matthias Springer	f164814f2f	[mlir][SCF][bufferize] Small simplification and more comments Differential Revision: https://reviews.llvm.org/D128651	2022-06-27 17:04:29 +02:00
Matthias Springer	c0b0b6a00a	[mlir][bufferize] Infer memory space in all bufferization patterns This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.) Differential Revision: https://reviews.llvm.org/D128423	2022-06-27 16:32:52 +02:00
Matthias Springer	45b995cda4	[mlir][bufferize][NFC] Change signature of allocateTensorForShapedValue Add a failure return value and bufferization options argument. This is to keep a subsequent change smaller. Differential Revision: https://reviews.llvm.org/D128278	2022-06-27 16:00:06 +02:00
Nicolas Vasilache	a0f843fdaf	[SCF] Add thread_dim_mapping attribute to scf.foreach_thread An optional thread_dim_mapping index array attribute specifies for each virtual thread dimension, how it remaps 1-1 to a set of concrete processing element resources (e.g. a CUDA grid dimension or a level of concrete nested async parallelism). At this time, the specification is backend-dependent and is not verified by the op, beyond being an index array attribute. It is the reponsibility of the lowering to interpret the index array in the context of the concrete target the op is lowered to, or to ignore it when the specification is ill-formed or unsupported for a particular target. Differential Revision: https://reviews.llvm.org/D128633	2022-06-27 04:58:36 -07:00
Matthias Springer	5d50f51c97	[mlir][bufferization][NFC] Add error handling to getBuffer This is in preparation of adding memory space support. Differential Revision: https://reviews.llvm.org/D128277	2022-06-27 13:48:01 +02:00
Matthias Springer	3ff93f838e	[mlir][SCF][bufferize][NFC] Bufferize scf.for terminator separately This allows for better type inference during bufferization and is in preparation of supporting memory spaces. Differential Revision: https://reviews.llvm.org/D128422	2022-06-27 13:26:32 +02:00
Matthias Springer	8e691e1f24	[mlir][SCF][bufferize] Bufferize scf.if/execute_region terminators separately This allows for better type inference during bufferization and is in preparation of supporting memory spaces. Differential Revision: https://reviews.llvm.org/D128581	2022-06-27 13:22:19 +02:00
Matthias Springer	7ebf70d85d	[mlir][SCF][bufferize][NFC] Bufferize parallel_insert_slice separately This allows for better type inference during bufferization and is in preparation of supporting memory spaces. Differential Revision: https://reviews.llvm.org/D128580	2022-06-27 13:16:02 +02:00
Matthias Springer	ba9d886db4	[mlir][bufferization][NFC] Bufferize with PostOrder traversal This is useful because the result type of an op can sometimes be inferred from its body (e.g., `scf.if`). This will be utilized in subsequent changes. Also introduces a new `getBufferType` interface method on BufferizableOpInterface. This method is useful for computing a bufferized block argument type with respect to OpOperand types of the parent op. Differential Revision: https://reviews.llvm.org/D128420	2022-06-27 12:42:41 +02:00
Matthias Springer	3474d10e1a	[mlir][bufferization][NFC] Make `escape` a dialect attribute All bufferizable ops that bufferize to an allocation receive a `bufferization.escape` attribute during TensorCopyInsertion. Differential Revision: https://reviews.llvm.org/D128137	2022-06-23 19:34:47 +02:00
Alex Zinenko	8b68da2c7d	[mlir] move SCF headers to SCF/{IR,Transforms} respectively This aligns the SCF dialect file layout with the majority of the dialects. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D128049	2022-06-20 10:18:01 +02:00
Jacques Pienaar	8df54a6a03	[mlir] Update accessors to prefixed form (NFC) Follow up from flipping dialects to both, flip accessor used to prefixed variant ahead to flipping from _Both to _Prefixed. This just flips to the accessors introduced in the preceding change which are just prefixed forms of the existing accessor changed from. Mechanical change using helper script https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.	2022-06-18 17:53:22 -07:00
Matthias Springer	b55d55ecd9	[mlir][bufferize][NFC] Remove BufferizationState With the recent refactorings, this class is no longer needed. We can use BufferizationOptions in all places were BufferizationState was used. Differential Revision: https://reviews.llvm.org/D127653	2022-06-17 14:04:11 +02:00
Matthias Springer	b3ebe3beed	[mlir][bufferize] Bufferize after TensorCopyInsertion This change changes the bufferization so that it utilizes the new TensorCopyInsertion pass. One-Shot Bufferize no longer calls the One-Shot Analysis. Instead, it relies on the TensorCopyInsertion pass to make the entire IR fully inplacable. The `bufferize` implementations of all ops are simplified; they no longer have to account for out-of-place bufferization decisions. These were already materialized in the IR in the form of `bufferization.alloc_tensor` ops during the TensorCopyInsertion pass. Differential Revision: https://reviews.llvm.org/D127652	2022-06-17 13:29:52 +02:00
Matthias Springer	d361ecbd0d	[mlir][SCF][bufferize] Implement `resolveConflicts` for SCF ops scf::ForOp and scf::WhileOp must insert buffer copies not only for out-of-place bufferizations, but also to enforce additional invariants wrt. to buffer aliasing behavior. This is currently happening in the respective `bufferize` methods. With this change, the tensor copy insertion pass will also enforce these invariants by inserting copies. The `bufferize` methods can then be simplified and made independent of the `AnalysisState` data structure in a subsequent change. Differential Revision: https://reviews.llvm.org/D126822	2022-06-15 09:07:31 +02:00
Nicolas Vasilache	72de7588cc	[mlir][SCF] Add bufferization hook for scf.foreach_thread and terminator. `scf.foreach_thread` results alias with the underlying `scf.foreach_thread.parallel_insert_slice` destination operands and they bufferize to equivalent buffers in the absence of other conflicts. `scf.foreach_thread.parallel_insert_slice` conflict detection is similar to `tensor.insert_slice` conflict detection. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D126769	2022-06-03 07:14:05 +00:00
Lei Zhang	413fbb045d	[mlir][scf] Retain existing attributes in scf.for transforms These attributes can carry useful information, e.g., pipelines might use them to organize and chain patterns. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D126320	2022-05-25 10:53:02 -04:00
Matthias Springer	996834e681	[mlir][SCF] Fix scf.while bufferization Before this fix, the bufferization implementation made the incorrect assumption that the values yielded from the "before" region must match with the values yielded from the "after" region. Differential Revision: https://reviews.llvm.org/D125835	2022-05-18 00:35:50 +02:00
Matthias Springer	12e41d9264	[mlir][bufferize] Infer memref types when possible Instead of recomputing memref types from tensor types, try to infer them when possible. This results in more precise layout maps. Differential Revision: https://reviews.llvm.org/D125614	2022-05-16 02:02:08 +02:00
Matthias Springer	248e113e9f	[mlir][bufferize][NFC] Move helper functions to BufferizationOptions Move helper functions for creating allocs/deallocs/memcpys to BufferizationOptions. Differential Revision: https://reviews.llvm.org/D125375	2022-05-11 16:23:22 +02:00
Matthias Springer	a5d09c6372	[mlir][scf] Implement BufferizableOpInterface for scf::WhileOp This follows the same implementation strategy as scf::ForOp and common functionality is extracted into helper functions. This implementation works well in cases where each yielded value (from either body/condition region) is equivalent to the corresponding bbArg of the parent block. In that case, each OpResult of the loop may be aliasing with the corresponding OpOperand of the loop (and with no other OpOperand). In the absence of said equivalence relationship, new buffer copies must be inserted, so that the aliasing OpOperand/OpResult contract of scf::WhileOp is honored. In essence, by yielding a newly allocated buffer, we can enforce the specified may-alias relationship. (Newly allocated buffers cannot alias with any OpOperands of the loop.) Differential Revision: https://reviews.llvm.org/D124929	2022-05-06 17:24:33 +09:00
Matthias Springer	e300682597	[mlir][scf][bufferize] Update verifyAnalysis error message The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands. Differential Revision: https://reviews.llvm.org/D124934	2022-05-05 16:56:50 +09:00
Matthias Springer	417e1c7d52	[mlir][scf][bufferize][NFC] Split ForOp bufferization into smaller functions This is in preparation of WhileOp bufferization, which reuses these functions. Differential Revision: https://reviews.llvm.org/D124933	2022-05-05 16:55:44 +09:00
Matthias Springer	f178c386f5	[mlir][scf][bufferize][NFC] Simplify verifyAnalysis implementation Differential Revision: https://reviews.llvm.org/D124928	2022-05-05 16:51:10 +09:00
River Riddle	eda6f907d2	[mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp Now that dialect constructors are generated in the .cpp file, we can drop all of the dependent dialect includes from the .h file. Differential Revision: https://reviews.llvm.org/D124298	2022-04-23 01:09:29 -07:00
Matthias Springer	fa087b4352	[mlir][scf][bufferize][NFC] Lookup buffer using helper function Lookup iter_arg buffers using `lookupBuffer` instead of always creating a new `ToMemrefOp`. Also cast all yielded buffers (if necessary), regardless of whether they are an equivalent buffer or a new allocation. Note: This should have been part of D123369. Differential Revision: https://reviews.llvm.org/D123383	2022-04-12 18:09:30 +09:00
Matthias Springer	d2608adf49	[mlir][bufferize] Do not insert useless casts for newly allocated buffers Differential Revision: https://reviews.llvm.org/D123369	2022-04-08 18:12:02 +09:00
River Riddle	77eee5795e	[mlir] Refactor DialectRegistry delayed interface support into a general DialectExtension mechanism The current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc. when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has a separate tracking structure, and is also quite limiting. This commit refactors this delayed mutation of dialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registration callback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directly on the loaded constructs, and also allows for loading new dependent dialects. The latter of which is extremely useful as it will now enable dependent dialects to only apply in the contexts in which they are necessary. For example, a dialect dependency can now be conditional on if a user actually needs the interface that relies on it. Differential Revision: https://reviews.llvm.org/D120367	2022-03-16 22:15:25 -07:00
Matthias Springer	1e1eeae840	[mlir][bufferize] Allow non-equivalent yields from scf.for loops This removes a restriction wrt. scf.for loops during One-Shot Bufferization. Such IR was previously rejected. It is still rejected by default because the bufferized IR could be slow. But such IR can now be bufferized with `allow-return-allocs`. Differential Revision: https://reviews.llvm.org/D121529	2022-03-16 23:22:06 +09:00
Matthias Springer	39ec46bd83	[mlir][bufferize] Extract buffer hoisting into separate function This improves the modularity of the bufferization. From now on, all ops that do not implement BufferizableOpInterface are considered hoisting barriers. Previously, all ops that do not implement the interface were not considered barriers and such ops had to be marked as barriers explicitly. This was unsafe because we could've hoisted across unknown ops where it was not safe to hoist. As a side effect, this allows for cleaning up AffineBufferizableOpInterfaceImpl. This build unit no longer needed and can be deleted. Differential Revision: https://reviews.llvm.org/D121519	2022-03-15 21:25:03 +09:00
Matthias Springer	9597b16aa9	[mlir][bufferize][NFC] Split BufferizationState into AnalysisState/BufferizationState Differential Revision: https://reviews.llvm.org/D121361	2022-03-15 17:35:47 +09:00

1 2

55 Commits