clang-p2996

Author	SHA1	Message	Date
Matthias Springer	e7790fbed3	[mlir] Add `test-convergence` option to Canonicalizer tests This new option is set to `false` by default. It should be set only in Canonicalizer tests to detect faulty canonicalization patterns. I.e., patterns that prevent the canonicalizer from converging. The canonicalizer should always convergence on such small unit tests that we have in `canonicalize.mlir`. Two faulty canonicalization patterns were detected and fixed with this change. Differential Revision: https://reviews.llvm.org/D140873	2023-01-04 12:02:21 +01:00
Mehdi Amini	cffd7b144b	[mlir][scf] Fixes IndexSwitchOp verifier crash Fixes #59460	2022-12-13 09:42:34 +00:00
Matthias Springer	4002eaaa01	[mlir][bufferize] Improve analysis of external functions External functions have no body, so they cannot be analyzed. Assume conservatively that each tensor bbArg may be aliasing with each tensor result. Furthermore, assume that each function arg is read and written-to after bufferization. This default behavior can be controlled with `bufferization.access` (similar to `bufferization.memory_layout`) in test cases. Also fix a bug in the dialect attribute verifier, which did not run for region argument attributes. Differential Revision: https://reviews.llvm.org/D139517	2022-12-09 14:36:33 +01:00
Matthias Springer	c1fef4e88a	[mlir][bufferization] Make `TensorCopyInsertionPass` a test pass TensorCopyInsertion should not have been exposed as a pass. This was a flaw in the original design. It is a preparation step for bufferization and certain transforms (that would otherwise be legal) are illegal between TensorCopyInsertion and actual rewrite to MemRef ops. Therefore, even if broken down as two separate steps internally, they should be exposed as a single pass. This change affects the sparse compiler, which uses `TensorCopyInsertionPass`. A new `SparsificationAndBufferizationPass` is added to replace all passes in the sparse tensor pipeline from `TensorCopyInsertionPass` until the actual bufferization (rewrite to memref/non-tensor). It is generally unsafe to run arbitrary passes in-between, in particular passes that hoist tensor ops out of loops or change SSA use-def chains along tensor ops. Differential Revision: https://reviews.llvm.org/D138915	2022-12-02 15:38:02 +01:00
Amy Wang	e4e64eaade	[MLIR][Transform] Consolidate the transform ops of get_parent_for and loop unroll from affine and scf dialects. This patch consolidates the two transform ops from the affine dialect and the scf dialect to avoid code duplication. This is to address the review comments from https://reviews.llvm.org/D137997. The transform ops directory / file structure for the affine dialect is kept for the purpose of forth-coming transform ops for affine, but get_parent_for and unroll are removed. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D138980	2022-11-30 11:07:44 -05:00
Hanhan Wang	0a1569a400	[mlir][NFC] Remove trailing whitespaces from `.td` and `.mlir` files. This is generated by running ``` sed --in-place 's/[[:space:]]\+$//' mlir/*/.td sed --in-place 's/[[:space:]]\+$//' mlir/*/.mlir ``` Reviewed By: rriddle, dcaballe Differential Revision: https://reviews.llvm.org/D138866	2022-11-28 15:26:30 -08:00
Matthias Springer	0d9761d50e	[mlir][SCF] Add tensor.dim(scf.foreach_thread) folding Dim sizes of `scf.foreach_thread` op results match the dim sizes of their respective tied shared_outs operands. Differential Revision: https://reviews.llvm.org/D138484	2022-11-22 11:28:27 +01:00
Lei Zhang	9bb633741a	[mlir][bufferization] Support general Attribute as memory space MemRef has been accepting a general Attribute as memory space for a long time. This commits updates bufferization side to catch up, which allows downstream users to plugin customized symbolic memory space. This also eliminates quite a few `getMemorySpaceAsInt` calls, which is deprecated. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D138330	2022-11-21 09:40:50 -05:00
Nicolas Vasilache	f0a411da77	[mlir][Transform]Significantly cleanup scf.foreach_thread and GPU transform permutation handling Previously, the need for a dense permutation leaked into the thread_dim_mapping specification. This revision allows to use a sparse specification of the thread_dim_mapping and the proper completion / sorting is applied automatically. In the process, the sematics of scf.foreach_thread is tightened to require a matching number of thread dimensions and mappings. The relevant negative test is added. Differential Revision: https://reviews.llvm.org/D137906	2022-11-14 09:19:49 -08:00
Guray Ozen	6663f34704	[mlir] Introduce device mapper attribute for `thread_dim_map` and `mapped to dims` `scf.foreach_thread` defines mapping its loops to processors via an integer array, see an example below. A lowering can use this mapping. However, expressing mapping as an integer array is very confusing, especially when there are multiple levels of parallelism. In addition, the op does not verify the integer array. This change introduces device mapping attribute to make mapping descriptive and verifiable. Then it makes GPU transform dialect use it. ``` scf.foreach_thread (%i, %j) in (%c1, %c2) { scf.foreach_thread (%i2, %j2) in (%c1, %c2) {...} { thread_dim_mapping = [0, 1]} } { thread_dim_mapping = [0, 1]} ``` It first introduces a `DeviceMappingInterface` which is an attribute interface. `scf.foreach_thread` defines its mapping via this interface. A lowering must define its attributes and implement this interface as well. This way gives us a clear validation. The change also introduces two new attributes (`#gpu.thread<x/y/z>` and `#gpu.block<x,y,z>` ). After this change, the above code prints as below, as seen here, this way clarifies the loop mappings. The change also implements consuming of these two new attribute by the transform dialect. Transform dialect binds the outermost loops to the thread blocks and innermost loops to threads. ``` scf.foreach_thread (%i, %j) in (%c1, %c2) { scf.foreach_thread (%i2, %j2) in (%c1, %c2) {...} { thread_dim_mapping = [#gpu.thread<x>, #gpu.thread<y>]} } { thread_dim_mapping = [#gpu.block<x>, #gpu.block<y>]} ``` Reviewed By: ftynse, nicolasvasilache Differential Revision: https://reviews.llvm.org/D137413	2022-11-11 08:44:57 +01:00
River Riddle	38c219b4a8	[mlir] Infer SubElementInterface implementations using the storage KeyTy The KeyTy of attribute/type storage classes provide enough information for automatically implementing the necessary sub element interface methods. This removes the need for derived classes to do it themselves, which is both much nicer and easier to handle certain invariants (e.g. null handling). In cases where explicitly handling for parameter types is necessary, they can provide an implementation of `AttrTypeSubElementHandler` to opt-in to support. This tickles a few things alias wise, which annoyingly messes with tests that hard code specific affine map numbers. Differential Revision: https://reviews.llvm.org/D137374	2022-11-04 18:15:03 -07:00
rkayaith	13bd410962	[mlir][Pass] Include anchor op in -pass-pipeline In D134622 the printed form of a pass manager is changed to include the name of the op that the pass manager is anchored on. This updates the `-pass-pipeline` argument format to include the anchor op as well, so that the printed form of a pipeline can be directly passed to `-pass-pipeline`. In most cases this requires updating `-pass-pipeline='pipeline'` to `-pass-pipeline='builtin.module(pipeline)'`. This also fixes an outdated assert that prevented running a `PassManager` anchored on `'any'`. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D134900	2022-11-03 11:36:12 -04:00
Matthias Springer	c9b3638126	[mlir][scf][bufferize] Fix bufferizesToMemoryRead with 0 loop iterations There was a bug in scf.for loop bufferization that could lead to a missing buffer copy (alloc was there, but not the copy). Differential Revision: https://reviews.llvm.org/D135053	2022-10-24 14:34:41 +02:00
River Riddle	c8496d292e	[mlir] Refactor alias generation to support nested aliases We currently only support one level of aliases, which isn't great in situations where an attribute/type can have multiple duplicated components nested within it(e.g. debuginfo metadata). This commit refactors alias generation to support nested aliases, which requires changing alias grouping to take into account the depth of child aliases, to ensure that attributes/types aren't printed before the aliases they use. The only real user facing change here was that we no longer print 0 as an alias suffix, which would be unnecessarily expensive to keep in the new alias generation method (and isn't that valuable of a behavior to preserve). Differential Revision: https://reviews.llvm.org/D136541	2022-10-23 23:59:55 -07:00
Jeff Niu	07d8fe9391	[mlir][scf] Add an IndexSwitchOp The `scf.index_switch` is a control-flow operation that branches to one of the given regions based on the values of the argument and the cases. The argument is always of type `index`. Example: ```mlir %0 = scf.index_switch %arg0 -> i32 case 2 { %1 = arith.constant 10 : i32 scf.yield %1 : i32 } case 5 { %2 = arith.constant 20 : i32 scf.yield %2 : i32 } default { %3 = arith.constant 30 : i32 scf.yield %3 : i32 } ``` Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D136003	2022-10-21 09:21:10 -07:00
Alex Zinenko	2e9abc0c71	[mlir] drop unnecssary transform.with_pdl_patterns from tests, NFC Many tests wrap the piece of the IR related to the transform dialect into `transform.with_pdl_patterns` without actually using PDL patterns inside. Some of these are leftovers from migration to `structured.match` and some others are cargo cult, both are useless and pollute the tests. Reviewed By: guraypp Differential Revision: https://reviews.llvm.org/D135661	2022-10-11 12:26:11 +00:00
Alex Zinenko	59bb8af4c3	[mlir] switch the transform loop extension to use types Add types to the Loop (SCF) extension of the transform dialect. See https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D135587	2022-10-11 09:55:23 +00:00
Alex Zinenko	6fe0309602	[mlir] switch transform dialect ops to use TransformTypeInterface Use the recently introduced TransformTypeInterface instead of hardcoding the PDLOperationType. This will allow the operations to use more specific transform types to express pre/post-conditions in the future. It requires the syntax and Python op construction API to be updated. Dialect extensions will be switched separately. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D135584	2022-10-11 09:55:13 +00:00
Matthias Springer	2e210034da	[mlir][bufferize] Fix repetitive region conflict detection This fixes a bug where a required buffer copy was not inserted. Not only written aliases, but also read aliases should be taken into account when computing common enclosing repetitive regions. Furthermore, for writing ops, it does not matter where the destination tensor is defined, but where the op itself is located. Differential Revision: https://reviews.llvm.org/D135420	2022-10-07 16:39:03 +09:00
Matthias Springer	f4e8f44811	[mlir][bufferize] Fix enclosing repetitive region computation The wrong function overload was called. Differential Revision: https://reviews.llvm.org/D135342	2022-10-07 10:37:04 +09:00
Johannes Reifferscheid	eaf20c4fc2	[mlir] Fix a cast that should be a dyn_cast. This fixes a crash for certain IR, see the new test case for an example. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D134424	2022-09-22 13:13:21 +02:00
Matthias Springer	04ff6009fc	[mlir][tensor][bufferize] Implement getBufferType for Expand/CollapseShapeOp This function must be implemented for all ops, where the result memref type is different from the input memref type. Differential Revision: https://reviews.llvm.org/D134331	2022-09-21 18:31:59 +09:00
Christopher Bate	f5fe92f693	[mlir][SCF] Fix loop pipelining unable to handle ops with regions This change allows the SCF LoopPipelining transform to handle ops with nested regions within the pipelined `scf.for` body. The op and nested regions are treated as a single unit from the transform's perspective. This change also makes explicit the requirement that only ops whose parent Block is the loop body Block are allowed to be scheduled by the caller. Reviewed By: ThomasRaoux, nicolasvasilache Differential Revision: https://reviews.llvm.org/D133965	2022-09-20 21:58:53 -06:00
Peiming Liu	52887071ea	[mlir][scf] Support simple symbolic expression without depending on AffineDialect to simply trivial loops. Remove dependence of AffineDialect Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D134291	2022-09-20 18:13:05 +00:00
Peiming Liu	d518fc28b6	[mlir][scf] Support simple symbolic expression when simplify loops Reviewed By: aartbik, ThomasRaoux Differential Revision: https://reviews.llvm.org/D134204	2022-09-19 21:50:01 +00:00
Alex Zinenko	f096e72ce6	[mlir] switch bufferization to use strided layout attribute Bufferization already makes the assumption that buffers pass function boundaries in the strided form and uses the corresponding affine map layouts. Switch it to use the recently introduced strided layout instead to avoid unnecessary casts when bufferizing further operations to the memref dialect counterparts that now largely rely on the strided layout attribute. Depends On D133947 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D133951	2022-09-16 10:56:50 +02:00
Alex Zinenko	2791162b01	[mlir] make memref.subview produce strided layout Memref subview operation has been initially designed to work on memrefs with strided layouts only and has never supported anything else. Port it to use the recently added StridedLayoutAttr instead of extracting the strided from implicitly from affine maps. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D133938	2022-09-16 10:56:46 +02:00
Johannes Reifferscheid	6247988e07	One-shot-bufferize: fix for inconsistent while arg types in before/after. Currently, if the `before` and `after` regions of a while op have tensor args in different indices, this leads to a crash. This moves the pass-through check for args to the handling of the condition block, since that is where the results are produced, so it's also where copies must be made. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D133477	2022-09-08 10:24:11 +02:00
Johannes Reifferscheid	fb9fc79809	One-shot-bufferize: allow non-tensor arguments in scg.while/for. Currently, one-shot-bufferize crashes as soon as there's a mixture of tensor and non-tensor arguments. This seems to happen for no good reason. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D133419	2022-09-07 15:54:25 +02:00
Matthias Springer	4cd7362083	[mlir][SCF] foreach_thread: Capture shared output tensors explicitly This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments. The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments. As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again. Differential Revision: https://reviews.llvm.org/D133114	2022-09-02 14:54:04 +02:00
Hendrik Greving	6a2190dffc	[mlir] Make division unsigned. Uses arith.divui where it is safe to do so. Adjusts the tests for above. Differential Revision: https://reviews.llvm.org/D132701	2022-08-30 09:55:04 -07:00
Alex Zinenko	519847fefc	[mlir] materialize strided memref layout as attribute Introduce a new attribute to represent the strided memref layout. Strided layouts are omnipresent in code generation flows and are the only kind of layouts produced and supported by a half of operation in the memref dialect (view-related, shape-related). However, they are internally represented as affine maps that require a somewhat fragile extraction of the strides from the linear form that also comes with an overhead. Furthermore, textual representation of strided layouts as affine maps is difficult to read: compare `affine_map<(d0, d1, d2)[s0, s1] -> (d032 + d1s0 + s1 + d2)>` with `strides: [32, ?, 1], offset: ?`. While a rudimentary support for parsing a syntactically sugared version of the strided layout has existed in the codebase for a long time, it does not go as far as this commit to make the strided layout a first-class attribute in the IR. This introduces the attribute and updates the tests that using the pre-existing sugared form to use the new attribute instead. Most memref created programmatically, e.g., in passes, still use the affine form with further extraction of strides and will be updated separately. Update and clean-up the memref type documentation that has gotten stale and has been referring to the details of affine map composition that are long gone. See https://discourse.llvm.org/t/rfc-materialize-strided-memref-layout-as-an-attribute/64211. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D132864	2022-08-30 17:19:58 +02:00
Matthias Springer	86974e32a4	[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.while) This change implements the same functionality as D132860, but for scf.while. Differential Revision: https://reviews.llvm.org/D132927	2022-08-30 16:58:21 +02:00
Matthias Springer	9d6096c56f	[mlir][SCF][bufferize][NFC] Move scf.if buffer type computation to getBufferType A part of the functionality of `bufferize` is extracted into `getBufferType`. Also, bufferized scf.yields inside scf.if are now created with the correct bufferized type from the get-to. Differential Revision: https://reviews.llvm.org/D132862	2022-08-30 16:48:10 +02:00
Matthias Springer	123c4b0251	[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.for) Even though iter_arg and init_arg of an scf.for loop may have the same tensor type, their bufferized memref types are not necessarily equal. It is sometimes necessary to insert a cast in case of differing layout maps. Differential Revision: https://reviews.llvm.org/D132860	2022-08-30 16:35:32 +02:00
lewuathe	cdc8d0fcd7	[mlir][affine] Option to unroll cleanup loop if smaller trip count. Add an option (cleanUpUnroll) to unroll cleanup loop even if the trip count is smaller the unroll factor. Differential Revision: https://reviews.llvm.org/D129171	2022-08-19 09:35:20 +09:00
Matthias Springer	adc93e0d38	[mlir][SCF] Loop lb/ub are symbols during Affine Min/Max canonicalization This fixes a bug in SCF/AffineCanonicalizationUtils.cpp. Loop lb/ub were previously considered dimensions, which caused a crash when a (non-optimizable) affine.min / affine.max expression was processed (due to multiplication of two dims). Lb/ub are now considered symbols and symbols may be multiplied. (The scope of the analysis is "within the loop body", at which point lb/ub are constants.) Differential Revision: https://reviews.llvm.org/D132021	2022-08-18 11:44:48 +02:00
Jeff Niu	58a47508f0	(Reland) [mlir] Switch segment size attributes to DenseI32ArrayAttr This reland includes changes to the Python bindings. Switch variadic operand and result segment size attributes to use the dense i32 array. Dense integer arrays were introduced primarily to represent index lists. They are a better fit for segment sizes than dense elements attrs. Depends on D131801 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D131803	2022-08-12 19:44:52 -04:00
Alex Zinenko	a60ed95419	[mlir][transform] failure propagation mode in sequence Introduce two different failure propagation mode in the Transform dialect's Sequence operation. These modes specify whether silenceable errors produced by nested ops are immediately propagated, thus stopping the sequence, or suppressed. The latter is useful in end-to-end transform application scenarios where the user cannot correct the transformation, but it is robust enough to silenceable failures. It can be combined with the "alternatives" operation. There is intentionally no default value to avoid favoring one mode over the other. Downstreams can update their tests using: S='s/sequence $%.*$ {/sequence \1 failures(propagate) {/' T='s/sequence {/sequence failures(propagate) {/' git grep -l transform.sequence \| xargs sed -i -e "$S" git grep -l transform.sequence \| xargs sed -i -e "$T" Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D131774	2022-08-12 15:31:22 +00:00
Matthias Springer	bf1b9528ff	[mlir][bufferize] Fix missing copy when bufferizing loops Using a loop init_arg inside of the loop is not supported. This change adds a pre-processing pass that resolves such IR with copies. Differential Revision: https://reviews.llvm.org/D131689	2022-08-12 10:44:55 +02:00
Alex Zinenko	e8e718fa4b	Revert "[mlir] Switch segment size attributes to DenseI32ArrayAttr" This reverts commit `30171e76f0`. Breaks Python tests in MLIR, missing C API and Python changes.	2022-08-12 10:22:47 +02:00
Jeff Niu	30171e76f0	[mlir] Switch segment size attributes to DenseI32ArrayAttr Switch variadic operand and result segment size attributes to use the dense i32 array. Dense integer arrays were introduced primarily to represent index lists. They are a better fit for segment sizes than dense elements attrs. Depends on D131738 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D131702	2022-08-11 20:56:45 -04:00
Nicolas Vasilache	1f77f01c65	[mlir][Linalg] Add a Transform dialect NavigationOp op to match a list of ops or an interface. This operation is a NavigationOp that simplifies the writing of transform IR. Since there is no way of refering to an interface by name, the current implementation uses an EnumAttr and depends on the interfaces it supports. In the future, it would be worthwhile to remove this dependence and generalize. Differential Revision: https://reviews.llvm.org/D130267	2022-07-21 07:11:42 -07:00
Matthias Springer	74902cc96f	[mlir][linalg][NFC] Cleanup: Drop linalg.inplaceable attribute bufferization.writable is used in most cases instead. All remaining test cases are updated. Some code that is no longer needed is deleted. Differential Revision: https://reviews.llvm.org/D129739	2022-07-14 15:50:03 +02:00
Matthias Springer	fc9b37dd53	[mlir][bufferization] Do not canonicalize to_tensor(to_memref(x)) This is a partial revert of D128615. to_memref(to_tensor(x)) always be folded to x. But to_tensor(to_memref(x)) cannot be folded in the general case because writes to the intermediary memref may go unnoticed. Differential Revision: https://reviews.llvm.org/D129354	2022-07-09 09:16:52 +02:00
Nicolas Vasilache	7fbf55c927	[mlir][Tensor] Move ParallelInsertSlice to the tensor dialect This is moslty NFC and will allow tensor.parallel_insert_slice to gain rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl. Depends on D128857 Differential Revision: https://reviews.llvm.org/D128920	2022-07-04 01:53:12 -07:00
Matthias Springer	cb47124179	[mlir][bufferize] Improve to_tensor/to_memref folding Differential Revision: https://reviews.llvm.org/D128615	2022-06-27 21:42:39 +02:00
Matthias Springer	c0b0b6a00a	[mlir][bufferize] Infer memory space in all bufferization patterns This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.) Differential Revision: https://reviews.llvm.org/D128423	2022-06-27 16:32:52 +02:00
Nicolas Vasilache	a0f843fdaf	[SCF] Add thread_dim_mapping attribute to scf.foreach_thread An optional thread_dim_mapping index array attribute specifies for each virtual thread dimension, how it remaps 1-1 to a set of concrete processing element resources (e.g. a CUDA grid dimension or a level of concrete nested async parallelism). At this time, the specification is backend-dependent and is not verified by the op, beyond being an index array attribute. It is the reponsibility of the lowering to interpret the index array in the context of the concrete target the op is lowered to, or to ignore it when the specification is ill-formed or unsupported for a particular target. Differential Revision: https://reviews.llvm.org/D128633	2022-06-27 04:58:36 -07:00
Matthias Springer	8e691e1f24	[mlir][SCF][bufferize] Bufferize scf.if/execute_region terminators separately This allows for better type inference during bufferization and is in preparation of supporting memory spaces. Differential Revision: https://reviews.llvm.org/D128581	2022-06-27 13:22:19 +02:00

1 2 3 4

193 Commits