clang-p2996

Author	SHA1	Message	Date
Matthias Springer	8c4bc1e75d	[mlir][Transforms] Merge 1:1 and 1:N type converters (#113032 ) The 1:N type converter derived from the 1:1 type converter and extends it with 1:N target materializations. This commit merges the two type converters and stores 1:N target materializations in the 1:1 type converter. This is in preparation of merging the 1:1 and 1:N dialect conversion infrastructures. 1:1 target materializations (producing a single `Value`) will remain valid. An additional API is added to the type converter to register 1:N target materializations (producing a `SmallVector<Value>`). Internally, all target materializations are stored as 1:N materializations. The 1:N type converter is removed. Note for LLVM integration: If you are using the `OneToNTypeConverter`, simply switch all occurrences to `TypeConverter`. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2024-10-25 11:44:20 -07:00
Andrea Faulds	9f6c632ecd	[mlir][mlir-spirv-cpu-runner] Move MLIR pass pipeline to mlir-opt (#113594 ) Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline, which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner, and removes them from the runner. The tests are changed to invoke mlir-opt with this flag before running the runner. The eventual goal is to move all host/device code generation steps out of the runner, like with some of the other runners. Recommit of `17e9752267`. It was reverted due to a build failure, but the build failure had in fact already been fixed in `e7302319b5`.	2024-10-25 07:21:59 -07:00
Matthias Springer	f18c3e4e73	[mlir][Transforms] Dialect Conversion: Simplify materialization fn result type (#113031 ) This commit simplifies the result type of materialization functions. Previously: `std::optional<Value>` Now: `Value` The previous implementation allowed 3 possible return values: - Non-null value: The materialization function produced a valid materialization. - `std::nullopt`: The materialization function failed, but another materialization can be attempted. - `Value()`: The materialization failed and so should the dialect conversion. (Previously: Dialect conversion can roll back.) This commit removes the last variant. It is not particularly useful because the dialect conversion will fail anyway if all other materialization functions produced `std::nullopt`. Furthermore, in contrast to type conversions, at least one materialization callback is expected to succeed. In case of a failing type conversion, the current dialect conversion can roll back and try a different pattern. This also used to be the case for materializations, but that functionality was removed with #107109: failed materializations can no longer trigger a rollback. (They can just make the entire dialect conversion fail without rollback.) With this in mind, it is even less useful to have an additional error state for materialization functions. This commit is in preparation of merging the 1:1 and 1:N type converters. Target materializations will have to return multiple values instead of a single one. With this commit, we can keep the API simple: `SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`. Note for LLVM integration: All 1:1 materializations should return `Value` instead of `std::optional<Value>`. Instead of `std::nullopt` return `Value()`.	2024-10-23 07:29:17 -07:00
lorenzo chelini	34d4f660fe	[mlir] Fix the emission of `prop-dict` when operations have no properties (#112851 ) When an operation has no properties, no property struct is emitted. To avoid a compilation error, we should also skip emitting `setPropertiesFromParsedAttr`, `parseProperties` and `printProperties` in such cases. Compilation error: ``` error: ‘Properties’ has not been declared static ::llvm::LogicalResult setPropertiesFromParsedAttr(Properties &prop, ::mlir::Attribute attr, ::llvm::function_ref<::mlir::InFlightDiagnostic()> emitError); ```	2024-10-21 13:43:55 -07:00
Jakub Kuderski	17e9752267	Revert "[mlir][mlir-spirv-cpu-runner] Move MLIR pass pipeline to mlir-opt" (#113176 ) Reverts llvm/llvm-project#111575 This caused build failures: https://lab.llvm.org/buildbot/#/builders/138/builds/5244	2024-10-21 08:10:22 -07:00
Michael Liao	e7302319b5	[mlir] Fix shared build. NFC	2024-10-21 10:55:17 -04:00
Andrea Faulds	f0312d962d	[mlir][mlir-spirv-cpu-runner] Move MLIR pass pipeline to mlir-opt (#111575 ) Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline, which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner, and removes them from the runner. The tests are changed to invoke mlir-opt with this flag before running the runner. The eventual goal is to move all host/device code generation steps out of the runner, like with some of the other runners.	2024-10-21 06:55:40 -07:00
donald chen	4b3f251bad	[mlir] [dataflow] unify semantics of program point (#110344 ) The concept of a 'program point' in the original data flow framework is ambiguous. It can refer to either an operation or a block itself. This representation has different interpretations in forward and backward data-flow analysis. In forward data-flow analysis, the program point of an operation represents the state after the operation, while in backward data flow analysis, it represents the state before the operation. When using forward or backward data-flow analysis, it is crucial to carefully handle this distinction to ensure correctness. This patch refactors the definition of program point, unifying the interpretation of program points in both forward and backward data-flow analysis. How to integrate this patch? For dense forward data-flow analysis and other analysis (except dense backward data-flow analysis), the program point corresponding to the original operation can be obtained by `getProgramPointAfter(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointBefore(block)`. For dense backward data-flow analysis, the program point corresponding to the original operation can be obtained by `getProgramPointBefore(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointAfter(block)`. NOTE: If you need to get the lattice of other data-flow analyses in dense backward data-flow analysis, you should still use the dense forward data-flow approach. For example, to get the Executable state of a block in dense backward data-flow analysis and add the dependency of the current operation, you should write: ``getOrCreateFor<Executable>(getProgramPointBefore(op), getProgramPointBefore(block))`` In case above, we use getProgramPointBefore(op) because the analysis we rely on is dense backward data-flow, and we use getProgramPointBefore(block) because the lattice we query is the result of a non-dense backward data flow computation. related dsscussion: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8 corresponding PSA: https://discourse.llvm.org/t/psa-program-point-semantics-change/81479	2024-10-11 21:59:05 +08:00
Benoit Jacob	a9ebdbb5ac	[MLIR] Vector: turn the ExtractStridedSlice rewrite pattern from #111541 into a canonicalization (#111614 ) This is a reasonable canonicalization because `extract` is more constrained than `extract_strided_slices`, so there is no loss of semantics here, just lifting an op to a special-case higher/constrained op. And the additional `shape_cast` is merely adding leading unit dims to match the original result type. Context: discussion on #111541. I wasn't sure how this would turn out, but in the process of writing this PR, I discovered at least 2 bugs in the pattern introduced in #111541, which shows the value of shared canonicalization patterns which are exercised on a high number of testcases. --------- Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>	2024-10-09 09:24:23 -04:00
Benoit Jacob	10054ba4ac	[mlir][vector] Add pattern to rewrite contiguous ExtractStridedSlice into Extract (#111541 ) Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-10-08 11:51:01 -04:00
Matthias Springer	206fad0e21	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250 ) This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.	2024-10-05 21:32:40 +02:00
Aman LaChapelle	759a7b5933	[mlir] Add the ability to define dialect-specific location attrs. (#105584 ) This patch adds the capability to define dialect-specific location attrs. This is useful in particular for defining location structure that doesn't necessarily fit within the core MLIR location hierarchy, but doesn't make sense to push upstream (i.e. a custom use case). This patch adds an AttributeTrait, `IsLocation`, which is tagged onto all the builtin location attrs, as well as the test location attribute. This is necessary because previously LocationAttr::classof only returned true if the attribute was one of the builtin location attributes, and well, the point of this patch is to allow dialects to define their own location attributes. There was an alternate implementation I considered wherein LocationAttr becomes an AttrInterface, but that was discarded because there are likely to be many locations in a single program, and I was concerned that forcing every MLIR user to pay the cost of the additional lookup/dispatch was unacceptable. It also would have been a much more invasive change. It would have allowed for more flexibility in terms of pretty printing, but it's unclear how useful/necessary that flexibility would be given how much customizability there already is for attribute definitions.	2024-10-03 10:25:44 -07:00
Billy Zhu	5b21fd298c	[MLIR][Pass] Full & deterministic diagnostics (#110311 ) Today, when the pass infra schedules a pass/nested-pipeline on a set of ops, it exits early as soon as it fails on one of the ops. This leads to non-exhaustive, and more importantly, non-deterministic error reporting (under async). This PR removes the early termination behavior so that all ops have a chance to run through the current pass/nested-pipeline, and all errors are reported (async diagnostics are already ordered). This guarantees deterministic & full error reporting. As a result, it's also no longer necessary to -split-input-file with one error per split when testing with -verify-diagnostics.	2024-10-01 19:07:52 -07:00
Andrea Faulds	a800ffac41	[mlir][gpu] Disjoint patterns for lowering clustered subgroup reduce (#109158 ) Making the existing populateGpuLowerSubgroupReduceToShufflePatterns() function also cover the new "clustered" subgroup reductions is proving to be inconvenient, because certain backends may have more specific lowerings that only cover the non-clustered type, and this creates pass ordering constraints. This commit removes coverage of clustered reductions from this function in favour of a new separate function, which makes controlling the lowering much more straightforward.	2024-09-18 15:55:53 -04:00
Andrea Faulds	fd26f8444a	[mlir][gpu] Rename two misspelled pattern population functions (#109015 )	2024-09-17 15:26:14 -04:00
MaheshRavishankar	d5f0969c96	[mlir][TilingInterface] Avoid looking at operands for getting slices to continue tile + fuse. (#107882 ) Current implementation of `scf::tileConsumerAndFuseProducerUsingSCF` looks at operands of tiled/tiled+fused operations to see if they are produced by `extract_slice` operations to populate the worklist used to continue fusion. This implicit assumption does not always work. Instead make the implementations of `getTiledImplementation` return the slices to use to continue fusion. This is a breaking change - To continue to get the same behavior of `scf::tileConsumerAndFuseProducerUsingSCF`, change all out-of-tree implementation of `TilingInterface::getTiledImplementation` to return the slices to continue fusion on. All in-tree implementations have been adapted to this. - This change touches parts that required a simplification to the `ControlFn` in `scf::SCFTileAndFuseOptions`. It now returns a `std::optional<scf::SCFTileAndFuseOptions::ControlFnResult>` object that should be `std::nullopt` if fusion is not to be performed. Signed-off-by: MaheshRavishankar <mahesh.revishankar@gmail.com>	2024-09-11 22:15:43 -07:00
Amy Wang	6634d44e5e	[MLIR][Transform] Allow stateInitializer and stateExporter for applyTransforms (#101186 ) This is discussed in RFC: https://discourse.llvm.org/t/rfc-making-the-constructor-of-the-transformstate-class-protected/80377	2024-09-09 10:57:13 -04:00
Matthias Springer	3815f478bb	[mlir][Transforms] Dialect conversion: Make materializations optional (#107109 ) This commit makes source/target/argument materializations (via the `TypeConverter` API) optional. By default (`ConversionConfig::buildMaterializations = true`), the dialect conversion infrastructure tries to legalize all unresolved materializations right after the main transformation process has succeeded. If at least one unresolved materialization fails to resolve, the dialect conversion fails. (With an error message such as `failed to legalize unresolved materialization ...`.) Automatic materializations through the `TypeConverter` API can now be deactivated. In that case, every unresolved materialization will show up as a `builtin.unrealized_conversion_cast` op in the output IR. There used to be a complex and error-prone analysis in the dialect conversion that predicted the future uses of unresolved materializations. Based on that logic, some casts (that were deemed to unnecessary) were folded. This analysis was needed because folding happened at a point of time when some IR changes (e.g., op replacements) had not materialized yet. This commit removes that analysis. Any folding of cast ops now happens after all other IR changes have been materialized and the uses can directly be queried from the IR. This simplifies the analysis significantly. And certain helper data structures such as `inverseMapping` are no longer needed for the analysis. The folding itself is done by `reconcileUnrealizedCasts` (which also exists as a standalone pass). After casts have been folded, the remaining casts are materialized through the `TypeConverter`, as usual. This last step can be deactivated in the `ConversionConfig`. `ConversionConfig::buildMaterializations = false` can be used to debug error messages such as `failed to legalize unresolved materialization ...`. (It is also useful in case automatic materializations are not needed.) The materializations that failed to resolve can then be seen as `builtin.unrealized_conversion_cast` ops in the resulting IR. (This is better than running with `-debug`, because `-debug` shows IR where some IR changes have not been materialized yet.) Note: This is a reupload of #104668, but with correct handling of cyclic unrealized_conversion_casts that may be generated by the dialect conversion.	2024-09-05 19:40:58 +02:00
SJW	ebf0599314	[MLIR][SCF] Add support for loop pipeline peeling for dynamic loops. (#106436 ) Allow speculative execution and predicate results per stage.	2024-09-04 12:24:58 -07:00
donald chen	b6603e1bf1	[mlir] [dataflow] Refactoring the definition of program points in data flow analysis (#105656 ) This patch distinguishes between program points and lattice anchors in data flow analysis, where lattice anchors represent locations where a lattice can be attached, while program points denote points in program execution. Related discussions: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8	2024-08-25 19:21:47 +08:00
MaheshRavishankar	4dbaef6d5e	[mlir][Linalg] Avoid doing op replacement in `linalg::dropUnitDims`. (#105749 ) It is better to do the replacement in the caller. This avoids the footgun if the caller needs the original operation. Instead return the produced operation and replacement values. Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2024-08-23 13:43:33 -07:00
Théo Degioanni	b084111c8e	[mlir][mem2reg] Fix Mem2Reg attempting to promote in graph regions (#104910 ) Mem2Reg assumes SSA dependencies but did not check for graph regions. This fixes it. --------- Co-authored-by: Christian Ulmann <christianulmann@gmail.com>	2024-08-23 15:15:10 +02:00
Ivan Butygin	15e915a44f	[mlir][dataflow] Propagate errors from `visitOperation` (#105448 ) Base `DataFlowAnalysis::visit` returns `LogicalResult`, but wrappers's Sparse/Dense/Forward/Backward `visitOperation` doesn't. Sometimes it's needed to abort solver early if some unrecoverable condition detected inside analysis. Update `visitOperation` to return `LogicalResult` and propagate it to `solver.initializeAndRun()`. Only `visitOperation` is updated for now, it's possible to update other hooks like `visitNonControlFlowArguments`, bit it's not needed immediately and let's keep this PR small. Hijacked `UnderlyingValueAnalysis` test analysis to test it.	2024-08-22 12:16:03 +03:00
Andrzej Warzyński	42944da5ba	[mlir][vector] Group re-order patterns together (#102856 ) Group all patterns that re-order vector.transpose and vector.broadcast Ops () under `populateSinkVectorOpsPatterns`. These patterns are normally used to "sink" redundant Vector Ops, hence grouping together. Example: ```mlir %at = vector.transpose %a, [1, 0]: vector<4x2xf32> to vector<2x4xf32> %bt = vector.transpose %b, [1, 0]: vector<4x2xf32> to vector<2x4xf32> %r = arith.addf %at, %bt : vector<2x4xf32> ``` would get converted to: ```mlir %0 = arith.addf %a, %b : vector<4x2xf32> %r = vector.transpose %0, [1, 0] : vector<2x4xf32> ``` This patch also moves all tests for these patterns so that all of them are: run under one test-flag: `test-vector-sink-patterns`, * located in one file: "vector-sink.mlir". To facilitate this change: * `-test-sink-vector-broadcast` is renamed as `test-vector-sink-patterns`, * "sink-vector-broadcast.mlir" is renamed as "vector-sink.mlir", * tests for `ReorderCastOpsOnBroadcast` and `ReorderElementwiseOpsOnTranspose` patterns are moved from "vector-reduce-to-contract.mlir" to "vector-sink.mlir", * `ReorderElementwiseOpsOnTranspose` patterns are removed from `populateVectorReductionToContractPatterns` and added to (newly created) `populateSinkVectorOpsPatterns`, * `ReorderCastOpsOnBroadcast` patterns are removed from `populateVectorReductionToContractPatterns` - these are already present in `populateSinkVectorOpsPatterns`. This should allow us better layering and more straightforward testing. For the latter, the goal is to be able to easily identify which pattern a particular test is exercising (especially when it's a specific pattern). NOTES FOR DOWNSTREAM USERS In order to preserve the current functionality, please make sure to add * `populateSinkVectorOpsPatterns`, wherever you are using `populateVectorReductionToContractPatterns`. Also, rename `populateSinkVectorBroadcastPatterns` as `populateSinkVectorOpsPatterns`. (*) I didn't notice any other re-order patterns.	2024-08-16 16:53:53 +01:00
Ian Wood	a95ad2da36	[mlir] Add bubbling patterns for non intersecting reshapes (#103401 ) Refactored @Max191's PR https://github.com/llvm/llvm-project/pull/94637 to move it to `Tensor` From the original PR >This PR adds fusion by expansion patterns to push a tensor.expand_shape up through a tensor.collapse_shape with non-intersecting reassociations. Sometimes parallel collapse_shape ops like this can block propagation of expand_shape ops, so this allows them to pass through each other. I'm not sure if I put the code/tests in the right places, so let me know where those go if they aren't. cc @MaheshRavishankar @hanhanW --------- Co-authored-by: Max Dawkins <max.dawkins@gmail.com>	2024-08-14 13:58:35 -07:00
Frank Schlimbach	baabcb2898	[mlir][mesh] Shardingcontrol (#102598 ) This is a fixed copy of #98145 (necessary after it got reverted). @sogartar @yaochengji This PR adds the following to #98145: - `UpdateHaloOp` accepts a `memref` (instead of a tensor) and not returning a result to clarify its inplace-semantics - `UpdateHaloOp` accepts `split_axis` to allow multiple mesh-axes per tensor/memref-axis (similar to `mesh.sharding`) - The implementation of `Shardinginterface` for tensor operation (`tensor.empty` for now) moved from the tensor library to the mesh interface library. `spmdize` uses features from `mesh` dialect. @rengolin agreed that `tensor` should not depend on `mesh` so this functionality cannot live in a `tensor`s lib. The unfulfilled dependency caused the issues leading to reverting #98145. Such cases are generally possible and might lead to re-considering the current structure (like for tosa ops). - rebased onto latest main -------------------------- Replacing `#mesh.sharding` attribute with operation `mesh.sharding` - extended semantics now allow providing optional `halo_sizes` and `sharded_dims_sizes` - internally a sharding is represented as a non-IR class `mesh::MeshSharding` What previously was ```mlir %sharded0 = mesh.shard %arg0 <@mesh0, [[0]]> : tensor<4x8xf32> %sharded1 = mesh.shard %arg1 <@mesh0, [[0]]> annotate_for_users : tensor<16x8xf32> ``` is now ```mlir %sharding = mesh.sharding @mesh0, [[0]] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding : tensor<4x8xf32> %1 = mesh.shard %arg1 to %sharding annotate_for_users : tensor<16x8xf32> ``` and allows additional annotations to control the shard sizes: ```mlir mesh.mesh @mesh0 (shape = 4) %sharding0 = mesh.sharding @mesh0, [[0]] halo_sizes = [1, 2] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding0 : tensor<4x8xf32> %sharding1 = mesh.sharding @mesh0, [[0]] sharded_dims_sizes = [3, 5, 5, 3] : !mesh.sharding %1 = mesh.shard %arg1 to %sharding1 annotate_for_users : tensor<16x8xf32> ``` - `mesh.shard` op accepts additional optional attribute `force`, useful for halo updates - Some initial spmdization support for the new semantics - Support for `tensor.empty` reacting on `sharded_dims_sizes` and `halo_sizes` in the sharding - New collective operation `mesh.update_halo` as a spmdized target for shardings with `halo_sizes` --------- Co-authored-by: frank.schlimbach <fschlimb@smtp.igk.intel.com> Co-authored-by: Jie Fu <jiefu@tencent.com>	2024-08-12 12:20:58 +01:00
Nikhil Kalra	165c6d1251	[mlir] Add support for parsing nested PassPipelineOptions (#101118 ) - Added a default parsing implementation to `PassOptions` to allow `Option`/`ListOption` to wrap PassOption objects. This is helpful when creating meta-pipelines (pass pipelines composed of pass pipelines). - Updated `ListOption` printing to enable round-tripping the output of `dump-pass-pipeline` back into `mlir-opt` for more complex structures.	2024-08-09 13:54:00 -07:00
Matthias Springer	7359a6b799	[mlir][ODS] Verify type constraints in Types and Attributes (#102326 ) When a type/attribute is defined in TableGen, a type constraint can be used for parameters, but the type constraint verification was missing. Example: ``` def TestTypeVerification : Test_Type<"TestTypeVerification"> { let parameters = (ins AnyTypeOf<[I16, I32]>:$param); // ... } ``` No verification code was generated to ensure that `$param` is I16 or I32. When type constraints a present, a new method will generated for types and attributes: `verifyInvariantsImpl`. (The naming is similar to op verifiers.) The user-provided verifier is called `verify` (no change). There is now a new entry point to type/attribute verification: `verifyInvariants`. This function calls both `verifyInvariantsImpl` and `verify`. If neither of those two verifications are present, the `verifyInvariants` function is not generated. When a type/attribute is not defined in TableGen, but a verifier is needed, users can implement the `verifyInvariants` function. (This function was previously called `verify`.) Note for LLVM integration: If you have an attribute/type that is not defined in TableGen (i.e., just C++), you have to rename the verification function from `verify` to `verifyInvariants`. (Most attributes/types have no verification, in which case there is nothing to do.) Depends on #102657.	2024-08-09 22:04:40 +02:00
Benjamin Maxwell	9b06e25e73	[mlir][vector] Add mask elimination transform (#99314 ) This adds a new transform `eliminateVectorMasks()` which aims at removing scalable `vector.create_masks` that will be all-true at runtime. It attempts to do this by simply pattern-matching the mask operands (similar to some canonicalizations), if that does not lead to an answer (is all-true? yes/no), then value bounds analysis will be used to find the lower bound of the unknown operands. If the lower bound is >= to the corresponding mask vector type dim, then that dimension of the mask is all true. Note that the pattern matching prevents expensive value-bounds analysis in cases where the mask won't be all true. For example: ```mlir %mask = vector.create_mask %dynamicValue, %c2 : vector<8x4xi1> ``` From looking at `%c2` we can tell this is not going to be an all-true mask, so we don't need to run the value-bounds analysis for `%dynamicValue` (and can exit the transform early). Note: Eliminating create_masks here means replacing them with all-true constants (which will then lead to the masks folding away).	2024-08-09 10:51:49 +01:00
Diego Caballero	2ac2e9a5b6	[mlir][LLVM] Improve lowering of `llvm.byval` function arguments (#100028 ) When a function argument is annotated with the `llvm.byval` attribute, [LLVM expects](https://llvm.org/docs/LangRef.html#parameter-attributes) the function argument type to be an `llvm.ptr`. For example: ``` func.func (%args0 : llvm.ptr {llvm.byval = !llvm.struct<(i32)>} { ... } ``` Unfortunately, this makes the type conversion context-dependent, which is something that the type conversion infrastructure (i.e., `LLVMTypeConverter` in this particular case) doesn't support. For example, we may want to convert `MyType` to `llvm.struct<(i32)>` in general, but to an `llvm.ptr` type only when it's a function argument passed by value. To fix this problem, this PR changes the FuncToLLVM conversion logic to generate an `llvm.ptr` when the function argument has a `llvm.byval` attribute. An `llvm.load` is inserted into the function to retrieve the value expected by the argument users.	2024-08-08 19:27:54 -07:00
Renato Golin	3968942f10	Revert "[mlir][mesh] adding shard-size control (#98145 )" This reverts commit `fca69838ca`. Also reverts the fixup: "[mlir] Fix -Wunused-variable in MeshOps.cpp (NFC)" This reverts commit `fc737368fe`.	2024-08-07 15:12:37 +01:00
Frank Schlimbach	fca69838ca	[mlir][mesh] adding shard-size control (#98145 ) - Replacing `#mesh.sharding` attribute with operation `mesh.sharding` - extended semantics now allow providing optional `halo_sizes` and `sharded_dims_sizes` - internally a sharding is represented as a non-IR class `mesh::MeshSharding` What previously was ```mlir %sharded0 = mesh.shard %arg0 <@mesh0, [[0]]> : tensor<4x8xf32> %sharded1 = mesh.shard %arg1 <@mesh0, [[0]]> annotate_for_users : tensor<16x8xf32> ``` is now ```mlir %sharding = mesh.sharding @mesh0, [[0]] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding : tensor<4x8xf32> %1 = mesh.shard %arg1 to %sharding annotate_for_users : tensor<16x8xf32> ``` and allows additional annotations to control the shard sizes: ```mlir mesh.mesh @mesh0 (shape = 4) %sharding0 = mesh.sharding @mesh0, [[0]] halo_sizes = [1, 2] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding0 : tensor<4x8xf32> %sharding1 = mesh.sharding @mesh0, [[0]] sharded_dims_sizes = [3, 5, 5, 3] : !mesh.sharding %1 = mesh.shard %arg1 to %sharding1 annotate_for_users : tensor<16x8xf32> ``` - `mesh.shard` op accepts additional optional attribute `force`, useful for halo updates - Some initial spmdization support for the new semantics - Support for `tensor.empty` reacting on `sharded_dims_sizes` and `halo_sizes` in the sharding - New collective operation `mesh.update_halo` as a spmdized target for shardings with `halo_sizes` @sogartar @yaochengji	2024-08-07 13:34:57 +01:00
Nikhil Kalra	84cc1865ef	[mlir] Support DialectRegistry extension comparison (#101119 ) `PassManager::run` loads the dependent dialects for each pass into the current context prior to invoking the individual passes. If the dependent dialect is already loaded into the context, this should be a no-op. However, if there are extensions registered in the `DialectRegistry`, the dependent dialects are unconditionally registered into the context. This poses a problem for dynamic pass pipelines, however, because they will likely be executing while the context is in an immutable state (because of the parent pass pipeline being run). To solve this, we'll update the extension registration API on `DialectRegistry` to require a type ID for each extension that is registered. Then, instead of unconditionally registered dialects into a context if extensions are present, we'll check against the extension type IDs already present in the context's internal `DialectRegistry`. The context will only be marked as dirty if there are net-new extension types present in the `DialectRegistry` populated by `PassManager::getDependentDialects`. Note: this PR removes the `addExtension` overload that utilizes `std::function` as the parameter. This is because `std::function` is copyable and potentially allocates memory for the contained function so we can't use the function pointer as the unique type ID for the extension. Downstream changes required: - Existing `DialectExtension` subclasses will need a type ID to be registered for each subclass. More details on how to register a type ID can be found here: `8b68e06731/mlir/include/mlir/Support/TypeID.h (L30)` - Existing uses of the `std::function` overload of `addExtension` will need to be refactored into dedicated `DialectExtension` classes with associated type IDs. The attached `std::function` can either be inlined into or called directly from `DialectExtension::apply`. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2024-08-06 01:32:36 +02:00
Kazu Hirata	5262865aac	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896 )	2024-08-04 11:43:05 -07:00
MaheshRavishankar	6740d701bd	[mlir][Linalg] Deprecate `linalg::tileToForallOp` and `linalg::tileToForallOpUsingTileSizes` (#91878 ) The implementation of these methods are legacy and they are removed in favor of using the `scf::tileUsingSCF` methods as replacements. To get the latter on par with requirements of the deprecated methods, the tiling allows one to specify the maximum number of tiles to use instead of specifying the tile sizes. When tiling to `scf.forall` this specification is used to generate the `num_threads` version of the operation. A slight deviation from previous implementation is that the deprecated method always generated the `num_threads` variant of the `scf.forall` operation. Instead now this is driven by the tiling options specified. This reduces the indexing math generated when the tile sizes are specified. Moving from `linalg::tileToForallOp` to `scf::tileUsingSCF` ``` OpBuilder b; TilingInterface op; ArrayRef<OpFoldResult> numThreads; ArrayAttr mapping; FailureOr<ForallTilingResult> result =linalg::tileToForallOp(b, op, numThreads, mapping); ``` can be replaced by ``` scf::SCFTilingOptions options; options.setNumThreads(numThreads); options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp); options.setMapping(mapping.getValue()); /note the difference that setMapping takes an ArrayRef<Attribute> / FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options); ``` This generates the `numThreads` version of the `scf.forall` for the inter-tile loops, i.e. ``` ... = scf.forall (%arg0, %arg1) in (%nt0, %nt1) shared_outs(...) ``` Moving from `linalg::tileToForallOpUsingTileSizes` to `scf::tileUsingSCF` ``` OpBuilder b; TilingInterface op; ArrayRef<OpFoldResult> tileSizes; ArrayAttr mapping; FailureOr<ForallTilingResult> result =linalg::tileToForallOpUsingTileSizes(b, op, tileSizes, mapping); ``` can be replaced by ``` scf::SCFTilingOptions options; options.setTileSizes(tileSizes); options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp); options.setMapping(mapping.getValue()); /note the difference that setMapping takes an ArrayRef<Attribute> / FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options); ``` Also note that `linalg::tileToForallOpUsingTileSizes` would effectively call the `linalg::tileToForallOp` by computing the `numThreads` from the `op` and `tileSizes` and generate the `numThreads` version of the `scf.forall`. That is not the case anymore. Instead this will directly generate the `tileSizes` version of the `scf.forall` op ``` ... = scf.forall(%arg0, %arg1) = (%lb0, %lb1) to (%ub0, %ub1) step(%step0, %step1) shared_outs(...) ``` If you actually want to use the `numThreads` version, it is upto the caller to compute the `numThreads` and set `options.setNumThreads` instead of `options.setTileSizes`. Note that there is a slight difference in the num threads version and tile size version. The former requires an additional `affine.max` on the tile size to ensure non-negative tile sizes. When lowering to `numThreads` version this `affine.max` is not needed since by construction the tile sizes are non-negative. In previous implementations, the `numThreads` version generated when using the `linalg::tileToForallOpUsingTileSizes` method would avoid generating the `affine.max` operation. To get the same state, downstream users will have to additionally normalize the `scf.forall` operation. Changes to `transform.structured.tile_using_forall` The transform dialect op that called into `linalg::tileToForallOp` and `linalg::tileToForallOpUsingTileSizes` have been modified to call `scf::tileUsingSCF`. The transform dialect op always generates the `numThreads` version of the `scf.forall` op. So when `tile_sizes` are specified for the transform dialect op, first the `tile_sizes` version of the `scf.forall` is generated by the `scf::tileUsingSCF` method which is then further normalized to get back to the same state. So there is no functional change to `transform.structured.tile_using_forall`. It always generates the `numThreads` version of the `scf.forall` op (as it did before this change). --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2024-07-31 12:32:07 -07:00
Matthias Springer	8fc329421b	[mlir][Transforms] Dialect conversion: Add missing "else if" branch (#101148 ) This code got lost in #97213 and there was no test for it. Add it back with an MLIR test. When a pattern is run without a type converter, we can assume that the new block argument types of a signature conversion are legal. That's because they were specified by the user. This won't work for 1->N conversions due to limitations in the dialect conversion infrastructure, so the original `FIXME` has to stay in place.	2024-07-30 16:36:47 +02:00
Victor Perez	e8f07cdb57	[MLIR][SCF] Define `-scf-rotate-while` pass (#99850 ) Define SCF dialect patterns rotating `scf.while` loops leveraging existing `mlir::scf::wrapWhileLoopInZeroTripCheck`. `forceCreateCheck` is always `false` as the pattern would lead to an infinite recursion otherwise. This pattern rotates `scf.while` ops, mutating them from "while" loops to "do-while" loops. A guard checking the condition for the first iteration is inserted. Note this guard can be optimized away if the compiler can prove the loop will be executed at least once. Using this pattern, the following while loop: ```mlir scf.while (%arg0 = %init) : (i32) -> i64 { %val = .., %arg0 : i64 %cond = arith.cmpi .., %arg0 : i32 scf.condition(%cond) %val : i64 } do { ^bb0(%arg1: i64): %next = .., %arg1 : i32 scf.yield %next : i32 } ``` Can be transformed into: ``` mlir %pre_val = .., %init : i64 %pre_cond = arith.cmpi .., %init : i32 scf.if %pre_cond -> i64 { %res = scf.while (%arg1 = %va0) : (i64) -> i64 { // Original after block %next = .., %arg1 : i32 // Original before block %val = .., %next : i64 %cond = arith.cmpi .., %next : i32 scf.condition(%cond) %val : i64 } do { ^bb0(%arg2: i64): %scf.yield %arg2 : i32 } scf.yield %res : i64 } else { scf.yield %pre_val : i64 } ``` The test pass for `wrapWhileLoopInZeroTripCheck` has been modified to use the new pattern when `forceCreateCheck=false`. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-07-30 10:06:01 +02:00
Krzysztof Drewniak	8955e285e1	[mlir] Add property combinators, initial ODS support (#94732 ) While we have had a Properties.td that allowed for defining non-attribute-backed properties, such properties were not plumbed through the basic autogeneration facilities available to attributes, forcing those who want to migrate to the new system to write such code by hand. ## Potentially breaking changes - The `setFoo()` methods on `Properties` struct no longer take their inputs by const reference. Those wishing to pass non-owned values of a property by reference to constructors and setters should set the interface type to `const [storageType]&` - Adapters and operations now define getters and setters for properties listed in ODS, which may conflict with custom getters. - Builders now include properties listed in ODS specifications, potentially conflicting with custom builders with the same type signature. ## Extensions to the `Property` class This commit adds several fields to the `Property` class, including: - `parser`, `optionalParser`, and `printer` (for parsing/printing properties of a given type in ODS syntax) - `storageTypeValueOverride`, an extension of `defaultValue` to allow the storage and interface type defaults to differ - `baseProperty` (allowing for classes like `DefaultValuedProperty`) Existing fields have also had their documentation comments updated. This commit does not add a `PropertyConstraint` analogous to `AttrConstraint`, but this is a natural evolution of the work here. This commit also adds the concrete property kinds `I32Property`, `I64Property`, `UnitProperty` (and special handling for it like for UnitAttr), and `BoolProperty`. ## Property combinators `Properties.td` also now includes several ways to combine properties. One is `ArrayProperty<Property elem>`, which now stores a variable-length array of some property as `SmallVector<elem.storageType>` and uses `ArrayRef<elem.storageType>` as its interface type. It has `IntArrayProperty` subclasses that change its conversion to attributes to use `DenseI[N]Attr`s instead of an `ArrayAttr`. Similarly, `OptionalProperty<Property p>` wraps a property's storage in `std::optional<>` and adds a `std::nullopt` default value. In the case where the underlying property can be parsed optionally but doesn't have its own default value, `OptionalProperty` can piggyback off the optional parser to produce a cleaner syntax, as opposed to its general form, which is either `none` or `some<[value]>`. (Note that `OptionalProperty` can be nested if desired). ## Autogeneration changes Operations and adaptors now support getters and setters for properties like those for attributes. Unlike for attributes, there aren't separate value and attribute forms, since there is no `FooAttr()` available for a `getFooAttr()` to return. The largest change is to operation formats. Previously, properties could only be used in custom directives. Now, they can be used anywhere an attribute could be used, and have parsers and printers defined in their tablegen records. These updates include special `UnitProperty` logic like that used for `UnitAttr`. ## Misc. Some attempt has been made to test the new functionality. This commit takes tentative steps towards updating the documentation to account for properties. A full update will be in order once any followup work has been completed and the interfaces have stabilized. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com> Co-authored-by: Christian Ulmann <christianulmann@gmail.com>	2024-07-26 09:35:06 -05:00
Matthias Springer	684a5a30e1	[mlir][Transforms] Dialect conversion: fix crash when converting detached region (#100633 ) This commit fixes a crash in the dialect conversion when applying a signature conversion to a block inside of a detached region. This fixes an issue reported in `4114d5be87 (r1691809730)`.	2024-07-25 22:14:15 +02:00
weiwei chen	12dba4d484	[mlir] Add metadata to Diagnostic. (#99398 ) Add metadata to Diagnostic. Motivation: we have a use case where we want to do some filtering in our customized Diagnostic Handler based on some customized info that is not `location` or `severity` or `diagnostic arguments` that are member variables of `Diagnostic`. Specifically, we want to add a unique ID to the `Diagnostic` for the handler to filter in a compiler pass that emits errors in async tasks with multithreading and the diagnostic handling is associated to the task. This patch adds a field of `metadata` to `mlir::Diagnostics` as a general solution. `metadata` is of type `SmallVector<DiagnosticArgument, 0>` to save memory size and reuse existing `DiagnosticArgument` for metadata type.	2024-07-25 10:01:46 -04:00
Angel Zhang	f83950ab8d	[mlir][spirv] Implement vector unrolling for `convert-to-spirv` pass (#100138 ) ### Description This PR builds on #99872. It implements a minimal version of function body vector unrolling to convert vector types into 1D and with a size supported by SPIR-V (2, 3 or 4 depending on the original dimension). The ops that are currently supported include those with elementwise traits (e.g. `arith.addi`), `vector.reduction` and `vector.transpose`. This PR also includes new LIT tests that only check for vector unrolling. ### Future Plans - Support more ops --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-07-24 10:41:19 -04:00
quartersdg	690dc4eff1	Add AsmParser::parseDecimalInteger. (#96255 ) An attribute parser needs to parse lists of possibly negative integers separated by x in a way which is foiled by parseInteger handling hex formats and parseIntegerInDimensionList does not allow negatives. --------- Co-authored-by: Jacques Pienaar <jpienaar@google.com>	2024-07-23 20:12:40 -07:00
Hsiangkai Wang	27ee33d136	[mlir][linalg] Decompose winograd operators (#96183 ) Convert Linalg winograd_filter_transform, winograd_input_transform, and winograd_output_transform into nested loops with matrix multiplication with constant transform matrices. Support several configurations of Winograd Conv2D, including F(2, 3), F(4, 3) and F(2, 5). These configurations show that the implementation can support different kernel size (3 and 5) and different output size (2 and 4). Besides symetric kernel size 3x3 and 5x5, this patch also supports 1x3, 3x1, 1x5, and 5x1 kernels. The implementation is based on the paper, Fast Algorithm for Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308) Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin Reviewed By: ftynse, Max191 Pull Request: https://github.com/llvm/llvm-project/pull/96183	2024-07-18 06:04:53 +01:00
Angel Zhang	6867e49fc8	[mlir][spirv] Implement vector type legalization for function signatures (#98337 ) ### Description This PR implements a minimal version of function signature conversion to unroll vectors into 1D and with a size supported by SPIR-V (2, 3 or 4 depending on the original dimension). This PR also includes new unit tests that only check for function signature conversion. ### Future Plans - Check for capabilities that support vectors of size 8 or 16. - Set up `OneToNTypeConversion` and `DialectConversion` to replace the current implementation that uses `GreedyPatternRewriteDriver`. - Introduce other vector unrolling patterns to cancel out the `vector.insert_strided_slice` and `vector.extract_strided_slice` ops and fully legalize the vector types in the function body. - Handle `func::CallOp` and declarations. - Restructure the code in `SPIRVConversion.cpp`. - Create test passes for testing sets of patterns in isolation. - Optimize the way original shape is splitted into target shapes, e.g. `vector<5xi32>` can be splitted into `vector<4xi32>` and `vector<1xi32>`. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-07-17 13:09:15 -04:00
Hsiangkai Wang	7d246e84a4	[mlir][linalg] Implement Conv2D using Winograd Conv2D algorithm (#96181 ) Define high level winograd operators and convert conv_2d_nhwc_fhwc into winograd operators. According to Winograd Conv2D algorithm, we need three transform operators for input, filter, and output transformation. The formula of Winograd Conv2D algorithm is Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A filter transform: G x g x G^T input transform: B^T x d x B output transform: A^T x y x A The implementation is based on the paper, Fast Algorithm for Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308) Reviewers: stellaraccident, ftynse, Max191, GeorgeARM, cxy-1993, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin Reviewed By: ftynse, Max191, stellaraccident Pull Request: https://github.com/llvm/llvm-project/pull/96181	2024-07-10 07:30:45 +01:00
Han-Chung Wang	04fc471f48	[mlir][linalg] Switch to use OpOperand* in ControlPropagationFn. (#96697 ) It's not easy to determine whether we want to propagate pack/unpack ops because we don't know the (producer, consumer) information. The revisions switch it to `OpOperand`, so the control function can capture the (producer, consumer) pair. E.g., ``` Operation producer = opOperand->get().getDefiningOp(); Operation *consumer = opOperand->getOwner(); ```	2024-07-08 09:53:09 -07:00
Jeremy Kun	07c157a435	[mlir] load dialect in parser for optional parameters (#96667 ) https://github.com/llvm/llvm-project/pull/96242 fixed an issue where the auto-generated parsers were not loading dialects whose namespaces are not present in the textual IR. This required the attribute parameter to be a tablegen def with its dialect information attached. This fails when using parameter wrapper classes like `OptionalParameter`. This came up because `RingAttr` uses `OptionalParameter` for its second and third attributes. `OptionalParameter` takes as input the C++ type as a string instead of the tablegen def, and so it doesn't have a dialect member value to trigger the fix from https://github.com/llvm/llvm-project/pull/96242. The docs on this topic say the appropriate solution as overloading `FieldParser` for a particular type. This PR updates `FieldParser` for generic attributes to load the dialect on demand. This requires `mlir-tblgen` to emit a `dialectName` static field on the generated attribute class, and check for it with template metaprogramming, since not all attribute types go through `mlir-tblgen`. --------- Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com> Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>	2024-07-07 09:44:07 -07:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Théo Degioanni	69d3793ffb	[mlir][sroa] Update name of subelement types in destructurable slots (#97226 ) The `elementPtrs` has changed meaning over time and the name is now outdated which may be confusing. This PR updates it to a name representative of current usage.	2024-06-30 20:24:56 +02:00
srcarroll	431213c99d	[mlir][linalg] Implement patterns for reducing rank of named linalg contraction ops (#95710 ) This patch introduces pattern rewrites for reducing the rank of named linalg contraction ops with unit spatial dim(s) to other named contraction ops. For example `linalg.batch_matmul` with batch size 1 -> `linalg.matmul` and `linalg.matmul` with unit LHS spatial dim -> `linalg.vecmat`, etc. These patterns don't support reducing the rank along reduction dimension as those don't convert to other named contraction ops.	2024-06-24 13:06:31 -05:00

1 2 3 4 5 ...

1887 Commits