clang-p2996

Author	SHA1	Message	Date
Kazu Hirata	887222e352	[mlir] Migrate away from ArrayRef(std::nullopt) (NFC) (#144989 ) ArrayRef has a constructor that accepts std::nullopt. This constructor dates back to the days when we still had llvm::Optional. Since the use of std::nullopt outside the context of std::optional is kind of abuse and not intuitive to new comers, I would like to move away from the constructor and eventually remove it. This patch takes care of the mlir side of the migration, starting with straightforward places where I see ArrayRef or ValueRange nearby. Note that ValueRange has a constructor that forwards arguments to an ArrayRef constructor.	2025-06-20 08:33:59 -07:00
Jeremy Kun	b533b0ec34	Define a DataFlowSolver helper that loads sensible default analyses (#143415 ) Cf. https://discourse.llvm.org/t/mlir-dead-code-analysis/67568/10 Custom analysis passes will not work properly unless both DeadCodeAnalysis and SparseConstantPropagation are loaded to the DataFlowSolver. This is intended behavior, but surprising to many users as shown in the thread. In lieu of a longer-term fix (which I am not knowledgeable enough to implement myself, yet), this commit adds a helper function that loads these two analyses, as well as providing breadcrumbs for an explanation of the problem. The existing places in the codebase where these two analyses are loaded for the purpose of running other unrelated analyses are replaced by the use of the helper. --------- Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com> Co-authored-by: Oleksandr "Alex" Zinenko <azinenko@amd.com>	2025-06-20 08:16:52 -07:00
Chao Chen	9dc59cc95b	[MLIR] Incorrect track of usedKey in setPropertiesFromParsedAttr (#144789 ) co-authored by @chencha3 and @joker-eph	2025-06-20 10:02:09 -05:00
Nicolas Vasilache	269cb22ae8	[mlir][transform] extract a minimal DomainAndOperandsAffineMapT… (#145034 ) …ransferInterface out of LinalgStructuredInterface and use that for PadTilingInterface Along the way, a bug was found on the handling of scalar values, fix it and add a test.	2025-06-20 15:45:21 +02:00
Andrzej Warzyński	8db272ffcf	[mlir][SparseTensor] Re-enable tests on AArch64 (#143387 ) These tests were disabled in https://reviews.llvm.org/D136273, due to: * https://github.com/llvm/llvm-project/issues/58465 That issue has now been resolved, so we should be able to re-enable these tests.	2025-06-20 14:25:36 +01:00
Krzysztof Parzyszek	349f8d67d4	[flang][OpenMP] Skip runtime mapping with no offload targets (#144534 ) When no offload targets are specified flang will ignore "target" constructs, but not "target data" constructs. This patch makes the behavior consistent across all offload-related operations. While ignoring "target" may produce semantically incorrect code, it may still be a useful debugging tool.	2025-06-20 08:09:36 -05:00
Artemiy Bulavin	6edf2eb364	[MLIR] Print more user-friendly error message when generating local reproducer and threading is enabled (#144905 )	2025-06-20 15:45:17 +03:00
Matthias Springer	6c0ac888c5	[mlir][arith][NFC] Remove redundant trait declaration (#145007 ) `Arith_Op` already declares the `ElementwiseMappable` traits, so they don't have to be declared for `arith.select`.	2025-06-20 14:32:56 +02:00
Mehdi Amini	a5b1093f78	[MLIR] Add ReturnLike trait to memref.atomic_yield (#144932 ) Without this, the yield isn't considered as the region terminator and the dataflow framework does not consider it live.	2025-06-20 13:55:03 +02:00
Nicolas Vasilache	227cd56e13	[NFC] Update transform-op-pad-tiling-interface.mlir Missing NL	2025-06-20 12:33:56 +02:00
Nicolas Vasilache	7af545237f	[NFC] Update transform-op-pad-tiling-interface-multiple-of.mlir Missing NL	2025-06-20 12:33:14 +02:00
Nicolas Vasilache	00c18d04ab	[mlir][Transforms] Add a PadTilingInterface transformation and hook i… (#144991 ) …t up to the transform dialect This revision revisits the padding transformation from first principles and prepares it to work more generally with TilingInterface. Compared to structured.transform.pad it has the following differences: - no support for nofold, copy-back, transpose and hoisting: these have been carried by the padding op in the very early days of StructuredOps and have since then been separated out as independent transformations that compose. - no conflated static bounding box derivation attempts: pad_tiling_interface composes more naturally with or without tiling. - properly derives padding size on outputs where multiple dimensions contribute: this is not supported in structured.transform.pad - geared towards supporting TilingInterface once the proper control mechanisms are supported through a PadSizeComputationFunction (supports LinalgOp by default) This will gradually replace structured.transform.pad as it is fleshed out and tested more comprehensively. In the future this should be moved out of a specific Linalg implementation file and into a more general "Structured" file.	2025-06-20 12:31:46 +02:00
Matthias Springer	95bd05d7ca	[mlir][Func][NFC] Simplify implementation after #144706 (#145006 )	2025-06-20 12:04:04 +02:00
Matthias Springer	f577516d91	[mlir][arith] Add back ElementwiseMappable to `arith.trunci` (#145000 ) This trait was accidentally dropped in #144863.	2025-06-20 11:23:33 +02:00
Umang Yadav	836201f117	Allow bf16 operands on new MFMAs (#144925 ) New gfx950 MFMA allows bf16 operands. `c0cc81cdc0/llvm/include/llvm/IR/IntrinsicsAMDGPU.td (L3434)` When running `amdgpu-to-rocdl`, Current logic converts bf16 to i16 always which fails to compile for newer bf16 MFMA e.g. `v_mfma_f32_16x16x32bf16`. Backend expects bf16 type for the operands for those newer MFMAs. This patch fixes it. CC: @krzysz00 @dhernandez0 @giuseros @antiagainst @kuhar	2025-06-19 12:52:31 -05:00
Slava Zakharin	6ce86538c1	[mlir][cf] Preserve branch weights during cf.cond_br canonicalization. (#144822 )	2025-06-19 10:09:10 -07:00
Andrzej Warzyński	3fe62682ef	[mlir][vector] Use `result` consistently as the result argument name (#144739 ) This patch updates the following ops to use `result` (instead of `res`) as the name for their result argument: * `vector.scalable.insert` * `vector.scalable.extract` * `vector.insert_strided_slice` This change ensures naming consistency with other ops in the `vector` dialect. It addresses part of: * https://github.com/llvm/llvm-project/issues/131602	2025-06-19 17:34:08 +01:00
Tobias Gysi	eb694b2846	[mlir][arith] Delete mul ext canonicalizations (#144844 ) The Arith dialect includes patterns that canonicalize a sequence of: - trunci(shrui(mul(sext(x), sext(y)), c)) -> mulsi_extended(x, y) - trunci(shrui(mul(zext(x), zext(y)), c)) -> mului_extended(x, y) These patterns return the high word of an extended multiplication, which assumes that the shift amount is equal to the bit width of the original operands. This check was missing, leading to incorrect canonicalizations when the shift amount was less than the bit width. For example, the following code: ``` %x = arith.extui %a: i32 to i33 %y = arith.extui %b: i32 to i33 %m = arith.muli %x, %y: i33 %c1 = arith.constant 1: i33 %sh = arith.shrui %m, %c1 : i33 %hi = arith.trunci %sh: i33 to i32 ``` would incorrectly be canonicalized to: ``` _, %hi = arith.mului_extended %a, %b : i32 ``` This commit removes the faulty canonicalizations since they are not believed to be generally beneficial (c.f., the discussion of the alternative https://github.com/llvm/llvm-project/pull/144787 which fixes the canonicalizations).	2025-06-19 16:32:48 +02:00
Matthias Springer	a4e4527c4b	[mlir][Transforms] Fix replaceUsesOfBlockArgument API (#144706 ) Before this PR, users had to pass the "old" block argument when replacing the uses of a block argument in a newly converted block. Users can now pass the actual block argument that should be replaced. Note for LLVM integration: Make sure to pass the current block argument instead of the old one.	2025-06-19 15:39:06 +02:00
Abdul Raheem	f87b6625d6	[MLIR][NFC] Fixed some Typos (#144263 ) -- Fixed some typos in Operation.h Signed-off: Abdul Raheem Beigh abdulraheembeigh@gmail.com	2025-06-19 06:24:34 -07:00
Matthias Springer	e33f13ba48	[mlir][arith] Add overflow flags to `arith.trunci` (#144863 ) LLVM already supports overflow flags on `llvm.trunc` for a while. This commit adds support for these flags to `arith.trunci`.	2025-06-19 13:59:22 +02:00
Fabian Mora	97c1a24445	[mlir][linalg] Add option to pad dynamic dims to `linalg::rewriteAsPaddedOp` (#144354 ) This patch makes the following changes: - Add a `ValueRange typeDynDims` argument to `linalg::makeComposedPadHighOp`, allowing to pad a tensor with dynamic dimensions using `tensor::createPadHighOp`. - Add a `DenseMap<std::pair<unsigned, unsigned>, OpFoldResult> sizeToPadTo;` option to `LinalgPaddingOptions`. This option allows setting the size to use when padding a dimension of an operand, allowing to pad operands even in the case they don't have a constant upper bounding box. If the value is not provided, then the constant upper bound is used by default. - Add a `use_prescribed_tensor_shapes` option to `transform.structured.pad`. If set to true then `tensor.dim` will be used as dimensions to compute the size of the padded dim instead of computing the constant upper bound. - This patch also changes the behavior for computing the padded shape `linalg::rewriteAsPaddedOp`, by using the newly added options in `LinalgPaddingOptions`. - Finally it adds tests verifying the behavior.	2025-06-19 11:47:44 +02:00
Durgadoss R	c0a9c908a6	[MLIR][NVVM-Docs] Fix rendering of a few tables in NVVM Docs (#144764 ) This patch corrects the formatting of tables in the tcgen05 ld/st and smem_descriptor Ops. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-06-19 14:45:43 +05:30
Hsiangkai Wang	03461c9c6e	[mlir][gpu][spirv] Remove rotation semantics of gpu.shuffle up/down (#139105 ) From the description of gpu.shuffle operation, shuffle up/down rotates values in the subgroup because it applies modulo on the shifted value to calculate the result lane ID. It is inconsistent with the definition of SPIR-V shuffle up/down and NVVM data movement definitions within subgroup. In NVVM, it says "If the computed source lane index j is in range, the returned i32 value will be the value of %a from lane j; otherwise, it will be the the value of %a from the current thread." It will keep the original value if the result land ID is out of range. In SPIR-V OpGroupNonUniformShuffleUp and OpGroupNonUniformShuffleDown, it says "The resulting value is undefined if Delta is greater than the current invocation’s id within the scope or if the identified invocation is not in scope restricted tangle." It's an undefined value if the result land ID is out of range. Anyway, there is no circular movement in shuffle up/down from these 2 specifications. This patch removes the circular movement in gpu.shuffle up/down and lower gpu.shuffle up/down to SPIR-V OpGroupNonUniformShuffleUp and OpGroupNonUniformShuffleDown directly. Reference: https://docs.nvidia.com/cuda/archive/12.2.1/nvvm-ir-spec/index.html#data-movement https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpGroupNonUniformShuffleUp https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpGroupNonUniformShuffleDown	2025-06-19 07:56:30 +01:00
Han-Chung Wang	351303c28e	[mlir][docs] Fix broken links to Traits documentation. (#144820 )	2025-06-18 20:07:43 -07:00
Jianhui Li	118bfcda46	[MLIR][XEGPU] Add blocking support for scatter ops (#144766 ) Add blocking support for scatter ops: Create_tdesc, update, prefetch, load and store. It also enables the load/store with chunk size.	2025-06-18 14:52:03 -07:00
Diego Caballero	ac37a0df94	[mlir] Fix integer comparison warning (#144794 ) Introduced by https://github.com/llvm/llvm-project/pull/141457	2025-06-18 14:11:21 -07:00
Diego Caballero	7aecd7ecac	[mlir][Vector] Add `vector.to_elements` op (#141457 ) This PR introduces the `vector.to_elements` op, which decomposes a vector into its scalar elements. This operation is symmetrical to the existing `vector.from_elements`. Examples: ``` // Decompose a 0-D vector. %0 = vector.to_elements %v0 : vector<f32> // %0 = %v0[0] // Decompose a 1-D vector. %0:2 = vector.to_elements %v1 : vector<2xf32> // %0#0 = %v1[0] // %0#1 = %v1[1] // Decompose a 2-D. %0:6 = vector.to_elements %v2 : vector<2x3xf32> // %0#0 = %v2[0, 0] // %0#1 = %v2[0, 1] // %0#2 = %v2[0, 2] // %0#3 = %v2[1, 0] // %0#4 = %v2[1, 1] // %0#5 = %v2[1, 2] ``` This op is aimed at reducing code size when modeling "structured" vector extractions and simplifying canonicalizations of large sequences of `vector.extract` and `vector.insert` ops into `vector.shuffle` and other sophisticated ops that can re-arrange vector elements.	2025-06-18 13:45:43 -07:00
Jakub Kuderski	96bbe472ef	Revert "[mlir][spirv] Fix int type declaration duplication when serializing" and follow up commits (#144773 ) This reverts the following PRs: * https://github.com/llvm/llvm-project/pull/143108 * https://github.com/llvm/llvm-project/pull/144538 * https://github.com/llvm/llvm-project/pull/144685 Reverting because this disabled tests when building without the llvm spirv backend enabled.	2025-06-18 16:15:06 -04:00
Yang Bai	fe3933da15	[mlir][vector] Support complete folding in single pass for vector.insert/vector.extract (#142124 ) ### Description This patch improves the folding efficiency of `vector.insert` and `vector.extract` operations by not returning early after successfully converting dynamic indices to static indices. This PR also renames the test pass `TestConstantFold` to `TestSingleFold` and adds comprehensive documentation explaining the single-pass folding behavior. ### Motivation Since the `OpBuilder::createOrFold` function only calls `fold` once, the current `fold` methods of `vector.insert` and `vector.extract` may leave the op in a state that can be folded further. For example, consider the following un-folded IR: ``` %v1 = vector.insert %e1, %v0 [0] : f32 into vector<128xf32> %c0 = arith.constant 0 : index %e2 = vector.extract %v1[%c0] : f32 from vector<128xf32> ``` If we use `createOrFold` to create the `vector.extract` op, then the result will be: ``` %v1 = vector.insert %e1, %v0 [127] : f32 into vector<128xf32> %e2 = vector.extract %v1[0] : f32 from vector<128xf32> ``` But this is not the optimal result. `createOrFold` should have returned `%e1`. The reason is that the execution of fold returns immediately after `extractInsertFoldConstantOp`, causing subsequent folding logics to be skipped. --------- Co-authored-by: Yang Bai <yangb@nvidia.com>	2025-06-18 09:26:04 -07:00
Andrei Golubev	ee070d0816	[mlir][bufferization] Support custom types (1/N) (#142986 ) Following the addition of TensorLike and BufferLike type interfaces (see `00eaff3e9c`), introduce minimal changes required to bufferize a custom tensor operation into a custom buffer operation. To achieve this, new interface methods are added to TensorLike type interface that abstract away the differences between existing (tensor -> memref) and custom conversions. The scope of the changes is intentionally limited (for example, BufferizableOpInterface is untouched) in order to first understand the basics and reach consensus design-wise. --- Notable changes: * mlir::bufferization::getBufferType() returns BufferLikeType (instead of BaseMemRefType) * ToTensorOp / ToBufferOp operate on TensorLikeType / BufferLikeType. Operation argument "memref" renamed to "buffer" * ToTensorOp's tensor type inferring builder is dropped (users now need to provide the tensor type explicitly)	2025-06-18 16:18:12 +02:00
Sergei Lebedev	1d6f1029f7	[mlir] [python] Fixed the return type of `MemRefType.get_strides_and_offset` (#144523 ) Previously, the return type for `offset` was `list[int]`, which clearly is not right.	2025-06-18 09:53:20 -04:00
lorenzo chelini	c5613dc863	[MLIR] Mark LLVM::FMAOp as legal (#144671 ) Mark LLVM::FMAOp as legal in configureGpuToNVVMConversionLegality, since we can handle intrinsic lowering in the NVPTX backend and emit fma.rn.f32.	2025-06-18 15:49:00 +02:00
Kunwar Grover	6729da647a	[mlir][amdgpu][nfc] Add PatternBenefit to populate methods (#144663 )	2025-06-18 15:19:17 +02:00
Frank Schlimbach	8584abb05a	[mlir] mlir/test/lit.local.cfg -> mlir/test/Target/SPIRV/lit.local.cfg (#144685 ) renamed: mlir/test/lit.local.cfg -> mlir/test/Target/SPIRV/lit.local.cfg	2025-06-18 15:04:55 +02:00
Matthias Springer	66580f77b8	[mlir][Transforms][NFC] Dialect Conversion: Keep `unresolvedMaterializations` up to date (#144254 ) `unresolvedMaterializations` is a mapping from `UnrealizedConversionCastOp` to `UnresolvedMaterializationRewrite`. This mapping is needed to find the correct type converter for an unresolved materialization. With this commit, `unresolvedMaterializations` is updated immediately when an op is being erased. This also cleans up the code base a bit: `SingleEraseRewriter` is now used only during the "cleanup" phase and no longer needed as a field of `ConversionRewriterImpl`. This commit is in preparation of the One-Shot Dialect Conversion refactoring: `allowPatternRollback = false` will in the future trigger immediate materialization of all IR changes.	2025-06-18 14:42:09 +02:00
Andrei Golubev	a1c2a71293	[mlir][bufferization] Use Type instead of Value in unknown conversion (#144658 ) Generally, bufferization should be able to create a memref from a tensor without needing to know more than just a mlir::Type. Thus, change BufferizationOptions::UnknownTypeConverterFn to accept just a type (mlir::TensorType for now) instead of mlir::Value. Additionally, apply the same rationale to getMemRefType() helper function. Both changes are prerequisites to enable custom types support in one-shot bufferization.	2025-06-18 14:38:58 +02:00
Oleksandr "Alex" Zinenko	8a469da8b2	[mlir] remove unnecessary atomic_rmw expansions (#144515 ) The expansion of `memref.atomic_rmw` into a `memref.generic_atomic_rmw` for floating-point min/max operations is no longer necessary as those are now supported by the LLVM dialect and LLVM IR. Furthermore, combining this expansion with direct lowering of `generic_atomic_rmw` could leads to invalid LLVM dialect IR with `cmpxchg` operating on floating-point values that it does not support.	2025-06-18 13:32:46 +02:00
Frank Schlimbach	43e1a5a411	[mlir][mesh] adding option for traversal order in sharding propagation (#144079 ) The traversal order in sharding propagation was hard-coded. This PR provides options to the pass to select a suitable order - forward-only - backward-only - forward-backward - backward-forward Default is the previous behavior (backward-forward).	2025-06-18 11:06:48 +02:00
Kirill Chibisov	74687180dd	[mlir][emitc] Make CExpression trait into interface (#142771 ) By defining `CExpressionInterface`, we move the side effect detection logic from `emitc.expression` into the individual operations implementing the interface allowing operations to gradually tune the side effect. It also allows checking for side effects each operation individually.	2025-06-18 07:38:47 +02:00
Jianhui Li	86a09f3615	[MLIR][XeGPU] Clean up xegpu op tests (#144592 ) Test cleanup: 1) separate layout.mlir from ops.mlir for layout related test 2) remove lane layout for ops working at work item scope. 3) remove redundant test in create_tdesc/update_tdesc/prefetch. 4) remove "test_" from all test function name.	2025-06-17 19:48:09 -05:00
Slava Zakharin	70343c8d44	[mlir][flang] Added Weighted[Region]BranchOpInterface's. (#142079 ) The new interfaces provide getters and setters for the weight information about the branches of BranchOpInterface and RegionBranchOpInterface operations. These interfaces are done the same way as LLVM dialect's BranchWeightOpInterface. The plan is to produce this information in Flang, e.g. mark most probably "cold" code as such and allow LLVM to order basic blocks accordingly. An example of such a code is copy loops generated for arrays repacking - we can mark it as "cold" assuming that the copy will not happen dynamically. If the copy actually happens the overhead of the copy is probably high enough so that we may not care about the little overhead of jumping to the "cold" code and fetching it.	2025-06-17 16:14:13 -07:00
Jianhui Li	f25f2f7de4	[MLIR][XeGPU] Extend unrolling support for scatter ops with chunk_size (#144447 ) Add support for load/store with chunk_size, which requires special consideration for the operand blocking since offests and masks are n-D and tensor are n+1-D. Support operations including create_tdesc, update_tdesc, load, store, and prefetch. --------- Co-authored-by: Adam Siemieniuk <adam.siemieniuk@intel.com>	2025-06-17 17:46:35 -05:00
Nishant Patel	8063bd153c	[MLIR][XeGPU] Add support for elementwise ops in Wg to Sg distribute pass [1/N] (#142797 ) This PR adds support for Elementwise operations' (unary & binary) lowering from Workgroup to Subgroup.	2025-06-17 09:55:02 -07:00
Davide Grohmann	549bc55cc3	[mlir][spirv] Fix int type declaration duplication when serializing (#143108 ) At the MLIR level unsigned integer and signless integers are different types. Indeed when looking up the two types in type definition cache they do not match. Hence when translating a SPIR-V module which contains both usign and signless integers will contain the same type declaration twice (something like OpTypeInt 32 0) which is not permitted in SPIR-V and such generated modules fail validation. This patch solves the problem by mapping unisgned integer types to singless integer types before looking up in the type definition cache. --------- Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>	2025-06-17 10:35:14 -04:00
William Moses	917bc90967	[MLIR][LLVMIR] Mark Funcop as affinescope (#144456 ) All functions are conceptually an affine scope.	2025-06-17 06:41:15 -07:00
Denzel-Brian Budii	12611a7fc7	[mlir] Improve mlir-query by adding matcher combinators (#141423 ) Whereas backward-slice matching provides support to limit traversal by specifying the desired depth level, this pull request introduces support for limiting traversal with a nested matcher (adding forward-slice also). It also adds support for variadic operators, including `anyOf` and `allOf`. Rather than simply stopping traversal when an operation named foo is encountered, one can now define a matcher that specifies different exit conditions. Variadic support implementation within mlir-query is very similar to clang-query.	2025-06-17 14:07:20 +02:00
Oleksandr "Alex" Zinenko	875b36a874	[mlir] fix MemRefToLLVM lowering of atomic operations (#139045 ) We have been confusingly, and arguably incorrectly, lowering `mimumf` atomic RMW operations in the MemRef dialect to `fm` atomic RMW operations in the LLVM dialect, which have different NaN-propagation semantics: `mimumf` propagates NaNs from either operand whereas `fm`, which lowers to the `fmnum` intrinsic returns the non-NaN operand. This also contradicts the lowering of `arith.mimumf` and `arith.m**numf` operations. Change the lowering to match the terminology in arith. Add tests for these lowerings. Keep a debug message in case of surprising behavior downstream (the code may be producing more NaNs now).	2025-06-17 13:40:57 +02:00
Momchil Velikov	7eda8274fe	[MLIR] Integration tests for lowering vector.contract to SVE FEAT_I8MM (#140573 )	2025-06-17 11:03:14 +01:00
Tom Eccles	aa01e8e9cf	[mlir][OpenMP] Fix broken insertion point for charbox with omp task (#143112 ) Fixes #142365	2025-06-17 10:42:42 +01:00

... 2 3 4 5 6 ...

23341 Commits