clang-p2996

Author	SHA1	Message	Date
xiaoleis-nv	d03f35f9b6	[MLIR][NVVM] Fix the datatype error for nvvm.mma.sync when the operand is bf16 (#122664 ) The PR fixes the datatype error for `nvvm.mma.sync` when the operand is `bf16`. This operation originally requires the A/B type to be `f16x2` for the `bf16` MMA. However, it violates the NVVM intrinsic [[here](`372044ee09/llvm/include/llvm/IR/IntrinsicsNVVM.td (L119)`)], where the A/B operand type should be `i32`. This is a bug, and there are no tests in MLIR that cover this datatype. ``` // mma bf16 -> s32 @ m16n8k16/m16n8k8 !eq(gft,"m16n8k16:a:bf16") : !listsplat(llvm_i32_ty, 4), !eq(gft,"m16n8k16:b:bf16") : !listsplat(llvm_i32_ty, 2), !eq(gft,"m16n8k8:a:bf16") : !listsplat(llvm_i32_ty, 2), !eq(gft,"m16n8k8:b:bf16") : [llvm_i32_ty], ``` This PR addresses this bug and adds tests to guarantee correctness. Co-authored-by: Xiaolei Shi <xiaoleis@nvidia.com>	2025-01-13 15:03:05 +05:30
Kazu Hirata	4f4e2abb1a	[mlir] Migrate away from PointerUnion::{is,get} (NFC) (#122591 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2025-01-11 13:16:43 -08:00
William Moses	38fcf62483	[MLIR] Import LLVM add flag to disable loadalldialects (#122574 ) Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>	2025-01-11 09:11:22 -05:00
Kazu Hirata	35e89897a4	[Dialect] Migrate away from PointerUnion::{is,get} (NFC) (#122568 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T>	2025-01-11 02:06:33 -08:00
Matthias Springer	5d26a6d759	[mlir][Interfaces] `ViewLikeOpInterface`: Remove parser/printer overloads (#122436 ) #115808 adds additional `custom<>` parser/printer variants. The overall list of overloads/variants is getting larger. This commit removes overloads that are not needed, to keep the parser/printer simple.	2025-01-10 17:18:53 +01:00
Guray Ozen	66e41a1a20	[MLIR][NVVM] Declare InferIntRangeInterface for RangeableRegisterOp (#122263 )	2025-01-10 10:32:25 +01:00
Krzysztof Drewniak	0aa831e0ed	[mlir][GPU] Implement ValueBoundsOpInterface for GPU ID operations (#122190 ) The GPU ID operations already implement InferIntRangeInterface, which gives constant lower and upper bounds on those IDs when appropriate metadata is prentent on the operations or in the surrounding context. This commit uses that existing code to implement the ValueBoundsOpInterface, which is used when analyzing affine operations (unlike the integer range interface, which is used for arithmetic optimization). It also implements the interface for gpu.launch, where we can use it to express the constraint that block/grid sizes are equal to their value from outside the launch op and that the corresponding IDs are bounded above by that size. As a consequence, the test pass for this inference is updated to work on a FunctionOpInterface and not a func.func, creating minor churn in other tests.	2025-01-09 11:42:22 -08:00
Razvan Lupusoru	cbcb7ad32e	[mlir][acc] Introduce MappableType interface (#122146 ) OpenACC data clause operations previously required that the variable operand implemented PointerLikeType interface. This was a reasonable constraint because the dialects currently mixed with `acc` do use pointers to represent variables. However, this forces the "pointer" abstraction to be exposed too early and some cases are not cleanly representable through this approach (more specifically FIR's `fix.box` abstraction). Thus, relax this by allowing a variable to be a type which implements either `PointerLikeType` interface or `MappableType` interface.	2025-01-09 10:27:37 -08:00
Andrea Faulds	7724be9728	[mlir][spirv] Do SPIR-V serialization in -test-vulkan-runner-pipeline (#121494 ) This commit is a further incremental step toward moving the whole mlir-vulkan-runner MLIR pass pipeline into mlir-opt (see #73457). The previous step was b225b3adf7b78387c9fcb97a3ff0e0a1e26eafe2, which moved all device passes prior to SPIR-V serialization into a new mlir-opt test pass, `-test-vulkan-runner-pipeline`. This commit changes how SPIR-V serialization is accomplished for Vulkan runner tests. Until now, this was done by the Vulkan-specific ConvertGpuLaunchFuncToVulkanLaunchFunc pass. With this commit, this responsibility is removed from that pass, and is instead done with the existing generic GpuModuleToBinaryPass. In addition, the SPIR-V serialization step is no longer done inside mlir-vulkan-runner, but rather inside mlir-opt (in the `-test-vulkan-runner-pipeline` pass). Both of these changes represent a greater alignment between mlir-vulkan-runner and the other GPU integration tests. Notably, the IR shapes produced by the mlir-opt pipelines for the Vulkan and SYCL runners are now much more similar, with both using a gpu.binary op for the serialized SPIR-V kernel. In order to enable this, this commit includes these supporting changes: - ConvertToSPIRVPass is enhanced to support producing the IR shape where a spirv.module is nested inside a gpu.module, since this is what GpuModuleToBinaryPass expects. - ConvertGPULaunchFuncToVulkanLaunchFunc is changed to remove its SPIR-V serialization functionality, and instead now extracts the SPIR-V from a gpu.binary operation (as produced by ConvertToSPIRVPass). - `-test-vulkan-runner-pipeline` now attaches SPIR-V target information required by GpuModuleToBinaryPass. - The WebGPU pass option, which had been removed from mlir-vulkan-runner in the previous commit in this series, is restored as an option to `-test-vulkan-runner-pipeline` instead, so that the WebGPU pass continues being inserted into the pipeline just before SPIR-V serialization.	2025-01-09 17:58:51 +01:00
Arda Unal	b3ce6dc723	[mlir][licm] Make scf.if recursively speculatable (#122031 ) This change: - makes scf.if recursively speculatable like affine.if is. - also introduces related LICM tests for both scf.if and affine.if	2025-01-08 09:54:18 -08:00
Matthias Springer	4751f47c7a	[mlir][Transforms] Dialect conversion: Turn LLVM_DEPRECATED into comments (#122073 ) Some functions of the deprecated 1:N dialect conversion were marked as `LLVM_DEPRECATED`. This caused compilation warnings because there are still test cases of the 1:N dialect conversion framework. (These test cases will be deleted at the same time when the 1:N driver is deleted.)	2025-01-08 17:10:06 +01:00
William Moses	1c067a513c	[MLIR] Enable import of non self referential alias scopes (#121987 ) Fixes #121965. --------- Co-authored-by: Christian Ulmann <christianulmann@gmail.com> Co-authored-by: Alex Zinenko <git@ozinenko.com>	2025-01-08 13:40:05 +01:00
Jack Frankland	360a03c980	[mlir][tosa] Add acc_type to Tosa-v1.0 Conv Ops (#121466 ) Tosa v1.0 adds accumulator type attributes to the various convolution operations defined in the spec. Update the dialect and any lit tests to include these attributes. Signed-off-by: Tai Ly <tai.ly@arm.com> Co-authored-by: Tai Ly <tai.ly@arm.com>	2025-01-08 12:12:26 +02:00
Longsheng Mou	c1d01b2fc2	[mlir][tosa] Add missing verifier for `tosa.pad` (#120934 ) This PR adds a missing verifier for `tosa.pad`, ensuring that the padding shape matches [2*rank(shape1)] according to V1.0.0 Specification. Fixes #119840.	2025-01-08 10:45:59 +02:00
Guray Ozen	f50f9698ad	[MLIR][GPU] Fix gpu.printf (#121940 )	2025-01-08 08:25:57 +01:00
Michael Jungmair	1fb98b5a7e	[mlir][Transforms] Make LocationSnapshotPass respect OpPrintingFlags (#119373 ) The current implementation of LocationSnapshotPass takes an OpPrintingFlags argument and stores it as member, but does not use it for printing. Properly implement the printing flags, also supporting command line args. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-01-07 12:14:35 +01:00
William Moses	5656cbca52	[MLIR][CAPI] export LLVMFunctionType param getter and setters (#121888 )	2025-01-07 02:39:44 -05:00
Ian Wood	fe42e63d7b	[mlir][NFC] Refactor `eraseState` to take constant time (#121670 ) Refactors `analysisStates` to use two nested maps . This prevents `eraseState` from having to scan through every analysis state which can be costly when there are many analysis states and/or `eraseState` is called frequently. Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-01-06 10:05:14 -08:00
Maksim Levental	0c1cf75300	[mlir] DCE `RegisteredOperationName::parseAssembly` decl (#121730 )	2025-01-06 07:12:59 -05:00
Maksim Levental	9ce8f4b70b	[mlir] DCE `friend Dialect::registerDialect` (#121728 )	2025-01-06 07:12:07 -05:00
Matthias Springer	599c739905	[mlir][GPU] Add NVVM-specific `cf.assert` lowering (#120431 ) This commit add an NVIDIA-specific lowering of `cf.assert` to to `__assertfail`. Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and `getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can be reused.	2025-01-06 12:00:11 +01:00
William Moses	b5f21671ef	MLIR: Enable importing inlineasm calls (#121624 )	2025-01-05 11:02:49 -05:00
Matthias Springer	95c5c5d4ba	[mlir][Transforms][NFC] Use `DominanceInfo` to compute materialization insertion point (#120746 ) In the dialect conversion driver, use `DominanceInfo` to compute a suitable insertion point for N:1 source materializations.	2025-01-04 09:23:15 +01:00
Matthias Springer	2d424765f4	[mlir][IR][NFC] `DominanceInfo`: Share same impl for block/op dominance (#115587 ) The `properlyDominates` implementations for blocks and ops are very similar. This commit replaces them with a single implementation that operates on block iterators. That implementation can be used to implement both `properlyDominates` variants. Before: ```c++ template <bool IsPostDom> bool DominanceInfoBase<IsPostDom>::properlyDominatesImpl(Block a, Block b) const; template <bool IsPostDom> bool DominanceInfoBase<IsPostDom>::properlyDominatesImpl( Operation a, Operation b, bool enclosingOpOk) const; ``` After: ```c++ template <bool IsPostDom> bool DominanceInfoBase<IsPostDom>::properlyDominatesImpl( Block aBlock, Block::iterator aIt, Block bBlock, Block::iterator bIt, bool enclosingOk) const; ``` Note: A subsequent commit will add a new public `properlyDominates` overload that accepts block iterators. That functionality can then be used to find a valid insertion point at which a range of values is defined (by utilizing post dominance).	2025-01-04 09:12:03 +01:00
Krzysztof Drewniak	9f5cefebb4	[mlir][Affine] Generalize the linearize(delinearize()) simplifications (#117637 ) The existing canonicalization patterns would only cancel out cases where the entire result list of an affine.delineraize_index was passed to an affine.lineraize_index and the basis elements matched exactly (except possibly for the outer bounds). This was correct, but limited, and left open many cases where a delinearize_index would take a series of divisions and modulos only for a subsequent linearize_index to use additions and multiplications to undo all that work. This sort of simplification is reasably easy to observe at the level of splititng and merging indexes, but difficult to perform once the underlying arithmetic operations have been created. Therefore, this commit generalizes the existing simplification logic. Now, any run of two or more delinearize_index results that appears within the argument list of a linearize_index operation with the same basis (or where they're both at the outermost position and so can be unbonded, or when `linearize_index disjoint` implies a bound not present on the `delinearize_index`) will be reduced to one signle delinearize_index output, whose basis element (that is, size or length) is equal to the product of the sizes that were simplified away. That is, we can now simplify %0:2 = affine.delinearize_index %n into (8, 8) : inde, index %1 = affine.linearize_index [%x, %0#0, %0#1, %y] by (3, 8, 8, 5) : index to the simpler %1 = affine.linearize_index [%x, %n, %y] by (3, 64, 5) : index This new pattern also works with dynamically-sized basis values. While I'm here, I fixed a bunch of typos in existing tests, and added a new getPaddedBasis() method to make processing potentially-underspecified basis elements simpler in some cases.	2025-01-03 15:12:39 -06:00
Matthias Springer	3ace685105	[mlir][Transforms] Support 1:N mappings in `ConversionValueMapping` (#116524 ) This commit updates the internal `ConversionValueMapping` data structure in the dialect conversion driver to support 1:N replacements. This is the last major commit for adding 1:N support to the dialect conversion driver. Since #116470, the infrastructure already supports 1:N replacements. But the `ConversionValueMapping` still stored 1:1 value mappings. To that end, the driver inserted temporary argument materializations (converting N SSA values into 1 value). This is no longer the case. Argument materializations are now entirely gone. (They will be deleted from the type converter after some time, when we delete the old 1:N dialect conversion driver.) Note for LLVM integration: Replace all occurrences of `addArgumentMaterialization` (except for 1:N dialect conversion passes) with `addSourceMaterialization`. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2025-01-03 16:11:56 +01:00
hatoo	cbff02b101	[mlir][emitc] Fix invalid syntax in example of emitc.return (#121112 ) A return type of `emitc.func` must be specified with `->` instead of `:`. I've verified the syntax using `mlir-translate --mlir-to-cpp`.	2025-01-02 18:13:27 +01:00
josel-amd	d622b66a82	Re-introduce Type Conversion on EmitC (#121476 ) This PR reintroduces https://github.com/llvm/llvm-project/pull/118940 with a fix for the build issues on cd9caf3aeed55280537052227f08bb1b41154efd	2025-01-02 14:58:15 +01:00
Marius Brehler	8178e72188	[mlir][func] Fix return op example (#121470 ) Similiar to #121112.	2025-01-02 14:02:07 +01:00
Matthias Gehre	df728cf1d7	Revert "[MLIR][SCFToEmitC] Convert types while converting from SCF to EmitC (#118940 )" This reverts commit `450c6b02d2`.	2025-01-02 11:55:35 +01:00
josel-amd	450c6b02d2	[MLIR][SCFToEmitC] Convert types while converting from SCF to EmitC (#118940 ) Switch from rewrite patterns to conversion patterns. This allows to perform type conversions together with other parts of the IR. For example, this allows to convert from index to emit.size_t types.	2025-01-02 11:36:23 +01:00
Matthias Springer	80ecbaa3c0	[mlir][Transforms] Mark 1:N conversion driver as deprecated (#121102 ) The 1:N conversion driver will be removed soon. Note for LLVM integration: Please migrate your code base to the regular dialect conversion driver.	2024-12-31 13:11:19 +01:00
Maksim Levental	fb365ac86c	[mlir][linalg] DCE unimplemented extra decl (#121272 )	2024-12-30 13:51:11 -06:00
Amir Bishara	d9111f19d2	[mlir][bufferization]-Refactor findValueInReverseUseDefChain to accept opOperand (#121304 ) Edit the `findValueInReverseUseDefChain` method to accept `OpOperand` instead of the `Value` type, This change will make sure that the populated `visitedOpOperands` argument is fully accurate and contains the opOperand we have started the reverse chain from.	2024-12-30 21:18:38 +02:00
Longsheng Mou	79af7bdd4e	[mlir][tosa] Add `AllElementTypesMatch` trait for `tosa.transpose` (#120964 ) This PR adds `AllElementTypesMatch` trait for `tosa.transpose` to ensure output tensor of same type as the input tensor. Fixes #119364.	2024-12-30 23:12:55 +08:00
Maksim Levental	60d20603e4	[mlir][xegpu] DCE decl in TD (#121249 )	2024-12-30 06:21:07 -05:00
Maksim Levental	8487d2460e	[mlir][shape] DCE unimplemented extra decl (#121275 )	2024-12-29 12:13:46 -05:00
Maksim Levental	f1bc3afb6c	[mlir][scf] DCE unimplemented decls in TDs (#121237 ) More dead code in headers...	2024-12-28 14:53:05 -06:00
Maksim Levental	4a6fcc17c6	[mlir][emitc] DCE unimplemented decls (#121253 )	2024-12-28 13:42:16 -05:00
Amir Bishara	7e749d4fb7	[mlir][bufferization]-Add ControlBuildSubsetExtractionFn to TensorEmptyElimination (#120851 ) This PR Adds a `ControlBuildSubsetExtractionFn` to the tensor empty elimination util, This will control the building of the subsets extraction of the `SubsetInsertionOpInterface`. This control function returns the subsets extraction value that will replace the `emptyTensorOp` use which is being consumed by a specefic user (which the util expects to eliminate it). The default control function will stay like today's behavior without any additional changes.	2024-12-28 13:28:09 +02:00
Kunwar Grover	91bbebc7e1	[mlir][scf] Add getPartialResultTilePosition to PartialReductionOpInterface (#120465 ) This PR adds a new interface method to PartialReductionOpInterface which allows it to query the result tile position for the partial result. Previously, tiling the reduction dimension with SplitReductionOuterReduction when the result has transposed parallel dimensions would produce wrong results. Other fixes that were needed to make this PR work: - Instead of ad-hoc logic to decide where to place the new reduction dimensions in the partial result based on the iteration space, the reduction dimensions are always appended to the partial result tensor. - Remove usage of PartialReductionOpInterface in Mesh dialect. The implementation was trying to just get a neutral element, but ended up trying to use PartialReductionOpInterface for it, which is not right. It was also passing the wrong sizes to it.	2024-12-27 16:52:34 +00:00
Kunwar Grover	5ad4213ef4	[mlir][Linalg] Allow PartialReductionOpInterface ops in tile_reduction_using_for (#120118 ) The API used internally expects PartialReductionOpInterface. This patch allows any operation implementing this interface to use this transform op (instead of just LinalgOp).	2024-12-27 13:19:58 +00:00
Maksim Levental	6b53a9546c	[mlir][arith] DCE `getPredicateByName` (#121165 )	2024-12-26 17:38:18 -08:00
Oleksandr "Alex" Zinenko	776ac21c7f	[mlir] minor documentation fix in GPUTransformOps.td (#121157 ) - do not refer to handles as `PDLOperation`, this is an outdated and incorrect vision of what they are based on the type used in the early days; - use backticks around inline code.	2024-12-26 20:18:35 +01:00
srcarroll	8906b7be91	Enable custom alloc-like ops in `promoteBufferResultsToOutParams` (#120288 ) In `buffer-results-to-out-params`, when `hoist-static-allocs` option is enabled the pass was looking for `memref.alloc`s in order to attempt to avoid copies when it can. Which makes it not extensible to external ops that have allocation like properties. This patch simply changes `memref::AllocOp` to `AllocationOpInterface` in the check to enable for any allocation op. Moreover, for function call updates, we enable setting an allocation function callback in `BufferResultsToOutParamsOpts` to allow users to emit their own alloc-like op.	2024-12-26 11:32:51 -06:00
Thirumalai Shaktivel	cbe583b0bd	[Flang] Add translation support for MutexInOutSet and InOutSet (#120715 ) Implementatoin details: Both Mutexinoutset and Inoutset is recognized as flag=0x4 and 0x8 respectively, the flags is set to `kmp_depend_info` and passed as argument to `__kmpc_omp_task_with_deps` runtime call	2024-12-26 15:02:09 +05:30
Krzysztof Drewniak	378e179337	[mlir][Properties] Shorten "Property" to "Prop" in most places (#120368 ) Since the property system isn't currently in heavy use, it's probably the right time to fix a choice I made when expanding ODS property support. Specifically, most of the property subclasses, like OptionalProperty or IntProperty, wrote out the word "Property" in full. The corresponding classes in the Attribute hierarchy uses the short-form "Attr" in those cases, as in OptionalAttr or DefaultValuedAttr. This commit changes all those uses of "Property" to "Prop" in order to prevent excessively verbose tablegen files that needlessly repeat the full name of a core concept that can be abbreviated. So, this commit renames all the FooProperty classes to FooProp, and keeps the existing names as alias with a Deprecated<> on them to warn people. In addition, this commit updates the documentation around properties to mention the constraint support.	2024-12-23 15:57:34 -06:00
Hongren Zheng	a60050cf19	[mlir][dataflow] Allow re-run all analyses in DataFlowSolver (#120881 ) In downstream (check https://github.com/google/heir/pull/1228, especially [this commit](`fbf0b2733f`); also check https://github.com/google/heir/pull/1154) we often need to re-run the analysis during the transformation pass as IR get changed based on the analysis result and analysis continuously get invalidated. There are solutions to it like `getOrCreateState` for newly created `Value` (`AnchorT`), but warning is that the new state does not propagate! This is quite unexpected as user of analysis would expect it to propagate. We downstream used to use `solver->propagateIfChanged` but that turned out to be not working, see detailed writeup in https://github.com/google/heir/issues/1153. Just call `initializeAndRun` repeatedly also does not solve the problem as `analysisStates` is not cleared and the monotonicity of `AnalysisState` will make the analysis invalid as `join` will not work as expected (the first join is no longer `join(uninitialized, init value)`, instead it becomes `join(higher value, init value)`. To correctly re-run the analysis, either a new `DataFlowSolver` is created, or we can just clear the `analysisState`.	2024-12-23 12:33:23 -08:00
Matthias Springer	df31fd8a36	[mlir] Fix use-after-return in #117513 (#120968 ) Fix a use-after-return in #117513. Free-standing lambdas should not be defined inside of the `LLVMTypeConverter` constructor because they go out of scope.	2024-12-23 15:13:42 +01:00
Srinivasa Ravi	5f98dd5dd5	[MLIR][NVVM] Update Wgmma.fence Ops to use intrinsics (#120956 ) This PR updates the WgmmaFenceAlignedOp, WgmmaGroupSyncAlignedOp, and WgmmaWaitGroupSyncOp Ops in the NVVM Dialect to lower to the corresponding intrinsics instead of inline-ptx. The existing test under Conversion/NVVMToLLVM is updated to check for the new patterns and separate tests are added under Target/LLVMIR to verify the lowered intrinsics.	2024-12-23 18:56:48 +05:30

1 2 3 4 5 ...

11030 Commits