clang-p2996

Author	SHA1	Message	Date
Andrzej Warzyński	5586541d22	[mlir][tensor] Make useful Tensor utilities public (#126802 ) 1. Extract the main logic from `foldTensorCastPrecondition` into a dedicated helper hook: `hasFoldableTensorCastOperand`. This allows for reusing the corresponding checks. 2. Rename `getNewOperands` to `getUpdatedOperandsAfterCastOpFolding` for better clarity and documentation of its functionality. 3. These updated hooks will be reused in: * https://github.com/llvm/llvm-project/pull/123902. This PR makes them public. Note: Moving these hooks to `Tensor/Utils` is not feasible because `MLIRTensorUtils` depends on `MLIRTensorDialect` (CMake targets). If these hooks were moved to `Utils`, it would create a dependency of `MLIRTensorDialect` on `MLIRTensorUtils`, leading to a circular dependency.	2025-02-12 23:12:14 +00:00
Razvan Lupusoru	ceb00c0702	[mlir][acc] Clean up TypedValue builders (#126968 ) When MappableType was introduced alongside PointerLikeType, the data clause operation builders were duplicated to accept a `TypedValue` of one of the two type options. However, the underlying builder takes a `Value` and this difference is not relevant for it. The only difference is that `varType` is set differently depending on the type. Having two duplicated builders can lead to clunky building since a `Value` must always be cast to one of the two options. Thus, simply clean this up - the verifier already checks that it is a type that implements one of the two interfaces.	2025-02-12 14:13:45 -08:00
Nikhil Kalra	65ed4fa57e	[mlir] Python: Parse ModuleOp from file path (#126572 ) For extremely large models, it may be inefficient to load the model into memory in Python prior to passing it to the MLIR C APIs for deserialization. This change adds an API to parse a ModuleOp directly from a file path. Re-lands [`4e14b8a`](`4e14b8afb4`).	2025-02-12 14:02:41 -08:00
Frank Schlimbach	0fd50ec9a3	[MLIR][mesh] Mesh fixes (#124724 ) A collection of fixes to the mesh dialect - allow constants in sharding propagation/spmdization - fixes to tensor replication (e.g. 0d tensors) - improved canonicalization - sharding propagation incorrectly generated too many ShardOps New operation `mesh.GetShardOp` enables exchanging sharding information (like on function boundaries)	2025-02-12 12:44:48 +01:00
Adam Siemieniuk	0b9b014be7	[mlir][dlti] Query by strings (#126716 ) Adds DLTI utility to query using strings directly as keys.	2025-02-12 09:13:43 +01:00
Hongtao Yu	4a63ff4330	Revert "[mlir] Enable LICM for ops with only read side effects in scf.for" (#126840 ) Reverts llvm/llvm-project#120302	2025-02-11 20:07:21 -08:00
Arda Unal	36d8e7056e	[mlir] Enable LICM for ops with only read side effects in scf.for (#120302 ) Enable ops with only read side effects in scf.for to be hoisted with a scf.if guard that checks against the trip count This patch takes a step towards a less conservative LICM in MLIR as discussed in the following discourse thread: [Speculative LICM?](https://discourse.llvm.org/t/speculative-licm/80977) This patch in particular does the following: 1. Relaxes the original constraint for hoisting that only hoists ops without any side effects. This patch also allows the ops with only read side effects to be hoisted into an scf.if guard only if every op in the loop or its nested regions is side-effect free or has only read side effects. This scf.if guard wraps the original scf.for and checks for trip_count > 0. 2. To support this, two new interface methods are added to LoopLikeInterface: _wrapInTripCountCheck_ and _unwrapTripCountCheck_. Implementation starts with wrapping the scf.for loop into scf.if guard using _wrapInTripCountCheck_ and if there is no op hoisted into the this guard after we are done processing the worklist, it unwraps the guard by calling _unwrapTripCountCheck_.	2025-02-11 15:48:57 -08:00
Shoaib Meenai	376f65d865	Revert "[mlir] Silence -Wdangling-assignment-gsl in OperationSupport.h (#126140 )" This reverts commit `f6556afce0`. Buildbots are broken.	2025-02-11 15:05:12 -08:00
Shoaib Meenai	f6556afce0	[mlir] Silence -Wdangling-assignment-gsl in OperationSupport.h (#126140 ) This warning is causing lots of build spam when I use a recent Clang as my host compiler. It's a potential false positive, so silence it until https://github.com/llvm/llvm-project/issues/126600 is resolved. Fix variable casing while I'm here.	2025-02-11 14:05:01 -08:00
Andrzej Warzyński	fcbf04e40e	[mlir][vector][nfc] Add clarification on "dim-1" bcast (#125425 ) Adds a small note to VectorOps.td on what "dim-1" broadcast is. Also updates comments to consistently use quotes, i.e. * "dim-1" broadcasting instead of dim-1 broadcasting. This way it is clear that we are referring to "stretching" one of the trailing dims rather than e.g. broadcasting a dim at idx 1.	2025-02-11 21:37:23 +00:00
Tai Ly	20ae283d08	[mlir][tosa] Change the shift of mul to be required (#125297 ) Change the shift operand for the mul operator to be a required operand. Also defined shift to be Tosa_ScalarInt8Tensor which requires that it is a rank-1 tensor whose shape is [1] (ie, tensor containing a single element) Signed-off-by: Tai Ly <tai.ly@arm.com>	2025-02-11 11:02:44 -08:00
Hsiangkai Wang	ab93bd6959	[mlir][tosa] Change ClampOp's min/max attributes (#125197 ) This changes Tosa ClampOp attributes to min_val and max_val which are either integer attributes or float attributes, and adds verify checks that these attribute element types must match element types of input and output Co-authored-by: Tai Ly <tai.ly@arm.com>	2025-02-11 08:02:52 -08:00
Adam Siemieniuk	67f59a642f	[mlir][xegpu] Improve scatter attribute definition (#126540 ) Refactors XeGPU scatter attribute introducing following: - improved docs formatting - default initialized parameters - invariant checks in attribute verifier - removal of additional parsing error The attribute's getters now provide default values simplifying their usage and scattered tensor descriptor handling. Related descriptor verifier is updated to avoid check duplication.	2025-02-11 10:05:23 +01:00
jeanPerier	99e1308c41	[mlir][LLVM] handle argument and result attributes in llvm.call and llvm.invoke (#123177 ) Update llvm.call/llvm.invoke pretty printer/parser and the llvm ir import/export to deal with the argument and result attributes. This patch is made on top of PR 123176 that modified the CallOpInterface and added the argument and result attributes to llvm.call and llvm.invoke without doing anything with them. RFC: https://discourse.llvm.org/t/mlir-rfc-adding-argument-and-result-attributes-to-llvm-call/84107	2025-02-11 09:39:51 +01:00
Uday Bondhugula	001ba42fe0	[MLIR][Affine] Make affine fusion MDG API const correct (#125994 ) Make affine fusion MDG API const correct. NFC changes otherwise.	2025-02-11 05:28:15 +05:30
Thomas Preud'homme	d7fd2a2a3b	[MLIR] Fix LLVMIRTransforms build failure (#125485 ) lib/libMLIRLLVMIRTransforms.a fails to build from scratch with the following error: In file included from llvm/include/llvm/Frontend/OpenMP/OMPConstants.h:19, from llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:19, from mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h:26, from mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h:24, from mlir/lib/Dialect/LLVMIR/Transforms/InlinerInterfaceImpl.cpp:17: llvm/include/llvm/Frontend/OpenMP/OMP.h:16:10: fatal error: llvm/Frontend/OpenMP/OMP.h.inc: No such file or directory Use a forward declaration for OpenMPIRBuilder in ModuleTranslation.h to avoid pulling OpenMP frontend header that require generated headers.	2025-02-10 19:37:58 +00:00
Benoit Jacob	ced23aa540	[MLIR][Math] Add fine-grained populate-patterns functions for math function rewrites. (#126103 ) The existing `mlir::populateMathPolynomialApproximationPatterns` is coarse-grained and inflexible: - It populates 2 distinct classes of patterns: (1) polynomial approximations, (2) expansions of operands to f32. - It does not offer knobs to select which math functions to apply the rewrites to. This PR adds finer-grained populate-patterns functions, which take a predicate lambda allowing the caller to control which math functions to apply rewrites to. Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>	2025-02-10 09:52:24 -08:00
Razvan Lupusoru	1c583c19bb	[acc][mlir] Add functionality for categorizing OpenACC variable types (#126167 ) OpenACC specification describes the following type categories: scalar, array, composite, and aggregate (which includes arrays, composites, and others such as Fortran pointer/allocatable). Decision for how to do implicit mapping is dependent on a variable's category. Since acc dialect's only means of distinguishing between types is through the interfaces attached, add API to be able to get the type category. In addition to defining the new API, attempt to provide a base implementation for memref which matches what OpenACC spec describes.	2025-02-10 08:03:38 -08:00
Rolf Morel	f796bc622a	[MLIR][Linalg] Expose linalg.matmul and linalg.contract via Python API (#126377 ) Now that linalg.matmul is in tablegen, "hand write" the Python wrapper that OpDSL used to derive. Similarly, add a Python wrapper for the new linalg.contract op. Required following misc. fixes: 1) make linalg.matmul's parsing and printing consistent w.r.t. whether indexing_maps occurs before or after operands, i.e. per the tests cases it comes _before_. 2) tablegen for linalg.contract did not state it accepted an optional cast attr. 3) In ODS's C++-generating code, expand partial support for `$_builder` access in `Attr::defaultValue` to full support. This enables access to the current `MlirContext` when constructing the default value (as is required when the default value consists of affine maps).	2025-02-10 12:05:13 +00:00
Mehdi Amini	67b7a2590f	Revert "[mlir] Python: Parse ModuleOp from file path" (#126482 ) Reverts llvm/llvm-project#125736 The gcc7 Bot is broken at the moment.	2025-02-10 09:09:58 +01:00
Andrzej Warzynski	b1a267e1b9	[mlir][vector] Remove references to non-existing patterns (nfc) Delete references to: * `VectorLoadToMemrefLoadLowering`, * `VectorStoreToMemrefStoreLowering`. These patters were removed in #121454.	2025-02-09 13:54:11 +00:00
Durgadoss R	2feced1df0	[MLIR][NVVM] Add tcgen05 wait/fence Ops (#126265 ) PR #126091 adds intrinsics for tcgen05 wait/fence/commit operations. This patch adds NVVM Dialect Ops for them. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-02-08 21:34:40 +05:30
Uday Bondhugula	b850ce41db	[MLIR][Affine] Fix private memref creation bug in affine fusion (#126028 ) Fix private memref creation bug in affine fusion exposed in the case of the same memref being loaded from/stored to in producer nest. Make the private memref replacement sound. Change affine fusion debug string to affine-fusion - more compact. Fixes: https://github.com/llvm/llvm-project/issues/48703	2025-02-08 08:35:10 +05:30
Adam Siemieniuk	8a03658d57	[mlir][xegpu] Tensor descriptor type verifier (#124548 ) Adds XeGPU tensor descriptor type verifier. The type verifier covers general tensor descriptor invariants w.r.t. Xe ISA semantics. Related operation verifiers are updated to account for the new descriptor checks and avoid duplication.	2025-02-07 20:43:05 +01:00
Scott Todd	73f11ac17d	[mlir][tosa] Use explicit namespace for OpTrait. (#126286 ) I'm seeing build errors in a downstream project using torch-mlir that are fixed by this change. See https://github.com/iree-org/iree/pull/19903#discussion_r1946899561 for more context. The build error on MSVC is: ``` C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/Utils/ConversionUtils.h(148): error C2872: 'OpTrait': ambiguous symbol C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/IR/TosaOps.h(49): note: could be 'mlir::OpTrait' C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Dialect/Torch/IR/TorchTraits.h(23): note: or 'mlir::torch::Torch::OpTrait' C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/Utils/ConversionUtils.h(148): note: the template instantiation context (the oldest one first) is C:\home\runner\_work\iree\iree\third_party\torch-mlir\lib\Conversion\TorchToTosa\TosaLegalizeCommon.cpp(126): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInfer<mlir::tosa::MulOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::PatternRewriter &,mlir::Location,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled with [ TosaOp=mlir::tosa::MulOp ] C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Conversion/TorchToTosa/TosaLegalizeUtils.h(83): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInfer<TosaOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::ImplicitLocOpBuilder &,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled with [ TosaOp=mlir::tosa::MulOp ] C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Conversion/TorchToTosa/TosaLegalizeUtils.h(76): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInferShape<TosaOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::ImplicitLocOpBuilder &,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled with [ TosaOp=mlir::tosa::MulOp ] ``` I think the torch-mlir code here is causing the issue, but I'm not sure why builds only started failing now: https://github.com/llvm/torch-mlir/blob/main/include/torch-mlir/Dialect/Torch/IR/TorchTraits.h. Given that `mlir::OpTrait` already exists, torch-mlir should not be creating an ambiguous symbol `mlir::torch::Torch::OpTrait`. So while a better fix would be to the downstream project, being explicit here doesn't seem that unreasonable to me.	2025-02-07 11:04:09 -08:00
TatWai Chong	571a98722f	[mlir][tosa] Change 'shape' of RESHAPE from attribute to input shape … (#125789 ) The shape operand is changed to input shape type since V1.0 Change-Id: I508cc1d67e9b017048b3f29fecf202cb7d707110 Co-authored-by: Won Jeon <won.jeon@arm.com>	2025-02-07 10:24:52 -08:00
Guray Ozen	b284a849d5	[MLIR][NVVM] Add default constructor for `nvvm.barrier` [NFC] (#126225 ) This PR adds a default constructor to `nvvm.barrier`, making it more convenient to build the OP.	2025-02-07 15:42:57 +01:00
Igor Wodiany	1454fc9dbf	[mlir][spirv] Add definition for OpGroupNonUniformBallotBitCount (#126055 ) A new constraint is also added to restrict attributes values for SPIR-V attributes. Ideally this should use `ConfinedAttr` with a custom constraint directly on the operand, however it seems TableGen does not allow using that with SPIR-V attributes. I suspect it is because SPIR-V attributes do not derive from the generic MLIR attribute class - TableGen complains about missing enum field.	2025-02-07 14:20:02 +01:00
Matthias Springer	15e50b1736	[mlir][IR] Clean up type constraints around `ValueSemanticsContainerOf` (#126075 ) * Remove duplicate `TypeOrContainer`. There is an identical class with the same name: `TypeOrValueSemanticsContainer`. * Remove `TypeOrContainerOfAnyRank` and use `TypeOrValueSemanticsContainer` instead. `TypeOrContainerOfAnyRank` is inconsistent with the other classes because it explicitly checks for `VectorType` and `TensorType` instead of utilizing the value semantics type trait. * Remove `SignlessIntegerOrIndexLikeOfAnyRank` etc. and use `SignlessIntegerOrIndexLike` instead. `SignlessIntegerOrIndexLike` etc. already allow 0-d vectors, so there is no difference with `SignlessIntegerOrIndexLikeOfAnyRank`.	2025-02-07 09:58:15 +01:00
Karim Nosseir	7fa57cd430	[MLIR] Add move constructor to BytecodeWriterConfig (#126130 ) The config is currently not movable and because there are constructors the default move won't be generated, which prevents it from being moved. Also, it is not copyable because of the unique_ptr. This PR adds move constructor to allow moving it.	2025-02-06 21:30:55 -08:00
Avik Pal	a15618f18c	[mlir] feat: add `mlirFuncSetResultAttr` (#125972 ) cc @ftynse @wsmoses	2025-02-06 17:33:12 -06:00
Alan Li	f0e1857c84	[MLIR] Support non-atomic RMW option for emulated vector stores (#124887 ) This patch is a followup of the previous one: #115922, It adds an option to turn on emitting non-atomic rmw code sequence instead of atomic rmw.	2025-02-06 13:22:42 -08:00
Md Asghar Ahmad Shahid	f2bca9e385	[MLIR][Linalg] Introduce broadcast/transpose semantic to batch_matmul (#122275 ) Goals: 1. To add syntax and semantic to 'batch_matmul' without changing any of the existing syntax expectations for current usage. batch_matmul is still just batch_matmul. 2. Move the definition of batch_matmul from linalg OpDsl to tablegen ODS infra. Scope of this patch: To expose broadcast and transpose semantics on the 'batch_matmul'. The broadcast and transpose semantic are as follows: By default, 'linalg.batch_matmul' behavior will remain as is. Broadcast and Transpose semantics can be applied by specifying the explicit attribute 'indexing_maps' as shown below. This is a list attribute, so the list must include all the maps if specified. Example Transpose: ``` linalg.batch_matmul indexing_maps = [ affine_map< (d0, d1, d2, d3) -> (d0, d3, d1)>, //transpose affine_map< (d0, d1, d2, d3) -> (d0, d3, d2)>, affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)> ] ins (%arg0, %arg1: memref<2x5x3xf32>,memref<2x5x7xf32>) outs (%arg2: memref<2x3x7xf32>) ``` Example Broadcast: ``` linalg.batch_matmul indexing_maps = [ affine_map< (d0, d1, d2, d3) -> (d3)>, //broadcast affine_map< (d0, d1, d2, d3) -> (d0, d3, d2)>, affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)> ] ins (%arg0, %arg1: memref<5xf32>,memref<2x5x7xf32>) outs (%arg2: memref<2x3x7xf32>) ``` Example Broadcast and transpose: ``` linalg.batch_matmul indexing_maps = [ affine_map< (d0, d1, d2, d3) -> (d1, d3)>, //broadcast affine_map< (d0, d1, d2, d3) -> (d0, d2, d3)>, //transpose affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)> ] ins (%arg0, %arg1: memref<3x5xf32>, memref<2x7x5xf32>) outs (%arg2: memref<2x3x7xf32>) ``` RFCs and related PR: https://discourse.llvm.org/t/rfc-linalg-opdsl-constant-list-attribute-definition/80149 https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863 https://discourse.llvm.org/t/rfc-mlir-linalg-operation-tree/83586 https://github.com/llvm/llvm-project/pull/115319	2025-02-06 19:08:50 +00:00
Krzysztof Drewniak	f4e3b8783c	[mlir][LLVM] Switch `undef` for `poison` for uninitialized values (#125629 ) LLVM itself is generally moving away from using `undef` and towards using `poison`, to the point of having a lint that caches new uses of `undef` in tests. In order to not trip the lint on new patterns and to conform to the evolution of LLVM - Rename valious ::undef() methods on StructBuilder subclasses to ::poison() - Audit the uses of UndefOp in the MLIR libraries and replace almost all of them with PoisonOp The remaining uses of `undef` are initializing `uninitialized` memrefs, explicit conversions to undef from SPIR-V, and a few cases in AMDGPUToROCDL where usage like %v = insertelement <M x iN> undef, iN %v, i32 0 %arg = bitcast <M x iN> %v to i(M * N) is used to handle "i32" arguments that are are really packed vectors of smaller types that won't always be fully initialized.	2025-02-06 12:49:30 -06:00
Krzysztof Drewniak	efd0a7f446	[mlir][ROCDL][~NFC] Migrate to LLVM dialect default builders (#125609 ) There were a bunch of spots in ROCDL.td where we were defining our own llvmBuilder call which could have been generated using the default built-in one on LLVM_IntrOpBase. This commit cleans up such usages in the interests of potentinally enabling ROCDL import in the future and of making best practices more obvious. The one breaking change is renaming WaitcntOp to SWaitcntOp, which should have minimal impact.	2025-02-06 11:38:43 -06:00
Igor Wodiany	8609e27a58	[mlir][spirv] Add definition for ImageWriteOp (#124124 ) This Pull Request adds OpImageWrite as defined in section 3.52.10. (Image Instructions). The tests in `mlir/test/Target/SPIRV/image-ops.mlir` are also updated (and extended with the new op), so they now pass validation with `spirv-val` after serialization into SPIR-V. The test was missing `ImageQuery` capability and entry points. For entry points dummy `main` functions were added.	2025-02-06 09:25:08 -05:00
Matthias Springer	8c2b4aa5a0	[mlir][LLVM][NFC] Fix description of `LLVMFixedVectorType` (#126031 )	2025-02-06 10:37:32 +01:00
Andrzej Warzyński	80fd902573	[mlir][tensor] Introduce `TensorRelayoutOpInterface` (#125823 ) The newly introduced `TensorRelayoutOpInterface` is created specifically for `tensor.pack` + `tensor.unpack`. Although the interface is currently empty, it enables us to refactor the logic in `FoldTensorCastProducerOp` within the Tensor dialect as follows: ```cpp // OLD // Reject tensor::PackOp - there's dedicated pattern for that instead. if (!foldTensorCastPrecondition(op) \|\| isa<tensor::PackOp, tensor::UnPackOp>(op)) return failure(); ``` is replaced with: ```cpp // NEW // Reject tensor::PackOp - there's dedicated pattern for that instead. if (!foldTensorCastPrecondition(op) \|\| isa<tensor::RelayoutOpInterface>(op)) return failure(); ``` This will be crucial once `tensor.pack` + `tensor.pack` are replaced with `linalg.pack` + `linalg.unpack` (i.e. moved to Linalg): * https://github.com/llvm/llvm-project/pull/123902, * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg/. Note that the interface itself will later be moved to the Linalg dialect. This decoupling ensures that the Tensor dialect does not require an understanding of Linalg ops, thus keeping the dependency lightweight. This PR is effectively a preparatory step for moving PackOp and UnpackOp to Linalg. Once that's completed, most CMake changes from this PR will be effectively reverted.	2025-02-06 09:18:13 +00:00
Bruno Cardoso Lopes	4fb96f203e	[MLIR][LLVM] Implement LLVM dialect support for global aliases (#125295 ) This includes support for module translation, module import and add tests for both. Fix https://github.com/llvm/llvm-project/issues/115390 ClangIR cannot currently lower global aliases to LLVM because of missing support for this.	2025-02-05 18:19:36 -08:00
Ivan Butygin	6e52a12811	[mlir][vector] Create `VectorToLLVMDialectInterface` (#121440 ) Create `VectorToLLVMDialectInterface` which allows automatic conversion discovery by generic `--convert-to-llvm` pass. This only covers final dialect conversion step and not any previous preparation steps. Also, currently there is no way to pass any additional parameters through this conversion interface, but most users using default parameters anyway.	2025-02-05 23:21:25 +03:00
Nikhil Kalra	4e14b8afb4	[mlir] Python: Parse ModuleOp from file path (#125736 ) For extremely large models, it may be inefficient to load the model into memory in Python prior to passing it to the MLIR C APIs for deserialization. This change adds an API to parse a ModuleOp directly from a file path.	2025-02-05 11:48:37 -08:00
Guray Ozen	dd099e9cc2	[MLIR][NVVM] Fix links in OP definition (#125865 )	2025-02-05 16:18:04 +01:00
Guray Ozen	baf27862dd	[MLIR][NVGPU] Move max threads/blocks size to dialect (NFC) (#124454 ) This PR moves maximum number of threads in a block and block in a grid to nvgpu dialect to avoid replicated code. The limits are defined here: https://docs.nvidia.com/cuda/cuda-c-programming-guide/#features-and-technical-specifications-technical-specifications-per-compute-capability	2025-02-05 12:38:37 +01:00
Jack Frankland	f0b8ff1251	[mlir][tosa] Remove Quantization Attribute (#125479 ) Removed the TOSA quantization attribute used in various MLIR TOSA dialect operations in favour of using builtin attributes. Update any lit tests, conversions and transformations appropriately. Signed-off-by: Tai Ly <tai.ly@arm.com> Co-authored-by: Tai Ly <tai.ly@arm.com>	2025-02-05 11:27:17 +00:00
Durgadoss R	4287c72404	[MLIR][NVVM] Add tcgen05 alloc/dealloc Ops (#125674 ) PR #124961 adds intrinsics for the tcgen05 alloc/dealloc PTX instructions. This patch adds NVVM Ops for the same. Tests are added to verify the lowering to the corresponding intrinsics in tcgen05-alloc.mlir file. PTX ISA link: https://docs.nvidia.com/cuda/parallel-thread-execution/#tcgen05-memory-alloc-manage-instructions Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-02-05 16:16:47 +05:30
Paul Carabas	12fff8db4b	[mlir][LLVMIR] Add support for tan intrinsic op (#125748 ) This patch adds support for Tan trig. function intrinsic in LLVM dialect & adds missing import/export tests for Sin	2025-02-04 22:42:39 -06:00
Uday Bondhugula	05a09e6e55	[MLIR][Affine] Extend/generalize MDG to properly add edges between non-affine ops (#125451 ) Drop arbitrary checks and hacks from affine fusion MDG construction and handle all ops using memory read/write effects. This has been a long pending change and it now makes affine fusion more powerful in the presence of non-affine ops and does not limit fusion in parts of the block where it is feasible simply because of non-affine ops elsewhere or intervening non-affine users. Populate memref read and write ops in non-affine region holding ops and non-affine ops at the top level of the Block properly; add the appropriate edges to MDG. Use memory read-write effects and drop assumptions and special handling of ops due to historic reasons. Update MDG to drop unnecessary "unhandled region" hack. This hack is no longer needed with the update to fully and properly construct the MDG. MDG edges now capture dependences between nodes completely. Drop non-affine users check. With the MDG generalization to properly include edges between non-affine nodes/operations, the non-affine users on path check in fusion is no longer needed. Add more test cases to exercise MDG generalization. Drop unnecessary failure when encountering side-effect-free affine.if ops. Improve documentation on MDG.	2025-02-05 09:52:59 +05:30
Soren Lassen	c8ca486573	[MLIR] print/parse resource handle key quoted and escaped (#119746 ) resource keys have the problem that you can’t parse them from mlir assembly if they have special or non-printable characters, but nothing prevents you from specifying such a key when you create e.g. a DenseResourceElementsAttr, and it works fine in other ways, including bytecode emission and parsing this PR solves the parsing by quoting and escaping keys with special or non-printable characters in mlir assembly, in the same way as symbols, e.g.: ``` module attributes { fst = dense_resource<resource_fst> : tensor<2xf16>, snd = dense_resource<"resource\09snd"> : tensor<2xf16> } {} {-# dialect_resources: { builtin: { resource_fst: "0x0200000001000200", "resource\09snd": "0x0200000008000900" } } #-} ``` by not quoting keys without special or non-printable characters, the change is effectively backwards compatible the change is tested by: 1. adding a test with a dense resource handle key with special characters to `dense-resource-elements-attr.mlir` 2. adding special and unprintable characters to some resource keys in the existing lit tests `pretty-resources-print.mlir` and `mlir/test/Bytecode/resources.mlir`	2025-02-04 13:49:15 -07:00
Corbin Robeck	6f35a9e7c5	[MLIR][ROCDL] Add Scale Convert Packed FP8 <-> F32 Support for GFX950 (#125564 ) Add Rocdl support for the following GFX950 instructions: CVT_SCALE_PK_FP8_F32 CVT_SCALE_PK_BF8_F32 CVT_SCALE_SR_FP8_F32 CVT_SCALE_SR_BF8_F32 CVT_SCALE_PK_F32_FP8 CVT_SCALE_PK_F32_BF8 CVT_SCALE_F32_FP8 CVT_SCALE_F32_BF8	2025-02-04 13:21:59 -05:00
Razvan Lupusoru	bd30838422	[flang][acc] Improve acc lowering around fir.box and arrays (#125600 ) The current implementation of OpenACC lowering includes explicit expansion of following cases: - Creation of `acc.bounds` operations for all arrays, including those whose dimensions are captured in the type (eg `!fir.array<100xf32>`) - Expansion of box types by only putting the box's address in the data clause. The address was extracted with a `fir.box_addr` operation and the bounds were filled with `fir.box_dims` operation. However, with the creation of the new type interface `MappableType`, the idea is that specific type-based semantics can now be used. This also really simplifies representation in the IR. Consider the following example: ``` subroutine sub(arr) real :: arr(:) !$acc enter data copyin(arr) end subroutine ``` Before the current PR, the relevant acc dialect IR looked like: ``` func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name = "arr"}) { ... %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> (!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>) %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index %2:3 = fir.box_dims %1#0, %c0 : (!fir.box<!fir.array<?xf32>>, index) -> (index, index, index) %c0_0 = arith.constant 0 : index %3 = arith.subi %2#1, %c1 : index %4 = acc.bounds lowerbound(%c0_0 : index) upperbound(%3 : index) extent(%2#1 : index) stride(%2#2 : index) startIdx(%c1 : index) {strideInBytes = true} %5 = fir.box_addr %1#0 : (!fir.box<!fir.array<?xf32>>) -> !fir.ref<!fir.array<?xf32>> %6 = acc.copyin varPtr(%5 : !fir.ref<!fir.array<?xf32>>) bounds(%4) -> !fir.ref<!fir.array<?xf32>> {name = "arr", structured = false} acc.enter_data dataOperands(%6 : !fir.ref<!fir.array<?xf32>>) ``` After the current change, it looks like: ``` func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name = "arr"}) { ... %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> (!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>) %2 = acc.copyin var(%1#0 : !fir.box<!fir.array<?xf32>>) -> !fir.box<!fir.array<?xf32>> {name = "arr", structured = false} acc.enter_data dataOperands(%2 : !fir.box<!fir.array<?xf32>>) ``` Restoring the old behavior can be done with following command line options: `--openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true`	2025-02-04 08:08:16 -08:00

1 2 3 4 5 ...

11181 Commits