Commit Graph

11181 Commits

Author SHA1 Message Date
Andrzej Warzyński
5586541d22 [mlir][tensor] Make useful Tensor utilities public (#126802)
1. Extract the main logic from `foldTensorCastPrecondition` into a
   dedicated helper hook: `hasFoldableTensorCastOperand`. This allows
   for reusing the corresponding checks.

2. Rename `getNewOperands` to `getUpdatedOperandsAfterCastOpFolding` for
   better clarity and documentation of its functionality.

3. These updated hooks will be reused in:
   * https://github.com/llvm/llvm-project/pull/123902. This PR makes
     them public.

**Note:** Moving these hooks to `Tensor/Utils` is not feasible because
`MLIRTensorUtils` depends on `MLIRTensorDialect` (CMake targets). If
these hooks were moved to `Utils`, it would create a dependency of
`MLIRTensorDialect` on `MLIRTensorUtils`, leading to a circular
dependency.
2025-02-12 23:12:14 +00:00
Razvan Lupusoru
ceb00c0702 [mlir][acc] Clean up TypedValue builders (#126968)
When MappableType was introduced alongside PointerLikeType, the data
clause operation builders were duplicated to accept a `TypedValue` of
one of the two type options. However, the underlying builder takes a
`Value` and this difference is not relevant for it. The only difference
is that `varType` is set differently depending on the type.

Having two duplicated builders can lead to clunky building since a
`Value` must always be cast to one of the two options. Thus, simply
clean this up - the verifier already checks that it is a type that
implements one of the two interfaces.
2025-02-12 14:13:45 -08:00
Nikhil Kalra
65ed4fa57e [mlir] Python: Parse ModuleOp from file path (#126572)
For extremely large models, it may be inefficient to load the model into
memory in Python prior to passing it to the MLIR C APIs for
deserialization. This change adds an API to parse a ModuleOp directly
from a file path.

Re-lands
[4e14b8a](4e14b8afb4).
2025-02-12 14:02:41 -08:00
Frank Schlimbach
0fd50ec9a3 [MLIR][mesh] Mesh fixes (#124724)
A collection of fixes to the mesh dialect
- allow constants in sharding propagation/spmdization
- fixes to tensor replication (e.g. 0d tensors)
- improved canonicalization
- sharding propagation incorrectly generated too many ShardOps
New operation `mesh.GetShardOp` enables exchanging sharding information
(like on function boundaries)
2025-02-12 12:44:48 +01:00
Adam Siemieniuk
0b9b014be7 [mlir][dlti] Query by strings (#126716)
Adds DLTI utility to query using strings directly as keys.
2025-02-12 09:13:43 +01:00
Hongtao Yu
4a63ff4330 Revert "[mlir] Enable LICM for ops with only read side effects in scf.for" (#126840)
Reverts llvm/llvm-project#120302
2025-02-11 20:07:21 -08:00
Arda Unal
36d8e7056e [mlir] Enable LICM for ops with only read side effects in scf.for (#120302)
Enable ops with only read side effects in scf.for to be hoisted with a
scf.if guard that checks against the trip count

This patch takes a step towards a less conservative LICM in MLIR as
discussed in the following discourse thread:

[Speculative LICM?](https://discourse.llvm.org/t/speculative-licm/80977)

This patch in particular does the following:

1. Relaxes the original constraint for hoisting that only hoists ops
without any side effects. This patch also allows the ops with only read
side effects to be hoisted into an scf.if guard only if every op in the
loop or its nested regions is side-effect free or has only read side
effects. This scf.if guard wraps the original scf.for and checks for
**trip_count > 0**.
2. To support this, two new interface methods are added to
**LoopLikeInterface**: _wrapInTripCountCheck_ and
_unwrapTripCountCheck_. Implementation starts with wrapping the scf.for
loop into scf.if guard using _wrapInTripCountCheck_ and if there is no
op hoisted into the this guard after we are done processing the
worklist, it unwraps the guard by calling _unwrapTripCountCheck_.
2025-02-11 15:48:57 -08:00
Shoaib Meenai
376f65d865 Revert "[mlir] Silence -Wdangling-assignment-gsl in OperationSupport.h (#126140)"
This reverts commit f6556afce0.

Buildbots are broken.
2025-02-11 15:05:12 -08:00
Shoaib Meenai
f6556afce0 [mlir] Silence -Wdangling-assignment-gsl in OperationSupport.h (#126140)
This warning is causing lots of build spam when I use a recent Clang as
my host compiler. It's a potential false positive, so silence it until
https://github.com/llvm/llvm-project/issues/126600 is resolved.
Fix variable casing while I'm here.
2025-02-11 14:05:01 -08:00
Andrzej Warzyński
fcbf04e40e [mlir][vector][nfc] Add clarification on "dim-1" bcast (#125425)
Adds a small note to VectorOps.td on what "dim-1" broadcast is. Also
updates comments to consistently use quotes, i.e.

* "dim-1" broadcasting instead of dim-1 broadcasting.

This way it is clear that we are referring to "stretching" one of the
trailing dims rather than e.g. broadcasting a dim at idx 1.
2025-02-11 21:37:23 +00:00
Tai Ly
20ae283d08 [mlir][tosa] Change the shift of mul to be required (#125297)
Change the shift operand for the mul operator to be a required operand.

Also defined shift to be Tosa_ScalarInt8Tensor which requires that it is
a rank-1 tensor
whose shape is [1] (ie, tensor containing a single element)

Signed-off-by: Tai Ly <tai.ly@arm.com>
2025-02-11 11:02:44 -08:00
Hsiangkai Wang
ab93bd6959 [mlir][tosa] Change ClampOp's min/max attributes (#125197)
This changes Tosa ClampOp attributes to min_val and max_val which are
either integer attributes or float attributes, and adds verify checks
that these attribute element types must match element types of input and
output

Co-authored-by: Tai Ly <tai.ly@arm.com>
2025-02-11 08:02:52 -08:00
Adam Siemieniuk
67f59a642f [mlir][xegpu] Improve scatter attribute definition (#126540)
Refactors XeGPU scatter attribute introducing following:
  - improved docs formatting
  - default initialized parameters
  - invariant checks in attribute verifier
  - removal of additional parsing error
 
The attribute's getters now provide default values simplifying their
usage and scattered tensor descriptor handling.
Related descriptor verifier is updated to avoid check duplication.
2025-02-11 10:05:23 +01:00
jeanPerier
99e1308c41 [mlir][LLVM] handle argument and result attributes in llvm.call and llvm.invoke (#123177)
Update llvm.call/llvm.invoke pretty printer/parser and the llvm ir import/export
to deal with the argument and result attributes.

This patch is made on top of PR 123176 that modified the
CallOpInterface and added the argument and result attributes to
llvm.call and llvm.invoke without doing anything with them.

RFC: https://discourse.llvm.org/t/mlir-rfc-adding-argument-and-result-attributes-to-llvm-call/84107
2025-02-11 09:39:51 +01:00
Uday Bondhugula
001ba42fe0 [MLIR][Affine] Make affine fusion MDG API const correct (#125994)
Make affine fusion MDG API const correct. NFC changes otherwise.
2025-02-11 05:28:15 +05:30
Thomas Preud'homme
d7fd2a2a3b [MLIR] Fix LLVMIRTransforms build failure (#125485)
lib/libMLIRLLVMIRTransforms.a fails to build from scratch with the
following error:
In file included from llvm/include/llvm/Frontend/OpenMP/OMPConstants.h:19,
                 from llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:19,
                 from mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h:26,
                 from mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h:24,
                 from mlir/lib/Dialect/LLVMIR/Transforms/InlinerInterfaceImpl.cpp:17:
llvm/include/llvm/Frontend/OpenMP/OMP.h:16:10:
fatal error: llvm/Frontend/OpenMP/OMP.h.inc: No such file or directory

Use a forward declaration for OpenMPIRBuilder in ModuleTranslation.h to
avoid pulling OpenMP frontend header that require generated headers.
2025-02-10 19:37:58 +00:00
Benoit Jacob
ced23aa540 [MLIR][Math] Add fine-grained populate-patterns functions for math function rewrites. (#126103)
The existing `mlir::populateMathPolynomialApproximationPatterns` is
coarse-grained and inflexible:
- It populates 2 distinct classes of patterns: (1) polynomial
approximations, (2) expansions of operands to f32.
- It does not offer knobs to select which math functions to apply the
rewrites to.

This PR adds finer-grained populate-patterns functions, which take a
predicate lambda allowing the caller to control which math functions to
apply rewrites to.

Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2025-02-10 09:52:24 -08:00
Razvan Lupusoru
1c583c19bb [acc][mlir] Add functionality for categorizing OpenACC variable types (#126167)
OpenACC specification describes the following type categories: scalar,
array, composite, and aggregate (which includes arrays, composites, and
others such as Fortran pointer/allocatable).

Decision for how to do implicit mapping is dependent on a variable's
category. Since acc dialect's only means of distinguishing between types
is through the interfaces attached, add API to be able to get the type
category.

In addition to defining the new API, attempt to provide a base
implementation for memref which matches what OpenACC spec describes.
2025-02-10 08:03:38 -08:00
Rolf Morel
f796bc622a [MLIR][Linalg] Expose linalg.matmul and linalg.contract via Python API (#126377)
Now that linalg.matmul is in tablegen, "hand write" the Python wrapper
that OpDSL used to derive. Similarly, add a Python wrapper for the new
linalg.contract op.

Required following misc. fixes:
1) make linalg.matmul's parsing and printing consistent w.r.t. whether
indexing_maps occurs before or after operands, i.e. per the tests cases
it comes _before_.
2) tablegen for linalg.contract did not state it accepted an optional
cast attr.
3) In ODS's C++-generating code, expand partial support for `$_builder`
access in `Attr::defaultValue` to full support. This enables access to
the current `MlirContext` when constructing the default value (as is
required when the default value consists of affine maps).
2025-02-10 12:05:13 +00:00
Mehdi Amini
67b7a2590f Revert "[mlir] Python: Parse ModuleOp from file path" (#126482)
Reverts llvm/llvm-project#125736

The gcc7 Bot is broken at the moment.
2025-02-10 09:09:58 +01:00
Andrzej Warzynski
b1a267e1b9 [mlir][vector] Remove references to non-existing patterns (nfc)
Delete references to:
  * `VectorLoadToMemrefLoadLowering`,
  * `VectorStoreToMemrefStoreLowering`.

These patters were removed in #121454.
2025-02-09 13:54:11 +00:00
Durgadoss R
2feced1df0 [MLIR][NVVM] Add tcgen05 wait/fence Ops (#126265)
PR #126091 adds intrinsics for tcgen05
wait/fence/commit operations. This patch
adds NVVM Dialect Ops for them.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-08 21:34:40 +05:30
Uday Bondhugula
b850ce41db [MLIR][Affine] Fix private memref creation bug in affine fusion (#126028)
Fix private memref creation bug in affine fusion exposed in the case of
the same memref being loaded from/stored to in producer nest. Make the
private memref replacement sound.

Change affine fusion debug string to affine-fusion - more compact.

Fixes: https://github.com/llvm/llvm-project/issues/48703
2025-02-08 08:35:10 +05:30
Adam Siemieniuk
8a03658d57 [mlir][xegpu] Tensor descriptor type verifier (#124548)
Adds XeGPU tensor descriptor type verifier.

The type verifier covers general tensor descriptor invariants w.r.t. Xe
ISA semantics.
Related operation verifiers are updated to account for the new
descriptor checks and avoid duplication.
2025-02-07 20:43:05 +01:00
Scott Todd
73f11ac17d [mlir][tosa] Use explicit namespace for OpTrait. (#126286)
I'm seeing build errors in a downstream project using torch-mlir that
are fixed by this change. See
https://github.com/iree-org/iree/pull/19903#discussion_r1946899561 for
more context. The build error on MSVC is:
```
C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/Utils/ConversionUtils.h(148): error C2872: 'OpTrait': ambiguous symbol
C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/IR/TosaOps.h(49): note: could be 'mlir::OpTrait'
C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Dialect/Torch/IR/TorchTraits.h(23): note: or       'mlir::torch::Torch::OpTrait'
C:\home\runner\_work\iree\iree\third_party\llvm-project\mlir\include\mlir/Dialect/Tosa/Utils/ConversionUtils.h(148): note: the template instantiation context (the oldest one first) is
C:\home\runner\_work\iree\iree\third_party\torch-mlir\lib\Conversion\TorchToTosa\TosaLegalizeCommon.cpp(126): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInfer<mlir::tosa::MulOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::PatternRewriter &,mlir::Location,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled
        with
        [
            TosaOp=mlir::tosa::MulOp
        ]
C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Conversion/TorchToTosa/TosaLegalizeUtils.h(83): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInfer<TosaOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::ImplicitLocOpBuilder &,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled
        with
        [
            TosaOp=mlir::tosa::MulOp
        ]
C:\home\runner\_work\iree\iree\third_party\torch-mlir\include\torch-mlir/Conversion/TorchToTosa/TosaLegalizeUtils.h(76): note: see reference to function template instantiation 'TosaOp mlir::tosa::CreateOpAndInferShape<TosaOp,mlir::Value&,mlir::Value&,mlir::Value&>(mlir::ImplicitLocOpBuilder &,mlir::Type,mlir::Value &,mlir::Value &,mlir::Value &)' being compiled
        with
        [
            TosaOp=mlir::tosa::MulOp
        ]
```

I think the torch-mlir code here is causing the issue, but I'm not sure
why builds only started failing now:
https://github.com/llvm/torch-mlir/blob/main/include/torch-mlir/Dialect/Torch/IR/TorchTraits.h.
Given that `mlir::OpTrait` already exists, torch-mlir should not be
creating an ambiguous symbol `mlir::torch::Torch::OpTrait`. So while a
better fix would be to the downstream project, being explicit here
doesn't seem that unreasonable to me.
2025-02-07 11:04:09 -08:00
TatWai Chong
571a98722f [mlir][tosa] Change 'shape' of RESHAPE from attribute to input shape … (#125789)
The shape operand is changed to input shape type since V1.0

Change-Id: I508cc1d67e9b017048b3f29fecf202cb7d707110

Co-authored-by: Won Jeon <won.jeon@arm.com>
2025-02-07 10:24:52 -08:00
Guray Ozen
b284a849d5 [MLIR][NVVM] Add default constructor for nvvm.barrier [NFC] (#126225)
This PR adds a default constructor to `nvvm.barrier`, making it more
convenient to build the OP.
2025-02-07 15:42:57 +01:00
Igor Wodiany
1454fc9dbf [mlir][spirv] Add definition for OpGroupNonUniformBallotBitCount (#126055)
A new constraint is also added to restrict attributes values for SPIR-V
attributes. Ideally this should use `ConfinedAttr` with a custom
constraint directly on the operand, however it seems TableGen does not
allow using that with SPIR-V attributes. I suspect it is because SPIR-V
attributes do not derive from the generic MLIR attribute class -
TableGen complains about missing enum field.
2025-02-07 14:20:02 +01:00
Matthias Springer
15e50b1736 [mlir][IR] Clean up type constraints around ValueSemanticsContainerOf (#126075)
* Remove duplicate `TypeOrContainer`. There is an identical class with
the same name: `TypeOrValueSemanticsContainer`.
* Remove `TypeOrContainerOfAnyRank` and use
`TypeOrValueSemanticsContainer` instead. `TypeOrContainerOfAnyRank` is
inconsistent with the other classes because it explicitly checks for
`VectorType` and `TensorType` instead of utilizing the value semantics
type trait.
* Remove `SignlessIntegerOrIndexLikeOfAnyRank` etc. and use
`SignlessIntegerOrIndexLike` instead. `SignlessIntegerOrIndexLike` etc.
already allow 0-d vectors, so there is no difference with
`SignlessIntegerOrIndexLikeOfAnyRank`.
2025-02-07 09:58:15 +01:00
Karim Nosseir
7fa57cd430 [MLIR] Add move constructor to BytecodeWriterConfig (#126130)
The config is currently not movable and because there are constructors
the default move won't be generated, which prevents it from being moved.
Also, it is not copyable because of the unique_ptr. This PR adds move
constructor to allow moving it.
2025-02-06 21:30:55 -08:00
Avik Pal
a15618f18c [mlir] feat: add mlirFuncSetResultAttr (#125972)
cc @ftynse @wsmoses
2025-02-06 17:33:12 -06:00
Alan Li
f0e1857c84 [MLIR] Support non-atomic RMW option for emulated vector stores (#124887)
This patch is a followup of the previous one: #115922, It adds an option
to turn on emitting non-atomic rmw code sequence instead of atomic rmw.
2025-02-06 13:22:42 -08:00
Md Asghar Ahmad Shahid
f2bca9e385 [MLIR][Linalg] Introduce broadcast/transpose semantic to batch_matmul (#122275)
Goals:
1. To add syntax and semantic to 'batch_matmul' without changing any of
the existing syntax expectations for current usage. batch_matmul is
still just batch_matmul.

2. Move the definition of batch_matmul from linalg OpDsl to tablegen ODS
infra.

Scope of this patch:
To expose broadcast and transpose semantics on the 'batch_matmul'.

The broadcast and transpose semantic are as follows:

By default, 'linalg.batch_matmul' behavior will remain as is. Broadcast
and Transpose semantics can be applied by specifying the explicit
attribute 'indexing_maps' as shown below. This is a list attribute, so
the list must include all the maps if specified.

    Example Transpose:
    ```
    linalg.batch_matmul indexing_maps = [
affine_map< (d0, d1, d2, d3) -> (d0, d3, d1)>, //transpose
                   affine_map< (d0, d1, d2, d3) -> (d0, d3, d2)>,
                   affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)>
                   ]
ins (%arg0, %arg1: memref<2x5x3xf32>,memref<2x5x7xf32>)
                   outs (%arg2: memref<2x3x7xf32>)
    ```

    Example Broadcast:
    ```
    linalg.batch_matmul indexing_maps = [
affine_map< (d0, d1, d2, d3) -> (d3)>, //broadcast
                       affine_map< (d0, d1, d2, d3) -> (d0, d3, d2)>,
                       affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)>
                     ]
                     ins (%arg0, %arg1: memref<5xf32>,memref<2x5x7xf32>)
                     outs (%arg2: memref<2x3x7xf32>)
    ```

    Example Broadcast and transpose:
    ```
    linalg.batch_matmul indexing_maps = [
affine_map< (d0, d1, d2, d3) -> (d1, d3)>, //broadcast
affine_map< (d0, d1, d2, d3) -> (d0, d2, d3)>, //transpose
                       affine_map< (d0, d1, d2, d3) -> (d0, d1, d2)>
                     ]
ins (%arg0, %arg1: memref<3x5xf32>, memref<2x7x5xf32>)
                     outs (%arg2: memref<2x3x7xf32>)
    ```

RFCs and related PR:

https://discourse.llvm.org/t/rfc-linalg-opdsl-constant-list-attribute-definition/80149
https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863
https://discourse.llvm.org/t/rfc-mlir-linalg-operation-tree/83586
https://github.com/llvm/llvm-project/pull/115319
2025-02-06 19:08:50 +00:00
Krzysztof Drewniak
f4e3b8783c [mlir][LLVM] Switch undef for poison for uninitialized values (#125629)
LLVM itself is generally moving away from using `undef` and towards
using `poison`, to the point of having a lint that caches new uses of
`undef` in tests.

In order to not trip the lint on new patterns and to conform to the
evolution of LLVM
- Rename valious ::undef() methods on StructBuilder subclasses to
::poison()
- Audit the uses of UndefOp in the MLIR libraries and replace almost all
of them with PoisonOp

The remaining uses of `undef` are initializing `uninitialized` memrefs,
explicit conversions to undef from SPIR-V, and a few cases in
AMDGPUToROCDL where usage like

    %v = insertelement <M x iN> undef, iN %v, i32 0
    %arg = bitcast <M x iN> %v to i(M * N)

is used to handle "i32" arguments that are are really packed vectors of
smaller types that won't always be fully initialized.
2025-02-06 12:49:30 -06:00
Krzysztof Drewniak
efd0a7f446 [mlir][ROCDL][~NFC] Migrate to LLVM dialect default builders (#125609)
There were a bunch of spots in ROCDL.td where we were defining our own
llvmBuilder call which could have been generated using the default
built-in one on LLVM_IntrOpBase.

This commit cleans up such usages in the interests of potentinally
enabling ROCDL import in the future and of making best practices more
obvious.

The one breaking change is renaming WaitcntOp to SWaitcntOp, which
should have minimal impact.
2025-02-06 11:38:43 -06:00
Igor Wodiany
8609e27a58 [mlir][spirv] Add definition for ImageWriteOp (#124124)
This Pull Request adds OpImageWrite as defined in section 3.52.10.
(Image Instructions). The tests in
`mlir/test/Target/SPIRV/image-ops.mlir` are also updated (and extended
with the new op), so they now pass validation with `spirv-val` after
serialization into SPIR-V. The test was missing `ImageQuery` capability
and entry points. For entry points dummy `main` functions were added.
2025-02-06 09:25:08 -05:00
Matthias Springer
8c2b4aa5a0 [mlir][LLVM][NFC] Fix description of LLVMFixedVectorType (#126031) 2025-02-06 10:37:32 +01:00
Andrzej Warzyński
80fd902573 [mlir][tensor] Introduce TensorRelayoutOpInterface (#125823)
The newly introduced `TensorRelayoutOpInterface` is created specifically
for `tensor.pack` + `tensor.unpack`. Although the interface is
currently empty, it enables us to refactor the logic in
`FoldTensorCastProducerOp` within the Tensor dialect as follows:

```cpp
// OLD
// Reject tensor::PackOp - there's dedicated pattern for that instead.
if (!foldTensorCastPrecondition(op) ||
    isa<tensor::PackOp, tensor::UnPackOp>(*op))
  return failure();
```

is replaced with:

```cpp
// NEW
// Reject tensor::PackOp - there's dedicated pattern for that instead.
if (!foldTensorCastPrecondition(op) ||
    isa<tensor::RelayoutOpInterface>(*op))
  return failure();
```

This will be crucial once `tensor.pack` + `tensor.pack` are replaced
with `linalg.pack` + `linalg.unpack` (i.e. moved to Linalg):
  * https://github.com/llvm/llvm-project/pull/123902,
  * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg/.

Note that the interface itself will later be moved to the Linalg
dialect. This decoupling ensures that the Tensor dialect does not
require an understanding of Linalg ops, thus keeping the dependency
lightweight.

This PR is effectively a preparatory step for moving PackOp and UnpackOp
to Linalg. Once that's completed, most CMake changes from this PR will
be effectively reverted.
2025-02-06 09:18:13 +00:00
Bruno Cardoso Lopes
4fb96f203e [MLIR][LLVM] Implement LLVM dialect support for global aliases (#125295)
This includes support for module translation, module import and add tests for both.

Fix https://github.com/llvm/llvm-project/issues/115390
ClangIR cannot currently lower global aliases to LLVM because of missing support for this.
2025-02-05 18:19:36 -08:00
Ivan Butygin
6e52a12811 [mlir][vector] Create VectorToLLVMDialectInterface (#121440)
Create `VectorToLLVMDialectInterface` which allows automatic conversion
discovery by generic `--convert-to-llvm` pass. This only covers final
dialect conversion step and not any previous preparation steps. Also,
currently there is no way to pass any additional parameters through this
conversion interface, but most users using default parameters anyway.
2025-02-05 23:21:25 +03:00
Nikhil Kalra
4e14b8afb4 [mlir] Python: Parse ModuleOp from file path (#125736)
For extremely large models, it may be inefficient to load the model into
memory in Python prior to passing it to the MLIR C APIs for
deserialization. This change adds an API to parse a ModuleOp directly
from a file path.
2025-02-05 11:48:37 -08:00
Guray Ozen
dd099e9cc2 [MLIR][NVVM] Fix links in OP definition (#125865) 2025-02-05 16:18:04 +01:00
Guray Ozen
baf27862dd [MLIR][NVGPU] Move max threads/blocks size to dialect (NFC) (#124454)
This PR moves maximum number of threads in a block and block in a grid
to nvgpu dialect to avoid replicated code.

The limits are defined here:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/#features-and-technical-specifications-technical-specifications-per-compute-capability
2025-02-05 12:38:37 +01:00
Jack Frankland
f0b8ff1251 [mlir][tosa] Remove Quantization Attribute (#125479)
Removed the TOSA quantization attribute used in various MLIR TOSA
dialect operations in favour of using builtin attributes.

Update any lit tests, conversions and transformations appropriately.

Signed-off-by: Tai Ly <tai.ly@arm.com>
Co-authored-by: Tai Ly <tai.ly@arm.com>
2025-02-05 11:27:17 +00:00
Durgadoss R
4287c72404 [MLIR][NVVM] Add tcgen05 alloc/dealloc Ops (#125674)
PR #124961 adds intrinsics for the tcgen05
alloc/dealloc PTX instructions. This patch
adds NVVM Ops for the same.

Tests are added to verify the lowering to
the corresponding intrinsics in tcgen05-alloc.mlir file.

PTX ISA link:
https://docs.nvidia.com/cuda/parallel-thread-execution/#tcgen05-memory-alloc-manage-instructions

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-05 16:16:47 +05:30
Paul Carabas
12fff8db4b [mlir][LLVMIR] Add support for tan intrinsic op (#125748)
This patch adds support for Tan trig. function intrinsic in LLVM dialect
& adds missing import/export tests for Sin
2025-02-04 22:42:39 -06:00
Uday Bondhugula
05a09e6e55 [MLIR][Affine] Extend/generalize MDG to properly add edges between non-affine ops (#125451)
Drop arbitrary checks and hacks from affine fusion MDG construction and
handle all ops using memory read/write effects. This has been a long
pending change and it now makes affine fusion more powerful in the
presence of non-affine ops and does not limit fusion in parts of the
block where it is feasible simply because of non-affine ops elsewhere or
intervening non-affine users.

Populate memref read and write ops in non-affine region holding ops and
non-affine ops at the top level of the Block properly; add the
appropriate edges to MDG. Use memory read-write effects and drop
assumptions and special handling of ops due to historic reasons.

Update MDG to drop unnecessary "unhandled region" hack. This hack is no
longer needed with the update to fully and properly construct the MDG.

MDG edges now capture dependences between nodes completely. Drop
non-affine users check. With the MDG generalization to properly include
edges
between non-affine nodes/operations, the non-affine users on path check
in fusion is no longer needed. Add more test cases to exercise MDG
generalization.

Drop unnecessary failure when encountering side-effect-free affine.if
ops.

Improve documentation on MDG.
2025-02-05 09:52:59 +05:30
Soren Lassen
c8ca486573 [MLIR] print/parse resource handle key quoted and escaped (#119746)
resource keys have the problem that you can’t parse them from mlir
assembly if they have special or non-printable characters, but nothing
prevents you from specifying such a key when you create e.g. a
DenseResourceElementsAttr, and it works fine in other ways, including
bytecode emission and parsing

this PR solves the parsing by quoting and escaping keys with special or
non-printable characters in mlir assembly, in the same way as symbols,
e.g.:
```
module attributes {
  fst = dense_resource<resource_fst> : tensor<2xf16>,
  snd = dense_resource<"resource\09snd"> : tensor<2xf16>
} {}

{-#
  dialect_resources: {
    builtin: {
      resource_fst: "0x0200000001000200",
      "resource\09snd": "0x0200000008000900"
    }
  }
#-}
```

by not quoting keys without special or non-printable characters, the
change is effectively backwards compatible

the change is tested by:
1. adding a test with a dense resource handle key with special
characters to `dense-resource-elements-attr.mlir`
2. adding special and unprintable characters to some resource keys in
the existing lit tests `pretty-resources-print.mlir` and
`mlir/test/Bytecode/resources.mlir`
2025-02-04 13:49:15 -07:00
Corbin Robeck
6f35a9e7c5 [MLIR][ROCDL] Add Scale Convert Packed FP8 <-> F32 Support for GFX950 (#125564)
Add Rocdl support for the following GFX950 instructions:

CVT_SCALE_PK_FP8_F32
CVT_SCALE_PK_BF8_F32
CVT_SCALE_SR_FP8_F32
CVT_SCALE_SR_BF8_F32
CVT_SCALE_PK_F32_FP8
CVT_SCALE_PK_F32_BF8
CVT_SCALE_F32_FP8
CVT_SCALE_F32_BF8
2025-02-04 13:21:59 -05:00
Razvan Lupusoru
bd30838422 [flang][acc] Improve acc lowering around fir.box and arrays (#125600)
The current implementation of OpenACC lowering includes explicit
expansion of following cases:
- Creation of `acc.bounds` operations for all arrays, including those
whose dimensions are captured in the type (eg `!fir.array<100xf32>`)
- Expansion of box types by only putting the box's address in the data
clause. The address was extracted with a `fir.box_addr` operation and
the bounds were filled with `fir.box_dims` operation.

However, with the creation of the new type interface `MappableType`, the
idea is that specific type-based semantics can now be used. This also
really simplifies representation in the IR. Consider the following
example:
```
subroutine sub(arr)
  real :: arr(:)
  !$acc enter data copyin(arr)
end subroutine
```

Before the current PR, the relevant acc dialect IR looked like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
  ...
  %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %2:3 = fir.box_dims %1#0, %c0 : (!fir.box<!fir.array<?xf32>>, index)
-> (index, index, index)
  %c0_0 = arith.constant 0 : index
  %3 = arith.subi %2#1, %c1 : index
  %4 = acc.bounds lowerbound(%c0_0 : index) upperbound(%3 : index)
extent(%2#1 : index) stride(%2#2 : index) startIdx(%c1 : index)
{strideInBytes = true}
  %5 = fir.box_addr %1#0 : (!fir.box<!fir.array<?xf32>>) ->
!fir.ref<!fir.array<?xf32>>
  %6 = acc.copyin varPtr(%5 : !fir.ref<!fir.array<?xf32>>) bounds(%4) ->
!fir.ref<!fir.array<?xf32>> {name = "arr", structured = false}
  acc.enter_data dataOperands(%6 : !fir.ref<!fir.array<?xf32>>)
```

After the current change, it looks like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
  ...
  %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
  %2 = acc.copyin var(%1#0 : !fir.box<!fir.array<?xf32>>) ->
!fir.box<!fir.array<?xf32>> {name = "arr", structured = false}
  acc.enter_data dataOperands(%2 : !fir.box<!fir.array<?xf32>>)
```

Restoring the old behavior can be done with following command line
options:
`--openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true`
2025-02-04 08:08:16 -08:00