Commit Graph

814 Commits

Author SHA1 Message Date
Billy Zhu
6f6336858e [MLIR][LLVM] Add DebugNameTableKind to DICompileUnit (#87974)
Add the DebugNameTableKind field to DICompileUnit, along with its
importer & exporter.
2024-04-09 06:18:07 -07:00
Jie Fu
2abd71ec51 [mlir] Fix -Wunused-variable in DebugImporter.cpp (NFC)
llvm-project/mlir/lib/Target/LLVMIR/DebugImporter.cpp:377:10:
error: unused variable '[_, inserted]' [-Werror,-Wunused-variable]
    auto [_, inserted] = dependentCache.try_emplace(
         ^
1 error generated.
2024-04-08 18:22:06 +08:00
Billy Zhu
81a7b6454e [MLIR][LLVM] Recursion importer handle repeated self-references (#87295)
Followup to this discussion:
https://github.com/llvm/llvm-project/pull/80251#discussion_r1535599920.

The previous debug importer was correct but inefficient. For cases with
mutual recursion that contain more than one back-edge, each back-edge
would result in a new translated instance. This is because the previous
implementation never caches any translated result with unbounded
self-references. This means all translation inside a recursive context
is performed from scratch, which will incur repeated run-time cost as
well as repeated attribute sub-trees in the translated IR (differing
only in their `recId`s).

This PR refactors the importer to handle caching inside a recursive
context.
- In the presence of unbound self-refs, the translation result is cached
in a separate cache that keeps track of the set of dependent unbound
self-refs.
- A dependent cache entry is valid only when all the unbound self-refs
are in scope. Whenever a cached entry goes out of scope, it will be
removed the next time it is looked up.
2024-04-08 01:09:54 -07:00
Fabian Mora
a2c4b7c8e2 [mlir] Add convertInstruction and getSupportedInstructions to LLVMImportInterface (#86799)
This patch adds the `convertInstruction` and `getSupportedInstructions`
to `LLVMImportInterface`, allowing any non-LLVM dialect to specify how
to import LLVM IR instructions and overriding the default import of LLVM instructions.
2024-04-07 08:46:21 +02:00
Jan Leyonberg
9708d09003 [MLIR][OpenMP] Skip host omp ops when compiling for the target device (#85239)
This patch separates the lowering dispatch for host and target devices.
For the target device, if the current operation is not a top-level
operation (e.g. omp.target) or is inside a target device code region it
will be ignored, since it belongs to the host code.


This is an alternative approach to #84611, the new test in this PR was
taken from there.
2024-04-05 09:25:28 -04:00
Tom Eccles
cc34ad91f0 [MLIR][OpenMP] Add cleanup region to omp.declare_reduction (#87377)
Currently, by-ref reductions will allocate the per-thread reduction
variable in the initialization region. Adding a cleanup region allows
that allocation to be undone. This will allow flang to support reduction
of arrays stored on the heap.

This conflation of allocation and initialization in the initialization
should be fixed in the future to better match the OpenMP standard, but
that is beyond the scope of this patch.
2024-04-04 11:19:42 +01:00
Tom Eccles
099ecdf1ec [mlir][OpenMP] map argument to reduction initialization region (#86979)
The argument to the initialization region of reduction declarations was
never mapped. This meant that if this argument was accessed inside the
initialization region, that mlir operation would be translated to an
llvm operation with a null argument (failing verification).

Adding the mapping ensures that the right LLVM value can be found when
inlining and converting the initialization region.

We have to separately establish and clean up these mappings for each use
of the reduction declaration because repeated usage of the same
declaration will inline it using a different concrete value for the
block argument.

This argument was never used previously because for most cases the
initialized value depends only upon the type of the reduction, not on
the original variable. It is needed now so that we can read the array
extents for the local copy from the mold.

Flang support for reductions on assumed shape arrays patch 2/3
2024-04-04 10:55:42 +01:00
Tom Eccles
5334b31e7c [mlir][OpenMP][NFC] Use SmallVectorImpl for function arguments (#86978) 2024-04-04 10:46:45 +01:00
Victor Perez
77cbc9bf60 [MLIR][LLVM] Add llvm.experimental.constrained.fptrunc operation (#86260)
Add operation mapping to the LLVM
`llvm.experimental.constrained.fptrunc.*` intrinsic.

The new operation implements the new
`LLVM::FPExceptionBehaviorOpInterface` and
`LLVM::RoundingModeOpInterface` interfaces.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-03-26 11:02:50 +01:00
Tobias Gysi
aeeb7d566c [MLIR][LLVM] Make subprogram flags optional (#86433)
This revision makes the subprogramFlags field in the DISubprogrammAttr
optional. This is necessary since the DISubprogram attached to a
declaration may have none of the subprogram flags set.
2024-03-25 12:19:22 +01:00
agozillon
8612fa0d84 [MLIR][OpenMP] Refactor bounds offsetting and fix to apply to all directives (#84349)
This PR refactors bounds offsetting by combining the two differing
implementations (one applying to initial derived type member map
implementation for descriptors and the other for regular arrays,
effectively allocatable array vs regular array in fortran) now that it's
a little simpler to do.

The PR also moves the utilization of createAlteredByCaptureMap into
genMapInfoOp, where it will be correctly applied to all MapInfoData,
appropriately offsetting and altering Pointer data set in the kernel
argument structure on the host. This primarily means bounds offsets will
now correctly apply to enter/exit/update map clauses as opposed to just
the Target directive that is currently the case. A few fortran runtime
tests have been added to verify this new behavior.

This PR depends on: https://github.com/llvm/llvm-project/pull/84328 and
is an extraction of the larger derived type member map PR stack (so a
requirement for it to land).
2024-03-22 15:32:39 +01:00
Tobias Gysi
adda597388 [MLIR] Add index bitwidth to the DataLayout (#85927)
When importing from LLVM IR the data layout of all pointer types
contains an index bitwidth that should be used for index computations.
This revision adds a getter to the DataLayout that provides access to
the already stored bitwidth. The function returns an optional since only
pointer-like types have an index bitwidth. Querying the bitwidth of a
non-pointer type returns std::nullopt.

The new function works for the built-in Index type and, using a type
interface, for the LLVMPointerType.
2024-03-21 09:07:57 +01:00
Christian Ulmann
4095a326c0 [MLIR][LLVM] Add extraData field to the DIDerivedType attribute (#85935)
This commit extends the DIDerivedTypeAttr with the `extraData` field.
For now, the type of it is limited to be a `DINodeAttr`, as extending
the debug metadata handling to support arbitrary metadata nodes does not
seem to be necessary so far.
2024-03-20 16:08:38 +01:00
Sergio Afonso
d84252e064 [MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393)
This patch proposes the renaming of certain OpenMP dialect operations with the
goal of improving readability and following a uniform naming convention for
MLIR operations and associated classes. In particular, the following operations
are renamed:

- `omp.map_info` -> `omp.map.info`
- `omp.target_update_data` -> `omp.target_update`
- `omp.ordered_region` -> `omp.ordered.region`
- `omp.cancellationpoint` -> `omp.cancellation_point`
- `omp.bounds` -> `omp.map.bounds`
- `omp.reduction.declare` -> `omp.declare_reduction`

Also, the following MLIR operation classes have been renamed:

- `omp::TaskLoopOp` -> `omp::TaskloopOp`
- `omp::TaskGroupOp` -> `omp::TaskgroupOp`
- `omp::DataBoundsOp` -> `omp::MapBoundsOp`
- `omp::DataOp` -> `omp::TargetDataOp`
- `omp::EnterDataOp` -> `omp::TargetEnterDataOp`
- `omp::ExitDataOp` -> `omp::TargetExitDataOp`
- `omp::UpdateDataOp` -> `omp::TargetUpdateOp`
- `omp::ReductionDeclareOp` -> `omp::DeclareReductionOp`
- `omp::WsLoopOp` -> `omp::WsloopOp`
2024-03-20 11:19:38 +00:00
Tom Eccles
27534d69e2 [mlir][LLVM] erase call mappings in forgetMapping() (#84955)
It looks like the mappings for call instructions were forgotten here.
This fixes a bug in OpenMP when in-lining a region containing call
operations multiple times.

OpenMP array reductions 4/6
Previous PR: https://github.com/llvm/llvm-project/pull/84954
Next PR: https://github.com/llvm/llvm-project/pull/84957
2024-03-20 10:03:26 +00:00
Billy Zhu
c242302492 [MLIR][LLVM] DI Recursive Type fix for recursion via scope of composites (#85850)
Fixes this bug for the previous recursive DI type PR:
https://github.com/llvm/llvm-project/pull/80251#issuecomment-2007254788.

Drawing inspiration from how clang uses DIBuilder to build forward
decls, this PR changes how placeholders are created & updated. Instead
of requiring each recursive DIType to do in-place mutation, we simply
ask for a temporary node as the placeholder, and run RAUW at the end
when the concrete node is translated.

This has the side effect of simplifying what's needed to add recursion
support for a type. Now only one additional method needs to be created
for exporting. Concretely, for this PR, `translateImpl` for
DICompositeType is back to the state it was before the previous PR, and
the only net addition for DICompositeType is `translateTemporaryImpl`.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-03-19 22:49:50 -07:00
Daniil Kovalev
924a1dceb5 [Dwarf] Support __ptrauth qualifier in metadata nodes (#83862)
Reland #82363 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/5/builds/41428.

Memory sanitizer detects usage of `RawData` union member which is not
filled directly. Instead, the code relies on filling `Data` union
member, which is a struct consisting of signing schema parameters.

According to https://en.cppreference.com/w/cpp/language/union, this is
UB:
"It is undefined behavior to read from the member of the union that
wasn't most recently written".

Instead of relying on compiler allowing us to do dirty things, do not
use union and only store `RawData`. Particular ptrauth parameters are
obtained on demand via bit operations.

Original PR description below.

Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:

- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-03-19 09:13:17 +03:00
Bixia Zheng
0ead2bd91b [MLIR][LLVM] Suppress unused variable warning. (#85467) 2024-03-15 17:18:02 -07:00
Jie Fu
83afcbf5f3 [mlir] Fix -Wunused-variable in DebugTranslation.cpp (NFC)
llvm-project/mlir/lib/Target/LLVMIR/DebugTranslation.cpp:226:10:
error: unused variable '[iter, inserted]' [-Werror,-Wunused-variable]
    auto [iter, inserted] =
         ^
1 error generated.
2024-03-16 07:22:34 +08:00
Billy Zhu
1e8dad3bef [MLIR][LLVM] Support Recursive DITypes (#80251)
Following the discussion from [this
thread](https://discourse.llvm.org/t/handling-cyclic-dependencies-in-debug-info/67526/11),
this PR adds support for recursive DITypes.

This PR adds:
1. DIRecursiveTypeAttrInterface: An interface that DITypeAttrs can
implement to indicate that it supports recursion. See full description
in code.
2. Importer & exporter support (The only DITypeAttr that implements the
interface is DICompositeTypeAttr, so the exporter is only implemented
for composites too. There will be two methods that each llvm DI type
that supports mutation needs to implement since there's nothing
general).

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-03-15 09:58:25 -07:00
Tom Eccles
f46f5a01f4 [flang][OpenMP][OMPIRBuilder][mlir] Optionally pass reduction vars by ref (#84304)
Previously reduction variables were always passed by value into and out
of the initialization and combiner regions of the OpenMP reduction
declare operation.

This worked well for reductions of primitive types (and might perform
better than passing by reference). But passing by reference will be
useful for array and derived type reductions (e.g. to move allocation
inside of the init region).

Passing reductions by reference requires different LLVM-IR generation
when lowering from MLIR because some of the loads/stores/allocations
will now be moved inside of the init and combiner regions. This
alternate code generation is requested using a new attribute to
omp.wsloop and omp.parallel.

Existing lowerings from mlir are unaffected (these will continue to use
the by-value argument passing.

Flang will continue to pass by-value argument passing for trivial types
unless a (hidden) command line argument is supplied. Non-trivial types
will always use the by-ref lowering.

Array reductions are not ready yet (but are coming very soon). In the
meantime, this is tested by forcing existing reductions to use by-ref.

Commit series for by-ref OpenMP reductions 3/3

---------

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
2024-03-13 14:51:09 +00:00
Kareem Ergawy
5c54f72901 [MLIR][OpenMP] Extend omp.private materialization support: firstprivate (#82164)
Extends current support for delayed privatization during translation to
LLVM IR. This adds support for one-block `firstprivate` `omp.private`
ops.
2024-03-04 12:28:30 +01:00
Daniil Kovalev
bf08d02868 Revert "[Dwarf] Support __ptrauth qualifier in metadata nodes" (#83672)
Reverts llvm/llvm-project#82363

See a build failure related to an issue discovered by memory sanitizer
(use of uninitialized value):
https://lab.llvm.org/buildbot/#/builders/37/builds/31965
2024-03-02 14:48:46 +03:00
Daniil Kovalev
8f65e7b917 [Dwarf] Support __ptrauth qualifier in metadata nodes (#82363)
Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR
with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type
which has the qualifier applied, and the following parameters
representing the signing schema:

- `ptrAuthKey` (integer)
- `ptrAuthIsAddressDiscriminated` (boolean)
- `ptrAuthExtraDiscriminator` (integer)
- `ptrAuthIsaPointer` (boolean)
- `ptrAuthAuthenticatesNullValues` (boolean)

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-03-01 19:48:08 +03:00
Leandro Lupori
64422cf826 [llvm][mlir][OMPIRBuilder] Translate omp.single's copyprivate (#80488)
Use the new copyprivate list from omp.single to emit calls to
__kmpc_copyprivate, during the creation of the single operation
in OMPIRBuilder.

This is patch 4 of 4, to add support for COPYPRIVATE in Flang.
Original PR: https://github.com/llvm/llvm-project/pull/73128
2024-02-28 13:33:42 -03:00
Tobias Gysi
e39e30e952 [mlir][llvm] Fix access group translation (#83257)
This commit fixes the translation of access group metadata to LLVM IR.
Previously, it did not use a temporary metadata node to model the
placeholder of the self-referencing access group nodes. This is
dangerous since, the translation may produce a metadata list with a null
entry that is later on changed changed with a self reference. At the
same time, for example the debug info translation may create the same
uniqued node, which after setting the self-reference the suddenly
references the access group metadata. The commit avoids such breakages.
2024-02-28 14:45:18 +01:00
Kareem Ergawy
9d56be010c [MLIR][OpenMP] Support basic materialization for omp.private ops (#81715)
Adds basic support for materializing delayed privatization. So far, the
restrictions on the implementation are:
- Only `private` clauses are supported (`firstprivate` support will be
  added in a later PR).
2024-02-28 05:00:07 +01:00
Krzysztof Drewniak
563f414e04 [mlir][AMDGPU] Set uniform-work-group-size=true by default (#79077)
GPU kernels generated via typical MLIR mechanisms make the assumption
that all workgroups are of uniform size, and so, as in OpenMP, it is
appropriate to set the "uniform-work-group-size"="true" attribute on
these functions by default. This commit makes that choice.

In the event it is needed, this commit adds
`rocdl.uniform_work_group_size` as an attribute to be set on LLVM
functions that can be used to override the default.

In addition, add proper failure messages to translation
2024-02-27 12:35:48 -06:00
Jie Fu
ef7417fbc5 [mlir] Fix -Wunused-but-set-variable in ModuleTranslation.cpp (NFC)
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp:1050:11:
error: variable 'numConstantsHit' set but not used [-Werror,-Wunused-but-set-variable]
      int numConstantsHit = 0;
          ^
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp:1051:11:
error: variable 'numConstantsErased' set but not used [-Werror,-Wunused-but-set-variable]
      int numConstantsErased = 0;
          ^
2 errors generated.
2024-02-27 10:37:17 +08:00
Xiang Li
c11627c2f4 [MLIR][LLVM] Fix memory explosion when converting global variable bodies in ModuleTranslation (#82708)
There is memory explosion when converting the body or initializer region
of a large global variable, e.g. a constant array.

For example, when translating a constant array of 100000 strings:

llvm.mlir.global internal constant @cats_strings() {addr_space = 0 :
i32, alignment = 16 : i64} : !llvm.array<100000 x ptr<i8>> {
    %0 = llvm.mlir.undef : !llvm.array<100000 x ptr<i8>>
    %1 = llvm.mlir.addressof @om_1 : !llvm.ptr<array<1 x i8>>
%2 = llvm.getelementptr %1[0, 0] : (!llvm.ptr<array<1 x i8>>) ->
!llvm.ptr<i8>
    %3 = llvm.insertvalue %2, %0[0] : !llvm.array<100000 x ptr<i8>>
    %4 = llvm.mlir.addressof @om_2 : !llvm.ptr<array<1 x i8>>
%5 = llvm.getelementptr %4[0, 0] : (!llvm.ptr<array<1 x i8>>) ->
!llvm.ptr<i8>
    %6 = llvm.insertvalue %5, %3[1] : !llvm.array<100000 x ptr<i8>>
    %7 = llvm.mlir.addressof @om_3 : !llvm.ptr<array<1 x i8>>
%8 = llvm.getelementptr %7[0, 0] : (!llvm.ptr<array<1 x i8>>) ->
!llvm.ptr<i8>
    %9 = llvm.insertvalue %8, %6[2] : !llvm.array<100000 x ptr<i8>>
    %10 = llvm.mlir.addressof @om_4 : !llvm.ptr<array<1 x i8>>
%11 = llvm.getelementptr %10[0, 0] : (!llvm.ptr<array<1 x i8>>) ->
!llvm.ptr<i8>
    %12 = llvm.insertvalue %11, %9[3] : !llvm.array<100000 x ptr<i8>>

    ... (ignore the remaining part)
}

where @om_1, @om_2, ... are string global constants.

Each time an operation is converted to LLVM, a new constant is created.
When it comes to llvm.insertvalue, a new constant array of 100000
elements is created and the old constant array (input) is not destroyed.
This causes memory explosion. We observed that, on a system with 128 GB
memory, the translation of 100000 elements got killed due to using up
all the memory. On a system with 64 GB, 65536 elements was enough to
cause the translation killed.

There is a previous patch (https://reviews.llvm.org/D148487) which fix
this issue but was reverted for
https://github.com/llvm/llvm-project/issues/62802

The old patch checks generated constants and destroyed them if there is
no use. But the check of use for the constant is too early, which cause
the constant be removed before use.


This new patch added a map was added a map to save expected use count
for a constant. Then decrease when reach each use.
And only erase the constant when the use count reach to zero

With new patch, the repro in
https://github.com/llvm/llvm-project/issues/62802 finished correctly.
2024-02-26 21:15:12 -05:00
Tobias Gysi
335d34d9ea [MLIR][LLVM] Fix debug intrinsic import (#82637)
This revision handles the case that the translation of a scope fails due
to cyclic metadata. This mainly affects the import of debug intrinsics
that indirectly take such a scope as metadata argument (e.g. via local
variable or label metadata). This commit ensures we drop intrinsics with
such a dependency on cyclic metadata.
2024-02-23 10:30:19 +01:00
Joseph Huber
cc374d8056 [OpenMP] Remove register_requires global constructor (#80460)
Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.

This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.

This function removes support for the old `__tgt_register_requires` and
replaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
`register_requires` having the old behavior. It is important that we
actively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for `libomptarget`. The
problem of `libomptarget` not having a define init and shutdown order
cascades into a lot of other issues so I have a strong incentive to be
rid of it.

It is worth noting that the current `__tgt_offload_entry` only has space
for a 32-bit integer here. I am planning to overhaul these at some point
as well.
2024-02-21 11:33:32 -06:00
Mehdi Amini
45c226d452 [MLIR] Add ODS support for generating helpers for dialect (discardable) attributes (#77024)
This is a new ODS feature that allows dialects to define a list of
key/value pair representing an attribute type and a name.
This will generate helper classes on the dialect to be able to
manage discardable attributes on operations in a type safe way.

For example the `test` dialect can define:

```
  let discardableAttrs = (ins
     "mlir::IntegerAttr":$discardable_attr_key,
  );
```

And the following will be generated in the TestDialect class:

```
   /// Helper to manage the discardable attribute `discardable_attr_key`.
    class DiscardableAttrKeyAttrHelper {
      ::mlir::StringAttr name;
    public:
      static constexpr ::llvm::StringLiteral getNameStr() {
        return "test.discardable_attr_key";
      }
      constexpr ::mlir::StringAttr getName() {
        return name;
      }

      DiscardableAttrKeyAttrHelper(::mlir::MLIRContext *ctx)
        : name(::mlir::StringAttr::get(ctx, getNameStr())) {}

     mlir::IntegerAttr getAttr(::mlir::Operation *op) {
       return op->getAttrOfType<mlir::IntegerAttr>(name);
     }
     void setAttr(::mlir::Operation *op, mlir::IntegerAttr val) {
       op->setAttr(name, val);
     }
     bool isAttrPresent(::mlir::Operation *op) {
       return op->hasAttrOfType<mlir::IntegerAttr>(name);
     }
     void removeAttr(::mlir::Operation *op) {
       assert(op->hasAttrOfType<mlir::IntegerAttr>(name));
       op->removeAttr(name);
     }
   };
   DiscardableAttrKeyAttrHelper getDiscardableAttrKeyAttrHelper() {
     return discardableAttrKeyAttrName;
   }
```

User code having an instance of the TestDialect can then manipulate this
attribute on operation using:

```
  auto helper = testDialect.getDiscardableAttrKeyAttrHelper();

  helper.setAttr(op, value);
  helper.isAttrPresent(op);
  ...
```
2024-02-19 23:30:03 -08:00
Kazu Hirata
f5cc961240 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Target/LLVMIR/AttrKindDetail.h:65:1: error: unused function
  'getAttrNameToKindMapping' [-Werror,-Wunused-function]
2024-02-13 11:41:32 -08:00
David Truby
be9f8ffd81 [mlir][flang][openmp] Rework wsloop reduction operations (#80019)
This patch reworks the way that wsloop reduction operations function to
better match the expected semantics from the OpenMP specification,
following the rework of parallel reductions.

The new semantics create a private reduction variable as a block
argument which should be used normally for all operations on that
variable in the region; this private variable is then combined with the
others into the shared variable. This way no special omp.reduction
operations are needed inside the region. These block arguments follow
the loop control block arguments.

---------

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
2024-02-13 19:13:54 +00:00
Rishi Surendran
fa6850a998 [mlir][nvvm]Add support for grid_constant attribute on LLVM function arguments (#78228)
Add support for attribute nvvm.grid_constant on LLVM function arguments.
The attribute can be attached only to arguments of type llvm.ptr that
have llvm.byval attribute.
Generate LLVM metadata for functions with nvvm.grid_constant arguments.
The metadata node is a list of integers, where each integer n denotes
that the nth parameter has the
grid_constant annotation (numbering from 1). The generated metadata node
will be handled by NVVM compiler. See
https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#supported-properties
for documentation on grid_constant property.

This patch also adds convertParameterAttr to
LLVMTranslationDialectInterface for supporting the translation of
derived dialect attributes on function parameters 
2024-02-12 13:16:59 -08:00
David Truby
9ecf4d20bb [mlir][flang][openmp] Rework parallel reduction operations (#79308)
This patch reworks the way that parallel reduction operations function
to better match the expected semantics from the OpenMP specification.
Previously specific omp.reduction operations were used inside the
region, meaning that the reduction only applied when the correct
operation was used, whereas the specification states that any change to
the variable inside the region should be taken into account for the
reduction.

The new semantics create a private reduction variable as a block
argument which should be used normally for all operations on that
variable in the region; this private variable is then combined with the
others into the shared variable. This way no special omp.reduction
operations are needed inside the region.

This patch only makes the change for the `parallel` operation, the
change for the `wsloop` operation will be in a separate patch.

---------

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
2024-02-12 17:19:49 +00:00
Kolya Panchenko
9f6c00565a [MLIR][VCIX] Support VCIX intrinsics in LLVMIR dialect (#75875)
The changeset extends LLVMIR intrinsics with VCIX intrinsics.
The VCIX intrinsics allow MLIR users to interact with RISC-V
co-processors that are compatible with `XSfvcp` extension

Source:
https://www.sifive.com/document-file/sifive-vector-coprocessor-interface-vcix-software
2024-02-07 15:23:28 -05:00
agozillon
95fe47ca7e [Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (#71766)
This patch seeks to add an initial lowering for pointers and allocatable variables 
captured by implicit and explicit map in Flang OpenMP for Target operations that 
take map clauses e.g. Target, Target Update. Target Exit/Enter etc.

Currently this is done by treating the type that lowers to a descriptor 
(allocatable/pointer/assumed shape) as a map of a record type (e.g. a structure) as that's
effectively what descriptor types lower to in LLVM-IR and what they're represented as
in the Fortran runtime (written in C/C++). The descriptor effectively lowers to a structure
containing scalar and array elements that represent various aspects of the underlying
data being mapped (lower bound, upper bound, extent being the main ones of interest
in most cases) and a pointer to the allocated data. In this current iteration of the mapping
we map the structure in it's entirety and then attach the underlying data pointer and map
the data to the device, this allows most of the required data to be resident on the device
for use. Currently we do not support the addendum (another block of pointer data), but
it shouldn't be too difficult to extend this to support it.

The MapInfoOp generation for descriptor types is primarily handled in an optimization
pass, where it expands BoxType (descriptor types) map captures into two maps, one for
the structure (scalar elements) and the other for the pointer data (base address) and
links them in a Parent <-> Child relationship. The later lowering processes will then treat
them as a conjoined structure with a pointer member map.
2024-02-05 18:45:07 +01:00
Sergio Afonso
92bbf615f5 [Flang][MLIR][OpenMP] Use function-attached target attributes for OpenMP lowering (#78291)
This patch removes the omp.target module attribute, since the
information it held on the target CPU and features is available through
the fir.target_cpu and fir.target_features module attributes. Target
outlining during the MLIR to LLVM IR translation stage is updated, so
that these attributes, at that point available as llvm.func attributes,
are passed along to the newly created function.
2024-02-02 13:16:36 +00:00
Sander de Smalen
d313614b60 [AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (#79166)
Since https://github.com/ARM-software/acle/pull/276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
2024-02-01 13:37:37 +00:00
Alex Bradbury
748c295908 [MLIR][LLVM] Add fast-math related function attribute support (#79812)
Adds unsafe-fp-math, no-infs-fp-math, no-nans-fp-math,
approx-func-fp-math, and no-signed-zeros-fp-math function attributes.

This allows code generators using the LLVMIR dialect to match the
codegen of Clang.
2024-01-30 14:03:51 +00:00
Kunwar Grover
9261ab708e [mlir][Target] Teach dense_resource conversion to LLVMIR Target (#78958)
This patch adds support for translating dense_resource attributes to
LLVMIR Target.
The support added is similar to how DenseElementsAttr is handled, except
we
don't need to handle splats.

Another possible way of doing this is adding iteration on
dense_resource, but that is
non-trivial as DenseResourceAttr is not meant to be something you should
directly
access. It has subclasses which you are supposed to use to iterate on
it.
2024-01-23 13:30:34 -08:00
Saiyedul Islam
d2398cca6f Restore: [mlir][ROCDL] Stop setting amdgpu-implicitarg-num-bytes (#79129)
This patch restores PR#78498
2024-01-23 18:48:39 +05:30
Saiyedul Islam
082f87c9d4 [AMDGPU] Change default AMDHSA Code Object version to 5 (#79038)
Also update LIT tests and docs.
For more details, see
https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata

Corresponding llvm-objdump AMDGPU lit tests are updated
in a follow-up PR.
2024-01-23 17:08:18 +05:30
Kareem Ergawy
5dbb30d950 [MLIR][OpenMP] Better error reporting for unsupported nowait (#78551)
Provides some context for failing to generate LLVM IR for `target
enter|exit|update` directives when `nowait` is provided. This is
directly helpful for flang users since they would get this error message
if they tried to use `nowait`. Before that we had a very generic
message.

This is a follow-up to https://github.com/llvm/llvm-project/pull/78269,
please only review the latest commit (the one with the same commit
message as the PR title).
2024-01-19 16:47:24 +01:00
Tobias Gysi
9dd0eb9c9c [mlir][llvm] Drop unreachable basic block during import (#78467)
This revision updates the LLVM IR import to support unreachable basic
blocks. An unreachable block may dominate itself and a value defined
inside the block may thus be used before its definition. The import does
not support such dependencies. We thus delete the unreachable basic
blocks before the import. This is possible since MLIR does not have
basic block labels that can be reached using an indirect call and
unreachable blocks can indeed be deleted safely.

Additionally, add a small poison constant import test.
2024-01-19 11:10:57 +01:00
Krzysztof Drewniak
aac23b08e3 [mlir][ROCDL] Stop setting amdgpu-implicitarg-num-bytes (#78498)
Clang stopped doing this late 2021 back in 33315ef321, and no other
frontent does this, so stop doing it.
2024-01-18 09:43:46 -06:00
Sergio Afonso
2747193058 [Flang][MLIR][OpenMP] Remove the early outlining interface (#78450)
After the removal of the OpenMP early outlining MLIR pass in #67319, the
`EarlyOutliningInterface` stopped doing any useful work. It used to be
necessary to tie the name of the function from which a target region was
outlined to that new function, so it would be used when translating to
LLVM IR in place of the outlined function's name.

This is not necessary anymore, so this patch removes all references to
this interface and uses of the `omp.outline_parent_name` discardable
attribute in tests.
2024-01-18 15:33:43 +00:00
Sergio Afonso
8fb685fb7e [MLIR][LLVM] Add explicit target_cpu attribute to llvm.func (#78287)
This patch adds the target_cpu attribute to llvm.func MLIR operations
and updates the translation to/from LLVM IR to match "target-cpu"
function attributes.
2024-01-17 14:55:02 +00:00