Commit Graph

893 Commits

Author SHA1 Message Date
Peter Klausler
7463b46a34 [flang][runtime] Fix use of empty optional in BOZ input (#120789)
Slava reported a valgrind result showing the use of uninitialized data
due to an unconditional dereference of an optional in BOZ formatted
input editing; fix.
2025-01-08 13:12:25 -08:00
Leandro Lupori
5130a4ea12 [flang][OpenMP] Handle pointers and allocatables in clone init (#121824)
InitializeClone(), implemented in #120295, was not handling top
level pointers and allocatables correctly.
Pointers and unallocated variables must be skipped.

This caused some regressions in the Fujitsu testsuite:
https://linaro.atlassian.net/browse/LLVM-1488
2025-01-07 14:00:39 -03:00
Valentin Clement (バレンタイン クレメン)
6dcd2b035d [flang][cuda] Convert cuf.sync_descriptor to runtime call (#121524)
Convert the op to a new entry point in the runtime
`CUFSyncGlobalDescriptor`
2025-01-02 17:02:59 -08:00
Brooks Davis
7326e903d7 flang: fix backtrace build on FreeBSD (#120297)
FreeBSD's libexecinfo defines backtrace with a size_t for the size
argument and return type. This almost certainly doesn't make sense, but
what's done is done so cast the output to allow compilation. Otherwise
we get:

.../flang/runtime/stop.cpp:165:13: error: non-constant-expression cannot
be narrowed from type 'size_t' (aka 'unsigned long') to 'int' in
initializer list [-Wc++11-narrowing]
  165 |   int nptrs{backtrace(buffer, MAX_CALL_STACK)};
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2025-01-02 12:06:29 -05:00
vdonaldson
df12983610 [flang] build fix (#121032)
Place floating point environment calls under '#ifdef __USE_GNU'.
2024-12-24 03:19:29 -05:00
Valentin Clement (バレンタイン クレメン)
4cb2a519db Revert "Reland '[flang] Allow to pass an async id to allocate the descriptor (#118713)' and #118733" (#121029)
This still cause issue for device runtime build.
2024-12-23 21:27:34 -08:00
khaki3
7d166fa384 [flang][cuda] Correct the number of blocks when setting the grid to * (#121000)
We set the `gridX` argument of `_FortranACUFLaunchKernel` to `-1` when
`*` is passed to the grid parameter. We store it in one of `dim3`
members. However, `dim3` members are unsigned, so positive-value checks
we use later, such as `gridDim.x > 0`, are invalid. This PR utilizes the
original gird-size arguments to compute the number of blocks.
2024-12-23 17:14:38 -08:00
Valentin Clement (バレンタイン クレメン)
5b74fb75d9 Reland '[flang] Allow to pass an async id to allocate the descriptor (#118713)' and #118733 (#120997)
Device runtime build have been fixed. Attempt to re-land these patches
that have been approved before.

https://github.com/llvm/llvm-project/pull/118713
https://github.com/llvm/llvm-project/pull/118733
2024-12-23 12:13:56 -08:00
vdonaldson
dcb7f44cd6 [flang] Modifications to ieee_support_halting (#120976)
The F23 standard requires that a call to intrinsic module procedure
ieee_support_halting be foldable to a constant at compile time in some
contexts. See for example F23 Clause 10.1.11 [Specification expression]
list item (13), Clause 1.1.12 [Constant expression] list item (11), and
references to specification and constant expressions elsewhere, such as
constraints C1012, C853, and C704.

Some Arm processors allow a user to control processor behavior when an
arithmetic exception is signaled, and some Arm processors do not have
this capability. An Arm executable will run on either type of processor,
so it is effectively unknown at compile time whether or not this support
will be available at runtime. This is in conflict with the standard
requirement.

This patch addresses this conflict by implementing ieee_support_halting
calls on Arm processors to check if this capability is present at
runtime. A call to ieee_support_halting in a constant context, such as
in the specification part of a program unit, will generate a compile
time "cannot be computed as a constant value" error. The expectation is
that such calls are unlikely to appear in production code.

Code generation for other processors will continue to generate a compile
time constant result for ieee_support_halting calls.
2024-12-23 11:07:20 -05:00
vdonaldson
c28a7c1efd [flang] Modifications to ieee_support_halting (#120747)
The F23 standard requires that a call to intrinsic module procedure
ieee_support_halting be foldable to a constant at compile time in some
contexts. See for example F23 Clause 10.1.11 [Specification expression]
list item (13), Clause 1.1.12 [Constant expression] list item (11), and
references to specification and constant expressions elsewhere, such as
constraints C1012, C853, and C704.

Some Arm processors allow a user to control processor behavior when an
arithmetic exception is signaled, and some Arm processors do not have
this capability. An Arm executable will run on either type of processor,
so it is effectively unknown at compile time whether or not this support
will be available at runtime. This in conflict with the standard
requirement.

This patch addresses this conflict by implementing ieee_support_halting
calls on Arm processors to check if this capability is present at
runtime. A call to ieee_support_halting in a constant context, such as
in the specification part of a program unit, will generate a compile
time "cannot be computed as a constant value" error. The expectation is
that such calls are unlikely to appear in production code.

Code generation for other processors will continue to generate a compile
time constant result for ieee_support_halting calls.
2024-12-23 09:30:45 -05:00
Valentin Clement (バレンタイン クレメン)
415cfaf339 [flang][cuda][NFC] Fix type in CUFFreeDescriptor (#120799) 2024-12-20 14:43:12 -08:00
Valentin Clement (バレンタイン クレメン)
e650ac1654 [flang][cuda][NFC] Fix typo in CUFAllocDescriptor (#120797)
Missing `r` in the function name.
2024-12-20 13:57:47 -08:00
Leandro Lupori
1fcb6a9754 [flang][OpenMP] Initialize allocatable members of derived types (#120295)
Allocatable members of privatized derived types must be allocated,
with the same bounds as the original object, whenever that member
is also allocated in it, but Flang was not performing such
initialization.

The `Initialize` runtime function can't perform this task unless
its signature is changed to receive an additional parameter, the
original object, that is needed to find out which allocatable
members, with their bounds, must also be allocated in the clone.
As `Initialize` is used not only for privatization, sometimes this
other object won't even exist, so this new parameter would need
to be optional.
Because of this, it seemed better to add a new runtime function:
`InitializeClone`.
To avoid unnecessary calls, lowering inserts a call to it only for
privatized items that are derived types with allocatable members.

Fixes https://github.com/llvm/llvm-project/issues/114888
Fixes https://github.com/llvm/llvm-project/issues/114889
2024-12-19 17:26:50 -03:00
Peter Klausler
9f3a611480 [flang] Don't needlessly instantiate distinct UNSIGNED cases for FINDLOC (#120471)
The FINDLOC runtime doesn't need to distinguish between INTEGER and
UNSIGNED data, so use the code for INTEGER also for UNSIGNED.
2024-12-18 11:37:26 -08:00
Peter Klausler
fc97d2e68b [flang] Add UNSIGNED (#113504)
Implement the UNSIGNED extension type and operations under control of a
language feature flag (-funsigned).

This is nearly identical to the UNSIGNED feature that has been available
in Sun Fortran for years, and now implemented in GNU Fortran for
gfortran 15, and proposed for ISO standardization in J3/24-116.txt.

See the new documentation for details; but in short, this is C's
unsigned type, with guaranteed modular arithmetic for +, -, and *, and
the related transformational intrinsic functions SUM & al.
2024-12-18 07:02:37 -08:00
执着
e8baa792e7 Backtrace support for flang (#118179)
Fixed build failures in old PRs due to missing files
2024-12-10 10:31:48 +00:00
Valentin Clement (バレンタイン クレメン)
16c2a1016e Revert "[flang] Allow to pass an async id to allocate the descriptor (#118713)" (#119109)
This reverts commit 7d1c661381.

This commit breaks some device runtime builds. Need time to investigate.
2024-12-07 19:55:12 -08:00
jeanPerier
d6ec7c82f3 [flang][CUF] fix missing header after #112188 (#118993)
Otherwise, builds with `-DFLANG_CUF_RUNTIME` hits:

```
runtime/CUDA/descriptor.cpp:44:24: error: invalid use of incomplete type 'const class Fortran::runtime::Descriptor'
   44 |   std::size_t count{src->SizeInBytes()};
```
2024-12-06 17:22:47 +01:00
Michael Kruse
c91ba04328 [Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188)
Split some headers into headers for public and private declarations in
preparation for #110217. Moving the runtime-private headers in
runtime-private include directory will occur in #110298.

* Do not use `sizeof(Descriptor)` in the compiler. The size of the
descriptor is target-dependent while `sizeof(Descriptor)` is the size of
the Descriptor for the host platform which might be too small when
cross-compiling to a different platform. Another problem is that the
emitted assembly ((cross-)compiling to the same target) is not identical
between Flang's running on different systems. Moving the declaration of
`class Descriptor` out of the included header will also reduce the
amount of #included sources.

* Do not use `sizeof(ArrayConstructorVector)` and
`alignof(ArrayConstructorVector)` in the compiler. Same reason as with
`Descriptor`.

* Compute the descriptor's extra flags without instantiating a
Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime
source, but not the compiler source.

* Move `InquiryKeywordHashDecode` into runtime-private header. The
function is defined in the runtime sources and trying to call it in the
compiler would lead to a link-error.

* Move allocator-kind magic numbers into common header. They are the
only declarations out of `allocator-registry.h` in the compiler as well.
 
This does not make Flang cross-compile ready yet, the main goal is to
avoid transitive header dependencies from Flang to clang-rt. There are
more assumptions that host platform is the same as the target platform.
2024-12-06 15:29:00 +01:00
Valentin Clement (バレンタイン クレメン)
83ccaad473 [flang][cuda] Use async id for device stream allocation (#118733)
When stream is specified use cudaMallocAsync with the specified stream
2024-12-05 08:57:10 -08:00
Michael Kruse
0cda970ecc [Flang][NFC] Split common headers to reduce dependencies. (#110244)
Fortran.h and target.h are defining symbols where some are used by both, the Fortran runtime (Flang-RT) and Fortran compiler (Flang), and others are used by Flang only. With the upcoming refactoring of the Fortran runtime into its own subproject (#110217), move the declarations that are used by both into new headers to minimize the amount of code that will need to be shared by Flang-RT and Flang.

Details:

 * `Fortran.h`: Flang-RT  only uses some enum definitions out of this file, but not `AsFortran` which is defined in `Fortran.cpp`. Moving the enums into `Fortran-consts.h` allows keeping `Fortran.cpp` within Flang.

 * `target.h`: Contains some floating-point definitions that is used by the non-GTest unittests in `fp-testing.h`. Flang-RT also uses some non-GTest as well. Moving those definitions avoids the dependence on the entire FortranEvaluate library.
2024-12-05 11:29:32 +01:00
Valentin Clement (バレンタイン クレメン)
7d1c661381 [flang] Allow to pass an async id to allocate the descriptor (#118713)
This is a patch in preparation for the support stream ordered memory
allocator in CUDA Fortran.

This patch adds an asynchronous id to the AllocatableAllocate runtime
function and to Descriptor::Allocate so it can be passed down to the
registered allocator. It is up to the allocator to use this value or
not.

A follow up patch will implement that asynchronous allocator for CUDA
Fortran.
2024-12-04 18:24:40 -08:00
vdonaldson
6003be7ef1 [flang] IEEE_GET_UNDERFLOW_MODE, IEEE_SET_UNDERFLOW_MODE (#118551)
Implement IEEE_GET_UNDERFLOW_MODE and IEEE_SET_UNDERFLOW_MODE. Update
IEEE_SUPPORT_UNDERFLOW_CONTROL to enable support for indvidual REAL
kinds.
2024-12-04 16:21:11 -05:00
Peter Klausler
9b64811e27 [flang][runtime] Skip unused truncated list-directed character input (#118320)
When reading non-delimited list-directed character input, read the whole
field even if it doesn't fit into the variable.

Fixes https://github.com/llvm/llvm-project/issues/118277.
2024-12-02 12:26:27 -08:00
Tom Eccles
1858a4ebf0 Revert "[flang]Add new intrinsic function backtrace and complete the TODO of abort" (#117990)
Reverts llvm/llvm-project#117603 due to failed buildbot
https://lab.llvm.org/buildbot/#/builders/152/builds/710

The important bit of the log was
```
FAILED: CMakeFiles/FortranRuntime.dir/stop.cpp.o 
ccache /usr/bin/g++ -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -DRT_USE_LIBCUDACXX=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/../include -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/build -I/home/buildbot/worker/third-party/nv/cccl/libcudacxx/include -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -Wimplicit-fallthrough -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-lto -O3 -DNDEBUG   -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -UNDEBUG -std=c++17  -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT CMakeFiles/FortranRuntime.dir/stop.cpp.o -MF CMakeFiles/FortranRuntime.dir/stop.cpp.o.d -o CMakeFiles/FortranRuntime.dir/stop.cpp.o -c /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/stop.cpp
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/stop.cpp:19:10: fatal error: llvm/Config/config.h: No such file or directory
   19 | #include "llvm/Config/config.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
```

CC @dty2
2024-11-28 10:43:07 +00:00
执着
159e6012be [flang]Add new intrinsic function backtrace and complete the TODO of abort (#117603)
Hey guys, I found that Flang's built-in ABORT function is incomplete
when I was using it. Compared with gfortran's ABORT (which can both
abort and print out a backtrace), flang's ABORT implementation lacks the
function of printing out a backtrace. This feature is essential for
debugging and understanding the call stack at the failure point.

To solve this problem, I completed the "// TODO:" of the abort function,
and then implemented an additional built-in function BACKTRACE for
flang. After a brief reading of the relevant source code, I used
backtrace and backtrace_symbols in "execinfo.h" to quickly implement
this. But since I used the above two functions directly, my
implementation is slightly different from gfortran's implementation (in
the output, the function call stack before main is additionally output,
and the function line number is missing). In addition, since I used the
above two functions, I did not need to add -g to embed debug information
into the ELF file, but needed -rdynamic to ensure that the symbols are
added to the dynamic symbol table (so that the function name will be
printed out).

Here is a comparison of the output between gfortran 's backtrace and my
implementation:
gfortran's implemention output:
```
#0  0x557eb71f4184 in testfun2_
        at /home/hunter/plct/fortran/test.f90:5
#1  0x557eb71f4165 in testfun1_
        at /home/hunter/plct/fortran/test.f90:13
#2  0x557eb71f4192 in test_backtrace
        at /home/hunter/plct/fortran/test.f90:17
#3  0x557eb71f41ce in main
        at /home/hunter/plct/fortran/test.f90:18
```
my impelmention output:
```
Backtrace:
#0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92]
#1 ./test(testfun2_+0x14) [0x574f07efc7b4]
#2 ./test(testfun1_+0xd) [0x574f07efc7cd]
#3 ./test(_QQmain+0x9) [0x574f07efc7e9]
#4 ./test(main+0x12) [0x574f07efc802]
#5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08]
#6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc]
#7 ./test(_start+0x25) [0x574f07efc6c5]
```
test program is:
```
function testfun2() result(err)
  implicit none
  integer :: err
  err = 1
  call backtrace
end function testfun2

subroutine testfun1()
  implicit none
  integer :: err
  integer :: testfun2

  err = testfun2()
end subroutine testfun1

program test_backtrace
  call testfun1()
end program test_backtrace
```
I am well aware of the importance of line numbers, so I am now working
on implementing line numbers (by parsing DWARF information) and
supporting cross-platform (Windows) support.
2024-11-28 10:37:22 +00:00
Valentin Clement (バレンタイン クレメン)
eb5cda480d [flang][cuda] cuf.allocate: Carry over stream to the runtime call (#117631)
- Update the runtime entry points to accept a stream information
- Update the conversion of `cuf.allocate` to pass correctly the stream
information when present.

Note that the stream is not currently used in the runtime. This will be
done in a separate patch as a design/solution needs to be down together
with the allocators.
2024-11-25 20:46:24 -08:00
Valentin Clement (バレンタイン クレメン)
5802367ddb [flang][cuda] Add support for allocate with source (#117388)
Add support for allocate statement with CUDA device variable and a
source.
2024-11-22 16:55:26 -08:00
Peter Klausler
3f594741cf [flang] Fix implementation of Kahan summation (#116897)
In the runtime's implementation of floating-point SUM, the
implementation of Kahan's algorithm for increased precision is
incorrect. The running correction factor should be subtracted from each
new data item, not added to it. This fix ensures that the sum of 100M
random default real values between 0. and 1. is close to 5.E7.

See https://en.wikipedia.org/wiki/Kahan_summation_algorithm.
2024-11-21 10:47:21 -08:00
Valentin Clement
42be165dde Reland '[flang][cuda] Specialize entry point for scalar to desc data transfer' 2024-11-15 19:13:55 -08:00
Valentin Clement (バレンタイン クレメン)
70b9440c88 Revert "[flang][cuda] Specialize entry point for scalar to desc data transfer" (#116458)
Reverts llvm/llvm-project#116457
2024-11-15 17:44:48 -08:00
Valentin Clement (バレンタイン クレメン)
43cb424a54 [flang][cuda] Specialize entry point for scalar to desc data transfer (#116457)
The runtime Assign function is not meant to initialize an array from a
scalar. For that we need to use DoAssignFromSource. Update the data
transfer from scalar to descriptor to use a new entry point that use
this function underneath.
2024-11-15 17:41:23 -08:00
Peter Klausler
376713ff50 [flang] Accept CLASS(*) array in EOSHIFT (#116114)
The intrinsic processing code wasn't allowing the ARRAY= argument to the
EOSHIFT intrinsic function to be CLASS(*). That case seems to conform to
the standard, although only one compiler could actually handle it, so
allow for it.

Fixes https://github.com/llvm/llvm-project/issues/115923.
2024-11-14 14:58:19 -08:00
vdonaldson
92604cf378 [flang] IEEE_REM (#115936)
Implement the IEEE 60559:2020 remainder function.
2024-11-13 13:02:20 -05:00
Valentin Clement (バレンタイン クレメン)
6b21cf8cca [flang][cuda] Compute grid x when calling a kernel with <<<*, block>>> (#115538)
`-1, 1, 1` is passed when calling a kernel with the `<<<*, block>>>`
syntax. Query the device to compute the grid.x value.
2024-11-08 14:34:26 -08:00
Peter Klausler
07e053fb95 [flang][runtime] Fix finalization case in assignment (#113611)
There were two bugs in derived type array assignment processing that
caused finalization to fail to occur for a test case. The first bug was
an off-by-one error in address overlap testing that caused a false
positive result for the test, whose left-hand side's allocatable's
descriptor was immediately adjacent in memory to the right-hand side's
array's data.
The second bug was that in such overlap cases (even when legitimate)
finalization would fail due to the LHS's descriptor having been copied
to a temporary for deferred deallocation and then nullified.

This patch corrects the overlap analysis for this test, and also
properly finalizes the LHS when overlap does exist. Some nearby dead
code was removed to avoid future confusion.

Fixes https://github.com/llvm/llvm-project/issues/113375.
2024-11-05 13:17:56 -08:00
Valentin Clement (バレンタイン クレメン)
db69d6939a [flang][cuda] Support data transfer from descriptor to a pointer (#115023)
Data transfer from a variable with a descriptor to a pointer. We create
a descriptor for the pointer so we can use the flang runtime to perform
the transfer. The Assign function handles all corner cases. We add a new
entry points `CUFDataTransferDescDescNoRealloc` to avoid reallocation
since the variable on the LHS is not an allocatable.
2024-11-05 11:59:08 -08:00
Valentin Clement (バレンタイン クレメン)
652db7e4ff [flang][cuda] Support data transfer from pointer to a descriptor (#114892)
When source is a pointer to an array or a scalar, embox it and use the
`CUFDataTransferDescDesc` or `CUFDataTransferGlobalDescDesc` entry
points. The runtime is already able to deal with all the corner cases
like non contiguous arrays and so on so we exploit this.

Memset might still be used for simple case where we want to initialize
to 0 for example. This will come in a follow up patch.
2024-11-05 08:56:19 -08:00
Valentin Clement (バレンタイン クレメン)
9d09c6fd9c [flang][cuda] Update device descriptor on data transfer (#114838)
When the destination of the data transfer is a global we might need to
sync the descriptor after the data transfer is done. This is the case
when the data transfer is from host/device to device as reallocation
might have happened and the descriptor on the device needs to take the
new values written on the host.

A new entry point is added `CUFDataTransferGlobalDescDesc` with the sync
when needed.
2024-11-04 13:22:06 -08:00
Valentin Clement (バレンタイン クレメン)
c949500d51 [flang][cuda] Fix not declared terminator (#114866) 2024-11-04 12:38:02 -08:00
Valentin Clement (バレンタイン クレメン)
51f7e98d59 [flang][cuda] Crash if mode is not handled (#114842) 2024-11-04 11:47:19 -08:00
Valentin Clement (バレンタイン クレメン)
32473864cb [flang][cuda] Data transfer with descriptor (#114598)
Reopen PR #114302 as it was automatically closed. 

Review in #114302
2024-11-01 12:35:48 -07:00
Valentin Clement (バレンタイン クレメン)
7792dbe29a Reland '[flang][runtime] Allow different memmov function in assign' (#114587)
Reland #114301
2024-11-01 11:26:39 -07:00
Valentin Clement (バレンタイン クレメン)
c5a254cdd7 Revert "[flang][runtime][NFC] Allow different memmove function in assign" (#114581)
Reverts llvm/llvm-project#114301
2024-11-01 10:40:10 -07:00
Valentin Clement (バレンタイン クレメン)
b278fe3297 [flang][runtime][NFC] Allow different memmove function in assign (#114301)
- Add a parameter to the `Assign` function to be able to use a different
`memmove` function. This is preparatory work to be able to use the
`Assign` function between host and device data.
- Expose the `Assign` function so it can be used from different files. 

- The new `memmoveFct` is not used in `BlankPadCharacterAssignment` yet
since it is not clear if there is a need. It will be updated in case it
is needed.
2024-11-01 10:34:03 -07:00
Valentin Clement (バレンタイン クレメン)
e4e9fea71e [flang][cuda] Pass descriptor by reference for CUFMemsetDescriptor (#114338) 2024-10-31 09:02:59 -07:00
Renaud Kauffmann
bfe486fe76 Passing descriptors by reference to CUDA runtime calls (#114288)
Passing a descriptor as a `const Descriptor &` or a `const Descriptor *`
generates a FIR signature where the box is passed by value.
This is an issue, as it requires a load of the box to be passed. But
since, ultimately, all boxes are passed by reference a temporary is
generated in LLVM and the reference to the temporary is passed.

The boxes addresses are registered with the CUDA runtime but the
temporaries are not, thus preventing the runtime to properly map a host
side address to its device side counterpart.

To address this issue, this PR changes the signatures to the transfer
functions to pass a descriptor as a `Descriptor *`, which will in turn
generate a FIR signature with that takes a box reference as an argument.
2024-10-30 13:24:47 -07:00
Valentin Clement (バレンタイン クレメン)
0b700f2333 [flang][cuda] Add entry point to launch global function with cluster_dims (#113958) 2024-10-29 10:01:49 -07:00
Renaud Kauffmann
70d61f6de7 [flang][cuda] Adding runtime call to CUFRegisterVariable (#113952) 2024-10-28 13:34:37 -07:00
Valentin Clement (バレンタイン クレメン)
4e40b71c51 [flang][cuda] Add specialized gpu.launch_func conversion (#113493) 2024-10-23 15:28:51 -07:00