clang-p2996

Author	SHA1	Message	Date
Peter Klausler	7463b46a34	[flang][runtime] Fix use of empty optional in BOZ input (#120789 ) Slava reported a valgrind result showing the use of uninitialized data due to an unconditional dereference of an optional in BOZ formatted input editing; fix.	2025-01-08 13:12:25 -08:00
Leandro Lupori	5130a4ea12	[flang][OpenMP] Handle pointers and allocatables in clone init (#121824 ) InitializeClone(), implemented in #120295, was not handling top level pointers and allocatables correctly. Pointers and unallocated variables must be skipped. This caused some regressions in the Fujitsu testsuite: https://linaro.atlassian.net/browse/LLVM-1488	2025-01-07 14:00:39 -03:00
Valentin Clement (バレンタインクレメン)	6dcd2b035d	[flang][cuda] Convert cuf.sync_descriptor to runtime call (#121524 ) Convert the op to a new entry point in the runtime `CUFSyncGlobalDescriptor`	2025-01-02 17:02:59 -08:00
Brooks Davis	7326e903d7	flang: fix backtrace build on FreeBSD (#120297 ) FreeBSD's libexecinfo defines backtrace with a size_t for the size argument and return type. This almost certainly doesn't make sense, but what's done is done so cast the output to allow compilation. Otherwise we get: .../flang/runtime/stop.cpp:165:13: error: non-constant-expression cannot be narrowed from type 'size_t' (aka 'unsigned long') to 'int' in initializer list [-Wc++11-narrowing] 165 \| int nptrs{backtrace(buffer, MAX_CALL_STACK)}; \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	2025-01-02 12:06:29 -05:00
vdonaldson	df12983610	[flang] build fix (#121032 ) Place floating point environment calls under '#ifdef __USE_GNU'.	2024-12-24 03:19:29 -05:00
Valentin Clement (バレンタインクレメン)	4cb2a519db	Revert "Reland '[flang] Allow to pass an async id to allocate the descriptor (#118713 )' and #118733 " (#121029 ) This still cause issue for device runtime build.	2024-12-23 21:27:34 -08:00
khaki3	7d166fa384	[flang][cuda] Correct the number of blocks when setting the grid to `` (#121000 ) We set the `gridX` argument of `_FortranACUFLaunchKernel` to `-1` when `` is passed to the grid parameter. We store it in one of `dim3` members. However, `dim3` members are unsigned, so positive-value checks we use later, such as `gridDim.x > 0`, are invalid. This PR utilizes the original gird-size arguments to compute the number of blocks.	2024-12-23 17:14:38 -08:00
Valentin Clement (バレンタインクレメン)	5b74fb75d9	Reland '[flang] Allow to pass an async id to allocate the descriptor (#118713 )' and #118733 (#120997 ) Device runtime build have been fixed. Attempt to re-land these patches that have been approved before. https://github.com/llvm/llvm-project/pull/118713 https://github.com/llvm/llvm-project/pull/118733	2024-12-23 12:13:56 -08:00
vdonaldson	dcb7f44cd6	[flang] Modifications to ieee_support_halting (#120976 ) The F23 standard requires that a call to intrinsic module procedure ieee_support_halting be foldable to a constant at compile time in some contexts. See for example F23 Clause 10.1.11 [Specification expression] list item (13), Clause 1.1.12 [Constant expression] list item (11), and references to specification and constant expressions elsewhere, such as constraints C1012, C853, and C704. Some Arm processors allow a user to control processor behavior when an arithmetic exception is signaled, and some Arm processors do not have this capability. An Arm executable will run on either type of processor, so it is effectively unknown at compile time whether or not this support will be available at runtime. This is in conflict with the standard requirement. This patch addresses this conflict by implementing ieee_support_halting calls on Arm processors to check if this capability is present at runtime. A call to ieee_support_halting in a constant context, such as in the specification part of a program unit, will generate a compile time "cannot be computed as a constant value" error. The expectation is that such calls are unlikely to appear in production code. Code generation for other processors will continue to generate a compile time constant result for ieee_support_halting calls.	2024-12-23 11:07:20 -05:00
vdonaldson	c28a7c1efd	[flang] Modifications to ieee_support_halting (#120747 ) The F23 standard requires that a call to intrinsic module procedure ieee_support_halting be foldable to a constant at compile time in some contexts. See for example F23 Clause 10.1.11 [Specification expression] list item (13), Clause 1.1.12 [Constant expression] list item (11), and references to specification and constant expressions elsewhere, such as constraints C1012, C853, and C704. Some Arm processors allow a user to control processor behavior when an arithmetic exception is signaled, and some Arm processors do not have this capability. An Arm executable will run on either type of processor, so it is effectively unknown at compile time whether or not this support will be available at runtime. This in conflict with the standard requirement. This patch addresses this conflict by implementing ieee_support_halting calls on Arm processors to check if this capability is present at runtime. A call to ieee_support_halting in a constant context, such as in the specification part of a program unit, will generate a compile time "cannot be computed as a constant value" error. The expectation is that such calls are unlikely to appear in production code. Code generation for other processors will continue to generate a compile time constant result for ieee_support_halting calls.	2024-12-23 09:30:45 -05:00
Valentin Clement (バレンタインクレメン)	415cfaf339	[flang][cuda][NFC] Fix type in CUFFreeDescriptor (#120799 )	2024-12-20 14:43:12 -08:00
Valentin Clement (バレンタインクレメン)	e650ac1654	[flang][cuda][NFC] Fix typo in CUFAllocDescriptor (#120797 ) Missing `r` in the function name.	2024-12-20 13:57:47 -08:00
Leandro Lupori	1fcb6a9754	[flang][OpenMP] Initialize allocatable members of derived types (#120295 ) Allocatable members of privatized derived types must be allocated, with the same bounds as the original object, whenever that member is also allocated in it, but Flang was not performing such initialization. The `Initialize` runtime function can't perform this task unless its signature is changed to receive an additional parameter, the original object, that is needed to find out which allocatable members, with their bounds, must also be allocated in the clone. As `Initialize` is used not only for privatization, sometimes this other object won't even exist, so this new parameter would need to be optional. Because of this, it seemed better to add a new runtime function: `InitializeClone`. To avoid unnecessary calls, lowering inserts a call to it only for privatized items that are derived types with allocatable members. Fixes https://github.com/llvm/llvm-project/issues/114888 Fixes https://github.com/llvm/llvm-project/issues/114889	2024-12-19 17:26:50 -03:00
Peter Klausler	9f3a611480	[flang] Don't needlessly instantiate distinct UNSIGNED cases for FINDLOC (#120471 ) The FINDLOC runtime doesn't need to distinguish between INTEGER and UNSIGNED data, so use the code for INTEGER also for UNSIGNED.	2024-12-18 11:37:26 -08:00
Peter Klausler	fc97d2e68b	[flang] Add UNSIGNED (#113504 ) Implement the UNSIGNED extension type and operations under control of a language feature flag (-funsigned). This is nearly identical to the UNSIGNED feature that has been available in Sun Fortran for years, and now implemented in GNU Fortran for gfortran 15, and proposed for ISO standardization in J3/24-116.txt. See the new documentation for details; but in short, this is C's unsigned type, with guaranteed modular arithmetic for +, -, and *, and the related transformational intrinsic functions SUM & al.	2024-12-18 07:02:37 -08:00
执着	e8baa792e7	Backtrace support for flang (#118179 ) Fixed build failures in old PRs due to missing files	2024-12-10 10:31:48 +00:00
Valentin Clement (バレンタインクレメン)	16c2a1016e	Revert "[flang] Allow to pass an async id to allocate the descriptor (#118713 )" (#119109 ) This reverts commit `7d1c661381`. This commit breaks some device runtime builds. Need time to investigate.	2024-12-07 19:55:12 -08:00
jeanPerier	d6ec7c82f3	[flang][CUF] fix missing header after #112188 (#118993 ) Otherwise, builds with `-DFLANG_CUF_RUNTIME` hits: ``` runtime/CUDA/descriptor.cpp:44:24: error: invalid use of incomplete type 'const class Fortran::runtime::Descriptor' 44 \| std::size_t count{src->SizeInBytes()}; ```	2024-12-06 17:22:47 +01:00
Michael Kruse	c91ba04328	[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188 ) Split some headers into headers for public and private declarations in preparation for #110217. Moving the runtime-private headers in runtime-private include directory will occur in #110298. * Do not use `sizeof(Descriptor)` in the compiler. The size of the descriptor is target-dependent while `sizeof(Descriptor)` is the size of the Descriptor for the host platform which might be too small when cross-compiling to a different platform. Another problem is that the emitted assembly ((cross-)compiling to the same target) is not identical between Flang's running on different systems. Moving the declaration of `class Descriptor` out of the included header will also reduce the amount of #included sources. * Do not use `sizeof(ArrayConstructorVector)` and `alignof(ArrayConstructorVector)` in the compiler. Same reason as with `Descriptor`. * Compute the descriptor's extra flags without instantiating a Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime source, but not the compiler source. * Move `InquiryKeywordHashDecode` into runtime-private header. The function is defined in the runtime sources and trying to call it in the compiler would lead to a link-error. * Move allocator-kind magic numbers into common header. They are the only declarations out of `allocator-registry.h` in the compiler as well. This does not make Flang cross-compile ready yet, the main goal is to avoid transitive header dependencies from Flang to clang-rt. There are more assumptions that host platform is the same as the target platform.	2024-12-06 15:29:00 +01:00
Valentin Clement (バレンタインクレメン)	83ccaad473	[flang][cuda] Use async id for device stream allocation (#118733 ) When stream is specified use cudaMallocAsync with the specified stream	2024-12-05 08:57:10 -08:00
Michael Kruse	0cda970ecc	[Flang][NFC] Split common headers to reduce dependencies. (#110244 ) Fortran.h and target.h are defining symbols where some are used by both, the Fortran runtime (Flang-RT) and Fortran compiler (Flang), and others are used by Flang only. With the upcoming refactoring of the Fortran runtime into its own subproject (#110217), move the declarations that are used by both into new headers to minimize the amount of code that will need to be shared by Flang-RT and Flang. Details: * `Fortran.h`: Flang-RT only uses some enum definitions out of this file, but not `AsFortran` which is defined in `Fortran.cpp`. Moving the enums into `Fortran-consts.h` allows keeping `Fortran.cpp` within Flang. * `target.h`: Contains some floating-point definitions that is used by the non-GTest unittests in `fp-testing.h`. Flang-RT also uses some non-GTest as well. Moving those definitions avoids the dependence on the entire FortranEvaluate library.	2024-12-05 11:29:32 +01:00
Valentin Clement (バレンタインクレメン)	7d1c661381	[flang] Allow to pass an async id to allocate the descriptor (#118713 ) This is a patch in preparation for the support stream ordered memory allocator in CUDA Fortran. This patch adds an asynchronous id to the AllocatableAllocate runtime function and to Descriptor::Allocate so it can be passed down to the registered allocator. It is up to the allocator to use this value or not. A follow up patch will implement that asynchronous allocator for CUDA Fortran.	2024-12-04 18:24:40 -08:00
vdonaldson	6003be7ef1	[flang] IEEE_GET_UNDERFLOW_MODE, IEEE_SET_UNDERFLOW_MODE (#118551 ) Implement IEEE_GET_UNDERFLOW_MODE and IEEE_SET_UNDERFLOW_MODE. Update IEEE_SUPPORT_UNDERFLOW_CONTROL to enable support for indvidual REAL kinds.	2024-12-04 16:21:11 -05:00
Peter Klausler	9b64811e27	[flang][runtime] Skip unused truncated list-directed character input (#118320 ) When reading non-delimited list-directed character input, read the whole field even if it doesn't fit into the variable. Fixes https://github.com/llvm/llvm-project/issues/118277.	2024-12-02 12:26:27 -08:00
Tom Eccles	1858a4ebf0	Revert "[flang]Add new intrinsic function backtrace and complete the TODO of abort" (#117990 ) Reverts llvm/llvm-project#117603 due to failed buildbot https://lab.llvm.org/buildbot/#/builders/152/builds/710 The important bit of the log was ``` FAILED: CMakeFiles/FortranRuntime.dir/stop.cpp.o ccache /usr/bin/g++ -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -DRT_USE_LIBCUDACXX=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/../include -I/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/build -I/home/buildbot/worker/third-party/nv/cccl/libcudacxx/include -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -Wimplicit-fallthrough -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-lto -O3 -DNDEBUG -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -UNDEBUG -std=c++17 -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT CMakeFiles/FortranRuntime.dir/stop.cpp.o -MF CMakeFiles/FortranRuntime.dir/stop.cpp.o.d -o CMakeFiles/FortranRuntime.dir/stop.cpp.o -c /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/stop.cpp /home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-gcc/llvm-project/flang/runtime/stop.cpp:19:10: fatal error: llvm/Config/config.h: No such file or directory 19 \| #include "llvm/Config/config.h" \| ^~~~~~~~~~~~~~~~~~~~~~ compilation terminated. ``` CC @dty2	2024-11-28 10:43:07 +00:00
执着	159e6012be	[flang]Add new intrinsic function backtrace and complete the TODO of abort (#117603 ) Hey guys, I found that Flang's built-in ABORT function is incomplete when I was using it. Compared with gfortran's ABORT (which can both abort and print out a backtrace), flang's ABORT implementation lacks the function of printing out a backtrace. This feature is essential for debugging and understanding the call stack at the failure point. To solve this problem, I completed the "// TODO:" of the abort function, and then implemented an additional built-in function BACKTRACE for flang. After a brief reading of the relevant source code, I used backtrace and backtrace_symbols in "execinfo.h" to quickly implement this. But since I used the above two functions directly, my implementation is slightly different from gfortran's implementation (in the output, the function call stack before main is additionally output, and the function line number is missing). In addition, since I used the above two functions, I did not need to add -g to embed debug information into the ELF file, but needed -rdynamic to ensure that the symbols are added to the dynamic symbol table (so that the function name will be printed out). Here is a comparison of the output between gfortran 's backtrace and my implementation: gfortran's implemention output: ``` #0 0x557eb71f4184 in testfun2_ at /home/hunter/plct/fortran/test.f90:5 #1 0x557eb71f4165 in testfun1_ at /home/hunter/plct/fortran/test.f90:13 #2 0x557eb71f4192 in test_backtrace at /home/hunter/plct/fortran/test.f90:17 #3 0x557eb71f41ce in main at /home/hunter/plct/fortran/test.f90:18 ``` my impelmention output: ``` Backtrace: #0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92] #1 ./test(testfun2_+0x14) [0x574f07efc7b4] #2 ./test(testfun1_+0xd) [0x574f07efc7cd] #3 ./test(_QQmain+0x9) [0x574f07efc7e9] #4 ./test(main+0x12) [0x574f07efc802] #5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08] #6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc] #7 ./test(_start+0x25) [0x574f07efc6c5] ``` test program is: ``` function testfun2() result(err) implicit none integer :: err err = 1 call backtrace end function testfun2 subroutine testfun1() implicit none integer :: err integer :: testfun2 err = testfun2() end subroutine testfun1 program test_backtrace call testfun1() end program test_backtrace ``` I am well aware of the importance of line numbers, so I am now working on implementing line numbers (by parsing DWARF information) and supporting cross-platform (Windows) support.	2024-11-28 10:37:22 +00:00
Valentin Clement (バレンタインクレメン)	eb5cda480d	[flang][cuda] cuf.allocate: Carry over stream to the runtime call (#117631 ) - Update the runtime entry points to accept a stream information - Update the conversion of `cuf.allocate` to pass correctly the stream information when present. Note that the stream is not currently used in the runtime. This will be done in a separate patch as a design/solution needs to be down together with the allocators.	2024-11-25 20:46:24 -08:00
Valentin Clement (バレンタインクレメン)	5802367ddb	[flang][cuda] Add support for allocate with source (#117388 ) Add support for allocate statement with CUDA device variable and a source.	2024-11-22 16:55:26 -08:00
Peter Klausler	3f594741cf	[flang] Fix implementation of Kahan summation (#116897 ) In the runtime's implementation of floating-point SUM, the implementation of Kahan's algorithm for increased precision is incorrect. The running correction factor should be subtracted from each new data item, not added to it. This fix ensures that the sum of 100M random default real values between 0. and 1. is close to 5.E7. See https://en.wikipedia.org/wiki/Kahan_summation_algorithm.	2024-11-21 10:47:21 -08:00
Valentin Clement	42be165dde	Reland '[flang][cuda] Specialize entry point for scalar to desc data transfer'	2024-11-15 19:13:55 -08:00
Valentin Clement (バレンタインクレメン)	70b9440c88	Revert "[flang][cuda] Specialize entry point for scalar to desc data transfer" (#116458 ) Reverts llvm/llvm-project#116457	2024-11-15 17:44:48 -08:00
Valentin Clement (バレンタインクレメン)	43cb424a54	[flang][cuda] Specialize entry point for scalar to desc data transfer (#116457 ) The runtime Assign function is not meant to initialize an array from a scalar. For that we need to use DoAssignFromSource. Update the data transfer from scalar to descriptor to use a new entry point that use this function underneath.	2024-11-15 17:41:23 -08:00
Peter Klausler	376713ff50	[flang] Accept CLASS() array in EOSHIFT (#116114 ) The intrinsic processing code wasn't allowing the ARRAY= argument to the EOSHIFT intrinsic function to be CLASS(). That case seems to conform to the standard, although only one compiler could actually handle it, so allow for it. Fixes https://github.com/llvm/llvm-project/issues/115923.	2024-11-14 14:58:19 -08:00
vdonaldson	92604cf378	[flang] IEEE_REM (#115936 ) Implement the IEEE 60559:2020 remainder function.	2024-11-13 13:02:20 -05:00
Valentin Clement (バレンタインクレメン)	6b21cf8cca	[flang][cuda] Compute grid x when calling a kernel with <<<, block>>> (#115538 ) `-1, 1, 1` is passed when calling a kernel with the `<<<, block>>>` syntax. Query the device to compute the grid.x value.	2024-11-08 14:34:26 -08:00
Peter Klausler	07e053fb95	[flang][runtime] Fix finalization case in assignment (#113611 ) There were two bugs in derived type array assignment processing that caused finalization to fail to occur for a test case. The first bug was an off-by-one error in address overlap testing that caused a false positive result for the test, whose left-hand side's allocatable's descriptor was immediately adjacent in memory to the right-hand side's array's data. The second bug was that in such overlap cases (even when legitimate) finalization would fail due to the LHS's descriptor having been copied to a temporary for deferred deallocation and then nullified. This patch corrects the overlap analysis for this test, and also properly finalizes the LHS when overlap does exist. Some nearby dead code was removed to avoid future confusion. Fixes https://github.com/llvm/llvm-project/issues/113375.	2024-11-05 13:17:56 -08:00
Valentin Clement (バレンタインクレメン)	db69d6939a	[flang][cuda] Support data transfer from descriptor to a pointer (#115023 ) Data transfer from a variable with a descriptor to a pointer. We create a descriptor for the pointer so we can use the flang runtime to perform the transfer. The Assign function handles all corner cases. We add a new entry points `CUFDataTransferDescDescNoRealloc` to avoid reallocation since the variable on the LHS is not an allocatable.	2024-11-05 11:59:08 -08:00
Valentin Clement (バレンタインクレメン)	652db7e4ff	[flang][cuda] Support data transfer from pointer to a descriptor (#114892 ) When source is a pointer to an array or a scalar, embox it and use the `CUFDataTransferDescDesc` or `CUFDataTransferGlobalDescDesc` entry points. The runtime is already able to deal with all the corner cases like non contiguous arrays and so on so we exploit this. Memset might still be used for simple case where we want to initialize to 0 for example. This will come in a follow up patch.	2024-11-05 08:56:19 -08:00
Valentin Clement (バレンタインクレメン)	9d09c6fd9c	[flang][cuda] Update device descriptor on data transfer (#114838 ) When the destination of the data transfer is a global we might need to sync the descriptor after the data transfer is done. This is the case when the data transfer is from host/device to device as reallocation might have happened and the descriptor on the device needs to take the new values written on the host. A new entry point is added `CUFDataTransferGlobalDescDesc` with the sync when needed.	2024-11-04 13:22:06 -08:00
Valentin Clement (バレンタインクレメン)	c949500d51	[flang][cuda] Fix not declared terminator (#114866 )	2024-11-04 12:38:02 -08:00
Valentin Clement (バレンタインクレメン)	51f7e98d59	[flang][cuda] Crash if mode is not handled (#114842 )	2024-11-04 11:47:19 -08:00
Valentin Clement (バレンタインクレメン)	32473864cb	[flang][cuda] Data transfer with descriptor (#114598 ) Reopen PR #114302 as it was automatically closed. Review in #114302	2024-11-01 12:35:48 -07:00
Valentin Clement (バレンタインクレメン)	7792dbe29a	Reland '[flang][runtime] Allow different memmov function in assign' (#114587 ) Reland #114301	2024-11-01 11:26:39 -07:00
Valentin Clement (バレンタインクレメン)	c5a254cdd7	Revert "[flang][runtime][NFC] Allow different memmove function in assign" (#114581 ) Reverts llvm/llvm-project#114301	2024-11-01 10:40:10 -07:00
Valentin Clement (バレンタインクレメン)	b278fe3297	[flang][runtime][NFC] Allow different memmove function in assign (#114301 ) - Add a parameter to the `Assign` function to be able to use a different `memmove` function. This is preparatory work to be able to use the `Assign` function between host and device data. - Expose the `Assign` function so it can be used from different files. - The new `memmoveFct` is not used in `BlankPadCharacterAssignment` yet since it is not clear if there is a need. It will be updated in case it is needed.	2024-11-01 10:34:03 -07:00
Valentin Clement (バレンタインクレメン)	e4e9fea71e	[flang][cuda] Pass descriptor by reference for CUFMemsetDescriptor (#114338 )	2024-10-31 09:02:59 -07:00
Renaud Kauffmann	bfe486fe76	Passing descriptors by reference to CUDA runtime calls (#114288 ) Passing a descriptor as a `const Descriptor &` or a `const Descriptor ` generates a FIR signature where the box is passed by value. This is an issue, as it requires a load of the box to be passed. But since, ultimately, all boxes are passed by reference a temporary is generated in LLVM and the reference to the temporary is passed. The boxes addresses are registered with the CUDA runtime but the temporaries are not, thus preventing the runtime to properly map a host side address to its device side counterpart. To address this issue, this PR changes the signatures to the transfer functions to pass a descriptor as a `Descriptor `, which will in turn generate a FIR signature with that takes a box reference as an argument.	2024-10-30 13:24:47 -07:00
Valentin Clement (バレンタインクレメン)	0b700f2333	[flang][cuda] Add entry point to launch global function with cluster_dims (#113958 )	2024-10-29 10:01:49 -07:00
Renaud Kauffmann	70d61f6de7	[flang][cuda] Adding runtime call to CUFRegisterVariable (#113952 )	2024-10-28 13:34:37 -07:00
Valentin Clement (バレンタインクレメン)	4e40b71c51	[flang][cuda] Add specialized gpu.launch_func conversion (#113493 )	2024-10-23 15:28:51 -07:00

1 2 3 4 5 ...

893 Commits