clang-p2996

Author	SHA1	Message	Date
Joseph Huber	a1be5d69df	[libc] Implement more input functions on the GPU (#66288 ) Summary: This patch implements the `fgets`, `getc`, `fgetc`, and `getchar` functions on the GPU. Their implementations are straightforward enough. One thing worth noting is that the implementation of `fgets` will be extremely slow due to the high latency to read a single char. A faster solution would be to make a new RPC call to call `fgets` (due to the special rule that newline or null breaks the stream). But this is left out because performance isn't the primary concern here.	2023-09-14 15:39:29 -05:00
Joseph Huber	4792ae5cd5	[libc] Fix building the RPC server with LIBC_NAMESPACE (#65642 ) A recent patch required the implementation to define `LIBC_NAMESPACE`. For GPU offloading we provide a static library whose internal implementation relies on the `libc` headers. This is a separate library that is constructed during the "bootstrap" phase. This patch moves the definition of the `LIBC_NAMESPACE` CMake variable up so its available during bootstrapping and adds it to the definition of the RPC server.	2023-09-07 12:47:36 -05:00
Joseph Huber	701e6f7630	[libc][fix] Fix buffer overrun in initialization of GPU return value Summary: The HSA API explicitly states that the size is a count of uint32_t's not a byte count. This was erroneously being used as a simple memcpy, causing some weird behaviour. Fix this by correctly passing `1` to initialize a single integer to zero.	2023-09-02 17:59:01 -05:00
Joseph Huber	07102a1194	[libc] Implement the 'abort' function on the GPU This function implements the `abort` function on the GPU. The implementation here closely mirros the `exit` call where we first synchornize with the RPC server to make sure it's listening and then we exit on the GPU. I was unsure if this should be a simple `__builtin_assert` on the GPU. I elected to go with an RPC approach to make this a more "true" `abort` call. That is, it should invoke some signal handlers and exit with the proper code according to the implemented C library on the server. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D159210	2023-08-31 08:40:15 -05:00
Joseph Huber	7fd9f0f4e0	[libc] Remove `MAX_LANE_SIZE` definition from the RPC server This `MAX_LANE_SIZE` was a hack from the days when we used a single instance of the server and had some GPU state handle it. Now that we have everything templated this really shouldn't be used. This patch removes its use and replaces it with template arguments. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D158633	2023-08-23 12:09:30 -05:00
Joseph Huber	334bbc0d67	[libc] Add support for the 'fread' function on the GPU This patch adds support for `fread` on the GPU via the RPC mechanism. Here we simply pass the size of the read to the server and then copy it back to the client via the RPC channel. This should allow us to do the basic operations on files now. This will obviously be slow for large sizes due ot the number of RPC calls involved, this could be optimized further by having a special RPC call that can initiate a memcpy between the two pointers. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D155121	2023-07-26 13:51:35 -05:00
Joseph Huber	a42c1f8d97	[libc][Obvious] Fix use of `fwrite` in the RPC server Summary: The RPC server used the size field which meant we didn't get the correct return value for partial reads. We fix that here.	2023-07-26 11:13:38 -05:00
Joseph Huber	c381a94753	[libc] Remove test RPC opcodes from the exported header This patch does the noisy work of removing the test opcodes from the exported interface to an interface that is only visible in `libc`. The benefit of this is that we both test the exported RPC registration more directly, and we do not need to give this interface to users. I have decided to export any opcode that is not a "core" libc feature as having its MSB set in the opcode. We can think of these as non-libc "extensions". Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154848	2023-07-21 15:36:36 -05:00
Joseph Huber	d3aabeb7b5	[libc] Treat the locks array as a bitfield Currently we keep an internal buffer of device memory that is used to indicate ownership of a port. Since we only use this as a single bit we can simply turn this into a bitfield. I did this manually rather than having a separate type as we need very special handling of the masks used to interact with the locks. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D155511	2023-07-21 10:49:11 -05:00
Joseph Huber	cf269417b2	[libc] Add an override option for specifying the loader implementation There are some cases when testing we want to override the logic for not building tests if the loader is not present. This allows users to specify an external binary that fulfils the same duties which will force the tests to be built even without meeting the dependencies. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D155837	2023-07-20 08:44:58 -05:00
Jon Chesterfield	095e69404a	[libc][amdgpu] Accept deadstripped clock_freq global If the clock_freq symbol isn't used, and is removed, we don't need to abort the loader. Can instead just not set it. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D155832	2023-07-20 14:23:08 +01:00
Jon Chesterfield	d483824fc8	[libc][amdgpu] Tolerate different install directories for hsa.h HSA headers might be under a hsa/ directory or might not. This scheme matches the one used by the openmp amdgpu plugin. Reviewed By: jhuber6, jplehr Differential Revision: https://reviews.llvm.org/D155812	2023-07-20 13:43:17 +01:00
Joseph Huber	e537c83975	[libc] Add basic support for calling host functions from the GPU This patch adds the `rpc_host_call` function as a GPU extension. This is exported from the `libc` project to use the RPC interface to call a function pointer via RPC any copying the arguments by-value. The interface can only support a single void pointer argument much like pthreads. The function call here is the bare-bones version of what's required for OpenMP reverse offloading. Full support will require interfacing with the mapping table, nowait support, etc. I decided to test this interface in `libomptarget` as that will be the primary consumer and it would be more difficult to make a test in `libc` due to the testing infrastructure not really having a concept of the "host" as it runs directly on the GPU as if it were a CPU target. Reviewed By: jplehr Differential Revision: https://reviews.llvm.org/D155003	2023-07-19 10:11:46 -05:00
Joseph Huber	979fb95021	Revert "[libc] Treat the locks array as a bitfield" Summary: This caused test failures on the gfx90a buildbot. This works on my gfx1030 and the Nvidia buildbots, so we'll need to investigate what is going wrong here. For now revert it to get the bots green. This reverts commit `05abcc5792`.	2023-07-19 09:27:08 -05:00
Joseph Huber	05abcc5792	[libc] Treat the locks array as a bitfield Currently we keep an internal buffer of device memory that is used to indicate ownership of a port. Since we only use this as a single bit we can simply turn this into a bitfield. I did this manually rather than having a separate type as we need very special handling of the masks used to interact with the locks. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D155511	2023-07-18 11:34:21 -05:00
Joseph Huber	a608076726	[libc][Obvious] Check if the state hasn't already been destroyed on shutdown This ensures that if someone calls the `rpc_shutdown` method multiple times it will not segfault and gracefully continue. This was causing problems in the OpenMP usage. This could point to other issues, but for now this is a safe fix. Differential Revision: https://reviews.llvm.org/D155005	2023-07-11 14:35:38 -05:00
Joseph Huber	c850ea1498	[libc] Support fopen / fclose on the GPU This patch adds the necessary support for the fopen and fclose functions to work on the GPU via RPC. I added a new test that enables testing this with the minimal features we have on the GPU. I will update it once we have `fread` and `fwrite` to actually check the outputted strings. For now I just relied on checking manually via the outpuot temp file. Reviewed By: JonChesterfield, sivachandra Differential Revision: https://reviews.llvm.org/D154519	2023-07-05 18:31:58 -05:00
Joseph Huber	5db39796bf	[libc] Support timing information in libc tests This patch adds the necessary support to provide timing information in `libc` tests. This is useful for determining which tests look what amount of time. We also can use this as a test basis for providing more fine-grained timing when implementing things on the GPU. The main difficulty with this is the fact that the AMDGPU fixed frequency clock operates at an unknown frequency. We need to read this on a per-card basis from the driver and then copy it in. NVPTX on the other hand has a fixed clock at a resolution of 1ns. I have also increased the resolution of the print-outs as the majority of these are below a millisecond for me. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154446	2023-07-05 14:27:08 -05:00
Joseph Huber	df52a22b1b	[libc] Make the RPC server target always available This patch makes sure that we always build the RPC server. The proposed used for this is to begin integrating this server implementation into `libomptarget`. That requires that we build this server ahead of time when using a `LLVM_ENABLE_PROJECTS` build. Make a few tweaks to ensure that the GCC compiler which may be used for this build doesn't complain. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154105	2023-06-30 11:30:57 -05:00
Joseph Huber	62f57bc9b0	[libc] Add other RPC callback methods to the RPC server This patch adds the other two methods to the server so the external users can use the interface through the obfuscated interface. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154224	2023-06-30 11:29:37 -05:00
Joseph Huber	667c10353e	[libc] Fix the implementation of exit on the GPU The RPC calls all have delays associated with them. Currently the `exit` function does an async send and immediately exits the GPU. This can have the effect that the RPC server never sees the exit call and we continue. This patch changes that to first sync with the server before continuing to perform its exit. There is still a hazard here, where the kernel can complete before the RPC call reads back its response, but this is simply multi-threaded hazards. This change ensures that the server will always exit some time after the GPU exits. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154112	2023-06-29 13:22:23 -05:00
Joseph Huber	31c154881c	[libc] Allow the RPC client to be initialized via a H2D memcpy The RPC client must be initialized to set a pointer to the underlying buffer. This is currently done with the `reset` method which may not be ideal for the use-case. We want runtimes to be able to initialize this without needing to call a kernel. Recent changes allowed the `Client` type to be trivially copyable. That means we can create a client on the server side and then copy it over. To that end we take the existing externally visible symbol and initialize it to the client's pointer. Therefore we can look up the symbol and copy it over once loaded. No test currently, I tested with a demo OpenMP application but couldn't think of how to put that in-tree. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D153633	2023-06-26 10:41:32 -05:00
Joseph Huber	e0b487bfc0	[libc] Rename and install the RPC server interface This patch prepares the RPC interface to be installed. We place this in the existing `llvm-gpu-none` directory as it will also give us access to the generated `libc` headers for the opcodes. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D153040	2023-06-21 11:26:24 -05:00
Joseph Huber	4272d09196	[libc][NFC] Cleanup the RPC server implementation prior to installing This does some simple cleanup prior to landing the patch to install these. Differential Revision: https://reviews.llvm.org/D153439	2023-06-21 11:14:20 -05:00
Joseph Huber	964a535bfa	[libc] Remove flexible array and replace with a template Currently the implementation of the RPC interface requires a flexible struct. This caused problems when compilling the RPC server with GCC as would be required if trying to export the RPC server interface. This required that we either move to the `x[1]` workaround or make it a template parameter. While just using `x[1]` would be much less noisy, this is technically undefined behavior. For this reason I elected to use templates. The downside to using templates is that the server code must now be able to handle multiple different types at runtime. I was unable to find a good solution that didn't rely on type erasure so I simply branch off of the given value. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D153304	2023-06-20 15:22:37 -05:00
Joseph Huber	490958b9ea	[libc][obvious] Actually return the value from `malloc` for NVPTX Switching to this interface we neglected to actually write the output from the malloc call to the RPC buffer. Fix this so the tests pass again. Differential Revision: https://reviews.llvm.org/D153069	2023-06-15 15:13:11 -05:00
Joseph Huber	dcdfc963d7	[libc] Export GPU extensions to `libc` for external use The GPU port of the LLVM C library needs to export a few extensions to the interface such that users can interface with it. This patch adds the necessary logic to define a GPU extension. Currently, this only exports a `rpc_reset_client` function. This allows us to use the server in D147054 to set up the RPC interface outside of `libc`. Depends on https://reviews.llvm.org/D147054 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152283	2023-06-15 11:02:24 -05:00
Joseph Huber	719d77ed28	[libc] Begin implementing a library for the RPC server This patch begins providing a generic static library that wraps around the raw `rpc.h` interface. As discussed in the corresponding RFC, https://discourse.llvm.org/t/rfc-libc-exporting-the-rpc-interface-for-the-gpu-libc/71030, we want to begin exporting RPC services to external users. In order to do this we decided to not expose the `rpc.h` header by wrapping around its functionality. This is done with a C-interface as we make heavy use of callbacks and allows us to provide a predictable interface. Reviewed By: JonChesterfield, sivachandra Differential Revision: https://reviews.llvm.org/D147054	2023-06-15 11:02:23 -05:00
Joseph Huber	168fa31816	[libc] Fix some tests on NVPTX due to insufficient stack size A few of these tests were disabled due to failing on NVPTX. After looking into it the vast majority of these cases were due to insufficient stack memory. This can be worked around by increasing the stack size in the loader or by reducing the memory usage in the case of large string constants. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D152583	2023-06-09 16:42:14 -05:00
Joseph Huber	e6a350df10	[libc] Replace the `PRINT_TO_STDERR` opcode for RPC printing. A previous patch added general support for printing via the RPC interface. we should consolidate this functionality and get rid of the old opcode that was used for simple testing. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D152211	2023-06-05 19:28:30 -05:00
Joseph Huber	a59e1712fa	[libc][obvious] Fix conditional when CUDA is not found If CUDA is not found this string will expand into nothing. We need to surround it with a string otherwise it will cause build failures. Differential Revision: https://reviews.llvm.org/D152209	2023-06-05 18:51:23 -05:00
Joseph Huber	e6c401b5e8	[libc] Add initial support for 'puts' and 'fputs' to the GPU This patch adds the initial support required to support basic priting in `stdio.h` via `puts` and `fputs`. This is done using the existing LLVM C library `File` API. In this sense we can think of the RPC interface as our system call to dump the character string to the file. We carry a `uintptr_t` reference as our native "file descriptor" as it will be used as an opaque reference to the host's version once functions like `fopen` are supported. For some unknown reason the declaration of the `StdIn` variable causes both the AMDGPU and NVPTX backends to crash if I use the `READ` flag. This is not used currently as we only support output now, but it needs to be fixed Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D151282	2023-06-05 17:56:55 -05:00
Joseph Huber	a621308881	[libc] Implement basic `malloc` and `free` support on the GPU This patch adds support for the `malloc` and `free` functions. These currently aren't implemented in-tree so we first add the interface filies. This patch provides the most basic support for a true `malloc` and `free` by using the RPC interface. This is functional, but in the future we will want to implement a more intelligent system and primarily use the RPC interface more as a `brk()` or `sbrk()` interface only called when absolutely necessary. We will need to design an intelligent allocator in the future. The semantics of these memory allocations will need to be checked. I am somewhat iffy on the details. I've heard that HSA can allocate asynchronously which seems to work with my tests at least. CUDA uses an implicit synchronization scheme so we need to use an explicitly separate stream from the one launching the kernel or the default stream. I will need to test the NVPTX case. I would appreciate if anyone more experienced with the implementation details here could chime in for the HSA and CUDA cases. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D151735	2023-06-05 17:56:53 -05:00
Joseph Huber	e826762a08	[libc] More efficiently send bytes via `send_n` and `recv_n` Currently we have the `send_n` and `recv_n` routines to stream data, such as a string to print, to the other side. The first operation is to send the size so the other side knows the number of bytes to recieve. However, this wasted 56 bytes that could've been sent. This meant that small values, like the arguments to a function to call on the host for example, needed to perform an extra send. This patch sends the first 56 bytes in the first packet and continues if necessary. Depends on D150992 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D151041	2023-05-23 10:59:47 -05:00
Joseph Huber	182e5acb11	[libc] Check the RPC server once again after the kernel exits We support asynchronous sends, that means that the kernel can issue a send, then exit the kernel as we do with the `EXIT` syscall. Because of the condition it's therefore possible for the kernel to exit and break from the loop before we check the server again. This can potentially cause us to ignore an `EXIT` call from the GPU. Reviewed By: JonChesterfield, lntue Differential Revision: https://reviews.llvm.org/D150456	2023-05-12 12:49:19 -05:00
Joseph Huber	d21e507cfc	[libc] Implement a generic streaming interface in the RPC Currently we provide the `send_n` and `recv_n` functions. These were somewhat divergent and not tested on the GPU. This patch changes the support to be more common. We do this my making the CPU provide an array equal the to at least the lane size while the GPU can rely on the private memory address of its stack variables. This allows us to send data back and forth generically. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D150379	2023-05-11 11:55:41 -05:00
Joseph Huber	30093d6be2	[libc][obvious] Fix undefined variable after name change I forgot that we still used these variables in the loaders. Differential Revision: https://reviews.llvm.org/D150362	2023-05-11 09:00:08 -05:00
Joseph Huber	4a2e50e4f4	[libc][NFC] Clean up some code in the RPC implementation. Small cleanup of the server code and fixes a constant name not following the naming convention. Differential Revision: https://reviews.llvm.org/D150361	2023-05-11 08:22:55 -05:00
Jon Chesterfield	bbeae142bf	[libc][rpc] Allocate a single block of shared memory instead of three Allows moving the pointer swap between server and client into reset. Single allocation simplifies whatever allocates the client/server, currently the libc loaders. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D150337	2023-05-11 03:04:56 +01:00
Joseph Huber	c8c19e1c31	[libc] Fix RPC interface when sending and recieving aribtrary packets The interface exported by the RPC library allows users to simply send and recieve fixed sized packets without worrying about the data motion underneath. However, this was broken in the current implementation. We can think of the send and recieve implementations in terms of waiting for ownership of the buffer, using the buffer, and posting ownership to the other side. Our implementation of `recv` was incorrect in the following scenarios. recv -> send // we still own the buffer and should give away ownership recv -> close // The other side is not waiting for data, this will result in multiple openings of the same port This patch attempts to fix this with an admittedly hacky fix where we track if the previous implementation was a recv and post conditionally. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D150327	2023-05-10 18:51:38 -05:00
Jon Chesterfield	f497611f43	[libc][rpc] Allocate locks array within process Replaces the globals currently used. Worth changing to a bitmap before allowing runtime number of ports >> 64. One bit per port is likely to be cheap enough that sizing for the worst case is always fine, otherwise in the future we can change to dynamically allocating it. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D150309	2023-05-11 00:41:51 +01:00
Joseph Huber	aea866c12c	[libc] Support concurrent RPC port access on the GPU Previously we used a single port to implement the RPC. This was sufficient for single threaded tests but can potentially cause deadlocks when using multiple threads. The reason for this is that GPUs make no forward progress guarantees. Therefore one group of threads waiting on another group of threads can spin forever because there is no guarantee that the other threads will continue executing. The typical workaround for this is to allocate enough memory that a sufficiently large number of work groups can make progress. As long as this number is somewhat close to the amount of total concurrency we can obtain reliable execution around a shared resource. This patch enables using multiple ports by widening the arrays to a predetermined size and indexes into them. Empty ports are currently obtained via a trivial linker scan. This should be imporoved in the future for performance reasons. Portions of D148191 were applied to achieve parallel support. Depends on D149581 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D149598	2023-05-05 10:12:19 -05:00
Joseph Huber	901266dad3	[libc] Change GPU startup and loader to use multiple kernels The GPU has a different execution model to standard `_start` implementations. On the GPU, all threads are active at the start of a kernel. In order to correctly intitialize and call the constructors we want single threaded semantics. Previously, this was done using a makeshift global barrier with atomics. However, it should be easier to simply put the portions of the code that must be single threaded in separate kernels and then call those with only one thread. Generally, mixing global state between kernel launches makes optimizations more difficult, similarly to calling a function outside of the TU, but for testing it is better to be correct. Depends on D149527 D148943 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D149581	2023-05-04 19:31:41 -05:00
Joseph Huber	507edb52f9	[libc] Enable multiple threads to use RPC on the GPU The execution model of the GPU expects that groups of threads will execute in lock-step in SIMD fashion. It's both important for performance and correctness that we treat this as the smallest possible granularity for an RPC operation. Thus, we map multiple threads to a single larger buffer and ship that across the wire. This patch makes the necessary changes to support executing the RPC on the GPU with multiple threads. This requires some workarounds to mimic the model when handling the protocol from the CPU. I'm not completely happy with some of the workarounds required, but I think it should work. Uses some of the implementation details from D148191. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D148943	2023-05-04 19:31:41 -05:00
Joseph Huber	2e1c0ec629	[libc] Support global constructors and destructors on NVPTX This patch adds the necessary hacks to support global constructors and destructors. This is an incredibly hacky process caused by the primary fact that Nvidia does not provide any binary tools and very little linker support. We first had to emit references to these functions and their priority in D149451. Then we dig them out of the module once it's loaded to manually create the list that the linker should have made for us. This patch also contains a few Nvidia specific hacks, but it passes the test, albeit with a stack size warning from `ptxas` for the callback. But this should be fine given the resource usage of a common test. This also adds a dependency on LLVM to the NVPTX loader, which hopefully doesn't cause problems with our CUDA buildbot. Depends on D149451 Reviewed By: tra Differential Revision: https://reviews.llvm.org/D149527	2023-05-04 07:13:00 -05:00
Joseph Huber	91528d2058	[libc] Fix printing on the GPU when given a `cpp::string_ref` The implementation of the test printing currently expects a null terminated C-string. However, the `write_to_stderr` interface uses a string view, which doesn't need to be null terminated. This patch changes the printing interface to directly use `fwrite` instead rather than relying on a null terminator. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D149493	2023-04-28 21:32:01 -05:00
Joseph Huber	0bd564a259	[libc] Add a test to directly stimulate the RPC interface Currently, the RPC interface with the loader is only tested if the other tests fail. This test adds a direct test that runs a simple integer increment over the RPC handshake 10000 times. Depends on https://reviews.llvm.org/D148288 Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D148342	2023-04-19 20:02:32 -05:00
Joseph Huber	d0ff5e4030	[libc] Update RPC interface for system utilities on the GPU This patch reworks the RPC interface to allow more generic memory operations using the shared better. This patch decomposes the entire RPC interface into opening a port and calling `send` or `recv` on it. The `send` function sends a single packet of the length of the buffer. The `recv` function is paired with the `send` call to then use the data. So, any aribtrary combination of sending packets is possible. The only restriction is that the client initiates the exchange with a `send` while the server consumes it with a `recv`. The operation of this is driven by two independent state machines that tracks the buffer ownership during loads / stores. We keep track of two so that we can transition between a send state and a recv state without an extra wait. State transitions are observed via bit toggling, e.g. This interface supports an efficient `send -> ack -> send -> ack -> send` interface and allows for the last send to be ignored without checking the ack. A following patch will add some more comprehensive testing to this interface. I I informally made an RPC call that simply incremented an integer and it took roughly 10 microsends to complete an RPC call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D148288	2023-04-19 20:02:31 -05:00
Joseph Huber	bc11bb3e26	[libc] Add the '--threads' and '--blocks' option to the GPU loaders We will want to test the GPU `libc` with multiple threads in the future. This patch adds the `--threads` and `--blocks` option to set the `x` dimension of the kernel. Using CUDA terminology instead of OpenCL for familiarity. Depends on D148288 D148342 Reviewed By: jdoerfert, sivachandra, tra Differential Revision: https://reviews.llvm.org/D148485	2023-04-19 08:01:58 -05:00
Joseph Huber	dfc162ad3f	[libc] Free the GPU memory allocated in the device loaders Summary: This part was ignored and we just hoped that shutting down the runtime freed these correctly. But it's best to be specific and free the memory we've allocated.	2023-04-03 11:55:32 -05:00

1 2

59 Commits