We provide the `send_n` and `recv_n` utilities as a generic way to stream data between both sides of the process. This was previously tested and performed as expected when using a string of constant size. However, when the size was allowed to diverge between the threads in the warp or wavefront this could deadlock. This did not occur on NVPTX because of the use of the explicit warp sync. However, on AMD one of the work items in the wavefront could continue executing and hit the next `recv` call before the other threads, then we would deadlock as we violated the RPC invariants. This patch replaces the for loop with a thread ballot. This will cause every thread in the warp or wavefront to continue executing the loop until all of them can exit. This acts as a more explicit wavefront sync. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D150992
56 lines
926 B
CMake
56 lines
926 B
CMake
add_custom_target(libc-startup-tests)
|
|
add_dependencies(libc-integration-tests libc-startup-tests)
|
|
|
|
add_integration_test(
|
|
startup_args_test
|
|
SUITE libc-startup-tests
|
|
SRCS
|
|
args_test.cpp
|
|
ARGS
|
|
1 2 3
|
|
ENV
|
|
FRANCE=Paris
|
|
GERMANY=Berlin
|
|
)
|
|
|
|
add_integration_test(
|
|
startup_rpc_test
|
|
SUITE libc-startup-tests
|
|
SRCS
|
|
rpc_test.cpp
|
|
DEPENDS
|
|
libc.src.__support.RPC.rpc_client
|
|
libc.src.__support.GPU.utils
|
|
LOADER_ARGS
|
|
--blocks-x 2
|
|
--blocks-y 2
|
|
--blocks-z 2
|
|
--threads-x 4
|
|
--threads-y 4
|
|
--threads-z 4
|
|
)
|
|
|
|
add_integration_test(
|
|
init_fini_array_test
|
|
SUITE libc-startup-tests
|
|
SRCS
|
|
init_fini_array_test.cpp
|
|
)
|
|
|
|
add_integration_test(
|
|
startup_rpc_interface_test
|
|
SUITE libc-startup-tests
|
|
SRCS
|
|
rpc_interface_test.cpp
|
|
)
|
|
|
|
add_integration_test(
|
|
startup_rpc_stream_test
|
|
SUITE libc-startup-tests
|
|
SRCS
|
|
rpc_stream_test.cpp
|
|
LOADER_ARGS
|
|
--threads 32
|
|
--blocks 8
|
|
)
|