clang-p2996

Files

Joseph Huber 2d8106cb5a [Clang] Add width handling for <gpuintrin.h> shuffle helper (#125896 )

Summary:
The CUDA impelementation has long supported the `width` argument on its
shuffle instrucitons, which makes it more difficult to replace those
uses with this helper. This patch just correctly implements that for
AMDGPU and NVPTX so it's equivalent to `__shfl_sync` in CUDA. This will
ease porting.

Fortunately these get optimized out correctly when passing in known
widths.

2025-02-05 12:38:48 -06:00

allocator.cpp

…

allocator.h

…

CMakeLists.txt

[libc] Switch to using the generic <gpuintrin.h> implementations (#121810 )

2025-01-07 13:08:39 -06:00

utils.h

[Clang] Add width handling for <gpuintrin.h> shuffle helper (#125896 )

2025-02-05 12:38:48 -06:00