Files
clang-p2996/mlir/test/Integration/GPU/CUDA
Durgadoss R 13d6233e77 [MLIR][NVGPU] Fix nvgpu_arrive syntax in matmulBuilder.py (#113713)
This patch updates the syntax for nvgpu_arrive Op
in matmulBuilder.py. This fixes the compilation
error for this test.

For the warp-specialized matmul_kernel implementation,
removing the WaitGroupSyncOp (after the mma-main-loop)
fixes the hang observed.

With these two fixes, the test compiles and
executes successfully on an sm90a machine.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-10-26 11:15:50 +05:30
..
2024-09-27 13:52:15 +02:00