clang-p2996

Files

Guray Ozen 4319e1916d [mlir][nvgpu] Introduce Multicast Capability to nvgpu.tma.async.load (#76935 )

This PR improves the functionality of the `nvgpu.tma.async.load` Op by
adding support for multicast. While we already had this capability in
the lower-level `nvvm.cp.async.bulk.tensor.shared.cluster.global` NVVM
Op, this PR lowers mask information to the NVVM operation.

2024-01-05 10:48:55 +01:00

CMakeLists.txt

[mlir][transform] Add NVGPU to NVVM conversion via transform.apply_conversion_patterns

2023-08-09 14:09:57 +00:00

NVGPUTransformOps.cpp

[mlir][nvgpu] Introduce Multicast Capability to nvgpu.tma.async.load (#76935 )

2024-01-05 10:48:55 +01:00