Files
clang-p2996/mlir/lib/Dialect/NVGPU/IR
Guray Ozen 8dd0d95c7c [mlir][nvgpu] Add nvgpu.tma.async.store (#77811)
PR adds `nvgpu.tma.async.store` Op for asynchronous stores using the
Tensor Memory Access (TMA) unit.

It also implements Op lowering to NVVM dialect. The Op currently
performs asynchronous stores of a tile memory region from shared to
global memory for a single CTA.
2024-01-15 11:44:51 +01:00
..