clang-p2996

Files

Christopher Bate cafb6284d1 [mlir][VectorToGPU] Update memref stride preconditions on nvgpu.mma.sync path

This change removes the requirement that the row stride be statically known when
converting `vector.transfer_read` and `vector.transfer_write` to distributed
SIMT operations in the `nvgpu` lowering path. It also adds a check to verify
that the last dimension of the source memref is statically known to have stride
1 since this is assumed in the conversion logic.  No other change should be
required since the generated `vector.load` operations are never created across
dimensions other than the last. The routines for checking preconditions on
`vector.transfer_read/write` are moved to under nvgpu utilities.

The change is NFC with respect to the GPU dialect lowering path.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D155753

2023-09-14 13:51:42 -06:00

CMakeLists.txt

…

MMAUtils.cpp

[mlir][VectorToGPU] Update memref stride preconditions on nvgpu.mma.sync path

2023-09-14 13:51:42 -06:00