clang-p2996

Files

Zhuoran Yin ea03bdee70 [MLIR][AMDGPU] Adding Vector transfer_read to load rewrite pattern (#131803 )

This PR adds the Vector transfer_read to load rewrite pattern. The
pattern creates a transfer read op lowering. A vector trasfer read op
will be lowered to a combination of `vector.load`, `arith.select` and
`vector.broadcast` if:
 - The transfer op is masked.
 - The memref is in buffer address space.
 - Other conditions introduced from `TransferReadToVectorLoadLowering`

The motivation of this PR is due to the lack of support of masked load
from amdgpu backend. `llvm.intr.masked.load` lower to a series of
conditional scalar loads refer to (`scalarize-masked-mem-intrin` pass).
This PR will make it possible for masked transfer_read to be lowered
towards buffer load with bounds check, allowing a more optimized global
load accessing pattern compared with existing implementation of
`llvm.intr.masked.load` on vectors.

2025-03-21 08:42:04 -04:00

CMakeLists.txt

[MLIR][AMDGPU] Adding Vector transfer_read to load rewrite pattern (#131803 )

2025-03-21 08:42:04 -04:00

EmulateAtomics.cpp

[mlir][AMDGPU] Enable emulating vector buffer_atomic_fadd for bf16 on gfx942 (#129029 )

2025-03-13 14:30:45 -05:00

ResolveStridedMetadata.cpp

[mlir][AMDGPU] Plumb address space 7 through MLIR, add address_space attr. (#125594 )

2025-02-26 16:02:39 -06:00

TransferReadToLoad.cpp

[MLIR][AMDGPU] Adding Vector transfer_read to load rewrite pattern (#131803 )

2025-03-21 08:42:04 -04:00