This transform looks for suitable vector transfers from global memory to shared memory and converts them to async device copies. Differential Revision: https://reviews.llvm.org/D155569
3.4 KiB
3.4 KiB