clang-p2996/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp at 1485fd295b2ab7fde3bf8d3ce33912c8a5fe1105

Files

aartbik 1485fd295b [mlir] [VectorOps] Improve scatter/gather CPU performance

Replaced the linearized address with the proper LLVM way of
defining vector of base + indices in SIMD style. This yields
much better code. Some prototype results with microbencmarking
sparse matrix x vector with 50% sparsity (about 2-3x faster):

         LINEARIZED     IMPROVED
GFLOPS  sdot  saxpy     sdot saxpy
16x16    1.6   1.4       4.4  2.1
32x32    1.7   1.6       5.8  5.9
64x64    1.7   1.7       6.4  6.4
128x128  1.7   1.7       5.9  5.9
256x256  1.6   1.6       6.1  6.0
512x512  1.4   1.4       4.9  4.7

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D84368

2020-07-22 23:47:36 -07:00

54 KiB

Raw Blame History

View Raw

54 KiB Raw Blame History

54 KiB

Raw Blame History