clang-p2996

Files

Jakub Kuderski 72003adf6b [mlir][gpu] Allow subgroup reductions over 1-d vector types (#76015 )

Each vector element is reduced independently, which is a form of
multi-reduction.

The plan is to allow for gradual lowering of multi-reduction that
results in fewer `gpu.shuffle` ops at the end:
1d `vector.multi_reduction` --> 1d `gpu.subgroup_reduce` --> smaller 1d
`gpu.subgroup_reduce` --> packed `gpu.shuffle` over i32

For example we can perform 2 independent f16 reductions with a series of
`gpu.shuffles` over i32, reducing the final number of `gpu.shuffles` by 2x.

2023-12-21 11:55:43 -05:00

builtins-opencl.mlir

…

builtins-vulkan.mlir

…

entry-point.mlir

…

gpu-to-spirv.mlir

…

load-store.mlir

…

module-opencl.mlir