Files
clang-p2996/llvm/test/Transforms/ScalarizeMaskedMemIntrin/X86
Krzysztof Drewniak 70995a1a33 [ScalarizeMaskedMemIntr] Optimize splat non-constant masks (#104537)
In cases (like the ones added in the tests) where the condition of a
masked load or store is a splat but not a constant (that is, a masked
operation is being used to implement patterns like "load if the current
lane is in-bounds, otherwise return 0"), optimize the 'scalarized' code
to perform an aligned vector load/store if the splat constant is true.

Additionally, take a few steps to preserve aliasing information and
names when nothing is scalarized while I'm here.

As motivation, some LLVM IR users will genatate masked load/store in
cases that map to this kind of predicated operation (where either the
vector is loaded/stored or it isn't) in order to take advantage of
hardware primitives, but on AMDGPU, where we don't have a masked load or
store, this pass would scalarize a load or store that was intended to be
- and can be - vectorized while also introducing expensive branches.

Fixes #104520

Pre-commit tests at #104527
2024-08-16 16:24:25 -05:00
..