Files
clang-p2996/llvm/test/CodeGen/X86/vector-unsigned-cmp.ll
Simon Pilgrim 5b89aaab00 [X86] Fold concat(PCMP*(),PCMP*()) -> CMPPS(concat,concat) on AVX1 targets (#95915)
This is a more restricted solution to #82242 (vs the more general #82290 + #84360) whereby if we're concat'ing PCMPEQ/GT nodes to 256-bits on a AVX1 target then determine if the integer values are in bounds to allow them to be converted to FP for a (legal) float comparison.

By performing this inside combineConcatVectorOps and working on PCMPEQ/GT nodes and not ICMP, we delay the fold until after more lowering has occurred, which avoids many of the issues where we were getting 'stuck' with CMPPS or unnecessary 256-bit nodes, and can more easily determine if either of the new concats() will be free.

Additionally this patch requires BOTH comparison operands to be in range, while technically not required this does help avoid the remaining regressions. It doesn't require that one of the operands is constant as it didn't seem necessary to include that constraint.

I've reused some of the code from #82290, and we may be able to add additional functionality (more CondCode patterns, v4i64/v4f64 handling, 'bitcastable' integers etc.) in future patches.

Fixes #82242
2024-06-28 15:24:06 +01:00

26 KiB