[X86][AVX512] Use comx for compare (#113567)
We added AVX10.2 COMEF ISA in LLVM, This does not optimize correctly in
scenario mentioned below.
Summary
Input
```
define i1 @oeq(float %x, float %y) {
%1 = fcmp oeq float %x, %y
ret i1 %1
}define i1 @une(float %x, float %y) {
%1 = fcmp une float %x, %y
ret i1 %1
}define i1 @ogt(float %x, float %y) {
%1 = fcmp ogt float %x, %y
ret i1 %1
}
// Prior AVX10.2, default code generation
oeq: # @oeq
cmpeqss xmm0, xmm1
movd eax, xmm0
and eax, 1
ret
une: # @une
cmpneqss xmm0, xmm1
movd eax, xmm0
and eax, 1
ret
ogt: # @ogt
ucomiss xmm0, xmm1
seta al
ret
```
This patch will remove `cmpeqss` and `cmpneqss`. For complete transform
check unit test.
Continuing on what PR https://github.com/llvm/llvm-project/pull/113098
added
Earlier Legalization and combine expanded `setcc oeq:ch` node into `and`
and `setcc eq` , `setcc o`. From suggestions in community
new internal transform
```
Optimized type-legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 11 nodes:
t0: ch,glue = EntryToken
t2: f16,ch = CopyFromReg t0, Register:f16 %0
t4: f16,ch = CopyFromReg t0, Register:f16 %1
t14: i8 = setcc t2, t4, setoeq:ch
t10: ch,glue = CopyToReg t0, Register:i8 $al, t14
t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1
Optimized legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 12 nodes:
t0: ch,glue = EntryToken
t2: f16,ch = CopyFromReg t0, Register:f16 %0
t4: f16,ch = CopyFromReg t0, Register:f16 %1
t15: i32 = X86ISD::UCOMX t2, t4
t17: i8 = X86ISD::SETCC TargetConstant:i8<4>, t15
t10: ch,glue = CopyToReg t0, Register:i8 $al, t17
t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1
```
Earlier transform is mentioned here
https://github.com/llvm/llvm-project/pull/113098#discussion_r1810307663
---------
Co-authored-by: mattarde <mattarde@intel.com>