This is very likely the cause of a stage 2 failure in
Transforms/LoopVectorize/check-prof-info.ll. Revert until I can
investigate this.
This reverts commit 3d199d086e.
Support replacement of operands not only in the immediate
instruction, but also instructions it uses.
To the most part, this extension is straightforward, but there are
two bits worth highlighting:
First, we can now no longer assume that if the Op is a vector, the
instruction also returns a vector. If Op is a vector and the
instruction returns a scalar, we should consider it as a cross-lane
operation.
Second, for the x ^ x special case, we can no longer assume that
the operand is RepOp, as we might have a replacement higher up the
instruction chain.
There is one optimization regression, but it is in a fuzzer-generated
test case.
Fixes https://github.com/llvm/llvm-project/issues/63104.
With the semantics change from D154051, it is no longer valid to
fold gep inbounds undef to poison (unless we know the index is
non-zero). Fold it to undef instead.
Differential Revision: https://reviews.llvm.org/D154215
Strengthen the fold for icmps of non-overlapping storage, by
working on the difference of offsets, rather than considering
both offsets independently. In particular, this allows handling
comparisons of pointers to the end of equal-sized allocations.
Proofs: https://alive2.llvm.org/ce/z/Po2nL4
Differential Revision: https://reviews.llvm.org/D153752
D143505 fixed/simplified folding of operations with SNaN operands. In
doing so it introduced a crash when handling scalable vector types,
wherein the scalable-vector ConstantVector was cast to a ConstantFP.
Since we know by that point in the code that if we've found a NaN, we're
dealing with a scalable-vector splat (as there are no other kinds of
scalable-vector constant for which that holds), we can grab the splatted
value and re-use the existing code, which will automatically splat the
new NaN back to a scalable vector for us.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153566
`select i1 non-const, i1 true, i1 false` has been optimized to
`non-const`. There is no reason that we can not optimize `select i1
ConstExpr, i1 true, i1 false` to `ConstExpr`.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D151631
The only value that can produce -0 is exactly -0, no rounding is involved. If the
denormal mode has flushed denormal inputs, a negative value could produce -0.
The constrained intrinsics do not track the denormal mode, and this is just
generally broken in the current set of FP predicates. The move to computeKnownFPClass
will address some of these issues.
This reverts commit 0012b94a4e.
Reverting due to test failures introduced by 73925ef8b0
Updated floating-point-compare.ll to keep the assume declaration.
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
ugt/uge/sgt/sge test for the extension to isImpliedCondOperands
case from issue 62441 and 61393:
(X >> Z) <=(u) Y ==> X <=(to u) Y and (X > Y +_{nuw} 1) ==> X != Y
Differential Revision: https://reviews.llvm.org/D149512
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
Use simplifySelectWithICmpEq to handle the implied equalities from the icmp-and,
then both of ICMP_NE and ICMP_EQ will be addressed including vector type.
(X & Y) == -1 ? X : -1 --> -1 (commuted 2 ways)
(X & Y) != -1 ? -1 : X --> -1 (commuted 2 ways)
This is a supplement to the icmp-or scenario on D148986.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D149229