`select i1 non-const, i1 true, i1 false` has been optimized to
`non-const`. There is no reason that we can not optimize `select i1
ConstExpr, i1 true, i1 false` to `ConstExpr`.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D151631
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
This reverts commit 0012b94a4e.
Reverting due to test failures introduced by 73925ef8b0
Updated floating-point-compare.ll to keep the assume declaration.
This is generally handled already in early CSE.
If a specialized pipeline is used, however, its possible for `i1`
operand with known-zero denominator to slip through. Generally the
known-zero denominator is caught and poison is returned, but if it is
indirect enough (known zero through a phi node) we can miss this case
in `InstructionSimplify` and then miss handling `i1`. This is because
`i1` is current handled with the following check:
`if(Known.countMinLeadingZeros() == Known.getBitWidth() - 1)`
which only works on the assumption we don't know the denominator to be
zero. If we know the denominator to be zero, this check fails:
https://github.com/llvm/llvm-project/issues/62607
This patch simply adds an explicit `if(Known.isZero) return poison;`
which fixes the issue.
Alive2 Link for tests:
https://alive2.llvm.org/ce/z/VTw54n
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D150142
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.
Differential Revision: https://reviews.llvm.org/D149256
Use simplifySelectWithICmpEq to handle the implied equalities from the icmp-and,
then both of ICMP_NE and ICMP_EQ will be addressed including vector type.
(X & Y) == -1 ? X : -1 --> -1 (commuted 2 ways)
(X & Y) != -1 ? -1 : X --> -1 (commuted 2 ways)
This is a supplement to the icmp-or scenario on D148986.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D149229
Use simplifySelectWithICmpEq to handle the implied equalities from the icmp-or,
then both of ICMP_NE and ICMP_EQ will be addressed
(X | Y) == 0 ? X : 0 --> 0 (commuted 2 ways)
(X | Y) != 0 ? 0 : X --> 0 (commuted 2 ways)
Fixes https://github.com/llvm/llvm-project/issues/62263
Reviewed By: nikic, RKSimon
Differential Revision: https://reviews.llvm.org/D148986
Relative to the previous attempt, this includes a bailout for phi
nodes, whose arguments might refer to a previous cycle iteration.
We did not hit this before by a fortunate deficiency of the
ConstantFoldInstOperands() API, which doesn't handle phi nodes,
unlike ConstantFoldInstruction().
-----
Instead of hardcoding a few instruction kinds, use the generic
interface now that we have it.
The primary effect of this is that intrinsics are now supported.
It's worth noting that this is still limited in that it does not
support vectors, so we can't remove e.g. existing fshl special
cases.
Instead of hardcoding a few instruction kinds, use the generic
interface now that we have it.
The primary effect of this is that intrinsics are now supported.
It's worth noting that this is still limited in that it does not
support vectors, so we can't remove e.g. existing fshl special
cases.
The threading folds are the same for div/rem and the isDivZero()
fold only differes in the return value.
This should be NFC, but as this slightly shuffles around the
order of the folds it might not be exactly the same.
This is a special case where we know the division result is zero.
For the unsigned case this is handled by generic icmp code, for
the signed case we do need to special case it.
There is no getNullValue in ConstantFP. Due to inheritance, we're calling
Constant::getNullValue which handles any type including FP.
Since we already know we want an FP constant we can use ConstantFP::getZero
which might be faster and is a more readable name for an FP zero.
Previous logic only applied for `ConstantInt` which misses all vector
cases. New code works for splat/non-splat vectors as well. No change
to the underlying simplifications.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D147275
When threading operations over phis, we need to adjust the context
instruction to the terminator of the incoming block. This was
handled when threading icmps, but not when threading binops.
Fixes https://github.com/llvm/llvm-project/issues/61312.
Address the dominating condition, the urem fold is benefit from the analytics improvements.
Fix https://github.com/llvm/llvm-project/issues/60546
NOTE: delete the calls in simplifyBinaryIntrinsic and foldICmpWithDominatingICmp
is used to reduce compile time.
Reviewed By: nikic, arsenm, erikdesjardins
Differential Revision: https://reviews.llvm.org/D144248
This is a generalization of a suggestion from issue #60799
that allows removing a redundant guard of an input value
via icmp+select. It should also solve issue #60801.
This only comes into play for a select with an equality
condition where we are trying to substitute a constant into
the false arm of a select. (A 'true' select arm substitution
allows "refinement", so it is not on this code path.)
The constant must be the same in the compare and the select,
and it must be a "binop absorber" (X op C = C). That query
currently includes 'or', 'and', and 'mul', so there are tests
for all of those opcodes.
We then use "impliesPoison" on the false arm binop and the
original "Op" to be replaced to ensure that the select is not
actually blocking poison from leaking. That could be
potentially expensive as we recursively test each operand, but
it is currently limited to a depth of 2. That's enough to catch
our motivating cases, but probably nothing more complicated
(although that seems unlikely).
I don't know how to generalize a proof for Alive2 for this, but
here's a positive and negative test example to help illustrate
the subtle logic differences of poison/undef propagation:
https://alive2.llvm.org/ce/z/Sz5K-c
Differential Revision: https://reviews.llvm.org/D144493
As described in PR54002, the "icmp of load from global" special
case in CaptureTracking is incorrect, as is the InstSimplify code
that makes use of it.
This moves the special case from CaptureTracking into InstSimplify
to limit its scope to the buggy transform only, and not affect
other code using CaptureTracking as well.
InstCombine is supposed to be a superset of InstSimplify, but
failed to invoke load simplification.
Unfortunately, this causes a minor compile-time regression, which
will be mitigated in a future commit.
define i1 @compare_vscales() {
%vscale = call i64 @llvm.vscale.i64()
%vscalex2 = shl nuw nsw i64 %vscale, 1
%vscalex4 = shl nuw nsw i64 %vscale, 2
%cmp = icmp ult i64 %vscalex2, %vscalex4
ret i1 %cmp
}
This IR is currently emitted by LLVM. This icmp is redundant as this snippet
can be simplified to true or false as both operands originate from the same
@llvm.vscale.i64() call.
Differential Revision: https://reviews.llvm.org/D142542
We can only fold insertvalue undef, (extractvalue x, n) to x
if x is not poison, otherwise we might be replacing undef with
poison (https://alive2.llvm.org/ce/z/fnw3c8). The insertvalue
poison case is always fine.
I didn't go to particularly large effort to preserve cases where
folding with undef is still legal (mainly when there is a chain of
multiple inserts that end up covering the whole aggregate),
because this shouldn't really occur in practice: We should always
be generating the insertvalue poison form when constructing
aggregates nowadays.
Differential Revision: https://reviews.llvm.org/D144106