Commit Graph

246 Commits

Author SHA1 Message Date
Nikita Popov
0df3200931 [ValueTracking] Fix KnownBits conflict for poison-only vector
If all the demanded elements are poison, return unknown instead of
conflict to avoid downstream assertions.

Fixes https://github.com/llvm/llvm-project/issues/75505.
2023-12-21 09:23:47 +01:00
bipmis
64987c648f [ValueTracking] isNonZero sub of ptr2int's with recursive GEP (#68680)
When the sub arguments are ptr2int it is not possible to determine
computeKnownBits() of its arguments.
For scalar case generally sub of 2 ptr2int are converted to sub of
indexes.
However a loop with recursive GEP/PHI where the arguments to sub is of
type ptr2int, if it is possible to determine that a sub of this GEP and
another pointer with the same base is KnownNonZero we can return this.
This helps subsequent passes to optimize the loop further.
2023-12-20 14:11:58 +00:00
Nikita Popov
337504683e [ValueTracking] Use isKnownNonEqual() in isNonZeroSub()
(x - y) != 0 is true iff x != y, so use the isKnownNonEqual()
helper, which knows some additional tricks.
2023-12-18 12:26:40 +01:00
Nikita Popov
7c1d8c74e8 [ValueTracking] Add test for non-zero sub via known non equal (NFC) 2023-12-18 12:26:40 +01:00
bipmis
6df6320374 [ValueTracking] isNonEqual Pointers with with a recursive GEP (#70459)
Handles canonical icmp eq(ptr1, ptr2) -> where ptr1/ptr2 is a recursive
GEP.
Can helps scenarios where InstCombineCompares folds icmp eq(sub(ptr2int,
ptr2int), 0) -> icmp eq(ptr1, ptr2)
and
icmp eq(phi(sub(ptr2int, ptr2int), ...)) -> phi i1 (icmp eq(sub(ptr2int,
ptr2int), 0), ....)
2023-12-15 10:02:57 +00:00
Nikita Popov
cf47af493b [InstCombine] Generalize folds for inversion of icmp operands (#74317)
We have a bunch of folds that basically perform X pred Y to ~Y pred ~X
for various special cases where this saves an instruction.

Generalize these folds to use isFreeToInvert(). We have to make sure
that we consume an instruction in either of the inversions, otherwise
we're just going to swap the icmp back and forth.

Fixes https://github.com/llvm/llvm-project/issues/74302.
2023-12-08 11:25:41 +01:00
Allen
ab3fdbdfbe [ValueTracking] Support srem/urem for isKnownNonNullFromDominatingCondition (#74021)
Similar to div, the rem should also proof its second operand is
non-zero, otherwise it is a UB.

Fix https://github.com/llvm/llvm-project/issues/71782
2023-12-01 16:20:38 +08:00
Craig Topper
03d4a9d94d [InstCombine] Set disjoint flag when turning Add into Or. (#72702)
The disjoint flag was recently added to IR in #72583
2023-11-27 12:54:11 -08:00
Noah Goldstein
f112e4693a [InstCombine] Don't transform sub X, ~Y -> add X, -Y unless Y is actually negatable
This combine was previously adding instruction in some cases (see the
tests).

Closes #72767
2023-11-19 12:15:03 -06:00
Dhruv Chawla
076581fd95 [ValueTracking] Implement sdiv/udiv support for isKnownNonNullFromDominatingCondition (#67282)
The second operand of a sdiv/udiv has to be non-null, as division by
zero is UB.

Proofs: https://alive2.llvm.org/ce/z/WttZbb

Fixes https://github.com/llvm/llvm-project/issues/64240.
2023-10-20 09:24:33 +05:30
Noah Goldstein
2dd52b4527 [InstCombine] Improve logic for adding flags to shift instructions.
Instead of relying on constant operands, use known bits to do the
computation.

Proofs: https://alive2.llvm.org/ce/z/M-aBnw

Differential Revision: https://reviews.llvm.org/D157532
2023-10-12 16:05:19 -05:00
Noah Goldstein
444383e0d0 [ValueTracking] Do more thorough non-zero check in isKnownToBePowerOfTwo when OrZero is no set.
We can cover more cases by directly checking if the result is
known-nonzero for common patterns when they are missing `OrZero`.

This patch add `isKnownNonZero` checks for `shl`, `lshr`, `and`, and `mul`.

Differential Revision: https://reviews.llvm.org/D157309
2023-10-12 16:05:19 -05:00
Noah Goldstein
dfda65c892 [ValueTracking] Add support for non-splat vecs in cmpExcludesZero
Just a small QOL change.
2023-10-12 16:05:19 -05:00
Noah Goldstein
9427fce677 [ValueTracking] Add tests for cmpExcludesZero for non-splat vecs; NFC 2023-10-12 16:05:19 -05:00
Noah Goldstein
50ece4cba9 [ValueTracking] Add better support for ConstantRange(And)
The fairly common power of two pattern `X & -X` can be capped at the
highest power of 2 (signbit set).
2023-10-12 14:12:26 -05:00
Noah Goldstein
0f8b40a82e [ValueTracking] Add better support for ConstantRange(Shl)
1) If LHS is constant:
    - The low bits of the LHS is set, the lower bound is non-zero
    - The upper bound can be capped at popcount(LHS) high bits
2) If RHS is constant:
    - The upper bound can be capped at (Width - RHS) high bits
2023-10-12 14:12:26 -05:00
Noah Goldstein
457308a46a [ValueTracking] Add more tests for constant ranges; NFC 2023-10-12 14:12:26 -05:00
Dhruv Chawla
3e992d81af [InferAlignment] Enable InferAlignment pass by default
This gives an improvement of 0.6%:
https://llvm-compile-time-tracker.com/compare.php?from=7d35fe6d08e2b9b786e1c8454cd2391463832167&to=0456c8e8a42be06b62ad4c3e3cf34b21f2633d1e&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D158600
2023-09-20 12:08:52 +05:30
Noah Goldstein
dd48a9b056 [ValueTracking] Handle conflicts when computing knownbits of PHI nodes in deadcode; PR65022
Bug was introduced in: https://reviews.llvm.org/D157807

The prior logic assumed that the information from the knownbits
returned from analyzing the `icmp` and its operands in the context
basicblock would be consistent.

This is not necessarily the case if we are analyzing deadcode.

For example with `(icmp sgt (select cond, 0, 1), -1)`. If we take
knownbits for the `select` using knownbits from the operator, we will
know the signbit is zero. If we are analyzing a not-taken from based
on the `icmp` (deadcode), we will also "know" that the `select` must
be negative (signbit is one). This will result in a conflict in
knownbits.

The fix is to just give up on analying the phi-node if its deadcode. We 1) don't want to waste time continuing to analyze it and 2) will be removing it (and its dependencies) later.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D158960
2023-09-01 02:11:50 -05:00
Noah Goldstein
846ff921ff [ValueTracking] Compute sdiv as non-zero if abs(num) u>= abs(denum)
Just covering an additional case.

Proof: https://alive2.llvm.org/ce/z/MJz9fT

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D157302
2023-08-24 19:43:10 -05:00
Noah Goldstein
d0b4ed9a2c [ValueTracking] Add tests for knowing sdiv is non-zero; NFC
Differential Revision: https://reviews.llvm.org/D157301
2023-08-24 19:43:10 -05:00
Noah Goldstein
7c9fe735d4 [ValueTracking] Strengthen analysis in computeKnownBits of phi
Use the comparison based analysis to strengthen the standard
knownbits analysis rather than choosing either/or.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D157807
2023-08-22 10:59:03 -05:00
Noah Goldstein
39e9862e6b [ValueTracking] Use predicates for incoming phi-edges to deduce non-zero
This is basically a copy and paste of the same logic we do in
`computeKnownBits` but adapts it for just `isKnownNonZero`.

Differential Revision: https://reviews.llvm.org/D157801
2023-08-22 10:59:02 -05:00
Noah Goldstein
61df774ab7 [ValueTracking] Improve analysis of knownbits from incoming phi edges.
Just fill in missing cases (TODO) for `ugt`, `uge`, `sgt`, `sge`,
`slt`, and `sle`. These are all in the same spirit as `ult`/`uge`, but
each of the other conditions have different constraints.

Proofs: https://alive2.llvm.org/ce/z/gnj4o-

Differential Revision: https://reviews.llvm.org/D157800
2023-08-22 10:59:02 -05:00
Noah Goldstein
fb92c0700b [ValueTracking] Add tests for deducing non-zero based for incoming phi-edges; NFC
Differential Revision: https://reviews.llvm.org/D157798
2023-08-22 10:59:02 -05:00
Noah Goldstein
35d916e11b [ValueTracking] Add tests for getting knownbits from phi-edges; NFC
Differential Revision: https://reviews.llvm.org/D157797
2023-08-22 10:59:02 -05:00
Nikita Popov
69bd66b3ce [Tests] Remove some and/or constant expressions in tests (NFC)
In preparation for their removal in D158081.
2023-08-21 12:05:32 +02:00
Dhruv Chawla
d53b3df570 [InstCombine] Remove unneeded isa<PHINode> check in foldOpIntoPhi
This check is redundant as it is covered by the call to
isPotentiallyReachable.

Depends on D155726.

Differential Revision: https://reviews.llvm.org/D155718
2023-08-16 21:09:08 +05:30
Noah Goldstein
fbb40df2bf [ValueTracking] In isKnownToBeAPowerOfTwo an i1 value is always true if OrZero is set
This is trivially true.

Differential Revision: https://reviews.llvm.org/D157310
2023-08-09 14:42:55 -05:00
Noah Goldstein
dff3d8a279 [ValueTracking] Add support for mul in isKnownToBeAPowerOfTwo
pow2 * pow2 is a power of 2 or zero.

Proof: https://alive2.llvm.org/ce/z/FNiiXd

Differential Revision: https://reviews.llvm.org/D157308
2023-08-09 14:42:55 -05:00
Noah Goldstein
4f818daca6 [ValueTracking] Add support for fshl/fshr in isKnownToBeAPowerOfTwo
If the funnel shifts are rotates (op0 == op1) then the number of 1s/0s
don't change so we can just look through op0/op1.

Proofs: https://alive2.llvm.org/ce/z/Pja5yu

Differential Revision: https://reviews.llvm.org/D157307
2023-08-09 14:42:55 -05:00
Noah Goldstein
6f4d660d7f [ValueTracking] Add support for bswap and bitreverse in isKnownToBeAPowerOfTwo
Both of these intrinsics don't change the number of 1s/0s so we can
look directly through them.

Proofs: https://alive2.llvm.org/ce/z/gnZuwC

Differential Revision: https://reviews.llvm.org/D157306
2023-08-09 14:42:55 -05:00
Noah Goldstein
bab8058d3b [ValueTracking] If OrZero is set, look through trunc in isKnownToBePowerOfTwo
Just move coverage.

Proof: https://alive2.llvm.org/ce/z/H37tVX

Differential Revision: https://reviews.llvm.org/D157304
2023-08-09 14:42:54 -05:00
Noah Goldstein
0690817bcc [ValueTracking] Add tests for more isKnownToBeAPowerOfTwo cases; NFC
Differential Revision: https://reviews.llvm.org/D157303
2023-08-09 14:42:54 -05:00
Nikita Popov
d01aec4c76 [InstCombine] Set dead phi inputs to poison in more cases
Set phi inputs to poison whenever we find a dead edge (either
during initial worklist population or the main InstCombine run),
instead of only doing this for successors of dead blocks.

This means that the phi operand is set to poison even if for
critical edges without an intermediate block.

There are quite a few test changes, because the pattern is fairly
common in vectorizer output, for cases where we know the vectorized
loop will be entered.
2023-08-01 11:53:47 +02:00
Nikita Popov
41895843b5 [InstCombine] Only perform one iteration
InstCombine is a worklist-driven algorithm, which works roughly
as follows:

* All instructions are initially pushed to the worklist.
  The initial order is in RPO program order.
* All newly inserted instructions get added to the worklist.
* When an instruction is folded, its users get added back to the
  worklist.
* When the use-count of an instruction decreases, it gets added
  back to the worklist.
* And a few of other heuristics on when we should revisit
  instructions.

On top of the worklist algorithm, InstCombine layers an additional
fix-point iteration: If any fold was performed in the previous
iteration, then InstCombine will re-populate the worklist from
scratch and fold the entire function again. This continues until
a fix-point is reached.

In the vast majority of cases, InstCombine will reach a fix-point
within a single iteration: However, a second iteration is performed
to verify that this is indeed the fixpoint. We can see this in the
statistics for llvm-test-suite:

    "instcombine.NumOneIteration": 411380,
    "instcombine.NumTwoIterations": 117921,
    "instcombine.NumThreeIterations": 236,
    "instcombine.NumFourOrMoreIterations": 2,

The way to read these numbers is that in 411380 cases, InstCombine
performs no folds. In 117921 cases it performs a fold and reaches
the fix-point within one iteration (the second iteration verifies
the fixpoint). In the remaining 238 cases, more than one iteration
is needed to reach the fixpoint.

In other words, only in 0.04% of cases are additional iterations
needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine
performs a completely useless extra iteration to verify the fix point.

This patch removes the fixpoint iteration from InstCombine, and always
only perform a single iteration. This results in a major compile-time
improvement of around 4% at negligible codegen impact.

This explicitly does accept that we will not reach a fixpoint in all
cases. However, this is mitigated by two factors: First, the data
suggests that this happens very rarely in practice. Second,
InstCombine runs many times during the optimization pipeline
(8 times even without LTO), so there are many chances to recover
such cases.

In order to prevent accidental optimization regressions in the
future, this implements a verify-fixpoint option, which is enabled
by default when instcombine is specified in -passes and disabled
when InstCombinePass() is constructed from C++. This means that
test cases need to explicitly use the no-verify-fixpoint option
if they fail to reach a fixed point (for a well understand reason
we cannot / do not want to avoid).

Differential Revision: https://reviews.llvm.org/D154579
2023-07-31 10:56:49 +02:00
Nikita Popov
2e0af16c93 [ValueTracking] Support add+icmp assumes for KnownBits
Support the canonical range check pattern for KnownBits assumptions.
This is the same as the generic ConstantRange handling, just shifted
by an offset.
2023-07-05 16:15:47 +02:00
Nikita Popov
dfb369399d [ValueTracking] Directly use KnownBits shift functions
Make ValueTracking directly call the KnownBits shift helpers, which
provides more precise results.

Unfortunately, ValueTracking has a special case where sometimes we
determine non-zero shift amounts using isKnownNonZero(). I have my
doubts about the usefulness of that special-case (it is only tested
in a single unit test), but I've reproduced the special-case via an
extra parameter to the KnownBits methods.

Differential Revision: https://reviews.llvm.org/D151816
2023-06-01 09:46:16 +02:00
Nikita Popov
a1dec5dacb [ValueTracking] Avoid optimizing away condition in test (NFC)
This is not what we're interested in testing, and it allows to
essentially optimize away the entire function with more powerful
optimization.
2023-05-26 16:38:37 +02:00
Nikita Popov
8f12057e8e [ValueTracking] Avoid UB in test (NFC)
Don't use br undef, as it is UB.
2023-05-26 15:55:20 +02:00
Noah Goldstein
2622b2f409 [ValueTracking] Use select condition to help determine if select is non-zero
In `select c, x, y` the condition `c` dominates the resulting `x` or
`y` chosen by the `select`. This adds logic to `isKnownNonZero` to try
and use the `icmp` for the `c` condition to see if it implies the
select `x` or `y` are known non-zero.

For example in:
    ```
    %c = icmp ugt i8 %x, %C
    %r = select i1 %c, i8 %x, i8 %y
    ```
    The true arm of select `%x` is non-zero (when "returned" by the
    `select`) because `%c` being true implies `%x` is non-zero.

Alive2 Links (with `x {pred} C`):
    - EQ  iff `C != 0`:
        - https://alive2.llvm.org/ce/z/umLabn
    - NE  iff `C == 0`:
        - https://alive2.llvm.org/ce/z/DQvy8Y
    - UGT [always]:
        - https://alive2.llvm.org/ce/z/HBkjgQ
    - UGE iff `C != 0`:
        - https://alive2.llvm.org/ce/z/LDNifB
    - SGT iff `C s>= 0`:
        - https://alive2.llvm.org/ce/z/QzWDj3
    - SGE iff `C s> 0`:
        - https://alive2.llvm.org/ce/z/rR4g3D
    - SLT iff `C s<= 0`:
        - https://alive2.llvm.org/ce/z/uysayx
    - SLE iff `C s< 0`:
        - https://alive2.llvm.org/ce/z/2jYc7e

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D147900
2023-05-23 13:52:40 -05:00
Noah Goldstein
530bbc8f69 [ValueTracking] Add tests for using condition in select for non-zero analysis; NFC
Differential Revision: https://reviews.llvm.org/D147899
2023-05-23 13:52:40 -05:00
Noah Goldstein
8a60814ed5 [ValueTracking] Use KnownBits functions for computeKnownBits of saturating add/sub functions
The knownbits implementation covers all the cases previously handled
by `uadd.sat`/`usub.sat` as well some additional ones. We previously
were not handling the `ssub.sat`/`sadd.sat` cases at all.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D150103
2023-05-23 13:52:40 -05:00
Noah Goldstein
1e963b4081 [ValueTracking] Add tests for knownbits of saturating add/sub functions; NFC
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D150101
2023-05-23 13:52:39 -05:00
Noah Goldstein
4fd3401e76 [KnownBits] Improve implementation of KnownBits::abs
`abs` preserves the lowest set bit, so if we know the lowest set bit,
set it in the output.

As well, implement the case where the operand is known negative.

Reviewed By: foad, RKSimon

Differential Revision: https://reviews.llvm.org/D150100
2023-05-23 13:52:39 -05:00
Noah Goldstein
261e5d0951 [ValueTracking] Add tests for knownbits of abs; NFC
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D150099
2023-05-16 18:58:13 -05:00
Noah Goldstein
124547eae8 [ValueTracking] Use KnownBits::sdiv for sdiv opcode in computeKnownBits
We now of an implementation of `KnownBits::sdiv` so we can implement
this case.

Differential Revision: https://reviews.llvm.org/D150096
2023-05-16 18:58:12 -05:00
Noah Goldstein
99795afb28 [ValueTracking] Pass exact flag to KnownBits::udiv in computeKnownBits
This information was previously missing but we can use it for
determining the low-bits.

Differential Revision: https://reviews.llvm.org/D150095
2023-05-16 18:58:12 -05:00
Noah Goldstein
7d05ab99ed [KnownBits] Improve KnownBits::udiv
We can more precisely determine the upper bits doing `MaxNum /
MinDenum` as opposed to only using the MSB.

As well, if the `exact` flag is set, we can sometimes determine some
of the low-bits.

Differential Revision: https://reviews.llvm.org/D150094
2023-05-16 18:58:12 -05:00
Noah Goldstein
53a079c8f7 [ValueTracking] Add tests for knownbits of sdiv and udiv; NFC
Differential Revision: https://reviews.llvm.org/D150092
2023-05-16 18:58:12 -05:00