Commit Graph

1946 Commits

Author SHA1 Message Date
Florian Hahn
b14be1e7c0 [SCEV] Use object size for globals to sharpen ranges.
The highest address the object can start is ObjSize bytes before the
end (unsigned max value). If this value is not a multiple of the
alignment, the last possible start value is the next lowest multiple
of the alignment. Note: The computations cannot overflow,
because if they would there's no possible start address for the
object.

At the moment, this is limited to GlobalVariables, because I could not
find a API similar to getObjectSize to also get the alignment of the
object. With such an API, this can be generalized to general addresses.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D149483
2023-04-29 21:33:30 +01:00
Florian Hahn
4c2d29f2fc [SCEV] Skip instrs with non-scevable types in visitAndClearUsers.
No SCEVs are formed for instructions with non-scevable types, so no
other SCEV expressions can depend on them. Skip those instructions and
their users when invalidating SCEV expressions.

Depends on D144847.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D144848
2023-04-28 15:37:35 +01:00
Nikita Popov
3ddd1ffb72 [SCEV] Don't invalidate past dependency-breaking instructions
When invalidating a value, we walk all users of that value and
invalidate them as well. This can be very expensive for large use
graphs.

However, we only need to invalidate a user U of instruction I if
SCEV(U) can depend on SCEV(I). This is not the case if U is an
instruction that always produces a SCEVUnknown, such as a load.
If the load pointer operand is invalidated, there is no need to
invalidate the load result, which is completely unrelated from a
SCEV perspective.

Differential Revision: https://reviews.llvm.org/D149323
2023-04-28 14:42:08 +02:00
Nikita Popov
103fc0f629 [SCEV] Replace IsAvailableOnEntry with block disposition
As far as I understand, the IsAvailableOnEntry() function basically
implements the same functionality as the properlyDominates() block
disposition. The primary difference (apart from a weaker
implementation) seems to be in this comment at the top:

    // Checks if the SCEV S is available at BB.  S is considered available at BB
    // if S can be materialized at BB without introducing a fault.

However, I don't really understand why there would be such a
requirement. It's my understanding that SCEV explicitly does not
care about trapping udiv instructions itself, and it's the job of
SCEVExpander's isSafeToExpand() to make sure these don't get
expanded if they may trap.

Differential Revision: https://reviews.llvm.org/D149344
2023-04-28 11:02:03 +02:00
Nikita Popov
fa0014a68b [SCEV] Drop LCSSA check in createNodeFromSelectLikePHI()
SCEV expressions no longer try to preserve LCSSA form. SCEV
construction will try to look through LCSSA phi nodes. As such,
we also no longer need to limit this special-case fold.
2023-04-27 15:18:07 +02:00
Nikita Popov
079c525f20 [SCEV] Try simplifying phi before createNodeFromSelectLikePHI()
Sometimes a phi can both be trivial and match the
createNodeFromSelectLikePHI() fold. In that case it is generally
more profitable to look through the phi node.
2023-04-27 15:07:19 +02:00
Nikita Popov
c27a96607c [SCEV] Remove LCSSA special case in getSCEVAtScope() (NFCI)
We no longer try to preserve LCSSA form in SCEV representation:
Nowadays, we look through LCSSA PHI nodes directly during SCEV
construction. As such, this separate special case in
getSCEVAtScope() is no longer needed.
2023-04-27 12:53:03 +02:00
Nikita Popov
19732a3eaa [SCEV] Check correct binary operator for nowrap flags
We should be checking the current BO here, not the nested one. If
the current BO has nowrap flags (and is UB on poison), then we'll
fetch both operand SCEVs of that BO. We'll check the nested BO
on the next iteration of the do/while loop.
2023-04-27 11:25:40 +02:00
Nikita Popov
3690e1f8a7 [SCEV] Check MatchBinaryOp opcode instead of original opcode
These are not necessarily the same (e.g. or can become add) and
this is what we're switching over in the first place.
2023-04-27 11:13:35 +02:00
Nikita Popov
4fcb006fb6 [SCEV] Fix getOperandsToCreate() for and/or
We can create expressions either for constant operand or i1
and/or. The implementation was inverting the latter check.
2023-04-27 10:50:57 +02:00
Philip Reames
09d879d060 [SCEV] Common code for computing trip count in a fixed type [NFC-ish]
This is a follow on to D147117 and D147355. In both cases, we were adding special cases to compute zext(BTC+1) instead of zext(BTC)+1 when the BTC+1 computation was known not to overflow.

Differential Revision: https://reviews.llvm.org/D148661
2023-04-25 12:04:42 -07:00
Max Kazantsev
ab07cbe437 [SCEV] Support sub in and negative constants willNotOverflow
This lifts two TODOs from this function, allowing us to prove
no-overflow whether it happens through max int (up) or through
min int (down) for both and and sub.

Differential Revision: https://reviews.llvm.org/D148618
Reviewed By: dmakogon
2023-04-25 16:40:37 +07:00
Joshua Cao
a4e420ea64 Revert "[SCEV] Precise trip multiples"
This reverts commit 027a4c8b96.
2023-04-24 01:41:53 -07:00
Joshua Cao
027a4c8b96 [SCEV] Precise trip multiples
We currently have getMinTrailingZeros(), from which we can get a SCEV's
multiple by computing 1 << MinTrailingZeroes. However, this only gets us
multiples that are a power of 2. This patch introduces a way to get max
constant multiples that are not just a power of 2. The logic is similar
to that of getMinTrailingZeros. getMinTrailingZeros is replaced by
computing the max constant multiple, and counting the number of trailing
bits.

This is applied in two places:

1) Computing unsigned constant ranges. For example, if we have i8
   {10,+,10}<nuw>, we know the max constant it can be is 250.

2) Computing trip multiples as shown in SCEV output. This is useful if
   for example, we are unrolling a loop by a factor of 5, and we know
   the trip multiple is 5, then we don't need a loop epilog.

If the code sees that a SCEV does not have <nuw>, it will fall back to
finding the max multiple that is a power of 2. Multiples that are a
power of 2 will still be a multiple even after the SCEV overflows.

Differential Revision: https://reviews.llvm.org/D141823
2023-04-24 00:21:59 -07:00
Nikita Popov
4cdb91f9e7 [SCEV] Clarify inference in isAddRecNeverPoison()
The justification in isAddRecNeverPoison() no longer applies, as
it dates back to a time where LLVM had an unconditional forward
progress guarantee. However, we also no longer need it, because we
can exploit branch on poison UB instead.

For a single exit loop (without abnormal exits) we know that all
instructions dominating the exit will be executed, so if any of
them trigger UB on poison that means that addrec is not poison.

This is slightly stronger than the previous code, because a) we
don't need the exit to also be the latch and b) we don't need the
value to be used in the exit branch in particular, any UB-producing
instruction is fine.

I don't expect much practical impact from this change, this is
mainly to clarify the reasoning behind this logic.

Differential Revision: https://reviews.llvm.org/D148633
2023-04-21 15:31:00 +02:00
Dmitry Makogon
e08f9894ec [SCEV] Preserve NSW for AddRec multiplied by -1 if it cannot be signed minimum
This preserves NSW flag for AddRecs multiplied by -1 if we can prove
via constant ranges that the AddRec cannot be signed minimum.

An explanation:
Let M be signed minimum. If AddRec's range contains M, then M * (-1) will
stay M and (M + 1) * (-1) will be signed maximum, so we get a signed overflow.
In all other cases if an AddRec didn't signed overflow,
then AddRec * (-1) wouldn't too.

Differential Revision: https://reviews.llvm.org/D148084
2023-04-14 19:36:56 +07:00
Joshua Cao
921b8f40e8 [SCEV][NFC] GetMinTrailingZeros switch case and naming cleanup
* combine zext and sext into the one switch case
* combine vscale and udiv into one switch case
* renames according to LLVM style
2023-04-10 22:56:29 -07:00
Joshua Cao
898a9ca5e9 [SCEV] Strengthen huge constant trip multiples.
SCEV determines that loops with trip count >=2^32 have a trip multiple
of 1 to guard against huge multiples. This patch stregthens this to
instead find the greatest power of 2 divisor that is less than the
threshold.

Differential Revision: https://reviews.llvm.org/D147868
2023-04-10 20:00:46 -07:00
Joshua Cao
569f7e547d [SCEV][NFC] Convert check to assert getSmallConstantTripMultiple() 2023-04-10 19:59:01 -07:00
Joshua Cao
585742cbfc [SCEV] When computing trip count, only zext if necessary
This patch improves on https://reviews.llvm.org/D110587. To summarize
the patch, given backedge-taken count BC, trip count TC is `BC + 1`.
However, we don't know if BC we might overflow. So the patch modifies TC
computation to `1 + zext(BC)`.

This patch only adds the zext if necessary by looking at the constant
range. If we can determine that BC cannot be the max value for its
bitwidth, then we know adding 1 will not overflow, and the zext is not
needed. We apply loop guards before computing TC to get more data.

The primary motivation is to support my work on more precise trip
multiples in https://reviews.llvm.org/D141823. For example:

```
void test(unsigned n)
  __builtin_assume(n % 6 == 0);
  for (unsigned i = 0; i < n; ++i)
    foo();
```

Prior to this patch, we had `TC = 1 + zext(-1 + 6 * ((6 umax %n) /u
6))<nuw>`. SCEV range computation is able to determine that the BC
cannot be the max value, so the zext is not needed. The result is `TC
-> (6 * ((6 umax %n) /u 6))<nuw>`. From here, we would be able to
determine that %n is a multiple of 6.

There was one change in LoopCacheAnalysis/LoopInterchange required.
Before this patch, if a loop has BC = false, it would compute `TC -> 1 +
zext(false) -> 1`, which was fine. After this patch, it computes `TC -> 1
+ false = true`. CacheAnalysis would then sign extend the `true`, which
was not the intended the behavior. I modified CacheAnalysis such that
it would only zero extend trip counts.

This patch is not NFC, but also does not change any SCEV outputs. I
would like to get this patch out first to make work with trip multiples
easier.

Differential Revision: https://reviews.llvm.org/D147117
2023-04-10 19:40:52 -07:00
Max Kazantsev
5b96b13fdf [SCEV] Improve AddRecs' range computation in Expensive Range Sharpening mode
Apply loop guards to AddRec's start in range computation for
non-self-wrapping AddRecs.

According to CT measurements, this has a wide negative compile time impact,
so we hold it in expensive range sharpening mode where it's not so critical.
However, we need to find a way to share benefits of this mode with default mode.

Patch by Aleksandr Popov!

Differential Revision: https://reviews.llvm.org/D147557
Reviewed By: mkazantsev
2023-04-10 16:37:10 +07:00
Joshua Cao
24170fb8cd [SCEV][NFC] Fix Do not use 'else' after 'return'
Follow LLVM coding standards and make clangd emit less warnings.
2023-04-08 15:56:08 -07:00
Philip Reames
6afcc54ac7 [SCEV] Infer no-self-wrap via constant ranges
Without this, pointer IVs in loops with small constant trip counts couldn't be proven no-self-wrap. This came up in a new LSR transform, but may also benefit other SCEV consumers as well.

Differential Revision: https://reviews.llvm.org/D146596
2023-03-22 12:06:28 -07:00
Alon Kom
8e5aa969d0 [SCEV] Preserve divisibility and min/max information in applyLoopGuards
applyLoopGuards doesn't always preserve information when there are multiple assumes.

This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:

1. __builtin_assume(TC % 8 == 0);
2. __builtin_assume(TC > 0);
3. __builtin_assume(TC < 100);

Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)

Looking at this SCEV, it doesn't preserve the divisibility by 8 information.

After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.

Differential Revision: https://reviews.llvm.org/D144947
2023-03-20 12:04:05 +02:00
Nikita Popov
a5242483e4 [SCEV] Recognize vscale intrinsics
Now that SCEV has a dedicated vscale node type, we should also map
vscale intrinsics to it. To make sure this does not regress ranges
(which were KnownBits based previously), add support for vscale to
getRangeRef() as well.

Differential Revision: https://reviews.llvm.org/D146226
2023-03-17 10:07:39 +01:00
Bjorn Pettersson
951a980dc7 [Analysis] Make order of analysis executions more stable
When debugging and using debug-pass-manager (e.g. in regression tests)
we prefer a consistent order in which analysis passes are executed.
But when for example doing

  return MyClass(AM.getResult<LoopAnalysis>(F),
                 AM.getResult<DominatorTreeAnalysis>(F));

then the order in which LoopAnalysis and DominatorTreeAnalysis isn't
guaranteed, and might for example depend on which compiler that is
used when building LLVM.

I've not scanned the full source tree, but this fixes some occurances
of the above pattern found in lib/Analysis.

This problem was discussed briefly in review for D146206.
2023-03-17 09:33:16 +01:00
Florian Hahn
484c622760 [SCEV] Do not strengthen nuw/nsw flags during get[Zero,Sign]ExtendedExpr.
Modifying AddRecs when constructing other expressions can lead to
surprising changes. It also seems like it is not really beneficial i
most cases.

At the moment, there's a single regression, but we still might be able
to improve the flags at AddRec construction.

Might help with the issue discussed in D143409.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D144051
2023-03-15 20:57:34 +00:00
Nikita Popov
2f3dc5fa8b [SCEV] Rename ControlsExit -> ControlsOnlyExit (NFC)
As suggested in https://reviews.llvm.org/D145510#4192162.
2023-03-14 11:04:54 +01:00
Nikita Popov
660403940c [SCEV] Fix finite loop non-strict predicate simplification (PR60944)
There are a number of issues with the current code for converting
ule -> ult (etc) predicates for comparisons controlling finite loops:

* It sets nowrap flags, which may only hold for that particular
  comparison, not globally. (PR60944)
* It doesn't check that the RHS is invariant. (I'm not sure this
  can cause practical issues independently of the previous point.)
* It runs before simplifications that may be more profitable. (PR54191)

This patch moves the handling for this into computeExitLimitFromICmp(),
because it is somewhat tightly coupled with assumptions in that code,
and addresses the aforementioned issues.

Fixes https://github.com/llvm/llvm-project/issues/60944.
Fixes https://github.com/llvm/llvm-project/issues/54191.

Differential Revision: https://reviews.llvm.org/D145510
2023-03-14 10:55:02 +01:00
Kazu Hirata
11efd1cb04 [Analysis] Use *{Set,Map}::contains (NFC) 2023-03-14 00:32:40 -07:00
Dmitry Makogon
b60758374b [SCEV] Apply loop guards against min/max for its arguments
This replaces several rewriting rules in ScalarEvolution::applyLoopGuards
that are applied to min/max expressions with the equivalent ones but
applied to its arguments.
So previously given we had a loop guard min(a, b) >= c,
the min expression would get rewritten as max(c, min(a, b)).
With such approach, we were unable to apply the rewrite if min operands
were zext for example (min(zext(a), zext(b))), however it's equivalent
to the expression zext(min(a, b)) for which we could apply the rewrite.

Now we'd rewrite the min operands also with these expressions:
a -> max(c, a) and
b -> max(c, b).
and this would allow us to apply the loop guard in this and similar cases:
min(zext(a), zext(b)) would get rewritten as min(zext(max(c, a)), zext(max(c, b)))
instead of just being skipped.

The list of added rules (omitting predicates signedness for simplicity):
1. Guard:     min(a, b) >= c
   Old rule:  min(a, b) -> max(c, min(a, b))
   New rules: a -> max(a, c) and b -> max(b, c)
2. Guard:     min(a, b) > c
   Old rule:  min(a, b) -> max(c + 1, min(a, b))
   New rules: a -> max(a, c + 1) and b -> max(b, c + 1)
3. Guard:     max(a, b) <= c
   Old rule:  max(a, b) -> min(c, max(a, b))
   New rules: a -> min(a, c) and b -> min(b, c)
4. Guard:     max(a, b) < c
   Old rule:  max(a, b) -> min(c - 1, max(a, b))
   New rules: a -> min(a, c - 1) and b -> min(b, c - 1)
The old rewrites still hold.

Differential Revision: https://reviews.llvm.org/D145230
2023-03-14 00:06:08 +07:00
Dmitry Makogon
bcda7db5e5 [SCEV] Rename variables in applyLoopGuards (NFC) 2023-03-14 00:06:07 +07:00
Florian Hahn
7019624ee1 [SCEV] Strengthen nowrap flags via ranges for ARs on construction.
At the moment, proveNoWrapViaConstantRanges is only used when creating
SCEV[Zero,Sign]ExtendExprs. We can get significant improvements by
strengthening flags after creating the AddRec.

I'll also share a follow-up patch that removes the code to strengthen
flags when creating SCEV[Zero,Sign]ExtendExprs. Modifying AddRecs while
creating those can lead to surprising changes.

Compile-time looks neutral:
https://llvm-compile-time-tracker.com/compare.php?from=94676cf8a13c511a9acfc24ed53c98964a87bde3&to=aced434e8b103109104882776824c4136c90030d&stat=instructions:u

Reviewed By: mkazantsev, nikic

Differential Revision: https://reviews.llvm.org/D144050
2023-03-07 17:10:34 +01:00
Nikita Popov
ffe8f47d72 [IR] Add operator<< overload for CmpInst::Predicate (NFC)
I regularly try and fail to use this while debugging.
2023-03-07 15:10:56 +01:00
Dmitry Makogon
30496bf645 [SCEV] Use fallthoughs in predicate switch when collecting rewrites for loop guard (NFC) 2023-03-07 15:59:50 +07:00
Nikita Popov
3228a501c4 [SCEV] Fix control flow warning (NFC) 2023-03-03 15:03:51 +01:00
Nikita Popov
e00c73c856 [SCEV] Extract a helper to create a SCEV with new operands (NFC) 2023-03-03 14:50:25 +01:00
Nikita Popov
2df4a3b4ac [SCEV] Remove an unnecessary switch (NFC)
Just the scevUnconditionallyPropagatesPoisonFromOperands() check
is sufficient. Also rename the flag to be more in line with the
more general predicate.
2023-03-03 14:37:40 +01:00
Dmitry Makogon
94b35eef4e [ScalarEvolution] Factor out RewriteMap utilities in applyLoopGuards (NFC)
This factors out two utilities used with RewriteMap in applyLoopGuards:
 - AddRewrite, which puts a rewrite rule in the map and if needed registers
   the rewrite in the list of rewritten expressions,
 - GetMaybeRewritten, which checks whether an expression has already been
   rewritten, and if so, returns the rewrite. Otherwise, returns the given
   expression.

This may be needed when adding new rewrite rules as not to copy-paste this
code.
2023-03-03 19:22:28 +07:00
Paul Walker
62d11b2cca Revert "Revert "[SCEV] Add SCEVType to represent vscale.""
Relanding after fixing Polly related build error.

This reverts commit 7b26dcae9e.
2023-03-02 13:14:07 +00:00
Paul Walker
7b26dcae9e Revert "[SCEV] Add SCEVType to represent vscale."
This reverts commit 7912f5cc92.
2023-03-02 11:59:50 +00:00
Paul Walker
7912f5cc92 [SCEV] Add SCEVType to represent vscale.
This is part of an effort to remove ConstantExpr based
representations of `vscale` so that its LangRef definiton can
be relaxed to accommodate a less strict definition of constant.

Differential Revision: https://reviews.llvm.org/D144891
2023-03-02 11:11:36 +00:00
Florian Hahn
2f3c748c45 [SCEV] Hoist common cleanup code to function. (NFC)
This allows for easier updating of common code in follow-on patches.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144847
2023-02-27 20:38:39 +01:00
Nikita Popov
0805d9d5aa [SCEV] Make scalable size representation more explicit
Represent scalable type sizes using C * vscale, where vscale is
the vscale constant expression. This exposes a bit more information
to SCEV, because the vscale multiplier is explicitly modeled in SCEV
(rather than part of the sizeof expression).

This is mainly intended as an alternative to D143642.

Differential Revision: https://reviews.llvm.org/D144624
2023-02-27 10:57:53 +01:00
Simon Pilgrim
aa6dc8ec47 [ScalarEvolution] Fix unused variable warnings. NFC.
Replace dyn_cast<> with isa<> as we don't actually need the variable
2023-02-25 21:27:57 +00:00
Leonard Chan
608ee703e5 [SCEV] Ensure SCEV does not replace aliases with their aliasees
Passes in general shouldn't replace an alias with the aliasee (see
https://reviews.llvm.org/D66606). This can lead to situations where a
linkonce_odr symbol (which could be interposable if lowered to weak
linkage) can be replaced with a local aliasee which won't be
interposable. SVEC does this when the function is invoked by
FunctionPass Manager -> Loop Pass Manager -> Induction Variable Users in
the codegen pipeline. This was found in hwasan instrumented code where a
linonce_odr alias was replaced with its private aliasee.

This fixes the bug descriped at
https://github.com/llvm/llvm-project/issues/60668.

Differential Revision: https://reviews.llvm.org/D144035
2023-02-23 19:59:25 +00:00
komalon1
02e08d06aa Revert "[SCEV] Preserve divisibility and min/max information in applyLoopGuards"
This reverts commit 219ba2fb7b.
2023-02-23 14:44:03 +02:00
Alon Kom
219ba2fb7b [SCEV] Preserve divisibility and min/max information in applyLoopGuards
applyLoopGuards doesn't always preserve information when there are multiple assumes.

This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:

1. __builtin_assume(TC % 8 == 0);
2. __builtin_assume(TC > 0);
3. __builtin_assume(TC < 100);

Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)

Looking at this SCEV, it doesn't preserve the divisibility by 8 information.

After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.

Differential Revision: https://reviews.llvm.org/D141850
2023-02-23 11:11:20 +02:00
Nikita Popov
7753ae8da2 [SCEV] Remove unused alignof/offsetof print special cases (NFC)
These shouldn't really reach SCEV without being folded away first,
and we don't have any tests that hit these cases.

The sizeof case does occur with scalable types.
2023-02-22 17:00:11 +01:00
Max Kazantsev
0cbb8ec030 Revert "[AssumptionCache] caches @llvm.experimental.guard's"
This reverts commit f9599bbc7a.

For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.

Differential Revision: https://reviews.llvm.org/D142330
2023-02-20 18:38:07 +07:00