Recent improvements in symbolic exit count computation revealed some problems with
SCEV's ability to find invariant predicate during first iterations. Ultimately it is based on its
ability to prove some facts for value on the last iteration. This last value, when it includes
`umin` as part of exit count, isn't always simplified enough. The motivating example is following:
https://github.com/llvm/llvm-project/issues/59615
Could not prove:
```
Pred = 36, LHS = (-1 + (-1 * (2147483645 umin (-1 + %var)<nsw>))<nsw> + %var), RHS = %var
FoundPred = 36, FoundLHS = {1,+,1}<nuw><nsw><%bb3>, FoundRHS = %var
```
Can prove:
```
Pred = 36, LHS = (-1 + (-1 * (-1 + %var)<nsw>)<nsw> + %var), RHS = %var
FoundPred = 36, FoundLHS = {1,+,1}<nuw><nsw><%bb3>, FoundRHS = %var
```
Here ` (2147483645 umin (-1 + %var)<nsw>)` is exit count composed of two parts from
two different exits: `2147483645 ` and `(-1 + %var)<nsw>`. When it was only one (latter)
analyzeable exit, for it everything was easily provable. Unfortunately, in general case `umin`
in one of `add`'s operands doesn't guarantee that the whole sum reduces, especially in presence
of negative steps and lack of `nuw`. I don't think there is a generic legal way to somehow play
around this `umin`.
So the ad-hoc solution is following: if we failed to find an equivalent predicate that is invariant
during first `MaxIter` iterations, and `MaxIter = umin(a, b, c...)`, try to find solution for at least one
of `a`, `b`, `c`... Because they all are `uge` than `MaxIter`, whatever is true during `a (b, c)` iterations
is also true during `MaxIter` iterations.
Differential Revision: https://reviews.llvm.org/D140456
Reviewed By: nikic
Use FoldID to cache SignExtendExprs that get folded to a different
SCEV.
Depends on D137505.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D137849
When creating SCEV expressions for ZExt, there's quite a bit of
reasoning done and in many places the reasoning in turn will try to
create new SCEVs for other ZExts.
This can have a huge compile-time impact. The attached test from #58402
takes an excessive amount of compile time; without the patch, the test
doesn't complete in 1500+ seconds, but with the patch it completes in 1
second.
To speed up this case, cache created ZExt expressions for given (SCEV, Ty) pairs.
Caching just ZExts is relatively straight-forward, but it might make
sense to extend it to other expressions in the future.
This has a slight positive impact on CTMark:
* O3: -0.03%
* ReleaseThinLTO: -0.03%
* ReleaseLTO-g: 0.00%
https://llvm-compile-time-tracker.com/compare.php?from=bf9de7464946c65f488fe86ea61bfdecb8c654c1&to=5ac0108553992fb3d58bc27b1518e8cf06658a32&stat=instructions:u
The patch also improves compile-time for some internal real-world workloads
where time spent in SCEV goes from ~300 seconds to ~3 seconds.
There are a few cases where computing & caching the result earlier may
return more pessimistic results, but the compile-time savings seem to
outweigh that.
Fixes#58402.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D137505
Old logic: when loop symbolic max exit count matched *exact* block exit count,
assume that all subsequent blocks will do 1 iteration less.
New logic: when loop symbolic max exit count matched *symbolic max* block exit count,
assume that all subsequent blocks will do 1 iteration less.
The new logic is still legal and is more permissive in situations when exact
block exit count is not known.
Differential Revision: https://reviews.llvm.org/D139692
Reviewed By: nikic
LLVM *loves* to convert `add` of operands with no common bits
into an `or`. But SCEV really doesn't deal with `or` that well,
so try extra hard to recognize this `or` as an `add`.
I believe, previously this wasn't being done because of the recursive
of this, but now that the `createSCEV()` is not recursive,
this should be fine. Unless this is *too* costly compile-time wise...
https://alive2.llvm.org/ce/z/EfapCo
These limitations are too strict, and their only purpose is to avoid code
size explosion. These restrictions seem obsolete, and the size problem
is solved in other places through cheap expansion limits.
The motivation is that the old code cannot deal with comparisons against
induction variant's increment.
Differential Revision: https://reviews.llvm.org/D138412
Reviewed By: lebedev.ri, reames
At the moment, getRangeRef may overflow the stack for very deeply nested
expressions.
This patch introduces a new getRangeRefIter function, which first builds
a worklist of N-ary expressions and phi nodes, followed by their
operands iteratively.
getRangeRef has been extended to also take a Depth argument and it
switches to use getRangeRefIter once the depth reaches a certain
threshold.
This ensures compile-time is not impacted in general. Note that
the iterative algorithm may lead to a slightly different evaluation
order, which could result in slightly worse ranges for cyclic phis.
https://llvm-compile-time-tracker.com/compare.php?from=23c3eb7cdf3478c9db86f6cb5115821a8f0f5f40&to=e0e09fa338e77e53242bfc846e1484350ad79773&stat=instructionsFixes#49579.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D130728
Additional SCEV verification highlighted a case where the cached loop
dispositions where incorrect after simplifying a phi node in IndVars.
Fix it by invalidating the phi before replacing it.
Fixes#58750
Additional SCEV verification highlighted a case where the cached loop
dispositions where incorrect after simplifying a condition in IndVars
and moving the user in LoopDeletion. Fix it by invalidating ICmp and all
its users.
Fixes#58515.
Moving an instruction can invalidate the cached block dispositions of
the corresponding SCEV. Invalidate the cached dispositions.
Also fixes a copy-paste error in forgetBlockAndLoopDispositions where
the start expression S was removed from BlockDispositions in the loop
but not the current values. This was also exposed by the new test case.
Fixes#58439.
Move LCSSA fixup from ::expandCodeForImpl to ::expand(). This has
the advantage that we directly preserve LCSSA nodes here instead of
relying on doing so in rememberInstruction. It also ensures that we
don't add the non-LCSSA-safe value to InsertedExpressions.
Alternative to D132704.
Fixes#57000.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D134739
Instruction being hoisted could have nuw/nsw flags inferred from the old
context, and we cannot simply move it to the new location keeping them
because we are going to introduce new uses to them that didn't exist before.
Example in https://github.com/llvm/llvm-project/issues/57187 shows how
this can produce branch by poison from initially well-defined program.
This patch forcefully recomputes poison-generating flag in the new context.
Differential Revision: https://reviews.llvm.org/D132022
Reviewed By: fhahn, nikic
This fixes https://github.com/llvm/llvm-project/issues/57336. It was exposed by a recent SCEV change, but appears to have been a long standing issue.
Note that the whole insert into the loop instead of a split exit edge is slightly contrived to begin with; it's there solely because IndVarSimplify preserves the CFG.
Differential Revision: https://reviews.llvm.org/D132571
Initial implementation had too weak requirements to positive/negative
range crossings. Not crossing zero with nuw is not enough for two reasons:
- If ArLHS has negative step, it may turn from positive to negative
without crossing 0 boundary from left to right (and crossing right to
left doesn't count for unsigned);
- If ArLHS crosses SINT_MAX boundary, it still turns from positive to
negative;
In fact we require that ArLHS always stays non-negative or negative,
which an be enforced by the following set of preconditions:
- both nuw and nsw;
- positive step (looks liftable);
Because of positive step, boundary crossing is only possible from left
part to the right part. And because of no-wrap flags, it is guaranteed
to never happen.
This reverts commit 354fa0b480.
Returning as is. The patch was reverted due to a miscompile, but
this patch is not causing it. This patch made it possible to infer
some nuw flags in code guarded by `false` condition, and then someone
else to managed to propagate the flag from dead code outside.
Returning the patch to be able to reproduce the issue.
This reverts commit 34ae308c73.
Our internal testing found a miscompile. Not sure if it's caused by
this patch or it revealed something else. Reverting while investigating.