foldAllocaCmp() needs to fold all comparisons of an alloca at the
same time, to ensure that there is a consistent view of the alloca
address. Currently, it folds "all" comparisons by limiting to the
case where there is only one. This patch switches the algorithm to
instead actually collect and fold all comparisons.
Something we need to be careful about here is that there may be
comparisons where both sides of the icmp are based on the alloca.
Such comparisons are comparing offsets of the alloca, and as such
can be ignored here, but shouldn't be folded to false.
Differential Revision: https://reviews.llvm.org/D144492
There is no getNullValue in ConstantFP. Due to inheritance, we're calling
Constant::getNullValue which handles any type including FP.
Since we already know we want an FP constant we can use ConstantFP::getZero
which might be faster and is a more readable name for an FP zero.
Many uses of getIntPtrType() were using that type to calculate the
neened type for GEP offset arguments. However, some time ago,
DataLayout was extended to support pointers where the size of the
pointer is not equal to the size of the values used to index it.
Much code was already migrated to, for example, use getIndexSizeInBits
instead of getPtrSizeInBits, but some rewrites still used
getIntPtrType() to get the type for GEP offsets.
This commit changes uses of getIntPtrType() to getIndexType() where
they are involved in a GEP-related calculation.
In at least one case (bounds check insertion) this resolves a compiler
crash that the new test added here would previously trigger.
This commit does not impact
- C library-related rewriting (memcpy()), which are operating under
the assumption that intptr_t == size_t. While all the mechanisms for
breaking this assumption now exist, doing so is outside the scope of
this commit.
- Code generation and below. Note that the use of getIntPtrType() in
CodeGenPrepare will be changed in a future commit.
- Usage of getIntPtrType() in any backend
Depends on D143435
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D143437
We can fold equality comparisons of non-inbounds geps to offset
comparison (https://alive2.llvm.org/ce/z/x2Zp8b). The inbounds
requirement is only necessary for relational comparisons.
We currently already canonicalize icmp eq (%x & Pow2), Pow2 to
icmp ne (%x & Pow2), 0. This patch generalizes the fold based on
known bits.
In particular, this allows us to handle comparisons against
!range !{i64 0, i64 2} loads, which addresses an optimization
regression in Rust caused by 8df376db72.
Differential Revision: https://reviews.llvm.org/D146149
Address the dominating condition, the urem fold is benefit from the analytics improvements.
Fix https://github.com/llvm/llvm-project/issues/60546
NOTE: delete the calls in simplifyBinaryIntrinsic and foldICmpWithDominatingICmp
is used to reduce compile time.
Reviewed By: nikic, arsenm, erikdesjardins
Differential Revision: https://reviews.llvm.org/D144248
This addresses the compile-time regression reported on D144369.
If we don't fold constant operands early, then we might end up
walking very large use lists of constants here. Explicitly exclude
constants, and also limit the number of inspected users to avoid
degenerate cases like this.
This entire transform shouldn't be part of InstCombine in the
first place though.
foldAllocaCmp() checks whether the alloca is not captured (ignoring
the icmp). Replace the manual implementation of escape analysis
with CaptureTracking.
The primary practical difference is that CaptureTracking handles
nocapture arguments, while foldAllocaCmp() was using a hardcoded
list.
This is basically just the CaptureTracking refactoring from D120371
without the other changes.
If we have both an nsw and nuw flag, we would see the nsw flag
first and only handle signed comparisons.
This patch ignores the nsw flag if the comparison isn't signed.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D143766
The output of intrinsic functions like ctpop, cttz, ctlz have limited range from 0 to bitwidth. So if the truncate destination type can hold the source bitwidth size, we can just ignore the truncate and use the truncate src to do combination.
Alive2 proofs:
https://alive2.llvm.org/ce/z/9D_-qP
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D143368
This is the most basic patch to handle fixing issue #57666.
D133919 proposes to handle much more than this in a single patch,
but I've used 10 regression tests just to make sure this part is
doing what I expected and nothing more, and it already shows even
more potential TODO items.
The more general proofs from D133919 are correct, but I want to
enable this in smaller steps to reduce risk:
https://alive2.llvm.org/ce/z/RrVEyX
Differential Revision: https://reviews.llvm.org/D142847
First time caused build failure:
https://lab.llvm.org/buildbot/#/builders/183/builds/10447
but after investigating it seems to be unrelated. The same
test/build passed later with the original commit here:
https://lab.llvm.org/buildbot/#/builders/183/builds/10448
1. Add checks if X and/or Y are odd. The Odd values are unnecessary to
the icmp: isZero(Odd * N) == isZero(N)
2. If neither X nor Y is known odd, then if X * Y cannot overflow AND
if X and/or Y is non-zero, the non-zero values are unnecessary to the
icmp.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D140850
1. Add checks if X and/or Y are odd. The Odd values are unnecessary to
the icmp: isZero(Odd * N) == isZero(N)
2. If neither X nor Y is known odd, then if X * Y cannot overflow AND
if X and/or Y is non-zero, the non-zero values are unnecessary to the
icmp.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D140850
Test if 2 values have different or same signbits:
(X u>> BitWidth - 1) == zext (Y s> -1) --> (X ^ Y) < 0
(X u>> BitWidth - 1) != zext (Y s> -1) --> (X ^ Y) > -1
https://alive2.llvm.org/ce/z/qMwMhj
As noted in #60242, these patterns regressed between the
14.0 and 15.0 releases - probably due to a change in
canonicalization of related patterns.
The related patterns for testing if 2 values are both
pos/neg appear to be handled already.
Complexity canonicalization guarantees that a binop and cast
are op0/op1 respectively. Adjusted generic test names to
show that this pattern is still useful.
This code handles (icmp eq/ne (1 << Y), C) if C is a power of 2.
This case is also handled by the more general foldICmpShlConstConst
which is called before we reach foldICmpShlOne.
The code tried to do this for (icmp sle (1 << Y), 0), but that is
canonicalized to sgt before we get there.
Simplify the code by removing the unreachable SGE and SLE handling.
Also remove the (1 << Y) >=u 2147483648 and (1 << Y) <u 2147483648
handling since those are canonicalized to (1 << Y) <s 0 and
(1 << Y) >=s 0 before we get there.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D141753
While demanded bits constant shrinking appears to prevent this in
practice right now, it is principally possible for C2 to have
set bits that are known not-needed (zeroable). See: D140858
`+` will overflow here, `|` will get the right logic.
Differential Revision: https://reviews.llvm.org/D141089
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
EmitGEPOffset() supports vector GEPs nowadays, so we don't need
any further code changes.
compare_gep_with_base_vector1 shows a weakness in folding the
resulting comparison if an index splat has to be performed.
If we go through the generic EmitGEPOffset code, the resulting
expression can be (and is) reduced in the same way this code did
manually. There are no changes in lit tests or llvm-test-suite.
This fold predates the time where we started adding nsw to the adds
created by EmitGEPOffset, so it was likely needed back then.
This might not actually be NFC due to worklist order changes etc.