clang-p2996

Author	SHA1	Message	Date
Noah Goldstein	13441eb0f8	[InstCombine] Add folds for `(icmp spred (ssub.sat X, Y), 0)` -> `X spred Y` Alive2 links: eq: https://alive2.llvm.org/ce/z/Fv3mvc ne: https://alive2.llvm.org/ce/z/AEuEXU sle: https://alive2.llvm.org/ce/z/mfKGUS sge: https://alive2.llvm.org/ce/z/tX3_M4 sgt: https://alive2.llvm.org/ce/z/x7VgnZ slt: https://alive2.llvm.org/ce/z/rQN4TM Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149521	2023-05-03 11:11:49 -05:00
Noah Goldstein	dc13624e88	[InstCombine] Fold `(cmp eq/ne (umax X, Y),0)` -> `(cmp eq/ne (or X, Y),0)` `or` is almost always preferable. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149426	2023-04-29 12:38:43 -05:00
Noah Goldstein	ecad53c3f4	[InstCombine] Don't fold `uadd.sat` to `or` if it increase instruction count In the `(cmp eq/ne (uadd.sat X, Y),0)` case, we where missing a `hasOneUse` check. Differential Revision: https://reviews.llvm.org/D149425	2023-04-29 12:38:41 -05:00
Noah Goldstein	726f8ecac2	Recommit "[InstCombine] Add transforms for `(icmp {u\|s}ge/le (xor X, Y), X)`" (2nd Try) Wasn't related to the bug it was original thought to be causing.	2023-04-18 17:17:54 -05:00
Noah Goldstein	9d8a984160	Revert "[InstCombine] Add transforms for `(icmp {u\|s}ge/le (xor X, Y), X)`" May be related to PR62175 This reverts commit `a3fd060d42`.	2023-04-18 01:23:40 -05:00
Noah Goldstein	a3fd060d42	[InstCombine] Add transforms for `(icmp {u\|s}ge/le (xor X, Y), X)` If Y is non-zero we can simplify the ge/le -> gt/lt `(X ^ Y_NonZero) u>= X` --> `(X ^ Y_NonZero) u> X` - https://alive2.llvm.org/ce/z/k482NQ `(X ^ Y_NonZero) u<= X` --> `(X ^ Y_NonZero) u< X` - https://alive2.llvm.org/ce/z/TuUDGy `(X ^ Y_NonZero) s>= X` --> `(X ^ Y_NonZero) s> X` - https://alive2.llvm.org/ce/z/vXQypR `(X ^ Y_NonZero) s<= X` --> `(X ^ Y_NonZero) s< X ` - https://alive2.llvm.org/ce/z/fbUq-z Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D144608	2023-04-17 22:39:06 -05:00
Nikita Popov	a759745169	[InstCombine] Support multiple comparisons in foldAllocaCmp() foldAllocaCmp() needs to fold all comparisons of an alloca at the same time, to ensure that there is a consistent view of the alloca address. Currently, it folds "all" comparisons by limiting to the case where there is only one. This patch switches the algorithm to instead actually collect and fold all comparisons. Something we need to be careful about here is that there may be comparisons where both sides of the icmp are based on the alloca. Such comparisons are comparing offsets of the alloca, and as such can be ignored here, but shouldn't be folded to false. Differential Revision: https://reviews.llvm.org/D144492	2023-04-14 11:32:58 +02:00
Jun Zhang	e3175f7f1b	[InstCombine] icmp(X \| OrC, C) --> icmp(X, 0) We can eliminate the or operation based on the predicate and the relation between OrC and C. sge: X \| OrC s>= C --> X s>= 0 iff OrC s>= C s>= 0 sgt: X \| OrC s> C --> X s>= 0 iff OrC s> C s>= 0 sle: X \| OrC s<= C --> X s< 0 iff OrC s> C s>= 0 slt: X \| OrC s< C --> X s< 0 iff OrC s>= C s>= 0 Alive2 links: sge: https://alive2.llvm.org/ce/z/W-6FHE sgt: https://alive2.llvm.org/ce/z/TKK2yJ sle: https://alive2.llvm.org/ce/z/vURQGM slt: https://alive2.llvm.org/ce/z/JAsVfw Related issue: https://github.com/llvm/llvm-project/issues/61538 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D147597	2023-04-13 17:26:24 +08:00
Craig Topper	1f60c8d025	[IR] Replace calls to ConstantFP::getNullValue with ConstantFP::getZero. NFC There is no getNullValue in ConstantFP. Due to inheritance, we're calling Constant::getNullValue which handles any type including FP. Since we already know we want an FP constant we can use ConstantFP::getZero which might be faster and is a more readable name for an FP zero.	2023-04-03 23:14:02 -07:00
Krzysztof Drewniak	916425b2d1	[llvm] Use pointer index type for more GEP offsets (pre-codegen) Many uses of getIntPtrType() were using that type to calculate the neened type for GEP offset arguments. However, some time ago, DataLayout was extended to support pointers where the size of the pointer is not equal to the size of the values used to index it. Much code was already migrated to, for example, use getIndexSizeInBits instead of getPtrSizeInBits, but some rewrites still used getIntPtrType() to get the type for GEP offsets. This commit changes uses of getIntPtrType() to getIndexType() where they are involved in a GEP-related calculation. In at least one case (bounds check insertion) this resolves a compiler crash that the new test added here would previously trigger. This commit does not impact - C library-related rewriting (memcpy()), which are operating under the assumption that intptr_t == size_t. While all the mechanisms for breaking this assumption now exist, doing so is outside the scope of this commit. - Code generation and below. Note that the use of getIntPtrType() in CodeGenPrepare will be changed in a future commit. - Usage of getIntPtrType() in any backend Depends on D143435 Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D143437	2023-03-28 16:41:02 +00:00
Zhongyunde	7949a2a802	[InstCombine] enhance icmp with sub folds The new IR with And removes a use of the input variable, which is better for analysis. Fix https://github.com/llvm/llvm-project/issues/60818 Reviewed By: nikic, spatel Differential Revision: https://reviews.llvm.org/D145846	2023-03-22 19:53:12 +08:00
Nikita Popov	289542b1e7	[InstCombine] Fold icmp eq of non-inbounds geps We can fold equality comparisons of non-inbounds geps to offset comparison (https://alive2.llvm.org/ce/z/x2Zp8b). The inbounds requirement is only necessary for relational comparisons.	2023-03-21 10:51:37 +01:00
Nikita Popov	e8ec42b80b	[InstCombine] Fold icmp eq of non-inbounds gep with base pointer For equality comparisons, we don't need the gep to be inbounds: https://alive2.llvm.org/ce/z/Fe_kn2	2023-03-21 10:20:11 +01:00
Nikita Popov	61d2f3a71e	[InstCombine] Canonicalize icmp eq pow2 more thoroughly We currently already canonicalize icmp eq (%x & Pow2), Pow2 to icmp ne (%x & Pow2), 0. This patch generalizes the fold based on known bits. In particular, this allows us to handle comparisons against !range !{i64 0, i64 2} loads, which addresses an optimization regression in Rust caused by `8df376db72`. Differential Revision: https://reviews.llvm.org/D146149	2023-03-16 09:41:52 +01:00
chenglin.bi	d99d765fc0	[InstCombine] Remove one-use limit when it can simplify to a const in the pattern foldICmpUsingBoolRange This patch follow up `dd31a3b3a5`, when the pattern return a constant we needn't limit it to one-use.	2023-03-15 16:57:51 +08:00
Kazu Hirata	c8f9555c4d	[Transforms] Use *{Set,Map}::contains (NFC)	2023-03-14 00:24:30 -07:00
Noah Goldstein	e2079e896a	[InstCombine] Add transform `(icmp eq/ne bitreverse(x), C)` -> `(icmp eq/ne x, bitreverse(C))` EQ: https://alive2.llvm.org/ce/z/TESofr NE: https://alive2.llvm.org/ce/z/mwloaT Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D145339	2023-03-06 20:29:34 -06:00
Zhongyunde	15d5c59280	[InstCombine] Improvement the analytics through the dominating condition Address the dominating condition, the urem fold is benefit from the analytics improvements. Fix https://github.com/llvm/llvm-project/issues/60546 NOTE: delete the calls in simplifyBinaryIntrinsic and foldICmpWithDominatingICmp is used to reduce compile time. Reviewed By: nikic, arsenm, erikdesjardins Differential Revision: https://reviews.llvm.org/D144248	2023-03-01 17:03:34 +08:00
Jun Zhang	f88436c3f3	[InstCombine] Fold signbit test of a pow2 or zero (X & X) < 0 --> X == MinSignedC (X & X) > -1 --> X != MinSignedC Alive2: https://alive2.llvm.org/ce/z/_J5q3S Closes: https://github.com/llvm/llvm-project/issues/60957 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D144777	2023-02-28 15:52:13 +08:00
Nikita Popov	47f9109dff	[InstCombine] Guard against many users when swapping icmp operands This addresses the compile-time regression reported on D144369. If we don't fold constant operands early, then we might end up walking very large use lists of constants here. Explicitly exclude constants, and also limit the number of inspected users to avoid degenerate cases like this. This entire transform shouldn't be part of InstCombine in the first place though.	2023-02-27 12:12:33 +01:00
Nikita Popov	2502dc8a28	[InstCombine] Use CaptureTracking in foldAllocaCmp() foldAllocaCmp() checks whether the alloca is not captured (ignoring the icmp). Replace the manual implementation of escape analysis with CaptureTracking. The primary practical difference is that CaptureTracking handles nocapture arguments, while foldAllocaCmp() was using a hardcoded list. This is basically just the CaptureTracking refactoring from D120371 without the other changes.	2023-02-20 16:39:19 +01:00
Kazu Hirata	f8f3db2756	Use APInt::count{l,r}_{zero,one} (NFC)	2023-02-19 22:04:47 -08:00
Craig Topper	2872987e5e	[InstCombine] Fix InstCombinerImpl::foldICmpMulConstant for nsw and nuw mul with unsigned compare. If we have both an nsw and nuw flag, we would see the nsw flag first and only handle signed comparisons. This patch ignores the nsw flag if the comparison isn't signed. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D143766	2023-02-14 23:43:17 -08:00
chenglin.bi	dd31a3b3a5	[InstCombine] fold icmp of the sum of ext bool based on limited range For the pattern `(zext i1 X) + (sext i1 Y)`, the constant range is [-1, 1]. We can simplify the pattern by logical operations. Like: ``` (zext i1 X) + (sext i1 Y) == -1 --> ~X & Y (zext i1 X) + (sext i1 Y) == 0 --> ~(X ^ Y) (zext i1 X) + (sext i1 Y) == 1 --> X & ~Y ``` And other predicates can the combination of these results: ``` (zext i1 X) + (sext i1 Y)) != -1 --> X \| ~Y (zext i1 X) + (sext i1 Y)) s> -1 --> X \| ~Y (zext i1 X) + (sext i1 Y)) u< -1 --> X \| ~Y (zext i1 X) + (sext i1 Y)) s> 0 --> X & ~Y (zext i1 X) + (sext i1 Y)) s< 0 --> ~X & Y (zext i1 X) + (sext i1 Y)) != 1 --> ~X \| Y (zext i1 X) + (sext i1 Y)) s< 1 --> ~X \| Y (zext i1 X) + (sext i1 Y)) u> 1 --> ~X & Y ``` All alive proofs: https://alive2.llvm.org/ce/z/KmgDpF https://alive2.llvm.org/ce/z/fLwWa9 https://alive2.llvm.org/ce/z/ZKQn2P Fix: https://github.com/llvm/llvm-project/issues/59666 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D143373	2023-02-15 10:34:00 +08:00
chenglin.bi	6f149a17d4	[InstCombine] Look through truncate to fold icmp with intrinsics The output of intrinsic functions like ctpop, cttz, ctlz have limited range from 0 to bitwidth. So if the truncate destination type can hold the source bitwidth size, we can just ignore the truncate and use the truncate src to do combination. Alive2 proofs: https://alive2.llvm.org/ce/z/9D_-qP Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D143368	2023-02-10 10:33:07 +08:00
Noah Goldstein	abbd256a81	Improve transforms for (icmp uPred X * Z, Y * Z) -> (icmp uPred X, Y) Several cases where missing. 1. `(icmp eq/ne XZ, YZ) [if Z % 2 != 0] -> (icmp eq/ne X, Y)` EQ: https://alive2.llvm.org/ce/z/6_HPZ5 NE: https://alive2.llvm.org/ce/z/c34qSU There was previously an implementation of this that work of `Y` was non-constant, but it was missing if `YZ` evaluated to a constant and/or `nsw`/`nuw` where both false. As well it only worked if `Z` was a constant but we can check 1s bit of `KnownBits` to cover more cases. 2. `(icmp eq/ne XZ, YZ) [if Z != 0 and nsw(XY) and nsw(YZ)] -> (icmp eq/ne X, Y)` EQ: https://alive2.llvm.org/ce/z/6SdAG6 NE: https://alive2.llvm.org/ce/z/fjsq_b This was previously implemented only to work if `Z` was constant, but we can use `isKnownNonZero` to cover more cases. 3. `(icmp uPred XY, YZ) [if Z != 0 and nuw(XY) and nuw(X*Y)] -> (icmp uPred X, Y)` EQ: https://alive2.llvm.org/ce/z/FqWQLX NE: https://alive2.llvm.org/ce/z/2gHrd2 ULT: https://alive2.llvm.org/ce/z/MUAWgZ ULE: https://alive2.llvm.org/ce/z/szQQ2L UGT: https://alive2.llvm.org/ce/z/McVUdu UGE: https://alive2.llvm.org/ce/z/95uyC8 This was previously implemented only for `eq/ne` cases. As well only if `Z` was constant, but again we can use `isKnownNonZero` to cover more cases. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D142786	2023-02-06 14:09:18 -06:00
Noah Goldstein	2a3732f934	Add transform for `(mul X, OddC) eq/ne N * C` --> `X eq/ne N` We previously only did this if the `mul` was `nuw`, but it works for any odd value. Alive2 Links: EQ: https://alive2.llvm.org/ce/z/6_HPZ5 NE: https://alive2.llvm.org/ce/z/c34qSU Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D143026	2023-02-06 14:09:18 -06:00
Sanjay Patel	eb6f5f1ada	[InstCombine] reduce icmp_ne0-of-and-of-select-of-constants Follow-up to: D142847 / `9885505967` This handles the 'ne' variant by inverting the result. https://alive2.llvm.org/ce/z/D229jS	2023-01-31 14:59:55 -05:00
Sanjay Patel	9885505967	[InstCombine] reduce icmp_eq0-of-and-of-select-of-constants This is the most basic patch to handle fixing issue #57666. D133919 proposes to handle much more than this in a single patch, but I've used 10 regression tests just to make sure this part is doing what I expected and nothing more, and it already shows even more potential TODO items. The more general proofs from D133919 are correct, but I want to enable this in smaller steps to reduce risk: https://alive2.llvm.org/ce/z/RrVEyX Differential Revision: https://reviews.llvm.org/D142847	2023-01-30 16:17:15 -05:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
Noah Goldstein	baf7f7e575	Recommit "Add optimizations for icmp eq/ne (mul(X, Y), 0)" 2nd Try First time caused build failure: https://lab.llvm.org/buildbot/#/builders/183/builds/10447 but after investigating it seems to be unrelated. The same test/build passed later with the original commit here: https://lab.llvm.org/buildbot/#/builders/183/builds/10448 1. Add checks if X and/or Y are odd. The Odd values are unnecessary to the icmp: isZero(Odd * N) == isZero(N) 2. If neither X nor Y is known odd, then if X * Y cannot overflow AND if X and/or Y is non-zero, the non-zero values are unnecessary to the icmp. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D140850	2023-01-27 20:35:40 -06:00
Noah Goldstein	e7e3d723a5	Revert "Add optimizations for icmp eq/ne (mul(X, Y), 0)" This reverts commit `aa250ceb3f`. Caused test failures in Clangd: https://lab.llvm.org/buildbot/#/builders/183/builds/10447 reverting while investigating.	2023-01-27 18:44:02 -06:00
Noah Goldstein	aa250ceb3f	Add optimizations for icmp eq/ne (mul(X, Y), 0) 1. Add checks if X and/or Y are odd. The Odd values are unnecessary to the icmp: isZero(Odd * N) == isZero(N) 2. If neither X nor Y is known odd, then if X * Y cannot overflow AND if X and/or Y is non-zero, the non-zero values are unnecessary to the icmp. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D140850	2023-01-27 17:45:35 -06:00
Sanjay Patel	0ef7cbc319	[InstCombine] reduce compare of signbits of 2 values, signed variant (X s>> BitWidth - 1) == sext (Y s> -1) --> (X ^ Y) < 0 (X s>> BitWidth - 1) != sext (Y s> -1) --> (X ^ Y) > -1 This is the same logic as: `7cbfc39c77` ...extended to deal with "signed" cast+shift instructions. https://alive2.llvm.org/ce/z/LLidya	2023-01-26 08:58:45 -05:00
Sanjay Patel	7cbfc39c77	[InstCombine] reduce compare of signbits of 2 values Test if 2 values have different or same signbits: (X u>> BitWidth - 1) == zext (Y s> -1) --> (X ^ Y) < 0 (X u>> BitWidth - 1) != zext (Y s> -1) --> (X ^ Y) > -1 https://alive2.llvm.org/ce/z/qMwMhj As noted in #60242, these patterns regressed between the 14.0 and 15.0 releases - probably due to a change in canonicalization of related patterns. The related patterns for testing if 2 values are both pos/neg appear to be handled already.	2023-01-24 08:47:41 -05:00
Sanjay Patel	82dcfe0dbb	[InstCombine] allow matching vector types for icmp-of-mask/cast Also use a more specific matcher to simplify the mask compare to type size.	2023-01-23 18:23:43 -05:00
Sanjay Patel	7a9e3ad070	[InstCombine] relax one-use check for icmp with mask/cast	2023-01-23 18:23:43 -05:00
Sanjay Patel	3e58d94780	[InstCombine] remove dead pattern matching code; NFC Complexity canonicalization guarantees that a binop and cast are op0/op1 respectively. Adjusted generic test names to show that this pattern is still useful.	2023-01-23 18:23:43 -05:00
luxufan	0ad5909958	[InstCombine] Don't combine smul of i1 type constant one Fixes: https://github.com/llvm/llvm-project/issues/59876 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D141214	2023-01-17 22:04:48 +08:00
Craig Topper	8e317e693a	[InstCombine] Remove dead code from foldICmpShlOne. NFC This code handles (icmp eq/ne (1 << Y), C) if C is a power of 2. This case is also handled by the more general foldICmpShlConstConst which is called before we reach foldICmpShlOne.	2023-01-15 19:10:17 -08:00
Craig Topper	77f2f34d69	[InstCombine] Generalize (icmp sgt (1 << Y), -1) -> (icmp ne Y, BitWidth-1) to any negative constant. Similar for the sle version which will be canonicalized to slt first. Alive2 proof as implemented: https://alive2.llvm.org/ce/z/_YawdM @spatel's original Alive2: https://alive2.llvm.org/ce/z/3YB2vs Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D141773	2023-01-15 13:36:57 -08:00
Craig Topper	ff39b7ea89	[InstCombine] Optimize (icmp slt (1 << Y), 1) -> (icmp eq Y, BitWidth-1). The code tried to do this for (icmp sle (1 << Y), 0), but that is canonicalized to sgt before we get there. Simplify the code by removing the unreachable SGE and SLE handling. Also remove the (1 << Y) >=u 2147483648 and (1 << Y) <u 2147483648 handling since those are canonicalized to (1 << Y) <s 0 and (1 << Y) >=s 0 before we get there. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D141753	2023-01-14 11:18:47 -08:00
Sanjay Patel	062415d3c8	[InstCombine] improve description of fold and add TODO; NFC D58633	2023-01-13 10:34:55 -05:00
Guillaume Chatelet	8fd5558b29	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:49:38 +00:00
Noah Goldstein	6d839621da	[InstCombine] Canonicalize (A & B_Pow2) eq/ne B_Pow2 patterns 1. A & B_Pow2 != B_Pow2 -> A & B_Pow2 == 0 https://alive2.llvm.org/ce/z/KVUej4 2. A & B_Pow2 == B_Pow2 -> A & B_Pow2 != 0 https://alive2.llvm.org/ce/z/PVv9FR This allows the patterns to more easily be analyzed elsewhere. Differential Revision: https://reviews.llvm.org/D141090	2023-01-09 12:48:28 +01:00
Noah Goldstein	e6375ca6dc	[InstCombine] Fix potentially buggy code in `((%x & C) == 0) --> %x u< (-C)` transform While demanded bits constant shrinking appears to prevent this in practice right now, it is principally possible for C2 to have set bits that are known not-needed (zeroable). See: D140858 `+` will overflow here, `\|` will get the right logic. Differential Revision: https://reviews.llvm.org/D141089	2023-01-09 11:44:11 +01:00
luxufan	eda8e999dd	[InstCombine] Combine (zext a) mul (zext b) to llvm.umul.with.overflow only if mul has NUW flag Fixes: https://github.com/llvm/llvm-project/issues/59836 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D141031	2023-01-08 14:41:59 +08:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Nikita Popov	81ac46445b	[InstCombine] Support vectors in icmp of GEP fold EmitGEPOffset() supports vector GEPs nowadays, so we don't need any further code changes. compare_gep_with_base_vector1 shows a weakness in folding the resulting comparison if an index splat has to be performed.	2023-01-02 15:29:13 +01:00
Nikita Popov	f7bc8e035d	[InstCombine] Remove redundant evaluateGEPOffsetExpression() fold (NFCI) If we go through the generic EmitGEPOffset code, the resulting expression can be (and is) reduced in the same way this code did manually. There are no changes in lit tests or llvm-test-suite. This fold predates the time where we started adding nsw to the adds created by EmitGEPOffset, so it was likely needed back then. This might not actually be NFC due to worklist order changes etc.	2022-12-27 17:17:21 +01:00

1 2 3 4 5 ...

910 Commits