clang-p2996

Author	SHA1	Message	Date
Sanjay Patel	8fce94f916	[InstCombine] canonicalize icmp with trunc op into mask and cmp, part 2 If C is a high-bit mask: (trunc X) u< C --> (X & C) != C (are any masked-high-bits clear?) If C is low-bit mask: (trunc X) u> C --> (X & ~C) != 0 (are any masked-high-bits set?) If C is not-of-power-of-2 (one clear bit): (trunc X) u> C --> (X & (C+1)) == C+1 (are all masked-high-bits set?) This extends the fold added with: `acabad9ff6` (https://alive2.llvm.org/ce/z/aFr7qV) Using decomposeBitTestICmp() to generalize this is a planned follow-up, but that requires removing an inverse fold. Here are Alive2 generalizations for these folds: https://alive2.llvm.org/ce/z/u-ZpC_ (ult, the previous patch) https://alive2.llvm.org/ce/z/YsuAu2 (ult, this patch) https://alive2.llvm.org/ce/z/ekktQP (ugt, low bitmask) https://alive2.llvm.org/ce/z/pJY9wR (ugt, one clear bit) Differential Revision: https://reviews.llvm.org/D112634	2021-11-16 09:27:30 -05:00
Matt Devereau	f526c600c0	[AArch64][SVE] Instcombine SVE LD1/ST1 to stock LLVM IR InstCombine AArch64 LD1/ST1 to llvm.masked.load/llvm.masked.store and LD1/ST1 to load/store when a ptrue all predicate pattern operand is present. This allows existing IR optimizations such as dead-load removal to occur. Differential Revision: https://reviews.llvm.org/D113489	2021-11-16 11:10:23 +00:00
Mehrnoosh Heidarpour	7daa95c8fa	[InstCombine] Fold (A^B)\|~A-->~(A&B) https://alive2.llvm.org/ce/z/2v6rhF Fixes: https://llvm.org/PR52478 Differential Revision: https://reviews.llvm.org/D113783	2021-11-15 12:29:37 -05:00
Mehrnoosh Heidarpour	b69dc2d180	[InstCombine] add tests for or-xor logic fold; NFC Baseline tests for D113783 Differential Revision: https://reviews.llvm.org/D113846	2021-11-14 11:20:32 -05:00
Nikita Popov	84e273cced	[InstCombine] Handle undefs in and of icmp eq zero fold For the scalar/splat case, this fold is subsumed by foldLogOpOfMaskedICmps(). However, the conjugated fold for "or" also supports splats with undef. Make both code paths consistent by using m_ZeroInt() for the "and" implementation as well. https://alive2.llvm.org/ce/z/tN63cu https://alive2.llvm.org/ce/z/ufB_Ue	2021-11-11 19:07:07 +01:00
Nikita Popov	1f568f2a25	[InstCombine] Add test for and of icmp ne zero with undefs (NFC) We handle this in the conjugated "or" fold, but not for "and".	2021-11-11 19:01:29 +01:00
Huihui Zhang	5a4bd07ea4	[InstCombine][NFC] Pre-commit baseline test for D113442. Add baseline test to check for folding select into binop operand enabled through D113442.	2021-11-10 19:45:27 -08:00
Stanislav Mekhanoshin	557f4ce0c3	[InstCombine] Precommit updated and-xor-or.ll tests. NFC. Tests for: ``` (a \| ~(b & c)) & ~(a & (b ^ c)) --> ~(a \| b) \| (a ^ b ^ c) (~(a & b) \| c) & ~(a & (b & c)) -> ~(a & b) (~(a & b) \| c) & ~(b & (a & c)) -> ~(a & b) (~a \| b \| c) & ~(a & b & c) -> ~a \| (b ^ c) (~a \| b \| c) & ~(a & b) -> (c & ~b) \| ~a ```	2021-11-10 17:10:06 -08:00
Nikita Popov	0242a6adf7	[InstCombine] Support splat vectors in some or of icmp folds Replace m_ConstantInt() with m_APInt() in order to support splat constants in addition to scalar integers.	2021-11-10 22:59:09 +01:00
Nikita Popov	861adaf2ad	[InstCombine] Support splat vectors in some and of icmp folds Replace m_ConstantInt() with m_APInt() to support splat vectors in addition to scalar integers.	2021-11-10 22:37:54 +01:00
Nikita Popov	fa4e9e64e2	[InstCombine] Add vector variants to merge-icmps.ll (NFC) And regenerate test checks.	2021-11-10 22:36:06 +01:00
Nikita Popov	58ebc79a64	[InstCombine] Strip offset when folding and/or of icmps When folding and/or of icmps, look through add of a constant and adjust the icmp range instead. Effectively, this decomposes X + C1 < C2 style range checks back into a normal range. This allows us to fold comparisons involving two range checks or one range check and some other condition. We had a fold for a really specific case of this (or of range check and eq, and only one one side!) while this handles it in fully generality. Differential Revision: https://reviews.llvm.org/D113510	2021-11-10 22:01:52 +01:00
Stanislav Mekhanoshin	5731381594	[InstCombine] Relax and reorganize one use checks in the ~(a \| b) & c Since there is just a single check for LHS in ~(A \| B) & C \| ... transforms and multiple RHS checks inside with more coming I am removing m_OneUse checks for LHS and adding new checks for RHS. This is non essential as long as there is total benefit. In addition (~(A \| B) & C) \| (~(A \| C) & B) --> (B ^ C) & ~A checks were overly restrictive, it should be good without any additional checks. Differential Revision: https://reviews.llvm.org/D113141	2021-11-10 10:14:12 -08:00
Nikita Popov	fb1a203e45	[InstCombine] Add additional test with signed range check (NFC)	2021-11-10 18:00:24 +01:00
Sanjay Patel	67299aa84f	[InstCombine] add check for integer source type from cast to prevent crash A problem was noted in the post-commit review for `c36b7e21bd` / D113035 : If the source type is not integer or integer vector, then we could crash when trying to ComputeNumSignBits().	2021-11-10 09:44:55 -05:00
Itay Bookstein	f9059efa0d	[InstCombine] Extend stacksave/restore elimination Previously, InstCombine detected a pair of llvm.stacksave/stackrestore instructions that are adjacent modulo debug instructions in order to eliminate the llvm.stackrestore. This precludes situations where intervening instructions (e.g. loads) preclude the llvm.stacksave and llvm.stackrestore from becoming adjacent. This commit extends the logic and allows for eliminating the llvm.stackrestore when the range of instructions between them does not include any alloca or side-effect causing instructions. Signed-off-by: Itay Bookstein <itay.bookstein@nextsilicon.com> Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D113105	2021-11-10 10:41:58 +02:00
Stanislav Mekhanoshin	bc6ed9a6f5	[InstCombine] Precommit updated and-xor-or.ll tests. NFC. Tests for: ``` (c \| ~(a & b)) & (b \| ~(a & c)) --> ~(a & (b ^ c)) (b \| ~(a & c)) & ~(a & b) --> ~((b \| c) & a) (b & ~(a \| c)) \| ~(a \| b) --> ~((b & c) \| a) (2 uses test) ```	2021-11-09 15:38:24 -08:00
Nikita Popov	599f44d66c	[InstCombine] Add tests for and/or of range checks (NFC)	2021-11-09 22:06:02 +01:00
Sanjay Patel	2a88d00cf2	[InstCombine] fold sub-of-umax to 0-usubsat Op0 - umax(X, Op0) --> 0 - usub.sat(X, Op1) I'm not sure if this is really an improvement in IR because we probably have better recognition/analysis for min/max, but this lines up with the fold we do for the icmp+select idiom and removes another diff from D98152. This is similar to the previous fold in the code that was added with: `83c2fb9f66` `baa6a85130` https://alive2.llvm.org/ce/z/5MrVB9	2021-11-09 12:46:03 -05:00
Sanjay Patel	de12ca31d4	[InstCombine] fix typo in test; NFC This was intended to commute the previous test, but failed to swap the values.	2021-11-09 12:46:03 -05:00
Sanjay Patel	baa6a85130	[InstCombine] allow commute in sub-of-umax fold This fold was added with: `83c2fb9f66` ...but missed the commuted pattern: https://alive2.llvm.org/ce/z/_tYEGy	2021-11-09 10:50:11 -05:00
Sanjay Patel	ad48fc35e2	[InstCombine] add/move tests for sub-of-umax; NFC	2021-11-09 10:50:11 -05:00
Sanjay Patel	c36b7e21bd	[InstCombine] enhance vector bitwise select matching (Cond & C) \| (~bitcast(Cond) & D) --> bitcast (select Cond, (bc C), (bc D)) This is part of fixing: https://llvm.org/PR34047 That report shows a case where a bitcast is sitting between the select condition candidate and its 'not' value due to current cast canonicalization rules. There's a bitcast type restriction that might be violated in existing matching, but I still need to investigate if that is possible - Alive2 shows we can only do this transform safely when the bitcast is from narrow to wide vector elements (otherwise poison could leak into elements that were safe in the original code): https://alive2.llvm.org/ce/z/Hf66qh Differential Revision: https://reviews.llvm.org/D113035	2021-11-09 08:54:59 -05:00
Nikita Popov	1376301c87	[InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possible to express all range tests in either of these forms (with different choices of constants), but currently neither of them is considered canonical. We may have equivalent range tests using either ult or ugt. This proposes to canonicalize all range tests to use ult. An alternative would be to canonicalize to either ult or ugt depending on the specific constants involved -- e.g. in practice we currently generate ult for && style ranges and ugt for \|\| style ranges when going through the insertRangeTest() helper. In fact, the "clamp like" fold was relying on this, which is why I had to tweak it to not assume whether inversion is needed based on just the predicate. Proof: https://alive2.llvm.org/ce/z/_SP_rQ Differential Revision: https://reviews.llvm.org/D113366	2021-11-08 21:15:46 +01:00
Matt	4a59694ba1	[AArch64][SVE] Combine FADD and FMUL aarch64 intrinsics to FMLA This is a refinement to the work in https://reviews.llvm.org/D111638 Fold (fadd p a (fmul p b c)) into (fma p a b c) Differential Revision: https://reviews.llvm.org/D113095	2021-11-08 12:22:38 +00:00
Nikita Popov	e3cec17b2d	[InstSimplify] Remove incorrect icmp of gep fold (PR52429) As described in https://bugs.llvm.org/show_bug.cgi?id=52429 this fold is incorrect, because inbounds only guarantees that the pointers don't wrap in the unsigned space: It is possible that the sign boundary is crossed by an object. I'm dropping the fold entirely rather than adjusting it, because computePointerICmp() fully subsumes it (just with correct predicate handling). Differential Revision: https://reviews.llvm.org/D113343	2021-11-06 21:03:21 +01:00
Sanjay Patel	83c2fb9f66	[InstCombine] match usub.sat from umax intrinsic umax(X, Op1) - Op1 --> usub.sat(X, Op1) https://alive2.llvm.org/ce/z/HpcGiJ This happens in 2 or more steps with an icmp-select idiom instead of an intrinsic. This is another step towards canonicalization of the min/max intrinsics. See: D98152	2021-11-06 08:32:52 -04:00
Sanjay Patel	025a2f73a3	[InstCombine] add tests for umax with sub; NFC	2021-11-06 08:32:52 -04:00
David Green	08056e1888	[InstCombine] Generalize sadd.sat combine to compute sign bits. There is a combine in instcombine to transform a saturated add/sub into a saddsat/ssubsat, currently handling inputs which are both sign extended (https://alive2.llvm.org/ce/z/68qpTn). This can generalize to, for example ashr of at least the bitwidth (https://alive2.llvm.org/ce/z/4TFyX- and https://alive2.llvm.org/ce/z/qDWzFs for example). Which means it generalizes further to "the number of sign bits", needing to be enough to truncate to the size of the saturate. (An example using `or` for instance: https://alive2.llvm.org/ce/z/EI_h_A). So this patch makes use of ComputeNumSignBits (with the newly added ComputeMinSignedBits) in matchSAddSubSat to generalize the fold to any inputs with enough sign bits known, truncating the inputs to the new size of the saturate. Differential Revision: https://reviews.llvm.org/D112298	2021-11-05 15:05:09 +00:00
Sanjay Patel	05f64b5ac9	[InstCombine] add signbit tests for icmp with trunc; NFC	2021-11-05 10:06:15 -04:00
David Green	cd8cb5377a	[InstCombine] Add additional tests for converting to sadd.sat with sign bits. NFC	2021-11-05 12:00:03 +00:00
Stanislav Mekhanoshin	5540e27043	[InstCombine] Precommit updated and-xor-or.ll tests. NFC. Tests for: (~(a \| b) & c) \| ~(a \| (b \| c)) -> ~(a \| b) (~(a \| b) & c) \| ~(b \| (a \| c)) -> ~(a \| b)	2021-11-04 14:35:36 -07:00
Simon Pilgrim	61d6f4e60a	[InstCombine] Add reference to PR52397 to help with triage rG1e5f814302f8 added the test case, I've added PR52397 to the comment to help keep track of the source of the bug	2021-11-04 13:00:42 +00:00
David Green	1e5f814302	[InstCombine] Fix infinite recursion in ashr/xor vector fold. The added test has poison lanes due to the vector shuffle. This can cause an infinite loop of combines in instcombine where it folds xor(ashr, -1) -> select (icmp slt 0), -1, 0 -> sext (icmp slt 0) -> xor(ashr, -1). We usually prevent this by checking that the xor constant is not -1, but with vectors some of the lanes may be -1, some may be poison. So this changes the way we detect that from "!C1->isAllOnesValue()" to "!match(C1, m_AllOnes())", which is more able to detect that some of the lanes are poison. Fixes PR52397	2021-11-04 09:24:27 +00:00
Sanjay Patel	7277d2e1c8	[InstCombine] adjust test for icmp fold; NFC I missed that the bitwidth changed from the previous test in the sequence.	2021-11-03 13:35:19 -04:00
Sanjay Patel	d18b7ea621	[InstCombine] add tests for icmp with trunc op; NFC	2021-11-03 12:43:15 -04:00
Peter Waller	7a34145f40	Reland "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store" This reverts commit `753eba6421`. Contiguous gather => masked load: (sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1)) => (masked.load (gep BasePtr IndexBase) Align Mask undef) Contiguous scatter => masked store: (sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1)) => (masked.store Value (gep BasePtr IndexBase) Align Mask) Tests with <vscale x 2 x double>: [Gather, Scatter] for each [Positive test (index=1), Negative test (index=2), Alignment propagation]. Differential Revision: https://reviews.llvm.org/D112076	2021-11-03 13:42:14 +00:00
Peter Waller	753eba6421	Revert "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store" This reverts commit `1febf42f03`, which has a use-of-uninitialized-memory bug. See: https://reviews.llvm.org/D112076	2021-11-03 13:39:38 +00:00
Peter Waller	1febf42f03	[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store Contiguous gather => masked load: (sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1)) => (masked.load (gep BasePtr IndexBase) Align Mask undef) Contiguous scatter => masked store: (sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1)) => (masked.store Value (gep BasePtr IndexBase) Align Mask) Tests with <vscale x 2 x double>: [Gather, Scatter] for each [Positive test (index=1), Negative test (index=2), Alignment propagation]. Differential Revision: https://reviews.llvm.org/D112076	2021-11-03 11:02:44 +00:00
Piotr Sobczak	03961709ed	[InstCombine] Extend pattern to replace shuffle's insertelement operand In D71220 a pattern was added to replace shuffle's insertelement operand if inserted scalar is not demanded. The pattern was added only for the case where the shuffle's mask size is equal to element's vector size. However, that condition is not required because the pattern does not change the shuffle vector size. This patch extends the pattern to also include cases where shuffle's mask size is not equal to element's vector size. Differential Revision: https://reviews.llvm.org/D112318	2021-11-03 09:43:04 +01:00
Stanislav Mekhanoshin	f75d986fa5	[InstCombine] Precommit updated and-xor-or.ll tests. NFC. Tests for (~a & b & c) \| ~(a \| b) -> (c \| ~b) & ~a	2021-11-02 11:19:27 -07:00
Sanjay Patel	052a2913f5	[InstCombine] add tests for bitwise select of vectors; NFC	2021-11-02 13:24:36 -04:00
Matt	895145aacb	Revert "[AArch64][SVE] Combine predicated FMUL/FADD into FMA" This reverts commit `fc28a2f8ce`.	2021-11-02 14:56:01 +00:00
Sanjay Patel	829146164f	[InstCombine] change 'not' match for bitwise select The tests diffs are logically equivalent, and so this is generally NFC, but this makes the code match the code comment. It should also be more efficient. If we choose the 'not' operand (rather than the 'not' instruction) as the select condition, then we don't have to invert the select condition/operands as a subsequent transform.	2021-11-02 10:16:01 -04:00
Stanislav Mekhanoshin	31f02e9d7a	[InstCombine] Precommit updated and-xor-or.ll tests. NFC.	2021-11-01 15:20:41 -07:00
Stanislav Mekhanoshin	59d7f99148	[InstCombine] Precommit updated and-xor-or.ll tests. NFC.	2021-11-01 13:26:17 -07:00
Sanjay Patel	42c94bc1ab	[InstCombine] allow vector splat matching for bitwise logic fold Similar to `54e969cffd` (and with cosmetic updates to hopefully make that easier to read), this fold has been around since early in LLVM history. Intermediate folds have been added subsequently, so extra uses are required to exercise this code. The test example actually shows an unintended consequence with extra uses - we end up with an extra instruction compared to what we started with. But this at least makes scalar/vector consistent. General proof: https://alive2.llvm.org/ce/z/tmuBza	2021-11-01 11:39:48 -04:00
Sanjay Patel	beb5396d52	[InstCombine] add tests for bitwise logic folds; NFC	2021-11-01 11:39:48 -04:00
Sanjay Patel	54e969cffd	[InstCombine] allow vector splat matching for bitwise logic folds This fold was added long ago (part of fixing PR4216), and it matched scalars only. Intermediate folds have been added subsequently, so extra uses are required to exercise this code. General proof: https://alive2.llvm.org/ce/z/G6BBhB One of the specific tests: https://alive2.llvm.org/ce/z/t0JhEB	2021-11-01 08:26:42 -04:00
Sanjay Patel	701923a60f	[InstCombine] add tests for bitwise logic folds; NFC The extra uses are needed to prevent intermediate folds. Without that, there would be no coverage currently. The vector tests show an artificial limitation in the code.	2021-11-01 08:26:41 -04:00

1 2 3 4 5 ...

6308 Commits