clang-p2996

Author	SHA1	Message	Date
Sanjay Patel	1378e7d8b8	[InstSimplify] add no-wrap parameters to simplifyMul and add more tests; NFC This gives mul the same capabilities as add/sub. A potential improvement with nsw was noted in: `1720ec6da0`	2023-01-18 13:29:30 -05:00
Sanjay Patel	072b03c471	[InstCombine] fold pow(X,Y) / X -> pow(X, Y-1) This is one of the patterns suggested in issue #34943.	2023-01-13 17:13:46 -05:00
Sanjay Patel	61af2ab681	[InstCombine] fold pow(X,Y) * X -> pow(X, Y+1) (with fast-math) This is one of the patterns suggested in issue #34943.	2023-01-13 17:13:46 -05:00
Sanjay Patel	914576c1f0	[InstCombine] fold pow(X,Y) * pow(Z,Y) -> pow(X*Z, Y) (with fast-math) This is one of the patterns suggested in issue #34943.	2023-01-13 13:26:10 -05:00
Sanjay Patel	f0faea5714	[InstSimplify] fold exact divide to poison if it is known to not divide evenly This is related to the discussion in D140665. I was looking over the demanded bits implementation in IR and noticed that we just bail out of a potential fold if a udiv is exact: `82be8a1d2b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (L799)` Also, see tests added with `7f0c11509e`. Then, I saw that we could lose a fold to poison if we zap the exact with that transform, so this patch tries to catch that as a preliminary step. Alive2 proofs: https://alive2.llvm.org/ce/z/zCjKM7 https://alive2.llvm.org/ce/z/-tz_RK (trailing zeros must be "less-than") https://alive2.llvm.org/ce/z/c9CMsJ (general proof and specific example) Differential Revision: https://reviews.llvm.org/D140733	2022-12-29 10:26:50 -05:00
Chenbing Zheng	1f84e72b7b	[InstCombine] Fold (X << Z) / (X * Y) -> (1 << Z) / Y Alive2: https://alive2.llvm.org/ce/z/CBJLeP	2022-12-29 17:30:49 +08:00
Chenbing Zheng	bff1f8c79b	[InstCombine] complete (X << Z) / (Y << Z) --> X / Y Add one more situations for this fold. For unsigned div, 'nsw' on both shifts + 'nuw' on the dividend. Alive2: https://alive2.llvm.org/ce/z/sELF76 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D139997	2022-12-23 11:56:52 +08:00
Matt Arsenault	e661185fb3	InstCombine: Fold fdiv nnan x, 0 -> copysign(inf, x) https://alive2.llvm.org/ce/z/gLBFKB	2022-11-07 22:00:15 -08:00
Sanjay Patel	f03b069c5b	[InstCombine] fold mul with decremented "shl -1" factor (2nd try) This is a corrected version of: `bc886e9b58` I made a copy-paste error that created an "add" instead of the intended "sub" on that attempt. The regression tests showed the bug, but I overlooked that. As I said in a comment on issue #58717, the bug reports resulting from the botched patch confirm that the pattern does occur in many real-world applications, so hopefully eliminating the multiply results in better code. I added one more regression test in this version of the patch, and here's an Alive2 proof to show that exact example: https://alive2.llvm.org/ce/z/dge7VC Original commit message: This is a sibling to: `6064e92b0a` ...but we canonicalize the shl+add to shl+xor, so the pattern is different than I expected: https://alive2.llvm.org/ce/z/8CX16e I have not found any patterns that are safe to propagate no-wrap, so that is not included here. Differential Revision: https://reviews.llvm.org/D137157	2022-11-02 09:30:01 -04:00
Florian Mayer	e1de7ac20f	Revert "[InstCombine] fold mul with decremented "shl -1" factor" This reverts commit `bc886e9b58`. Broke MSAN bootstrap buildbots with Assertion `RangeAfterCopy % ExtraScale == 0 && "Extra instruction requires immediate to be aligned"' failed.	2022-10-31 17:39:05 -07:00
Sanjay Patel	bc886e9b58	[InstCombine] fold mul with decremented "shl -1" factor This is a sibling to: `6064e92b0a` ...but we canonicalize the shl+add to shl+xor, so the pattern is different than I expected: https://alive2.llvm.org/ce/z/8CX16e I have not found any patterns that are safe to propagate no-wrap, so that is not included here.	2022-10-31 09:06:55 -04:00
zhongyunde	f58311796c	[InstCombine] refactor the SimplifyUsingDistributiveLaws NFC Precommit for D136015 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D137019	2022-10-30 21:04:06 +08:00
Sanjay Patel	6064e92b0a	[InstCombine] fold mul with incremented "shl 1" factor X * ((1 << Z) + 1) --> (X << Z) + X https://alive2.llvm.org/ce/z/P-7WK9 It's possible that we could do better with propagating no-wrap, but this carries over the existing logic and appears to be correct. The naming differences on the existing folds are a result of using getName() to set the final value via Builder. That makes it easier to transfer no-wrap rather than the gymnastics required from the raw create instruction APIs.	2022-10-29 12:50:19 -04:00
Sanjay Patel	50000ec2cb	[InstCombine] create helper function for mul patterns with 1<<X; NFC There are at least 2 other potential patterns that could go here.	2022-10-29 12:50:19 -04:00
Sanjay Patel	d344146857	[InstCombine] reduce code duplication in visitMul(); NFC	2022-10-29 09:26:02 -04:00
Sanjay Patel	44b7da89d7	[InstCombine] fmul nnan X, 0.0 --> copysign(0.0, X) https://alive2.llvm.org/ce/z/ybgM5F Differential Revision: https://reviews.llvm.org/D136166	2022-10-18 11:34:02 -04:00
Sanjay Patel	e5ee0b06d6	[InstCombine] try to determine "exact" for sdiv If the divisor is a power-of-2 or negative-power-of-2 and the dividend is known to have >= trailing zeros than the divisor, the division is exact: https://alive2.llvm.org/ce/z/UGBksM (general proof) https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests) This isn't the most direct optimization (we could create ashr in these examples instead of relying on existing folds for exact divides), but it's possible that there's a more general constraint than just a pow2 divisor, so this might be extended in the future. This should solve issue #58348. Differential Revision: https://reviews.llvm.org/D135970	2022-10-16 10:59:56 -04:00
Sanjay Patel	340ae45be0	[InstCombine] use isKnownNonNegative() for readability; NFCI This should be functionally equivalent - both calls are thin wrappers around computeKnownBits(). We'll probably want to use known-bits directly in follow-up patches because that could determine "exact" for example (see issue #58348).	2022-10-16 10:59:56 -04:00
Sanjay Patel	7b9482df3d	[InstCombine] fold sdiv with common shl amount in operands (X << Z) / (Y << Z) --> X / Y https://alive2.llvm.org/ce/z/CLKzqT This requires a surprising "nuw" constraint because we have to guard against immediate UB via signed-div overflow with -1 divisor. This extends `008a89037a` and is another transform derived from issue #58137.	2022-10-12 11:32:15 -04:00
Sanjay Patel	008a89037a	[InstCombine] fold udiv with common shl amount in operands (X << Z) / (Y << Z) --> X / Y https://alive2.llvm.org/ce/z/E5eaxU This fixes the motivating example from issue #58137, but it is not the most general transform. We should probably also convert left-shift in the divisor to right-shift in the dividend for that, but that exposes another missed canonicalization for shifts and adds.	2022-10-12 11:12:26 -04:00
Sanjay Patel	fe97f95036	[InstCombine] propagate "exact" through folds of div These folds were added recently with: `6b869be810` `8da2fa856f` ...but they didn't account for the "exact" attribute, and that can be safely propagated: https://alive2.llvm.org/ce/z/F_WhnR https://alive2.llvm.org/ce/z/ft9Cgr	2022-10-12 09:25:05 -04:00
Sanjay Patel	d117ee25b8	[InstCombine] add helper function for div+shl folds; NFC There are at least 2 similar patterns that could be added here, and the existing fold can be improved because it fails to propagate "exact".	2022-10-12 09:25:04 -04:00
Sanjay Patel	7ec604a317	[InstCombine] try harder to cancel out mul/div ((Op1 * X) / Y) / Op1 --> X / Y https://alive2.llvm.org/ce/z/JYxWjA InstSimplify handles the more basic mul+div pattern with shared operand, but we don't seem to have any reassociation folds to handle cases where the common op is further away. This is a generalization of `9cff4711ac` and another transform derived from issue #58137.	2022-10-11 09:51:51 -04:00
Sanjay Patel	9cff4711ac	[InstCombine] fold udiv with common factor ((X *nuw Y) >> Z) / X --> Y >> Z https://alive2.llvm.org/ce/z/x3kKnq This is similar to `6b869be810` / `8da2fa856f`, but I have not found a signed equivalent, so it's just an unsigned match for now.	2022-10-10 08:12:06 -04:00
Sanjay Patel	eccb9a77c6	[InstCombine] fold exact sdiv to ashr (2nd try) The 1st attempt failed to updated the test checks as expected. Original commit message: sdiv exact X, (1<<ShAmt) --> ashr exact X, ShAmt (if shl is non-negative) https://alive2.llvm.org/ce/z/kB6VF7 It would probably be better to use ValueTracking to replace this and the existing transform above it, but the analysis does not account for the no-wrap properly, and it's not immediately clear to me how to fix it.	2022-10-08 10:09:44 -04:00
Sanjay Patel	68d4dbc2c1	Revert "[InstCombine] fold exact sdiv to ashr" This reverts commit `fe15290e0c`. The test checks were not updated as expected.	2022-10-08 10:02:03 -04:00
Sanjay Patel	fe15290e0c	[InstCombine] fold exact sdiv to ashr sdiv exact X, (1<<ShAmt) --> ashr exact X, ShAmt (if shl is non-negative) https://alive2.llvm.org/ce/z/kB6VF7 It would probably be better to use ValueTracking to replace this and the existing transform above it, but the analysis does not account for the no-wrap properly, and it's not immediately clear to me how to fix it.	2022-10-08 09:23:46 -04:00
Sanjay Patel	bdfefac9a4	[InstCombine] refactor sdiv by (negative) power-of-2 folds; NFCI It's probably better to try harder on this kind of pattern by using ValueTracking.	2022-10-07 11:35:17 -04:00
Sanjay Patel	8da2fa856f	[InstCombine] fold sdiv with hidden common factor (X * Y) s/ (X << Z) --> Y s/ (1 << Z) https://alive2.llvm.org/ce/z/yRSddG issue #58137	2022-10-06 13:11:50 -04:00
Sanjay Patel	6b869be810	[InstCombine] fold udiv with hidden common factor (X * Y) u/ (X << Z) --> Y u>> Z https://alive2.llvm.org/ce/z/4G9D_W	2022-10-06 11:35:27 -04:00
Sanjay Patel	2e87333bfe	[InstCombine] convert mul by negative-pow2 to negate and shift This is an unusual canonicalization because we create an extra instruction, but it's likely better for analysis and codegen (similar reasoning as D133399). InstCombine::Negator may create this kind of multiply from negate and shift, but this should not conflict because of the narrow negation. I don't know how to create a fully general proof for this kind of transform in Alive2, but here's an example with bitwidths similar to one of the regression tests: https://alive2.llvm.org/ce/z/J3jTjR Differential Revision: https://reviews.llvm.org/D133667	2022-10-02 12:22:25 -04:00
zhongyunde	23a5de4294	[InstCombine] Distributive or+mul with const operand We aleady support the transform: `(X+C1)CI -> XCI+C1CI` Here the case is a little special as the form of `(X+C1)CI` is transformed into `(X\|C1)CI`, so we should also support the transform: `(X\|C1)CI -> XCI+C1CI` Fixes https://github.com/llvm/llvm-project/issues/57278 Reviewed By: bcl5980, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D132658	2022-08-30 20:36:52 +08:00
zhongyunde	84d6966e4d	[InstCombine] Propagate the nuw for combine of add+mul As the commit of D132658, make the 'nuw' change separately. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D132777	2022-08-28 23:01:11 +08:00
Zain Jaffal	f61f99a105	[instcombine] Optimise for zero initialisation of product given fast flags are enabled Currently, clang ignores the 0 initialisation in finite math For example: ``` double f_prod = 0; double arr[1000]; for (size_t i = 0; i < 1000; i++) { f_prod *= arr[i]; } ``` Clang will ignore that `f_prod` is set to zero and it will generate assembly to iterate over the loop. Reviewed By: fhahn, spatel Differential Revision: https://reviews.llvm.org/D131672	2022-08-17 11:12:15 +01:00
Sanjay Patel	f95a6aea1b	[InstCombine] avoid splitting a constant expression with div/rem fold Follow-up to `d4940c0f3d` to further limit the transform to avoid an unintended pattern/fold of a constant expression.	2022-07-30 09:45:25 -04:00
Sanjay Patel	d4940c0f3d	[InstCombine] fix miscompile from urem/udiv transform with constant expression The isa<Constant> check could misfire on an instruction with 2 constant operands. This bug was introduced with `bb789381fc` (D36988). See issue #56810 for a C source example that exposed the bug.	2022-07-29 17:14:30 -04:00
Nikita Popov	5eaeeed8cb	[InstCombine] Avoid ConstantExpr::getFNeg() calls (NFCI) Instead call the constant folding API, which can fail. For now, this should be NFC, as we still allow the creation of fneg constant expressions.	2022-07-29 16:01:46 +02:00
Nikita Popov	fc18a88231	[InstCombine] Avoid creating float binop ConstantExprs Replace ConstantExpr:getFAdd etc with call to ConstantFoldBinaryOpOperands(). I'm using the constant folding API rather than IRBuilder here to ensure that this does actually constant fold. These transforms don't use m_ImmConstant(), so this would not otherwise be guaranteed (and apparently, they can't use m_ImmConstant because they want to handle scalable vector splats). There is an opportunity here to further migrate these to the ConstantFoldFPInstOperands() API, which would respect the denormal mode. I've held off on doing so here, because some of this code explicitly checks for denormal results, and I don't want to touch it in a mostly NFC change.	2022-07-08 16:36:04 +02:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Sanjay Patel	3f33d67d8a	[InstCombine] fold mul with masked low bit operand to trunc+select https://alive2.llvm.org/ce/z/o7rQ5q This shows an extra instruction in some cases, but that is caused by an existing canonicalization of trunc -> and+icmp. Codegen should be better for any target where a multiply is more costly than the most simple ALU op. This ends up producing the requested x86 asm from issue #55618, but it's not the same IR. We are missing a canonicalization from the negate+mask pattern to the trunc+select created here.	2022-06-05 20:07:18 -04:00
Sanjay Patel	8689463bfb	[InstCombine] make pattern matching more consistent; NFC We could go either way on this and several similar matches. Just matching as a binop is possibly slightly more efficient; we don't need to re-confirm the opcode of the instruction.	2022-06-02 16:01:23 -04:00
zhongyunde	3e6ba89055	[InstCombine] Fold a mul with bool value into and Fixes https://github.com/llvm/llvm-project/issues/55599 X * Y --> X & Y, iff X, Y can be only {0, 1}. https://alive2.llvm.org/ce/z/_RsTKF Reviewed By: spatel, nikic Differential Revision: https://reviews.llvm.org/D126040	2022-05-30 21:05:00 +08:00
Sanjay Patel	b5b6aa4d53	[InstCombine] fold multiply by signbit-splat to cmp+select (ashr i32 X, 31) * C --> (X < 0) ? -C : 0 https://alive2.llvm.org/ce/z/G8u9SS With a constant operand, this is an improvement in IR and codegen (where it can be converted to a mask op). Without a constant operand, we would have to negate the operand, so that is probably better left to the backend. This is similar but not the same optimization that is requested in #55618.	2022-05-27 11:54:19 -04:00
Sanjay Patel	5a6e085757	[InstCombine] reduce code duplication; NFC	2022-05-27 11:54:19 -04:00
Sanjay Patel	c4c750058f	[InstCombine] fold mul of signbit directly to X < 0 ? Y : 0 This is effectively NFC (intentionally no test diffs) because we already have the related fold that converts the 'and' pattern to select. So this is just an efficiency improvement.	2022-05-26 16:19:15 -04:00
Sanjay Patel	e8c20d995b	[IR] add and use pattern match specialization for sqrt intrinsic; NFC This was included in D126190 originally, but it's independent and a useful change for readability.	2022-05-23 14:16:30 -04:00
Sanjay Patel	be7f09f7b2	[IR] create and use helper functions that test the signbit; NFCI	2022-05-16 11:26:23 -04:00
Sanjay Patel	2fa8fc3d0a	[InstCombine] freeze operand in div+mul fold As discussed in issue #37809, this transform is not safe if the input is an undefined value. This is similar to recent changes for urem and sdiv: `d428f09b2c` `99ef341ce9` There is no difference in codegen on the basic examples, but this could lead to regressions. We may need to improve freeze analysis or lowering if that happens. Presumably, in real cases that are similar to the tests where a subsequent transform removes the rem, we will also be able to remove the freeze by seeing that the parameter has 'noundef'.	2022-05-12 13:49:29 -04:00
Sanjay Patel	99ef341ce9	[InstCombine] freeze operand in sdiv expansion As discussed in issue #37809, this transform is not safe if the input is an undefined value. This is similar to a recent change for urem: `d428f09b2c` There is no difference in codegen on the basic examples, but this could lead to regressions. We may need to improve freeze analysis or lowering if that happens. Presumably, in real cases that are similar to the tests where a subsequent transform removes the select, we will also be able to remove the freeze by seeing that the parameter has 'noundef'.	2022-05-11 14:01:28 -04:00
Sanjay Patel	d428f09b2c	[InstCombine] freeze operand in urem expansion As discussed in issue #37809, this transform is not safe if the input is an undefined value. There is no difference in codegen on the basic examples, but this could lead to regressions. We may need to improve freeze analysis or lowering if that happens.	2022-05-11 12:47:26 -04:00

1 2 3 4 5 ...

396 Commits