clang-p2996

Author	SHA1	Message	Date
Roman Lebedev	6ff633ddc4	[NFC][InstCombine] Revisit tests in shift-amount-reassociation-with-truncation-shl.ll llvm-svn: 367196	2019-07-28 21:31:58 +00:00
Sanjay Patel	99c57c6daf	[InstCombine] fold fsub+fneg with fdiv/fmul between The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194	2019-07-28 17:10:06 +00:00
Roman Lebedev	d5bc4b09f1	[NFC][InstCombine] Shift amount reassociation: can have trunc between shl's https://rise4fun.com/Alive/OQbM Not so simple for lshr/ashr, so those maybe later. https://bugs.llvm.org/show_bug.cgi?id=42391 llvm-svn: 367189	2019-07-28 13:13:46 +00:00
Hideto Ueno	e7bea9b73a	[Attributor] Deduce "align" attribute Summary: Deduce "align" attribute in attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64152 llvm-svn: 367187	2019-07-28 07:04:01 +00:00
Hideto Ueno	cc0a4cdc89	[FunctionAttrs] Annotate "willreturn" for intrinsics Summary: In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef). In this patch, willreturn is annotated for LLVM intrinsics. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64904 llvm-svn: 367184	2019-07-28 06:09:56 +00:00
Sanjay Patel	02b9e45a7e	[InstSimplify] remove quadratic time looping (PR42771) The test case from: https://bugs.llvm.org/show_bug.cgi?id=42771 ...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is seemingly done just to avoid invalidating the instruction iterator. We can instead delay instruction deletion until we reach the end of the block (or we could delay until we reach the end of all blocks). There's a test diff here for a degenerate case with llvm.assume that is not meaningful in itself, but serves to verify this change in logic. This change probably doesn't result in much overall compile-time improvement because we call '-instsimplify' as a standalone pass only once in the standard -O2 opt pipeline currently. Differential Revision: https://reviews.llvm.org/D65336 llvm-svn: 367173	2019-07-27 14:05:51 +00:00
Sanjay Patel	d20a0fe203	[InstCombine] add tests for fsub with negated operand; NFC llvm-svn: 367156	2019-07-26 21:12:22 +00:00
Wei Mi	55a68a2400	[JumpThreading] Stop searching predecessor when the current bb is in a unreachable loop. updatePredecessorProfileMetadata in jumpthreading tries to find the first dominating predecessor block for a PHI value by searching upwards the predecessor block chain. But jumpthreading may see some temporary IR state which contains unreachable bb not being cleaned up. If an unreachable loop happens to be on the predecessor block chain, keeping chasing the predecessor block will run into an infinite loop. The patch fixes it. Differential Revision: https://reviews.llvm.org/D65310 llvm-svn: 367154	2019-07-26 20:59:22 +00:00
Sanjay Patel	a9ab31558c	[InstCombine] canonicalize negated operand of fdiv This is a transform that we use with fmul, so use it for fdiv too for consistency. llvm-svn: 367146	2019-07-26 19:56:59 +00:00
Sanjay Patel	487e957775	[InstCombine] add tests for fdiv with negated operand; NFC llvm-svn: 367145	2019-07-26 19:44:53 +00:00
Sanjay Patel	c229cfeb7a	[InstCombine] remove flop from lerp patterns (Y * (1.0 - Z)) + (X * Z) --> Y - (Y * Z) + (X * Z) --> Y + Z * (X - Y) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=42716 Factoring eliminates an instruction, so that should be a good canonicalization. The potential conversion to FMA would be handled by the backend based on target capabilities. Differential Revision: https://reviews.llvm.org/D65305 llvm-svn: 367101	2019-07-26 11:19:18 +00:00
Sanjay Patel	8f15d40555	[InstCombine] add tests for lerp patterns (PR42716); NFC llvm-svn: 367069	2019-07-25 22:25:21 +00:00
Florian Hahn	c74808b914	[PredicateInfo] Replace pointer comparisons with deterministic compares. Currently there are a few pointer comparisons in ValueDFS_Compare, which can cause non-deterministic ordering when materializing values. There are 2 cases this patch fixes: 1. Order defs before uses used to compare pointers, which guarantees defs before uses, but causes non-deterministic ordering between 2 uses or 2 defs, depending on the allocation order. By converting the pointers to booleans, we can circumvent that problem. 2. comparePHIRelated was comparing the basic block pointers of edges, which also results in a non-deterministic order and is also not really meaningful for ordering. By ordering by their destination DFS numbers we guarantee a deterministic order. For the example below, we can end up with 2 different uselist orderings, when running `opt -mem2reg -ipsccp` hundreds of times. Because the non-determinism is caused by allocation ordering, we cannot reproduce it with ipsccp alone. declare i32 @hoge() local_unnamed_addr #0 define dso_local i32 @ham(i8* %arg, i8* %arg1) #0 { bb: %tmp = alloca i32 %tmp2 = alloca i32, align 4 br label %bb19 bb4: ; preds = %bb20 br label %bb6 bb6: ; preds = %bb4 %tmp7 = call i32 @hoge() store i32 %tmp7, i32* %tmp %tmp8 = load i32, i32* %tmp %tmp9 = icmp eq i32 %tmp8, 912730082 %tmp10 = load i32, i32* %tmp br i1 %tmp9, label %bb11, label %bb16 bb11: ; preds = %bb6 unreachable bb13: ; preds = %bb20 br label %bb14 bb14: ; preds = %bb13 %tmp15 = load i32, i32* %tmp br label %bb16 bb16: ; preds = %bb14, %bb6 %tmp17 = phi i32 [ %tmp10, %bb6 ], [ 0, %bb14 ] br label %bb19 bb18: ; preds = %bb20 unreachable bb19: ; preds = %bb16, %bb br label %bb20 bb20: ; preds = %bb19 indirectbr i8* null, [label %bb4, label %bb13, label %bb18] } Reviewers: davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64866 llvm-svn: 367049	2019-07-25 20:48:13 +00:00
Roman Lebedev	aa205957ff	[NFC][DivRemPairs] Tests with rem in expanded form (PR42673) As discussed in https://bugs.llvm.org/show_bug.cgi?id=42673 there is a TTI hook hasDivRemOp() that matters here. While -div-rem-pairs will decompose 'rem' if that hook returns false, nothing does the opposite transform. We can't to this in InstCombine, because it does not currently access TTI, and i'm not sure we should change that. We can't really do that in DAGCombine since it also currently does not access TTI. Therefore only DivRemPairs is left. https://bugs.llvm.org/show_bug.cgi?id=42673 llvm-svn: 367046	2019-07-25 20:26:34 +00:00
Serguei Katkov	cde00c02e1	[Loop Peeling] Fix idom detection algorithm. We'd like to determine the idom of exit block after peeling one iteration. Let Exit is exit block. Let ExitingSet - is a set of predecessors of Exit block. They are exiting blocks. Let Latch' and ExitingSet' are copies after a peeling. We'd like to find an idom'(Exit) - idom of Exit after peeling. It is an evident that idom'(Exit) will be the nearest common dominator of ExitingSet and ExitingSet'. idom(Exit) is a nearest common dominator of ExitingSet. idom(Exit)' is a nearest common dominator of ExitingSet'. Taking into account that we have a single Latch, Latch' will dominate Header and idom(Exit). So the idom'(Exit) is nearest common dominator of idom(Exit)' and Latch'. All these basic blocks are in the same loop, so what we find is (nearest common dominator of idom(Exit) and Latch)'. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D65292 llvm-svn: 367044	2019-07-25 19:31:50 +00:00
Sanjay Patel	b456310902	[SimplifyCFG] avoid crashing after simplifying a switch (PR42737) Later code in TryToSimplifyUncondBranchFromEmptyBlock() assumes that we have cleaned up unreachable blocks, but that was not happening with this switch transform. llvm-svn: 367037	2019-07-25 17:01:12 +00:00
Vlad Tsyrklevich	5d5a58317c	Revert "[InstCombine] try to narrow a truncated load" This reverts commit `bc4a63fd3c`, this is a speculative revert to fix a number of sanitizer bots (like sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2 compiler crashes, presumably due to a miscompile. llvm-svn: 367029	2019-07-25 15:37:57 +00:00
Florian Hahn	c0d0e3bda8	[PredicateInfo] Use SmallVector instead of SmallPtrSet. We do not need the SmallPtrSet to avoid adding duplicates to OpsToRename, because we already keep a ValueInfo mapping. If we see an op for the first time, Infos will be empty and we can also add it to OpsToRename. We process operands by visiting BBs depth-first and then iterate over all instructions & users, so the order should be deterministic. Therefore we can skip one round of sorting, which we purely needed for guaranteeing a deterministic order when iterating over the SmallPtrSet. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64816 llvm-svn: 367028	2019-07-25 15:35:10 +00:00
Sanjay Patel	bc4a63fd3c	[InstCombine] try to narrow a truncated load trunc (load X) --> load (bitcast X to narrow type) We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated load pattern can interfere with other instcombine transforms, so I'd like to allow the fold sooner. Example: https://bugs.llvm.org/show_bug.cgi?id=16739 ...in that report, we have bitcasts bracketing these ops, so those could get eliminated too. We've generally ruled out widening of loads early in IR ( LoadCombine - http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but that reasoning may not apply to narrowing if we can preserve information such as the dereferenceable range. Differential Revision: https://reviews.llvm.org/D64432 llvm-svn: 367011	2019-07-25 12:14:27 +00:00
Craig Topper	e9abc8177a	[InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of a range comparision. Similar for foldAndOfICmps We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow it to merge with icmp ugt X, C. Similar for the other constants. We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there. Fixes PR42691. Differential Revision: https://reviews.llvm.org/D65017 llvm-svn: 366945	2019-07-24 20:57:29 +00:00
David Bolvansky	db913d9618	[InstCombine] Adjusted pow-exp tests for Windows [NFC] Summary: https://bugs.llvm.org/show_bug.cgi?id=42740 Reviewers: efriedma, hans Reviewed By: hans Subscribers: spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65220 llvm-svn: 366925	2019-07-24 17:01:20 +00:00
Matt Arsenault	0b7f226311	AMDGPU: Fix test after r366913 llvm-svn: 366916	2019-07-24 16:05:55 +00:00
Sanjay Patel	3624074426	[InstCombine] add tests for load narrowing; NFC Baseline results for D64432. llvm-svn: 366901	2019-07-24 12:44:21 +00:00
Petr Hosek	8b161bacf4	[SafeStack] Insert the deref before remaining elements This is a follow up to D64971. While we need to insert the deref after the offset, it needs to come before the remaining elements in the original expression since the deref needs to happen before the LLVM fragment if present. Differential Revision: https://reviews.llvm.org/D65172 llvm-svn: 366865	2019-07-24 00:16:23 +00:00
Philip Reames	ea5c94b497	[IndVars] Fix a subtle bug in optimizeLoopExits The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts. This could cause two separate bugs: 1) We might exit the loop early, and leave optimizations undone. This is what triggered the assertion failure in the reported test case. 2) We might optimize one exit, then exit without indicating a change. This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars. Note that the pointer exit counts are a really fragile concept. They show up only when we have a pointer IV w/o a datalayout to provide their size. It's really questionable to me whether the complexity implied is worth it. llvm-svn: 366829	2019-07-23 17:45:11 +00:00
Roman Lebedev	402bf28ecc	[NFC][InstCombine] Fixup commutative/negative tests with icmp preds in @llvm.umul.with.overflow tests llvm-svn: 366802	2019-07-23 12:42:57 +00:00
Roman Lebedev	4153f17181	[InstSimplify][NFC] Tests for skipping 'div-by-0' checks before inverted @llvm.umul.with.overflow It would be already handled by the non-inverted case if we were hoisting the `not` in InstCombine, but we don't (granted, we don't sink it in this case either), so this is a separate case. llvm-svn: 366801	2019-07-23 12:42:49 +00:00
Roman Lebedev	87fdcb8749	[NFC][PhaseOredering][SimplifyCFG] Add more runlines to umul.with.overflow tests This way it will be more obvious that the problem is both in cost threshold and in hardcoded benefit check, plus will show how the instsimplify cleans this all in the end. llvm-svn: 366800	2019-07-23 12:42:41 +00:00
Hideto Ueno	19c07afe17	[Attributor] Deduce "dereferenceable" attribute Summary: Deduce dereferenceable attribute in Attributor. These will be added in a later patch. * dereferenceable(_or_null)_globally (D61652) * Deduction based on load instruction (similar to D64258) Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64876 llvm-svn: 366788	2019-07-23 08:16:17 +00:00
Hideto Ueno	2d654df763	[AMDGPU][NFC] Simplify test file for amdgcn intrinsics Summary: Remove unchecked attribute in the call site and use FileCheck String Substitution for `convergent` check. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64901 llvm-svn: 366781	2019-07-23 06:48:47 +00:00
Stefan Stipanovic	6058b86373	Fixing build error from commit `95cbc3d` [Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64162 llvm-svn: 366769	2019-07-22 23:58:23 +00:00
Stefan Stipanovic	5a9ba27c71	Revert "Fixing build error from commit 9285295." This reverts commit `95cbc3da88`. llvm-svn: 366759	2019-07-22 22:55:05 +00:00
Stefan Stipanovic	95cbc3da88	Fixing build error from commit `9285295`. [Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D64162 llvm-svn: 366753	2019-07-22 22:10:59 +00:00
Roman Lebedev	0689427280	[InstSimplify][NFC] Tests for skipping 'div-by-0' checks before @llvm.umul.with.overflow These may remain after @llvm.umul.with.overflow was canonicalized from the code that was originally doing the check via division. llvm-svn: 366751	2019-07-22 22:09:11 +00:00
Roman Lebedev	1693b80bd5	[SimplifyCFG][NFC] Test that we fail to flatten CFG in JPEG "sign" value extend pattern This comes up in JPEG decoding, see e.g. Figure F.12 – Extending the sign bit of a decoded value in V of ITU T.81 (JPEG specification). llvm-svn: 366750	2019-07-22 22:09:02 +00:00
Roman Lebedev	fca23d74c9	[SimplifyCFG][NFC] Test that we fail to flatten CFG after forming @llvm.umul.with.overflow Even if we formed @llvm.umul.with.overflow, we are still stuck with that guard against div-by-zero, which is no longer needed, because we didn't flatten the CFG. llvm-svn: 366749	2019-07-22 22:08:55 +00:00
Roman Lebedev	77d37037f0	[InstCombine][NFC] Tests for canonicalization of unsigned multiply overflow check llvm-svn: 366748	2019-07-22 22:08:45 +00:00
Roman Lebedev	6b248fca33	[NFC][PhaseOrdering] Add tests showcasing the problems of unsigned multiply overflow check While we can form the @llvm.mul.with.overflow easily, we are still left with that check that was guarding against div-by-0. And in the second case we won't even flatten the CFG. llvm-svn: 366747	2019-07-22 22:08:35 +00:00
Roman Lebedev	d5a52aeab6	[IndVarSimplify][NFC] Autogenerate check lines in loop_evaluate_1.ll Being affected by upcoming patch. llvm-svn: 366746	2019-07-22 22:08:27 +00:00
Eric Christopher	77dc6d2479	Temporarily Revert "[Attributor] Liveness analysis." as it's breaking the build. This reverts commit `9285295f75`. llvm-svn: 366737	2019-07-22 21:04:23 +00:00
Stefan Stipanovic	9285295f75	[Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D64162 llvm-svn: 366736	2019-07-22 20:54:30 +00:00
Stefan Stipanovic	69ebb02001	[Attributor] NoAlias on return values. Porting function return value attribute noalias to attributor. This will be followed with a patch for callsite and function argumets. Reviewers: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63067 llvm-svn: 366728	2019-07-22 19:36:27 +00:00
Petr Hosek	f6cd6ffbc9	[SafeStack] Insert the deref after the offset While debugging code that uses SafeStack, we've noticed that LLVM produces an invalid DWARF. Concretely, in the following example: int main(int argc, char* argv[]) { std::string value = ""; printf("%s\n", value.c_str()); return 0; } DWARF would describe the value variable as being located at: DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus The assembly to get this variable is: leaq -32(%r14), %rbx The order of operations in the DWARF symbols is incorrect in this case. Specifically, the deref is incorrect; this appears to be incorrectly re-inserted in repalceOneDbgValueForAlloca. With this change which inserts the deref after the offset instead of before it, LLVM produces correct DWARF: DW_OP_breg14 R14-32 Differential Revision: https://reviews.llvm.org/D64971 llvm-svn: 366726	2019-07-22 18:52:42 +00:00
Peter Collingbourne	ef5cfc2dae	WholeProgramDevirt: Teach the pass to respect the global's alignment. The bytes inserted before an overaligned global need to be padded according to the alignment set on the original global in order for the initializer to meet the global's alignment requirements. The previous implementation that padded to the pointer width happened to be correct for vtables on most platforms but may do the wrong thing if the vtable has a larger alignment. This issue is visible with a prototype implementation of HWASAN for globals, which will overalign all globals including vtables to 16 bytes. There is also no padding requirement for the bytes inserted after the global because they are never read from nor are they significant for alignment purposes, so stop inserting padding there. Differential Revision: https://reviews.llvm.org/D65031 llvm-svn: 366725	2019-07-22 18:50:45 +00:00
Peter Collingbourne	c3b8661df5	LowerTypeTests: Teach the pass to respect global alignments. We were previously ignoring alignment entirely when combining globals together in this pass. There are two main things that we need to do here: add additional padding before each global to meet the alignment requirements, and set the combined global's alignment to the maximum of all of the original globals' alignments. Since we now need to calculate layout as we go anyway, use the calculated layout to produce GlobalLayout instead of using StructLayout. Differential Revision: https://reviews.llvm.org/D65033 llvm-svn: 366722	2019-07-22 18:47:03 +00:00
Serguei Katkov	c6c31da867	[Loop Peeling] Fix the handling of branch weights of peeled off branches. Current algorithm to update branch weights of latch block and its copies is based on the assumption that number of peeling iterations is approximately equal to trip count. However it is not correct. According to profitability check in one case we can decide to peel in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration can be less then estimated trip count. This patch introduces another way to set the branch weights to peeled of branches. Let F is a weight of the edge from latch to header. Let E is a weight of the edge from latch to exit. F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit. Then, Estimated TripCount = F / E. For I-th (counting from 0) peeled off iteration we set the the weights for the peeled latch as (TC - I, 1). It gives us reasonable distribution, The probability to go to exit 1/(TC-I) increases. At the same time the estimated trip count of remaining loop reduces by I. As a result after peeling off N iteration the weights will be (F - N * E, E) and trip count of loop becomes F / E - N or TC - N. The idea is taken from the review of the patch D63918 proposed by Philip. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64235 llvm-svn: 366665	2019-07-22 05:15:34 +00:00
Craig Topper	ee5dc7e7ad	[InstCombine] Add foldAndOfICmps test cases inspired by PR42691. icmp ne %x, INT_MIN can be treated similarly to icmp sgt %x, INT_MIN. icmp ne %x, INT_MAX can be treated similarly to icmp slt %x, INT_MAX. icmp ne %x, UINT_MAX can be treated similarly to icmp ult %x, UINT_MAX. We already treat icmp ne %x, 0 similarly to icmp ugt %x, 0 llvm-svn: 366662	2019-07-22 02:43:43 +00:00
Roman Lebedev	8a431874e9	[NFC][InstCombine] Add a few extra srem-by-power-of-two tests - extra uses llvm-svn: 366652	2019-07-21 09:05:49 +00:00
Roman Lebedev	a2dd672c5f	[NFC][InstCombine] Autogenerate a few tests llvm-svn: 366643	2019-07-20 21:34:00 +00:00
Roman Lebedev	056640f8b3	[NFC][InstCombine] Add srem-by-signbit tests - still can fold to bittest https://rise4fun.com/Alive/IIeS llvm-svn: 366642	2019-07-20 21:33:50 +00:00

1 2 3 4 5 ...

13062 Commits