clang-p2996

Author	SHA1	Message	Date
Evgeny Stupachenko	204ade4102	Add early exit on reassociation of 0 expression. Summary: Before the patch a try to reassociate ((v * 16) * 0) * 1 fall into infinite loop Reviewers: pankajchawla Differential Revision: http://reviews.llvm.org/D41467 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326861	2018-03-07 02:17:08 +00:00
Eugene Zelenko	e2fc88a2fe	[Transforms] Add missing header for InstructionCombining.cpp, in order to export LLVMInitializeInstCombine as extern "C". Fixes PR35947. Patch by Brenton Bostick. Differential revision: https://reviews.llvm.org/D44140 llvm-svn: 326843	2018-03-06 23:06:13 +00:00
Sebastian Pop	bf6e1c26cf	DA: remove uses of GEP, only ask SCEV It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly detects multi-dimensional arrays in single nested loops: from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6 ;; for (long int i = 0; i < 50; i++) { ;; A[i][3i - 6] = i; ;; B++ = A[i][i]; DA used to detect two subscripts, which makes no sense in the LLVM IR or in C/C++ semantics, as there are no guarantees as in Fortran of subscripts not overlapping into a next array dimension: maximum nesting levels = 1 SrcPtrSCEV = %A DstPtrSCEV = %A using GEPs subscript 0 src = {0,+,1}<nuw><nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} subscript 1 src = {-6,+,3}<nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} Separable = {} Coupled = {1} With the current patch, DA will correctly work on only one dimension: maximum nesting levels = 1 SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body> DstSCEV = {%A,+,404}<%for.body> subscript 0 src = {(-2424 + %A)<nsw>,+,1212}<%for.body> dst = {%A,+,404}<%for.body> class = 1 loops = {1} Separable = {0} Coupled = {} This change removes all uses of GEP from DA, and we now only rely on the SCEV representation. The patch does not turn on -da-delinearize by default, and so the DA analysis will be more conservative in the case of multi-dimensional memory accesses in nested loops. I disabled some interchange tests, as the DA is not able to disambiguate the dependence anymore. To make DA stronger, we may need to compute a bound on the number of iterations based on the access functions and array dimensions. The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to avoid checking for snippets of LLVM IR: this form of checking is very hard to maintain. Instead, we now check for output of the pass that are more meaningful than dozens of lines of LLVM IR. Some tests now require -debug messages and thus only enabled with asserts. Patch written by Sebastian Pop and Aditya Kumar. Differential Revision: https://reviews.llvm.org/D35430 llvm-svn: 326837	2018-03-06 21:55:59 +00:00
Sanjay Patel	1f2f5d18d3	[InstCombine] simplify min/max canonicalization; NFCI llvm-svn: 326828	2018-03-06 19:01:18 +00:00
Sanjay Patel	7ed0bc26ac	[ValueTracking] move helpers for SelectPatterns from InstCombine to ValueTracking Most of the folds based on SelectPatternResult belong in InstSimplify rather than InstCombine, so the helper code should be available to other passes/analysis. llvm-svn: 326812	2018-03-06 16:57:55 +00:00
Florian Hahn	517dc51c48	[CallSiteSplitting] Do not crash when BB's terminator changes. Change doCallSiteSplitting to iterate until we reach the terminator instruction. tryToSplitCallSite can replace BB's terminator in case BB is a successor of itself. Then IE will be invalidated and we also have to check the current terminator. Reviewers: junbuml, davidxl, davide, fhahn Reviewed By: fhahn, junbuml Differential Revision: https://reviews.llvm.org/D43824 llvm-svn: 326793	2018-03-06 14:00:58 +00:00
Florian Hahn	f0a25f7253	[CloneFunction] Support BB == PredBB in DuplicateInstructionsInSplit. In case PredBB == BB and StopAt == BB's terminator, StopAt != &*BI will fail, because BB's terminator instruction gets replaced. By using BB.getTerminator() we get the current terminator which we can use to compare. Reviewers: sanjoy, anna, reames Reviewed By: anna Differential Revision: https://reviews.llvm.org/D43822 llvm-svn: 326779	2018-03-06 13:12:32 +00:00
Xin Tong	8fd561f572	[MergeICmp] Simplify how BCECmpBlock instructions are blacklisted llvm-svn: 326761	2018-03-06 02:24:02 +00:00
Xin Tong	98af9efca5	[MergeICmp] Fix printing. NFC llvm-svn: 326760	2018-03-06 02:04:57 +00:00
Daniel Neilson	82daad31fe	[RewriteStatepoints] Fix stale parse points Summary: RewriteStatepointsForGC collects parse points for further processing. During the collection if a callsite is found in an unreachable block (DominatorTree::isReachableFromEntry()) then all unreachable blocks are removed by removeUnreachableBlocks(). Some of the removed blocks could have been reachable according to DominatorTree::isReachableFromEntry(). In this case the collected parse points became stale and resulted in a crash when accessed. The fix is to unconditionally canonicalize the IR to removeUnreachableBlocks and then collect the parse points. The added test crashes with the old version and passes with this patch. Patch by Yevgeny Rouban! Reviewed by: Anna Differential Revision: https://reviews.llvm.org/D43929 llvm-svn: 326748	2018-03-05 22:27:30 +00:00
Daniel Neilson	bdda115e19	[InstCombine] Don't blow up in foldICmpWithCastAndCast on vector icmp instructions. Summary: Presently, InstCombiner::foldICmpWithCastAndCast() implicitly assumes that it is only invoked with icmp instructions of integer type. If that assumption is broken, and it is called with an icmp of vector type, then it fails (asserts/crashes). This patch addresses the deficiency. It allows it to simplify icmp (ptrtoint x), (ptrtoint/c) of vector type into a compare of the inputs, much as is done when the type is integer. Reviewers: apilipenko, fedor.sergeev, mkazantsev, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44063 llvm-svn: 326730	2018-03-05 18:05:51 +00:00
Craig Topper	8452faceae	[InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc. This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops. Differential Revision: https://reviews.llvm.org/D43774 llvm-svn: 326729	2018-03-05 18:04:12 +00:00
Florian Hahn	0b7c6422fb	[IPSCCP] Add getCompare which returns either true, false, undef or null. getCompare returns true, false or undef constants if the comparison can be evaluated, or nullptr if it cannot. This is in line with what ConstantExpr::getCompare returns. It also allows us to use ConstantExpr::getCompare for comparing constants. Reviewers: davide, mssimpso, dberlin, anna Reviewed By: davide Differential Revision: https://reviews.llvm.org/D43761 llvm-svn: 326720	2018-03-05 17:33:50 +00:00
Sanjay Patel	53ffabdfcb	[CVP] fix formatting; NFC llvm-svn: 326711	2018-03-05 16:08:34 +00:00
Xin Tong	8345c0e3a5	[MergeICmp] We can discard initial blocks that do other work Summary: We can discard initial blocks that do other work We do not need to limit ourselves to just the first block in the chain. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44029 llvm-svn: 326698	2018-03-05 13:54:47 +00:00
Clement Courbet	34be1b0288	[MergeICmps][NFC] Improve logging. llvm-svn: 326683	2018-03-05 08:21:47 +00:00
Fedor Indutny	364b9c2adb	[CallSiteSplitting] fix use after-free Iterating through predecessors of `TailBB` while removing their terminators leads to use after-free, because the predecessor list is changing on each removal. llvm-svn: 326668	2018-03-03 22:34:38 +00:00
Fedor Indutny	f9e09c1dd0	[CallSiteSplitting] properly split musttail calls Summary: `musttail` calls can't be naively splitted. The split blocks must include not only the call instruction itself, but also (optional) `bitcast` and `return` instructions that follow it. Clone `bitcast` and `ret`, place them into the split blocks, and remove the tail block when done. Reviewers: junbuml, mcrosier, davidxl, davide, fhahn Reviewed By: fhahn Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D43729 llvm-svn: 326666	2018-03-03 21:40:14 +00:00
Sanjay Patel	1a8d5c3d1f	[InstCombine] (~X) - (~Y) --> Y - X llvm-svn: 326660	2018-03-03 17:53:25 +00:00
Chandler Carruth	a4619d9944	[ThinLTO] Revert r325320: Import global variables This caused some links to fail with ThinLTO due to missing symbols as well as causing some binaries to have failures at runtime. We're working with the author to get a test case, but want to get the tree green again. Further, it appears to introduce a data race. While the test usage of threads was disabled in r325361 & r325362, that isn't an acceptable fix. I've reverted both of these as well. This code needs to be thread safe. Test cases for this are already on the original commit thread. llvm-svn: 326638	2018-03-02 23:40:08 +00:00
Vedant Kumar	7fc591f8bb	[AggressiveInstCombine] Use use_empty() instead of !getNumUses(), NFC use_empty() runs in O(1), whereas getNumUses() runs in O(# uses). llvm-svn: 326635	2018-03-02 23:22:49 +00:00
Sanjay Patel	e29375d04c	[InstCombine] rearrange visitFMul; NFCI Put the simplest non-FMF folds first, so it's easier to see what's left to fix/group/add with the FMF folds. llvm-svn: 326632	2018-03-02 23:06:45 +00:00
Vedant Kumar	f69baf64eb	[Utils] Salvage debug info in block simplification In stage2 -O3 builds of llc, this results in small but measurable increases in the number of variables with locations, and in the number of unique source variables overall. (According to llvm-dwarfdump --statistics, there are 123 additional variables with locations, which is just a 0.006% improvement). The size of the .debug_loc section of the llc dsym increases by 0.004%. llvm-svn: 326629	2018-03-02 22:46:48 +00:00
Vedant Kumar	334fa57456	[Utils] Salvage debug info in recursive inst deletion In stage2 -O3 builds of llc, this results in a 0.3% increase in the number of variables with locations, and a 0.2% increase in the number of unique source variables overall. The size of the .debug_loc section of the llc dsym increases by 0.5%. llvm-svn: 326621	2018-03-02 21:36:35 +00:00
Craig Topper	c7461e1aad	[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again. Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends. Differential Revision: https://reviews.llvm.org/D44038 llvm-svn: 326617	2018-03-02 21:25:18 +00:00
Sanjay Patel	2fd0acf05a	[InstCombine] partly fix FMF for fmul+log2 fold The code was checking that all of the instructions in the sequence are 'fast', but that's not necessary. The final multiply is all that we need to check (tests adjusted). The fmul doesn't need to be fully 'fast' either, but that can be another patch. llvm-svn: 326608	2018-03-02 20:32:46 +00:00
Yaxun Liu	3c42f1c3c9	LoopUnroll: respect pragma unroll when AllowRemainder is disabled Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops. Differential Revision: https://reviews.llvm.org/D43826 llvm-svn: 326585	2018-03-02 16:22:32 +00:00
Florian Hahn	515acd64fd	[LV][CFG] Add irreducible CFG detection for outer loops This patch adds support for detecting outer loops with irreducible control flow in LV. Current detection uses SCCs and only works for innermost loops. This patch adds a utility function that works on any CFG, given its RPO traversal and its LoopInfoBase. This function is a generalization of isIrreducibleCFG from lib/CodeGen/ShrinkWrap.cpp. The code in lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility function. Patch by Diego Caballero <diego.caballero@intel.com> Differential Revision: https://reviews.llvm.org/D40874 llvm-svn: 326568	2018-03-02 12:24:25 +00:00
Fedor Indutny	1571b1271e	[ArgumentPromotion] don't break musttail invariant PR36543 Summary: Do not break musttail invariant by promoting arguments of musttail callee or caller. Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk Reviewed By: rnk Subscribers: rnk, llvm-commits Differential Revision: https://reviews.llvm.org/D43926 llvm-svn: 326521	2018-03-02 00:59:27 +00:00
Sanjay Patel	d0cdb2f861	[InstCombine] allow fmul fold with less than 'fast' This is a retry of r326502 with updates to the reassociate test file that I missed the first time. @test15_reassoc in the supposed -reassociate test file (except that it tests 2 other passes too...) shows that there's no clear responsiblity for reassociation transforms. Instcombine now gets that case, but only because the constant values are identical. Otherwise, it would still miss that pattern. Reassociate doesn't get that case because it hasn't been updated to use less than 'fast' FMF. llvm-svn: 326513	2018-03-02 00:14:51 +00:00
Sanjay Patel	eb5d046890	revert r326502: [InstCombine] allow fmul fold with less than 'fast' I forgot that I added tests for 'reassoc' to -reassociate, but suprisingly that file calls -instcombine too, so it is affected. I'll update that file and try again. llvm-svn: 326510	2018-03-01 23:39:24 +00:00
Sanjay Patel	7373ae5c9a	[InstCombine] allow fmul fold with less than 'fast' llvm-svn: 326502	2018-03-01 22:53:47 +00:00
Craig Topper	2915bc0046	[SimplifyLibCalls] Update an obviously copy and pasted header comment to match this file. NFC llvm-svn: 326475	2018-03-01 20:05:09 +00:00
Sanjay Patel	f3b1af7aa4	[InstCombine] simplify code for (XY) X => (XX) Y ; NFCI llvm-svn: 326444	2018-03-01 15:50:26 +00:00
Benjamin Kramer	d1cf7ff5ab	[SCCP] Fix unused variable warning in release builds. llvm-svn: 326429	2018-03-01 11:31:44 +00:00
Reid Kleckner	3762a089d7	[IPSCCP] do not break musttail invariant (PR36485) Do not replace results of `musttail` calls with a constant if the call itself can't be removed. Do not zap returns of `musttail` callees, if the call site can't be removed and replaced with a constant. Do not zap returns of `musttail`-calling blocks, this breaks invariant too. Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43695 llvm-svn: 326404	2018-03-01 01:19:18 +00:00
Reid Kleckner	cb9611ca67	[DAE] don't remove args of musttail target/caller `musttail` requires identical signatures of caller and callee. Removing arguments breaks `musttail` semantics. PR36441 Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43708 llvm-svn: 326394	2018-03-01 00:09:35 +00:00
Sanjay Patel	eaf5a120ed	[InstCombine] simplify code for X * -1.0 --> -X; NFC I've added random FMF to one of the tests to show those are propagated. llvm-svn: 326377	2018-02-28 22:30:04 +00:00
Jonas Devlieghere	9ca064552a	[GlobalOpt] don't change CC of musttail calle(e\|r) When the function has musttail call - its cc is fixed to be equal to the cc of the musttail callee. In such case (and in the case of the musttail callee), GlobalOpt should not change the cc to fastcc as it will break the invariant. This fixes PR36546 Patch by: Fedor Indutny (indutny) Differential revision: https://reviews.llvm.org/D43859 llvm-svn: 326376	2018-02-28 22:28:44 +00:00
Craig Topper	b95298b041	[InstCombine] Split the FP constant code out of lookThroughFPExtensions and use nullptr as a sentinel Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings. This patch just moves all the code into a helper function and returns nullptr when it wants to stop. I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement. Differential Revision: https://reviews.llvm.org/D43833 llvm-svn: 326361	2018-02-28 20:14:34 +00:00
Vedant Kumar	9a041a7522	[InstrProfiling] Emit the runtime hook when no counters are lowered The API verification tool tapi has difficulty processing frameworks which enable code coverage, but which have no code. The profile lowering pass does not emit the runtime hook in this case because no counters are lowered. While the hook is not needed for program correctness (the profile runtime doesn't have to be linked in), it's needed to allow tapi to validate the exported symbol set of instrumented binaries. It was not possible to add a workaround in tapi for empty binaries due to an architectural issue: tapi generates its expected symbol set before it inspects a binary. Changing that model has a higher cost than simply forcing llvm to always emit the runtime hook. rdar://36076904 Differential Revision: https://reviews.llvm.org/D43794 llvm-svn: 326350	2018-02-28 19:00:08 +00:00
Sanjay Patel	b3f4f62698	[InstCombine] move invariant call out of loop; NFC We really shouldn't need a 2-loop here at all, but that's another cleanup. llvm-svn: 326330	2018-02-28 16:50:51 +00:00
Sanjay Patel	8fdd87f929	[InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly vague. The constant check is redundant in some cases, but it allows removing duplication for most of the calls. llvm-svn: 326329	2018-02-28 16:36:24 +00:00
Xin Tong	256869d8bc	Fix typo. NFC llvm-svn: 326319	2018-02-28 12:09:53 +00:00
Xin Tong	8ba674e43b	[MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed more than once. Summary: Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module. The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block is processed and second time when the last block is processed. We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43825 llvm-svn: 326318	2018-02-28 12:08:00 +00:00
David Green	7c35de124a	[Dominators] Remove verifyDomTree and add some verifying for Post Dom Trees Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315	2018-02-28 11:00:08 +00:00
Florian Hahn	1807c516c7	[NewGVN] Update phi-of-ops def block when updating existing ValuePHI. In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181	2018-02-27 09:34:51 +00:00
Sanjay Patel	31a90468e1	[InstCombine] allow fdiv folds with less than fully 'fast' ops Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098	2018-02-26 16:02:45 +00:00
Renato Golin	9d1b2acaaa	[LV] Move isLegalMasked* functions from Legality to CostModel All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079	2018-02-26 11:06:36 +00:00
Florian Hahn	a1822cbabc	[LoopInterchange] Loops with empty dependency matrix are safe. The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077	2018-02-26 10:45:25 +00:00

... 20 21 22 23 24 ...

20610 Commits