clang-p2996

Author	SHA1	Message	Date
Reid Kleckner	fb502d2f5e	[IR] Make paramHasAttr to use arg indices instead of attr indices This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367	2017-04-14 20:19:02 +00:00
Sanjay Patel	7cfe41659c	[InstCombine] (X != C1 && X != C2) --> (X \| (C1 ^ C2)) != C2 ...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364	2017-04-14 19:23:50 +00:00
Craig Topper	fb71b7d3e0	[InstCombine] Support folding a subtract with a constant LHS into a phi node We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363	2017-04-14 19:20:12 +00:00
Craig Topper	c22c7b1459	[InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip code when LHS/RHS aren't BinaryOperators Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out. We also rely on these null checks to check the result of getIdentityValue and early out for it. This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator. Differential Revision: https://reviews.llvm.org/D31913 llvm-svn: 300353	2017-04-14 17:55:41 +00:00
Davide Italiano	91239088a1	[FunctionImport] assert(false) -> llvm_unreachable(). NFCI. llvm-svn: 300344	2017-04-14 17:22:02 +00:00
Sanjoy Das	e3a15e832c	Tighten the API for ScalarEvolutionNormalization llvm-svn: 300331	2017-04-14 15:49:59 +00:00
Sanjoy Das	ac9f3ea0b4	Remove NormalizeAutodetect; NFC It is cleaner to have a callback based system where the logic of whether an add recurrence is normalized or not lives on IVUsers. This is one step in a multi-step cleanup. llvm-svn: 300330	2017-04-14 15:49:53 +00:00
Gil Rapaport	334f8fbe47	[LV] Remove implicit single basic block assumption This patch is part of D28975's breakdown - no change in output intended. LV's code currently assumes the vectorized loop is a single basic block up until predicateInstructions() is called. This patch removes two manifestations of this assumption (loop phi incoming values, dominator tree update) by replacing the use of vectorLoopBody with the vectorized loop's latch/header. Differential Revision: https://reviews.llvm.org/D32040 llvm-svn: 300310	2017-04-14 07:30:23 +00:00
Craig Topper	c9a4fc0750	[InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFC llvm-svn: 300305	2017-04-14 05:09:04 +00:00
Xinliang David Li	9a71766751	Fix test failure on windows: pass module to getInstrProfXXName calls llvm-svn: 300302	2017-04-14 03:03:24 +00:00
Daniel Berlin	2f72b19b05	NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299	2017-04-14 02:53:37 +00:00
Xinliang David Li	57dea2d359	[Profile] PE binary coverage bug fix PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277	2017-04-13 23:37:12 +00:00
Reid Kleckner	f021fab2af	[IR] Make getParamAttributes take argument numbers, not ArgNo+1 Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272	2017-04-13 23:12:13 +00:00
Craig Topper	e7563f8dda	[InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of getLowBitsSet. NFC llvm-svn: 300265	2017-04-13 21:49:48 +00:00
Davide Italiano	af36d02430	[LCSSA] Efficiently compute blocks dominating at least one exit. For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255	2017-04-13 20:36:59 +00:00
Richard Smith	6c2615177b	Revert accidentally-committed files in r300252. llvm-svn: 300253	2017-04-13 20:31:21 +00:00
Richard Smith	55bd375b69	Remove all allocation and divisions from GreatestCommonDivisor Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252	2017-04-13 20:29:59 +00:00
Reid Kleckner	257cb4e099	[InstCombine] Fix !prof metadata preservation for invokes Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251	2017-04-13 20:26:38 +00:00
Davide Italiano	0b30227f75	[LCSSA] Assert that we always have a valid loop. We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244	2017-04-13 20:05:37 +00:00
Davide Italiano	549078d1ab	[LCSSA] Remove spurious whitespaces. NFCI. llvm-svn: 300243	2017-04-13 20:02:27 +00:00
Davide Italiano	5129951296	[LCSSA] Use `auto` when the type is obvious. NFCI. llvm-svn: 300242	2017-04-13 20:01:30 +00:00
Dehao Chen	2c7ca9b5df	SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240	2017-04-13 19:52:10 +00:00
Anna Thomas	dcdb325fee	[LV] Fix the vector code generation for first order recurrence Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238	2017-04-13 18:59:25 +00:00
Sanjay Patel	445d03bf00	[InstCombine] fold X == 0 \|\| X == -1 to one compare (PR32524) This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236	2017-04-13 18:47:06 +00:00
Reid Kleckner	aea2a28098	[DAE] Simplify call site replacement code with CallSite NFC llvm-svn: 300235	2017-04-13 18:42:03 +00:00
Reid Kleckner	c3fae796fd	[InstCombine] Simplify attribute code with new AttributeList::get NFC llvm-svn: 300230	2017-04-13 18:11:03 +00:00
Reid Kleckner	3a1150352d	[ArgPromotion] Don't drop !prof metadata on promoted calls Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229	2017-04-13 18:10:30 +00:00
Sanjay Patel	9745d24a66	[InstCombine] use similar ops for related folds; NFCI It's less efficient to produce 'ule' than 'ult' since we know we're going to canonicalize to 'ult', but we shouldn't have duplicated code for these folds. As a trade-off, this was a pretty terrible way to make a '2'. :) if (LHSC == SubOne(RHSC)) AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC); The next steps are to share the code to fix PR32524 and add the missing 'and' fold that was left out when PR14708 was fixed: https://bugs.llvm.org/show_bug.cgi?id=14708 llvm-svn: 300222	2017-04-13 17:36:24 +00:00
Sanjay Patel	a8ebb46e0e	[InstCombine] fix assert to not always be true llvm-svn: 300202	2017-04-13 16:05:01 +00:00
Geoff Berry	85a530fb59	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline." This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200	2017-04-13 15:36:25 +00:00
Ayal Zaks	cd712b6c49	[LV] Refactor ILV to provide vectorizeInstruction(); NFC Refactoring InnerLoopVectorizer's vectorizeBlockInLoop() to provide vectorizeInstruction(). Aligning DeadInstructions with its only user. Facilitates driving the transformation by VPlan - follows https://reviews.llvm.org/D28975 and its tentative breakdown. Differential Revision: https://reviews.llvm.org/D31997 llvm-svn: 300183	2017-04-13 09:07:23 +00:00
Reid Kleckner	7f72033e1c	[IR] Take func, ret, and arg attrs separately in AttributeList::get This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153	2017-04-13 00:58:09 +00:00
Craig Topper	c75f94bfa5	[InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits. This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch. llvm-svn: 300094	2017-04-12 19:32:47 +00:00
Craig Topper	cf3641fd57	[InstCombine] Remove unreachable code for turning an And where all demanded bits on both sides are known to be zero into a constant 0. We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones. llvm-svn: 300093	2017-04-12 19:08:03 +00:00
Sanjay Patel	6e41018942	[InstCombine] fix wrong undef handling when converting select to shuffle As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092	2017-04-12 18:39:53 +00:00
Craig Topper	f35a7f7b49	[InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of cascaded ifs on opcode. NFC llvm-svn: 300085	2017-04-12 18:25:25 +00:00
Craig Topper	9a51c7f343	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084	2017-04-12 18:17:46 +00:00
Craig Topper	b0076fe8b4	[InstCombine] Move portion of SimplifyDemandedUseBits that deals with instructions with multiple uses out to a separate method. NFCI llvm-svn: 300082	2017-04-12 18:05:21 +00:00
Craig Topper	845033a6c9	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075	2017-04-12 16:49:59 +00:00
Sanjay Patel	33439f982b	[InstCombine] morph an existing instruction instead of creating a new one One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 llvm-svn: 300067	2017-04-12 15:11:33 +00:00
Jonas Paulsson	22776892c9	[SLPVectorizer] Pass the right type argument to getCmpSelInstrCost() In getEntryCost(), make the scalar type for a compare instruction that of the operands, not i1. This is needed in order to call getCmpSelInstrCost() for a compare in a sensible way, the same way as the LoopVectorizer does. New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll Review: Matthew Simpson https://reviews.llvm.org/D31601 llvm-svn: 300061	2017-04-12 13:29:25 +00:00
Jonas Paulsson	592dbea779	[LoopVectorizer] Improve handling of branches during cost estimation. The cost for a branch after vectorization is very different depending on if the vectorizer will if-convert the block (branch is eliminated), or if scalarized and predicated blocks will be produced (branch duplicated before each block). There is also the case of remaining scalar branches, such as the back-edge branch. This patch handles these cases differently with TTI based cost estimates. Review: Matthew Simpson https://reviews.llvm.org/D31175 llvm-svn: 300058	2017-04-12 13:13:15 +00:00
Jonas Paulsson	da74ed42da	[LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore() Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056	2017-04-12 12:41:37 +00:00
Jonas Paulsson	fccc7d66c3	[SystemZ] TargetTransformInfo cost functions implemented. getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052	2017-04-12 11:49:08 +00:00
Bjorn Pettersson	4af0593ecc	[LoadCombine] Avoid analysing dead basic blocks Summary: Dead basic blocks may be forming a loop, for which SSA form is fulfilled, but with a circular def-use chain. LoadCombine could enter an infinite loop when analysing such dead code. This patch solves the problem by simply avoiding to analyse all basic blocks that aren't forward reachable, from function entry, in LoadCombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=27065 Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide Reviewed By: davide Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31032 llvm-svn: 300034	2017-04-12 08:07:55 +00:00
Chandler Carruth	927d8e610a	[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 llvm-svn: 300032	2017-04-12 07:27:28 +00:00
Craig Topper	b5194eeebf	[InstCombine][IR] Add a commutable BinOp matcher. Use it to reduce some code. NFC llvm-svn: 300030	2017-04-12 05:49:28 +00:00
Bob Haarman	4075ccc717	ThinLTOBitcodeWriter: keep comdats together, rename if leader is renamed Summary: COFF requires that every comdat contain a symbol with the same name as the comdat. ThinLTOBitcodeWriter renames symbols, which may cause this requirement to be violated. This change avoids such violations by renaming comdats if their leaders are renamed. It also keeps comdats together when splitting modules. Reviewers: pcc, mehdi_amini, tejohnson Reviewed By: pcc Subscribers: rnk, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31963 llvm-svn: 300019	2017-04-12 01:43:07 +00:00
Reid Kleckner	c2cb560045	[IR] Add AttributeSet to hide AttributeSetNode* again, NFC Summary: For now, it just wraps AttributeSetNode*. Eventually, it will hold AvailableAttrs as an inline bitset, and adding and removing enum attributes will be super cheap. This sinks AttributeSetNode back down to lib/IR/AttributeImpl.h. Reviewers: pete, chandlerc Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D31940 llvm-svn: 300014	2017-04-12 00:38:00 +00:00
Evgeniy Stepanov	90fd87303c	[asan] Give global metadata private linkage. Internal linkage preserves names like "__asan_global_foo" which may account to 2% of unstripped binary size. llvm-svn: 299995	2017-04-11 22:28:13 +00:00

1 2 3 4 5 ...

17668 Commits