clang-p2996

Author	SHA1	Message	Date
Chuang-Yu Cheng	68f7f1cf00	Teaching SimplifyCFG to recognize the Or-Mask trick that InstCombine uses to reduce the number of comparisons. Specifically, InstCombine can turn: (i == 5334 \|\| i == 5335) into: ((i \| 1) == 5335) SimplifyCFG was already able to detect the pattern: (i == 5334 \|\| i == 5335) to: ((i & -2) == 5334) This patch supersedes D21315 and resolves PR27555 (https://llvm.org/bugs/show_bug.cgi?id=27555). Thanks to David and Chandler for the suggestions! Author: Thomas Jablin (tjablin) Reviewers: majnemer chandlerc halfdan cycheng http://reviews.llvm.org/D21397 llvm-svn: 273639	2016-06-24 01:59:00 +00:00
Anna Thomas	31a0b2088f	InstCombine rule to fold trunc when value available Summary: This instcombine rule folds away trunc operations that have value available from a prior load or store. This kind of code can be generated as a result of GVN widening the load or from source code as well. Reviewers: reames, majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21246 llvm-svn: 273608	2016-06-23 20:22:22 +00:00
Artur Pilipenko	80771b9ad9	Upgrade other old memset/memcpy signatures in tests causing buildbot failures with rL273568. llvm-svn: 273580	2016-06-23 16:34:52 +00:00
Michael Zolotukhin	2d3592d481	[LoopUnrollAnalyzer] Fix a bug in UnrolledInstAnalyzer::visitLoad. When simplifying a load we need to make sure that the type of the simplified value matches the type of the instruction we're processing. In theory, we can handle casts here as we deal with constant data, but since it's not implemented at the moment, we at least need to bail out. This fixes PR28262. llvm-svn: 273562	2016-06-23 14:31:31 +00:00
Hal Finkel	a1271036c5	Allow DeadStoreElimination to track combinations of partial later wrties DeadStoreElimination can currently remove a small store rendered unnecessary by a later larger one, but could not remove a larger store rendered unnecessary by a series of later smaller ones. This adds that capability. It works by keeping a map, which is used as an effective interval map, for each store later overwritten only partially, and filling in that interval map as more such stores are discovered. No additional walking or aliasing queries are used. In the map forms an interval covering the the entire earlier store, then it is dead and can be removed. The map is used as an interval map by storing a mapping between the ending offset and the beginning offset of each interval. I discovered this problem when investigating a performance issue with code like this on PowerPC: #include <complex> using namespace std; complex<float> bar(complex<float> C); complex<float> foo(complex<float> C) { return bar(C)C; } which produces this: define void @_Z4testSt7complexIfE(%"struct.std::complex" noalias nocapture sret %agg.result, i64 %c.coerce) { entry: %ref.tmp = alloca i64, align 8 %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"* %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32 %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32 %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32 %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce) %2 = bitcast %"struct.std::complex"* %agg.result to i64* %3 = load i64, i64* %ref.tmp, align 8 store i64 %3, i64* %2, align 4 ; <--- *** THIS SHOULD NOT BE HERE ** %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0 %4 = lshr i64 %3, 32 %5 = trunc i64 %4 to i32 %6 = bitcast i32 %5 to float %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1 %7 = trunc i64 %3 to i32 %8 = bitcast i32 %7 to float %mul_ad.i.i = fmul fast float %6, %1 %mul_bc.i.i = fmul fast float %8, %0 %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i %mul_ac.i.i = fmul fast float %6, %0 %mul_bd.i.i = fmul fast float %8, %1 %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4 store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4 ret void } the problem here is not just that the i64 store is unnecessary, but also that it blocks further backend optimizations of the other uses of that i64 value in the backend. In the future, we might want to add a special case for handling smaller accesses (e.g. using a bit vector) if the map mechanism turns out to be noticeably inefficient. A sorted vector is also a possible replacement for the map for small numbers of tracked intervals. Differential Revision: http://reviews.llvm.org/D18586 llvm-svn: 273559	2016-06-23 13:46:39 +00:00
David Majnemer	d1fbf48566	[SCCP] Don't assume all Constants are ConstantInt This fixes PR28269. llvm-svn: 273521	2016-06-23 00:14:29 +00:00
Sanjay Patel	a06d989552	[ValueTracking] improve ComputeNumSignBits for vector constants This is similar to the computeKnownBits improvement in rL268479. There's probably more we can do for vector logic instructions, but this should let us see non-splat constant masking ops that can become vector selects instead of and/andn/or sequences. Differential Revision: http://reviews.llvm.org/D21610 llvm-svn: 273459	2016-06-22 19:20:59 +00:00
Artur Pilipenko	1cec4fdddf	Upgrade old memset/memcpy signatures (without isVolatile argument) in tests We no longer have corresponding code in autoupgrade and the vast majority of the tests were fixed long time ago. Fix the remaining few. One of the verifier test cases is marked as XFAIL because it was passing only because the signature was incorrect. llvm-svn: 273428	2016-06-22 15:16:06 +00:00
Sanjay Patel	c6cacd6067	[InstSimplify] add ashr tests including vector types llvm-svn: 273421	2016-06-22 14:18:04 +00:00
Simon Pilgrim	bc35f9f702	[SLPVectorizer][X86] Added ceil/floor/nearbyint/rint/trunc vectorization tests llvm-svn: 273420	2016-06-22 14:07:46 +00:00
Sanjay Patel	21579bb39a	[InstSimplify] regenerate checks llvm-svn: 273419	2016-06-22 14:00:16 +00:00
Haicheng Wu	a783bac50b	[Kryo] Enable loop prefetcher. Differential Revision: http://reviews.llvm.org/D21535 llvm-svn: 273329	2016-06-21 22:47:56 +00:00
Easwaran Raman	8bceb9d210	Fix PR28219: Use profile summary from reader and not compute it Differentiaal revision: http://reviews.llvm.org/D21546 llvm-svn: 273301	2016-06-21 19:29:49 +00:00
Elena Demikhovsky	a266cf0518	reverted the prev commit due to assertion failure llvm-svn: 273258	2016-06-21 12:10:11 +00:00
Elena Demikhovsky	9823c995bc	Fixed consecutive memory access detection in Loop Vectorizer. It did not handle correctly cases without GEP. The following loop wasn't vectorized: for (int i=0; i<len; i++) to++ = from++; I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1. Differential revision: http://reviews.llvm.org/D20789 llvm-svn: 273257	2016-06-21 11:32:01 +00:00
Simon Pilgrim	356e823b51	[X86][SSE] Add cost model for BSWAP of vectors The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 llvm-svn: 273217	2016-06-20 23:08:21 +00:00
Sanjay Patel	9ad8fb68f7	[InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869) By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question raised by PR27869: https://llvm.org/bugs/show_bug.cgi?id=27869 ...where InstCombine turns an icmp+zext into a shift causing us to miss the fold. Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp. Differential Revision: http://reviews.llvm.org/D21512 llvm-svn: 273200	2016-06-20 20:59:59 +00:00
Dehao Chen	071bb9d7af	Pass AssumptionCacheTracker from SampleProfileLoader to Inliner Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader Reviewers: davidxl Subscribers: eraman, vsk, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D21205 llvm-svn: 273199	2016-06-20 20:53:40 +00:00
Matt Arsenault	802ebcb4bb	InstCombine: Don't strip convergent from intrinsic callsites Specific instances of intrinsic calls may want to be convergent, such as certain register reads but the intrinsic declaration is not. llvm-svn: 273188	2016-06-20 19:04:44 +00:00
Sanjay Patel	445d7ecf89	[InstCombine] consolidate some icmp+logic tests and improve checks llvm-svn: 273186	2016-06-20 18:40:37 +00:00
Sanjay Patel	14dcb042bc	[InstCombine] update to use FileCheck with autogenerated exact checking llvm-svn: 273180	2016-06-20 18:23:40 +00:00
Sanjay Patel	06918ad79e	[InstCombine] update to use FileCheck with autogenerated exact checking llvm-svn: 273173	2016-06-20 17:56:13 +00:00
Sanjay Patel	a038240660	[InstCombine] regenerate checks llvm-svn: 273170	2016-06-20 17:48:48 +00:00
David Majnemer	c5601df9fd	Reapply "[LoopIdiom] Don't remove dead operands manually" This reverts commit r273160, reapplying r273132. RecursivelyDeleteTriviallyDeadInstructions cannot be called on a parentless Instruction. llvm-svn: 273162	2016-06-20 16:03:25 +00:00
Cong Liu	1c28b6d733	Revert "[LoopIdiom] Don't remove dead operands manually" This reverts commit r273132. Breaks multiple test under /llvm/test:Transforms (e.g. llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan. llvm-svn: 273160	2016-06-20 15:22:15 +00:00
Patrik Hagglund	7205215591	Fix for PR27940 After a store has been eliminated, when making sure that the instruction iterator points to a valid instruction, dbg intrinsics are now ignored as a new instruction. Patch by Henric Karlsson. Reviewed by Daniel Berlin. Differential Revision: http://reviews.llvm.org/D21076 llvm-svn: 273141	2016-06-20 09:10:10 +00:00
David Majnemer	a705843f23	[LoopIdiom] Don't remove dead operands manually Removing dead instructions requires remembering which operands have already been removed. RecursivelyDeleteTriviallyDeadInstructions has this logic, don't partially reimplement it in LoopIdiomRecognize. This fixes PR28196. llvm-svn: 273132	2016-06-20 02:33:29 +00:00
Sanjay Patel	a4b052c7d1	[InstSimplify] add tests for PR27689; regenerate checks llvm-svn: 273128	2016-06-19 21:40:12 +00:00
David Majnemer	3119599475	[LoadCombine] Combine Loads formed from GEPS with negative indexes Change the underlying offset and comparisons to use int64_t instead of uint64_t. Patch by River Riddle! Differential Revision: http://reviews.llvm.org/D21499 llvm-svn: 273105	2016-06-19 06:14:56 +00:00
Matt Arsenault	a466a7cf62	Add looping testcase that broke in r272987 llvm-svn: 273081	2016-06-18 05:15:58 +00:00
Sanjoy Das	e8fd9561cb	[SCEV] Fix incorrect trip count computation The way we elide max expressions when computing trip counts is incorrect -- it breaks cases like this: ``` static int wrapping_add(int a, int b) { return (int)((unsigned)a + (unsigned)b); } void test() { volatile int end_buf = 2147483548; // INT_MIN - 100 int end = end_buf; unsigned counter = 0; for (int start = wrapping_add(end, 200); start < end; start++) counter++; print(counter); } ``` Note: the `NoWrap` variable that was being tested has little to do with the values flowing into the max expression; it is a property of the induction variable. test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test functionality I'm reverting in this change, so I've deleted the test fully. llvm-svn: 273079	2016-06-18 04:38:31 +00:00
Matt Arsenault	8fd5978811	Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""" This seems to be causing an infinite loop / crash in instcombine on some bots. llvm-svn: 273069	2016-06-17 23:36:38 +00:00
Adam Nemet	a9f09c6245	[LAA] Enable symbolic stride speculation for all LAA clients This is a functional change for LLE and LDist. The other clients (LV, LVerLICM) already had this explicitly enabled. The temporary boolean parameter to LAA is removed that allowed turning off speculation of symbolic strides. This makes LAA's caching interface LAA::getInfo only take the loop as the parameter. This makes the interface more friendly to the new Pass Manager. The flag -enable-mem-access-versioning is moved from LV to a LAA which now allows turning off speculation globally. llvm-svn: 273064	2016-06-17 22:35:41 +00:00
Matt Arsenault	d76efc14b9	Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."" Reapply r272987. Condition should be in terms of the destination type, and the flags should not be copied. llvm-svn: 273045	2016-06-17 20:33:53 +00:00
Davide Italiano	b49aa5c0c4	[PM] Port MergedLoadStoreMotion to the new pass manager, take two. This is indeed a much cleaner approach (thanks to Daniel Berlin for pointing out), and also David/Sean for review. Differential Revision: http://reviews.llvm.org/D21454 llvm-svn: 273032	2016-06-17 19:10:09 +00:00
James Y Knight	148a6469dc	Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass. Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1 or 2-byte. For those, you need to mask and shift the 1 or 2 byte values appropriately to use the 4-byte instruction. This change adds support for cmpxchg-based instruction sets (only SPARC, in LLVM). The support can be extended for LL/SC-based PPC and MIPS in the future, supplanting the ISel expansions those architectures currently use. Tests added for the IR transform and SPARCv9. Differential Revision: http://reviews.llvm.org/D21029 llvm-svn: 273025	2016-06-17 18:11:48 +00:00
Sanjay Patel	216d8cf720	[InstCombine] allow more than one use for vector bitcast folding with selects The motivating example for this transform is similar to D20774 where bitcasts interfere with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to produce min and max ops: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %bc1 = bitcast <4 x float> %a to <4 x i32> %bc2 = bitcast <4 x float> %b to <4 x i32> %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2 %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1 %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>* store <4 x i32> %sel1, <4 x i32>* %bc3 %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>* store <4 x i32> %sel2, <4 x i32>* %bc4 ret void } With this patch, we move the selects up to use the input args which allows getting rid of all of the bitcasts: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16 store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16 ret void } The asm for x86 SSE then improves from: movaps %xmm0, %xmm2 cmpltps %xmm1, %xmm2 movaps %xmm2, %xmm3 andnps %xmm1, %xmm3 movaps %xmm2, %xmm4 andnps %xmm0, %xmm4 andps %xmm2, %xmm0 orps %xmm3, %xmm0 andps %xmm1, %xmm2 orps %xmm4, %xmm2 movaps %xmm0, (%rdi) movaps %xmm2, (%rsi) To: movaps %xmm0, %xmm2 minps %xmm1, %xmm2 maxps %xmm0, %xmm1 movaps %xmm2, (%rdi) movaps %xmm1, (%rsi) The TODO comments show that we're limiting this transform only to vectors and only to bitcasts because we need to improve other transforms or risk creating worse codegen. Differential Revision: http://reviews.llvm.org/D21190 llvm-svn: 273011	2016-06-17 16:46:50 +00:00
Adam Nemet	e7709d92ba	[LLE] Don't hard-code the name of the preheader in test Turns out a didn't get this right because symbolic stride versioning changes the name. Relax the matching. llvm-svn: 272992	2016-06-17 09:13:15 +00:00
Matt Arsenault	ce56f7bbaa	Revert "InstCombine: Reduce trunc (shl x, K) width." This reverts commit r272987. This might be causing crashes on some bots. llvm-svn: 272990	2016-06-17 06:28:53 +00:00
Matt Arsenault	028fd50642	InstCombine: Reduce trunc (shl x, K) width. llvm-svn: 272987	2016-06-17 04:43:22 +00:00
Evgeniy Stepanov	45fa0fd758	[safestack] Sink unsafe address computation to each use. This is a fix for PR27844. When replacing uses of unsafe allocas, emit the new location immediately after each use. Without this, the pointer stays live from the function entry to the last use, while it's usually cheaper to recalculate. llvm-svn: 272969	2016-06-16 22:34:04 +00:00
Evgeniy Stepanov	72d961a1da	[safestack] Fixup llvm.dbg.value when rewriting unsafe allocas. When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are updated to refer to the new location. This change does the same to dbg.value intrinsics. llvm-svn: 272968	2016-06-16 22:34:00 +00:00
Sanjoy Das	07c6521aed	[EarlyCSE] Fold invariant loads Redundant invariant loads can be CSE'ed with very little extra effort over what early-cse already tracks, so it looks reasonable to make early-cse handle this case. llvm-svn: 272954	2016-06-16 20:47:57 +00:00
Justin Lebar	b0bd07aff7	Fix strip-dead-debug-info test if path contains "bar". This test checks that the string 'bar' (no quotes) doesn't exist in the output after running opt. But opt embeds the absolute path to the filename, and on my machine, the filename contains the string 'jlebar', causing the test to fail. This patch changes the test to look for the string '"bar"' instead. llvm-svn: 272941	2016-06-16 19:39:55 +00:00
Davide Italiano	41315f7873	[PM] Revert the port of MergeLoadStoreMotion to the new pass manager. Daniel Berlin expressed some real concerns about the port and proposed and alternative approach. I'll revert this for now while working on a new patch, which I hope to put up for review shortly. Sorry for the churn. llvm-svn: 272925	2016-06-16 17:40:53 +00:00
Adam Nemet	776346848a	[LLE] New test to check that no versioning for symbolic strides occurs. NFC This is currently only performed in the Vectorizer. I will change this as symbolic stride collection is moved to LAA. This test will track when the actual functional change occurs. llvm-svn: 272918	2016-06-16 16:45:29 +00:00
Igor Laevsky	87f0d0e185	Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo" It was causing failures in Profile-i386 and Profile-x86_64 tests. llvm-svn: 272912	2016-06-16 16:25:53 +00:00
Igor Laevsky	c9179fd2c2	[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise we will get dangling pointer inside BranchProbabilityInfo cache. Differential Revision: http://reviews.llvm.org/D20957 llvm-svn: 272891	2016-06-16 13:28:25 +00:00
Patrik Hagglund	0acaefaf9d	PR27938: Don't remove valid DebugLoc in Scalarizer Added checks to make sure the Scalarizer::transferMetadata() don't remove valid debug locations from instructions. This is important as the verifier pass require that e.g. inlinable callsites have a valid debug location. https://llvm.org/bugs/show_bug.cgi?id=27938 Patch by Karl-Johan Karlsson Reviewers: dblaikie Differential Revision: http://reviews.llvm.org/D20807 llvm-svn: 272884	2016-06-16 10:48:54 +00:00
Chuang-Yu Cheng	dbe00d51b4	SimplifyCFG is able to detect the pattern: (i == 5334 \|\| i == 5335) to: ((i & -2) == 5334) This transformation has some incorrect side conditions. Specifically, the transformation is only applied when the right-hand side constant (5334 in the example) is a power of two not equal and not equal to the negated mask. These side conditions were added in r258904 to fix PR26323. The correct side condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334]. It's a little bit hard to see why these transformations are correct and what the side conditions ought to be. Here is a CVC3 program to verify them for 64-bit values: ONE : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63); x : BITVECTOR(64); y : BITVECTOR(64); z : BITVECTOR(64); mask : BITVECTOR(64) = BVSHL(ONE, z); QUERY( (y & ~mask = y) => ((x & ~mask = y) <=> (x = y OR x = (y \| mask))) ); Please note that each pattern must be a dual implication (<--> or iff). One directional implication can create spurious matches. If the implication is only one-way, an unsatisfiable condition on the left side can imply a satisfiable condition on the right side. Dual implication ensures that satisfiable conditions are transformed to other satisfiable conditions and unsatisfiable conditions are transformed to other unsatisfiable conditions. Here is a concrete example of a unsatisfiable condition on the left implying a satisfiable condition on the right: mask = (1 << z) (x & ~mask) == y --> (x == y \|\| x == (y \| mask)) Substituting y = 3, z = 0 yields: (x & -2) == 3 --> (x == 3 \|\| x == 2) The version of this code before r258904 had no side-conditions and incorrectly justified itself in comments through one-directional implication. Thanks to Chandler for the suggestion! Author: Thomas Jablin (tjablin) Reviewers: chandlerc majnemer hfinkel cycheng http://reviews.llvm.org/D21417 llvm-svn: 272873	2016-06-16 04:44:25 +00:00

1 2 3 4 5 ...

6915 Commits