Commit Graph

1862 Commits

Author SHA1 Message Date
Owen Anderson
6270515918 Revert r113439, which relaxed the requirement that loops containing calls cannot be unrolled. After some discussion,
there seems to be a better way to achieve the same effect.

llvm-svn: 113528
2010-09-09 20:02:23 +00:00
Owen Anderson
8084dbaf8e Relax the "don't unroll loops containing calls" rule. Instead, when a loop contains a call, lower the
unrolling threshold to the optimize-for-size threshold.  Basically, for loops containing calls, unrolling
can still be profitable as long as the loop is REALLY small.

llvm-svn: 113439
2010-09-08 23:10:07 +00:00
Owen Anderson
3fe002dfb5 Generalize instcombine's support for combining multiple bit checks into a single test. Patch by Dirk Steinke!
llvm-svn: 113423
2010-09-08 22:16:17 +00:00
Chris Lattner
6e27b3e004 Fix a serious performance regression introduced by r108687 on linux:
turning (fptrunc (sqrt (fpext x))) -> (sqrtf x)  is great, but we have
to delete the original sqrt as well.  Not doing so causes us to do 
two sqrt's when building with -fmath-errno (the default on linux).

llvm-svn: 113260
2010-09-07 20:01:38 +00:00
Chris Lattner
29570bd695 rename test.
llvm-svn: 113257
2010-09-07 19:57:06 +00:00
Chris Lattner
be9019090e fix PR8067, an over-aggressive assertion in LICM.
llvm-svn: 113146
2010-09-06 05:11:24 +00:00
Chris Lattner
b01c24a945 Teach loop rotate to hoist trivially invariant instructions
in the duplicated block instead of duplicating them.  

Duplicating them into the end of the loop and the preheader 
means that we got a phi node in the header of the loop, 
which prevented LICM from hoisting them.  GVN would
usually come around later and merge the duplicated 
instructions so we'd get reasonable output... except that
anything dependent on the shoulda-been-hoisted value can't
be hoisted.  In PR5319 (which this fixes), a memory value
didn't get promoted.

llvm-svn: 113134
2010-09-06 01:10:22 +00:00
Chris Lattner
72d283c826 fix PR8063, a crash in globalopt in the malloc analysis code.
llvm-svn: 113109
2010-09-05 17:20:46 +00:00
Dan Gohman
487e250109 Fix LoopSimplify to notify ScalarEvolution when splitting a loop backedge
into an inner loop, as the new loop iteration may differ substantially.
This fixes PR8078.

llvm-svn: 113057
2010-09-04 02:42:48 +00:00
Chris Lattner
50506787d1 fix a bug in my licm rewrite when a load from the promoted memory
location is being re-stored to the memory location.  We would get
a dangling pointer from the SSAUpdate data structure and miss a 
use.  This fixes PR8068

llvm-svn: 113042
2010-09-04 00:12:30 +00:00
Owen Anderson
c91c1a205a Propagate non-local comparisons. Fixes PR1757.
llvm-svn: 113025
2010-09-03 22:47:08 +00:00
Owen Anderson
c725462245 Add support for simplifying a load from a computed value to a load from a global when it
is provable that they're equivalent.  This fixes PR4855.

llvm-svn: 112994
2010-09-03 19:08:37 +00:00
Owen Anderson
064cb4c807 Add a test for PR4413, which was apparently fixed at some point in the past.
llvm-svn: 112987
2010-09-03 18:33:08 +00:00
Owen Anderson
50d8c8888c Add PR number to test.
llvm-svn: 112971
2010-09-03 16:58:25 +00:00
Chris Lattner
65fb25a257 more test cleanup
llvm-svn: 112892
2010-09-02 22:38:56 +00:00
Chris Lattner
bb451461ec remove some noise from tests.
llvm-svn: 112889
2010-09-02 22:35:33 +00:00
Chris Lattner
affc0e42f0 fix more AST updating bugs, correcting miscompilation in PR8041
llvm-svn: 112878
2010-09-02 22:19:10 +00:00
Owen Anderson
67dee4dcac Fix typo. I accidentally edited the wrong file before my last commit.
llvm-svn: 112851
2010-09-02 19:52:06 +00:00
Owen Anderson
a8c896b704 Fix a bug in LazyValueInfo that CorrelatedValuePropagation exposed: In the LVI lattice, undef and the full set ConstantRange should not
be treated as equivalent.

llvm-svn: 112843
2010-09-02 18:23:58 +00:00
Duncan Sands
8dda07428a Print the number of uses of a function in the .ll since it can be informative
and there seems to be no reason not to.

llvm-svn: 112812
2010-09-02 08:52:23 +00:00
Chris Lattner
8af45a889d deepen my MMX/SRoA hack to avoid hurting non-x86 codegen.
llvm-svn: 112763
2010-09-01 23:09:27 +00:00
Dan Gohman
0ad7d9c24e Fix loop unswitching's assumption that a code path which either
infinite loops or exits will eventually exit. This fixes PR5373.

llvm-svn: 112745
2010-09-01 21:46:45 +00:00
Bill Wendling
6456efaffd The output of opt -stats must be sent to stderr. Patch by NAKAMURA Takumi!
llvm-svn: 112724
2010-09-01 18:32:56 +00:00
Chris Lattner
34e5361eb5 add a gross hack to work around a problem that Argiris reported
on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>,
which then cause random problems because the X86 backend is
producing mmx stuff without inserting proper emms calls.

In the short term, force off MMX datatypes.  In the long term,
the X86 backend should not select generic vector types to MMX
registers.  This is being worked on, but won't be done in time
for 2.8.  rdar://8380055

llvm-svn: 112696
2010-09-01 05:14:33 +00:00
Chris Lattner
b9ed4f252f filecheckize
llvm-svn: 112695
2010-09-01 05:10:14 +00:00
Chris Lattner
030f02021b licm is wasting time hoisting constant foldable operations,
instead of hoisting them, just fold them away.  This occurs in the
testcase for PR8041, for example.

llvm-svn: 112669
2010-08-31 23:00:16 +00:00
Owen Anderson
a5e6b3eca4 Merge 2010-08-31-InfiniteRecursion.ll into crash.ll.
llvm-svn: 112635
2010-08-31 20:27:17 +00:00
Owen Anderson
799a08ae48 Add a test for the duplicated-conditional situation illutrated by PR5652.
llvm-svn: 112621
2010-08-31 18:49:12 +00:00
Chris Lattner
e2295f1c80 merge two tests.
llvm-svn: 112617
2010-08-31 18:44:03 +00:00
Owen Anderson
3931c85956 Manually reduce this testcase.
llvm-svn: 112615
2010-08-31 18:16:29 +00:00
Chris Lattner
fbcd165b59 merge two tests and convert to filecheck.
llvm-svn: 112613
2010-08-31 18:05:08 +00:00
Owen Anderson
ada0623725 Add a micro-test for the transforms I added to JumpThreading.
I have not been able to find a way to test each in isolation, for a few reasons:
1) The ability to look-through non-i1 BinaryOperator's requires the ability to look through non-constant
   ICmps in order for it to ever trigger.
2) The ability to do LVI-powered PHI value determination only matters in cases that ProcessBranchOnPHI
   can't handle.  Since it already handles all the cases without other instructions in the def-use chain
   between the PHI and the branch, it requires the ability to look through ICmps and/or BinaryOperators
   as well.

llvm-svn: 112611
2010-08-31 17:59:07 +00:00
Owen Anderson
064b139c8d Rename test directory to reflect new pass name.
llvm-svn: 112592
2010-08-31 07:50:31 +00:00
Owen Anderson
48d58ad64c Rename ValuePropagation to a more descriptive CorrelatedValuePropagation.
llvm-svn: 112591
2010-08-31 07:48:34 +00:00
Owen Anderson
3997a07fb9 More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly constant-fold undef, and be more careful with its return value.
This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's
handling of and/or of i1's), but never manifested before.  This patch adds a tracking set to prevent this case.

llvm-svn: 112589
2010-08-31 07:36:34 +00:00
Owen Anderson
376597c13e Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily
fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine.

llvm-svn: 112575
2010-08-31 04:41:06 +00:00
Owen Anderson
70b17c50e2 Combine these two tests, and make sure there's a newline at the end of the file.
llvm-svn: 112554
2010-08-30 23:37:41 +00:00
Duncan Sands
68c30907cc Correct bogus module triple specifications.
llvm-svn: 112469
2010-08-30 10:48:29 +00:00
Chris Lattner
263f804699 LICM does get dead instructions input to it. Instead of sinking them
out of loops, just delete them.

llvm-svn: 112451
2010-08-29 18:22:25 +00:00
Chris Lattner
504e5100d3 remove the ABCD and SSI passes. They don't have any clients that
I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355
2010-08-28 03:51:24 +00:00
Chris Lattner
d0214f3efe handle the constant case of vector insertion. For something
like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345
2010-08-28 01:50:57 +00:00
Chris Lattner
dd6601048e optimize bitcasts from large integers to vector into vector
element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343
2010-08-28 01:20:38 +00:00
Owen Anderson
cf7f941121 Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325
2010-08-27 23:31:36 +00:00
Chris Lattner
954e9557e3 tidy up test.
llvm-svn: 112321
2010-08-27 23:15:21 +00:00
Chris Lattner
6c1395f62a Enhance the shift propagator to handle the case when you have:
A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314
2010-08-27 22:53:44 +00:00
Chris Lattner
18d7fc8fc6 Implement a pretty general logical shift propagation
framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304
2010-08-27 22:24:38 +00:00
Chris Lattner
606b76eba6 merge and filecheckize test
llvm-svn: 112289
2010-08-27 20:44:45 +00:00
Chris Lattner
c665156e9f merge two tests
llvm-svn: 112288
2010-08-27 20:42:10 +00:00
Chris Lattner
7398434675 teach the truncation optimization that an entire chain of
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285
2010-08-27 20:32:06 +00:00
Chris Lattner
90cd746e63 Add an instcombine to clean up a common pattern produced
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278
2010-08-27 18:31:05 +00:00