Commit Graph

3100 Commits

Author SHA1 Message Date
Chris Lattner
6bf2cd5735 Add support for alloca, implementing ctor-list-opt.ll:CTOR6
llvm-svn: 23452
2005-09-26 17:07:09 +00:00
Chris Lattner
46d9ff081d Add a debug printout, fix a crash on kc++
llvm-svn: 23450
2005-09-26 07:34:35 +00:00
Chris Lattner
46af55e0e4 Implement loads/stores through GEP's of globals. This implements
ctor-list-opt.ll:CTOR5.

llvm-svn: 23449
2005-09-26 06:52:44 +00:00
Chris Lattner
61ff32cd70 Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr
llvm-svn: 23447
2005-09-26 05:34:07 +00:00
Chris Lattner
02ae21e1e0 Eliminate GetGEPGlobalInitializer in favor of the more powerful
ConstantFoldLoadThroughGEPConstantExpr function in the utils lib.

llvm-svn: 23446
2005-09-26 05:28:52 +00:00
Chris Lattner
0b011ec8e2 Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils
as ConstantFoldLoadThroughGEPConstantExpr.

llvm-svn: 23445
2005-09-26 05:28:06 +00:00
Chris Lattner
c13c7b9376 Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine
pass.

llvm-svn: 23444
2005-09-26 05:27:10 +00:00
Chris Lattner
b009663e27 add a comment
llvm-svn: 23442
2005-09-26 05:16:34 +00:00
Chris Lattner
4b05c322d5 Add support for getelementptr, load, and correctly reject volatile stores.
llvm-svn: 23441
2005-09-26 05:15:37 +00:00
Chris Lattner
3e9ea5ffec Add support for br/brcond/switch and phi
llvm-svn: 23439
2005-09-26 04:57:38 +00:00
Chris Lattner
99e23fa74c Add a simple interpreter to this code, allowing us to statically evaluate
global ctors that are simple enough.  This implements ctor-list-opt.ll:CTOR2.

llvm-svn: 23437
2005-09-26 04:44:35 +00:00
Chris Lattner
696beefabb factor some code into a InstallGlobalCtors method, add comments. No functionality change.
llvm-svn: 23435
2005-09-26 02:31:18 +00:00
Chris Lattner
838bdc1836 Make the global opt optimizer work on modules with a null terminator, by
accepting the null even with a non-65535 init prio

llvm-svn: 23434
2005-09-26 02:19:27 +00:00
Chris Lattner
41b6a5a693 Factor this code out into a few methods.
Implement the start of global ctor optimization.  It is currently smart
enough to remove the global ctor for cases like this:

struct foo {
  foo() {}
} x;

... saving a bit of startup time for the program.

llvm-svn: 23433
2005-09-26 01:43:45 +00:00
Chris Lattner
f487768062 Fix some logic I broke that caused a regression on
SimplifyLibCalls/2005-05-20-sprintf-crash.ll

llvm-svn: 23430
2005-09-25 07:06:48 +00:00
Chris Lattner
0b3557f54a Move MaskedValueIsZero up.
Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll

llvm-svn: 23428
2005-09-24 23:43:33 +00:00
Chris Lattner
175463a165 Simplify this code a bit by relying on recursive simplification. Support
sprintf("%s", P)'s that have uses.

s/hasNUses(0)/use_empty()/

llvm-svn: 23425
2005-09-24 22:17:06 +00:00
Chris Lattner
499e33646e remove some debugging code
llvm-svn: 23411
2005-09-23 18:49:09 +00:00
Chris Lattner
c59a371d45 Fold two consequtive branches that share a common destination between them.
This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy
code

llvm-svn: 23410
2005-09-23 18:47:20 +00:00
Chris Lattner
3a978bf66d simplify some logic further
llvm-svn: 23408
2005-09-23 07:23:18 +00:00
Chris Lattner
cc14ebc17b pull a bunch of logic out of SimplifyCFG into a helper fn
llvm-svn: 23407
2005-09-23 06:39:30 +00:00
Chris Lattner
6c70106053 Start threading across blocks with code in them, so long as the code does
not define a value that is used outside of it's block.  This catches many
more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc.

This implements branch-phi-thread.ll:test3.ll

llvm-svn: 23397
2005-09-20 01:48:40 +00:00
Chris Lattner
f0bd8d0107 Implement merging of blocks with the same condition if the block has multiple
predecessors.  This implements branch-phi-thread.ll::test1

llvm-svn: 23395
2005-09-20 00:43:16 +00:00
Chris Lattner
049cb4482f Reject a case we don't handle yet
llvm-svn: 23393
2005-09-19 23:57:04 +00:00
Chris Lattner
a160924d57 remove debugging code :-/
llvm-svn: 23392
2005-09-19 23:50:15 +00:00
Chris Lattner
748f903046 Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading
control across branches with determined outcomes.  More generality to follow.
This triggers a couple thousand times in specint.

llvm-svn: 23391
2005-09-19 23:49:37 +00:00
Chris Lattner
b4b2530a1a Refactor this code a bit and make it more general. This now compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)

llvm-svn: 23386
2005-09-18 07:22:02 +00:00
Chris Lattner
797dee7705 Compile
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23385
2005-09-18 06:30:59 +00:00
Chris Lattner
01f56c68e9 Generalize this transform, using MaskedValueIsZero, allowing us to compile:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23384
2005-09-18 06:02:59 +00:00
Chris Lattner
4ebc8ab4e0 fix typeo
llvm-svn: 23383
2005-09-18 05:25:20 +00:00
Chris Lattner
e5b23a6d67 Remove unintentionally committed code
llvm-svn: 23382
2005-09-18 05:12:51 +00:00
Chris Lattner
27cb9dbd35 implement shift.ll:test25. This compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr

llvm-svn: 23381
2005-09-18 05:12:10 +00:00
Chris Lattner
af517574ce Implement add.ll:test29. Codegening:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

llvm-svn: 23379
2005-09-18 04:24:45 +00:00
Chris Lattner
027eaf01cf remove debug output
llvm-svn: 23377
2005-09-18 03:50:25 +00:00
Chris Lattner
1521298993 Implement or.ll:test21. This teaches instcombine to be able to turn this:
struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret

llvm-svn: 23376
2005-09-18 03:42:07 +00:00
Chris Lattner
a393e4d4b3 Fix the regression last night compiling povray
llvm-svn: 23348
2005-09-14 17:32:56 +00:00
Chris Lattner
2a8932960d Add a simple xform to simplify array accesses with casts in the way.
This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.

llvm-svn: 23328
2005-09-13 18:36:04 +00:00
Chris Lattner
fd018c8dfe Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.
This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.

llvm-svn: 23327
2005-09-13 02:09:55 +00:00
Chris Lattner
567b81f0d2 Add a helper function, allowing us to simplify some code a bit, changing
indentation, no functionality change

llvm-svn: 23325
2005-09-13 00:40:14 +00:00
Chris Lattner
219175c84d Implement a simple xform to turn code like this:
if () { store A -> P; } else { store B -> P; }

into a PHI node with one store, in the most trival case.  This implements
load.ll:test10.

llvm-svn: 23324
2005-09-12 23:23:25 +00:00
Chris Lattner
e0bfdf1485 Another load-peephole optimization: do gcse when two loads are next to
each other.  This implements InstCombine/load.ll:test9

llvm-svn: 23322
2005-09-12 22:21:03 +00:00
Chris Lattner
b990f7d8ed Implement a trivial form of store->load forwarding where the store and the
load are exactly consequtive.  This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).

llvm-svn: 23320
2005-09-12 22:00:15 +00:00
Chris Lattner
8048b85e8f Fix a regression from last night, which caused this pass to create invalid
code for IV uses outside of loops that are not dominated by the latch block.
We should only convert these uses to use the post-inc value if they ARE
dominated by the latch block.

Also use a new LoopInfo method to simplify some code.

This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll

llvm-svn: 23318
2005-09-12 17:11:27 +00:00
Chris Lattner
a67648396a _test:
li r2, 0
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r2, 1
        stw r2, 0(r4)
        blr
[zion ~/llvm]$ cat > ~/xx
Uses of IV's outside of the loop should use hte post-incremented version
of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
which used to generate code like this (this is the code from the
dont-hoist-simple-loop-constants.ll testcase):

_test:
        li r2, 0                 **** IV starts at 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2            **** Copy for loop exit
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2           **** IV+2
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2       ****  IV+2
        stw r2, 0(r4)
        blr

And now generated code like this:

_test:
        li r2, 1               *** IV starts at 1
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701     *** IV.postinc + 0
        blt cr0, LBB_test_1
LBB_test_2:     ; loopexit.2.loopexit
        stw r2, 0(r4)          *** IV.postinc + 0
        blr

llvm-svn: 23313
2005-09-12 06:04:47 +00:00
Chris Lattner
530fe6ab30 implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll.
We used to emit this code for it:

_test:
        li r2, 1     ;; Value tying up a register for the whole loop
        li r5, 0
LBB_test_1:     ; no_exit.2
        or r6, r5, r5
        li r5, 0
        stw r5, 0(r3)
        addi r5, r6, 1
        addi r3, r3, 4
        add r7, r2, r5  ;; should be addi r7, r5, 1
        cmpwi cr0, r7, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r6, 2
        stw r2, 0(r4)
        blr

now we emit this:

_test:
        li r2, 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2   ;; whoa, fold those adds!
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2
        stw r2, 0(r4)
        blr

more improvement coming.

llvm-svn: 23306
2005-09-10 01:18:45 +00:00
Chris Lattner
b5e381a8cf Fix a problem that Dan Berlin noticed, where reassociation would not succeed
in building maximal expressions before simplifying them.  In particular, i
cases like this:

X-(A+B+X)

the code would consider A+B+X to be a maximal expression (not understanding
that the single use '-' would be turned into a + later), simplify it (a noop)
then later get simplified again.

Each of these simplify steps is where the cost of reassociation comes from,
so this patch should speed up the already fast pass a bit.

Thanks to Dan for noticing this!

llvm-svn: 23214
2005-09-02 07:07:58 +00:00
Chris Lattner
9fe263aa75 Avoid creating garbage instructions, just move the old add instruction
to where we need it when converting -(A+B+C) -> -A + -B + -C.

llvm-svn: 23213
2005-09-02 06:38:04 +00:00
Chris Lattner
d1325da091 add some assertions and fix problems where reassociate could access the
Ops vector out of range

llvm-svn: 23211
2005-09-02 05:23:22 +00:00
Chris Lattner
8ca5b2a6d2 Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll
llvm-svn: 23019
2005-08-24 17:55:32 +00:00
Chris Lattner
4201cd1bbc Transform floor((double)FLT) -> (double)floorf(FLT), implementing
Regression/Transforms/SimplifyLibCalls/floor.ll.  This triggers 19 times in
177.mesa.

llvm-svn: 23017
2005-08-24 17:22:17 +00:00