init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.
llvm-svn: 139140
to be unreliable on platforms which require memcpy calls, and it is
complicating broader legalize cleanups. It is hoped that these cleanups
will make memcpy byval easier to implement in the future.
llvm-svn: 138977
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.
I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.
llvm-svn: 138621
types (with power of two types such as 8,16,32 .. 512).
Fix a bug in the integer promotion of bitcast nodes. Enable integer expanding
only if the target of the conversion is an integer (when the type action is
scalarize).
Add handling to the legalization of vector load/store in cases where the saved
vector is integer-promoted.
llvm-svn: 132985
by non-CMP expressions. The executable test case (129821) would test
this as well, if we had an "-O0 -disable-arm-fast-isel" LLVM-GCC
tester. Alas, the ARM assembly would be very difficult to check with
FileCheck.
The thumb2-cbnz.ll test is affected; it generates larger code (tst.w
vs. cmp #0), but I believe the new version is correct.
rdar://problem/9298790
llvm-svn: 131261
manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).
Fixes rdar://9292577
llvm-svn: 129842
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR. Eventually this
will get matched to VINSERTF128 if AVX is available.
llvm-svn: 124307
with an invalid type then split the result and perform the overflow check
normally.
Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.
llvm-svn: 123864
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.
I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.
llvm-svn: 123547