Commit Graph

16013 Commits

Author SHA1 Message Date
Hal Finkel
cd9569c19e Update IR when merging slots in stack coloring
The way that stack coloring updated MMOs when merging stack slots, while
correct, is suboptimal, and is incompatible with the use of AA during
instruction scheduling. The solution, which involves the use of const_cast (and
more importantly, updating the IR from within an MI-level pass), obviously
requires some explanation:

When the stack coloring pass was originally committed, the code in
ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using
GetUnderlyingObject, and all load/store and store/store memory control
dependencies where added between SUs at the object level (where only one
object, that returned by GetUnderlyingObject, was used to identify the object
associated with each MMO). When stack coloring merged stack slots, it would
replace MMOs derived from the remapped alloca with the alloca with which the
remapped alloca was being replaced. Because ScheduleDAGInstrs only used single
objects, and tracked alias sets at the object level, this was a fine solution.

In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use
GetUnderlyingObjects, and track alias sets using, potentially, multiple
underlying objects for each MMO. This was done, primarily, to provide the
ability to look through PHIs, and provide better scheduling for
induction-variable-dependent loads and stores inside loops. At this point, the
MMO-updating code in stack coloring became suboptimal, because it would clear
the MMOs for (i.e. completely pessimize) all instructions for which r169744
might help in scheduling. Updating the IR directly is the simplest fix for this
(and the one with, by far, the least compile-time impact), but others are
possible (we could give each MMO a small vector of potential values, or make
use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs).

Unfortunately, replacing all MMO values derived from the remapped alloca with
the base replacement alloca fundamentally breaks our ability to use AA during
instruction scheduling (which is critical to performance on some targets). The
reason is that the original MMO might have had an offset (either constant or
dynamic) from the base remapped alloca, and that offset is not present in the
updated MMO. One possible way around this would be to use
GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also
its offset based on the original offset. Unfortunately, this solution would
only handle constant offsets, and for safety (because AA is not completely
restricted to deducing relationships with constant offsets), we would need to
clear all MMOs without constant offsets over the entire function. This would be
an even worse pessimization than the current single-object restriction. Any
other solution would involve passing around a vector of remapped allocas, and
teaching AA to use it, introducing additional complexity and overhead into AA.

Instead, when remapping an alloca, we replace all IR uses of that alloca as
well (optionally inserting a bitcast as necessary). This is even more efficient
that the old MMO-updating code in the stack coloring pass (because it removes
the need to call GetUnderlyingObject on all MMO values), removes the
single-object pessimization in the default configuration, and enables the
correct use of AA during instruction scheduling (all without any additional
overhead).

LLVM now no longer miscompiles itself on x86_64 when using -enable-misched
-enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle!
Fixed PR18497.

Because the alloca replacement is now done at the IR level, unless the MMO
directly refers to the remapped alloca, the change cannot be seen at the MI
level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll.

llvm-svn: 199658
2014-01-20 14:03:16 +00:00
Hal Finkel
a228a8187b Track multiple stores per object when using AA in ScheduleDAGInstrs
When using AA to break false chain dependencies, we need to track multiple
stores per object in ScheduleDAGInstrs. Historically, we tracked potential alias
chains at the object level, and so all loads of an object would retain
dependencies on any store to that object. With AA, however, this is not
sufficient: non-overlapping stores and loads to the same object all need to be
tested for dependencies separately, we cannot only test all loads to an object
against only the last store (see PR18497 for an explicit example).

To mitigate any unwelcome compile-time impact when not using AA, only one store
is kept in the list per object when not using AA.

This, along with a stack coloring change to come shortly, will provide a test
case, fix PR18497 (and allow LLVM to compile itself using -enable-aa-sched-mi
on x86-64).

llvm-svn: 199657
2014-01-20 14:03:02 +00:00
Chandler Carruth
b587ab679f Fix a DenseMap iterator invalidation bug causing lots of crashes when
type units were enabled. The crux of the issue is that the
addDwarfTypeUnitType routine can end up being indirectly recursive. In
this case, the reference into the dense map (TU) became invalid by the
time we popped all the way back and used it to add the DIE type
signature.

Instead, use early return in the case where we can bypass the recursive
step and creating a type unit. Then use the pointer to the new type unit
to set up the DIE type signature in the case where we have to.

I tried really hard to reduce a testcase for this, but it's really
annoying. You have to get this to be mid-recursion when the densemap
grows. Even if we got a test case for this today, it'd be very unlikely
to continue exercising this pattern.

llvm-svn: 199630
2014-01-20 08:07:07 +00:00
Adrian Prantl
ef129fbb41 Debug info (LTO): Move the creation of accessibility flags to
getOrCreateSubprogramDIE to avoid attributes being added twice when DIEs
are merged.

rdar://problem/15842330.

llvm-svn: 199536
2014-01-18 02:12:00 +00:00
Rafael Espindola
0b694814a8 Add an emitRawComment function and use it to simplify some uses of EmitRawText.
llvm-svn: 199397
2014-01-16 16:28:37 +00:00
Tim Northover
3657cb0350 ReMat: fix overly cavalier attitude to sub-register indices
There are two attempted optimisations in reMaterializeTrivialDef, trying to
avoid promoting the size of a register too much when rematerializing.
Unfortunately, both appear to be flawed. First, we see if the original register
would have worked, but this is inadequate. Consider:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0 = COPY v1:Q1 (v1, v2 are QQ)
    ...
    uses of v2

In this case even though v2 *could* be used directly as the output of
SOMETHING, this would set the wrong bits of the QQ register involved. The
correct rematerialization must be:

    v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ)
    ...
    uses of v2:Q1_Q2

For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then
we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try
to hunt for a class between v1 and v2 that works. Unfortunately, this is also
wrong:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ)
    ...
    uses of v2 as a QQQ

The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current
logic would decide that v2 could be a QQ (no interest is taken in later uses).

This patch, therefore, always accepts the widened register class without trying
to be clever. Generally there is no penalty to this (e.g. in the common GR32 <
GR64 case, expanding the width doesn't matter because it's not like you were
going to do anything else with the high bits of a GR32 register). It can
increase register pressure in cases like the ARM VFP regs though (multiple
non-overlapping but equivalent subregisters). This situation can be
spotted by the fact that both source and destination in the
not-quite-coalesced pair have a sub-register index and
rematerialisation is skipped in that situation.

Unfortunately, no in-tree targets actually expose this as far as I can tell
(there are so few isAsCheapAsAMove instructions for it to trigger on) so I've
been unable to produce a test. It was exposed in our ARM64 SPEC tests though,
and I will be adding a test there that we should be able to contribute
soon(TM).

rdar://problem/15775279

llvm-svn: 199376
2014-01-16 12:29:55 +00:00
Rafael Espindola
74c3e63193 Use a slightly smaller hack.
llvm-svn: 199363
2014-01-16 07:36:00 +00:00
Andrea Di Biagio
d7c03ec348 [DAGCombiner] Fix a wrong check in method SimplifyVBinOp.
This fixes a regression intruced by r199135.

Revision 199135 tried to simplify part of the logic in method
DAGCombiner::SimplifyVBinOp introducing calls to method BuildVectorSDNode::isConstant().

However, that revision wrongly changed the check performed by method
SimplifyVBinOp to identify dag nodes that can be folded.
Before revision 199135, that method only tried to simplify vector binary operations
if both operands were build_vector of Constant/ConstantFP/Undef only.

After revision 199135, method SimplifyVBinop tried to
simplify also vector binary operations with only one constant operand.

This fixes the problem restoring the old behavior of SimplifyVBinOp.

llvm-svn: 199328
2014-01-15 19:51:32 +00:00
David Majnemer
dee105772c WinCOFF: Transform IR expressions featuring __ImageBase into image relative relocations
MSVC on x64 requires that we create image relative symbol
references to refer to RTTI data. Seeing as how there is no way to
explicitly make reference to a given relocation type in LLVM IR, pattern
match expressions of the form &foo - &__ImageBase.

Differential Revision: http://llvm-reviews.chandlerc.com/D2523

llvm-svn: 199312
2014-01-15 09:16:42 +00:00
Eric Christopher
1ad8457570 Make sure we emit a relocation to the debug_ranges section in the
presence of CU ranges.

llvm-svn: 199276
2014-01-15 00:04:29 +00:00
Eric Christopher
39cde8cc90 Enable use of ranges for translation units in the presence of
-ffunction-sections and update comments and TODOs about other
places that we should enable this.

llvm-svn: 199263
2014-01-14 22:44:17 +00:00
Nico Rieck
7157bb765e Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199218
2014-01-14 15:22:47 +00:00
Patrik Hagglund
682a10d4cc Fix valgrind warning for gcc builds.
Sorry, I don't understand why the warning is generated (a gcc
bug?). Anyhow, the change should improve readablity. No functionality
change intended.

llvm-svn: 199214
2014-01-14 14:09:00 +00:00
Nico Rieck
9d2e0df049 Revert "Decouple dllexport/dllimport from linkage"
Revert this for now until I fix an issue in Clang with it.

This reverts commit r199204.

llvm-svn: 199207
2014-01-14 12:38:32 +00:00
Nico Rieck
e43aaf7967 Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199204
2014-01-14 11:55:03 +00:00
Jakob Stoklund Olesen
b6b35a4955 Always let value types influence register classes.
When creating a virtual register for a def, the value type should be
used to pick the register class. If we only use the register class
constraint on the instruction, we might pick a too large register class.

Some registers can store values of different sizes. For example, the x86
xmm registers can hold f32, f64, and 128-bit vectors. The three
different value sizes are represented by register classes with identical
register sets: FR32, FR64, and VR128. These register classes have
different spill slot sizes, so it is important to use the right one.

The register class constraint on an instruction doesn't necessarily care
about the size of the value its defining. The value type determines
that.

This fixes a problem where InstrEmitter was picking 32-bit register
classes for 64-bit values on SPARC.

llvm-svn: 199187
2014-01-14 06:18:38 +00:00
Rafael Espindola
4a1a360634 Make getTargetStreamer return a possibly null pointer.
This will allow it to be called from target independent parts of the main
streamer that don't know if there is a registered target streamer or not. This
in turn will allow targets to perform extra actions at specified points in the
interface: add extra flags for some labels, extra work during finalization, etc.

llvm-svn: 199174
2014-01-14 01:21:46 +00:00
Juergen Ributzka
6840282c99 [DAG] Refactor ReassociateOps - no functional change intended.
llvm-svn: 199146
2014-01-13 21:49:25 +00:00
Juergen Ributzka
7384405f23 [DAG] Teach DAG to also reassociate vector operations
This commit teaches DAG to reassociate vector ops, which in turn enables
constant folding of vector op chains that appear later on during custom lowering
and DAG combine.

Reviewed by Andrea Di Biagio

llvm-svn: 199135
2014-01-13 20:51:35 +00:00
Andrew Trick
7daf6a45f4 Hide the pre-RA-sched= option.
This is a very confusing option for a feature that will go away.

-enable-misched is exposed instead to help triage issues with the new
scheduler.

llvm-svn: 199133
2014-01-13 20:08:27 +00:00
Chandler Carruth
73523021d0 [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

llvm-svn: 199104
2014-01-13 13:07:17 +00:00
Chandler Carruth
e509db410a [PM] Pull the generic graph algorithms and data structures for dominator
trees into the Support library.

These are all expressed in terms of the generic GraphTraits and CFG,
with no reliance on any concrete IR types. Putting them in support
clarifies that and makes the fact that the static analyzer in Clang uses
them much more sane. When moving the Dominators.h file into the IR
library I claimed that this was the right home for it but not something
I planned to work on. Oops.

So why am I doing this? It happens to be one step toward breaking the
requirement that IR verification can only be performed from inside of
a pass context, which completely blocks the implementation of
verification for the new pass manager infrastructure. Fixing it will
also allow removing the concept of the "preverify" step (WTF???) and
allow the verifier to cleanly flag functions which fail verification in
a way that precludes even computing dominance information. Currently,
that results in a fatal error even when you ask the verifier to not
fatally error. It's awesome like that.

The yak shaving will continue...

llvm-svn: 199095
2014-01-13 10:52:56 +00:00
Tim Northover
7fdd4857f7 Revert "ReMat: fix overly cavalier attitude to sub-register indices"
Very sorry, this was a premature patch that I still need to investigate and
finish off (for some reason beyond me at the moment it doesn't actually fix the
issue in all cases).

This reverts commit r199091.

llvm-svn: 199093
2014-01-13 10:49:11 +00:00
Tim Northover
59f8d4b4ee ReMat: fix overly cavalier attitude to sub-register indices
There are two attempted optimisations in reMaterializeTrivialDef, trying to
avoid promoting the size of a register too much when rematerializing.
Unfortunately, both appear to be flawed. First, we see if the original register
would have worked, but this is inadequate. Consider:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0 = COPY v1:Q1 (v1, v2 are QQ)
    ...
    uses of v2

In this case even though v2 *could* be used directly as the output of
SOMETHING, this would set the wrong bits of the QQ register involved. The
correct rematerialization must be:

    v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ)
    ...
    uses of v2:Q1_Q2

For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then
we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try
to hunt for a class between v1 and v2 that works. Unfortunately, this is also
wrong:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ)
    ...
    uses of v2 as a QQQ

The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current
logic would decide that v2 could be a QQ (no interest is taken in later uses).

This patch, therefore, always accepts the widened register class without trying
to be clever. Generally there is no penalty to this (e.g. in the common GR32 <
GR64 case, expanding the width doesn't matter because it's not like you were
going to do anything else with the high bits of a GR32 register). It can
increase register pressure in cases like the ARM VFP regs though (multiple
non-overlapping but equivalent subregisters). Hopefully this situation is rare
enough that it won't matter.

Unfortunately, no in-tree targets actually expose this as far as I can tell
(there are so few isAsCheapAsAMove instructions for it to trigger on) so I've
been unable to produce a test. It was exposed in our ARM64 SPEC tests though,
and I will be adding a test there that we should be able to contribute
soon(TM).

llvm-svn: 199091
2014-01-13 10:47:01 +00:00
Chandler Carruth
5ad5f15cff [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082
2014-01-13 09:26:24 +00:00
Jakob Stoklund Olesen
1995b9fead Handle bundled terminators in isBlockOnlyReachableByFallthrough.
Targets like SPARC and MIPS have delay slots and normally bundle the
delay slot instruction with the corresponding terminator.

Teach isBlockOnlyReachableByFallthrough to find any MBB operands on
bundled terminators so SPARC doesn't need to specialize this function.

llvm-svn: 199061
2014-01-12 19:24:08 +00:00
Nico Rieck
b5262d6d8f Fix non-deterministic SDNodeOrder-dependent codegen
Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code
generation.

llvm-svn: 199050
2014-01-12 14:09:17 +00:00
Chandler Carruth
9d805139bd [PM] Simplify the interface exposed for IR printing passes.
Nothing was using the ability of the pass to delete the raw_ostream it
printed to, and nothing was trying to pass it a pointer to the
raw_ostream. Also, the function variant had a different order of
arguments from all of the others which was just really confusing. Now
the interface accepts a reference, doesn't offer to delete it, and uses
a consistent order. The implementation of the printing passes haven't
been updated with this simplification, this is just the API switch.

llvm-svn: 199044
2014-01-12 11:30:46 +00:00
Chandler Carruth
b8ddc7043c [PM] Rename the IR printing pass header to a more generic and correct
name to match the source file which I got earlier. Update the include
sites. Also modernize the comments in the header to use the more
recommended doxygen style.

llvm-svn: 199041
2014-01-12 11:10:32 +00:00
Alp Toker
798060e006 Fix 'ned' typo in doc comment
Patch by Jasper Neumann!

llvm-svn: 199007
2014-01-11 14:01:43 +00:00
Eric Christopher
942f22c439 Revert r198979 - accidental commit.
llvm-svn: 198981
2014-01-11 00:28:12 +00:00
Eric Christopher
ceec7b02fa Reformat.
llvm-svn: 198980
2014-01-11 00:23:18 +00:00
Eric Christopher
67cde9ac07 Update function name and add some helpful comments.
llvm-svn: 198979
2014-01-11 00:23:16 +00:00
David Blaikie
15ed5ebfc5 Revert "Revert r198851, "Prototype of skeleton type units for fission""
This reverts commit r198865 which reverts r198851.

ASan identified a use-of-uninitialized of the DwarfTypeUnit::Ty variable
in skeleton type units.

llvm-svn: 198908
2014-01-10 01:38:41 +00:00
NAKAMURA Takumi
c5bf572993 Revert r198851, "Prototype of skeleton type units for fission"
It caused undefined behavior. DwarfTypeUnit::Ty might not be initialized properly, I guess.

llvm-svn: 198865
2014-01-09 13:08:00 +00:00
Richard Sandiford
15cfc1c33c Handle masked rotate amounts
At the moment we expect rotates to have the form:

   (or (shl X, Y), (shr X, Z))

where Y == bitsize(X) - Z or Z == bitsize(X) - Y.  This form means that
the (or ...) is undefined for Y == 0 or Z == 0.  This undefinedness can
be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or
Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C
(including 0, the most natural choice).

llvm-svn: 198861
2014-01-09 10:56:42 +00:00
Richard Sandiford
0f264db3c6 Match the InstCombine form of rotates by X+C
InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X),
so a rotate left of a 32-bit Y by X+C could appear as either:

   (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C))))

without InstCombine or:

   (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X)))

with it.

We already matched the first form.  This patch handles the second too.

llvm-svn: 198860
2014-01-09 10:49:40 +00:00
David Blaikie
a588365df6 Prototype of skeleton type units for fission
llvm-svn: 198851
2014-01-09 05:08:28 +00:00
David Blaikie
38fe6342f6 DwarfDebug: Refactor out common skeleton construction code to be reused for type unit skeletons.
llvm-svn: 198846
2014-01-09 04:28:46 +00:00
David Blaikie
b334e94492 Reformatting for r198842
llvm-svn: 198843
2014-01-09 03:24:13 +00:00
David Blaikie
f645f963ff DwarfUnit: Rename "Node" to "CUNode" and propagate it through DwarfTypeUnit as well.
Since we'll now also need the split dwarf file name along with the
language in DwarfTypeUnits, just use the whole DICompileUnit rather than
explicitly handling each field needed.

llvm-svn: 198842
2014-01-09 03:23:41 +00:00
David Blaikie
7480ae6e19 Revert "DwarfUnit: Move the DICompileUnit Node to the DwarfCompileUnit only"
This reverts commit r198830.

Decided to go a different way with this...

llvm-svn: 198841
2014-01-09 03:03:27 +00:00
Chandler Carruth
d48cdbf0c3 Put the functionality for printing a value to a raw_ostream as an
operand into the Value interface just like the core print method is.
That gives a more conistent organization to the IR printing interfaces
-- they are all attached to the IR objects themselves. Also, update all
the users.

This removes the 'Writer.h' header which contained only a single function
declaration.

llvm-svn: 198836
2014-01-09 02:29:41 +00:00
David Blaikie
08badfd2ba DwarfUnit: Move the DICompileUnit Node to the DwarfCompileUnit only
It's unused in DwarfTypeUnit, as is expected.

llvm-svn: 198830
2014-01-09 01:20:14 +00:00
Andrew Trick
32e1be7bd0 llvm.experimental.stackmap: fix encoding of large constants.
In the stackmap format we advertise the constant field as signed.
However, we were determining whether to promote to a 64-bit constant
pool based on an unsigned comparison.

This fix allows -1 to be encoded as a small constant.

llvm-svn: 198816
2014-01-09 00:22:31 +00:00
Hal Finkel
2150e3a743 Conservatively handle multiple MMOs in MIsNeedChainEdge
MIsNeedChainEdge, which is used by -enable-aa-sched-mi (AA in misched), had an
llvm_unreachable when -enable-aa-sched-mi is enabled and we reach an
instruction with multiple MMOs. Instead, return a conservative answer. This
allows testing -enable-aa-sched-mi on x86.

Also, this moves the check above the isUnsafeMemoryObject checks.
isUnsafeMemoryObject is currently correct only for instructions with one MMO
(as noted in the comment in isUnsafeMemoryObject):

  // We purposefully do no check for hasOneMemOperand() here
  // in hope to trigger an assert downstream in order to
  // finish implementation.

The problem with this is that, had the candidate edge passed the
"!MIa->mayStore() && !MIb->mayStore()" check, the hoped-for assert would never
happen (which could, in theory, lead to incorrect behavior if one of these
secondary MMOs was volatile, for example).

llvm-svn: 198795
2014-01-08 21:52:02 +00:00
Andrea Di Biagio
23df4e4a2d Teach the DAGCombiner how to fold 'vselect' dag nodes according
to the following two rules:
  1) fold (vselect (build_vector AllOnes), A, B) -> A
  2) fold (vselect (build_vector AllZeros), A, B) -> B

llvm-svn: 198777
2014-01-08 18:33:04 +00:00
Richard Sandiford
95c864d9bd [DAGCombiner] Factor duplicated rotate code into a separate function
No functional change intended.

llvm-svn: 198768
2014-01-08 15:40:47 +00:00
Rafael Espindola
894843cb4e Move the llvm mangler to lib/IR.
This makes it available to tools that don't link with target (like llvm-ar).

llvm-svn: 198708
2014-01-07 21:19:40 +00:00
Benjamin Kramer
8a68ab3710 Emit arange padding with a single directive.
llvm-svn: 198700
2014-01-07 19:28:14 +00:00