Commit Graph

15456 Commits

Author SHA1 Message Date
Juergen Ributzka
f043a65327 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
This reverts commit r191130.

llvm-svn: 191138
2013-09-21 15:09:46 +00:00
Juergen Ributzka
e9a80fc912 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

llvm-svn: 191130
2013-09-21 04:55:18 +00:00
Eric Christopher
9cd26af8b6 Move emission of the debug string table to early in the debug
info finalization to greatly reduce the number of fixups that the
assembler has to handle in order to improve compile time.

llvm-svn: 191119
2013-09-20 23:22:52 +00:00
Eric Christopher
9c58f317da Migrate addGlobalName to the .cpp file as an intermediate step
to further work.

llvm-svn: 191113
2013-09-20 22:20:55 +00:00
Andrew Trick
978674b2bc Allow subtarget selection of the default MachineScheduler and document the interface.
The global registry is used to allow command line override of the
scheduler selection, but does not work well as the normal selection
API. For example, the same LLVM process should be able to target
multiple targets or subtargets.

llvm-svn: 191071
2013-09-20 05:14:41 +00:00
David Blaikie
efd0bcb70f DebugInfo: GDBIndexEntry*String conversion functions now return const char* for easy llvm::formating
This was previously invoking UB by passing a user-defined type to
format. Thanks to Jordan Rose for pointing this out.

llvm-svn: 191060
2013-09-20 00:33:15 +00:00
David Blaikie
9d117ab7ef Add braces to suppress Clang's dangling-else warning.
These violations were introduced in r191049

llvm-svn: 191059
2013-09-20 00:33:11 +00:00
Richard Mitton
21101b3231 Added support for generate DWARF .debug_aranges sections automatically.
llvm-svn: 191052
2013-09-19 23:21:01 +00:00
Andrew Trick
665d3ec3d3 Rename ConvergingScheduler to GenericScheduler.
This was an experimental scheduler a year ago. It's now used by
several subtargets, both in-order and out-of-order, and it
is about to be enabled by default for x86 and armv7. It will be the
new GenericScheduler for subtargets that don't provide their own
SchedulingStrategy.

llvm-svn: 191051
2013-09-19 23:10:59 +00:00
David Blaikie
404d3047c0 DebugInfo: llvm-dwarfdump support for gnu_pubnames section
llvm-svn: 191050
2013-09-19 23:01:29 +00:00
Kai Nacke
d09bb4614b PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191049
2013-09-19 23:00:28 +00:00
Kai Nacke
2d967b2751 Revert PR16726: extend rol/ror matching
There is a buildbot failure. Need to investigate this.

llvm-svn: 191048
2013-09-19 22:53:36 +00:00
Kai Nacke
4eaf6444fa PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191045
2013-09-19 22:36:39 +00:00
David Blaikie
d0a869d0bf DebugInfo: Improve IR annotation comments for GNU pubthings.
llvm-svn: 191043
2013-09-19 22:19:37 +00:00
David Blaikie
8dec407649 Unshift the GDB index/GNU pubnames constants modified in r191025
Based on code review feedback from Eric Christopher, unshifting these
constants as they can appear in the gdb_index itself, shifted a further
24 bits. This means that keeping them preshifted is a bit inflexible, so
let's not do that.

Given the motivation, wrap up some nicer enums, more type safety, and
some utility functions.

llvm-svn: 191035
2013-09-19 20:40:26 +00:00
David Blaikie
b20db58a4d DebugInfo: Simplify gnu_pubnames index computation.
Names open to bikeshedding. Could switch back to the constants being
unshifted, but this way seems a bit easier to work with.

llvm-svn: 191025
2013-09-19 18:39:59 +00:00
David Blaikie
70a3320244 Remove unnecessary conditional operators performing bool->bool conversion.
llvm-svn: 191020
2013-09-19 17:33:35 +00:00
David Blaikie
0f5ad28a9d Fix a typo and simplify a boolean expression.
llvm-svn: 191018
2013-09-19 17:27:48 +00:00
Benjamin Kramer
d443e4a080 DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width.
PR17283.

llvm-svn: 191000
2013-09-19 13:28:20 +00:00
Adrian Prantl
262bcf4584 Debug info: Get rid of the VLA indirection hack in FastISel.
Use the DIVariable::isIndirect() flag set by the frontend instead of
guessing whether to set the machine location's indirection bit.
Paired commit with CFE.

llvm-svn: 190961
2013-09-18 22:08:59 +00:00
Arnold Schwaighofer
cae8735a54 Costmodel: Add support for horizontal vector reductions
Upcoming SLP vectorization improvements will want to be able to estimate costs
of horizontal reductions. Add infrastructure to support this.

We model reductions as a series of (shufflevector,add) tuples ultimately
followed by an extractelement. For example, for an add-reduction of <4 x float>
we could generate the following sequence:

 (v0, v1, v2, v3)
   \   \  /  /
     \  \  /
       +  +

 (v0+v2, v1+v3, undef, undef)
    \      /
 ((v0+v2) + (v1+v3), undef, undef)

 %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
                           <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
 %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
 %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
                          <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
 %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
 %r = extractelement <4 x float> %bin.rdx8, i32 0

This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)"
that will allow clients to ask for the cost of such a reduction (as backends
might generate more efficient code than the cost of the individual instructions
summed up). This interface is excercised by the CostModel analysis pass which
looks for reduction patterns like the one above - starting at extractelements -
and if it sees a matching sequence will call the cost model interface.

We will also support a second form of pairwise reduction that is well supported
on common architectures (haddps, vpadd, faddp).

 (v0, v1, v2, v3)
  \   /    \  /
 (v0+v1, v2+v3, undef, undef)
    \     /
 ((v0+v1)+(v2+v3), undef, undef, undef)

  %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
        <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
  %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
        <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
  %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
  %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
        <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
  %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
        <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
  %r = extractelement <4 x float> %bin.rdx.1, i32 0

llvm-svn: 190876
2013-09-17 18:06:50 +00:00
Serge Pavlov
8ec39992c1 Added documentation to getMemsetStores.
llvm-svn: 190866
2013-09-17 16:24:42 +00:00
Quentin Colombet
d30a9585b8 [SelectionDAG] Teach the vector scalarizer about TRUNCATE.
When a truncate node defines a legal vector type but uses an illegal
vector type, the legalization process was splitting the vector until
<1 x vector> type, but then it was failing to scalarize the node because
it did not know how to handle TRUNCATE.

<rdar://problem/14989896>

llvm-svn: 190830
2013-09-17 00:26:56 +00:00
Adrian Prantl
db3e26d193 Debug info: Fix PR16736 and rdar://problem/14990587.
A DBG_VALUE is register-indirect iff the first operand is a register
_and_ the second operand is an immediate.

llvm-svn: 190821
2013-09-16 23:29:03 +00:00
Jakub Staszak
ec2ffa92d8 Use reference instead of copy.
llvm-svn: 190813
2013-09-16 22:03:38 +00:00
Peter Collingbourne
3fa50f9b05 Implement function prefix data as an IR feature.
Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html

Differential Revision: http://llvm-reviews.chandlerc.com/D1191

llvm-svn: 190773
2013-09-16 01:08:15 +00:00
Benjamin Kramer
7d6052687e Replace some unnecessary vector copies with references.
llvm-svn: 190770
2013-09-15 22:04:42 +00:00
Hal Finkel
31658834e6 Prevent assert in CombinerGlobalAA with null values
DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we
can't use AA in this case (if we try, then the casting code in AA will assert).

llvm-svn: 190763
2013-09-15 02:19:49 +00:00
Quentin Colombet
cf71c6320b [Peephole] Rewrite copies to avoid cross register banks copies.
By definition copies across register banks are not coalescable. Still, it may be
possible to get rid of such a copy when the value is available in another
register of the same register file.
Consider the following example, where capital and lower letters denote different
register file:
b = copy A <-- cross-bank copy
...
C = copy b <-- cross-bank copy

This could have been optimized this way:
b = copy A  <-- cross-bank copy
...
C = copy A <-- same-bank copy

Note: b and C's definitions may be in different basic blocks.

This patch adds a peephole optimization that looks through a chain of copies
leading to a cross-bank copy and reuses a source that is on the same register
file if available.

This solution could also be used to get rid of some copies (e.g., A could have
been used instead of C). However, we do not do so because:
- It may over constrain the coloring of the source register for coalescing.
- The register allocator may not be able to find a nice split point for the
  longer live-range, leading to more spill.

<rdar://problem/14742333>

llvm-svn: 190713
2013-09-13 18:26:31 +00:00
Eric Christopher
dd1a01203d Add initial support for handling gnu style pubnames accepted by some
versions of gold. This support is designed to allow gold to produce
gdb_index sections similar to the accelerator tables and consumable
by gdb.

llvm-svn: 190649
2013-09-13 00:35:05 +00:00
Eric Christopher
8b3737fbb0 Reformat and hoist section grabbing to top level.
llvm-svn: 190648
2013-09-13 00:34:58 +00:00
Joey Gouly
0e76fa7df5 Add an instruction deprecation feature to TableGen.
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.

The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
  ComplexDeprecationPredicate<"MCR">

would mean you would have to define the following function:
  bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                             std::string &Info)

Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.

The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.

llvm-svn: 190598
2013-09-12 10:28:05 +00:00
Hal Finkel
6f1ff8e1a8 Fix crash in AggressiveAntiDepBreaker with empty CriticalPathSet
If no register classes are added to CriticalPathRCs, then the CriticalPathSet
bitmask will be empty. In that case, ExcludeRegs must remain NULL or else this
line will cause a segfault:

  } else if ((ExcludeRegs != NULL) && ExcludeRegs->test(AntiDepReg)) {

I have no in-tree test case.

llvm-svn: 190584
2013-09-12 04:22:31 +00:00
Matt Arsenault
bc08ddba58 Remove pointless assertion after r190376
llvm-svn: 190565
2013-09-12 01:07:49 +00:00
Manman Ren
5b2f4b0540 Debug info: add more comments.
llvm-svn: 190544
2013-09-11 19:40:28 +00:00
Hal Finkel
8f2e700522 Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 190542
2013-09-11 19:25:43 +00:00
Benjamin Kramer
079b96e6f7 Revert "Give internal classes hidden visibility."
It works with clang, but GCC has different rules so we can't make all of those
hidden. This reverts commit r190534.

llvm-svn: 190536
2013-09-11 18:05:11 +00:00
Benjamin Kramer
6a44af3629 Give internal classes hidden visibility.
Worth 100k on a linux/x86_64 Release+Asserts clang.

llvm-svn: 190534
2013-09-11 17:42:27 +00:00
Bill Wendling
62a2d14ac5 Simplify the checking of function attributes by using the simple methods.
llvm-svn: 190499
2013-09-11 08:35:09 +00:00
Eli Friedman
8f06d55697 Rename variables for consistency.
No functional change.

llvm-svn: 190466
2013-09-11 00:41:02 +00:00
Eli Friedman
78bffa5767 Fix unused variables.
llvm-svn: 190448
2013-09-10 23:18:14 +00:00
Eric Christopher
13b99d2aba Hoist section call out of loop.
llvm-svn: 190440
2013-09-10 21:49:37 +00:00
Manman Ren
2312ed35d2 Debug Info: create scope children DIEs when the scope DIE is not null.
We try to create the scope children DIEs after we create the scope DIE. But
to avoid emitting empty lexical block DIE, we first check whether a scope
DIE is going to be null, then create the scope children if it is not null.
From the number of children, we decide whether to actually create the scope DIE.

This patch also removes an early exit which checks for a special condition.
It also removes deletion of un-used children DIEs that are generated
because we used to generate children DIEs before the scope DIE.

Deletion of un-used children DIEs may cause problem because we sometimes keep
created DIEs in a member variable of a CU.

llvm-svn: 190421
2013-09-10 18:40:41 +00:00
Manman Ren
34b3dcc3b5 Debug Info: define a DIRef template.
Specialize the constructors for DIRef<DIScope> and DIRef<DIType> to make sure
the Value is indeed a scope ref and a type ref.

Use DIScopeRef for DIScope::getContext and DIType::getContext and use DITypeRef
for getContainingType and getClassType.

DIScope::generateRef now returns a DIScopeRef instead of a "Value *" for
readability and type safety.

llvm-svn: 190418
2013-09-10 18:30:07 +00:00
Matt Arsenault
d232222f34 Don't use getSetCCResultType for creating a vselect
The vselect mask isn't a setcc.

This breaks in the case when the result of getSetCCResultType
is larger than the vector operands

e.g. %tmp = select i1 %cmp <2 x i8> %a, <2 x i8> %b
when getSetCCResultType returns <2 x i32>, the assertion
that the (MaskTy.getSizeInBits() == Op1.getValueType().getSizeInBits())
is hit.

No test since I don't think I can hit this with any of the current
targets. The R600/SI implementation would break, since it returns a
vector of i1 for this, but it doesn't reach ExpandSELECT for other
reasons.

llvm-svn: 190376
2013-09-10 00:41:56 +00:00
Andrew Trick
6c88b35090 Enable -misched-cyclicpath by default.
llvm-svn: 190367
2013-09-09 23:31:14 +00:00
Manman Ren
de897a369a Debug Info: move DIScope::getContext back from DwarfDebug.
This partially reverts r190330. DIScope::getContext now returns DIScopeRef
instead of DIScope. We construct a DIScopeRef from DIScope when we are
dealing with subprogram, lexical block or name space.

llvm-svn: 190362
2013-09-09 22:35:23 +00:00
Andrew Trick
e1f7bf2c02 mi-sched: smooth out the cyclicpath heuristic.
Arnold's idea.

I generally try to avoid stateful heuristics because it can make
debugging harder. However, we need a way to prevent the latency
priority from dominating, and it somewhat makes sense to schedule
aggressively for latency only within an issue group.

Swift in particular likes this, and it doesn't hurt anyone else:
| Benchmarks/MiBench/consumer-lame              |  10.39% |
| Benchmarks/Misc/himenobmtxpa                  |   9.63% |

llvm-svn: 190360
2013-09-09 22:28:08 +00:00
Jack Carter
170a5f2983 white spaces and long lines
llvm-svn: 190358
2013-09-09 22:02:08 +00:00
Eric Christopher
ba506db498 Always add global names. We're adding them in the rest of the code
as well as types.

No functional change as they're not emitted unless the option
is true anyhow.

llvm-svn: 190346
2013-09-09 20:03:20 +00:00