Commit Graph

1797 Commits

Author SHA1 Message Date
Weiming Zhao
fe26fd27b4 PR 18466: Fix ARM Pseudo Expansion
When expanding neon pseudo stores, it may miss the implicit uses of sub
regs, which may cause post RA scheduler reorder instructions that
breakes anti dependency.

For example:
  VST1d64QPseudo %R0<kill>, 16, %Q9_Q10, pred:14, pred:%noreg
  will be expanded to
    VST1d64Q %R0<kill>, 16, %D18, pred:14, pred:%noreg;

An instruction that defines %D20 may be scheduled before the store by
mistake.

This patches adds implicit uses for such case. For the example above, it
emits:
  VST1d64Q %R0<kill>, 8, %D18, pred:14, pred:%noreg, %Q9_Q10<imp-use>

llvm-svn: 199282
2014-01-15 01:32:12 +00:00
Tim Northover
463a5f24d1 ARM: correctly determine final tBX_LR in Thumb1 functions
The changes caused by folding an sp-adjustment into a "pop" previously
disrupted the forward search for the final real instruction in a
terminating block. This switches to a backward search (skipping debug
instrs).

This fixes PR18399.

Patch by Zhaoshi.

llvm-svn: 199266
2014-01-14 22:53:28 +00:00
Tim Northover
56cc5c92db ARM: add constraint that RdLo != Rn != RdHi for v5 MLA insts.
llvm-svn: 199212
2014-01-14 13:05:47 +00:00
Tim Northover
7d074a5ad6 ARM: add test for r199108. Oops.
rdar://problem/15800156

llvm-svn: 199109
2014-01-13 14:20:25 +00:00
Benjamin Kramer
c10563d14e Fix broken CHECK lines.
llvm-svn: 199016
2014-01-11 21:06:00 +00:00
Artyom Skrobov
4d91d944ae Must not produce Tag_CPU_arch_profile for pre-ARMv7 cores (e.g. cortex-m0)
llvm-svn: 198945
2014-01-10 16:42:55 +00:00
Saleem Abdulrasool
87ccd367b6 ARM IAS: improve .eabi_attribute handling
Parse tag names as well as expressions.  The former is part of the
specification, the latter is for improved compatibility with the GNU assembler.
Fix attribute value handling to be comformant to the specification.

llvm-svn: 198662
2014-01-07 02:28:42 +00:00
Tim Northover
d6a729bb85 ARM MachO: sort out isTargetDarwin/isTargetIOS/... checks.
The ARM backend has been using most of the MachO related subtarget
checks almost interchangeably, and since the only target it's had to
run on has been IOS (which is all three of MachO, Darwin and IOS) it's
worked out OK so far.

But we'd like to support embedded targets under the "*-*-none-macho"
triple, which means everything starts falling apart and inconsistent
behaviours emerge.

This patch should pick a reasonably sensible set of behaviours for the
new triple (and any others that come along, with luck). Some choices
were debatable (notably FP == r7 or r11), but we can revisit those
later when deficiencies become apparent.

llvm-svn: 198617
2014-01-06 14:28:05 +00:00
Tim Northover
7649ebacd6 ARM: keep special non-AEABIness of "-darwin-eabi" triples for now
Longer term, we want to move users to "*-*-*-macho" for embedded work, but for
now people are relying on the last thing we told them, which is unfortunately
"*-*-darwin-eabi".

rdar://problem/15703934

llvm-svn: 198602
2014-01-06 12:00:44 +00:00
Rafael Espindola
d89b16dcb8 Make the ARM ABI selectable via SubtargetFeature.
This patch makes it possible to select the ABI with -mattr. It will be used to
forward clang's -target-abi option to llvm's CodeGen.

llvm-svn: 198304
2014-01-02 13:40:08 +00:00
Bill Wendling
8ea7582546 Un-XFAILify some tests which are now passing.
llvm-svn: 198184
2013-12-29 23:09:14 +00:00
Andrew Trick
3ca67d6404 New machine model for cortex-a9. Schedule for resources and latency.
Schedule more conservatively to account for stalls on floating point
resources and latency. Use the AGU resource to model latency stalls
since it's shared between FP and LD/ST instructions. This might not be
completely accurate but should work well in practice.

llvm-svn: 198125
2013-12-28 21:57:05 +00:00
Josh Magee
58fa493955 Unbreak ARM buildbots after r197653 by forcing the target triple on this test.
llvm-svn: 197709
2013-12-19 18:14:42 +00:00
Rafael Espindola
357d013e54 Add a triple so that this passes on OS X.
I am surprised I am the first one to notice this.

llvm-svn: 197689
2013-12-19 16:06:33 +00:00
Josh Magee
22b8ba2d67 [stackprotector] Use analysis from the StackProtector pass for stack layout in PEI a nd LocalStackSlot passes.
This changes the MachineFrameInfo API to use the new SSPLayoutKind information
produced by the StackProtector pass (instead of a boolean flag) and updates a
few pass dependencies (to preserve the SSP analysis).

The stack layout follows the same approach used prior to this change - i.e.,
only LargeArray stack objects will be placed near the canary and everything
else will be laid out normally.  After this change, structures containing large
arrays will also be placed near the canary - a case previously missed by the
old implementation.

Out of tree targets will need to update their usage of
MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. 

The next patch will implement the rules for sspstrong and sspreq.  The end goal
is to support ssp-strong stack layout rules.

WIP.

Differential Revision: http://llvm-reviews.chandlerc.com/D2158

llvm-svn: 197653
2013-12-19 03:17:11 +00:00
Weiming Zhao
63871d255f [aarch32] fix bug 18268: Incorrect condition of vsel
Given vsel_cc, op1, op2, since vsel has no LE/LT, to generate vsel for
such selection, it needs to inverse cc and swap op1 and op2. To inverse
cc, both L/G and E bits should be flipped.

llvm-svn: 197615
2013-12-18 22:25:17 +00:00
Tim Northover
c0b42f718e ARM: force soft-float ABI for tests depending on it.
This should fix the ARM bots.

llvm-svn: 197555
2013-12-18 09:58:06 +00:00
Tim Northover
44594ad7e2 ARM: set default float ABI based on triple.
Clang sets the float-abi target option manually, but no longer
annotates each function with its ABI. This can lead to confusing
mistmatch between "clang -emit-llvm | llc" and normal clang
invocations.

Besides which, gnueabihf actually *is* hard-float. Defaulting to soft
was just perverse.

llvm-svn: 197554
2013-12-18 09:27:33 +00:00
Quentin Colombet
b4c44d239c Add warning capabilities in LLVM.
This reapplies r197438 and fixes the link-time circular dependency between
IR and Support. The fix consists in moving the diagnostic support into IR.

The patch adds a new LLVMContext::diagnose that can be used to communicate to
the front-end, if any, that something of interest happened.
The diagnostics are supported by a new abstraction, the DiagnosticInfo class.
The base class contains the following information:
- The kind of the report: What this is about.
- The severity of the report: How bad this is.

This patch also adds 2 classes:
- DiagnosticInfoInlineAsm: For inline asm reporting. Basically, this diagnostic
will be used to switch to the new diagnostic API for LLVMContext::emitError.
- DiagnosticStackSize: For stack size reporting. Comes as a replacement of the
hard coded warning in PEI.

This patch also features dynamic diagnostic identifiers. In other words plugins
can use this infrastructure for their own diagnostics (for more details, see
getNextAvailablePluginDiagnosticKind).

This patch introduces a new DiagnosticHandlerTy and a new DiagnosticContext in
the LLVMContext that should be set by the front-end to be able to map these
diagnostics in its own system.

http://llvm-reviews.chandlerc.com/D2376
<rdar://problem/15515174>

llvm-svn: 197508
2013-12-17 17:47:22 +00:00
Quentin Colombet
382b135d92 Revert r197438 and r197447 until we figure out how to avoid circular dependency at link time
llvm-svn: 197451
2013-12-17 01:19:59 +00:00
Quentin Colombet
66673f4075 Add warning capabilities in LLVM.
The patch adds a new LLVMContext::diagnose that can be used to communicate to
the front-end, if any, that something of interest happened.
The diagnostics are supported by a new abstraction, the DiagnosticInfo class.
The base class contains the following information:
- The kind of the report: What this is about.
- The severity of the report: How bad this is.

This patch also adds 2 classes:
- DiagnosticInfoInlineAsm: For inline asm reporting. Basically, this diagnostic
will be used to switch to the new diagnostic API for LLVMContext::emitError.
- DiagnosticStackSize: For stack size reporting. Comes as a replacement of the
hard coded warning in PEI.

This patch also features dynamic diagnostic identifiers. In other words plugins
can use this infrastructure for their own diagnostics (for more details, see
getNextAvailablePluginDiagnosticKind).

This patch introduces a new DiagnosticHandlerTy and a new DiagnosticContext in
the LLVMContext that should be set by the front-end to be able to map these
diagnostics in its own system.

http://llvm-reviews.chandlerc.com/D2376
<rdar://problem/15515174>

llvm-svn: 197438
2013-12-16 23:22:51 +00:00
Joerg Sonnenberger
8fe41b7319 Recognize EABIHF as environment and use it for RTAPI + VFP.
llvm-svn: 197405
2013-12-16 18:51:28 +00:00
Joerg Sonnenberger
002a14765e Enabling thumb2 mode used to force support for armv6t2. Replace this
with a temporary assertion and adjust the various test cases.

llvm-svn: 197224
2013-12-13 11:16:00 +00:00
Tim Northover
76fc8a4c40 ARM: constrain register-class in fast-isel
The tests were no longer using fast-isel at all (MachO needs an "ios" rather
than "darwin" triple at the moment and Linux needs ARM mode). Once that was
corrected, the verifier complained about a t2ADDri created for the alloca.

llvm-svn: 197046
2013-12-11 16:04:57 +00:00
Tim Northover
a4173715f7 ARM: fix folding of stack-adjustment (yet again).
When trying to eliminate an "sub sp, sp, #N" instruction by folding
it into an existing push/pop using dummy registers, we need to account
for the fact that this might affect precisely how "fp" gets set in the
prologue.

We were attempting this, but assuming that *whenever* we performed a
fold it would make a difference. This is false, for example, in:
    push {r4, r7, lr}
    add fp, sp, #4
    vpush {d8}
    sub sp, sp, #8

we can fold the "sub" into the "vpush", forming "vpush {d7, d8}".
However, in that case the "add fp" instruction mustn't change, which
we were getting wrong before.

Should fix PR18160.

llvm-svn: 196725
2013-12-08 15:56:50 +00:00
Weiming Zhao
43d8e6cb3b Bug 18149: [AArch32] VSel instructions has no ARMCC field
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.

llvm-svn: 196588
2013-12-06 17:56:48 +00:00
Andrew Trick
880e573d98 MI-Sched: handle latency of in-order operations with the new machine model.
The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.

MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.

I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false

For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)

Speedups:
| Benchmarks/BenchmarkGame/recursive         |  52.39% |
| Benchmarks/VersaBench/beamformer           |  20.80% |
| Benchmarks/Misc/pi                         |  19.97% |
| Benchmarks/Misc/mandel-2                   |  19.95% |
| SPEC/CFP2000/188.ammp                      |  18.72% |
| Benchmarks/McCat/08-main/main              |  18.58% |
| Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
| Benchmarks/Olden/power                     |  17.11% |
| Benchmarks/Misc-C++/mandel-text            |  16.47% |
| Benchmarks/Misc/oourafft                   |  15.94% |
| Benchmarks/Misc/flops-7                    |  14.99% |
| Benchmarks/FreeBench/distray               |  14.26% |
| SPEC/CFP2006/470.lbm                       |  14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
| Benchmarks/SmallPT/smallpt                 |  10.36% |
| Benchmarks/Misc-C++/Large/ray              |   8.97% |
| Benchmarks/Misc/fp-convert                 |   8.75% |
| Benchmarks/Olden/perimeter                 |   7.10% |
| Benchmarks/Bullet/bullet                   |   7.03% |
| Benchmarks/Misc/mandel                     |   6.75% |
| Benchmarks/Olden/voronoi                   |   6.26% |
| Benchmarks/Misc/flops-8                    |   5.77% |
| Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
| Benchmarks/MiBench/security-rijndael       |   5.15% |
| Benchmarks/Misc/flops-6                    |   5.10% |
| Benchmarks/Olden/tsp                       |   4.46% |
| Benchmarks/MiBench/consumer-lame           |   4.28% |
| Benchmarks/Misc/flops-5                    |   4.27% |
| Benchmarks/mafft/pairlocalalign            |   4.19% |
| Benchmarks/Misc/himenobmtxpa               |   4.07% |
| Benchmarks/Misc/lowercase                  |   4.06% |
| SPEC/CFP2006/433.milc                      |   3.99% |
| Benchmarks/tramp3d-v4                      |   3.79% |
| Benchmarks/FreeBench/pifft                 |   3.66% |
| Benchmarks/Ptrdist/ks                      |   3.21% |
| Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
| SPEC/CINT2000/175.vpr                      |   3.12% |
| Benchmarks/nbench                          |   2.98% |
| SPEC/CFP2000/183.equake                    |   2.91% |
| Benchmarks/Misc/perlin                     |   2.85% |
| Benchmarks/Misc/flops-1                    |   2.82% |
| Benchmarks/Misc-C++-EH/spirit              |   2.80% |
| Benchmarks/Misc/flops-2                    |   2.77% |
| Benchmarks/NPB-serial/is                   |   2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
| Benchmarks/BenchmarkGame/n-body            |   2.28% |
| Benchmarks/SciMark2-C/scimark2             |   2.27% |
| Benchmarks/Olden/bh                        |   2.03% |
| skidmarks10/skidmarks                      |   1.81% |
| Benchmarks/Misc/flops                      |   1.72% |

Slowdowns:
| Benchmarks/llubenchmark/llu                | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
| Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
| Benchmarks/Shootout/hash                   |  -2.35% |
| Benchmarks/Prolangs-C++/ocean              |  -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
| Polybench/linear-algebra/kernels/3mm       |  -1.95% |
| Benchmarks/McCat/09-vor/vor                |  -1.68% |

llvm-svn: 196516
2013-12-05 17:55:58 +00:00
Tim Northover
e4def5e228 ARM: fix yet another stack-folding bug
We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.

Should fix PR18136.

llvm-svn: 196493
2013-12-05 11:02:02 +00:00
David Peixotto
8ad70b3542 Add support for parsing ARM symbol variants on ELF targets
ARM symbol variants are written with parens instead of @ like this:

  .word __GLOBAL_I_a(target1)

This commit adds support for parsing these symbol variants in
expressions. We introduce a new flag to MCAsmInfo that indicates the
parser should use parens to parse the symbol variant. The expression
parser is modified to look for symbol variants using parens instead
of @ when the corresponding MCAsmInfo flag is true.

The MCAsmInfo parens flag is enabled only for ARM on ELF.

By adding this flag to MCAsmInfo, we are able to get rid of
redundant ARM-specific symbol variants and use the generic variants
instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new
UseParensForSymbolVariant attribute in MCAsmInfo to correctly print
the symbol variants for arm.

To achive this we need to keep a handle to the MCAsmInfo in the
MCSymbolRefExpr class that we can check when printing the symbol
variant.

Updated Tests:
  Changed case of symbol variant to match the generic kind.
  test/CodeGen/ARM/tls-models.ll
  test/CodeGen/ARM/tls1.ll
  test/CodeGen/ARM/tls2.ll
  test/CodeGen/Thumb2/tls1.ll
  test/CodeGen/Thumb2/tls2.ll

PR18080

llvm-svn: 196424
2013-12-04 22:43:20 +00:00
James Molloy
8a25992f39 Addrspacecasts are no-ops on ARM.
Testcase added.

llvm-svn: 196269
2013-12-03 11:23:11 +00:00
Tim Northover
dee8604caf ARM: decide whether to use movw/movt based on "minsize" attribute.
llvm-svn: 196102
2013-12-02 14:46:26 +00:00
Tim Northover
72360d201c ARM: add pseudo-instructions for lit-pool global materialisation
These are used by MachO only at the moment, and (much like the existing
MOVW/MOVT set) work around the fact that the labels used in the actual
instructions often contain PC-dependent components, which means that repeatedly
materialising the same global can't be CSEed.

With small modifications, it could be adapted to how ELF finds the address of
_GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there.

llvm-svn: 196090
2013-12-02 10:35:41 +00:00
Tim Northover
45479dcf49 ARM: fix bug in -Oz stack adjustment folding
Previously, we clobbered callee-saved registers when folding an "add
sp, #N" into a "pop {rD, ...}" instruction. This change checks whether
a register we're going to add to the "pop" could actually be live
outside the function before doing so and should fix the issue.

This should fix PR18081.

llvm-svn: 196046
2013-12-01 14:16:24 +00:00
Tim Northover
fa36dfeeca Darwin-ARM: use movw/movt for static relocations
llvm-svn: 195759
2013-11-26 12:45:05 +00:00
Amara Emerson
34df448f7c [ARM] Enable FeatureMP for Cortex-A5 by default.
Patch by Oliver Stannard.

llvm-svn: 195640
2013-11-25 13:17:15 +00:00
Manman Ren
d664bd7725 Debug Info: update testing cases to specify the debug info version number.
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.

Make tests more robust by removing hard-coded metadata numbers in CHECK lines.

llvm-svn: 195535
2013-11-23 01:16:29 +00:00
Manman Ren
409558f81e Debug Info: update testing cases to specify the debug info version number.
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.

llvm-svn: 195504
2013-11-22 21:49:45 +00:00
Tim Northover
74e3637a0c ARM: use CHECK-LABEL on a test.
llvm-svn: 195457
2013-11-22 13:25:07 +00:00
Richard Barton
c31078cded Add support for Cortex-A12.
Patch by Oliver Stannard!

llvm-svn: 195448
2013-11-22 11:53:16 +00:00
Artyom Skrobov
97af2ec096 [ARM] add the overlooked tests for Cortex-A7 build attributes
llvm-svn: 195365
2013-11-21 16:22:39 +00:00
NAKAMURA Takumi
4d3457e628 [PR17978] Mark two ARM/fast-isel tests as XFAIL:vg_leak due to GV.
llvm-svn: 195010
2013-11-18 13:50:19 +00:00
Bob Wilson
9f3e6b25ee Avoid illegal integer promotion in fastisel
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow.  Results are different when:

    sext(a) + sext(b) != sext(a + b)

Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.

<rdar://problem/15292280>

Patch by Duncan Exon Smith!

llvm-svn: 194840
2013-11-15 19:09:27 +00:00
Tim Northover
28adfbb0d1 ARM: produce friendly error for invalid inline asm
We used to perform an invalid operation on an MVT and crash, which wasn't much
fun.

Patch by Oliver Stannard.

llvm-svn: 194714
2013-11-14 17:15:39 +00:00
Rafael Espindola
4929301af4 Error if we see an alias to a declaration.
In ELF and COFF an alias is just another offset in a section. There is no way
to represent an alias to something in another file.

In MachO, the spec has the N_INDR type which should allow for exactly that, but
is not currently implemented. Given that it is specified but not implemented,
we error in codegen to avoid miscompiling but don't reject aliases to
declarations in the verifier to leave the option open of implementing it.

In the past we have used alias to declarations as a way of implementing
weakref, which is why it exists in some old tests which this patch updates.

llvm-svn: 194705
2013-11-14 13:58:06 +00:00
Weiming Zhao
0da5cc0765 Enable generating legacy IT block for AArch32
By default, the behavior of IT block generation will be determinated
dynamically base on the arch (armv8 vs armv7). This patch adds backend
options: -arm-restrict-it and -arm-no-restrict-it.  The former one
restricts the generation of IT blocks (the same behavior as thumbv8) for
both arches. The later one allows the generation of legacy IT block (the
same behavior as ARMv7 Thumb2) for both arches.

Clang will support -mrestrict-it and -mno-restrict-it, which is
compatible with GCC.

llvm-svn: 194592
2013-11-13 18:29:49 +00:00
Bradley Smith
9aa8ac9f23 [ARM] Add support for FP_HP_extension build attribute
llvm-svn: 194470
2013-11-12 10:38:05 +00:00
Quentin Colombet
b06a0ed4b0 [VirtRegMap] Fix for PR17825. Do not ignore noreturn definitions when setting
isPhysRegUsed if the unwind information is required.
Indeed, the runtime may need a correct stack to be able to unwind the call.

llvm-svn: 194271
2013-11-08 18:14:17 +00:00
Tim Northover
93bcc66e73 ARM: fold prologue/epilogue sp updates into push/pop for code size
ARM prologues usually look like:
    push {r7, lr}
    sub sp, sp, #4

If code size is extremely important, this can be optimised to the single
instruction:
    push {r6, r7, lr}

where we don't actually care about the contents of r6, but pushing it subtracts
4 from sp as a side effect.

This should implement such a conversion, predicated on the "minsize" function
attribute (-Oz) since I've yet to find any code it actually makes faster.

llvm-svn: 194264
2013-11-08 17:18:07 +00:00
Bob Wilson
e7dde0c061 Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+.
rdar://12856873
Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller

llvm-svn: 193942
2013-11-03 06:14:38 +00:00
Bradley Smith
2521975a42 [ARM] Add Virtualization subtarget feature and more build attributes in this area
Add a Virtualization ARM subtarget feature along with adding proper build
attribute emission for Tag_Virtualization_use (encodes Virtualization and
TrustZone) and Tag_MPextension_use.

Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to
something that is more maintainable. This changes the focus of this
testcase away from testing CPU defaults (which is tested elsewhere), onto
specifically testing that attributes are encoded correctly.

llvm-svn: 193859
2013-11-01 13:27:35 +00:00