Commit Graph

13449 Commits

Author SHA1 Message Date
Alex Lorenz
4be56e9370 MIR Serialization: Serialize the jump table pseudo source values.
llvm-svn: 244813
2015-08-12 21:11:08 +00:00
Alex Lorenz
d858f874fa MIR Serialization: Serialize the GOT pseudo source values.
llvm-svn: 244809
2015-08-12 21:00:22 +00:00
Alex Lorenz
46e9558ac6 MIR Serialization: Serialize the stack pseudo source values.
llvm-svn: 244806
2015-08-12 20:44:16 +00:00
Alex Lorenz
91097a3ffa MIR Serialization: Serialize the constant pool pseudo source values.
llvm-svn: 244803
2015-08-12 20:33:26 +00:00
JF Bastien
71d29acecd WebAssembly: floating-point comparisons
Summary:
D11924 implemented part of the floating-point comparisons, this patch implements the rest:
 * Tell ISelLowering that all booleans are either 0 or 1.
 * Expand the eq/ne/lt/le/gt/ge floating-point comparisons to the canonical ones (similar to what Mips32r6InstrInfo.td does).
 * Add tests for ord/uno.
 * Add tests for ueq/one/ult/ule/ugt/uge.
 * Fix existing comparison tests to remove the (res & 1) code, which setBooleanContents stops from generating.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11970

llvm-svn: 244779
2015-08-12 17:53:29 +00:00
Simon Pilgrim
a5737a44da Cleaned up test. NFCI.
llvm-svn: 244765
2015-08-12 17:00:50 +00:00
John Brawn
75fc09ddba Redo "Make global aliases have symbol size equal to their type"
r242520 was reverted in r244313 as the expected behaviour of the alias
attribute in C is that the alias has the same size as the aliasee. However
we can re-introduce adding the size on the alias when the aliasee does not,
from a source code or object perspective, exist as a discrete entity. This
happens when the aliasee is not a symbol, or when that symbol is private.

Differential Revision: http://reviews.llvm.org/D11943

llvm-svn: 244752
2015-08-12 15:05:39 +00:00
John Brawn
0bef27d836 [GlobalMerge] Only emit aliases for internal linkage variables for non-Mach-O
On Mach-O emitting aliases for the variables that make up a MergedGlobals
variable can cause problems when linking with dead stripping enabled so don't
do that, except for external variables where we must emit an alias.

llvm-svn: 244748
2015-08-12 13:36:48 +00:00
Michael Kuperstein
fe0d9bb6eb [X86] Disable mul -> shl + lea combine when compiling for minsize
Differential Revision: http://reviews.llvm.org/D11904

llvm-svn: 244740
2015-08-12 11:27:26 +00:00
Michael Kuperstein
bc7f99a3ab [X86] Allow x86 call frame optimization to fold more loads into pushes
This abstracts away the test for "when can we fold across a MachineInstruction"
into the the MI interface, and changes call-frame optimization use the same test
the peephole optimizer users.

Differential Revision: http://reviews.llvm.org/D11945

llvm-svn: 244729
2015-08-12 10:14:58 +00:00
Matt Arsenault
c574686529 AMDGPU: Fix assert on dbg_value instructions
llvm-svn: 244728
2015-08-12 09:04:44 +00:00
Simon Pilgrim
8c049d5c03 [InstCombine] Move SSE/AVX vector blend folding to instcombiner
As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely).

InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask).

I also moved all the relevant combine tests into InstCombine/blend_x86.ll

Differential Revision: http://reviews.llvm.org/D11934

llvm-svn: 244723
2015-08-12 08:08:56 +00:00
Sanjay Patel
260b6d36f4 [x86] enable machine combiner reassociations for 256-bit vector FP mul/add
llvm-svn: 244705
2015-08-12 00:29:10 +00:00
Mark Heffernan
438ffe5eac Use 32-bit divides instead of 64-bit divides where possible.
For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor
fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason
is that many index computations are carried out in 64-bits but never actually exceed the
capacity of a 32-bit word.

llvm-svn: 244684
2015-08-11 22:16:34 +00:00
JF Bastien
da06bce8b5 WebAssembly: implement comparison.
Some of the FP comparisons (ueq, one, ult, ule, ugt, uge) are currently broken, I'll fix them in a follow-up.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11924

llvm-svn: 244665
2015-08-11 21:02:46 +00:00
Sanjay Patel
2c6a01570d [x86] enable machine combiner reassociations for 128-bit vector single/double multiplies
llvm-svn: 244657
2015-08-11 20:19:23 +00:00
Sanjay Patel
82d91ddb4f fix minsize detection: minsize attribute implies optimizing for size
Also, add a test for optsize because this was not part of any existing regression test.

llvm-svn: 244651
2015-08-11 19:39:36 +00:00
Jingyue Wu
99eb4685ef SelectionDAG: Prefer to combine multiplication with less uses for fma
Summary:
For example:

  s6 = s0*s5;
  s2 = s6*s6 + s6;
  ...
  s4 = s6*s3;

We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)).
This only happens when Aggressive is true, otherwise hasOneUse() check
already prevents from folding the multiplication with more uses.

Test Plan: test/CodeGen/NVPTX/fma-assoc.ll

Patch by Xuetian Weng

Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm

Subscribers: arsenm, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11855

llvm-svn: 244649
2015-08-11 19:21:46 +00:00
Sanjay Patel
070df89928 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244631
2015-08-11 17:04:31 +00:00
Sanjay Patel
caddda56aa add missing tests for powi expansion with size optimizations
The minsize test will be fixed in the next commit.

llvm-svn: 244630
2015-08-11 16:58:49 +00:00
Sanjay Patel
c3e8349a3e fixed to use FileCheck
llvm-svn: 244627
2015-08-11 16:51:31 +00:00
Sanjay Patel
605b6adf31 fixed to test attribute, rather than CPU
llvm-svn: 244625
2015-08-11 16:43:18 +00:00
John Brawn
863bfdbfb4 [GlobalMerge] Use private linkage for MergedGlobals variables
Other objects can never reference the MergedGlobals symbol so external linkage
is never needed. Using private instead of internal linkage means the object is
more similar to what it looks like when global merging is not enabled, with
the only difference being that the merged variables are addressed indirectly
relative to the start of the section they are in.

Also add aliases for merged variables with internal linkage, as this also makes
the object be more like what it is when they are not merged.

Differential Revision: http://reviews.llvm.org/D11942

llvm-svn: 244615
2015-08-11 15:48:04 +00:00
Sanjay Patel
c454f07eb1 delete FIXME comment; it's fixed
llvm-svn: 244605
2015-08-11 14:35:29 +00:00
Sanjay Patel
74ca312666 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244604
2015-08-11 14:31:14 +00:00
Sanjay Patel
52c2691829 add missing test for machine combiner when optimizing for size
The minsize test will be fixed in the next commit.

llvm-svn: 244603
2015-08-11 14:29:45 +00:00
Michael Kuperstein
243c073a2e [X86] Allow merging of immediates within a basic block for code size savings
First step in preventing immediates that occur more than once within a single
basic block from being pulled into their users, in order to prevent unnecessary
large instruction encoding .Currently enabled only when optimizing for size.

Patch by: zia.ansari@intel.com
Differential Revision: http://reviews.llvm.org/D11363

llvm-svn: 244601
2015-08-11 14:10:58 +00:00
James Molloy
b7b2a1e9b4 [AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic.
Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change:

  - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant.
  - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall!

llvm-svn: 244595
2015-08-11 12:06:37 +00:00
Michael Kuperstein
7337ee23d8 [X86] When optimizing for minsize, use POP for small post-call stack clean-up
When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp"
following a call by one or two pops, respectively. We don't try to do it in
general, but only when the stack adjustment immediately follows a call - which
is the most common case.

That allows taking a short-cut when trying to find a free register to pop into,
instead of a full-blown liveness check. If the adjustment immediately follows a
call, then every register the call clobbers but doesn't define should be dead at
that point, and can be used.

Differential Revision: http://reviews.llvm.org/D11749

llvm-svn: 244578
2015-08-11 08:48:48 +00:00
Michael Kuperstein
82814f63c0 Allow PeepholeOptimizer to fold a few more cases
The condition for clearing the folding candidate list was clamped together
with the "uninteresting instruction" condition. This is too conservative,
e.g. we don't need to clear the list when encountering an IMPLICIT_DEF.

Differential Revision: http://reviews.llvm.org/D11591

llvm-svn: 244577
2015-08-11 08:19:43 +00:00
JF Bastien
ef172fc9f0 WebAssembly: add basic floating-point tests
Summary: I somehow forgot to add these when I added the basic floating-point opcodes. Also remove ceil/floor/trunc/nearestint for now, and add them only when properly tested.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11927

llvm-svn: 244562
2015-08-11 02:45:15 +00:00
JF Bastien
e73ce68225 WebAssembly: simply assert on SNaN and NaNs with payloads
Summary: convertToHexString doesn't represent them correctly at this point in time. This is a follow-up to sunfish's suggestion in D11914.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11925

llvm-svn: 244551
2015-08-11 00:49:20 +00:00
Alex Lorenz
c483808785 MIR Serialization: Serialize UsedPhysRegMask from the machine register info.
This commit serializes the UsedPhysRegMask register mask from the machine
register information class. The mask is serialized as an inverted
'calleeSavedRegisters' mask to keep the output minimal.

This commit also allows the MIR parser to infer this mask from the register
mask operands if the machine function doesn't specify it.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244548
2015-08-11 00:32:49 +00:00
Alex Lorenz
c5d35ba009 MIR Parser: Report an error when a stack object is redefined.
llvm-svn: 244536
2015-08-10 23:50:41 +00:00
Alex Lorenz
1d9a303142 MIR Parser: Report an error when a fixed stack object is redefined.
llvm-svn: 244534
2015-08-10 23:45:02 +00:00
Alex Lorenz
b97c9ef4d0 MIR Serialization: Serialize the liveout register mask machine operands.
llvm-svn: 244529
2015-08-10 23:24:42 +00:00
Sanjay Patel
d967a878fa fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244528
2015-08-10 23:07:26 +00:00
JF Bastien
4a6422562d WebAssembly: print immediates
Summary:
For now output using C99's hexadecimal floating-point representation.

This patch also cleans up how machine operands are printed: instead of special-casing per type of machine instruction, the code now handles operands generically.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11914

llvm-svn: 244520
2015-08-10 22:36:48 +00:00
Alex Lorenz
e5101e2016 MachineVerifier: Handle the optional def operand in a PATCHPOINT instruction.
The PATCHPOINT instructions have a single optional defined register operand,
but the machine verifier can't verify the optional defined register operands.
This commit makes sure that the machine verifier won't report an error when a
PATCHPOINT instruction doesn't have its optional defined register operand.
This change will allow us to enable the machine verifier for the code
generation tests for the patchpoint intrinsics.

Reviewers: Juergen Ributzka
llvm-svn: 244513
2015-08-10 21:47:36 +00:00
Alex Lorenz
2f43dd5a12 StackMap: FastISel: Add an appropriate number of immediate operands to the
frame setup instruction.

This commit ensures that the stack map lowering code in FastISel adds an
appropriate number of immediate operands to the frame setup instruction.

The previous code added just one immediate operand, which was fine for a target
like AArch64, but on X86 the ADJCALLSTACKDOWN64 instruction needs two explicit
operands. This caused the machine verifier to report an error when the old code
added just one.

Reviewers: Juergen Ributzka

Differential Revision: http://reviews.llvm.org/D11853

llvm-svn: 244508
2015-08-10 21:27:03 +00:00
JF Bastien
fa9746dc8d x86: Emit LAHF/SAHF instead of PUSHF/POPF
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF.

As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire.

I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are:

| Time per call (ms)  | Runtime (ms) | Benchmark                      |
| 0.000012514         |      6257    | sete.i386                      |
| 0.000012810         |      6405    | sete.i386-fast                 |
| 0.000010456         |      5228    | sete.x86-64                    |
| 0.000010496         |      5248    | sete.x86-64-fast               |
| 0.000012906         |      6453    | lahf-sahf.i386                 |
| 0.000013236         |      6618    | lahf-sahf.i386-fast            |
| 0.000010580         |      5290    | lahf-sahf.x86-64               |
| 0.000010304         |      5152    | lahf-sahf.x86-64-fast          |
| 0.000028056         |     14028    | pushf-popf.i386                |
| 0.000027160         |     13580    | pushf-popf.i386-fast           |
| 0.000023810         |     11905    | pushf-popf.x86-64              |
| 0.000026468         |     13234    | pushf-popf.x86-64-fast         |

Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose.

Reviewers: rnk, jvoung, t.p.northover

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D6629

llvm-svn: 244503
2015-08-10 20:59:36 +00:00
Sanjay Patel
d09391c8cd fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244499
2015-08-10 20:45:44 +00:00
Sanjay Patel
178f8cba51 [x86, SSE]]add missing tests for load folding with partial register update
The minsize case is wrong; that will be fixed in the next commit.

llvm-svn: 244498
2015-08-10 20:34:34 +00:00
Simon Pilgrim
a3a72b41de [InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner
As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations.

Differential Revision: http://reviews.llvm.org/D11886

llvm-svn: 244495
2015-08-10 20:21:15 +00:00
Jonathan Roelofs
f45295c366 Fix a few more cases of 'CHECK[^:]*$'. NFCI
llvm-svn: 244491
2015-08-10 19:56:39 +00:00
James Y Knight
3994be87de [Sparc] Implement i64 load/store support for 32-bit sparc.
The LDD/STD instructions can load/store a 64bit quantity from/to
memory to/from a consecutive even/odd pair of (32-bit) registers. They
are part of SparcV8, and also present in SparcV9. (Although deprecated
there, as you can store 64bits in one register).

As recommended on llvmdev in the thread "How to enable use of 64bit
load/store for 32bit architecture" from Apr 2015, I've modeled the
64-bit load/store operations as working on a v2i32 type, rather than
making i64 a legal type, but with few legal operations. The latter
does not (currently) work, as there is much code in llvm which assumes
that if i64 is legal, operations like "add" will actually work on it.

The same assumption does not hold for v2i32 -- for vector types, it is
workable to support only load/store, and expand everything else.

This patch:
- Adds a new register class, IntPair, for even/odd pairs of registers.

- Modifies the list of reserved registers, the stack spilling code,
  and register copying code to support the IntPair register class.

- Adds support in AsmParser. (note that in asm text, you write the
  name of the first register of the pair only. So the parser has to
  morph the single register into the equivalent paired register).

- Adds the new instructions themselves (LDD/STD/LDDA/STDA).

- Hooks up the instructions and registers as a vector type v2i32. Adds
  custom legalizer to transform i64 load/stores into v2i32 load/stores
  and bitcasts, so that the new instructions can actually be
  generated, and marks all operations other than load/store on v2i32
  as needing to be expanded.

- Copies the unfortunate SelectInlineAsm hack from ARMISelDAGToDAG.
  This hack undoes the transformation of i64 operands into two
  arbitrarily-allocated separate i32 registers in
  SelectionDAGBuilder. and instead passes them in a single
  IntPair. (Arbitrarily allocated registers are not useful, asm code
  expects to be receiving a pair, which can be passed to ldd/std.)

Also adds a bunch of test cases covering all the bugs I've added along
the way.

Differential Revision: http://reviews.llvm.org/D8713

llvm-svn: 244484
2015-08-10 19:11:39 +00:00
Jonathan Roelofs
49e46ce8e2 Fix a bunch of trivial cases of 'CHECK[^:]*$' in the tests. NFCI
I looked into adding a warning / error for this to FileCheck, but there doesn't
seem to be a good way to avoid it triggering on the instances of it in RUN lines.

llvm-svn: 244481
2015-08-10 19:01:27 +00:00
Sanjay Patel
10294b59de fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244464
2015-08-10 17:15:17 +00:00
Sanjay Patel
0f12d71b49 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244463
2015-08-10 17:00:44 +00:00
Sanjay Patel
68b0325a9e fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244460
2015-08-10 16:47:47 +00:00