Commit Graph

13608 Commits

Author SHA1 Message Date
Hal Finkel
2d55698ed7 [PowerPC/MIR Serialization] Target flags serialization support
Add support for MIR serialization of PowerPC-specific operand target flags
(based on the generic infrastructure added in r244185 and r245383).

I won't even pretend that this is good test coverage, but this includes the
regression test associated with r246372. Adding an MIR test for that fix is far
superior to adding an IR-level test because particular instruction-scheduling
decisions are necessary in order to expose the bug, and using an MIR test we
can start the pipeline post-scheduling.

llvm-svn: 246373
2015-08-30 07:50:35 +00:00
James Molloy
a184adffab [ARM] Fix up buildbots after r246360
I have no idea how I missed this in my internal testing. Just no idea. Sorry for the bot-armageddon.

llvm-svn: 246361
2015-08-29 11:50:08 +00:00
James Molloy
45ee9898ec [ARM] Hoist fabs/fneg above a conversion to float.
This is especially visible in softfp mode, for example in the implementation of libm fabs/fneg functions. If we have:

%1 = vmovdrr r0, r1
%2 = fabs %1

then move the fabs before the vmovdrr:

%1 = and r1, #0x7FFFFFFF
%2 = vmovdrr r0, r1

This is never a lose, and could be a serious win because the vmovdrr may be followed by a vmovrrd, which would enable us to remove the conversion into FPRs completely.

We already do this for f32, but not for f64. Tests are added for both.

llvm-svn: 246360
2015-08-29 10:49:11 +00:00
Matt Arsenault
e4d0c142e8 AMDGPU: Add sdst operand to VOP2b instructions
The VOP3 encoding of these allows any SGPR pair for the i1
output, but this was forced before to always use vcc.
This doesn't yet try to use this, but does add the operand
to the definitions so the main change is adding vcc to the
output of the VOP2 encoding.

llvm-svn: 246358
2015-08-29 07:16:50 +00:00
Matt Arsenault
5c004a7c61 AMDGPU: Fix dropping mem operands when moving to VALU
Without a memory operand, mayLoad or mayStore instructions
are treated as hasUnorderedMemRef, which results in much worse
scheduling.

We really should have a verifier check that any
non-side effecting mayLoad or mayStore has a memory operand.
There are a few instructions (interp and images) which I'm
not sure what / where to add these.

llvm-svn: 246356
2015-08-29 06:48:46 +00:00
Duncan P. N. Exon Smith
0bd73bb58b DI: Update tests before adding !dbg subprogram attachments
I'm working on adding !dbg attachments to functions (PR23367), which
we'll use to determine the canonical subprogram for a function (instead
of the `subprograms:` array in the compile units).  This updates a few
old tests in preparation.

Transforms/Mem2Reg/ConvertDebugInfo2.ll had an old-style grep+count
based test that would start to fail because I've added an extra line
with `!dbg`.  Instead, explicitly `CHECK` for what I think the test
actually cares about.

All three testcases have subprograms with a valid `function:` reference
-- which means my upgrade script will add a `!dbg` attachment -- but
that aren't referenced from any compile unit.  I suspect these testcases
were handreduced over-zealously (or have bitrotted?).  Add a reference
from the compile unit so that upcoming Verifier checks won't fail here.

llvm-svn: 246351
2015-08-28 23:32:00 +00:00
Duncan P. N. Exon Smith
814b8e91c7 DI: Require subprogram definitions to be distinct
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'.  Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.

While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore.  Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node.  The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.

I updated almost all the IR with the following script:

    git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'

Likely some variant of would work for out-of-tree testcases.

llvm-svn: 246327
2015-08-28 20:26:49 +00:00
Matt Arsenault
d9c830154f Make MergeConsecutiveStores look at other stores on same chain
When combiner AA is enabled, look at stores on the same chain.
Non-aliasing stores are moved to the same chain so the existing
code fails because it expects to find an adajcent store on a consecutive
chain.

Because of how DAGCombiner tries these store combines,
MergeConsecutiveStores doesn't see the correct set of stores on the chain
when it visits the other stores. Each store individually has its chain
fixed before trying to merge consecutive stores, and then tries to merge
stores from that point before the other stores have been processed to
have their chains fixed. To fix this, attempt to use FindBetterChain
on any possibly neighboring stores in visitSTORE.

Suppose you have 4 32-bit stores that should be merged into 1 vector
store. One store would be visited first, fixing the chain. What happens is
because not all of the store chains have yet been fixed, 2 of the stores
are merged. The other 2 stores later have their chains fixed,
but because the other stores were already merged, they have different
memory types and merging the two different sized stores is not
supported and would be more difficult to handle.

llvm-svn: 246307
2015-08-28 17:31:28 +00:00
David Majnemer
9c51053dd5 Test case for r246304.
llvm-svn: 246306
2015-08-28 17:19:54 +00:00
Sanjay Patel
7c912898a5 [x86] enable machine combiner reassociations for scalar 'and' insts
llvm-svn: 246300
2015-08-28 14:09:48 +00:00
Joseph Tremoulet
ec18285b91 [WinEH] Update coloring to handle nested cases cleanly
Summary:
Change the coloring algorithm in WinEHPrepare to visit a funclet's exits
in its parents' contexts and so properly classify the continuations of
nested funclets.

Also change the placement of cloned blocks to be deterministic and to
maintain the relative order of each funclet's blocks.

Add a lit test showing various patterns that require cloning, the last
several of which don't have CHECKs yet because they require cloning
entire funclets which is NYI.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12353

llvm-svn: 246245
2015-08-28 01:12:35 +00:00
Quentin Colombet
fa4ecb4b9a [AArch64][CollectLOH] Fix a regression that prevented us to detect chains of
more than 2 instructions.

I introduced this regression a while back and did not noticed it because I
somehow forgot to push the initial test cases for the pass!

Fix that as well!

llvm-svn: 246239
2015-08-27 23:47:10 +00:00
Reid Kleckner
0e2882345d [WinEH] Add some support for code generating catchpad
We can now run 32-bit programs with empty catch bodies.  The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-svn: 246235
2015-08-27 23:27:47 +00:00
Ahmed Bougacha
87166905c8 [CodeGen] Check FoldConstantArithmetic result before using it.
Fixes PR24602: r245689 introduced an unguarded use of
SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails
because of opaque (hoisted) constants.

llvm-svn: 246217
2015-08-27 21:46:04 +00:00
Matt Arsenault
053ae1f5e3 AMDGPU/SI: Add test for folding constants into operands
Patch by Axel Davy

llvm-svn: 246167
2015-08-27 17:41:27 +00:00
Jonathan Roelofs
054026dba2 Fix a case of CHECK[^:]*$.
http://reviews.llvm.org/D11917

llvm-svn: 246163
2015-08-27 17:03:14 +00:00
Cong Hou
08cb4fc688 Fixed a bug that edge weights are not assigned correctly when lowering switch statement.
This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters.

Differential Revision: http://reviews.llvm.org/D12391

llvm-svn: 246129
2015-08-27 00:37:40 +00:00
Bjarke Hammersholt Roune
6c64738e87 [NVPTX] Let NVPTX backend detect integer min and max patterns.
Summary:
Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support.


Reviewers: jholewinski, meheff, jingyue

Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D12377

llvm-svn: 246107
2015-08-26 23:22:02 +00:00
Cong Hou
03127700d5 Assign weights to edges to jump table / bit test header when lowering switch statement.
Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly.

Differential Revision: http://reviews.llvm.org/D12166#fae6eca7

llvm-svn: 246104
2015-08-26 23:15:32 +00:00
JF Bastien
45479f627a WebAssembly: handle private/internal globals.
Things of note:
 - Other linkage types aren't handled yet. We'll figure it out with dynamic linking.
 - Special LLVM globals are either ignored, or error out for now.
 - TLS isn't supported yet (WebAssembly will have threads later).
 - There currently isn't a syntax for alignment, I left it in a comment so it's easy to hook up.
 - Undef is convereted to whatever the type's appropriate null value is.
 - assert versus report_fatal_error: follow what other AsmPrinters do, and assert only on what should have been caught elsewhere.

llvm-svn: 246092
2015-08-26 22:09:54 +00:00
Matt Arsenault
5e7f95e567 AMDGPU: Don't reprocess instructions when splitting i64 bcnt
llvm-svn: 246079
2015-08-26 20:48:04 +00:00
Matt Arsenault
445833cc91 AMDGPU: Fix not moving users of s_bfe_i64 to VALU
This wouldn't propagate to users of the original BFE
and would hit a verifier error.

llvm-svn: 246078
2015-08-26 20:47:58 +00:00
Matthias Braun
4e7ded834f SelectionDAGBuilder: Fix SPDescriptor not resetting GuardReg
This was causing problems when some functions use a GuardReg and some
don't as can happen when mixing SelectionDAG and FastISel generated
functions.

llvm-svn: 246075
2015-08-26 20:46:52 +00:00
Matthias Braun
4816b18d86 FastISel: Avoid adding a successor block twice for degenerate IR.
This fixes http://llvm.org/PR24581

Differential Revision: http://reviews.llvm.org/D12350

llvm-svn: 246074
2015-08-26 20:46:49 +00:00
Matt Arsenault
19c5488015 AMDGPU: Produce error on dynamic_stackalloc
llvm-svn: 246048
2015-08-26 18:37:13 +00:00
James Y Knight
3602286937 [SPARC] Fix stupid oversight in stack realignment support.
If you're going to realign %sp to get object alignment properly (which
the code does), and stack offsets and alignments are calculated going
down from %fp (which they are), then the total stack size had better
be a multiple of the alignment. LLVM did indeed ensure that.

And then, after aligning, the sparc frame code added 96 (for sparcv8)
to the frame size, making any requested alignment of 64-bytes or
higher *guaranteed* to be misaligned. The test case added with r245668
even tests this exact scenario, and asserted the incorrect behavior,
which I somehow failed to notice. D'oh.

This change fixes the frame lowering code to align the stack size
*after* adding the spill area, instead.

Differential Revision: http://reviews.llvm.org/D12349

llvm-svn: 246042
2015-08-26 17:57:51 +00:00
Charles Davis
119525914c Make variable argument intrinsics behave correctly in a Win64 CC function.
Summary:
This change makes the variable argument intrinsics, `llvm.va_start` and
`llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows
inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch
I have to add support for GCC's `__builtin_ms_va_list` constructs.

Reviewers: nadav, asl, eugenis

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1622

llvm-svn: 245990
2015-08-25 23:27:41 +00:00
JF Bastien
b6091dfe0f WebAssembly: emit (func (param t) (result t)) s-expressions
Summary: Match spec format: https://github.com/WebAssembly/spec/blob/master/ml-proto/test/fac.wasm

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D12307

llvm-svn: 245986
2015-08-25 22:58:05 +00:00
JF Bastien
289287060b WebAssembly: comment out .globl when printing textual assembly
Do the same for .weak (not implemented for now, but may as well to it). Update comment string to two semicolons.

llvm-svn: 245982
2015-08-25 22:23:15 +00:00
Cong Hou
cd59591396 Remove the final bit test during lowering switch statement if all cases in bit test cover a contiguous range.
When lowering switch statement, if bit tests are used then LLVM will always generates a jump to the default statement in the last bit test. However, this is not necessary when all cases in bit tests cover a contiguous range. This is because when generating the bit tests header MBB, there is a range check that guarantees cases in bit tests won't go outside of [low, high], where low and high are minimum and maximum case values in the bit tests. This patch checks if this is the case and then doesn't emit jump to default statement and hence saves a bit test and a branch.

Differential Revision: http://reviews.llvm.org/D12249

llvm-svn: 245976
2015-08-25 21:34:38 +00:00
Sanjay Patel
1015661edf fix CHECK-LABEL and wrong label
llvm-svn: 245958
2015-08-25 18:12:40 +00:00
Sanjay Patel
deb8f826a5 make fast unaligned memory accesses implicit with SSE4.2 or SSE4a
This is a follow-on from the discussion in http://reviews.llvm.org/D12154.

This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.

A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx

This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449

Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.

Differential Revision: http://reviews.llvm.org/D12288

llvm-svn: 245950
2015-08-25 16:29:21 +00:00
Michael Kuperstein
6e3fee07f7 [X86] Remove references to _ftol2
As of r245924, _ftol2 is no longer used for fptoui on MS platforms.
Remove the dead code associated with it.

llvm-svn: 245925
2015-08-25 07:58:33 +00:00
Michael Kuperstein
8515893be8 [X86] Fix fptoui conversions
This fixes two issues in x86 fptoui lowering.
1) Makes conversions from f80 go through the right path on AVX-512.
2) Implements an inline sequence for fptoui i64 instead of a library
call. This improves performance by 6X on SSE3+ and 3X otherwise.
Incidentally, it also removes the use of ftol2 for fptoui, which was
wrong to begin with, as ftol2 converts to a signed i64, producing
wrong results for values >= 2^63.

Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D11316

llvm-svn: 245924
2015-08-25 07:42:09 +00:00
Steve King
5cdbd20cc3 Pass function attributes instead of boolean in isIntDivCheap().
llvm-svn: 245921
2015-08-25 02:31:21 +00:00
Hal Finkel
0f2ddcb83f [PowerPC] PPCVSXFMAMutate should ignore trivial-copy addends
We might end up with a trivial copy as the addend, and if so, we should ignore
the corresponding FMA instruction. The trivial copy can be coalesced away later,
so there's nothing to do here. We should not, however, assert. Fixes PR24544.

llvm-svn: 245907
2015-08-24 23:48:28 +00:00
JF Bastien
af111db8af WebAssembly: Implement call
Summary: Support function calls.

Reviewers: sunfish, sunfishcode

Subscribers: sunfishcode, jfb, llvm-commits

Differential revision: http://reviews.llvm.org/D12219

llvm-svn: 245887
2015-08-24 22:16:48 +00:00
JF Bastien
19c2e6634d Revert two bad commits.
Summary: I forgot to squash git commits before doing an svn dcommit of D12219. Reverting, and re-submitting.

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D12298

llvm-svn: 245886
2015-08-24 22:07:33 +00:00
JF Bastien
744ad106c3 Missing print.
llvm-svn: 245883
2015-08-24 22:00:04 +00:00
JF Bastien
d8a9d66d50 call
llvm-svn: 245882
2015-08-24 21:59:51 +00:00
Simon Pilgrim
342f60384e [X86][SSE] Added tests for zero-extension vector shuffles that don't extend starting from the 0'th lane.
llvm-svn: 245878
2015-08-24 21:28:13 +00:00
Dan Gohman
69c4c76396 [WebAssembly] CodeGen support for __builtin_wasm_page_size()
llvm-svn: 245872
2015-08-24 21:03:24 +00:00
Bill Schmidt
32fd189de2 [PPC64LE] Fix PR24546 - Swap optimization and debug values
This patch fixes PR24546, which demonstrates a segfault during the VSX
swap removal pass.  The problem is that debug value instructions were
not excluded from the list of instructions to be analyzed for webs of
related computation.  I've added the test case from the PR as a crash
test in test/CodeGen/PowerPC.

llvm-svn: 245862
2015-08-24 19:27:27 +00:00
Dan Gohman
7b63484b99 [WebAssembly] Skeleton FastISel support
llvm-svn: 245860
2015-08-24 18:44:37 +00:00
Dan Gohman
896e53fae8 [WebAssembly] Implement floating point rounding operators.
llvm-svn: 245859
2015-08-24 18:23:13 +00:00
Dan Gohman
08fc966d3c [WebAssembly] Implement the is_zero_undef forms of cttz and ctlz
llvm-svn: 245851
2015-08-24 16:39:37 +00:00
Oliver Stannard
284f2bffc9 Add DAG optimisation for FP16_TO_FP
The FP16_TO_FP node only uses the bottom 16 bits of its input, so the
following pattern can be optimised by removing the AND:

  (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op)

This is a common pattern for ARM targets when functions have __fp16
arguments, as they are passed as floats (so that they get passed in the
correct registers), but then bitcast and truncated to ignore the top 16
bits.

llvm-svn: 245832
2015-08-24 09:47:45 +00:00
Scott Douglass
bdef60462d [ARM] Use AEABI helpers for i64 div and rem
Differential Revision: http://reviews.llvm.org/D12232

llvm-svn: 245830
2015-08-24 09:17:18 +00:00
Sanjay Patel
453b4df973 remove FIXME; fixed by r245733
llvm-svn: 245819
2015-08-23 20:43:25 +00:00
Simon Pilgrim
2a7049abe0 [DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTOR
Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from.

llvm-svn: 245814
2015-08-23 15:22:14 +00:00