Commit Graph

1783 Commits

Author SHA1 Message Date
Hiroshi Inoue
95f24dca98 [SelectionDAG] set dereferenceable flag when expanding memcpy/memmove
When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable.
This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer.
This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure.

Differential Revision: https://reviews.llvm.org/D34467

llvm-svn: 306209
2017-06-24 15:17:38 +00:00
Nirav Dave
18c10c53d0 Update constants in complex-return test to prevent reduction to smaller constants
llvm-svn: 306192
2017-06-24 01:29:24 +00:00
Lei Huang
84dbbfdeb9 [PowerPC] define target hook isReallyTriviallyReMaterializable()
Define target hook isReallyTriviallyReMaterializable() to explicitly specify
PowerPC instructions that are trivially rematerializable.  This will allow
the MachineLICM pass to accurately identify PPC instructions that should always
be hoisted.

Differential Revision: https://reviews.llvm.org/D34255

llvm-svn: 305932
2017-06-21 17:17:56 +00:00
Matthias Braun
7a482e2302 RegisterScavenging: Followup to r305625
This does some improvements/cleanup to the recently introduced
scavengeRegisterBackwards() functionality:

- Rewrite findSurvivorBackwards algorithm to use the existing
  LiveRegUnit::accumulateBackward() code. This also avoids the Available
  and Candidates bitset and just need 1 LiveRegUnit instance
  (= 1 bitset).
- Pick registers in allocation order instead of register number order.

llvm-svn: 305817
2017-06-20 18:43:14 +00:00
Sanjay Patel
a351a61cf2 [CGP, PowerPC] try to constant fold before creating loads for memcmp expansion
This is the last step needed to avoid regressions for x86 before we flip the switch to allow 
expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, 
so we need to do that here too.

FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that 
code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 
constant tests because its alignment requirements are too strict (shouldn't require alignment 
for a constant operand).

Differential Revision: https://reviews.llvm.org/D34071

llvm-svn: 305734
2017-06-19 19:48:35 +00:00
Matthias Braun
537d039104 RegScavenging: Add scavengeRegisterBackwards()
Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse
to place spills as the very first instruciton of a basic block and thus
artifically increase pressure (test in
test/CodeGen/PowerPC/scavenging.mir:spill_at_begin)

This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 305625
2017-06-17 02:08:18 +00:00
Matthias Braun
35530d7129 Revert "RegScavenging: Add scavengeRegisterBackwards()"
Revert because of reports of some PPC input starting to spill when it
was predicted that it wouldn't and no spillslot was reserved.

This reverts commit r305516.

llvm-svn: 305566
2017-06-16 17:48:08 +00:00
Matthias Braun
a42c537912 RegScavenging: Add scavengeRegisterBackwards()
Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64
problems reported in the stage2 build last time, which I cannot
reproduce right now.

This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 305516
2017-06-15 22:14:55 +00:00
Lei Huang
b4733ca8c5 [MachineLICM] Hoist TOC-based address instructions
Add condition for MachineLICM to safely hoist instructions that utilize
non constant registers that are reserved.

On PPC, global variable access is done through the table of contents (TOC)
which is always in register X2.  The ABI reserves this register in any
functions that have calls or access global variables.

A call through a function pointer involves saving, changing and restoring
this register around the call and thus MachineLICM does not consider it to
be invariant. We can however guarantee the register is preserved across the
call and thus is invariant.

Differential Revision: https://reviews.llvm.org/D33562

llvm-svn: 305490
2017-06-15 18:29:59 +00:00
Hiroshi Inoue
7a08bb1458 [PowerPC] fix potential verification errors on CFENCE8
This patch fixes a potential verification error (64-bit register operands for cmpw) with -verify-machineinstrs.

Differential Revision: https://reviews.llvm.org/D34208

llvm-svn: 305479
2017-06-15 16:51:28 +00:00
Nemanja Ivanovic
7855185bbb Revert r304907 as it is causing some failures that I cannot reproduce.
Reverting this until a test case can be provided to aid the investigation.

llvm-svn: 305372
2017-06-14 07:05:42 +00:00
Tony Jiang
1a8eec141a [PowerPC] Match vec_revb builtins to P9 instructions.
Power9 has instructions that will reverse the bytes within an element for all
sizes (half-word, word, double-word and quad-word). These can be used for the
vec_revb builtins in altivec.h. However, we implement these to match vector
shuffle nodes as that will cover both the builtins and vector shuffles that
occur in the SDAG through other means.

Differential Revision: https://reviews.llvm.org/D33690

llvm-svn: 305214
2017-06-12 18:24:36 +00:00
Tony Jiang
30a49d1a3d [Power9] Added support for the modsw, moduw, modsd, modud hardware instructions.
Note that if we need the result of both the divide and the modulo then we
compute the modulo based on the result of the divide and not using the new
hardware instruction.

Commit on behalf of STEFAN PINTILIE.
Differential Revision: https://reviews.llvm.org/D33940

llvm-svn: 305210
2017-06-12 17:58:42 +00:00
Sanjay Patel
dd96270472 [PowerPC] add memcmp test with one constant operand and equality cmp; NFC
llvm-svn: 305131
2017-06-09 23:15:14 +00:00
Sanjay Patel
5e370850d4 [CGP] don't expand a memcmp with nobuiltin attribute
This matches the behavior used in the SDAG when expanding memcmp.

For reference, we're intentionally treating the earlier fortified call transforms differently after:
https://bugs.llvm.org/show_bug.cgi?id=23093
https://reviews.llvm.org/rL233776

One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers:
https://reviews.llvm.org/D19781
https://reviews.llvm.org/D19801

Differential Revision: https://reviews.llvm.org/D34043

llvm-svn: 305007
2017-06-08 19:47:25 +00:00
Guozhi Wei
f31c56df2a [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.

This patch fixed PR32442.

Differential Revision: https://reviews.llvm.org/D31407

llvm-svn: 305001
2017-06-08 18:27:24 +00:00
Zaara Syeda
79acbbe513 [Power9] Exploit vector integer extend instructions
This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword

Differential Revision: https://reviews.llvm.org/D33510

llvm-svn: 304992
2017-06-08 17:14:36 +00:00
Sanjay Patel
0edcd1d717 [PowerPC] add memcmp test with nobuiltin attr; NFC
In SDAG, we don't expand libcalls with a nobuiltin attribute.
It's not clear if that's correct from the existing code comment:
"Don't do the check if marked as nobuiltin for some reason."

...adding a test here either way to show that there is currently
a different behavior implemented in the CGP-based expansion.

llvm-svn: 304991
2017-06-08 17:09:18 +00:00
Sanjay Patel
e7c5041c2a [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion
The test diff for PowerPC shows we can better optimize if this case is one block.

For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed 
cheap and SDAG can't optimize across blocks. 

Instead of this:

_cmp_eq8:
  movq  (%rdi), %rax
  cmpq  (%rsi), %rax
  je  LBB23_1
## BB#2:                                ## %res_block
  movl  $1, %ecx
  jmp LBB23_3
LBB23_1:
  xorl  %ecx, %ecx
LBB23_3:                                ## %endblock
  xorl  %eax, %eax
  testl %ecx, %ecx
  sete  %al
  retq

We get this:

cmp_eq8:   
  movq  (%rdi), %rcx
  xorl  %eax, %eax
  cmpq  (%rsi), %rcx
  sete  %al
  retq

And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). 
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable 
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we 
can lower the IR produced here optimally.

Differential Revision: https://reviews.llvm.org/D34005

llvm-svn: 304987
2017-06-08 16:53:18 +00:00
Sanjay Patel
8ce1e3b759 [CGP] avoid zext/trunc of a memcmp expansion compare
This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.

The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:

_cmp_eq4_zexted_to_i64:
  movl  (%rdi), %ecx
  movl  (%rsi), %edx
  xorl  %eax, %eax
  cmpq  %rdx, %rcx
  sete  %al

_cmp_eq4_better:
  movl  (%rdi), %ecx
  xorl  %eax, %eax
  cmpl  (%rsi), %ecx
  sete  %al

llvm-svn: 304923
2017-06-07 16:16:45 +00:00
Nemanja Ivanovic
d8623f0825 [PowerPC] Eliminate integer compare instructions - vol. 5
Adds handling for i64 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33720

llvm-svn: 304907
2017-06-07 13:18:06 +00:00
Nemanja Ivanovic
bb67f847d6 [PowerPC] Eliminate integer compare instructions - vol. 3
Adds handling for i32 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33718

llvm-svn: 304901
2017-06-07 12:23:41 +00:00
Sanjay Patel
f57015d4cc [CGP / PowerPC] use direct compares if there's only one load per block in memcmp() expansion
I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the 
special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle.

I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time 
limitation in the DAG. We might have to just avoid those special-cases here in CGP. But 
regardless of that, I think this is a win for the more general cases.

http://rise4fun.com/Alive/cbQ

Differential Revision: https://reviews.llvm.org/D33963

llvm-svn: 304849
2017-06-07 00:17:08 +00:00
Sanjay Patel
7a52296f1f [PowerPC] auto-generate full checks and increase test coverage
3 of the tests were testing exactly the same thing: memcmp(x, y, 16) != 0.
I changed that to test 4, 7, and 16 bytes, so we can see how those differ.

llvm-svn: 304838
2017-06-06 22:06:07 +00:00
Matthias Braun
0021d46a1c RegisterScavenging: Add ScavengerTest pass
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.

Also adds some basic register scavenging tests.

llvm-svn: 304606
2017-06-02 23:01:42 +00:00
Sean Fertile
457ddd311a [PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.

Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D33656

llvm-svn: 304317
2017-05-31 18:20:17 +00:00
Zaara Syeda
3a7578c658 [PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313
2017-05-31 17:12:38 +00:00
Tony Jiang
60c247de18 [PowerPC] Fix a performance bug for PPC::XXPERMDI.
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.

Differential Revision: https://reviews.llvm.org/D33404

llvm-svn: 304298
2017-05-31 13:09:57 +00:00
Nemanja Ivanovic
accab033c9 [PowerPC] Eliminate integer compare instructions - vol. 3
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for the 64-bit SETEQ patterns.

Differential Revision: https://reviews.llvm.org/D33369

llvm-svn: 304286
2017-05-31 08:04:07 +00:00
Nemanja Ivanovic
e597bd8230 [PowerPC] Eliminate integer compare instructions - vol. 2
This patch builds upon https://reviews.llvm.org/rL302810 to add
handling for bitwise logical operations in general purpose registers.
The idea is to keep the values in GPRs as long as possible - only
extracting them to a condition register bit when no further operations
are to be done.

Differential Revision: https://reviews.llvm.org/D31851

llvm-svn: 304282
2017-05-31 05:40:25 +00:00
Tim Shen
0bd0aa8f07 [AntiDepBreaker] Revert r299124 and add a test.
Summary:
AntiDepBreaker intends to add all live-outs, including the implicit
CSRs, in StartBlock. r299124 was done without understanding that
intention.

Now with the live-ins propagated correctly (D32464), we can revert this change.

Reviewers: MatzeB, qcolombet

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33697

llvm-svn: 304251
2017-05-30 22:26:52 +00:00
Matthias Braun
eec1f3672a LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI
Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal
addPristines() function must be called on an empty set or it may
accidentally reset saved registers.

- addLiveOutsNoPristines() needs to add callee saved registers that are
  actually saved and restored somewhere to the set (they are not
  pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().

This fixes the problem from D32156.

Differential Revision: https://reviews.llvm.org/D32464

llvm-svn: 304001
2017-05-26 16:23:08 +00:00
Matthias Braun
c93c063993 Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"
Tentatively revert this to see if it fixes the buildbot stage2
breakages.

This reverts commit r303938.
This reverts commit r303954.

llvm-svn: 303960
2017-05-26 02:25:20 +00:00
Matthias Braun
9e6826de77 Test for r303938
llvm-svn: 303954
2017-05-26 01:29:25 +00:00
Tim Shen
a2b85da879 [PPC] Fix atomics lowering in DAG lowering.
I forgot to forward the chain, causing some missing instruction
dependencies. The test crashes the compiler without this patch.

Inspired by the test case, D33519 also tries to remove the extra sync.

Differential Revision: https://reviews.llvm.org/D33573

llvm-svn: 303931
2017-05-25 22:58:35 +00:00
Tony Jiang
0a429f040e [PowerPC] Fix a performance bug for PPC::XXSLDWI.
There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI
instruction, this patch recognizes them and does the selection to improve the
PPC performance.

llvm-svn: 303822
2017-05-24 23:48:29 +00:00
Zaara Syeda
932978315b P9: D-form vector load/store. Differential Revision: https://reviews.llvm.org/D33248
llvm-svn: 303780
2017-05-24 17:50:37 +00:00
Hiroshi Inoue
37e63b1b21 Summary
PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass.
This patch improves this optimization to eliminate more compare instructions in two types of common case.


- comparison against a constant 1 or -1

The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant.
This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0".
With this patch, compare can be eliminated in the following code sequence, as an example.

uint64_t a, b;
if ((a | b) & 0x8000000000000000ull) { ... }
else { ... }


- andi for 32-bit comparison on PPC64

Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check.

%1 = and i32 %a, 10
%2 = icmp ne i32 %1, 0
br i1 %2, label %foo, label %bar

In this simple example, LLVM generates andi. + cmplwi + beq on PPC64.
This patch make it possible to eliminate the cmplwi for this case.
I added andi. for optimization targets if it is safe to do so.

Differential Revision: https://reviews.llvm.org/D30081

llvm-svn: 303500
2017-05-21 06:00:05 +00:00
Kyle Butt
f6c61ef64d CodeGen: Power: Add lowering for shifts of v1i128.
When legalizing vector operations on vNi128, they will be split to v1i128
because that is a legal type on ppc64, but then the compiler will crash in
selection dag because it fails to select for these operations. This patch fixes
shift operations. Logical shift right and left shift can be performed in the
vector unit, but algebraic shift right requires being split.

Differential Revision: https://reviews.llvm.org/D32774

llvm-svn: 303307
2017-05-17 21:54:41 +00:00
Krzysztof Parzyszek
2b0533126e [PPC] Properly update register save area offsets
The variables MinGPR/MinG8R were not updated properly when resetting the
offsets, which in the included testcase lead to saving the CR register
in the same location as R30.

This fixes another issue reported in PR26519.

Differential Revision: https://reviews.llvm.org/D33017

llvm-svn: 303257
2017-05-17 13:25:09 +00:00
Tim Shen
0fbbef43e0 [PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC.
Differential Revisions: https://reviews.llvm.org/D32763

llvm-svn: 303209
2017-05-16 20:58:55 +00:00
Tim Shen
3bef27cc6f [PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.
Summary:
This fixes pr32392.

The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in
expandPostRAPseudo.

The reason why expandPostRAPseudo is chosen is because previous passes
are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne-
7, .+4 (some branch pass(s)).

Differential Revision: https://reviews.llvm.org/D32763

llvm-svn: 303205
2017-05-16 20:18:06 +00:00
Nirav Dave
da8f221273 Elide stores which are overwritten without being observed.
Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.

Test notes:

* Many testcases overwrite store addresses multiple times and needed
  minor changes, mainly making stores volatile to prevent the
  optimization from optimizing the test away.

* Many X86 test cases optimized out instructions associated with
  associated with va_start.

* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
  dependencies to check and can probably be removed and potentially
  replaced with another test.

Reviewers: rnk, john.brawn

Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33206

llvm-svn: 303198
2017-05-16 19:43:56 +00:00
Kyle Butt
7d531daece CodeGen: BlockPlacement: Increase tail duplication size for O3.
At O3 we are more willing to increase size if we believe it will improve
performance. The current threshold for tail-duplication of 2 instructions is
conservative, and can be relaxed at O3.

Benchmark results:
llvm test-suite:
6% improvement in aha, due to duplication of loop latch
3% improvement in hexxagon

2% slowdown in lpbench. Seems related, but couldn't completely diagnose.

Internal google benchmark:
Produces 4% improvement on internal google protocol buffer serialization
benchmarks.

Differential-Revision: https://reviews.llvm.org/D32324
llvm-svn: 303084
2017-05-15 17:30:47 +00:00
Guozhi Wei
22e7da9597 [PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0
According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.

This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.

Differential Revision: https://reviews.llvm.org/D32880

llvm-svn: 302834
2017-05-11 22:17:35 +00:00
Nemanja Ivanovic
96c3d626a2 [PowerPC] Eliminate integer compare instructions - vol. 1
This patch is the first in a series of patches to provide code gen for
doing compares in GPRs when the compare result is required in a GPR.

It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64
extensions. This first patch handles equality comparison on i32 operands with
the result sign or zero extended.

Differential Revision: https://reviews.llvm.org/D31847

llvm-svn: 302810
2017-05-11 16:54:23 +00:00
Serge Pavlov
d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Krzysztof Parzyszek
038a0546db [PPC] When restoring R30 (PIC base pointer), mark it as <def>
This happened on the PPC32/SVR4 path and was discovered when building
FreeBSD on PPC32. It was a typo-class error in the frame lowering code.

This fixes PR26519.

llvm-svn: 302183
2017-05-04 19:14:54 +00:00
Tim Shen
e59d06fe78 [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction
Summary:
This is the corresponding llvm change to D28037 to ensure no performance
regression.

Reviewers: bogner, kbarton, hfinkel, iteratee, echristo

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28329

llvm-svn: 301990
2017-05-03 00:07:02 +00:00
Nemanja Ivanovic
b89c27f515 [PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE
Fixes PR30730.
This is a re-commit of a pulled commit. The commit was pulled because some
software projects contained uses of Altivec vectors that violated alignment
requirements. Known issues have now been fixed.

Committing on behalf of Lei Huang.

Differential Revision: https://reviews.llvm.org/D26861

llvm-svn: 301892
2017-05-02 01:47:34 +00:00