Commit Graph

1507 Commits

Author SHA1 Message Date
Kai Luo
e2b93185b8 [PowerPC] Only make copies of registers on stack in variadic function when va_start is called
On PPC64, for a variadic function, if va_start is not called, it won't
access any variadic argument on stack, thus we can save stores of
registers used to pass arguments.

Differential Revision: https://reviews.llvm.org/D82361
2020-07-09 07:18:17 +00:00
Nemanja Ivanovic
1b1539712e [PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization
When legalizing shuffles, we make an attempt to combine it into
a PPC specific canonical form that avoids a need for a swap. If the
combine is successful, we RAUW the node and the custom legalization
replaces the now dead node instead of the one it should replace.
Remove that erroneous call to RAUW.
2020-07-06 22:09:28 -05:00
Amy Kwan
c13e3e2c2e [PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE.
This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs.
We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if:
- Element size is 4 bytes
- The RHS is a constant vector (and constant splat of 4-bytes)
- The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks:
<0, 4-7, 2, 4-7>
<4-7, 1, 4-7, 3>

Differential Revision: https://reviews.llvm.org/D83245
2020-07-06 20:28:38 -05:00
jasonliu
6d3ae365bd [XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s)
Summary:

When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.

Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L

Differential Revision: https://reviews.llvm.org/D82481
2020-07-06 15:49:15 +00:00
Esme-Yi
0607c8df7f [PowerPC] Legalize SREM/UREM directly on P9.
Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D82145
2020-07-06 11:47:31 +00:00
Guillaume Chatelet
87e2751cf0 [Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082
2020-07-03 08:06:43 +00:00
Kai Luo
03828e38c3 [PowerPC] Implement probing for dynamic stack allocation
This patch is part of supporting `-fstack-clash-protection`. Mainly do
such things compared to existing `lowerDynamicAlloc`

- Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get
  actual frame pointer and final stack pointer.
- Synthesize a loop to probe by blocks.
- Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in
  prologepilog.

Differential Revision: https://reviews.llvm.org/D81358
2020-07-03 05:36:40 +00:00
Nemanja Ivanovic
a701dc5510 [PowerPC] Remove undefs from splat input when changing shuffle mask
As of 1fed131660, we have code that
changes shuffle masks so that we can put the shuffle in a canonical
form that can be matched to a single instruction. However, it
does not properly account for undef elements in the BUILD_VECTOR
that is the RHS splat so we can end up with undefs where they
shouldn't be. This patch converts the splat input with undefs to
one without.
2020-07-02 12:26:56 -05:00
Guillaume Chatelet
8dbafd24d6 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82977
2020-07-02 11:28:02 +00:00
Anil Mahmud
c5b4f03b53 [PowerPC] Exploit xxspltiw and xxspltidp instructions
Exploits the VSX Vector Splat Immediate Word and
VSX Vector Splat Immediate Double Precision instructions:

  xxspltiw XT,IMM32
  xxspltidp XT,IMM32

Differential Revision: https://reviews.llvm.org/D82911
2020-07-01 19:18:29 -05:00
Stefan Pintilie
b294e00fb0 [PowerPC] Fix for PC Relative call protocol
The situation where the caller uses a TOC and the callee does not
but is marked as clobbers the TOC (st_other=1) was not being compiled
correctly if both functions where in the same object file.

The call site where we had `callee` was missing a nop after the call.
This is because it was assumed that since the two functions where in
the same DSO they would be sharing a TOC. This is not the case if the
callee uses PC Relative because in that case it may clobber the TOC.
This patch makes sure that we add the cnop correctly so that the
linker has a place to restore the TOC.

Reviewers: sfertile, NeHuang, saghir

Differential Revision: https://reviews.llvm.org/D81126
2020-07-01 07:08:41 -05:00
Guillaume Chatelet
28de229bc6 [Alignment][NFC] Migrate MachineFrameInfo::CreateStackObject to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82894
2020-07-01 07:28:11 +00:00
Guillaume Chatelet
a976ea3209 [Alignment][NFC] Migrate PPC, X86 and XCore backends to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82779
2020-06-30 08:08:45 +00:00
Nemanja Ivanovic
d2533d96e1 [PowerPC] Fix crash for shuffle canonicalization with elt 0 from RHS
Commit 1fed131660 assumed that shuffle vector canonicalization will
always ensure that the shuffle mask will be ordered so that element
zero comes from the LHS vector. However there is code out there for
which this is not the case. This patch simply removes that unsafe
assumption and makes the code work regardless of the source of the
first element.
2020-06-29 12:26:08 -05:00
Nemanja Ivanovic
57ad8f4730 [PowerPC] Don't combine SCALAR_TO_VECTOR without VSX
Most of the patterns for PPCISD::SCALAR_TO_VECTOR_PERMUTED require
VSX. So don't emit them if the subtarget doesn't have VSX.
This resolves the issue reported on
https://reviews.llvm.org/rG1fed131660b2c5d3ea7007e273a7a5da80699445
2020-06-29 09:48:57 -05:00
Zarko Todorovski
e504a23b63 [NFC][PPC][AIX] Add stack frame layout diagram to PPCISelLowering.cpp
Summary:
This NFC patch adds a diagram of the AIX ABI stack frame layout.

Based on https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/assembler/idalangref_runtime_process.html

Reviewers: sfertile, cebowleratibm, hubert.reinterpretcast, Xiangling_L

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D82408
2020-06-25 09:41:42 -04:00
Nemanja Ivanovic
1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Esme-Yi
ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Qiu Chaofan
13edcd696e [PowerPC] Support constrained rounding operations
This patch adds handling of constrained FP intrinsics about round,
truncate and extend for PowerPC target, with necessary tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D64193
2020-06-14 23:43:31 +08:00
Qiu Chaofan
7315d221a2 [PowerPC] Exploit vnmsubfp instruction
On PowerPC, we have vnmsubfp Altivec instruction for fnmsub operation on
v4f32 type. Default pattern for this instruction never works since we
don't have legal fneg for v4f32 when VSX disabled.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80617
2020-06-14 23:19:17 +08:00
Guillaume Chatelet
1778564f91 [Alignment][NFC] Migrate the rest of backends
Summary: This is a followup on D81196

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81278
2020-06-08 07:17:20 +00:00
Qiu Chaofan
7a001a2d92 [PowerPC] Require nsz flag for c-a*b to FNMSUB
On PowerPC, FNMSUB (both VSX and non-VSX version) means -(a*b-c). But
the backend used to generate these instructions regardless whether nsz
flag exists or not. If a*b-c==0, such transformation changes sign of
zero.

This patch introduces PPC specific FNMSUB ISD opcode, which may help
improving combined FMA code sequence.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D76585
2020-06-04 16:41:27 +08:00
QingShan Zhang
a462561cee [NFC][PowerPC] Remove unused node PPCISD::VMADDFP and PPCISD::VNMSUBFP
These two nodes were added by 69caef2b78 in 2005
and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer
way for VMADDFP if we really want to create that node. For VNMSUBFP, we will
also add a more generic node FNMSUB in D76585 if we really want it.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D80429
2020-06-03 06:36:30 +00:00
Li Rong Yi
3101601b54 [PowerPC] Exploit vabsd on P9
Summary: Exploit vabsd* for for absolute difference of vectors on P9,
for example:
void foo (char *restrict p, char *restrict q, char *restrict t)
{
  for (int i = 0; i < 16; i++)
     t[i] = abs (p[i] - q[i]);
}
this case should be matched to the HW instruction vabsdub.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80271
2020-06-01 02:30:27 +00:00
Zequan Wu
80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Lei Huang
2368bf52cd [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-27 13:14:25 -05:00
Lei Huang
559845f8fe Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm"
This reverts commit 7eb666b155.
2020-05-27 09:40:21 -05:00
Lei Huang
7eb666b155 [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-26 13:48:22 -05:00
Nemanja Ivanovic
099a875f28 [PowerPC] Unaligned FP default should apply to scalars only
As reported in PR45186, we could be in a situation where we don't
want to handle unaligned memory accesses for FP scalars but still
have VSX (which allows unaligned access for vectors). Change the
default to only apply to scalars.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45186
2020-05-26 10:19:06 -05:00
Nemanja Ivanovic
793cc518b9 [PowerPC] Prevent legalization loop from promoting SELECT_CC from v4i32 to v4i32
As reported in https://bugs.llvm.org/show_bug.cgi?id=45709 we can hit an
infinite loop in legalization since we set the legalization action for
ISD::SELECT_CC for all fixed length vector types to Promote. Without some
different legalization action for the type being promoted to, the legalizer
simply loops. Since we don't have patterns to match the node, the right
legalization action should be Expand.

Differential revision: https://reviews.llvm.org/D79854
2020-05-25 20:09:07 -05:00
Amy Kwan
b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Nemanja Ivanovic
1a493b0fa5 [PowerPC] Add missing handling for half precision
The fix for PR39865 took care of some of the handling for half precision
but it missed a number of issues that still exist. This patch fixes the
remaining issues that cause crashes in the PPC back end.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776

Differential revision: https://reviews.llvm.org/D79283
2020-05-22 07:50:11 -05:00
Sean Fertile
ce4ebc14a8 [PowerPC] Remove support for SplitCSR.
SplitCSR was only suppored for functions with CXX_FAST_TLS calling
convention. Clang only emits that calling convention for Darwin which is
no longer supported by the PowerPC backend. Another IR producer could
use the calling convention, but considering the calling convention is
meant to be an optimization and the codegen for SplitCSR can be
attrocious on Power (see the modifed lit test) it is best to remove it
and codegen CXX_FAST_TLS same as the C calling convention.

Differential Revision: https://reviews.llvm.org/D79018
2020-05-14 10:32:17 -04:00
Qiu Chaofan
e9753822b5 [PowerPC] Respect SDNodeFlags in lowering SELECT_CC
Legalizer should respect both command-line options or SDNode-level
fast-math flags.

Also, this patch propagates other flags during custom simplifying.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79074
2020-05-13 14:05:47 +08:00
Qiu Chaofan
e8d2ff22f0 [PowerPC] Add fma/fsqrt/fmax strict-fp intrinsics
This patch adds strict-fp intrinsics support for fma, fsqrt, fmaxnum and
fminnum on PowerPC.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D72749
2020-05-12 13:44:09 +08:00
Kang Zhang
dcc5ff3bc2 [PowerPC] Use PredictableSelectIsExpensive to enable select to branch in CGP
Summary:
This patch will set the variable PredictableSelectIsExpensive to do the
select to if based on BranchProbability in CodeGenPrepare.

When the BranchProbability more than MinPercentageForPredictableBranch,
PPC will convert SELECT to branch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D71883
2020-05-11 15:02:09 +00:00
Craig Topper
d1119980e5 [SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.

Removing getAlignment() will be done as a follow up.

Differential Revision: https://reviews.llvm.org/D79436
2020-05-08 16:04:11 -07:00
Sean Fertile
2a3cf5e583 [PowerPC][AIX] Pass ByVal formal args that span registers and stack.
Implement passing of ByVal formal arguments when the argument is passed
partly in the argument registers, with the remainder of the argument
passed on the stack.

Differential Revision: https://reviews.llvm.org/D78515
2020-04-28 14:57:14 -04:00
Craig Topper
a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Stefan Pintilie
1354a03e74 [PowerPC][Future] Implement PC Relative Tail Calls
Tail Calls were initially disabled for PC Relative code because it was not safe
to make certain assumptions about the tail calls (namely that all compiled
functions no longer used the TOC pointer in R2). However, once all of the
TOC pointer references have been removed it is safe to tail call everything
that was tail called prior to the PC relative additions as well as a number of
new cases.
For example, it is now possible to tail call indirect functions as there is no
need to save and restore the TOC pointer for indirect functions if the caller
is marked as may clobber R2 (st_other=1). For the same reason it is now also
possible to tail call functions that are external.

Differential Revision: https://reviews.llvm.org/D77788
2020-04-27 12:55:08 -05:00
Victor Huang
e20b07b021 [PowerPC][Future] Add missing changes for PC Realtive addressing
1. Use Subtarget.isUsingPCRelativeCalls() in LowerConstantPool to
check if using PCRelative addressing.

2. Change MO_GOT_FLAG = 32 to MO_GOT_FLAG = 8 in PPC.h to use
consecutive bits.

Differential Revision: https://reviews.llvm.org/D78406
2020-04-23 10:26:43 -05:00
Victor Huang
a60ca4b4e9 [PowerPC][Future] Initial support for PCRel addressing to get block address
Add initial support for PCRelative addressing to get block address
instead of using TOC.

Differential Revision: https://reviews.llvm.org/D76294
2020-04-22 15:01:29 -05:00
Victor Huang
02141a17ae [PowerPC][Future] Remove redundant r2 save and restore for indirect call
Currently an indirect call produces the following sequence on PCRelative mode:

extern void function( );
extern void (*ptrfunc) ( );

void g() {
    ptrfunc=function;
}

void f() {
    (*ptrfunc) ( );
}

Producing

paddi 3, 0, .LC0@PCREL, 1
ld 3, 0(3)
std 2, 24(1)
ld 12, 0(3)
mtctr 12
bctrl
ld 2, 24(1)

Though the caller does not use or preserve r2, it is still saved and restored
across a function call. This patch is added to remove these redundant save and
restores for indirect calls.

Differential Revision: https://reviews.llvm.org/D77749
2020-04-22 12:05:51 -05:00
Victor Huang
43abef06f4 [PowerPC][Future] Initial support for PCRel addressing for jump tables.
Add initial support for PC Relative addressing to get jump table base
address instead of using TOC.

Differential Revision: https://reviews.llvm.org/D75931
2020-04-22 10:45:01 -05:00
Craig Topper
d22989c34e [CallSite removal][Target] Replace CallSite with CallBase. NFC
In some cases just delete an unneeded include.
2020-04-21 23:29:36 -07:00
Stefan Pintilie
a92ee77d85 [PowerPC][Future] Add offsets to PC Relative relocations.
This is an optimization that applies to global addresses and
allows for the following transformation:
Convert this:

paddi r3, 0, symbol@PCREL, 1
ld r4, 8(r3)

To this:

pld r4, symbol@PCREL+8(0), 1

An instruction is saved and the linker can do the addition when
the symbol is resolved.

Differential Revision: https://reviews.llvm.org/D76160
2020-04-21 11:08:19 -05:00
Christopher Tetreault
a9b137f9ff [SVE] Remove calls to getBitWidth from PowerPC
Reviewers: efriedma, sdesmalen, hfinkel, david-arm, fpetrogalli

Reviewed By: efriedma, fpetrogalli

Subscribers: wuzish, nemanjai, tschuett, hiraditya, kbarton, rkruppe, psnobl, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77900
2020-04-20 14:18:37 -07:00
Nemanja Ivanovic
64b31d96df [PowerPC] Do not attempt to reuse load for 64-bit FP_TO_UINT without FPCVT
We call the function that attempts to reuse the conversion without checking
whether the target matches the constraints that the callee expects. This patch
adds the check prior to the call.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=43976

Differential revision: https://reviews.llvm.org/D77564
2020-04-20 13:00:06 -05:00
Sean Fertile
d52bb6d099 [PowerPC][AIX] ByVal formal argument support: passing on the stack.
Adds support for passing a ByVal formal argument completely on the stack
(ie after all argument registers are exhausted).

Differential Revision: https://reviews.llvm.org/D78263
2020-04-20 12:04:59 -04:00
Stefan Pintilie
b771c4a842 [PowerPC][Future] More support for PCRel addressing for global values
Add initial support for PC Relative addressing for global values that
require GOT indirect addressing. This patch adds PCRelative support for
global addresses that may not be known at link time and may require
access through the GOT.

Differential Revision: https://reviews.llvm.org/D76064
2020-04-17 11:06:13 -05:00