Commit Graph

4296 Commits

Author SHA1 Message Date
Rafael Espindola
092b619e55 Use the correct func begin symbol in all places in ppc.
I missed an occurrence of the old symbol in my previous patch.

llvm-svn: 231398
2015-03-05 19:47:50 +00:00
Rafael Espindola
86bd6a1202 Use the generic Lfunc_begin label on ppc.
This removes yet another custom label to mark the start of a function.

llvm-svn: 231390
2015-03-05 18:55:50 +00:00
Kit Barton
e48b1e1c4f While reviewing the changes to Clang to add builtin support for the vsld, vsrd, and vsrad instructions, it was pointed out that the builtins are generating the LLVM opcodes (shl, lshr, and ashr) not calls to the intrinsics. This patch changes the implementation of the vsld, vsrd, and vsrad instructions from from intrinsics to VXForm_1 instructions and makes them legal with P8 Altivec. It also removes the definition of the int_ppc_altivec_vsld, int_ppc_altivec_vsrd, and int_ppc_altivec_vsrad intrinsics.
llvm-svn: 231378
2015-03-05 16:24:38 +00:00
Nemanja Ivanovic
e8effe1edb Add LLVM support for PPC cryptography builtins
Review: http://reviews.llvm.org/D7955

llvm-svn: 231285
2015-03-04 20:44:33 +00:00
Mehdi Amini
46a43556db Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
Nemanja Ivanovic
d384cd9907 Test commit. Removed an unnecessary space
llvm-svn: 231257
2015-03-04 17:09:12 +00:00
Bill Schmidt
d90aff2c4f [PowerPC] Remove unnecessary and incomplete commentary
This "itinerary class map" in PPCSchedule.td is incomplete and
redundant with the actual code.  As it provides no value, we've
decided to remove it.

No functional change.

llvm-svn: 231246
2015-03-04 14:56:05 +00:00
Kit Barton
0cfa7b7ad0 Add the following 64-bit vector integer arithmetic instructions added in POWER8:
vaddudm
vsubudm
vmulesw
vmulosw
vmuleuw
vmulouw
vmuluwm
vmaxsd
vmaxud
vminsd
vminud
vcmpequd
vcmpequd.
vcmpgtsd
vcmpgtsd.
vcmpgtud
vcmpgtud.
vrld
vsld
vsrd
vsrad

Phabricator review: http://reviews.llvm.org/D7959

llvm-svn: 231115
2015-03-03 19:55:45 +00:00
Benjamin Kramer
7149aabf8b Make some non-constant static variables non-static or fully const.
Otherwise we have to emit thread-safe initialization for them. NFC.

llvm-svn: 230894
2015-03-01 18:09:56 +00:00
Bill Schmidt
e3959eb54e [PowerPC] Fix PR22711 - Misaligned .toc section
Straightforward patch to emit an alignment directive when emitting a
TOC entry.  The test case was generated from the test in PR22711 that
demonstrated a misaligned .toc section.  The object code is run
through llvm-readobj to verify that the correct alignment has been
applied to the .toc section.

Thanks to Ulrich Weigand for running down where the fix was needed.

llvm-svn: 230801
2015-02-27 22:14:10 +00:00
Hal Finkel
5c3cacf5c0 [PowerPC] Use vector types for memcpy and friends (sometimes)
When using Altivec, we can use vector loads and stores for aligned memcpy and
friends. Starting with the P7 and VXS, we have reasonable unaligned vector
stores. Starting with the P8, we have fast unaligned loads too.

For QPX, we use vector loads are stores, but only for aligned memory accesses.

llvm-svn: 230788
2015-02-27 19:58:28 +00:00
Eric Christopher
11e4df73c8 getRegForInlineAsmConstraint wants to use TargetRegisterInfo for
a lookup, pass that in rather than use a naked call to getSubtargetImpl.
This involved passing down and around either a TargetMachine or
TargetRegisterInfo. Update all callers/definitions around the targets
and SelectionDAG.

llvm-svn: 230699
2015-02-26 22:38:43 +00:00
Eric Christopher
23a3a7c871 Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.

llvm-svn: 230583
2015-02-26 00:00:24 +00:00
Hal Finkel
cf59921670 [PowerPC] Make LDtocL and friends invariant loads
LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node,
represent loads from the TOC, which is invariant. As a result, these loads can
be hoisted out of loops, etc. In order to do this, we need to generate
GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory
intrinsic node type. Once this is done, the MMO transfer is automatically
handled for TableGen-driven instruction selection, and for nodes generated
directly in PPCISelDAGToDAG, we need to transfer the MMOs manually.

Also, we were not transferring MMOs associated with pre-increment loads, so do
that too.

Lastly, this fixes an exposed bug where R30 was not added as a defined operand of
UpdateGBR.

This problem was highlighted by an example (used to generate the test case)
posted to llvmdev by Francois Pichet.

llvm-svn: 230553
2015-02-25 21:36:59 +00:00
Hal Finkel
0746211811 [PowerPC] Cleanup unused target-specific SDAG nodes
We had somehow accumulated a few target-specific SDAG nodes dealing with PPC64
TOC access that were referenced only in TableGen patterns. The associated
(pseudo-)instructions are used, but are being generated directly. NFC.

llvm-svn: 230518
2015-02-25 18:06:45 +00:00
Aaron Ballman
5561ed448b Silencing a "result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)" warning in MSVC; NFC.
llvm-svn: 230489
2015-02-25 13:05:24 +00:00
Aaron Ballman
70c27ded97 Silencing a -Wsign-compare warning triggered in MSVC; NFC.
llvm-svn: 230488
2015-02-25 13:02:23 +00:00
Hal Finkel
c93a9a2cb4 [PowerPC] Add support for the QPX vector instruction set
This adds support for the QPX vector instruction set, which is used by the
enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes
wide, holding 4 double-precision floating-point values. Boolean values, modeled
here as <4 x i1> are actually also represented as floating-point values
(essentially  { -1, 1 } for { false, true }). QPX shares many features with
Altivec and VSX, but is distinct from both of them. One major difference is
that, instead of adding completely-separate vector registers, QPX vector
registers are extensions of the scalar floating-point registers (lane 0 is the
corresponding scalar floating-point value). The operations supported on QPX
vectors mirrors that supported on the scalar floating-point values (with some
additional ones for permutations and logical/comparison operations).

I've been maintaining this support out-of-tree, as part of the bgclang project,
for several years. This is not the entire bgclang patch set, but is most of the
subset that can be cleanly integrated into LLVM proper at this time. Adding
this to the LLVM backend is part of my efforts to rebase bgclang to the current
LLVM trunk, but is independently useful (especially for codes that use LLVM as
a JIT in library form).

The assembler/disassembler test coverage is complete. The CodeGen test coverage
is not, but I've included some tests, and more will be added as follow-up work.

llvm-svn: 230413
2015-02-25 01:06:45 +00:00
Tim Northover
3b6b7ca2bc CodeGen: convert CCState interface to using ArrayRefs
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.

There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.

llvm-svn: 230118
2015-02-21 02:11:17 +00:00
Eric Christopher
db51e2a52a Fix an asan use-after-free bug introduced by the asm printer
changes to remove non-Function based subtargets out of the asm
printer. For module level emission we'll need to construct up
an MCSubtargetInfo so that we can encode instructions for
emission.

llvm-svn: 230050
2015-02-20 19:54:07 +00:00
Eric Christopher
9cda4b7ed9 Remove a use of the Subtarget in the darwin ppc asm printer.
EmitFunctionStubs is called from doFinalization and so can't
depend on the Subtarget existing. It's also irrelevant as
we know we're darwin since we're in the darwin asm printer.

llvm-svn: 230039
2015-02-20 18:53:42 +00:00
Eric Christopher
1df0c519fc Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 230037
2015-02-20 18:44:15 +00:00
Kit Barton
263edb99ab I incorrectly marked the VORC instruction as isCommutable when I added it.
This fix removes the VORC instruction definition from the isCommutable block.

Phabricator review: http://reviews.llvm.org/D7772

llvm-svn: 230020
2015-02-20 15:54:58 +00:00
Eric Christopher
155290edf9 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 229998
2015-02-20 08:24:34 +00:00
Eric Christopher
1947a9e2e2 Make the TargetMachine::getSubtarget that takes a Function argument
take a reference to match the getSubtargetImpl that takes a Function
argument.

llvm-svn: 229994
2015-02-20 07:32:59 +00:00
Hal Finkel
e5aaf3f2cd [PowerPC] Loop Data Prefetching for the BG/Q
The IBM BG/Q supercomputer's A2 cores have a hardware prefetching unit, the
L1P, but it does not prefetch directly into the A2's L1 cache. Instead, it
prefetches into its own L1P buffer, and the latency to access that buffer is
significantly higher than that to the L1 cache (although smaller than the
latency to the L2 cache). As a result, especially when multiple hardware
threads are not actively busy, explicitly prefetching data into the L1 cache is
advantageous.

I've been using this pass out-of-tree for data prefetching on the BG/Q for well
over a year, and it has worked quite well. It is enabled by default only for
the BG/Q, but can be enabled for other cores as well via a command-line option.

Eventually, we might want to add some TTI interfaces and move this into
Transforms/Scalar (there is nothing particularly target dependent about it,
although only machines like the BG/Q will benefit from its simplistic
strategy).

llvm-svn: 229966
2015-02-20 05:08:21 +00:00
Kit Barton
298beb5e86 This patch adds the VSX logical instructions introduced in the Power ISA 2.07. It also removes the added complexity that favors VMX versions of the three instructions.
Phabricator review: http://reviews.llvm.org/D7616

Commiting on Nemanja's behalf.

llvm-svn: 229694
2015-02-18 16:21:46 +00:00
Eric Christopher
5c0e009d3a Make the PowerPC AsmPrinter independent of global subtarget
initialization. Initialize the subtarget once per function and
migrate EmitStartOfAsmFile to either use attributes on the
TargetMachine or get information from all of the various
subtargets.

llvm-svn: 229475
2015-02-17 07:21:21 +00:00
Eric Christopher
75dc3904a5 Add a FIXME to move IsLittleEndian to the target machine.
llvm-svn: 229472
2015-02-17 06:45:17 +00:00
Eric Christopher
fee6aaf683 Move ABI handling and 64-bitness to the PowerPC target machine.
This required changing how the computation of the ABI is handled
and how some of the checks for ABI/target are done.

llvm-svn: 229471
2015-02-17 06:45:15 +00:00
Hal Finkel
5cedafb8cd [PowerPC] Support non-direct-sub/superclass VSX copies
Our register allocation has become better recently, it seems, and is now
starting to generate cross-block copies into inflated register classes. These
copies are not transformed into subregister insertions/extractions by the
PPCVSXCopy class, and so need to be handled directly by
PPCInstrInfo::copyPhysReg. The code to do this was *almost* there, but not
quite (it was unnecessarily restricting itself to only the direct
sub/super-register-class case (not copying between, for example, something in
VRRC and the lower-half of VSRC which are super-registers of F8RC).

Triggering this behavior manually is difficult; I'm including two
bugpoint-reduced test cases from the test suite.

llvm-svn: 229457
2015-02-16 23:46:30 +00:00
Andrew Trick
05938a5481 AArch64: Safely handle the incoming sret call argument.
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.

Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.

Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering

llvm-svn: 229413
2015-02-16 18:10:47 +00:00
Aaron Ballman
f9a1897c72 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
Chandler Carruth
003ed332bf Remove a variable only used in an assert and sink its initializer into
the assert. Fixes -Wunused-variable on non-asserts builds.

llvm-svn: 229250
2015-02-14 09:14:44 +00:00
Duncan P. N. Exon Smith
5bedaf934f PowerPC: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229224
2015-02-14 02:54:07 +00:00
Eric Christopher
fcd3d87ad8 The base pointer save offset can be computed at initialization time,
do so and fix up the calls.

llvm-svn: 229169
2015-02-13 22:48:53 +00:00
Eric Christopher
a10d58dba8 Move the target machine variable so that it's initialized early
enough we can use it to initialize frame lowering.

llvm-svn: 229168
2015-02-13 22:48:51 +00:00
Eric Christopher
e8dbfe1cf8 Stash the TargetMachine on the subtarget so we can access it later.
Clean up a subtarget function that has it passed in while we're at it.

llvm-svn: 229164
2015-02-13 22:23:04 +00:00
Eric Christopher
a4ae213193 PPC LinkageSize can be computed at initialization time, do so.
llvm-svn: 229163
2015-02-13 22:22:57 +00:00
Chandler Carruth
30d69c2e36 [PM] Remove the old 'PassManager.h' header file at the top level of
LLVM's include tree and the use of using declarations to hide the
'legacy' namespace for the old pass manager.

This undoes the primary modules-hostile change I made to keep
out-of-tree targets building. I sent an email inquiring about whether
this would be reasonable to do at this phase and people seemed fine with
it, so making it a reality. This should allow us to start bootstrapping
with modules to a certain extent along with making it easier to mix and
match headers in general.

The updates to any code for users of LLVM are very mechanical. Switch
from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h".
Qualify the types which now produce compile errors with "legacy::". The
most common ones are "PassManager", "PassManagerBase", and
"FunctionPassManager".

llvm-svn: 229094
2015-02-13 10:01:29 +00:00
Chandler Carruth
71f308adb7 Re-sort #include lines using my handy dandy ./utils/sort_includes.py
script. This is in preparation for changes to lots of include lines.

llvm-svn: 229088
2015-02-13 09:09:03 +00:00
Eric Christopher
dc3a8a4a66 PPCFrameLowering's FramePointerOffset can be computed at initialization
time. Do so.

llvm-svn: 228998
2015-02-13 00:39:38 +00:00
Eric Christopher
736d39e189 The TOC save offset can be computed at compile time, do so and
propagate changes.

llvm-svn: 228997
2015-02-13 00:39:36 +00:00
Eric Christopher
f71609b5dd The return save offset can be computed at initialization time - do
so and save the value.

llvm-svn: 228996
2015-02-13 00:39:27 +00:00
Olivier Sallenave
05e69157b6 Change max interleave factor to 12 for POWER7 and POWER8.
llvm-svn: 228973
2015-02-12 22:57:58 +00:00
Benjamin Kramer
5f6a907288 MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line with countTrailingZeros
Update all callers.

llvm-svn: 228930
2015-02-12 15:35:40 +00:00
Hal Finkel
7a0516ea66 [PowerPC] Mark jumps as expensive (using using CR bits)
On PowerPC, which has a full set of logical operations on (its multiple sets
of) condition-register bits, it is not profitable to break of complex
conditions feeding a jump into multiple jumps. We can turn off this feature of
CGP/SDAGBuilder by marking jumps as "expensive".

P7 test-suite speedups (no regressions):
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.626647% +/- 0.323583%
MultiSource/Benchmarks/Olden/power/power
	-18.2821% +/- 8.06481%

llvm-svn: 228895
2015-02-12 01:02:52 +00:00
Bill Schmidt
67f36bd0d8 Fix up r228725, missed change in PPCSubtarget definition
llvm-svn: 228728
2015-02-10 19:31:55 +00:00
Bill Schmidt
82f1c775a0 [PowerPC] Fix reverted patch r227976 to avoid register assignment issues
See full discussion in http://reviews.llvm.org/D7491.

We now hide the add-immediate and call instructions together in a
separate pseudo-op, which is tagged to define GPR3 and clobber the
call-killed registers.  The PPCTLSDynamicCall pass prior to RA now
expands this op into the two separate addi and call ops, with explicit
definitions of GPR3 on both instructions, and explicit clobbers on the
call instruction.  The pass is now marked as requiring and preserving
the LiveIntervals and SlotIndexes analyses, and fixes these up after
the replacement sequences are introduced.

Self-hosting has been verified on LE P8 and BE P7 with various
optimization levels, etc.  It has also been verified with the
--no-tls-optimize flag workaround removed.

llvm-svn: 228725
2015-02-10 19:09:05 +00:00
Hal Finkel
57c6ac5e41 [PowerPC] Support the (old) cntlz instruction alias
Some old assembly code uses the cntlz alias for cntlzw, binutils supports this,
and we should too. Fixes PR22519.

llvm-svn: 228719
2015-02-10 18:45:02 +00:00