Commit Graph

1983 Commits

Author SHA1 Message Date
Craig Topper
e7ca6f5456 [DAGCombiner] When combining zero_extend of a truncate, only mask before extending for vectors.
Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code.

Differential Revision: https://reviews.llvm.org/D42679

llvm-svn: 326500
2018-03-01 22:32:25 +00:00
Sebastian Pop
c33af715d7 [AArch64] generate vuzp instead of mov
when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a
specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the
BUILD_VECTOR with either vuzp1 or vuzp2.

With this patch LLVM generates the following code for the first function fun1 in the testcase:
adrp x8, .LCPI0_0
ldr  q0, [x8, :lo12:.LCPI0_0]
tbl  v0.16b, { v0.16b }, v0.16b
ext  v1.16b, v0.16b, v0.16b, #8
uzp1 v0.8b, v0.8b, v1.8b
str  d0, [x8]
ret

Without this patch LLVM currently generates this code:
adrp    x8, .LCPI0_0
ldr     q0, [x8, :lo12:.LCPI0_0]
tbl     v0.16b, { v0.16b }, v0.16b
mov     v1.16b, v0.16b
mov     v1.b[1], v0.b[2]
mov     v1.b[2], v0.b[4]
mov     v1.b[3], v0.b[6]
mov     v1.b[4], v0.b[8]
mov     v1.b[5], v0.b[10]
mov     v1.b[6], v0.b[12]
mov     v1.b[7], v0.b[14]
str     d1, [x8]
ret

llvm-svn: 326443
2018-03-01 15:47:39 +00:00
Roman Tereshin
2b94972eb9 [GlobalISel][AArch64] Adding -disable-gisel-legality-check CL option
Currently it's impossible to test InstructionSelect pass with MIR which
is considered illegal by the Legalizer in Assert builds. In early stages
of porting an existing backend from SelectionDAG ISel to GlobalISel,
however, we would have very basic CallLowering, Legalizer, and
RegBankSelect implementations, but rather functional Instruction Select
with quite a few patterns selectable due to the semi-automatic porting
process borrowing them from SelectionDAG ISel.

As we are trying to define legality as a property of being selectable by
the instruction selector, it would be nice to be able to easily check
what the selector can do in its current state w/o the legality check
provided by the Legalizer getting in the way.

It also seems beneficial to have a regression testing set up that would
not allow the selector to silently regress in its support of the MIR not
supported yet by the previous passes in the GlobalISel pipeline.

This commit adds -disable-gisel-legality-check command line option to
llc that disables those legality checks in RegBankSelect and
InstructionSelect passes.

It also adds quite a few MIR test cases for AArch64's Instruction
Selector. Every one of them would fail on the legality check at the
moment, but will select just fine if the check is disabled. Every test
MachineFunction is intended to exercise a specific selection rule and
that rule only, encoded in the MachineFunction's name by the rule's
number, ID, and index of its GIM_Try opcode in TableGen'erated
MatchTable (-optimize-match-table=false).

Reviewers: ab, dsanders, qcolombet, rovka

Reviewed By: bogner

Subscribers: kristof.beyls, volkan, aditya_nandakumar, aemerson,
rengolin, t.p.northover, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D42886

llvm-svn: 326396
2018-03-01 00:27:48 +00:00
Chih-Hung Hsieh
9f9e4681ac [TLS] use emulated TLS if the target supports only this mode
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.

Differential Revision: https://reviews.llvm.org/D42999

llvm-svn: 326341
2018-02-28 17:48:55 +00:00
Geoff Berry
a2b9011290 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.

llvm-svn: 326208
2018-02-27 16:59:10 +00:00
Evandro Menezes
717941e132 [AArch64] Harden test cases
NFC

llvm-svn: 326147
2018-02-26 23:19:25 +00:00
Francis Visoiu Mistrih
e4fae4d5b6 [CodeGen] Don't omit any redundant information in -debug output
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:

1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.

2) When -debug-printing a MF entirely, don't print any redundant
information.

3) When printing MIR, don't print any redundant information.

I'd like to change 2) to:

2) When -debug-printing a MF entirely, don't omit any redundant information.

Differential Revision: https://reviews.llvm.org/D43337

llvm-svn: 326094
2018-02-26 15:23:42 +00:00
Evandro Menezes
1afffac05b [PATCH] [AArch64] Add new target feature to fuse conditional select
This feature enables the fusion of the comparison and the conditional select
instructions together.

Differential revision: https://reviews.llvm.org/D42392

llvm-svn: 325939
2018-02-23 19:27:43 +00:00
Evandro Menezes
c0571bd065 [AArch64] Improve macro fusion test case
Improve a vector in the test case for the fusion of address generation and
loads or stores.  Otherwise, NFC.

llvm-svn: 325839
2018-02-22 23:32:06 +00:00
Sander de Smalen
a86f3cfb49 Revert "[DebugInfo][FastISel] Fix dropping dbg.value()"
This patch reverts r325440 and r325438 because it triggers an
assertion in SelectionDAGBuilder.cpp. Also having debug enabled
may unintentionally affect code-gen. The patch is reverted until
we find a better solution.

llvm-svn: 325825
2018-02-22 19:53:59 +00:00
Evandro Menezes
72f3983633 [AArch64] Refactor instructions using SIMD immediates
Get rid of icky goto loops and make the code easier to maintain.  Otherwise,
NFC.

Restore r324903 and fix PR36369.

Differentail revision: https://reviews.llvm.org/D43364

llvm-svn: 325621
2018-02-20 20:31:45 +00:00
Amara Emerson
db211892ed [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first.
This is a follow on commit to r[x] where we fix the other direction of copy.
For this case, after converting the source from gpr32 -> fpr32, we use a
subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land.

https://reviews.llvm.org/D43444

llvm-svn: 325550
2018-02-20 05:11:57 +00:00
Amara Emerson
7e9f348b2d [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied
to gpr register banks.

PR36345.

rdar://36478867

Differential Revision: https://reviews.llvm.org/D43310

llvm-svn: 325463
2018-02-18 17:10:49 +00:00
Amara Emerson
bc03baef77 [AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits.
These are needed for operations on fp16 types in a later patch.

llvm-svn: 325462
2018-02-18 17:03:02 +00:00
Haicheng Wu
aed6e52b3c [AArch64] Coalesce Copy Zero during instruction selection
Add special case for copy of zero to avoid a double copy.

Differential Revision: https://reviews.llvm.org/D36104

llvm-svn: 325459
2018-02-18 13:51:33 +00:00
Sander de Smalen
d01cb72f7e Made test dbg_value_fastisel.ll specific to AArch64 fast-isel.
Some buildbots failed on this test (rL325438) because they don't
build all targets. I set the triple to aarch64 and moved the test
to test/CodeGen/AArch64/fast-isel-dbg-value.ll.

llvm-svn: 325440
2018-02-17 17:43:24 +00:00
Martin Storsjo
a63a5b993e [AArch64] Implement dynamic stack probing for windows
This makes sure that alloca() function calls properly probe the
stack as needed.

Differential Revision: https://reviews.llvm.org/D42356

llvm-svn: 325433
2018-02-17 14:26:32 +00:00
Quentin Colombet
48abac82b8 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r323991.

This commit breaks target that don't model all the register constraints
in TableGen. So far the workaround was to set the
hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the
cases.
For instance, when mutating an instruction (like in the lowering of
COPYs) the isRenamable flag is not properly updated. The same problem
will happen when attaching machine operand from one instruction to
another.

Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.

llvm-svn: 325421
2018-02-17 03:05:33 +00:00
Evandro Menezes
10ae20d80c [AArch64] Fix BITCAST lowering crash
The data type is assumed to be a vector, but sometimes it is not, leading
to an assertion.

Add simple test-case to verify this.

Differential revision: https://reviews.llvm.org/D42599

llvm-svn: 325378
2018-02-16 20:00:57 +00:00
Volkan Keles
9283763865 GlobalISel: IRTranslate llvm.fmuladd.* intrinsic
Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, bogner

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D43090

llvm-svn: 324971
2018-02-13 00:47:46 +00:00
Abderrazek Zaafrani
e72d99261f [AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic - llvm portion
https://reviews.llvm.org/D42993

llvm-svn: 324912
2018-02-12 17:35:42 +00:00
Oliver Stannard
02f08c9d1f [AArch64] Improve v8.1-A code-gen for atomic load-and
Armv8.1-A added an atomic load-clear instruction (which performs bitwise
and with the complement of it's operand), but not a load-and
instruction. Our current code-generation for atomic load-and always
inserts an MVN instruction to invert its argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-and
operation into an xor with -1 and a load-clear, allowing the normal DAG
optimisations to work on it.

To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't
see any easy way to do this with an AArch64-specific ISD node, because
the code-generation for atomic operations assumes the SDNodes are of
type AtomicSDNode.

I've left the old tablegen patterns in because they are still needed for
global isel.

Differential revision: https://reviews.llvm.org/D42478

llvm-svn: 324908
2018-02-12 17:03:11 +00:00
Oliver Stannard
4269917304 [AArch64] Improve v8.1-A code-gen for atomic load-subtract
Armv8.1-A added an atomic load-add instruction, but not a load-subtract
instruction. Our current code-generation for atomic load-subtract always
inserts a NEG instruction to negate it's argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-subtract
operation into a subtract and a load-add, allowing the normal DAG
optimisations to work on it.

I've left the old tablegen patterns in because they are still needed for
global isel.

Some of the tests in this patch are copied from D35375 by Chad Rosier (which
was abandoned).

Differential revision: https://reviews.llvm.org/D42477

llvm-svn: 324892
2018-02-12 14:22:03 +00:00
Sanjay Patel
eb8c408e50 [TargetLowering] try to create -1 constant operand for math ops via demanded bits
This reverses instcombine's demanded bits' transform which always tries to clear bits in constants.

As noted in PR35792 and shown in the test diffs:
https://bugs.llvm.org/show_bug.cgi?id=35792
...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. 

I did investigate changing instcombine's behavior, but it would be more work to change 
canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, 
so we'd have to figure out how to not regress those cases.

Differential Revision: https://reviews.llvm.org/D42986

llvm-svn: 324839
2018-02-11 14:38:23 +00:00
Jonas Paulsson
7850601fa3 [AArch64] Return true in enableMultipleCopyHints().
Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: Martin Storsjö
llvm-svn: 324720
2018-02-09 09:22:20 +00:00
Aditya Nandakumar
b14fd2608c [GISel]: Verify COPIES involving generic registers.
Add verification for copies involving generic registers if they are
compatible - ie if it is a generic copy, then the types are the
same, and if a COPY b/w generic and target virtual register, then
the sizes should be the same. Only checks if there are no sub registers
involved for now.

https://reviews.llvm.org/D37775

llvm-svn: 324696
2018-02-09 01:27:23 +00:00
Sjoerd Meijer
5ea465ded7 [AArch64] Don't materialize 0 with "fmov h0, .." when FullFP16 is not supported
We were generating "fmov h0, wzr" instructions when FullFP16 is not enabled.
I've not added any tests, because the problem was visible in:
test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll,
which I had to change: I don't think Cyclone has FullFP16 enabled
by default, so it shouldn't be using this v8.2a instruction.

I've also removed these rdar tags, please shout if there are any objections.

Differential Revision: https://reviews.llvm.org/D43020

llvm-svn: 324581
2018-02-08 08:39:05 +00:00
Eugene Leviant
25347ea895 [LegalizeDAG] Truncate condition operand of ISD::SELECT
Differential revision: https://reviews.llvm.org/D42737

llvm-svn: 324447
2018-02-07 05:38:29 +00:00
Craig Topper
6d9e090a64 [Mips][AMDGPU] Update test cases to not use vector lt/gt compares that can be simplified to an equality/inequality or to always true/false.
For example 'ugt X, 0' can be simplified to 'ne X, 0'. Or 'uge X, 0' is always true.

We already simplify this for scalars in SimplifySetCC, but we don't currently for vectors in SimplifySetCC. D42948 proposes to change that.

llvm-svn: 324436
2018-02-07 00:51:37 +00:00
Sanjay Patel
ffe72034a6 [AArch64] add test to show sub-optimal isel; NFC
llvm-svn: 324404
2018-02-06 21:25:02 +00:00
Amara Emerson
572f6cecf1 [AArch64][GlobalISel] Fix old use of % sigil in test.
My rebase had missed the new $ sigil we're using.

llvm-svn: 324051
2018-02-02 02:14:42 +00:00
Amara Emerson
58aea52bc4 [GlobalISel] Constrain the dest reg of IMPLICT_DEF.
This fixes a crash where the user is a COPY, which deliberately does not
constrain its source operands, resulting in a vreg without a reg class escaping
selection.

Differential Revision: https://reviews.llvm.org/D42697

llvm-svn: 324047
2018-02-02 01:44:43 +00:00
Amara Emerson
cbc02c71a4 [GlobalISel] Fix assert failure when legalizing non-power-2 loads.
Until we support extending loads properly we're going to fall back for these.
We already handle stores in the same way, so this is just being consistent.

llvm-svn: 324001
2018-02-01 20:47:03 +00:00
Geoff Berry
94503c7bc3 [MachineCopyPropagation] Extend pass to do COPY source forwarding
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.

This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).

This change is a continuation of the work started in D30751.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar

Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits

Differential Revision: https://reviews.llvm.org/D41835

llvm-svn: 323991
2018-02-01 18:54:01 +00:00
Sanjay Patel
702c19cc3e [AArch64] add tests with sqrt estimate and ieee denorms; NFC
As noted in D42323, we're not checking for denorms as we should.

llvm-svn: 323985
2018-02-01 17:57:45 +00:00
Sanjay Patel
f42381fd7e [AArch64] auto-generate complete checks; NFC
llvm-svn: 323984
2018-02-01 17:44:50 +00:00
Puyan Lotfi
43e94b15ea Followup on Proposal to move MIR physical register namespace to '$' sigil.
Discussed here:

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html

In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.

llvm-svn: 323922
2018-01-31 22:04:26 +00:00
Geoff Berry
82203c4149 [MachineOutliner] Freeze registers in new functions
Summary:
Call MRI.freezeReservedRegs() on functions created during outlining so
that calls to isReserved() by the verifier called after this pass won't
assert.

Reviewers: MatzeB, qcolombet, paquette

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D42749

llvm-svn: 323905
2018-01-31 20:15:16 +00:00
Chih-Hung Hsieh
60d1e79ffb [Analysis] Disable calls to *_finite and other glibc-only functions on Android.
Since r322087, glibc's finite lib calls are generated when possible.
However, they are not supported on Android. This change also
disables other functions not available on Android.

Differential Revision: http://reviews.llvm.org/D42668

llvm-svn: 323898
2018-01-31 19:12:50 +00:00
Petar Jovanovic
540f4cd10a [DWARF] Allow duplication of tails with CFI instructions
This commit came as a result for revert of patch r317579 (originally
committed as r317100). The patch made CFI instructions duplicable, because
their existence in the epilogue block was affecting the Tail duplication
pass. However, duplicating blocks with CFI instructions was an issue for
compact unwind info on Darwin, which is why the patch was reverted.

This patch allows duplicating tails with CFI instructions, though they are
not duplicable, by copying them 'manually'.


Patch by Djordje Kovacevic.

Differential Revision: https://reviews.llvm.org/D40979

llvm-svn: 323883
2018-01-31 15:57:57 +00:00
Florian Hahn
c68428b5dc [MachineCombiner] Add check for optimal pattern order.
In D41587, @mssimpso discovered that the order of some patterns for
AArch64 was sub-optimal. I thought a bit about how we could avoid that
case in the future. I do not think there is a need for evaluating all
patterns for now. But this patch adds an extra (expensive) check, that
evaluates the latencies of all patterns, and ensures that the latency
saved decreases for subsequent patterns.

This catches the sub-optimal order fixed in D41587, but I am not
entirely happy with the check, as it only applies to sub-optimal
patterns seen while building with EXPENSIVE_CHECKS on. It did not
discover any other sub-optimal pattern ordering.

Reviewers: Gerolf, spatel, mssimpso

Reviewed By: Gerolf, mssimpso

Differential Revision: https://reviews.llvm.org/D41766

llvm-svn: 323873
2018-01-31 13:54:30 +00:00
Evandro Menezes
b7d1729787 [AArch64] Expand testing of zero cycle zeroing
Make sure that r321824 doesn't change zeroing.

Differential revision: https://reviews.llvm.org/D42089

llvm-svn: 323816
2018-01-30 21:14:11 +00:00
Martin Storsjo
cc981d285d [GlobalISel] Bail out on calls to dllimported functions
Differential Revision: https://reviews.llvm.org/D42568

llvm-svn: 323811
2018-01-30 19:50:58 +00:00
Martin Storsjo
708498a164 [AArch64] Properly handle dllimport of variables when using fast-isel
Differential Revision: https://reviews.llvm.org/D42567

llvm-svn: 323810
2018-01-30 19:50:51 +00:00
Evandro Menezes
f1d01645a7 [AArch64] Add new target feature to fuse address generation with load or store
This feature enables the fusion of the address generation and a
corresponding load or store together.

Differential revision: https://reviews.llvm.org/D42393

llvm-svn: 323782
2018-01-30 16:28:01 +00:00
Evandro Menezes
16d7d81e5d [AArch64] Update test cases for Exynos M3
Update any test case relevant for Exynos M3.

llvm-svn: 323775
2018-01-30 15:40:27 +00:00
Evandro Menezes
9f9daa1f14 [AArch64] Add pipeline model for Exynos M3
Add the scheduling and cost model for Exynos M3.

Differential revision: https://reviews.llvm.org/D42387

llvm-svn: 323773
2018-01-30 15:40:16 +00:00
Quentin Colombet
72f6d59841 [RAFast] Don't dereference MBB::end
When RAFast sees liveins in on a basic block, it uses that information
to initialize the availability of the registers. The called
method uses an instruction as one of its argument and in the liveins
case, RAFast was dereferencing MBB::begin which can be MBB::end for
empty basic block.

Change the API of definePhysReg to use MachineBasicBlock::iterator
instead of MachineInstr so that we don't dereference an
invalid iterator while making the call.

rdar://problem/36952401

llvm-svn: 323710
2018-01-29 23:42:37 +00:00
Jun Bum Lim
fc7d56d949 Revert "AArch64: Omit callframe setup/destroy when not necessary"
This reverts commit r322917 due to multiple performance regressions in spec2006
and spec2017. XFAILed llvm/test/CodeGen/AArch64/big-callframe.ll which initially
motivated this change.

llvm-svn: 323683
2018-01-29 19:56:42 +00:00
Oliver Stannard
a9d2e004d2 [AArch64] Generate the CASP instruction for 128-bit cmpxchg
The Large System Extension added an atomic compare-and-swap instruction
that operates on a pair of 64-bit registers, which we can use to
implement a 128-bit cmpxchg.

Because i128 is not a legal type for AArch64 we have to do all of the
instruction selection in C++, and the instruction requires even/odd
register pairs, so we have to wrap it in REG_SEQUENCE and EXTRACT_SUBREG
nodes. This is very similar to what we do for 64-bit cmpxchg in the ARM
backend.

Differential revision: https://reviews.llvm.org/D42104

llvm-svn: 323634
2018-01-29 09:18:37 +00:00