clang-p2996

Author	SHA1	Message	Date
Sjoerd Meijer	d31a8c0595	Recommit: [ARM] f16 constant pool fix This recommits r325754; the modified and failing test case actually didn't need any modifications. llvm-svn: 325765	2018-02-22 10:43:57 +00:00
Sjoerd Meijer	9a25247f80	Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy. llvm-svn: 325756	2018-02-22 08:41:55 +00:00
Sjoerd Meijer	f98e32cf53	Added a test that I forgot to svn add in my previous commit r325754. llvm-svn: 325755	2018-02-22 08:20:50 +00:00
Sjoerd Meijer	7d5909eb0f	[ARM] f16 constant pool fix This is a follow up of r325012, that allowed half types in constant pools. Proper alignment was enforced when a big basic block was split up, but not when a CPE was placed before/after a block; the successor block had the wrong alignment. Differential Revision: https://reviews.llvm.org/D43580 llvm-svn: 325754	2018-02-22 08:16:05 +00:00
Sjoerd Meijer	4d5c40492a	[ARM] Lower BR_CC for f16 This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616	2018-02-20 19:28:05 +00:00
Francis Visoiu Mistrih	7f0f8bb4bd	[CodeGen] Fix tests breaking after r325505 llvm-svn: 325512	2018-02-19 15:51:17 +00:00
Sjoerd Meijer	c9bde5404a	[ARM] Add LLVM tests for the vcvtr builtins Follow up of Clang commit r325351; this adds the LLVM tests, which were also missing. Differential Revision: https://reviews.llvm.org/D43395 llvm-svn: 325443	2018-02-17 19:59:29 +00:00
Quentin Colombet	48abac82b8	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" This reverts commit r323991. This commit breaks target that don't model all the register constraints in TableGen. So far the workaround was to set the hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the cases. For instance, when mutating an instruction (like in the lowering of COPYs) the isRenamable flag is not properly updated. The same problem will happen when attaching machine operand from one instruction to another. Geoff Berry is working on a fix in https://reviews.llvm.org/D43042. llvm-svn: 325421	2018-02-17 03:05:33 +00:00
Jonas Paulsson	995ba6e42c	[ARM] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Eli Friedman llvm-svn: 325327	2018-02-16 09:51:01 +00:00
Mikhail Maltsev	0a7e107e77	[LegalizeDAG] Fix legalization of SETCC Summary: Currently when expanding a SETCC node into a SELECT_CC, LLVM uses an incorrect type for determining BooleanContent of the result. This patch fixes the issue. Fixes PR36079. Reviewers: rogfer01, javed.absar, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43282 llvm-svn: 325325	2018-02-16 09:35:16 +00:00
Roger Ferrer Ibanez	d41059a9f6	[ARM] Materialise some boolean values to avoid a branch This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form a != b ? x : 0 a == b ? 0 : x and that currently (e.g. in Thumb1) are emitted as branches. Differential Revision: https://reviews.llvm.org/D34515 llvm-svn: 325323	2018-02-16 09:23:59 +00:00
Pablo Barrio	fa6f1c0130	[ARM] Fix redirect in inline assembly test Summary: Fix silly mistake in a test Reviewers: gkistanova, apilipenko Subscribers: javed.absar, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D43342 llvm-svn: 325283	2018-02-15 19:17:55 +00:00
Pablo Barrio	e28cb8399a	[ARM] Allow 64- and 128-bit types with 't' inline asm constraint Summary: In LLVM, 't' selects a floating-point/SIMD register and only supports 32-bit values. This is appropriately documented in the LLVM Language Reference Manual. However, this behaviour diverges from that of GCC, where 't' selects the s0-s31 registers and its qX and dX variants depending on additional operand modifiers (q/P). For example, the following C code: #include <arm_neon.h> float32x4_t a, b, x; asm("vadd.f32 %0, %1, %2" : "=t" (x) : "t" (a), "t" (b)) results in the following assembly if compiled with GCC: vadd.f32 s0, s0, s1 whereas LLVM will show "error: couldn't allocate output register for constraint 't'", since a, b, x are 128-bit variables, not 32-bit. This patch extends the use of 't' to mean that of GCC, thus allowing selection of the lower Q vector regs and their D/S variants. For example, the earlier code will now compile as: vadd.f32 q0, q0, q1 This behaviour still differs from that of GCC but I think it is actually more correct, since LLVM picks up the right register type based on the datatype of x, while GCC would need an extra operand modifier to achieve the same result, as follows: asm("vadd.f32 %q0, %q1, %q2" : "=t" (x) : "t" (a), "t" (b)) Since this is only an extension of functionality, existing code should not be affected by this change. Note that operand modifiers q/P are already supported by LLVM, so this patch should suffice to support inline assembly with constraint 't' originally built for GCC. Reviewers: grosbach, rengolin Reviewed By: rengolin Subscribers: rogfer01, efriedma, olista01, aemerson, javed.absar, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42962 llvm-svn: 325244	2018-02-15 14:44:22 +00:00
Sjoerd Meijer	9430c8cd1c	[ARM] f16 vcmp fixes This adds f16 VCMP match rules and fixes the test cases. Differential Revision: https://reviews.llvm.org/D43291 llvm-svn: 325228	2018-02-15 10:33:07 +00:00
Sjoerd Meijer	3b4294edd2	[ARM] f16 stack spill/reloads This adds support for handling f16 stack spills/reloads. Differential Revision: https://reviews.llvm.org/D43280 llvm-svn: 325130	2018-02-14 15:09:09 +00:00
Francis Visoiu Mistrih	f6ed795d0c	[CodeGen] Print bundled instructions using the MIR syntax in -debug output Old syntax: BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 * %r0 = SOME_OP %r2 * %r1 = ANOTHER_OP internal %r0 New syntax: BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 { %r0 = SOME_OP %r2 %r1 = ANOTHER_OP internal %r0 } llvm-svn: 325032	2018-02-13 18:08:26 +00:00
Sjoerd Meijer	f4a7fa7bbe	[ARM] Allow half types in ConstantPool Change ARMConstantIslandPass to: - accept f16 literals as litpool entries, - if the litpool needs to be inserted in the middle of a big block, then we need to 4-byte align the next instruction in ARM mode. Differential Revision: https://reviews.llvm.org/D42784 llvm-svn: 325012	2018-02-13 15:34:09 +00:00
Sjoerd Meijer	101ee43072	[Thumb] Handle addressing mode AddrMode5FP16 This addressing mode wasn't checked, so we were running in an assert. Differential Revision: https://reviews.llvm.org/D43179 llvm-svn: 324996	2018-02-13 10:29:03 +00:00
Martin Storsjo	9ca8b57186	[GlobalMerge] Allow merging of dllexported variables If merging them, the dllexport attribute needs to be brought along to the new GlobalAlias. Differential Revision: https://reviews.llvm.org/D43192 llvm-svn: 324937	2018-02-12 21:14:21 +00:00
David Green	6d9f8c9817	[CodeGen] Add a -trap-unreachable option for debugging Add a common -trap-unreachable option, similar to the target specific hexagon equivalent, which has been replaced. This turns unreachable instructions into traps, which is useful for debugging. Differential Revision: https://reviews.llvm.org/D42965 llvm-svn: 324880	2018-02-12 11:06:27 +00:00
Sanjay Patel	3659e8c95c	[ARM] preserve test intent by removing undef D43141 proposes to correct undef folding in the DAG, and this test would not survive that change. llvm-svn: 324814	2018-02-10 15:14:00 +00:00
Rafael Espindola	c052fa0bd3	Emit smaller exception tables for non-SJLJ mode. * Use uleb128 for code offsets in the LSDA call site table. * Omit the TTBase offset if the type table is empty. This change can reduce the size of the DWARF/Itanium LSDA by about half. Patch by Ryan Prichard! llvm-svn: 324750	2018-02-09 17:13:37 +00:00
Rafael Espindola	d09b416943	Use assembler expressions to lay out the EH LSDA. Rely on the assembler to finalize the layout of the DWARF/Itanium exception-handling LSDA. Rather than calculate the exact size of each thing in the LSDA, use assembler directives: To emit the offset to the TTBase label: .uleb128 .Lttbase0-.Lttbaseref0 .Lttbaseref0: To emit the size of the call site table: .uleb128 .Lcst_end0-.Lcst_begin0 .Lcst_begin0: ... call site table entries ... .Lcst_end0: To align the type info table: ... action table ... .balign 4 .long _ZTIi .long _ZTIl .Lttbase0: Using assembler directives simplifies the compiler and allows switching the encoding of offsets in the call site table from udata4 to uleb128 for a large code size savings. (This commit does not change the encoding.) The combination of the uleb128 followed by a balign creates an unfortunate dependency cycle that the assembler must sometimes resolve either by padding an LEB or by inserting zero padding before the type table. See PR35809 or GNU as bug 4029. Patch by Ryan Prichard! llvm-svn: 324749	2018-02-09 17:00:25 +00:00
Francis Visoiu Mistrih	39ec2e95ae	[CodeGen] Unify the syntax of MBB successors in MIR and -debug output Instead of: Successors according to CFG: %bb.6(0x12492492 / 0x80000000 = 14.29%) print: successors: %bb.6(0x12492492); %bb.6(14.29%) llvm-svn: 324685	2018-02-09 00:10:31 +00:00
Francis Visoiu Mistrih	da89d1812a	[CodeGen] Print MachineBasicBlock labels using MIR syntax in -debug output Instead of: %bb.1: derived from LLVM BB %for.body print: bb.1.for.body: Also use MIR syntax for MBB attributes like "align", "landing-pad", etc. llvm-svn: 324563	2018-02-08 05:02:00 +00:00
Sjoerd Meijer	8c0739347c	[ARM] FP16 mov imm pattern This is a follow up of r324321, adding a match pattern for mov with a FP16 immediate (also fixing operand vfp_f16imm that wasn't even compiling). Differential Revision: https://reviews.llvm.org/D42973 llvm-svn: 324456	2018-02-07 08:37:17 +00:00
Eli Friedman	cd07a3e2f9	Place undefined globals in .bss instead of .data Following up on the discussion from http://lists.llvm.org/pipermail/llvm-dev/2017-April/112305.html, undef values are now placed in the .bss as well as null values. This prevents undef global values taking up potentially huge amounts of space in the .data section. The following two lines now both generate equivalent .bss data: @vals1 = internal unnamed_addr global [20000000 x i32] zeroinitializer, align 4 @vals2 = internal unnamed_addr global [20000000 x i32] undef, align 4 ; previously unaccounted for This is primarily motivated by the corresponding issue in the Rust compiler (https://github.com/rust-lang/rust/issues/41315). Differential Revision: https://reviews.llvm.org/D41705 Patch by varkor! llvm-svn: 324424	2018-02-06 23:22:14 +00:00
Eli Friedman	98f8bba283	[LivePhysRegs] Fix handling of return instructions. See D42509 for the original version of this. Basically, there are two significant changes to behavior here: - addLiveOuts always adds all pristine registers (even if a block has no successors). - addLiveOuts and addLiveOutsNoPristines always add all callee-saved registers for return blocks (including conditional return blocks). I cleaned up the functions a bit to make it clear these properties hold. Differential Revision: https://reviews.llvm.org/D42655 llvm-svn: 324422	2018-02-06 23:00:17 +00:00
Sjoerd Meijer	d2718ba95e	[ARM] f16 conversions This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion match patterns. Differential Revision: https://reviews.llvm.org/D42954 llvm-svn: 324360	2018-02-06 16:28:43 +00:00
Sjoerd Meijer	89ea2648bb	[ARM] Armv8.2-A FP16 code generation (part 3/3) This adds most of the FP16 codegen support, but these areas need further work: - FP16 literals and immediates are not properly supported yet (e.g. literal pool needs work), - Instructions that are generated from intrinsics (e.g. vabs) haven't been added. This will be addressed in follow-up patches. Differential Revision: https://reviews.llvm.org/D42849 llvm-svn: 324321	2018-02-06 08:43:56 +00:00
Sjoerd Meijer	986d64ad73	[ARM] fixed some tabs/whitespaces in test. NFC. llvm-svn: 324074	2018-02-02 11:51:06 +00:00
Matthias Braun	ca0abaebfb	SplitKit: Fix liveness recomputation in some remat cases. Example situation: ``` BB0: %0 = ... use %0 ; ... condjump BB1 jmp BB2 BB1: %0 = ... ; rematerialized def from above (from earlier split step) jmp BB2 BB2: ; ... use %0 ``` %0 will have a live interval with 3 value numbers (for the BB0, BB1 and BB2 parts). Now SplitKit tries and succeeds in rematerializing the value number in BB2 (This only works because it is a secondary split so SplitKit is can trace this back to a single original def). We need to recompute all live ranges affected by a value number that we rematerialize. The case that we missed before is that when the value that is rematerialized is at a join (Phi VNI) then we also have to recompute liveness for the predecessor VNIs. rdar://35699130 Differential Revision: https://reviews.llvm.org/D42667 llvm-svn: 324039	2018-02-02 00:08:19 +00:00
Geoff Berry	94503c7bc3	[MachineCopyPropagation] Extend pass to do COPY source forwarding Summary: This change extends MachineCopyPropagation to do COPY source forwarding and adds an additional run of the pass to the default pass pipeline just after register allocation. This version of this patch uses the newly added MachineOperand::isRenamable bit to avoid forwarding registers is such a way as to violate constraints that aren't captured in the Machine IR (e.g. ABI or ISA constraints). This change is a continuation of the work started in D30751. Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits Differential Revision: https://reviews.llvm.org/D41835 llvm-svn: 323991	2018-02-01 18:54:01 +00:00
Sjoerd Meijer	9d9a86535e	[ARM] FullFP16 LowerReturn Fix Commit r323512 introduced an optimisation in LowerReturn for half-precision return values. A missing check caused a crash when the return value is "undef" (i.e. a node that has no operands). Differential Revision: https://reviews.llvm.org/D42743 llvm-svn: 323968	2018-02-01 13:48:40 +00:00
Evgeniy Stepanov	7746899f48	Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations" Miscompiles code. Testcase pending. This reverts commit r323869. llvm-svn: 323929	2018-01-31 22:55:19 +00:00
Puyan Lotfi	43e94b15ea	Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922	2018-01-31 22:04:26 +00:00
Pablo Barrio	2e442a7831	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Marten Svanfeldt. Reviewers: fhahn, pbarrio Reviewed By: pbarrio Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 323869	2018-01-31 13:20:10 +00:00
Sjoerd Meijer	98d5359ea2	[ARM] Armv8.2-A FP16 code generation (part 2/3) Half-precision arguments and return values are passed as if it were an int or float for ARM. This results in truncates and bitcasts to/from i16 and f16 values, which are legalized very early to stack stores/loads. When FullFP16 is enabled, we want to avoid codegen for these bitcasts as it is unnecessary and inefficient. Differential Revision: https://reviews.llvm.org/D42580 llvm-svn: 323861	2018-01-31 10:18:29 +00:00
Diana Picus	f72e865372	[ARM GlobalISel] Add inst selector tests for G_SITOFP and G_UITOFP These are handled by the TableGen'erated code. llvm-svn: 323732	2018-01-30 09:15:27 +00:00
Diana Picus	2a5b962030	[ARM GlobalISel] Map G_SITOFP and G_UITOFP Straightforward mapping (integer operand to GPR, floating point operand to FPR). llvm-svn: 323731	2018-01-30 09:15:23 +00:00
Diana Picus	517531e5a5	[ARM GlobalISel] Legalize G_SITOFP and G_UITOFP Legal if we have hardware support, libcall otherwise. Also add supporting code to the legalizer helper for libcalls. llvm-svn: 323730	2018-01-30 09:15:17 +00:00
Diana Picus	f5ad62d921	[ARM GlobalISel] Add inst selector tests for G_FPTOSI and G_FPTOUI The work is done by the TableGen'erated code. llvm-svn: 323728	2018-01-30 07:55:02 +00:00
Diana Picus	a2da03022c	[ARM GlobalISel] Map G_FPTOSI and G_FPTOUI Straightforward mapping (integer operand goes to GPR, floating point operand goes to FPR). llvm-svn: 323727	2018-01-30 07:54:58 +00:00
Diana Picus	4ed0ee7b5f	[ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI Legal if we have hardware support for floating point, libcalls otherwise. Also add the necessary support for libcalls in the legalizer helper. llvm-svn: 323726	2018-01-30 07:54:52 +00:00
Daniel Sanders	08464524c3	[ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern Summary: Apparently, we missed on constraining register classes of VReg-operands of all the instructions built from a destination pattern but the root (top-level) one. The issue exposed itself while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained, while nested VTOSIZS (or rather its destination virtual register to be exact) does not. Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI. https://bugs.llvm.org/show_bug.cgi?id=35965 rdar://problem/36886530 Patch by Roman Tereshin Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan Reviewed By: dsanders, qcolombet, rovka Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42565 llvm-svn: 323692	2018-01-29 21:09:12 +00:00
Momchil Velikov	d2cc6fd90b	[ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative load instruction The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR` register class. The function serves to emit a `tLDRspi` instruction and certainly any subset of the `tGPR` register class is a valid destination of the load. Differential revision: https://reviews.llvm.org/D42535 llvm-svn: 323514	2018-01-26 10:20:58 +00:00
Sjoerd Meijer	011de9c0ca	[ARM] Armv8.2-A FP16 code generation (part 1/3) This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is not supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512	2018-01-26 09:26:40 +00:00
Weiming Zhao	665784f170	[ARM] Expand long shifts for Thumb1 to __aeabi_ calls Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354	2018-01-24 18:00:57 +00:00
Martin Storsjo	4ed94a06ac	[ARM] Call __chkstk for dynamic stack allocation in all windows environments This matches what MSVC does for alloca() function calls on ARM. Even if MSVC doesn't support VLAs at the language level, it does support the alloca function. On the clang level, both the _alloca() (when emulating MSVC, which is what the alloca() function expands to) and __builtin_alloca() builtin functions, and VLAs, map to the same LLVM IR "alloca" function - so within LLVM they're not distinguishable from each other. Differential Revision: https://reviews.llvm.org/D42292 llvm-svn: 323308	2018-01-24 06:40:11 +00:00
Martin Storsjo	e8248f2e10	[GlobalMerge] Don't merge dllexport globals Merging such globals loses the dllexport attribute. Add a test to check that normal globals still are merged. Differential Revision: https://reviews.llvm.org/D42127 llvm-svn: 323307	2018-01-24 06:40:04 +00:00

1 2 3 4 5 ...

3332 Commits