clang-p2996

Author	SHA1	Message	Date
Tim Northover	14e7f73a0f	GlobalISel: clear pending phis after MachineFunction translated Test is just reordering the existing functions (it would trigger for any function after one with a phi). llvm-svn: 277841	2016-08-05 17:50:36 +00:00
Simon Pilgrim	69b6a70834	[X86][SSE] Add initial support for 2 input target shuffle combining. At the moment only the INSERTPS matching can actually use 2 inputs but the plumbing is now in place. llvm-svn: 277839	2016-08-05 17:36:14 +00:00
Tim Northover	97d0cb3165	GlobalISel: IRTranslate PHI instructions llvm-svn: 277835	2016-08-05 17:16:40 +00:00
Ulrich Weigand	c3b495a649	[PowerPC] Wrong fast-isel codegen for VSX floating-point loads There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled. This can ccause invalid registers to be used in certain cases, like: lfd 36, ... instead of using a VSX load instruction. The wrong register number gets silently truncated, causing invalid code to be generated. The first place is PPCFastISel::PPCEmitLoad, which had multiple problems: 1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they are computed from resultReg, which is still zero at this point in many cases. Fixed by changing the helper routines to operate on a register class instead of a register and passing in UseRC. 2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo: bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS; bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD; The second line needs to use isVSFRC (like PPCEmitStore does). 3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong. Fixed by copying the code handling the same issue in PPCEmitStore. The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class. Fixed by hardcoding a F8RC class even on systems supporting VSX. Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630 Differential Revision: https://reviews.llvm.org/D22632 llvm-svn: 277823	2016-08-05 15:22:05 +00:00
Strahinja Petrovic	30e0ce8e9f	[PowerPC] fix passing long double arguments to function (soft-float) This patch fixes passing long double type arguments to function in soft float mode. If there is less than 4 argument registers free (long double type is mapped in 4 gpr registers in soft float mode) long double type argument must be passed through stack. Differential Revision: https://reviews.llvm.org/D20114. llvm-svn: 277804	2016-08-05 08:47:26 +00:00
Tim Northover	61c16142b4	GlobalISel: extend add widening to SUB, MUL, OR, AND and XOR. These are the operations that are trivially identical. Division is omitted for now because you need to use the correct sign/zero extension. llvm-svn: 277775	2016-08-04 21:39:49 +00:00
Tim Northover	1cfa919b3d	GlobalISel: add support for G_MUL llvm-svn: 277774	2016-08-04 21:39:44 +00:00
Tim Northover	9656f1476c	GlobalISel: implement narrowing for G_ADD. llvm-svn: 277769	2016-08-04 20:54:13 +00:00
Yaxun Liu	86c052238a	[OpenCL] Add missing tests for getOCLTypeName Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22964 llvm-svn: 277759	2016-08-04 19:45:00 +00:00
Tim Northover	2f32e7f0ac	AArch64: don't assume all i128s are BUILD_PAIRs It leads to a crash when they're not. I'm sure I've made this mistake before, at least once. llvm-svn: 277755	2016-08-04 19:32:28 +00:00
Tim Northover	06db18fbf8	GlobalISel: also add G_TRUNC to IRTranslator. llvm-svn: 277749	2016-08-04 18:35:17 +00:00
Tim Northover	323358184e	GlobalISel: add code to widen scalar G_ADD llvm-svn: 277747	2016-08-04 18:35:11 +00:00
Derek Schuff	732636d901	[WebAssembly] Check return value of getRegForValue in FastISel Previously, FastISel for WebAssembly wasn't checking the return value of `getRegForValue` in certain cases, which would generate instructions referencing NoReg. This patch fixes this behavior. Patch by Dominic Chen Differential Revision: https://reviews.llvm.org/D23100 llvm-svn: 277742	2016-08-04 18:01:52 +00:00
Krzysztof Parzyszek	04c0796e37	[Hexagon] Validate register class when doing bit simplification llvm-svn: 277740	2016-08-04 17:56:19 +00:00
Daniel Sanders	5dcbac57c5	[mips] Set Personality and LSDA encoding for FreeBSD Reviewers: seanbruno, sdardis Subscribers: tberghammer, danalbert, srhines, dsanders, sdardis, llvm-commits, seanbruno Differential Revision: https://reviews.llvm.org/D23113 llvm-svn: 277732	2016-08-04 15:36:03 +00:00
Krzysztof Parzyszek	7773c58458	[Hexagon] Clear kill flags from modified registers in peephole optimizer llvm-svn: 277727	2016-08-04 14:17:16 +00:00
Nikolai Bozhenov	f679530ba1	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 llvm-svn: 277725	2016-08-04 12:47:28 +00:00
Simon Pilgrim	8ae6dad49b	[X86][SSE] Don't decide when to scalarize CTTZ/CTLZ for performance at lowering - this is what cost models are for Improved CTTZ/CTLZ costings will be added shortly llvm-svn: 277713	2016-08-04 10:14:39 +00:00
Simon Dardis	57f4ae4625	[mips] Enable tail calls by default Enable tail calls by default for (micro)MIPS(64). microMIPS is slightly more tricky than doing it for MIPS(R6) or microMIPSR6. microMIPS has two instruction encodings: 16bit and 32bit along with some restrictions on the size of the instruction that can fill the delay slot. For safe tail calls for microMIPS, the delay slot filler attempts to find a correct size instruction for the delay slot of TAILCALL pseudos. Reviewers: dsanders, vkalintris Subscribers: jfb, dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D21138 llvm-svn: 277708	2016-08-04 09:17:07 +00:00
Dean Michael Berris	7e9abea2ae	[XRay] Align entry and return sleds to 2 byte boundaries This should ensure that we can atomically write two bytes (on top of the retq and the one past it) and have those two bytes not straddle cache lines. We also move the label past the alignment instruction so that we can refer to the actual first instruction, as opposed to potential padding before the aligned instruction. Update the tests to allow us to reflect the new order of assembly. Reviewers: rSerge, echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23101 llvm-svn: 277701	2016-08-04 07:37:28 +00:00
Matt Arsenault	b0e32f1ba1	AMDGPU: Fix a slow test by using basic regalloc This just tests that the register limit isn't exceeded, so the regisetr allocation doesn't need to be great.' The critically slow part is all in greedy RA, so switch to basic. llvm-svn: 277700	2016-08-04 07:04:54 +00:00
Matthias Braun	1873998b16	RenameIndependentSubregs: Fix liveness query in rewriteOperands() rewriteOperands() always performed liveness queries at the base index rather than the RegSlot/Base as apropriate for the machine operand. This could lead to illegal rewriting in some cases. llvm-svn: 277661	2016-08-03 22:37:47 +00:00
Guozhi Wei	9584d18d48	[PPC] Handling CallInst in PPCBoolRetToInt This patch fixes pr25548. Current implementation of PPCBoolRetToInt doesn't handle CallInst correctly, so it failed to do the intended optimization when there is a CallInst with parameters. This patch fixed that. llvm-svn: 277655	2016-08-03 21:43:51 +00:00
Bruno Cardoso Lopes	3fcf832cce	Revert "[ARM] Constant Materialize: imms with specific value can be encoded into mov.w" This reverts commit r277610 / d619aa8878c3dafcc0d29a46517f63ff3209fdd4. This make subtarget-no-movt.ll fail in http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/26892, llvm-svn: 277654	2016-08-03 21:26:21 +00:00
Elliot Colp	6af6f64f87	I can't reproduce this buildbot failure locally, so temporarily remove this test while I investigate. http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/27427 llvm-svn: 277636	2016-08-03 19:39:20 +00:00
Simon Pilgrim	898f030f70	[X86][SSE] Enable target shuffle combining to combine multiple shuffle inputs. We currently only support combining target shuffles that consist of a single source input (plus elements known to be undef/zero). This patch generalizes the recursive combining of the target shuffle to collect all the inputs, merging any duplicates along the way, into a full set of src ops and its shuffle mask. We uncover a number of cases where we have failed to combine a unary shuffle because the input has been duplicated and separated during lowering. This will allow us to combine to 2-input shuffles in a future patch. Differential Revision: https://reviews.llvm.org/D22859 llvm-svn: 277631	2016-08-03 19:08:24 +00:00
Krzysztof Parzyszek	23ee12e173	[Hexagon] Generate COPY/REG_SEQUENCE more aggressively for vectors llvm-svn: 277626	2016-08-03 18:35:48 +00:00
Ehsan Amiri	a538b0f023	Adding -verify-machineinstrs option to PowerPC tests Currently we have a number of tests that fail with -verify-machineinstrs. To detect this cases earlier we add the option to the testcases with the exception of tests that will currently fail with this option. PR 27456 keeps track of this failures. No code review, as discussed with Hal Finkel. llvm-svn: 277624	2016-08-03 18:17:35 +00:00
Weiming Zhao	57dc4cf0e1	[ARM] Constant Materialize: imms with specific value can be encoded into mov.w Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. Reviewers: john.brawn, jmolloy Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277610	2016-08-03 17:05:23 +00:00
Elliot Colp	82b1468a4d	Disable shrinking of SNaN constants When expanding FP constants, we attempt to shrink doubles to floats and perform an extending load. However, on SystemZ, and possibly on other targets (I've only confirmed the problem on SystemZ), the FP extending load instruction may convert SNaN into QNaN, or may cause an exception. So in the general case, we would still like to shrink FP constants, but SNaNs should be left as doubles. Differential Revision: https://reviews.llvm.org/D22685 llvm-svn: 277602	2016-08-03 15:09:21 +00:00
Krzysztof Parzyszek	ed4e7827bb	[Hexagon] Do not check alignment for unsized types in isLegalAddressingMode When the same base address is used to load two different data types, LSR would assume a memory type of "void". This type is not sized and has no alignment information. Checking for it causes a crash. llvm-svn: 277601	2016-08-03 15:06:18 +00:00
Dean Michael Berris	0b8f6c8777	[XRay] Make the xray_instr_map section specification more correct Summary: We also add a test to show what currently happens when we create a section per function and emit an xray_instr_map. This illustrates the relationship (or lack thereof) between the per-function section and the xray_instr_map section. We also change the code generation slightly so that we don't always create group sections, but rather only do so if a function where the table is associated with is in a group. Also in this change: - Remove the "merge" flag on the xray_instr_map section. - Test that we're generating the right table for comdat and non-comdat functions. Reviewers: echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23104 llvm-svn: 277580	2016-08-03 07:21:55 +00:00
Derek Schuff	39bf39f35c	[WebAssembly] Initial SIMD128 support. Kicks off the implementation of wasm SIMD128 support (spec: https://github.com/stoklund/portable-simd/blob/master/portable-simd.md), adding support for add, sub, mul for i8x16, i16x8, i32x4, and f32x4. The spec is WIP, and might change in the near future. Patch by João Porto Differential Revision: https://reviews.llvm.org/D22686 llvm-svn: 277543	2016-08-02 23:16:09 +00:00
Tim Northover	765777ce67	ARM: only form SMMLS when SUBE flags unused. In this particular example we wouldn't want the smmls anyway (the value is actually unused), but in general smmls does not provide the required flags register so if that SUBE result is used we can't replace it. llvm-svn: 277541	2016-08-02 23:12:36 +00:00
Matt Arsenault	979902b3ff	AMDGPU: fdiv -1, x -> rcp -x llvm-svn: 277535	2016-08-02 22:25:04 +00:00
Krzysztof Parzyszek	824d347d2d	[Hexagon] Recognize vcombine in copy propagation llvm-svn: 277528	2016-08-02 21:49:20 +00:00
Artem Belevich	db4bc667af	[NVPTX] remove unnecessary named metadata update that happens to break debug info. Also added test case to verify IR changes done by NVPTXGenericToNVVM pass. Differential Revision: https://reviews.llvm.org/D22837 llvm-svn: 277520	2016-08-02 20:58:24 +00:00
Tim Northover	1021d89398	AArch64: properly calculate cmpxchg status in FastISel. We were relying on the misleadingly-names $status result to actually be the status. Actually it's just a scratch register that may or may not be valid (and is the inverse of the real ststus anyway). Success can be determined by comparing the value loaded against the one we wanted to see for "cmpxchg strong" loops like this. Should fix PR28819. llvm-svn: 277513	2016-08-02 20:22:36 +00:00
Nicolai Haehnle	8a482b33fe	AMDGPU: Stay in WQM for non-intrinsic stores Summary: Two types of stores are possible in pixel shaders: stores to memory that are explicitly requested at the API level, and stores that are an implementation detail of register spilling or lowering of arrays. For the first kind of store, we must ensure that helper pixels have no effect and hence WQM must be disabled. The second kind of store must always be executed, because the written value may be loaded again in a way that is relevant for helper pixels as well -- and there are no externally visible effects anyway. This is a candidate for the 3.9 release branch. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D22675 llvm-svn: 277504	2016-08-02 19:31:14 +00:00
Nicolai Haehnle	bef0e90cf1	AMDGPU: Track physical registers in SIWholeQuadMode Summary: There are cases where uniform branch conditions are computed in VGPRs, and we didn't correctly mark those as WQM. The stray change in basic-branch.ll is because invoking the LiveIntervals analysis leads to the detection of a dead register that would otherwise not be seen at -O0. This is a candidate for the 3.9 branch, as it fixes a possible hang. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22673 llvm-svn: 277500	2016-08-02 19:17:37 +00:00
Ahmed Bougacha	91bdeb1cc2	[AArch64][GlobalISel] Replace test REQUIRES with lit.local.cfg. NFC. I forgot the REQUIRES once (see r277486). Let's prevent it from happening again. llvm-svn: 277499	2016-08-02 19:04:29 +00:00
Ahmed Bougacha	8a31ed2432	[AArch64] Remove useless 'import re' from CodeGen lit.local.cfg. NFC. llvm-svn: 277498	2016-08-02 19:04:25 +00:00
Krzysztof Parzyszek	962932c2e2	[Hexagon] Prefer _io over _rr for 64-bit store with constant offset Identify patterns where the address is aligned to an 8-byte boundary, but both the base address and the constant offset are both proper multiples of 4. In such cases, extract Base+4 into a separate instruc- tion, and use S2_storerd_io, instead of using S4_storerd_rr. llvm-svn: 277497	2016-08-02 18:50:05 +00:00
Ahmed Bougacha	0d020190dd	[AArch64][GlobalISel] Add REQUIRES: global-isel to verifier tests. I thought the directory had a lit.local.cfg, but it doesn't. I'll add one, but for now, add the REQUIRES line. While there, move the triple into the IR and add a datalayout. llvm-svn: 277486	2016-08-02 17:19:35 +00:00
Ahmed Bougacha	bfaddd999a	[GlobalISel] Set the Selected MF property. None of GlobalISel requires the property, but this lets us use the verifier instead of rolling our own "all instructions selected" check. llvm-svn: 277484	2016-08-02 16:49:25 +00:00
Ahmed Bougacha	b14e944cdb	[GlobalISel] Verify Selected MF property. After instruction selection, there should be no pre-isel generic instructions remaining, nor should generic virtual registers be used. Verify that. llvm-svn: 277483	2016-08-02 16:49:22 +00:00
Ahmed Bougacha	b109d51865	[GlobalISel] Add Selected MachineFunction property. Selected: the InstructionSelect pass ran and all pre-isel generic instructions have been eliminated; i.e., all instructions are now target-specific or non-pre-isel generic instructions (e.g., COPY). Since only pre-isel generic instructions can have generic virtual register operands, this also means that all generic virtual registers have been constrained to virtual registers (assigned to register classes) and that all sizes attached to them have been eliminated. This lets us enforce certain invariants across passes. This property is GlobalISel-specific, but is always available. llvm-svn: 277482	2016-08-02 16:49:19 +00:00
Ahmed Bougacha	4628e37e7f	[GlobalISel] Set and require RegBankSelected MF property. The InstructionSelect pass assumes that RegBankSelect ran; set the property on all tests (thereby verifying the test inputs) and require it in the pass. llvm-svn: 277477	2016-08-02 16:17:18 +00:00
Ahmed Bougacha	3681c772cf	[GlobalISel] Verify RegBankSelected MF property. RegBankSelected functions shouldn't have any generic virtual register not assigned to a bank. Verify that. llvm-svn: 277476	2016-08-02 16:17:15 +00:00
Ahmed Bougacha	2471265508	[GlobalISel] Add RegBankSelected MachineFunction property. RegBankSelected: the RegBankSelect pass ran and all generic virtual registers have been assigned to a register bank. This lets us enforce certain invariants across passes. This property is GlobalISel-specific, but is always available. llvm-svn: 277475	2016-08-02 16:17:10 +00:00

1 2 3 4 5 ...

16872 Commits