clang-p2996

Author	SHA1	Message	Date
Mandeep Singh Grang	98bc25a0f2	[RISCV] Use init_array instead of ctors for RISCV target, by default Summary: LLVM defaults to the newer .init_array/.fini_array scheme for static constructors rather than the less desirable .ctors/.dtors (the UseCtors flag defaults to false). This wasn't being respected in the RISC-V backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate flag for UseInitArray. This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call InitializeELF(TM.Options.UseInitArray). Reviewers: asb, apazos Reviewed By: asb Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits Differential Revision: https://reviews.llvm.org/D44750 llvm-svn: 328433	2018-03-24 18:37:19 +00:00
Simon Pilgrim	913345f8f5	[X86][AES] Ensure we're testing both non-VEX/VEX variants of AES instructions on AVX targets Add skylake server tests as well llvm-svn: 328424	2018-03-24 15:05:12 +00:00
Simon Pilgrim	91fe24b8cf	[X86][SSE] Ensure we're testing both non-VEX/VEX variants of SSE instructions on AVX targets And ensure we don't use later instruction sets in SSE schedule tests llvm-svn: 328423	2018-03-24 14:51:52 +00:00
Simon Pilgrim	f7d0f7e6db	[X86][AVX1] Ensure we don't use later instruction sets in AVX1 schedule tests llvm-svn: 328421	2018-03-24 13:47:48 +00:00
Simon Pilgrim	d2016f95fb	[X86][AVX2] Ensure we don't use later instruction sets in AVX2 schedule tests llvm-svn: 328420	2018-03-24 13:47:01 +00:00
Craig Topper	2c0a62ab9a	[X86] Add a DAG combine to simplify PMULDQ/PMULUDQ nodes These nodes only use the lower 32 bits of their inputs so we can use SimplifyDemandedBits to simplify them. Differential Revision: https://reviews.llvm.org/D44375 llvm-svn: 328405	2018-03-24 01:52:01 +00:00
Reid Kleckner	e27b410661	[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32 Both GCC and MSVC only look at the low byte of a boolean when it is passed. llvm-svn: 328386	2018-03-23 23:38:53 +00:00
Krzysztof Parzyszek	bcf0a96f9e	[Hexagon] Boost profit for word-mask immediates, reduce for others This avoids unnecessary splitting due to uninteresting immediates. llvm-svn: 328364	2018-03-23 20:11:00 +00:00
Krzysztof Parzyszek	e247526cc9	[Hexagon] Fold offset in base+immediate loads/stores Optimize Ry = add(Rx,#n); memw(Ry+#0) = Rz => memw(Rx,#n) = Rz. Patch by Jyotsna Verma. llvm-svn: 328355	2018-03-23 19:30:34 +00:00
Tony Tye	88441a3d1e	[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44697 llvm-svn: 328351	2018-03-23 18:58:47 +00:00
Tony Tye	7a893d4e34	[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 llvm-svn: 328349	2018-03-23 18:45:18 +00:00
Krzysztof Parzyszek	5f7ba9a74c	[Hexagon] Always generate mux out of predicated transfers if possible HexagonGenMux would collapse pairs of predicated transfers if it assumed that the predicated .new forms cannot be created. Turns out that generating mux is preferable in almost all cases. Introduce an option -hexagon-gen-mux-threshold that controls the minimum distance between the instruction defining the predicate and the later of the two transfers. If the distance is closer than the threshold, mux will not be generated. Set the threshold to 0 by default. llvm-svn: 328346	2018-03-23 18:43:09 +00:00
Krzysztof Parzyszek	80f10e4fe5	[Hexagon] Avoid early if-conversion for one sided branches Patch by Anand Kodnani. llvm-svn: 328344	2018-03-23 18:00:18 +00:00
Ana Pazos	41573804f2	[ARM] Fix "Constant pool entry out of range!" in Thumb1 mode This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode. In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode, adjustBBOffsetsAfter() is not calculating postOffset correctly by properly accounting for the padding that is required for the constant pool that immediately follows the jump table branch instruction. Reviewers: t.p.northover, eli.friedman Reviewed By: t.p.northover Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44709 llvm-svn: 328341	2018-03-23 17:53:27 +00:00
Krzysztof Parzyszek	570c6440cd	[Hexagon] Two fixes in early if-conversion - Fix checking for vector predicate registers. - Avoid speculating llvm.lifetime.end intrinsic. Patch by Harsha Jagasia and Brendon Cahoon. llvm-svn: 328339	2018-03-23 17:46:09 +00:00
Simon Pilgrim	e5c0a041ff	[X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unit Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338	2018-03-23 17:38:59 +00:00
Zaara Syeda	6535993625	Re-commit: [MachineLICM] Add functions to MachineLICM to hoist invariant stores This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 328326	2018-03-23 15:28:15 +00:00
John Brawn	e3b44f9de6	[AArch64] Don't reduce the width of loads if it prevents combining a shift Loads and stores can only shift the offset register by the size of the value being loaded, but currently the DAGCombiner will reduce the width of the load if it's followed by a trunc making it impossible to later combine the shift. Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and make it prevent the width reduction if this is what would happen, though do allow it if reducing the load width will let us eliminate a later sign or zero extend. Differential Revision: https://reviews.llvm.org/D44794 llvm-svn: 328321	2018-03-23 14:47:07 +00:00
Simon Pilgrim	8619962c73	[X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units Fixes throughput to match Agner/Fam16h-SoG as well. llvm-svn: 328318	2018-03-23 14:27:26 +00:00
Christof Douma	4a025cc79d	[ARM] Support float literals under XO When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313	2018-03-23 13:02:03 +00:00
Amara Emerson	f542355942	[GlobalISel] Fix legalizer combine to not use illegal input G_EXTRACT. This was being masked because GISel is enabled by default for -O0 and the abort was disabled. Modified test to explicitly enable abort. llvm-svn: 328311	2018-03-23 12:48:57 +00:00
Simon Pilgrim	2755893834	[X86][SandyBridge] Fix missing comma that was causing string concatenation of 2 instregex entries Found while updating D44687 llvm-svn: 328308	2018-03-23 11:56:38 +00:00
Martin Storsjo	db75aa96d3	Revert "[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))" This reverts commit r328252. This change broke building a number of projects when targeting ARM and AArch64, see PR36873. llvm-svn: 328297	2018-03-23 08:36:47 +00:00
Craig Topper	4787b7f434	[X86] Correct the latencies of SNB integer vector multiplies based on Agner's data. Add missing MMX multiplies. llvm-svn: 328295	2018-03-23 06:41:43 +00:00
Craig Topper	7580a7997d	[X86] Change VPSADBW itinerary to SSE_INTALU_ITINS_P to match the SSE version. llvm-svn: 328293	2018-03-23 06:41:40 +00:00
Craig Topper	7f142b8bf1	[X86] Merge VMOVMSKBrr and MOVMSKBrr in the SNB sheduler model. The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that. llvm-svn: 328291	2018-03-23 06:41:38 +00:00
Craig Topper	fae4173b47	[X86] Add VEXTRB/W/D/Q to Zen scheduler model. The SSE versions were present, but not the VEX version. llvm-svn: 328290	2018-03-23 06:41:36 +00:00
Michael Zolotukhin	fab7a676c2	State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'. That removes some redundant recomputations from the passes pipeline. llvm-svn: 328272	2018-03-22 23:44:40 +00:00
Michael Zolotukhin	3520331f93	Reapply "[test] Add tests for llc passes pipelines." with a fix for bots with expensive checks on. llvm-svn: 328267	2018-03-22 23:02:48 +00:00
Craig Topper	adb173314d	[X86] Correct the VROUND regular expressions in Znver1 scheduler model to account for r328254 llvm-svn: 328260	2018-03-22 22:17:11 +00:00
Craig Topper	40d3b32e12	[X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD This makes the Y position consistent with other instructions. This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary. llvm-svn: 328254	2018-03-22 21:55:20 +00:00
Guozhi Wei	17ff975eb1	[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst)) In our real world application, we found the following optimization is missed in DAGCombiner (zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst)) If the user of original zext is an add, it may enable further lea optimization on x86. This patch add a new function CombineZExtLogicopShiftLoad to do this optimization. Differential Revision: https://reviews.llvm.org/D44402 llvm-svn: 328252	2018-03-22 21:47:25 +00:00
Craig Topper	58afb4ea58	[X86][SkylakeClient] Fix a bunch of instructions that were incorrectly assigned Port015 instead of Port01. The VEC ADD and VEC MUL units aren't present on port 5 on SkylakeClient. llvm-svn: 328241	2018-03-22 21:10:07 +00:00
Jun Bum Lim	2ecb7ba4c6	[CodeGen] Add a new pass for PostRA sink Summary: This pass sinks COPY instructions into a successor block, if the COPY is not used in the current block and the COPY is live-in to a single successor (i.e., doesn't require the COPY to be duplicated). This avoids executing the the copy on paths where their results aren't needed. This also exposes additional opportunites for dead copy elimination and shrink wrapping. These copies were either not handled by or are inserted after the MachineSink pass. As an example of the former case, the MachineSink pass cannot sink COPY instructions with allocatable source registers; for AArch64 these type of copy instructions are frequently used to move function parameters (PhyReg) into virtual registers in the entry block.. For the machine IR below, this pass will sink %w19 in the entry into its successor (%bb.1) because %w19 is only live-in in %bb.1. ``` %bb.0: %wzr = SUBSWri %w1, 1 %w19 = COPY %w0 Bcc 11, %bb.2 %bb.1: Live Ins: %w19 BL @fun %w0 = ADDWrr %w0, %w19 RET %w0 %bb.2: %w0 = COPY %wzr RET %w0 ``` As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be able to see %bb.0 as a candidate. With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64. Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz Reviewed By: sebpop Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41463 llvm-svn: 328237	2018-03-22 20:06:47 +00:00
Nirav Dave	8c5f47ac40	[DAG, X86] Fix ISel-time node insertion ids As in SystemZ backend, correctly propagate node ids when inserting new unselected nodes into the DAG during instruction Seleciton for X86 target. Fixes PR36865. Reviewers: jyknight, craig.topper Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44797 llvm-svn: 328233	2018-03-22 19:32:07 +00:00
Craig Topper	4a3be6e578	[X86] Correct the scheduling data for some of the 32 and 64 bit multiplies to as best as I understand how they are implemented. llvm-svn: 328231	2018-03-22 19:22:51 +00:00
Aditya Nandakumar	b3297ef051	[GISel]: Fix incorrect IRTranslation while translating null pointer types https://reviews.llvm.org/D44762 Currently IRTranslator produces %vreg17<def>(p0) = G_CONSTANT 0; instead we should build %vreg16(s64) = G_CONSTANT 0 %vreg17(p0) = G_INTTOPTR %vreg16 reviewed by @aemerson. llvm-svn: 328218	2018-03-22 17:31:38 +00:00
Jonas Devlieghere	7e69dd02bb	Revert "[test] Add tests for llc passes pipelines." This reverts r328159 because the two AArch64 tests fail on GreenDragon: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/11030/ llvm-svn: 328188	2018-03-22 10:34:06 +00:00
Michael Zolotukhin	7e6fa1d6ae	[test] Add tests for llc passes pipelines. This is basically an extension of existing test test/CodeGen/X86/O0-pipeline.ll introduced in r302608. llvm-svn: 328159	2018-03-21 22:17:13 +00:00
Artem Belevich	30512869ff	[NVPTX] Make tensor shape part of WMMA intrinsic's name. This is needed for the upcoming implementation of the new 8x32x16 and 32x8x16 variants of WMMA instructions introduced in CUDA 9.1. Differential Revision: https://reviews.llvm.org/D44719 llvm-svn: 328158	2018-03-21 21:55:02 +00:00
Lei Huang	efd6f1c8e2	[POWER9][NFC] update testcase check statements llvm-svn: 328147	2018-03-21 20:59:45 +00:00
Sanjay Patel	e235942a1e	[InstSimplify] fp_binop X, NaN --> NaN We propagate the existing NaN value when possible. Differential Revision: https://reviews.llvm.org/D44521 llvm-svn: 328140	2018-03-21 19:31:53 +00:00
Krzysztof Parzyszek	c715a5d2b8	[Hexagon] Eliminate subregisters from PHI nodes before pipelining The pipeliner needs to remove instructions from the SlotIndexes structure when they are deleted. Otherwise, the SlotIndexes map has stale data, and an assert will occur when adding new instructions. This patch also changes the pipeliner to make the back-edge of a loop carried dependence 1 cycle. The 1 cycle latency is added to the anti-dependence that represents the back-edge. This changes eliminates a couple of hacks added to the pipeliner to handle the latency of the back-edge. It is needed to correctly pipeline the test case for the sub-register elimination pass. llvm-svn: 328113	2018-03-21 16:39:11 +00:00
Alex Bradbury	65d6ea5e68	[RISCV] Codegen support for RV32F floating point comparison operations This patch also includes extensive tests targeted at select and br+fcmp IR inputs. A sequence of br+fcmp required support for FPR32 registers to be added to RISCVInstrInfo::storeRegToStackSlot and RISCVInstrInfo::loadRegFromStackSlot. llvm-svn: 328104	2018-03-21 15:11:02 +00:00
Alex Bradbury	77d5927a1c	[RISCV] Add tests missed from r327979 llvm-svn: 328102	2018-03-21 14:50:27 +00:00
Craig Topper	137a4dd84d	[X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads. llvm-svn: 328071	2018-03-21 03:41:33 +00:00
Craig Topper	d25f1acf67	[X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis. Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server. Fixes PR36808. llvm-svn: 328061	2018-03-20 23:39:48 +00:00
Derek Schuff	39b5367cba	[WebAssembly] Strip threadlocal attribute from globals in single thread mode The default thread model for wasm is single, and in this mode thread-local global variables can be lowered identically to non-thread-local variables. Differential Revision: https://reviews.llvm.org/D44703 llvm-svn: 328049	2018-03-20 22:01:32 +00:00
Martin Storsjo	07589fc496	[X86] Don't use the MSVC stack protector names on mingw Mingw uses the same stack protector functions as GCC provides on other platforms as well. Patch by Valentin Churavy! Differential Revision: https://reviews.llvm.org/D27296 llvm-svn: 328039	2018-03-20 20:37:51 +00:00
Abderrazek Zaafrani	4c60c222e4	[AArch64] Add vmulxh_lane fp16 vector intrinsic https://reviews.llvm.org/D44591 llvm-svn: 328035	2018-03-20 20:25:40 +00:00

1 2 3 4 5 ...

23931 Commits