clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	350c22c587	[X86][SNB] Fix scheduling of MMX integer multiply instructions. The entries were being bound to the wrong class. llvm-svn: 331388	2018-05-02 19:26:14 +00:00
Simon Pilgrim	6732f6ea51	[X86] Split WriteShuffle/WriteVarShuffle + WriteBlend/WriteVarBlend into XMM and YMM/ZMM scheduler classes llvm-svn: 331386	2018-05-02 18:48:23 +00:00
Farhana Aleen	07e612340f	[AMDGPU] A trivial fix for a buildbot failure caused by "commit 224a839fcbbead221f872cd32a1dd0c308d37299". Author: FarhanaAleen llvm-svn: 331383	2018-05-02 18:16:39 +00:00
Simon Pilgrim	819f218f07	[X86] Cleanup WriteFShuffle/WriteFVarShuffle (+256 variants) scheduler classes with more common default values llvm-svn: 331380	2018-05-02 17:58:50 +00:00
Farhana Aleen	150cb6d91a	Revert "[AMDGPU] performAddCombine should run after DAG is legalized." This reverts commit 6b97d2995566b4dddd6bf0d75579ff44501d4494. llvm-svn: 331371	2018-05-02 16:48:52 +00:00
Farhana Aleen	2f4100f56e	[AMDGPU] performAddCombine should run after DAG is legalized. Summary: performAddCombine should run after DAG is legalized; Otherwise generic optimization in the DAGCombiner can optimize an addcarry+trunc into an addcarry instruction with illegal types. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46337 llvm-svn: 331368	2018-05-02 16:24:10 +00:00
Simon Pilgrim	e93fd5f1e4	[X86] Cleanup WriteFAdd/WriteFCmp scheduler classes with more common default values Intel models were targeting x87 instead of packed sse. Also fixes XOP's VFRCZ to use WriteFAdd/WriteFAddY. llvm-svn: 331340	2018-05-02 09:18:49 +00:00
Farhana Aleen	e2dfe8a853	[AMDGPU] Support horizontal vectorization. Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46213 llvm-svn: 331313	2018-05-01 21:41:12 +00:00
Eli Friedman	763d161eda	[AArch64] Add more tests for 64-bit immediate lowering. This adds a some more tests, and adds some notes to tests which are using a suboptimal lowering. The constants with suboptimal lowerings seem to be relatively rare in practice, but it might be a fun project to work on improvements. llvm-svn: 331304	2018-05-01 20:00:14 +00:00
Vedant Kumar	e23173b677	[DAGCombiner] Fix SDLoc in a (zext (zextload x)) combine (4/N) The logic for this combine is almost identical to the logic for a (sext (sextload x)) combine. This commit factors out the logic so it can be shared by both combines, and corrects the SDLoc assigned in the zext version of the combine. Prior to this patch, for the given test case, we would apply the location associated with the udiv instruction to instructions which perform the load. Part of: llvm.org/PR37262 llvm-svn: 331303	2018-05-01 19:51:15 +00:00
Vedant Kumar	d7117ed0f9	[DAGCombiner] Fix SDLoc in a (sext (sextload x)) combine (3/N) Prior to this patch, for the given test case, we would apply the location associated with the sdiv instruction to instructions which perform the load. Part of: llvm.org/PR37262. Differential Revision: https://reviews.llvm.org/D46222 llvm-svn: 331302	2018-05-01 19:51:15 +00:00
Vedant Kumar	cc7b2a55c2	[DAGCombiner] Change the SDLoc on split extloads (2/N) In DAGCombiner, we try to simplify this pattern: ([s\|z]ext (load ...)) Conceptually, a new extload which is created while splitting the load should have the same debug location as the load. Making this change affects the IROrder of the new load, causing some test case churn. In practice, the new location is never different from the location of the [s\|z]ext, at least not during check-llvm or a stage2 build. Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D46156 llvm-svn: 331301	2018-05-01 19:29:15 +00:00
Vedant Kumar	ee4bfcaa5a	[DAGCombiner] Set the right SDLoc on a newly-created zextload (1/N) Setting the right SDLoc on a newly-created zextload fixes a line table bug which resulted in non-linear stepping behavior. Several backend tests contained CHECK lines which relied on the IROrder inherited from the wrong SDLoc. This patch breaks that dependence where feasbile and regenerates test cases where not. In some cases, changing a node's IROrder may alter register allocation and spill behavior. This can affect performance. I have chosen not to prevent this by applying a "known good" IROrder to SDLocs, as this may hide a more general bug in the scheduler, or cause regressions on other test inputs. rdar://33755881, Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D45995 llvm-svn: 331300	2018-05-01 19:26:15 +00:00
Konstantin Zhuravlyov	1501af4846	AMDGPU: Remove remnants of gfx901 (it was deprecated some time ago) llvm-svn: 331298	2018-05-01 18:47:48 +00:00
Simon Pilgrim	21caf0124f	[X86] Split WriteFMul/WriteFDiv into XMM and YMM/ZMM scheduler classes llvm-svn: 331293	2018-05-01 18:22:53 +00:00
Simon Pilgrim	5269167f5b	[X86] Split WriteFAdd into XMM and YMM/ZMM scheduler classes Removes more WriteFAdd InstRW overrides llvm-svn: 331276	2018-05-01 16:13:42 +00:00
Sanjay Patel	5727011fd5	[DAG] add test to show FMF mismatch between IR and DAG; NFC D45710 proposes to change this, but we have no test coverage for the first step in this process. llvm-svn: 331271	2018-05-01 15:43:36 +00:00
Simon Pilgrim	dd8eae128b	[X86] Split WriteFShuffle into XMM and YMM/ZMM scheduler classes Removes more WriteFShuffle InstRW overrides llvm-svn: 331264	2018-05-01 14:25:01 +00:00
Simon Dardis	3d562fb975	Reland r331175: "[mips] Fix the predicates of jump and branch and link instructions" The previous version of this patch restricted the 'jal' instruction to MIPS and microMIPSr3. microMIPS32r6 does not have this instruction and instead uses jal as an alias for balc. Original commit message: > Reviewers: smaksimovic, atanasyan, abeserminji > > Differential Revision: https://reviews.llvm.org/D46114 > llvm-svn: 331259	2018-05-01 13:06:49 +00:00
Simon Pilgrim	57f2b185ac	[X86] Split WriteVecLogic into XMM and YMM/ZMM scheduler classes This removes all the WriteVecLogic InstRW overrides. llvm-svn: 331258	2018-05-01 12:39:17 +00:00
Andrea Di Biagio	d4c58400c5	[X86] Correct spill slot size. This patch fixes a bug introduced by revision 330778 (originally reviewed at: https://reviews.llvm.org/D44782), where function isFrameLoadOpcode returned the wrong number of bytes read for opcodes VMOVSSrm and VMOVSDrm. This corrects that mistake, and extends the regression test to catch cases where the dead stores should be removed. Patch by Jeremy Morse. Differential Revision: https://reviews.llvm.org/D46256 llvm-svn: 331252	2018-05-01 10:29:38 +00:00
Gabor Buella	c8ded04e85	[X86] movdiri and movdir64b instructions Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D45983 llvm-svn: 331248	2018-05-01 10:01:16 +00:00
Krzysztof Parzyszek	1cf329c933	[LivePhysRegs] Remove registers clobbered by regmasks from the live set Dead defs were being removed from the live set (in stepForward), but registers clobbered by regmasks weren't (more specifically, they were actually removed by removeRegsInMask, but then they were added back in). llvm-svn: 331219	2018-04-30 19:38:47 +00:00
Matt Arsenault	0084adc516	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215	2018-04-30 19:08:16 +00:00
Roman Tereshin	46f838f370	[MIR] Reset unique MBB numbering in MachineFunction::reset() No need to waste space nor number MBBs differently if MF gets recreated. Reviewers: qcolombet, stoklund, t.p.northover, bogner, javed.absar Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46078 llvm-svn: 331213	2018-04-30 18:58:57 +00:00
Sanjay Patel	1babf5ff32	[DAGCombiner] rename function attribute for disabling ftrunc transform This is the matching name change for the Clang patch at: D46236 rL331209 Differential Revision: https://reviews.llvm.org/D46237 llvm-svn: 331210	2018-04-30 18:20:33 +00:00
Ulrich Weigand	c3ec80fea1	[SystemZ] Handle SADDO et.al. and ADD/SUBCARRY This provides an optimized implementation of SADDO/SSUBO/UADDO/USUBO as well as ADDCARRY/SUBCARRY on top of the new CC implementation. In particular, multi-word arithmetic now uses UADDO/ADDCARRY instead of the old ADDC/ADDE logic, which means we no longer need to use "glue" links for those instructions. This also allows making full use of the memory-based instructions like ALSI, which couldn't be recognized due to limitations in the DAG matcher previously. Also, the llvm.sadd.with.overflow et.al. intrinsincs now expand to directly using the ADD instructions and checking for a CC 3 result. llvm-svn: 331203	2018-04-30 17:54:28 +00:00
Ulrich Weigand	b32f3656d2	[SystemZ] Do not use glue to represent condition code dependencies Currently, an instruction setting the condition code is linked to the instruction using the condition code via a "glue" link in the SelectionDAG. This has a number of drawbacks; in particular, it means the same CC cannot be used by multiple users. It also makes it more difficult to efficiently implement SADDO et. al. This patch changes the back-end to represent CC dependencies as normal values during SelectionDAG matching, along the lines of how this is handled in the X86 back-end already. In addition to the core mechanics of updating all relevant patterns, this requires a number of additional changes: - We now need to be able to spill/restore a CC value into a GPR if necessary. This means providing a copyPhysReg implementation for moves involving CC, and defining getCrossCopyRegClass. - Since we still prefer to avoid such spills, we provide an override for IsProfitableToFold to avoid creating a merged LOAD / ICMP if this would result in multiple users of the CC. - combineCCMask no longer requires a single CC user, and no longer need to be careful about preventing invalid glue/chain cycles. - emitSelect needs to be more careful in marking CC live-in to the basic block it generates. Also, we can now optimize the case of multiple subsequent selects with the same condition just like X86 does. llvm-svn: 331202	2018-04-30 17:52:32 +00:00
Daniel Sanders	2de9d4ad5d	Fix infinite loop after r331115 There are two separate fixes here: * The lowering code for non-extending loads should report UnableToLegalize instead of emitting the same instruction. * The target should not be requesting lowering of non-extending loads. llvm-svn: 331201	2018-04-30 17:20:01 +00:00
Ulrich Weigand	fb56686cd3	[SystemZ] Improve handling of Select pseudo-instructions If we have LOCR instructions, select them directly from SelectionDAG instead of first going through a pseudo instruction and then using the custom inserter to emit the LOCR. Provide Select pseudo-instructions for VR32/VR64 if we have vector instructions, to avoid having to go through the first 16 FPRs unnecessarily. If we do not have LOCFHR, prefer using LOCR followed by a move over a conditional branch. llvm-svn: 331191	2018-04-30 15:49:27 +00:00
Simon Dardis	5a512d63c9	Revert "[mips] Fix the predicates of jump and branch and link instructions" That commit broke one of the LLD builders, reverting while I investigate. This patch reverts r331175. llvm-svn: 331178	2018-04-30 14:03:35 +00:00
Simon Dardis	cc95a9c557	[mips] Fix the predicates of jump and branch and link instructions Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D46114 llvm-svn: 331175	2018-04-30 13:37:42 +00:00
Simon Dardis	57c2095d1b	[mips] Fix microMIPS loads and stores. Previously these instructions were unselectable and instead were generated through the instruction mapping tables. Reviewers: atanasyan, smaksimovic, abeserminji Differential Revision: https://reviews.llvm.org/D46055 llvm-svn: 331165	2018-04-30 09:44:44 +00:00
Daniel Sanders	5eb9f581b6	[globalisel][legalizerinfo] Introduce dedicated extending loads and add lowerings for them Summary: Previously, a extending load was represented at (G_EXT (G_LOAD x)). This had a few drawbacks: G_LOAD had to be legal for all sizes you could extend from, even if registers didn't naturally hold those sizes. * All sizes you could extend from had to be allocatable just in case the extend went missing (e.g. by optimization). * At minimum, G_EXT and G_TRUNC had to be legal for these sizes. As we improve optimization of extends and truncates, this legality requirement would spread without considerable care w.r.t when certain combines were permitted. The SelectionDAG importer required some ugly and fragile pattern rewriting to translate patterns into this style. This patch begins changing the representation to: * (G_[SZ]EXTLOAD x) * (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits() which resolves these issues by allowing targets to work entirely in their native register sizes, and by having a more direct translation from SelectionDAG patterns. This patch introduces the new generic instructions and new variation on G_LOAD and adds lowering for them to convert back to the existing representations. Depends on D45466 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, aemerson, javed.absar Reviewed By: aemerson Subscribers: aemerson, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45540 llvm-svn: 331115	2018-04-28 18:14:50 +00:00
Jessica Paquette	0b6724917a	[MachineOutliner] Add defs to calls + don't track liveness on outlined functions This commit makes it so that if you outline a def of some register, then the call instruction created by the outliner actually reflects that the register is defined by the call. It also makes it so that outlined functions don't have the TracksLiveness property. Outlined calls shouldn't break liveness assumptions that someone might make. This also un-XFAILs the noredzone test, and updates the calls test. llvm-svn: 331095	2018-04-27 23:36:35 +00:00
Heejin Ahn	d20d0648ed	[DAGCombiner] Fix a case of 1 in non-splat vector pow2 divisor Summary: D42479 (rL329525) enabled SDIV combine for pow2 non-splat vector dividers. But when there is a 1 in a vector, the instruction sequence to be generated involves shifting a value by the number of its bit widths, which is undefined (`c64f4dbfe3/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L6000-L6006)`). Especially, in architectures that do not support vector instructions, each of element in a vector will be computed separately using scalar operations, and then the resulting value will be undef for '1' values in a vector. (All 1's vector is fine; only vectors mixed with 1 and others will be affected.) Reviewers: RKSimon, jgravelle-google Subscribers: jfb, dschuff, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D46161 llvm-svn: 331092	2018-04-27 22:23:11 +00:00
Craig Topper	d656410293	[X86] Make the STTNI flag intrinsics use the flags from pcmpestrm/pcmpistrm if the mask instrinsics are also used in the same basic block. Summary: Previously the flag intrinsics always used the index instructions even if a mask instruction also exists. To fix fix this I've created a single ISD node type that returns index, mask, and flags. The SelectionDAG CSE process will merge all flavors of intrinsics with the same inputs to a s ingle node. Then during isel we just have to look at which results are used to know what instruction to generate. If both mask and index are used we'll need to emit two instructions. But for all other cases we can emit a single instruction. Since I had to do manual isel anyway, I've removed the pseudo instructions and custom inserter code that was working around tablegen limitations with multiple implicit defs. I've also renamed the recently added sse42.ll test case to sttni.ll since it focuses on that subset of the sse4.2 instructions. Reviewers: chandlerc, RKSimon, spatel Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46202 llvm-svn: 331091	2018-04-27 22:15:33 +00:00
Jun Bum Lim	9e3e14b5f9	[PostRASink] extend the live-in check for all aliased registers Extend the live-in check for all aliased registers so that we can allow sinking Copy instructions when only implicit def is in successor's live-in. llvm-svn: 331072	2018-04-27 19:59:20 +00:00
Daniel Sanders	27fe8a5011	[globalisel][legalizerinfo] Add support for legalization based on the MachineMemOperand Summary: Currently only the memory size is supported but others can be added as needed. narrowScalar for G_LOAD and G_STORE now correctly update the MachineMemOperand and will refuse to legalize atomics since those need more careful expansions to maintain atomicity. Reviewers: ab, aditya_nandakumar, bogner, rtereshin, aemerson, javed.absar Reviewed By: aemerson Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45466 llvm-svn: 331071	2018-04-27 19:48:53 +00:00
Mark Searles	a6322924e6	[AMDGPU][Waitcnt] Update a few tests to use default waitcnt pass (si-insert-waitcnts) rather than old pass (si-insert-waits); this is a small step towards the overall goal of removing the old waitcnt pass, which is no longer maintained. Differential Revision: https://reviews.llvm.org/D46154 llvm-svn: 331062	2018-04-27 17:59:15 +00:00
Simon Pilgrim	b2aa89c909	[X86][AVX] Split WriteFLogic into XMM and YMM/ZMM scheduler classes This removes all the AND/ANDN/OR/XOR PS/PD InstRW overrides. llvm-svn: 331051	2018-04-27 15:50:33 +00:00
Simon Dardis	e3c3c5a7a7	[mips] Analyze and provide selection patterns microMIPSR6 branches These branches were previously unanalyzable and unselectable. Add them and recognize how to generate their inverses. Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D46113 llvm-svn: 331050	2018-04-27 15:49:49 +00:00
Francis Visoiu Mistrih	c855e92ca9	[AArch64] Place the first ldp at the end when ReverseCSRRestoreSeq is true Put the first ldp at the end, so that the load-store optimizer can run and merge the ldp and the add into a post-index ldp. This didn't work in case no frame was needed and resulted in code size regressions. llvm-svn: 331044	2018-04-27 15:30:54 +00:00
Oliver Stannard	76088a5929	[AArch64] Codegen for v8.2A dot product intrinsics This adds IR intrinsics for the AArch64 dot-product instructions introduced in v8.2-A. Differential revisioon: https://reviews.llvm.org/D46107 llvm-svn: 331036	2018-04-27 13:45:32 +00:00
Aleksandar Beserminji	3546c1603a	[mips] Fix how compiler fuse instructions to fmadd/fmsub This patch makes compiler does not fuse fmul and fadd/fsub into fmadd/fmsub by default. Instead, -fp-contract=fast option can be used when such behavior is desired. Differential Revision: https://reviews.llvm.org/D46057 llvm-svn: 331033	2018-04-27 13:30:27 +00:00
Oliver Stannard	f3632143da	[ARM] Codegen for v8.2A dot product intrinsics This adds IR intrinsics for the ARM dot-product instructions introduced in v8.2-A. Differential revision: https://reviews.llvm.org/D46106 llvm-svn: 331032	2018-04-27 12:50:40 +00:00
Alex Bradbury	f5800a2aa0	[RISCV] Add remat.ll test case This test case demonstrates suboptimal codegen due to the fact that simple constants aren't recognised as rematerialisable. llvm-svn: 331028	2018-04-27 11:50:30 +00:00
David Green	c4cccea4c9	[ARM] Enable misched for R52. Back when the R52 schedule was added in rL286949, there was no way to enable machine schedules in ARM for specific cores. Since then a target feature has been added. This enables the feature for R52, removing the need to manually specify compiler flags. llvm-svn: 331027	2018-04-27 11:29:49 +00:00
Eli Friedman	da018e5687	[MachineOutliner] Don't outline from functions with a section marking. The program might have unusual expectations for functions; for example, the Linux kernel's build system warns if it finds references from .text to .init.data. I'm not sure this is something we actually want to make any guarantees about (there isn't any explicit rule that would disallow outlining in this case), but we might want to be conservative anyway. Differential Revision: https://reviews.llvm.org/D46091 llvm-svn: 331007	2018-04-27 00:21:34 +00:00
Chandler Carruth	16429acacb	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997	2018-04-26 21:46:01 +00:00

1 2 3 4 5 ...

24332 Commits