clang-p2996

Author	SHA1	Message	Date
Matt Arsenault	212570abcf	GlobalISel: Implement bitcast action for G_EXTRACT_VECTOR_ELEMENT For AMDGPU, vectors with elements < 32 bits should be indexed in 32-bit elements and the desired bits extracted from there. For elements > 64-bits, these should be reduce to 64/32 elements to enable the normal dynamic indexing paths. In the dynamic index cases, this produces shorter code most of the time. This does immediately regress the constant index cases, but this should be fixed once we have the most basic of shift combines. The element size > 64 case is pretty much ported from the exisiting DAG implementation for extract element promote. The increasing element size case is new.	2020-08-02 10:42:07 -04:00
Simon Pilgrim	b8ffbf0e02	[DAG] TargetLowering::expandMUL_LOHI - pass SDLoc as const& Try to be more consistent with the SDLoc param in the TargetLowering methods. This also exposes an issue where we were passing a SDNode as a SDLoc, relying on the implicit SDLoc(SDNode) constructor.	2020-08-02 15:31:36 +01:00
Simon Pilgrim	d14a22da5e	[DAG] TargetLowering::LowerAsmOutputForConstraint - pass SDLoc as const& Try to be more consistent with the SDLoc param in the TargetLowering methods.	2020-08-02 15:12:02 +01:00
Kazu Hirata	60434989e5	Use llvm::is_contained where appropriate (NFC) Use llvm::is_contained where appropriate (NFC) Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D85083	2020-08-01 21:51:06 -07:00
Evgeny Leviant	e73f5d86f1	[MachineVerifier] Refactor calcRegsPassed. NFC Patch improves performance of verify-machineinstrs pass up to 10x. Differential revision: https://reviews.llvm.org/D84105	2020-08-01 12:58:52 +03:00
Sriraman Tallam	ca6b6d40ff	Rename basic block sections options to be consistent. D68049 created options for basic block sections: -fbasic-block-sections=, -funique-basic-block-section-names. Rename options in llc and lld (--lto-) to be consistent. Specifically, + Rename basicblock-sections to basic-block-sections + Rename unique-bb-section-names to unique-basic-block-section-names Differential Revision: https://reviews.llvm.org/D84462	2020-07-31 11:50:55 -07:00
Aditya Nandakumar	2144a3bdbb	[GISel] Add combiners for G_INTTOPTR and G_PTRTOINT https://reviews.llvm.org/D84909 Patch adds two new GICombinerRules, one for G_INTTOPTR and one for G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if the cast is within the same address space. The G_PTRTOINT elides int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules. Patch by mkitzan	2020-07-31 10:13:36 -07:00
Matt Arsenault	57bd64ff84	Support addrspacecast initializers with isNoopAddrSpaceCast Moves isNoopAddrSpaceCast to the TargetMachine. It logically belongs with the DataLayout.	2020-07-31 10:42:43 -04:00
Vitaly Buka	b0eb40ca39	[NFC] Remove unused GetUnderlyingObject paramenter Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621	2020-07-31 02:10:03 -07:00
Vitaly Buka	89051ebace	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Eli Friedman	7e88efa7c5	[LegalizeTypes][SVE] Support widen/split legalization for SPLAT_VECTOR Just the obvious implementation that rewrites the result type. Also fix warning from EXTRACT_SUBVECTOR legalization that triggers on the test. Differential Revision: https://reviews.llvm.org/D84706	2020-07-30 16:17:45 -07:00
Jon Roelofs	afae6d97fa	[SelectionDAG] Fix lowering of vector geps This fixes an assertion failure that was being triggered in SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32> to i64 (which should have been <2xi64>). Fixes: rdar://66016901 Differential Revision: https://reviews.llvm.org/D84884	2020-07-30 14:56:53 -06:00
Brendon Cahoon	7b114446c3	Align store conditional address In cases where the alignment of the datatype is smaller than expected by the instruction, the address is aligned. The aligned address is used for the load, but wasn't used for the store conditional, which resulted in a run-time alignment exception.	2020-07-30 10:42:00 -05:00
jasonliu	04dc9691eb	[XCOFF][AIX] Enable -ffunction-sections Summary: This patch implements -ffunction-sections on AIX. This patch focuses on assembly generation. Follow-on patch needs to handle: 1. -ffunction-sections implication for jump table. 2. Object file generation path and associated testing. Differential Revision: https://reviews.llvm.org/D83875	2020-07-30 13:30:01 +00:00
Sam Tebbs	276ed5f7e4	[DAGCombiner] Fold sext_inreg of a masked load into a sign extended masked load This patch adds a DAG combine fold for a sext(masked_load) into a sign extended masked load. Differential Revision: https://reviews.llvm.org/D84332	2020-07-30 10:34:02 +01:00
Kang Zhang	0037a5f894	[PHIElimination] Fix the killed flag for LowerPHINode() Summary: In the phi-node-elimination pass, we set the killed flag incorrectly. When we eliminate the PHI node, we replace the PHI with a copy for the incoming value. Before this patch, we will set incoming value as killed(PHICopy). And we will remove the killed flag from last using incoming value(OldKill). This is correct, only if the new PHICopy is after the OldKill. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D80886	2020-07-30 08:18:50 +00:00
Matt Arsenault	7d0b32c268	GlobalISel: Use result of find rather than rechecking map	2020-07-29 21:26:20 -04:00
Matt Arsenault	66c572af55	GlobalISel: Handle assorted no-op intrinsics SelectionDAGBuilder just drops these, so do the same.	2020-07-29 21:26:20 -04:00
Matt Arsenault	0da582d9b6	GlobalISel: Handle llvm.roundeven I still think it's highly questionable that we have two intrinsics with identical behavior and only vary by the name of the libcall used if it happens to be lowered that way, but try to reduce the feature delta between SDAG and GlobalISel for recently added intrinsics. I'm not sure which opcode should be considered the canonical one, but lower roundeven back to round.	2020-07-29 20:01:12 -04:00
Philip Reames	755f91f12c	[Statepoint] Enable cross block relocates w/vreg lowering This change is mechanical, it just removes the restriction and updates tests. The key building blocks were submitted in `31342eb` and `8fe2abc`. Note that this (and preceeding changes) entirely subsumes D83965. I did includes a couple of it's tests. From the codegen changes, an interesting observation: this doesn't actual reduce spilling, it just let's the register allocator do it's job. That results in a slightly different overall result which has both pros and cons over the eager spill lowering. (i.e. We'll have some perf tuning to do once this is stable.)	2020-07-29 13:32:51 -07:00
Amara Emerson	0c0e36061a	[GlobalISel] Add G_INTRINSIC_LRINT and translate from llvm.lrint Differential Revision: https://reviews.llvm.org/D84551	2020-07-29 11:51:04 -07:00
Philip Reames	8fe2abc190	[Statepoint] Consolidate relocation type tracking [NFC] Change the way we track how a particular pointer was relocated at a statepoint in selection dag. Previously, we used an optional<location> for the spill lowering, and a block local Register for the newly introduced vreg lowering. Combine all three lowerings (norelocate, spill, and vreg) into a single helper class, and keep a single copy of the information. This is submitted separately as it really does make the code more readible on it's own, but the indirect motivation is to move vreg tracking from StatepointLowering to FunctionLoweringInfo. This is the last piece needed to support cross block relocations with vregs; that will follow in a separate (non-NFC) patch.	2020-07-29 11:45:31 -07:00
Amara Emerson	d8ba622209	[AArch64][GlobalISel] Selection support for vector DUP[X]lane instructions. In future, we'd like to use the perfect-shuffle mechanism to deal with these shuffle permutations. For now, this improves performance by avoiding the super-expensive const-pool load + tbl instruction. Differential Revision: https://reviews.llvm.org/D84866	2020-07-29 11:41:37 -07:00
Matt Arsenault	0b7de7966f	GlobalISel: Implement lower for G_EXTRACT_VECTOR_ELT Use the basic store to stack and reload.	2020-07-29 14:16:28 -04:00
Matt Arsenault	90b76dac57	GloblaISel: Remove unreachable condition Fixes bug 46882	2020-07-29 13:42:22 -04:00
Simon Pilgrim	fdc902774e	[DAG][AMDGPU][X86] Add SimplifyMultipleUseDemandedBits handling for SIGN/ZERO_EXTEND + SIGN/ZERO_EXTEND_VECTOR_INREG Peek through multiple use ops like we already do for ANY_EXTEND/ANY_EXTEND_VECTOR_INREG Differential Revision: https://reviews.llvm.org/D84863	2020-07-29 18:10:59 +01:00
Philip Reames	31342eb63e	[Statepoint] When using the tied def lowering, unconditionally use vregs [almost NFC] This builds on `3da1a96` on the path towards supporting invokes and cross block relocations. The actual change attempts to be NFC, but does fail in one corner-case explained below. The change itself is fairly mechanical. Rather than remember SDValues - which are inherently block local - immediately produce a virtual register copy and remember that. Once this lands, we'll update the FunctionLoweringInfo::StatepointSpillMap map to allow register based lowerings, delete VirtRegs from StatepointLowering, and drop the restriction against cross block relocations. I deliberately separate the semantic part into it's own change for easy of understanding and fault isolation. The corner-case which isn't quite NFC is that the old implementation implicitly CSEd gc.relocates of the same SDValue regardless of type. The new implementation still only relocates once, but it produces distinct vregs for the bitcast and it's source, whereas SelectionDAG's generic CSE was able to remove the bitcast in the old implementation. Note that the final assembly doesn't change (at least in the test), as our MI level optimizations catch the duplication. I assert that this is an uninteresting corner-case. It's functionally correct, and if we find a case where this influences performance, we should really be canonicalizing types to i8* at the IR level. Differential Revision: https://reviews.llvm.org/D84692	2020-07-29 09:23:52 -07:00
Kang Zhang	a4ade9ed21	[MachineVerifier] Handle the PHI node for verifyLiveVariables() Summary: When doing MachineVerifier for LiveVariables, the MachineVerifier pass will calculate the LiveVariables, and compares the result with the result livevars pass gave. If they are different, verifyLiveVariables() will give error. But when we calculate the LiveVariables in MachineVerifier, we don't consider the PHI node, while livevars considers. This patch is to fix above bug. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D80274	2020-07-29 15:43:47 +00:00
Simon Wallis	6a05c6bfc8	[MachineCopyPropagation] BackwardPropagatableCopy: add check for hasOverlappingMultipleDef In MachineCopyPropagation::BackwardPropagatableCopy(), a check is added for multiple destination registers. The copy propagation is avoided if the copied destination register is the same register as another destination on the same instruction. A new test is added. This used to fail on ARM like this: error: unpredictable instruction, RdHi and RdLo must be different umull r9, r9, lr, r0 Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82638	2020-07-29 16:21:01 +01:00
David Sherwood	2078771759	[SVE][CodeGen] Add simple integer add tests for SVE tuple types I have added tests to: CodeGen/AArch64/sve-intrinsics-int-arith.ll for doing simple integer add operations on tuple types. Since these tests introduced new warnings due to incorrect use of getVectorNumElements() I have also fixed up these warnings in the same patch. These fixes are: 1. In narrowExtractedVectorBinOp I have changed the code to bail out early for scalable vector types, since we've not yet hit a case that proves the optimisations are profitable for scalable vectors. 2. In DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS I have replaced calls to getVectorNumElements with getVectorMinNumElements in cases that work with scalable vectors. For the other cases I have added asserts that the vector is not scalable because we should not be using shuffle vectors and build vectors in such cases. Differential revision: https://reviews.llvm.org/D84016	2020-07-29 13:32:10 +01:00
David Sherwood	5d84eafc6b	[CodeGen] Remove calls to getVectorNumElements in DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR In DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR I have replaced calls to getVectorNumElements with getVectorMinNumElements, since this code path works for both fixed and scalable vector types. For scalable vectors the index will be multiplied by VSCALE. Fixes warnings in this test: sve-sext-zext.ll Differential revision: https://reviews.llvm.org/D83198	2020-07-29 13:05:39 +01:00
Daniel Sanders	abf1ed70d6	[globalisel][cse] Merge debug locations when CSE'ing Reviewed By: aditya_nandakumar Differential Revision: https://reviews.llvm.org/D78388	2020-07-28 14:25:26 -07:00
Matt Arsenault	e87356b498	GlobalISel: Don't assert on operations with no type indices Fix not marking G_FENCE as legal on AMDGPU This was apparently defaulting to legal using the "legacy" rules, whatever those are.	2020-07-28 16:49:55 -04:00
Mircea Trofin	1e027b77f0	[llvm][NFC] refactor setBlockFrequency for clarity. The refactoring encapsulates frequency calculation in MachineBlockFrequencyInfo, and renames the API to clarify its motivation. It should clarify frequencies may not be reset 'freely' by users of the analysis, as the API serves as a partial update to avoid a full analysis recomputation. Differential Revision: https://reviews.llvm.org/D84427	2020-07-28 13:04:11 -07:00
Simon Pilgrim	b4b6e77454	[DAG] isSplatValue - add support for TRUNCATE/SIGN_EXTEND/ZERO_EXTEND These are just pass-throughs to the source operand - we can't assume that ANY_EXTEND(splat) will still be a splat though.	2020-07-28 19:56:11 +01:00
Matt Arsenault	97b5fb78d1	GlobalISel: Translate llvm.convert.{to\|from}.fp16 intrinsics I think these were added as a workaround for SelectionDAG lacking half legalization support in the past. I think they should probably be removed from the IR, but clang does still have a target control to emit these instead of the native half fpext/fptrunc.	2020-07-28 11:46:05 -04:00
Matt Arsenault	5f802be4e5	GlobalISel: Don't fail translate on intrinsics with metadata	2020-07-27 19:00:25 -04:00
Sridhar Gopinath	4b5412b5db	Fix the move constructor of MMI to move MachineFunctions map The move constructor of MachineModuleInfo currently does not copy the MachineFunctions map. This commit fixes this issue. Patch by Sridhar Gopinath. Thanks! Differential Revision: https://reviews.llvm.org/D84274	2020-07-27 14:10:05 -07:00
Kazu Hirata	902cbcd59e	Use llvm::is_contained where appropriate (NFC) Summary: This patch replaces std::find with llvm::is_contained where appropriate. Reviewers: efriedma, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, jvesely, nhaehnle, hiraditya, rogfer01, kerbowa, llvm-commits, vkmr Tags: #llvm Differential Revision: https://reviews.llvm.org/D84489	2020-07-27 10:20:44 -07:00
Nadav Rotem	df880b7730	[StackProtector] Speed up RequiresStackProtector Speed up the method RequiresStackProtector by checking the intrinsic value of the call. The original code calls getName() that returns an allocating std::string on each check. This change removes about 96072 std::string instances when compiling sqlite3.c; The function was discovered with a Facebook-internal performance tool. Differential Revision: https://reviews.llvm.org/D84620	2020-07-27 10:07:47 -07:00
Amy Kwan	7c182663a8	Revert "Re-apply:" Emit DW_OP_implicit_value for Floating point constants"" This patch reverts commit `59a76d957a26` as it has caused failure on the big endian PowerPC buildbots (as well as the SystemZ buildbots).	2020-07-27 09:44:13 -05:00
David Sherwood	14bc85e0eb	[SVE] Don't use LocalStackAllocation for SVE objects I have introduced a new TargetFrameLowering query function: isStackIdSafeForLocalArea that queries whether or not it is safe for objects of a given stack id to be bundled into the local area. The default behaviour is to always bundle regardless of the stack id, however for AArch64 this is overriden so that it's only safe for fixed-size stack objects. There is future work here to extend this algorithm for multiple local areas so that SVE stack objects can be bundled together and accessed from their own virtual base-pointer. Differential Revision: https://reviews.llvm.org/D83859	2020-07-27 08:22:01 +01:00
QingShan Zhang	a6e9f5264c	[Scheduling] Improve group algorithm for store cluster Store Addr and Store Addr+8 are clusterable pair. They have memory(ctrl) dependency on different loads. Current implementation will put these two stores into different group and miss to cluster them. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D84139	2020-07-27 02:02:40 +00:00
Matt Arsenault	f6176f8a5f	GlobalISel: Handle G_PTR_ADD in narrowScalar	2020-07-26 10:08:17 -04:00
Matt Arsenault	3e8bb7a000	GlobalISel: Handle fewerElementsVector for G_PTR_ADD	2020-07-26 10:08:09 -04:00
Matt Arsenault	61ced4b87a	GlobalISel: Handle 'n' inline asm constraint	2020-07-26 09:30:41 -04:00
Changpeng Fang	9162b70e51	DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit Summary: In parallelizeChainedStores, a TokenFactor was created with the size greater than 3000. We found that DAGCombiner::visitTokenFactor will consume a huge amount of time on such nodes. Since the number of operands already exceeds TokenFactorInlineLimit, we propose to give up simplification with the consideration of compile time. Reviewers: @spatel, @arsenm Differential Revision: https://reviews.llvm.org/D84204	2020-07-25 21:20:59 -07:00
Eric Christopher	18975762c1	Fold StatepointBB into checks as it's only used from an NDEBUG or ASSERT context fixing an unused variable warning.	2020-07-25 18:36:53 -07:00
Philip Reames	55dae9c20c	[Statepoints] Style cleanup after `3da1a963` [NFC] Just fixing a few minor stylistic issues.	2020-07-25 16:40:39 -07:00
Philip Reames	3da1a9634e	[Statepoints] Support lowering gc relocations to virtual registers (Disabled under flag for the moment) This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator. Today, we force spill all operands in SelectionDAG. The code to do so is distinctly non-optimal. The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to. In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots. This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above. In particular, the first N are lowered to variadic tied def/use pairs. So new statepoint looks like this: reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ... N is limited by the maximal number of tied registers machine instruction can have (15 at the moment). The current patch is restricted to handling relocations within a single basic block. Cross block relocations (e.g. invokes) are handled via the legacy mechanism. This restriction will be relaxed in future patches. Patch By: dantrushin Differential Revision: https://reviews.llvm.org/D81648	2020-07-25 14:26:05 -07:00

1 2 3 4 5 ...

29071 Commits