clang-p2996

Author	SHA1	Message	Date
Jay Foad	2dcf051259	[CodeGen] Store call frame size in MachineBasicBlock Record the call frame size on entry to each basic block. This is usually zero except when a basic block has been split in the middle of a call sequence. This simplifies PEI::replaceFrameIndices which previously had to visit basic blocks in a specific order and had special handling for unreachable blocks. More importantly it paves the way for an equally simple implementation of a backwards version of replaceFrameIndices, which is required to fully convert PrologEpilogInserter to backwards register scavenging, which is preferred because it does not rely on accurate kill flags. Differential Revision: https://reviews.llvm.org/D156113	2023-07-27 10:32:00 +01:00
Vitaly Buka	a496c8be6e	Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" And dependent commits. Details in D150388. This reverts commit `825b7f0ca5`. This reverts commit `7a98f084c4`. This reverts commit `b4a62b1fa5`. This reverts commit `b7836d8562`. No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D156381	2023-07-26 22:13:32 -07:00
Sameer Sahasrabuddhe	b14e30f10d	[LLVM] refactor GenericSSAContext and its specializations Fix the GenericSSAContext template so that it actually declares all the necessary typenames and the methods that must be implemented by its specializations SSAContext and MachineSSAContext. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156288	2023-07-27 09:54:50 +05:30
Pranav Kant	6f305e0658	[DAGCombiner] Limit graph traversal to cap compile times hasPredecessorHelper method, that is used by DAGCombiner to combine to pre-indexed and post-indexed load/stores, is a major source of slowdown while compiling a large function with MSan enabled on Arm. This patch caps the DFS-graph traversal for this method to 8192 which cuts compile time by 50% (4m -> 2m compile time) at the cost of less overall nodes combined. Here's the summary of pre-index DAG nodes created and time it took to compile the pathological case with different MaxDepth limit: 1. With MaxDepth = 0 (unlimited): 1800, took 4m 2. With MaxDepth = 32k, 560, took 2m31s 3. With MaxDepth = 8k, 139, took 2m. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D154885	2023-07-26 17:29:38 +00:00
DianQK	30f2170a78	Revert "[DebugInfo] Fix potential CU mismatch for attachRangesOrLowHighPC" This reverts commit `d20e4a1d68`. After committing `2ee4d0386c`, We don't support subprogram definitions nested within `DICompositeType` when doing LTO builds. For a detailed discussion, see https://reviews.llvm.org/D152095.	2023-07-26 19:58:00 +08:00
Zhongyunde	05aae0839f	Reland [AArch64][NFC] Call the API getVScaleRange directly Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range. the previous changes bring in a Buildbot failure because MinSVEVectorSize = MinSVEVectorSize. error: explicitly assigning value of variable of type 'unsigned int' to itself [-Werror,-Wself-assign] Reviewed By: sdesmalen, nikic, dmgreen Differential Revision: https://reviews.llvm.org/D155708	2023-07-26 18:55:31 +08:00
Jay Foad	6fcad9cf93	[DAGCombiner] Simplify foldAndOrOfSETCC. NFC. Pull out repeated hasOneUse checks. Simplify some conditions. Reduce indentation. Differential Revision: https://reviews.llvm.org/D156220	2023-07-26 10:22:55 +01:00
Zhongyunde	ebaac2b2d6	Revert "[AArch64][NFC] Call the API getVScaleRange directly" This reverts commit `67005c8e6f`.	2023-07-26 16:44:14 +08:00
Zhongyunde	67005c8e6f	[AArch64][NFC] Call the API getVScaleRange directly Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range. Reviewed By: sdesmalen, nikic, dmgreen Differential Revision: https://reviews.llvm.org/D155708	2023-07-26 15:54:04 +08:00
esmeyi	e83b8a5e71	[XCOFF] Enable available_externally linkage for functions. Summary: D80642 added support for emitting AvailableExternally Linkage on AIX. However, an assertion of "Trying to get csect representation of this symbol but none was set." occurred when a function is declared as available_externally. This is due to we missing to generate a csect for the function. This patch fixes it. Reviewed By: hubert.reinterpretcast, shchenz Differential Revision: https://reviews.llvm.org/D156213 Signed-off-by: Esme Yi <esme.yi@ibm.com>	2023-07-25 22:47:11 -04:00
Qi Hu	ddd7d35c6c	[RegAlloc] Fix assertion failure caused by inline assembly When inline assembly code requests more registers than available, the MachineInstr::emitError function in the RegAllocFast pass emits an error but doesn't stop the pass, and then the compiler crashes later with an assertion failure. This commit, mimicking the RegAllocGreedy pass, assigns a random physical register, and therefore avoids the crash after producing the diagnostic. This problem has been observed for both rustc and clang, while it doesn't occur in gcc.	2023-07-25 19:21:03 -04:00
Craig Topper	1f5a1b8952	[DAGCombiner] Minor improvements to foldAndOrOfSETCC. NFC Reduce the scope of some variables. Replace an if with an assertion. Reviewed By: kmitropoulou Differential Revision: https://reviews.llvm.org/D156140	2023-07-25 00:20:06 -07:00
Matt Arsenault	0d797b71eb	RegisterCoaleser: Fix empty subrange verifier error In this example an implicit def had live-out undef subrange defs. After coalescing with the def from a previous block, the undef-defed lanes are no longer live out of the block in the new interval. An empty subrange was tenatively created for these lanes, but it must be deleted.	2023-07-24 12:18:34 -04:00
Matt Arsenault	2a53b6c06b	RegisterCoalescer: Fix verifier error on redef of subregister for live out implicit_defs A live out implicit_def wasn't deleted, but the subranges weren't correctly updated. The main range was correct but the def corresponding to the initial main range def instruction was missing from the lanes redefined in another block. The written lanes are not quite the same as the valid lanes in the case of an implicit_def. Fixes verifier error in blender. There is an additional verifier in some of the testcase variants where an empty subrange remains.	2023-07-24 12:18:34 -04:00
WANG Rui	595d5f36f4	[DAGCombine] Canonicalize operands for visitANDLike During the construction of SelectionDAG, there are no explicit canonicalization rules to adjust the order of operands for AND nodes. This may prevent the optimization in DAGCombiner::visitANDLike from being triggered. This patch canonicalizes the operands before matches, which can be observed to improve optimization on the RISC-V target architecture. Canonicalize: ``` and(x, add) -> and(add, x) ``` Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154760	2023-07-24 16:52:04 +08:00
Antonio Frighetto	2dea969d83	[clang][CodeGen] Introduce `-frecord-command-line` for MachO Allow clang driver command-line recording when targeting MachO object files as well. Reviewed-by: sgraenitz Differential Revision: https://reviews.llvm.org/D155716	2023-07-24 09:24:59 +02:00
David Green	6edc9a7662	[AArch64][GISel] Additional FPExt vector lowering Similar to D155311, this adds lowering for more vector cases for FPExt Differential Revision: https://reviews.llvm.org/D155601	2023-07-23 16:58:13 +01:00
Amaury Séchet	88452508f3	[DAG] Improve carry reconstruction in combineCarryDiamond. The gain is usually suffiscient to go the extra mile and reconstruct a carry in some cases. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154533	2023-07-22 22:49:48 +00:00
Craig Topper	a815f03f9b	[LegalizeTypes] Use report_fatal_error instead of llvm_unreachable in the default case of some type legalization handlers. These can be triggered by in various ways when intrinsics are used wrong or a target doesn't correctly not support something. Using a fatal error prevents strange behavior like infinite loops. We already do this for some of the vector type legalization handles.	2023-07-22 11:05:24 -07:00
Daniel Hoekwater	0315fca912	[AArch64] Move branch relaxation after bbsection assignment Because branch relaxation needs to factor in if branches target a block in the same section or a different one, it needs to run after the Basic Block Sections / Machine Function Splitting passes. Because Jump table compression relies on block offsets remaining fixed after the table is compressed, we must also move the JT compression pass. The only tests affected are ones enforcing just the ordering and the a few that have basic block ids changed because RenumberBlocks hasn't run yet. Differential Revision: https://reviews.llvm.org/D153829	2023-07-21 20:24:52 +00:00
Simon Pilgrim	ae60706da0	[DAG] SimplifyDemandedBits - call ComputeKnownBits for constant non-uniform ISD::SRL shift amounts We only attempted to determine KnownBits for uniform constant shift amounts, but ComputeKnownBits is able to handle some non-uniform cases as well that we can use as a fallback.	2023-07-21 14:52:57 +01:00
Jingu Kang	351b4c17dd	Revert "[MachineLICM] Handle Subloops" This reverts commit `50dd383d08`.	2023-07-20 17:12:25 +01:00
Jingu Kang	50dd383d08	[MachineLICM] Handle Subloops Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outmost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-07-20 16:39:13 +01:00
Danila Malyutin	e1aa4e7b38	[Statepoint] Use correct RegisterClass for spilling Copy propagation might have changed the register class of the register Differential Revision: https://reviews.llvm.org/D155792	2023-07-20 16:00:00 +03:00
Simon Pilgrim	7567b72f4d	[DAG] ShrinkDemandedConstant - early-out for empty DemandedBits/Elts Leave this to constant folding in SimplifyDemandedBits Fixes #63975	2023-07-20 12:18:10 +01:00
Simon Pilgrim	697f60598e	[DAG] hoistLogicOpWithSameOpcodeHands - ensure SIGN_EXTEND_INREG nodes have the same extension value type Fix bug in the check for matching SIGN_EXTEND_INREG types	2023-07-20 10:44:46 +01:00
David Green	0c41c59dee	[DAG][AArch64] Fix truncated vscale constant types It appears that vscale values truncated to i1 causes mismatches in the constant types when created in getNode. https://godbolt.org/z/TaaTo86ne. Differential Revision: https://reviews.llvm.org/D155626	2023-07-20 09:12:05 +01:00
Fangrui Song	b215a2c885	.debug_gnu_pub{names,types}: Stabilize iteration order StringMap iteration order is not guaranteed to be deterministic (https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h). Sort by DIE offset (which looks like a pre-order traversal order).	2023-07-19 23:30:30 -07:00
Fangrui Song	60a2b99125	AccelTable: Use MapVector to stabilize iteration order Entries of the same DJB hash are in the hash lookup table/name table are ordered by the iteration order of `Entries` (a StringMap). Change `Entries` to a MapVector to stabilize the order and simplify future changes like D142862 that change the StringMap hash function.	2023-07-19 19:50:36 -07:00
Fangrui Song	30e753dd07	[FSAFDO] Switch to xxh3_64bits Following recent changes switching from xxh64 to xxh32 for better hashing performance. This particular instance may or may not have noticeable performance difference, but this change makes us toward removing xxHash64.	2023-07-19 14:23:28 -07:00
Momchil Velikov	4c95f79cce	[CodeGenPrepare] Refactor optimizeSelectInst (NFC) Refactor to use BasicBlockUtils functions and make life easier for a subsequent patch for updating the dominator tree. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D154053	2023-07-19 18:56:44 +01:00
Igor Kirillov	c15557d64e	[CodeGen] Extend ComplexDeinterleaving pass to recognise patterns using integer types AArch64 introduced CMLA and CADD instructions as part of SVE2. This change allows to generate such instructions when this architecture feature is available. Differential Revision: https://reviews.llvm.org/D153808	2023-07-19 11:01:19 +00:00
Simon Pilgrim	98b0f1360d	[DAG] hoistLogicOpWithSameOpcodeHands - add support for SIGN_EXTEND_INREG nodes. This can reuse the existing *_EXTEND node handling (with special handling for the valuetype arg)	2023-07-19 11:56:32 +01:00
Simon Pilgrim	2167ae93c9	[DAG] hoistLogicOpWithSameOpcodeHands - add support for _EXTEND_VECTOR_INREG nodes. This can reuse the existing _EXTEND node handling.	2023-07-19 10:50:23 +01:00
Jingu Kang	62ed3ff4bb	Revert "[MachineLICM] Handle Subloops" This reverts commit `33e60484d7`.	2023-07-19 10:30:50 +01:00
Han Shen	f7f744a522	[CodeGen] Separate MachineFunctionSplitter logic for different profile types. In D152577 @xur has a post-submit comment regarding to an awkward usage of MFS for Autofdo - instead of just using -fsplit-machine-function, the user needs to add "-mllvm -mfs-psi-cutoff=0" to choose the right logic for AutoFDO. The compiler should choose the right default values for such case. This CL separate MFS logic for different profile types. Reviewed By: xur, wenlei Differential Revision: https://reviews.llvm.org/D155253	2023-07-18 11:21:35 -07:00
David Green	74c0bdff7d	[AArch64][GISel] Additional FPTrunc vector lowering I was attempting to add llvm.reduce.fminimum/fmaximum support for GlobalISel. In the process I noticed that llvm.reduce.fmin/fmax was missing, and could do with being added first. That led on to adding additional vector support for minnum/maxnum, which in turn led to needing to handle fptrunc and fpext for some of the fp16 types. So this patch extends the vector handling for fptrunc, adding support for f16 types which are clamped to 4 elements, and scalarizing the rest. I went round in circles a little with how smaller than legal vectors should be handled, but this seems simple and seems to work, if not always optimally yet. Differential Revision: https://reviews.llvm.org/D155311	2023-07-18 18:52:19 +01:00
Simon Pilgrim	d7eb9240c0	[DAG] SimplifyDemandedBits - attempt to use SimplifyMultipleUseDemandedBits for bitcasts from larger element types Attempt to avoid multi-use ops if the bitcast doesn't need anything from them.	2023-07-18 18:38:03 +01:00
Simon Pilgrim	3ad4f92f83	[DAG] More aggressively (extract_vector_elt (build_vector x, y), c) iff element is zero constant We currently don't extract vector elements from multi-use build vectors unless TLI.aggressivelyPreferBuildVectorSources accepts them, which seems a little extreme for constant build vectors (especially as under some cases ComputeKnownBits will indirectly extract the data for us). This is causing a few regressions in some upcoming SimplifyDemandedBits work I'm looking at, all of which just need to know that the element is zero, so I've tweaked the fold to accept zero elements as well, which will typically fold very easily. Differential Revision: https://reviews.llvm.org/D155582	2023-07-18 17:31:34 +01:00
Matt Arsenault	3f8ef57bed	MachineSink: Fix sinking VGPR def out of a divergent loop This fixes sinking a VGPR def out of a loop past the reconvergence point at the SI_END_CF. There was a prior fix which introduced blockPrologueInterferes (D121277) to fix the same basic problem for the post RA sink. This also had the special case isIgnorableUse case which was incorrect, because in some contexts the exec use is not ignorable. I'm thinking about a new way to represent this which will avoid needing hasIgnorableUse and isBasicBlockPrologue, which would function more like the exception handling. Fixes: SWDEV-407790 https://reviews.llvm.org/D155343	2023-07-18 06:15:50 -04:00
Sameer Sahasrabuddhe	ef7d53731b	[llvm] minor cleanup in GenericSSAContext - update comments to reflect actual state - use (implicitly inline) constexpr for a const static member	2023-07-18 12:01:11 +05:30
Konstantina Mitropoulou	4c42ab1199	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns CMP(A,C)\|\|CMP(B,C) => CMP(MIN/MAX(A,B), C) CMP(A,C)&&CMP(B,C) => CMP(MIN/MAX(A,B), C) This first patch handles integer types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153502	2023-07-17 17:13:47 -07:00
Matt Arsenault	825b7f0ca5	InlineSpiller: Fix copy identification bugs in isCopyOfBundle Noticed by inspection of `b7836d8562`. This was checking if the first instruction was a copy, not the current MI. It should fully respect the isCopyInstr result. Hopefully this fixes a reported regression which we can extract a test from.	2023-07-17 20:05:56 -04:00
Matt Arsenault	296e24cd2e	DAG: Constant fold frexp nodes Special casing the nonfinite exponent value everywhere is kind of annoying.	2023-07-17 17:34:29 -04:00
Simon Pilgrim	4f95821f58	[DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI. This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!	2023-07-17 17:17:40 +01:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Jay Foad	a1a9c53ae7	[GlobalISel] Fix infinite loop in reassociation combine Don't reassociate (C1+C2)+Y -> C1+(C2+Y). Fixes https://github.com/llvm/llvm-project/issues/63849 Differential Revision: https://reviews.llvm.org/D155284	2023-07-16 14:15:24 +01:00
Matt Arsenault	c4ccd6e3d2	MachineSink: Remove unnecessary empty block check	2023-07-14 18:46:18 -04:00
Matt Arsenault	6d3027e3d1	MachineSink: Move helper function and use more const	2023-07-14 18:46:18 -04:00
Weining Lu	ef33d6cbfc	[XRay] Add initial support for loongarch64 Only support patching FunctionEntry/FunctionExit/FunctionTailExit for now. Reviewed By: MaskRay, xen0n Co-Authored-By: zhanglimin <zhanglimin@loongson.cn> Differential Revision: https://reviews.llvm.org/D140727	2023-07-14 09:27:13 +08:00

1 2 3 4 5 ...

34376 Commits