clang-p2996

Author	SHA1	Message	Date
Jon Roelofs	83e6d2edfc	Revert "[ARM] Always lower direct calls as direct when the outliner is enabled (#66434 )" This reverts commit `003bcad9a8`. ARM folks say it regresses some of their benchmarks: https://github.com/llvm/llvm-project/pull/66434#issuecomment-1722424162	2023-09-18 09:45:46 -07:00
Craig Topper	8f04d81ede	[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding through DAGCombine later. I've only seen this with constants being lowered to constant pools during lowering on RISC-V.	2023-09-18 09:10:19 -07:00
Craig Topper	f71a9e8bb7	[SelectionDAG][RISCV][PowerPC][X86] Use TargetConstant for immediates for ISD::PREFETCH. (#66601 ) The intrinsic uses ImmArg so TargetConstant would be consistent with how other intrinsics are handled. This hides the constants from type legalization so we can remove the promotion support. isel patterns are updated accordingly.	2023-09-18 08:58:50 -07:00
Simon Pilgrim	b2ffc867ad	[DAG] getNode() - begin generalizing the (zext (trunc (assertzext x))) -> (assertzext x) fold. We'll need to generalize this fold to check for any zero upperbits to address some of the D155472 regressions, but this exposes a number of issues. For now, just use the general MaskedValueIsZero test instead of the assertzext.	2023-09-18 15:32:31 +01:00
Jay Foad	d8d0588f66	[TwoAddressInstruction] Update LiveIntervals after INSERT_SUBREG with undef read (#66211 ) Update LiveIntervals after rewriting: %reg = INSERT_SUBREG undef %reg, %subreg, subidx to: undef %reg:subidx = COPY %subreg D113044 implemented this for the non-undef case.	2023-09-18 14:51:58 +01:00
Sergei Barannikov	caaf61eb6e	[SDag] Fold saddo[_carry] with bitwise-not argument to ssubo[_carry] (#66571 ) Fold `(saddo (not a), 1)` to `(ssubo 0, a)` and `(saddo_carry (not a), b, c)` to `(ssubo_carry b, a, !c)`. Proof: https://alive2.llvm.org/ce/z/Lj49YM This is the same as https://reviews.llvm.org/D46505 and https://reviews.llvm.org/D59208, but for signed opcodes.	2023-09-18 14:45:41 +03:00
Yingwei Zheng	e042ff7eef	[SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb. Clang vs gcc: https://godbolt.org/z/rc3s4hjPh Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156390	2023-09-17 02:56:09 +08:00
Yingwei Zheng	b423e1f05d	[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs. Comparison with GCC: https://godbolt.org/z/c5zPdP7j4 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158673	2023-09-16 17:09:41 +08:00
Philip Reames	09a5aac514	[TLI] Add extend as explicit parameter to shouldRemoveExtendFromGSIndex [nfc] Note: Reviewed as part of a stack of changes in PR# 66405.	2023-09-15 14:48:02 -07:00
Guozhi Wei	cbdccb30c2	[RA] Split a virtual register in cold blocks if it is not assigned preferred physical register If a virtual register is not assigned preferred physical register, it means some COPY instructions will be changed to real register move instructions. In this case we can try to split the virtual register in colder blocks, if success, the original COPY instructions can be deleted, and the new COPY instructions in colder blocks will be generated as register move instructions. It results in fewer dynamic register move instructions executed. The new test case split-reg-with-hint.ll gives an example, the hot path contains 24 instructions without this patch, now it is only 4 instructions with this patch. Differential Revision: https://reviews.llvm.org/D156491	2023-09-15 19:52:50 +00:00
Stephen Tozer	9811ffe7d0	[DebugInfo] Process single-location debug values in variadic form when producing DWARF Revision `c383f4d655` enabled using variadic-form debug values to represent single-location, non-stack-value debug values, and a further patch made all DBG_INSTR_REFs use variadic form. Not all code paths were updated correctly to handle the new syntax however, with entry values in still expecting an expression that begins exactly DW_OP_LLVM_entry_value, 1. A function already exists to select non-variadic-like expressions; this patch adds an extra function to cheaply simplify such cases to non-variadic form, which we use prior to any entry-value processing to put DBG_INSTR_REFs and DBG_VALUEs down the same code path. We also use it for a few DIExpression functions that check for whether the first element(s) of a DIExpression match a particular pattern, so that they will return the same result for DIExpression(DW_OP_LLVM_arg, 0, <ops>) as for DIExpression(<ops>). Differential Revision: https://reviews.llvm.org/D158185	2023-09-15 19:07:44 +01:00
Jon Roelofs	003bcad9a8	[ARM] Always lower direct calls as direct when the outliner is enabled (#66434 ) The indirect lowering hinders the outliner's ability to see that sequences are in fact common, since the sequence similarity is rendered opaque by the register callee. The size savings from making them indirect seems to be dwarfed by the outliner's savings from de-duplication. rdar://115178034 rdar://115459865	2023-09-15 10:04:56 -07:00
Benjamin Kramer	3454cf67bd	Revert "[MachineLICM] Handle Subloops" This reverts commit `5ec9699c4d`. It accesses MI after it has been hoisted.	2023-09-15 13:20:31 +02:00
Martin Storsjö	7a91bbbb00	[GlobalISel] Check for unsupported Windows features on invoke (#65864 ) This matches what is done on calls, since `cc981d285d` (extended for another case in `5a751e747d`). Apply both those cases on invoke just like is done for call. Also update the preexisting comment which was left without update in `5a751e747d`. This fixes github issue #61941.	2023-09-15 11:14:40 +03:00
Orlando Cazalet-Hyams	7afc7db7fc	[Assignment Tracking] Trim assignments for untagged out of bounds stores (#66095 ) Fixes #65004 by trimming assignments from out of bounds stores (out of bounds of either the base variable or the backing alloca). If there's no overlap at all or the out of bounds access starts at a negative offset from the alloca, the assignment is simply skipped.	2023-09-15 09:10:53 +01:00
Arthur Eubanks	1feb00a28c	[X86] Introduce a large data threshold for the medium code model Currently clang's medium code model treats all data as large, putting them in a large data section and using more expensive instruction sequences to access them. Following gcc's -mlarge-data-threshold, which allows putting data under a certain size in a normal data section as opposed to a large data section. This allows using cheaper code sequences to access some portion of data in the binary (which will be implemented in LLVM in a future patch). And under the medium codel mode, only put data above the large data threshold into large data sections, not all data. Reviewed By: MaskRay, rnk Differential Revision: https://reviews.llvm.org/D149288	2023-09-14 15:09:25 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Matthias Braun	b0c8c45423	Avoid BlockFrequency overflow problems (#66280 ) Multiplying raw block frequency with an integer carries a high risk of overflow. - Add `BlockFrequency::mul` return an std::optional with the product or `nullopt` to indicate an overflow. - Fix two instances where overflow was likely.	2023-09-14 11:11:27 -07:00
Jingu Kang	5ec9699c4d	[MachineLICM] Handle Subloops Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outermost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-09-14 18:07:31 +01:00
Nick Desaulniers	86735a4353	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264 ) reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit `ee643b706b`. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll	2023-09-13 13:31:24 -07:00
Philip Reames	61757fbd04	[DAG] Remove pointless peephole from refineUniformBase [nfc] No need to special case add 0, N. SelectionDAG::getNode contains the canonicalization and simplification for this case, so no need to duplicate it here.	2023-09-13 10:16:11 -07:00
Reid Kleckner	ee643b706b	Revert "[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 )" This reverts commit `2ca4d13612`. Also revert the followup, "[InlineAsm] fix botched merge conflict resolution" This reverts commit `8b9bf3a9f7`. There were SystemZ and Mips build errors, too many to fix forward.	2023-09-13 09:58:02 -07:00
Philip Reames	2f005df066	[DAG][X86] Fold mgather/mscatter/etc with splat index (#65980 ) A splat index means the operation is reading from (writing to) the same memory location. Generally, zero is the cheapest value to splat. As such, we'd prefer to add the splatted value to the base, and use a constant zero as the index operand.	2023-09-13 09:26:30 -07:00
Nick Desaulniers	2ca4d13612	[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 ) Similar to commit `2fad6e6985` ("[InlineAsm] wrap Kind in enum class NFC") Fix the TODOs added in commit `93bd428742` ("[InlineAsm] refactor InlineAsm class NFC (#65649)")	2023-09-13 08:48:09 -07:00
Simon Pilgrim	e6b85c3027	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case (REAPPLIED) Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Reapplied after reversion at `e1e3c75c7d` with a tweak to the pseudo-probe-peep.ll test Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 12:33:39 +01:00
Simon Pilgrim	e1e3c75c7d	Revert rG6c56cf71ee82ec3a28e0dfc2b751bd10c16929da "[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case" Need to address a missed test change	2023-09-13 11:27:47 +01:00
Simon Pilgrim	6c56cf71ee	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 11:01:58 +01:00
XinWang10	8ebe1d1cc1	Revert "[NFC]Add assert to avoid possibly deref nullptr (#65564 )" (#66187 ) This reverts commit `99fb65fa7a` because it won't benefit.	2023-09-13 17:14:29 +08:00
Tobias Stadler	721b3d0a02	[GlobalISel] GISelKnownBits: forward unused depth parameter Actually pass along the depth parameter of getKnownBits to computeKnownBitsImpl. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D159321	2023-09-13 00:35:26 +02:00
Felipe de Azevedo Piovezan	bc5dac1743	[AsmPrinter][DwarfDebug] Skip vars with fragments in different location kinds The AsmPrinter currently assumes that a Debug Variable will have all of its fragments with the same "kind" of location (i.e. all in the stack or all in entry values). This is not enforced by the verifier, so it needs to be handled properly. Until we do so, we conservatively drop one of the fragments. Differential Revision: https://reviews.llvm.org/D159468	2023-09-12 11:09:29 -04:00
Allen	eaf23b2480	[GIsel][AArch64] Legalize <2 x i16> for G_INSERT_VECTOR_ELT (#65830 ) Widen the vector elements to 64 bits to make sure it legal instead by clamping the number of elements. Depend on D153394. Fixes https://github.com/llvm/llvm-project/issues/63826	2023-09-12 21:15:01 +08:00
Yingwei Zheng	4793c2c3de	[DAGCombiner][RISCV] Prefer to sext i32 non-negative values (#65984 ) By default, `DAGCombiner` folds `sext x` to `zext x` when `x` is non-negative. It will generate redundant `zext` inst seq on riscv64 (typically `slli (srli x, 32), 32`). godbolt: https://godbolt.org/z/osf6adP1o This patch applies the transform iff `zext` is cheaper than `sext`.	2023-09-12 19:02:35 +08:00
Valery Pykhtin	af2dcc3052	[AMDGPU] Handle inUndef flag in LiveVariables::recomputeForSingleDefVirtReg A register's use with isUndef flags shouldn't be considered as a point where the register is live. LiveVariables::runOnInstr ignores such uses. This was found when I tried to replace calls to SIOptimizeVGPRLiveRange::updateLiveRangeInThenRegion SIOptimizeVGPRLiveRange::updateLiveRangeInElseRegion with LiveVariables::recomputeForSingleDefVirtReg. In the testcase below %2 use is undef in the last REG_SEQUENCE. CodeGen/AMDGPU/si-opt-vgpr-liverange-bug-deadlanes.mir failed: Function Live Ins: $vgpr0 in %0 bb.0: successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%) liveins: $vgpr0 %0:vgpr_32 = COPY killed $vgpr0 %1:vgpr_32 = V_MOV_B32_e32 0, implicit $exec %2:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN killed %0:vgpr_32, undef %5:sgpr_128, 0, 0, 0, 0, implicit $exec :: (dereferenceable invariant load (s32)) %3:sreg_64 = V_CMP_NE_U32_e64 0, %2:vgpr_32, implicit $exec %7:sreg_64 = SI_IF killed %3:sreg_64, %bb.2, implicit-def dead $exec, implicit-def dead $scc, implicit $exec S_BRANCH %bb.1 bb.1: ; predecessors: %bb.0 successors: %bb.2(0x80000000); %bb.2(100.00%) %8:vreg_128 = REG_SEQUENCE killed %1:vgpr_32, %subreg.sub0, %1:vgpr_32, %subreg.sub1, %1:vgpr_32, %subreg.sub2, undef %4.sub3:vreg_128, %subreg.sub3 bb.2: ; predecessors: %bb.0, %bb.1 successors: %bb.3(0x40000000), %bb.4(0x40000000); %bb.3(50.00%), %bb.4(50.00%) %9:vreg_128 = PHI undef %10:vreg_128, %bb.0, %8:vreg_128, %bb.1 %14:vgpr_32 = PHI %2:vgpr_32, %bb.0, undef %15:vgpr_32, %bb.1 %11:sreg_64 = SI_ELSE killed %7:sreg_64, %bb.4, implicit-def dead $exec, implicit-def dead $scc, implicit $exec S_BRANCH %bb.3 bb.3: ; predecessors: %bb.2 successors: %bb.4(0x80000000); %bb.4(100.00%) %12:vreg_128 = REG_SEQUENCE killed %14:vgpr_32, %subreg.sub0, %14:vgpr_32, %subreg.sub1, %14:vgpr_32, %subreg.sub2, undef %6:vgpr_32, %subreg.sub3 bb.4: ; predecessors: %bb.2, %bb.3 %13:vreg_128 = PHI %9:vreg_128, %bb.2, %12:vreg_128, %bb.3 SI_END_CF killed %11:sreg_64, implicit-def dead $exec, implicit-def dead $scc, implicit $exec dead %4:vreg_128 = REG_SEQUENCE killed %13.sub2:vreg_128, %subreg.sub0, %13.sub2:vreg_128, %subreg.sub1, %13.sub2:vreg_128, %subreg.sub2, undef %2:vgpr_32, %subreg.sub3 S_ENDPGM 0 * Bad machine code: LiveVariables: Block should not be in AliveBlocks * - function: _amdgpu_ps_main - basic block: %bb.1 (0x55e17ebd7100) Virtual register %2 is not needed live through the block. * Bad machine code: LiveVariables: Block should not be in AliveBlocks * - function: _amdgpu_ps_main - basic block: %bb.2 (0x55e17ebd7200) Virtual register %2 is not needed live through the block. * Bad machine code: LiveVariables: Block should not be in AliveBlocks * - function: _amdgpu_ps_main - basic block: %bb.3 (0x55e17ebd7300) Virtual register %2 is not needed live through the block. Differential Revision: https://reviews.llvm.org/D158167	2023-09-12 10:59:43 +02:00
Matt Arsenault	cd4b906e18	RegisterCoalescer: Don't delete IMPLICIT_DEF if it's live into the same block Live out implicit_defs need to be kept, but the check for this only checked if the block parent was the same. This doesn't work if the parent blocks are the same but the value is live. Fixes verifier error "Instruction ending live segment doesn't read the register", which would appear at the coalesced non-implicit_def def. Fixes #38788 https://reviews.llvm.org/D158882	2023-09-12 09:28:33 +03:00
Matt Arsenault	de5585078e	RegisterCoalescer: Correctly set valid lanes when keeping live out implicit defs This fixes some verifier errors when live out implicit defs are coalesced with identity copies. Fixes some reduced testcases from issue #38788 but doesn't solve the original failure. I was surprised this seems to obviate the special casing in analyzeValue that's been there since the subregister liveness support went in. https://reviews.llvm.org/D158850	2023-09-12 09:28:33 +03:00
liqin.weng	1eec357494	[VP] IR expansion for maxnum/minnum Add basic handling for VP ops that can expand to non-predicate ops Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159494	2023-09-12 10:15:52 +08:00
Jeremy Morse	e54277fa10	[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and updates many call sites to use it. The motivating reason for doing this is given here [0], we'd like to pass around more information about the position of debug-info in the iterator object. That necessitates passing iterators around most of the time. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152468	2023-09-11 20:01:19 +01:00
Kazu Hirata	232ab04812	[AsmPrinter] Fix an unused variable warning This patch fixes: llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:109:10: error: unused variable 'Ret' [-Werror,-Wunused-variable]	2023-09-11 11:24:50 -07:00
Vitaly Buka	f106b3f135	Revert "[PHIElimination] Handle subranges in LiveInterval updates" Leaks memory. This reverts commit `3bff611068`.	2023-09-11 11:09:26 -07:00
Scott Linder	4333146195	[NFC][AsmPrinter] Use std::visit in constructVariableDIEImpl This potentially has a slightly positive performance impact, as std::visit can be implemented as a `switch`-like jump rather than a series of `if`s. More importantly, the reader can be confident is no overlap between the cases. Differential Revision: https://reviews.llvm.org/D158678	2023-09-11 17:32:00 +00:00
Scott Linder	35e621f9ae	[NFC][AsmPrinter] Expose std::variant-ness of DbgVariable Differential Revision: https://reviews.llvm.org/D158677	2023-09-11 17:31:59 +00:00
Scott Linder	414ceffc9e	[NFC][AsmPrinter] Remove dead multi-MMI handling from DwarfFile::addScopeVariable Differential Revision: https://reviews.llvm.org/D158676	2023-09-11 17:31:59 +00:00
Scott Linder	58c108cde7	[NFC][AsmPrinter] Refactor DbgVariable as a std::variant Only a subset of the fields of DbgVariable are meaningful at any time, and some fields are re-used for multiple purposes (for example FrameIndexExprs is used with a throw-away frame-index of 0 to hold a single DIExpression without needing to add another member). The exact invariants must be reverse-engineered by inspecting the actual use of the class, its imprecise/outdated doc-comment, and some asserts. Refactor DbgVariable into a sum type by inheriting from std::variant. This makes the active fields for any given state explicit and removes the need to re-use fields in disparate contexts. As a bonus, it seems to reduce the size on my x86_64 linux box from 144 bytes to 96 bytes. There is some potential cost to `std::get` as it must check the active alternative even when context or an assert obviates it. To try to help ensure the compiler can optimize out the checks the patch also adds a helper `get` method which uses the noexcept `std::get_if`. Some of the extra cost would also be avoided more cleanly with a refactor that exposes the alternative types in the public interface, which will come in another patch. Differential Revision: https://reviews.llvm.org/D158675	2023-09-11 17:31:59 +00:00
Nick Desaulniers	93bd428742	[InlineAsm] refactor InlineAsm class NFC (#65649 ) I would like to steal one of these bits to denote whether a kind may be spilled by the register allocator or not, but I'm afraid to touch of any this code using bitwise operands. Make flags a first class type using bitfields, rather than launder data around via `unsigned`.	2023-09-11 09:27:37 -07:00
liqin.weng	3723ede3cf	[VP] IR expansion for zext/sext/trunc/fptosi/fptosi/sitofp/uitofp/fptrunc/fpext Add basic handling for VP ops that can expand to Cast intrinsics Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159491	2023-09-11 21:14:38 +08:00
liqin.weng	28e74e6180	[VP] IR expansion for abs/smax/smin/umax/umin Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159495	2023-09-11 21:14:37 +08:00
Jeremy Morse	6942c64e81	[NFC][RemoveDIs] Prefer iterator-insertion over instructions Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537	2023-09-11 11:48:45 +01:00
Carl Ritson	3bff611068	[PHIElimination] Handle subranges in LiveInterval updates Add handling for subrange updates in LiveInterval preservation. This requires extending MachineBasicBlock::SplitCriticalEdge to also update subrange intervals. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158144	2023-09-11 17:15:09 +09:00
Jay Foad	fd453e2381	[TwoAddressInstruction] Use member functions instead of static helpers This just avoids explicitly passing around common pointers like MRI and TII. NFC.	2023-09-10 18:33:57 +01:00
Amara Emerson	eaab3245d4	[GlobalISel] Add constant folding support for G_FMA/G_FMAD in the combiner. (#65659 )	2023-09-09 16:32:02 +08:00

1 2 3 4 5 ...

34645 Commits