clang-p2996

Author	SHA1	Message	Date
Youngsuk Kim	eed067e9fb	[llvm] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque ptr cleanup effort (NFC).	2023-11-13 14:33:41 -06:00
Youngsuk Kim	876236023c	[llvm] Remove no-op ptr-to-ptr bitcasts (NFC) (#72133 ) Opaque ptr cleanup effort (NFC).	2023-11-13 13:05:27 -05:00
Valery Pykhtin	f054947c0d	[SimplifyCFG] Prevent merging cbranch to cbranch if the branch probability from the first to second is too low. (#69375 ) AMDGPU target has faced the situation which can be illustrated with the following testcase: define void @dont_merge_cbranches(i32 %V) { %divergent_cond = icmp ne i32 %V, 0 %uniform_cond = call i1 @uniform_result(i1 %divergent_cond) br i1 %uniform_cond, label %bb2, label %exit, !prof !0 bb2: br i1 %divergent_cond, label %bb3, label %exit bb3: call void @bar( ) br label %exit exit: ret void } !0 = !{!"branch_weights", i32 1, i32 100000} SimplifyCFG merges branches on %uniform_cond and %divergent_cond which is undesirable because the first branch to bb2 is taken extremely rare and the second branch is expensive. The merged branch becomes as expensive as the second. This patch prevents such merging if the branch to the second branch is unlikely to happen.	2023-11-13 15:37:55 +01:00
Kazu Hirata	22b0f7ba6e	[Transforms] Include llvm/ADT/SmallSet.h (NFC) This patch adds #include "llvm/ADT/SmallSet.h" to a couple of files that are relying on transitive includes of SmallSet.h. It in turn unblocks the removal of unnecessary includes of llvm/ADT/SmallSet.h in several other files.	2023-11-11 12:25:39 -08:00
Nikita Popov	192e7d3d52	[IRBuilder] Add IsNonNeg param to CreateZExt() (NFC)	2023-11-10 12:00:34 +01:00
Chuanqi Xu	b7b5907b56	[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014 ) Close https://github.com/llvm/llvm-project/issues/56980. This patch tries to introduce a light-weight optimization attribute for coroutines which are guaranteed to only be destroyed after it reached the final suspend. The rationale behind the patch is simple. See the example: ```C++ A foo() { dtor d; co_await something(); dtor d1; co_await something(); dtor d2; co_return 43; } ``` Generally the generated .destroy function may be: ```C++ void foo.destroy(foo.Frame frame) { switch(frame->suspend_index()) { case 1: frame->d.~dtor(); break; case 2: frame->d.~dtor(); frame->d1.~dtor(); break; case 3: frame->d.~dtor(); frame->d1.~dtor(); frame->d2.~dtor(); break; default: // coroutine completed or haven't started break; } frame->promise.~promise_type(); delete frame; } ``` Since the compiler need to be ready for all the cases that the coroutine may be destroyed in a valid state. However, from the user's perspective, we can understand that certain coroutine types may only be destroyed after it reached to the final suspend point. And we need a method to teach the compiler about this. Then this is the patch. After the compiler recognized that the coroutines can only be destroyed after complete, it can optimize the above example to: ```C++ void foo.destroy(foo.Frame frame) { frame->promise.~promise_type(); delete frame; } ``` I spent a lot of time experimenting and experiencing this in the downstream. The numbers are really good. In a real-world coroutine-heavy workload, the size of the build dir (including .o files) reduces 14%. And the size of final libraries (excluding the .o files) reduces 8% in Debug mode and 1% in Release mode.	2023-11-09 14:42:07 +08:00
Allen	7ec86f4d68	[SimplifyCFG] Fix the compile crash for invalid upper bound value (#71351 ) Fix the crash for the last land PR70542. Note: For '%add = add nuw i32 %x, 1', we can only infer the LowerBound is 1, but the UpperBound is wrapped to 0 in computeConstantRange. so we can't assume the UpperBound is valid bound when its value is 0. Fix https://github.com/llvm/llvm-project/issues/71329. Reviewed By: zmodem, nikic	2023-11-09 12:33:24 +08:00
Jeremy Morse	f1b0a54451	Reapply `7d77bbef4a`, adding new debug-info classes This reverts commit `957efa4ce4`. Original commit message below -- in this follow up, I've shifted un-necessary inclusions of DebugProgramInstruction.h into being forward declarations (fixes clang-compile time I hope), and a memory leak in the DebugInfoTest.cpp IR unittests. I also tracked a compile-time regression in D154080, more explanation there, but the result of which is hiding some of the changes behind the EXPERIMENTAL_DEBUGINFO_ITERATORS compile-time flag. This is tested by the "new-debug-iterators" buildbot. [DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-08 16:42:35 +00:00
Markos Horro	9d2903c8e5	[IndVars] Add check of loop invariant for trunc instructions (#71072 ) The same idea as in `34d380e1f6`, but considering truncation instructions. Improvement for #59633.	2023-11-08 11:16:23 +00:00
Vladislav Dzhidzhoev	6beddd668a	Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" This caused assert: llvm/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:110: void llvm::DwarfFile::addScopeVariable(LexicalScope , DbgVariable ): Assertion `Ret.second' failed. See comments https://reviews.llvm.org/D144006#4656350. This reverts commit `3b449bd46a`.	2023-11-08 00:29:24 +01:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
Philip Reames	551c280cfd	[indvars] Always fallback to truncation if AddRec widening fails (#70967 ) The current code structure results in cases where if a) we can't clone the IV user (because it's not in our whitelist) or b) can't prove the SCEV expressions are identical, we'd sometimes leave both the original unwiddened IV and the partially widdened IV in code. Instead, just truncate thw wide IV to the use - same as what we'd do if we couldn't find an addrec to start with. Noticed this while playing with changing how we produce addrecs. The current structure results in a very tight interlock between SCEVs internal capabilities and indvars code.	2023-11-07 07:49:39 -08:00
Hans Wennborg	05ed92127c	Revert "Reland [SimplifyCFG] Delete the unnecessary range check for small mask operation (#70542 )" This caused https://github.com/llvm/llvm-project/issues/71329 > Fix the compile crash when the default result has no result for > https://github.com/llvm/llvm-project/pull/65835 > > Fixes https://github.com/llvm/llvm-project/issues/65120 > Reviewed By: zmodem, nikic This reverts commit `7c4180a36a`.	2023-11-07 10:53:22 +01:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Nikita Popov	be3cef0b2a	[LibCallsShrinkWrap] Avoid use of ConstantExpr::getFPExtend() (NFC) Use the constant folding API instead.	2023-11-06 15:38:42 +01:00
Nikita Popov	a682a9cfd0	Revert "Port Swift's merge function pass to llvm: merging functions that differ in constants (#68235 )" This reverts commit `19b5495b65`. PR landed without approval, with severe quality issues.	2023-11-03 21:15:46 +01:00
Philip Reames	5adf6ab7ff	Revert "[IndVars] Generate zext nneg when locally obvious" This reverts commit `a6c8e27b3a`. It appears likely to have caused https://lab.llvm.org/buildbot/#/builders/57/builds/30988.	2023-11-03 11:19:14 -07:00
Manman Ren	19b5495b65	Port Swift's merge function pass to llvm: merging functions that differ in constants (#68235 ) See RFC for details: https://discourse.llvm.org/t/rfc-for-moving-swift-s-merge-function-pass-to-llvm/73778 We will need to refactor extension to FunctionComparator/FunctionHash to StructuralHash. This patch adds a new pass which is ported from Swift, and will need to discuss on how to migrate Swift’s pass over after we land this in llvm. Create this PR to get some early review on the patch. --------- Co-authored-by: Manman Ren <mren@meta.com>	2023-11-03 11:13:58 -07:00
Philip Reames	7c93452e17	[indvars] Restructure getExtendedOperandRecurrence [nfc] As suggested during review of https://github.com/llvm/llvm-project/pull/70990.	2023-11-03 10:50:57 -07:00
Philip Reames	1ffea97ffd	[indvars] Support known positive extends in getExtendedOperandRecurrence (#70990 ) IndVars has the existing notion of a narrow definition which is known to positive and thus both sign and zero extension kinds are actually the same operations. There's existing logic for forming a SCEV based on the extension kind and the no-wrap flags. This change extends that logic to form the opposite extension kind for a positive def if doing so is allowed by the flags. Note that we already do something analogous for the getWideRecurrence case as well.	2023-11-03 10:21:30 -07:00
Philip Reames	a6c8e27b3a	[IndVars] Generate zext nneg when locally obvious zext nneg was recently added to the IR in #67982. This patch teaches SimplifyIndVars to prefer zext nneg over both sext and plain zext, when a local SCEV query indicates the source is non-negative. The choice to prefer zext nneg over sext looks slightly aggressive here, but probably isn't so much in practice. For cases where we'd "remember" the range fact, instcombine would convert the sext into a zext nneg anyways. The only cases where this produces a different result overall are when SCEV knows a non-local fact, and it doesn't get materialized into the IR. Those are exactly the cases where using zext nneg are most useful. We do run the risk of e.g. a missing combine - since we haven't updated most of them yet - but that seems like a manageable risk. Note that there are much deeper algorithmic changes we could make to this code to exploit zext nneg, but this seemed like a reasonable and low risk starting point.	2023-11-03 09:20:59 -07:00
Allen	7c4180a36a	Reland [SimplifyCFG] Delete the unnecessary range check for small mask operation (#70542 ) Fix the compile crash when the default result has no result for https://github.com/llvm/llvm-project/pull/65835 Fixes https://github.com/llvm/llvm-project/issues/65120 Reviewed By: zmodem, nikic	2023-11-03 09:12:29 +08:00
spupyrev	cebc837937	[CodeLayout] Pre-process execution counts before layout (#70501 ) BOLT fails to process binaries in non-LBR mode, as some blocks marked as having a zero execution count. Adjusting code layout to process such blocks without assertions. This is NFC for all other use cases.	2023-11-02 12:08:33 -07:00
Jeremy Morse	957efa4ce4	Revert "[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info" And some intervening fixups. There are two remaining problems: * A memory leak via https://lab.llvm.org/buildbot/#/builders/236/builds/7120/steps/10/logs/stdio * A performance slowdown with -g where I'm not completely sure what the cause it These might be fairly straightforwards to fix, but it's the end of the day hear, so I figure I'll clear the buildbots til tomorrow. This reverts commit `7d77bbef4a`. This reverts commit `9026f35afe`. This reverts commit `d97b2b389a`.	2023-11-02 17:41:36 +00:00
Vladislav Dzhidzhoev	3b449bd46a	[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7) RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 Similar to imported declarations, the patch tracks function-local types in DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with the aforementioned metadata change and provided a support of function-local types scoped within a lexical block. The patch assumes that DICompileUnit's 'enums field' no longer tracks local types and DwarfDebug would assert if any locally-scoped types get placed there. Reviewed By: jmmartinez Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com> Differential Revision: https://reviews.llvm.org/D144006	2023-11-02 17:44:52 +01:00
Jeremy Morse	7d77bbef4a	[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-02 12:44:53 +00:00
David Sherwood	07f0e75b53	[LoopVectorize] Fix bug with code to hoist runtime checks (#70937 ) There was a silly mistake in the expandBounds function that was using the wrong type when calling expandCodeFor and always assuming the stride is 64 bits. I've added the following test to defend this fix: Transforms/LoopVectorize/ARM/mve-hoist-runtime-checks.ll	2023-11-02 10:02:50 +00:00
Jacob Lambert	2b898afdef	[llvm] Add comment and assert for CloneModule edge case (#67734 ) CloneModule is not currently designed to handle un-materialized Modules, for example one created via a lazy initializer like getLazyBitcodeModule(). In this case we get a somewhat cryptic segmentation fault without a clear path forward. In this patch, we add a comment to inform CloneModule users of this shortcoming, and an assert to test for empty function bodies before the segmentation fault is triggered.	2023-11-01 10:05:11 -07:00
Nikita Popov	b87110e298	[SimplifyCFG] Avoid use of ConstantExpr::getIntegerCast() (NFC) We're working on a ConstantInt here, so constant folding will always succeed. Just avoid using the ConstantExpr API.	2023-11-01 11:55:11 +01:00
Nikita Popov	6b8ed78719	[IR] Add writable attribute This adds a writable attribute, which in conjunction with dereferenceable(N) states that a spurious store of N bytes is introduced on function entry. This implies that this many bytes are writable without trapping or introducing data races. See https://llvm.org/docs/Atomics.html#optimization-outside-atomic for why the second point is important. This attribute can be added to sret arguments. I believe Rust will also be able to use it for by-value (moved) arguments. Rust likely won't be able to use it for &mut arguments (tree borrows does not appear to allow spurious stores). In this patch the new attribute is only used by LICM scalar promotion. However, the actual motivation for this is to fix a correctness issue in call slot optimization, which needs this attribute to avoid optimization regressions. Followup to the discussion on D157499. Differential Revision: https://reviews.llvm.org/D158081	2023-11-01 10:46:31 +01:00
Philip Reames	a78f5c0649	[IndVars] Use IRBuilder in eliminateTrunc [nfc-ish] (#70836 ) Mostly a cleanup so that we don't need to manually emit instructions, and can eagerly constant fold where relevant.	2023-10-31 14:37:57 -07:00
Aleksandr Popov	483e92468e	[NFC] Extract LoopConstrainer from IRCE to reuse it outside the pass (#70508 ) Co-authored-by: Aleksandr Popov <apopov@azul.com>	2023-10-31 18:16:59 +01:00
Philip Reames	f8742b8d6a	[SCEV] Teach SCEVExpander to use zext nneg when possible (#70815 ) zext nneg was recently added to the IR in #67982. Teaching SCEVExpander to emit nneg when possible is valuable since SCEV may have proved non-trivial facts about loop bounds which would otherwise be lost when materializing the value.	2023-10-31 09:33:07 -07:00
Aleksandr Popov	e8d5db206c	[LoopPeeling] Fix weights updating of peeled off branches (#70094 ) In https://reviews.llvm.org/D64235 a new algorithm has been introduced for updating the branch weights of latch blocks and their copies. It increases the probability of going to the exit block for each next peel iteration, calculating weights by (F - I * E, E), where: - F is a weight of the edge from latch to header. - E is a weight of the edge from latch to exit. - I is a number of peeling iteration. E.g: Let's say the latch branch weights are (100,300) and the estimated trip count is 4. If we peel off all 4 iterations the weights of the copied branches will be: 0: (100,300) 1: (100,200) 2: (100,100) 3: (100,1) https://godbolt.org/z/93KnoEsT6 So we make the original loop almost unreachable from the 3rd peeled copy according to the profile data. But that's only true if the profiling data is accurate. Underestimated trip count can lead to a performance issues with the register allocator, which may decide to spill intervals inside the loop assuming it's unreachable. Since we don't know how accurate the profiling data is, it seems better to set neutral 1/1 weights on the last peeled latch branch. After this change, the weights in the example above will look like this: 0: (100,300) 1: (100,200) 2: (100,100) 3: (100,100) Co-authored-by: Aleksandr Popov <apopov@azul.com>	2023-10-31 14:02:42 +01:00
Craig Topper	b1c59b516c	[SCCP] Infer nneg on zext when forming from non-negative sext. (#70730 ) Builds on #67982 which recently introduced the nneg flag on a zext instruction.	2023-10-30 15:07:22 -07:00
XChy	fc6bdb8549	[SimplifyCFG] Reland transform for redirecting phis between unmergeable BB and SuccBB (#68473 ) Reland #67275 with #68953 resolved.	2023-10-28 17:10:20 +08:00
Fangrui Song	8e247b8f47	Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC	2023-10-27 00:30:41 -07:00
spupyrev	f61179f812	[CodeLayout] Changed option names cds to cdsort (#69668 ) Renaming cds-> cdsort for consistency. This is NFC unless somebody uses older names	2023-10-26 18:10:30 -07:00
Allen	851338b126	Revert "[SimplifyCFG] Delete the unnecessary range check for small mask operation (#70324 ) This reverts commit `5e07481d42`.	2023-10-26 20:39:24 +08:00
zhongyunde 00443407	5e07481d42	[SimplifyCFG] Delete the unnecessary range check for small mask operation When the small mask value little than 64, we can eliminate the checking for upper limit of the range by enlarge the lookup table size to the maximum index value. (Then the final table size grows to the next pow2 value) ``` bool f(unsigned x) { switch (x % 8) { case 0: return 1; case 1: return 0; case 2: return 0; case 3: return 1; case 4: return 1; case 5: return 0; case 6: return 1; // This would remove the range check: case 7: return 0; } return 0; } ``` Use WouldFitInRegister instead of fitsInLegalInteger to support more result type beside bool. Fixes https://github.com/llvm/llvm-project/issues/65120 Reviewed By: zmodem, nikic, RKSimon	2023-10-26 19:01:22 +08:00
Youngsuk Kim	4c60c0cb4e	[LowerMemIntrinsics] Remove no-op ptr-to-ptr bitcasts (NFC) Remove ptr-to-ptr bitcasts, which are unnecessary with opaque pointers enabled. Opaque pointer clean-up effort. NFC.	2023-10-25 16:23:58 -05:00
Alina Sbirlea	d0584e248d	[CodeLayout] Update to resolve Wdangling warning. Change `cc2fbc648d` introduced -Wdangling warning, use temporaries to resolve. llvm/lib/Transforms/Utils/CodeLayout.cpp:764:27: error: temporary whose address is used as value of local variable '[minDensity, maxDensity]' will be destroyed at the end of the full-expression [-Werror,-Wdangling] 764 \| std::minmax(ChainPred->density(), ChainSucc->density()); llvm/lib/Transforms/Utils/CodeLayout.cpp:764:49: error: temporary whose address is used as value of local variable '[minDensity, maxDensity]' will be destroyed at the end of the full-expression [-Werror,-Wdangling] 764 \| std::minmax(ChainPred->density(), ChainSucc->density());	2023-10-25 11:31:48 -07:00
spupyrev	cc2fbc648d	[CodeLayout] Faster basic block reordering, ext-tsp (#68617 ) Aggressive inlining might produce huge functions with >10K of basic blocks. Since BFI treats _all_ blocks and jumps as "hot" having non-negative (but perhaps small) weight, the current implementation can be slow, taking minutes to produce an layout. This change introduces a few modifications that significantly (up to 50x on some instances) speeds up the computation. Some notable changes: - reduced the maximum chain size to 512 (from the prior 4096); - introduced MaxMergeDensityRatio param to avoid merging chains with very different densities; - dropped a couple of params that seem unnecessary. Looking at some "offline" metrics (e.g., the number of created fall-throughs), there shouldn't be problems; in fact, I do see some metrics go up. But it might be hard/impossible to measure perf difference for such small changes. I did test the performance clang-14 binary and do not record a perf or i-cache-related differences. My 5 benchmarks, with ext-tsp runtime (the lower the better) and "tsp-score" (the higher the better). Before: - benchmark 1: num functions: 13,047 reordering running time is 2.4 seconds score: 125503458 (128.3102%) - benchmark 2: num functions: 16,438 reordering running time is 3.4 seconds score: 12613997277 (129.7495%) - benchmark 3: num functions: 12,359 reordering running time is 1.9 seconds score: 1315881613 (105.8991%) - benchmark 4: num functions: 96,588 reordering running time is 7.3 seconds score: 89513906284 (100.3413%) - benchmark 5: num functions: 1 reordering running time is 372 seconds score: 21292505965077 (99.9979%) - benchmark 6: num functions: 71,155 reordering running time is 314 seconds score: 29795381626270671437824 (102.7519%) After: - benchmark 1: reordering running time is 2.2 seconds score: 125510418 (128.3130%) - benchmark 2: reordering running time is 2.6 seconds score: 12614502162 (129.7525%) - benchmark 3: reordering running time is 1.6 seconds score: 1315938168 (105.9024%) - benchmark 4: reordering running time is 4.9 seconds score: 89518095837 (100.3454%) - benchmark 5: reordering running time is 4.8 seconds score: 21292295939119 (99.9971%) - benchmark 6: reordering running time is 104 seconds score: 29796710925310302879744 (102.7565%)	2023-10-25 07:52:26 -07:00
Kazu Hirata	f9306f6de3	[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156 ) C++20 comes with std::erase to erase a value from std::vector. This patch renames llvm::erase_value to llvm::erase for consistency with C++20. We could make llvm::erase more similar to std::erase by having it return the number of elements removed, but I'm not doing that for now because nobody seems to care about that in our code base. Since there are only 50 occurrences of erase_value in our code base, this patch replaces all of them with llvm::erase and deprecates llvm::erase_value.	2023-10-24 23:03:13 -07:00
Ruiling, Song	ac24238002	[LowerSwitch] Don't let pass manager handle the dependency (#68662 ) Some passes has limitation that only support simple terminators: branch/unreachable/return. Right now, they ask the pass manager to add LowerSwitch pass to eliminate `switch`. Let's manage such kind of pass dependency by ourselves. Also add the assertion in the related passes.	2023-10-25 09:24:36 +08:00
Benjamin Kramer	eb67b34740	[IPSCCP] Don't crash on ptrtoint	2023-10-24 14:14:39 +02:00
Carlos Alberto Enciso	f3b20cb16a	[IPSCCP] Variable not visible at Og. (#66745 ) https://bugs.llvm.org/show_bug.cgi?id=51559 https://github.com/llvm/llvm-project/issues/50901 IPSCCP pass removes the global variable and does not create a constant expression for the initializer value.	2023-10-24 06:22:18 +01:00
Sam Clegg	e01c7d54b4	[LowerGlobalDtors] Skip __cxa_atexit call completely when arg0 is unused (#68758 ) In emscripten we have a build mode (the default actually) where the runtime never exits and therefore `__cxa_atexit` is a dummy/stub function that does nothing. In this case we would like to be able completely DCE any otherwise-unused global dtor functions. Fixes: https://github.com/emscripten-core/emscripten/issues/19993	2023-10-23 10:08:08 -07:00
Fangrui Song	a24418375a	[CodeLayout] cache-directed sort: limit max chain size (#69039 ) When linking an executable with a slightly larger executable, ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638). ``` 4.6% 20.7Mi .text.hot 3.5% 15.9Mi .text 3.4% 15.2Mi .text.unknown ``` Add cl option `cdsort-max-chain-size`, which is similar to `ext-tsp-max-chain-size`, and set it to 128, to improve performance. In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace" builds, the "Total Sort sections" time is measured as follows: * -mllvm -cdsort-max-chain-size=64: 1.321813 * -mllvm -cdsort-max-chain-size=128: 2.030425 * -mllvm -cdsort-max-chain-size=256: 2.927684 * -mllvm -cdsort-max-chain-size=512: 5.493106 * unlimited: 9 minutes The rest part takes 6.8s.	2023-10-22 16:50:03 -07:00
Kazu Hirata	9c5a5a421d	[llvm] Stop including llvm/ADT/iterator_range.h (NFC) Identified with misc-include-cleaner.	2023-10-22 15:41:18 -07:00

1 2 3 4 5 ...

7098 Commits