clang-p2996

Author	SHA1	Message	Date
Congzhe	b0662a7a7d	[CodeMoverUtils] Enhance CodeMoverUtils to sink an entire BB (#87857 ) When moving an entire basic block after `InsertPoint`, currently we check each instruction whether their users are dominated by `InsertPoint`, however, this can be improved such that even a user is not dominated by `InsertPoint`, as long as it appears as a subsequent instruction in the same BB, it is safe to move. This patch is similar to commit `751be2a064` that enhanced hoisting an entire BB, and this patch enhances sinking an entire BB. Please refer to the added functionality in test case `llvm/unittests/Transforms/Utils/CodeMoverUtilsTest.cpp` that was not supported without this patch.	2024-04-10 00:28:21 -04:00
Florian Hahn	c836983671	[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770 ) VPBlendRecipe does not use the first mask operand. Removing it allows VPlan-based DCE to remove unused mask computations. This also fixes #87410, where unused Not VPInstructions are considered having only their first lane demanded, but some of their operands providing a vector value due to other users. Fixes https://github.com/llvm/llvm-project/issues/87410 PR: https://github.com/llvm/llvm-project/pull/87770	2024-04-09 11:14:05 +01:00
Florian Hahn	15d11a4de9	[VPlan] Track IsOrdered in VPReductionRecipe, remove use of ILV (NFCI). Instead of using ILV.useOrderedReductions during ::execute, instead store the information at recipe construction. Another step towards making recipe'::execute independent of legacy ILV.	2024-04-07 20:33:22 +01:00
Florian Hahn	e701c1a653	[VPlan] Use recipe's debug loc for VPWidenMemoryInstructionRecipe (NFCI) Now that VPRecipeBase manages debug locations for recipes, use it in VPWidenMemoryInstructionRecipe.	2024-04-01 12:07:30 +01:00
Florian Hahn	8a614c1d31	[VPlan] Rename getVPValueOrAddLiveIn -> getOrAddLiveIn (NFCI). The helper now only deals with live-ins, clarify the name.	2024-03-28 21:02:15 +00:00
Stephen Tozer	75dfa58ea9	[RemoveDIs][NFC] Rename DPMarker->DbgMarker (#85931 ) Another trivial rename patch, the last big one for now, which renamed DPMarkers to DbgMarkers. This required the field `DbgMarker` in `Instruction` to be renamed to `DebugMarker` to avoid a clash, but otherwise was a simple string substitution of `s/DPMarker/DbgMarker` and a manual renaming of `DPM` to `DM` in the few places where that acronym was used for debug markers.	2024-03-20 16:00:10 +00:00
Stephen Tozer	ffd08c7759	[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216 ) This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, except for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.	2024-03-19 20:07:07 +00:00
Stephen Tozer	360da83858	[RemoveDI][NFC] Rename DPValue->DbgRecord in comments and varnames (#84939 ) This patch continues the ongoing rename work, replacing DPValue with DbgRecord in comments and the names of variables, both members and fn-local. This is the most labour-intensive part of the rename, as it is where the most decisions have to be made about whether a given comment or variable is referring to DPValues (equivalent to debug variable intrinsics) or DbgRecords (a catch-all for all debug intrinsics); these decisions are not individually difficult, but comprise a fairly large amount of text to review. This patch still largely performs basic string substitutions followed by clang-format; there are almost* no places where, for example, a comment has been expanded or modified to reflect the semantic difference between DPValues and DbgRecords. I don't believe such a change is generally necessary in LLVM, but it may be useful in the docs, and so I'll be submitting docs changes as a separate patch. *In a few places, `dbg.values` was replaced with `debug intrinsics`.	2024-03-13 16:39:35 +00:00
Nikita Popov	20b15e645c	[Tests] Drop inrange attribute from some tests (NFC) These don't actually test anything related to inrange, so drop the attribute.	2024-03-13 11:49:16 +01:00
Stephen Tozer	15f3f446c5	[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793 ) As part of the effort to rename the DbgRecord classes, this patch renames the widely-used functions that operate on DbgRecords but refer to DbgValues or DPValues in their names to refer to DbgRecords instead; all such functions are defined in one of `BasicBlock.h`, `Instruction.h`, and `DebugProgramInstruction.h`. This patch explicitly does not change the names of any comments or variables, except for where they use the exact name of one of the renamed functions. The reason for this is reviewability; this patch can be trivially examined to determine that the only changes are direct string substitutions and any results from clang-format responding to the changed line lengths. Future patches will cover renaming variables and comments, and then renaming the classes themselves.	2024-03-12 14:53:13 +00:00
calebwat	22cf983387	[VPlan] Use opaque pointers in VPlan unit test IR (#69947 ) Updates the unit tests for VPlan to use opaque pointers in strings containing LLVM IR. This is to match the similar adjustments being made for lit tests to use opaque pointers.	2024-02-21 11:38:26 -08:00
Florian Hahn	9923d29cfa	[VPlan] Merge main VPlan verifer with HCFG verifier. Unify VPlan verifiers in verifyVPlanIsValid. This adds verification for various properties on blocks to the verifier used for VPlans generated by the inner loop vectorizer. It also adds def-use checks for the verifier used in the VPlan native path. This drops the separate flag to enable HCFG verification. Instead, all VPlans are verified once they have been created, if assertions are enabled. This also removes VPWidenPHIRecipe from VPHeaderPHIRecipe; it is used to model any phi node in the native path.	2024-02-20 16:43:57 +00:00
Jeremy Morse	66d4fe97d8	[DebugInfo][RemoveDIs] Final final test-maintenence patch (#80988 ) This should be the final portion of shaping-up the test suite to be ready for turning on non-intrinsic debug-info: * Pin CostModel tests that expect to see intrinsics in their -debug output to not use RemoveDIs. This is a spurious test output difference. * Add 'tail' to a bunch of intrinsics in UpdateTestChecks. We're cannonicalising intrinsics to be printed with "tail" in RemoveDI conversion as dbg.values usually pick that up while being optimised. This is another spurious output difference. * The "DebugInfoDrop" pass used in the debugify unit-tests happens to operate inside the pass manager, thus it sees non-intrinsic debug-info. Update it to correctly drop it.	2024-02-07 14:31:52 +00:00
Florian Hahn	ec402a2e53	[VPlan] Implement cloning of VPlans. (#73158 ) This patch implements cloning for VPlans and recipes. Cloning is used in the epilogue vectorization path, to clone the VPlan for the main vector loop. This means we won't re-use a VPlan when executing the VPlan for the epilogue vector loop, which in turn will enable us to perform optimizations based on UF & VF.	2024-01-27 13:30:52 +00:00
Alexandros Lamprineas	92289db82f	[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513 ) This fixes #71892 allowing us to check magled names in the IR verifier.	2024-01-17 09:55:30 +00:00
Davide Italiano	b6f922fbf5	Revert "[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385 )" This reverts commit `fc6faa1113`.	2024-01-16 17:01:01 -08:00
Vladislav Dzhidzhoev	fc6faa1113	[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385 ) - [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7) - [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash reported in Chromium (https://reviews.llvm.org/D144006#4651955). The first commit is added for convenience, as it has already been accepted. If DISubpogram was not cloned (e.g. we are cloning a function that has other functions inlined into it, and subprograms of the inlined functions are not supposed to be cloned), it doesn't make sense to clone its DILocalVariables as well. Otherwise get duplicated DILocalVariables not tracked in their subprogram's retainedNodes, that crash LTO with Chromium. This is meant to be committed along with https://reviews.llvm.org/D144006.	2024-01-11 17:08:12 +01:00
Wenju He	108989b717	[IR] Disallow ZeroInit for spirv.Image (#73887 ) According to spirv spec, OpConstantNull's result type can't be image type. So we can't generate zeroinitializer for spirv.Image.	2023-12-19 13:54:25 +08:00
Jessica Del	32f9983c06	[AMDGPU] - Add address space for strided buffers (#74471 ) This is an experimental address space for strided buffers. These buffers can have structs as elements and a stride > 1. These pointers allow the indexed access in units of stride, i.e., they point at `buffer[index * stride]`. Thus, we can use the `idxen` modifier for buffer loads. We assign address space 9 to 192-bit buffer pointers which contain a 128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially, they are fat buffer pointers with an additional 32-bit index.	2023-12-15 15:49:25 +01:00
Kazu Hirata	5c9d82de6b	[llvm] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 22:46:02 -08:00
Florian Hahn	99aa5311ee	[VPlan] Add missing output of live-ins to VPlan dot printing. Split off live-in printing to VPlan::printLiveIns and use it to print Live-ins when printing in the DOT format.	2023-12-04 13:41:28 +00:00
Mircea Trofin	284da049f5	[coro][pgo] Don't promote pgo counters in the suspend basic block (#71263 ) If a suspend happens in the resume part (this can happen in the case of chained coroutines), and that's part of a loop, the pre-split CFG has the suspend block as an exit of that loop. PGO Counter Promotion will then try to commit the temporary counter to the global in that "exit" block (it also does that in the other loop exit BBs, which also includes the "destroy" case). This interferes with symmetric transfer. We don't need to commit the counter in the suspend case - it's not a loop exit from the perspective of the behavior of the program. The regular loop exit, together with the "destroy" case, completely cover any updates that may need to happen to the global counter.	2023-11-30 11:58:26 -08:00
Fangrui Song	dd3184c30f	[unittest,examples] Replace uses of IRBuilder::getInt8PtrTy with getPtrTy. NFC	2023-11-27 08:29:13 -08:00
Florian Hahn	34c2dcd5ac	[VPlan] Move initial skeleton construction to createInitialVPlan. (NFC) This patch moves creating the middle VPBBs and an initial empty vector loop region for the top-level loop to createInitialVPlan. This consolidates code to create the initial VPlan skeleton and enables adding other bits outside the main region during initial VPlan construction. In particular, D150398 will add the exit check & branch to the middle block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158333	2023-11-12 13:00:44 +00:00
Jeremy Morse	f1b0a54451	Reapply `7d77bbef4a`, adding new debug-info classes This reverts commit `957efa4ce4`. Original commit message below -- in this follow up, I've shifted un-necessary inclusions of DebugProgramInstruction.h into being forward declarations (fixes clang-compile time I hope), and a memory leak in the DebugInfoTest.cpp IR unittests. I also tracked a compile-time regression in D154080, more explanation there, but the result of which is hiding some of the changes behind the EXPERIMENTAL_DEBUGINFO_ITERATORS compile-time flag. This is tested by the "new-debug-iterators" buildbot. [DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-08 16:42:35 +00:00
Vladislav Dzhidzhoev	6beddd668a	Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" This caused assert: llvm/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:110: void llvm::DwarfFile::addScopeVariable(LexicalScope , DbgVariable ): Assertion `Ret.second' failed. See comments https://reviews.llvm.org/D144006#4656350. This reverts commit `3b449bd46a`.	2023-11-08 00:29:24 +01:00
Jeremy Morse	957efa4ce4	Revert "[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info" And some intervening fixups. There are two remaining problems: * A memory leak via https://lab.llvm.org/buildbot/#/builders/236/builds/7120/steps/10/logs/stdio * A performance slowdown with -g where I'm not completely sure what the cause it These might be fairly straightforwards to fix, but it's the end of the day hear, so I figure I'll clear the buildbots til tomorrow. This reverts commit `7d77bbef4a`. This reverts commit `9026f35afe`. This reverts commit `d97b2b389a`.	2023-11-02 17:41:36 +00:00
Vladislav Dzhidzhoev	3b449bd46a	[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7) RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 Similar to imported declarations, the patch tracks function-local types in DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with the aforementioned metadata change and provided a support of function-local types scoped within a lexical block. The patch assumes that DICompileUnit's 'enums field' no longer tracks local types and DwarfDebug would assert if any locally-scoped types get placed there. Reviewed By: jmmartinez Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com> Differential Revision: https://reviews.llvm.org/D144006	2023-11-02 17:44:52 +01:00
Jeremy Morse	7d77bbef4a	[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-02 12:44:53 +00:00
Benjamin Kramer	eb67b34740	[IPSCCP] Don't crash on ptrtoint	2023-10-24 14:14:39 +02:00
Benjamin Kramer	8f33995351	[IPSCCP] Fix a mistake in `b796eac3f2` so the test actually passes	2023-10-24 10:48:46 +02:00
Benjamin Kramer	b796eac3f2	[IPSCCP] Silence sign compare warnings in test	2023-10-24 10:38:09 +02:00
Carlos Alberto Enciso	f3b20cb16a	[IPSCCP] Variable not visible at Og. (#66745 ) https://bugs.llvm.org/show_bug.cgi?id=51559 https://github.com/llvm/llvm-project/issues/50901 IPSCCP pass removes the global variable and does not create a constant expression for the initializer value.	2023-10-24 06:22:18 +01:00
Fangrui Song	a24418375a	[CodeLayout] cache-directed sort: limit max chain size (#69039 ) When linking an executable with a slightly larger executable, ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638). ``` 4.6% 20.7Mi .text.hot 3.5% 15.9Mi .text 3.4% 15.2Mi .text.unknown ``` Add cl option `cdsort-max-chain-size`, which is similar to `ext-tsp-max-chain-size`, and set it to 128, to improve performance. In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace" builds, the "Total Sort sections" time is measured as follows: * -mllvm -cdsort-max-chain-size=64: 1.321813 * -mllvm -cdsort-max-chain-size=128: 2.030425 * -mllvm -cdsort-max-chain-size=256: 2.927684 * -mllvm -cdsort-max-chain-size=512: 5.493106 * unlimited: 9 minutes The rest part takes 6.8s.	2023-10-22 16:50:03 -07:00
Dominik Adamski	eee8dd9088	[CodeExtractor] Allow to use 0 addr space for aggregate arg (#66998 ) The user of CodeExtractor should be able to specify that the aggregate argument should be passed as a pointer in zero address space. CodeExtractor is used to generate outlined functions required by OpenMP runtime. The arguments of the outlined functions for OpenMP GPU code are in 0 address space. 0 address space does not need to be the default address space for GPU device. That's why there is a need to allow the user of CodeExtractor to specify, that the allocated aggregate parameter is passed as pointer in zero address space.	2023-10-18 20:12:31 +02:00
Matthias Braun	5181156b37	Use BlockFrequency type in more places (NFC) (#68266 ) The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it more consistently in various APIs and disable implicit conversion to make usage more consistent and explicit. - Use `BlockFrequency Freq` parameter for `setBlockFreq`, `getProfileCountFromFreq` and `setBlockFreqAndScale` functions. - Return `BlockFrequency` in `getEntryFreq()` functions. - While on it change some `const BlockFrequency& Freq` parameters to plain `BlockFreqency Freq`. - Mark `BlockFrequency(uint64_t)` constructor as explicit. - Add missing `BlockFrequency::operator!=`. - Remove `uint64_t BlockFreqency::getMaxFrequency()`. - Add `BlockFrequency BlockFrequency::max()` function.	2023-10-05 11:40:17 -07:00
Kazu Hirata	3b34c117db	[llvm] Remove unused using decls (NFC) Identified with misc-unused-using-decls.	2023-10-03 23:21:50 -07:00
Hans Wennborg	eee1f7cef8	Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" This caused asserts: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed. See comment on the code review for reproducer. > RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 > > Similar to imported declarations, the patch tracks function-local types in > DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with > the aforementioned metadata change and provided a support of function-local > types scoped within a lexical block. > > The patch assumes that DICompileUnit's 'enums field' no longer tracks local > types and DwarfDebug would assert if any locally-scoped types get placed there. > > Reviewed By: jmmartinez > > Differential Revision: https://reviews.llvm.org/D144006 This reverts commit `f8aab289b5`.	2023-09-29 14:23:31 +02:00
Fangrui Song	e705b37a77	[CodeLayout] Add unittest for cache-directed sort The function reordering algorithm added by https://reviews.llvm.org/D152834 and used by BOLT (https://reviews.llvm.org/D153039) is untested. Add some tests at the appropriate layer. Depends on D159526 Differential Revision: https://reviews.llvm.org/D159527	2023-09-27 10:52:12 -07:00
Vladislav Dzhidzhoev	f8aab289b5	[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7) RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 Similar to imported declarations, the patch tracks function-local types in DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with the aforementioned metadata change and provided a support of function-local types scoped within a lexical block. The patch assumes that DICompileUnit's 'enums field' no longer tracks local types and DwarfDebug would assert if any locally-scoped types get placed there. Reviewed By: jmmartinez Differential Revision: https://reviews.llvm.org/D144006	2023-09-26 23:07:29 +04:00
Florian Hahn	4b0df112da	[VPlan] Fix invalid IR in unit test input, run verifier. Some tests were passing invalid IR to the VPlan construction logic. Fix the invalid IR and run the verifier on the input to avoid issues in the future.	2023-09-22 21:12:09 +01:00
Nikita Popov	2d8d622c73	[SCEV] Require that addrec operands dominate the loop SCEVExpander currently has special handling for the case where the start or the step of an addrec do not dominate the loop header, which is not used by any lit test. Initially I thought that this is entirely dead code, because addrec operands are required to be loop invariant. However, SCEV currently allows creating an addrec with operands that are loop invariant but defined after the loop. This doesn't seem like a useful case to allow, and we don't appear to be using this outside a single easy to adjust unit test.	2023-09-22 09:02:54 +02:00
Bjorn Pettersson	4d5906e0bf	[llvm][unittests] Remove unneeded header includes	2023-09-12 18:47:44 +02:00
Mel Chen	26aed5b9a8	[VPlan][LoopUtils] Remove unused parameter TTI This patch removes the member TTI from VPReductionRecipe, as the generation of reduction operations no longer requires TTI. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D158148	2023-09-04 05:30:37 -07:00
Mel Chen	463e7cb892	[LV][VPlan] Refactor VPReductionRecipe to use reference for member RdxDesc This commit refactors the implementation of VPReductionRecipe to use reference instead of pointer for member RdxDesc. Because the member RdxDesc in VPReductionRecipe should not be a nullptr, using a reference will provide clearer semantics. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D158058	2023-08-16 19:37:49 -07:00
Bjorn Pettersson	e53b28c833	[llvm] Drop some bitcasts and references related to typed pointers Differential Revision: https://reviews.llvm.org/D157551	2023-08-10 15:07:07 +02:00
Alexandros Lamprineas	d1b376fd7b	[FuncSpec] Rework the discardment logic for unprofitable specializations. Currently we make an arbitrary comparison between codesize and latency in order to decide whether to keep a specialization or not. Sometimes the latency savings are biased in favor of loops because of imprecise block frequencies, therefore this metric contains a lot of noise. This patch tries to address the problem as follows: * Reject specializations whose codesize savings are less than X% of the original function size. * Reject specializations whose latency savings are less than Y% of the original function size. * Reject specializations whose inlining bonus is less than Z% of the original function size. I am not saying this is super precise, but at least X, Y and Z are configurable, allowing us to tweak the cost model. Moreover, it lets us prioritize codesize over latency, which is a less noisy metric. I am also increasing the minimum size a function should have to be considered a candidate for specialization. Initially the cost of a function was calculated as CodeMetrics::NumInsts * InlineConstants::getInstrCost() which later in D150464 was altered into CodeMetrics::NumInsts since the metric is supposed to model TargetTransformInfo::TCK_CodeSize. However, we omitted adjusting MinFunctionSize in that commit. Differential Revision: https://reviews.llvm.org/D157123	2023-08-09 10:28:46 +01:00
Florian Hahn	93c5bae00e	[VPlan] Use printOperands for VPInstruction. Use the printOperands for printing VPInstruction's operands to be more in line with other recipes and ensure consistent printing after D15719. Also removes some stray spaces in print output.	2023-08-08 11:31:21 +01:00
Alexandros Lamprineas	c2d19002ae	[FuncSpec] Estimate dead blocks more accurately. Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider blocks with a back-edge, or blocks preceded by dead blocks. Differential Revision: https://reviews.llvm.org/D156903	2023-08-07 11:04:23 +01:00
Alexandros Lamprineas	5bfefff1c4	Reland [FuncSpec] Split the specialization bonus into CodeSize and Latency. Currently we use a combined metric TargetTransformInfo::TCK_SizeAndLatency when estimating the specialization bonus. This is suboptimal, and in some cases erroneous. For example we shouldn't be weighting the codesize decrease attributed to constant propagation by the block frequency of the dead code. Instead only the latency savings should be weighted by block frequency. The total codesize savings from all the specialization arguments should be deducted from the specialization cost. Differential Revision: https://reviews.llvm.org/D155103	2023-08-02 12:41:13 +01:00

1 2 3 4 5 ...

612 Commits