clang-p2996

Author	SHA1	Message	Date
Amir Ayupov	fd38366e45	[BOLT][NFC] Clean includes, add license headers (#87200 )	2024-03-31 19:29:45 -07:00
Maksim Panchenko	6b1cf00400	[BOLT] Add support for Linux kernel static keys jump table (#86090 ) Runtime code modification used by static keys is the most ubiquitous self-modifying feature of the Linux kernel. The idea is to to eliminate the condition check and associated conditional jump on a hot path if that condition (based on a boolean value of a static key) does not change often. Whenever they condition changes, the kernel runtime modifies all code paths associated with that key flipping the code between nop and (unconditional) jump.	2024-03-21 14:05:21 -07:00
Maksim Panchenko	bba790db47	[BOLT] Refactor instruction creation interface. NFCI (#85292 ) Refactor MCPlusBuilder's create{Instruction}() functions that used to return bool. We almost never check the return value as we rely on llvm_unreachable() to detect unimplemented functionality. There were a couple of cases that checked the return value, but they would hit the unreachable condition first (at least in debug builds) before the return value gets checked.	2024-03-14 13:17:17 -07:00
Mehdi Amini	4a4fb930a5	Use the new ThreadPoolInterface base class instead of the concrete implementation (NFC) (#84056 )	2024-03-05 12:37:11 -08:00
Elvina Yakubova	b98e6a5ced	[BOLT][AArch64] Skip BBs only instead of functions (#81989 ) After [this ](`846eb76761`) commit we noticed that the size of fdata file decreased a lot. That's why the better and more precise way will be to skip basic blocks with exclusive instructions only instead of the whole function	2024-02-27 19:19:47 +03:00
Amir Ayupov	52cf07116b	[BOLT][NFC] Log through JournalingStreams (#81524 ) Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:53:53 -08:00
Amir Ayupov	13d60ce2f2	[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch continue the migration on libCore, libRewrite and libPasses to use the new BOLTError class whenever a failure occurs. Test Plan: NFC Co-authored-by: Rafael Auler <rafaelauler@fb.com>	2024-02-12 14:51:15 -08:00
Amir Ayupov	fa7dd4919a	[BOLT][NFC] Add BOLTError and return it from passes (1/2) (#81522 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch we add a new class BOLTError and auxiliary functions `createFatalBOLTError()` and `createNonFatalBOLTError()` that allow BOLT code to bubble up the problem to the caller by using the Error class as a return type (or Expected). Also changes passes to use these. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:39:59 -08:00
Amir Ayupov	a5f3d1a803	[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch we change the interface to `BinaryFunctionPass` to return an Error on `runOnFunctions()`. This gives passes the ability to report a serious problem to the caller (RewriteInstance class), so the caller may decide how to best handle the exceptional situation. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:36:12 -08:00
Maksim Panchenko	7fe97f0420	[BOLT] Always run CheckLargeFunctions in non-relocation mode (#80922 ) We run CheckLargeFunctions pass in non-relocation mode to prevent the emission of functions that later could not be written to the output due to their large size. The main reason behind the pass is to prevent the emission of metadata for such functions since this metadata becomes incorrect if the function is left unmodified. Currently, the pass is enabled in non-relocation mode only when debug info output is also enabled. As we emit increasingly more kinds of metadata, e.g. for the Linux kernel, it becomes more challenging to track metadata that needs to be fixed. Hence, I'm enabling the pass to always run in non-relocation mode.	2024-02-08 14:21:49 -08:00
Maksim Panchenko	8ea7f1d20a	[BOLT][NFCI] Keep instruction annotations (#80382 ) We used to delete most instruction annotations before code emission. It was done to release memory taken by annotations and to reduce overall memory consumption. However, since the implementation of annotations has moved to using existing instruction operands, the memory overhead associated with them has reduced drastically. I measured that savings are less than 0.5% on large binaries and processing time is just slightly reduced if we keep them. Additionally, I plan to use annotations in pre-emission passes for the Linux kernel rewriter.	2024-02-06 19:59:53 -08:00
Amir Ayupov	3c64b24ed3	[BOLT] Add extra staleness logging (#80225 ) Report two extra metrics: - # of stale functions with matching block count, - # of stale blocks with matching instruction count.	2024-02-01 07:16:40 -08:00
spupyrev	9058503d26	[BOLT] Deprecate hfsort+ in favor of cdsort (#72408 ) A new function sorting algorithm (cdsort) in LLVM is an optimized version of BOLT's hfsort+. In order to avoid code duplication and simplify maintenance, getting rid of hfsort+. Perf-wise this is likely a neutral change, though differences on individual benchmarks are possible, since the generated function layout has changed. I tested cdsort vs hfsort+ on a number of open-source and prod binaries built in different modes and record an average neutral perf difference, perhaps with more "green" counters.	2024-01-26 06:51:55 -08:00
Amir Ayupov	e9309b27d7	[BOLT] Report input staleness (#79496 ) It's beneficial to have uniform reporting in both `infer-stale-profile` on and off cases, primarily for logging purposes. Without this change, BOLT would report "input" staleness in `infer-stale-profile=0` case (without matching), and "output" staleness in `infer-stale-profile=1` case (after matching). This change makes BOLT report "input" staleness in both cases. "Output" staleness information is printed separately with "BOLT-INFO: inferred profile..."	2024-01-25 14:15:13 -08:00
spupyrev	0daf303e79	[BOLT] Fix double conversion in CacheMetrics (#75253 ) The change (i) fixes an issue with double-int conversion in CacheMetrics and (ii) removes command-line options for computing metrics (which aren't modified anyway). This change might break some tests verifying the exact output of CacheMetrics.	2024-01-12 10:27:12 -08:00
ShatianWang	1577483413	[BOLT] Don't split likely fallthrough in CDSplit (#76164 ) This diff speeds up CDSplit by not considering any hot-warm splitting point that could break a fall-through branch from a basic block to its most likely successor. Co-authored-by: spupyrev <spupyrev@fb.com>	2023-12-21 16:17:10 -05:00
Kazu Hirata	ad8fd5b185	[BOLT] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 23:34:49 -08:00
Kazu Hirata	1cc5431285	[BOLT] Fix warnings This patch fixes: bolt/lib/Core/BinaryFunctionProfile.cpp:222:10: error: variable 'BBMergeSI' set but not used [-Werror,-Wunused-but-set-variable] bolt/lib/Passes/VeneerElimination.cpp:67:12: error: variable 'VeneerCallers' set but not used [-Werror,-Wunused-but-set-variable]	2023-12-11 12:55:29 -08:00
Amir Ayupov	b039ccc684	[BOLT] Provide backwards compatibility for YAML profile with std::hash (#74253 ) Provide backwards compatibility for YAML profile that uses `std::hash`: xxh3 hash is the default for newly produced profile (sets `std-hash: false`), whereas the profile that doesn't specify `std-hash` will be treated as `std-hash: true`, preserving old behavior.	2023-12-11 12:27:32 -08:00
sinan	fdb13cf531	[BOLT] Fix local out-of-range stub issue in LongJmp (#73918 ) If a local stub is out-of-range, at LongJmp we will try to find another local stub first. However, The original implementation do not work as expected and it leads to an infinite loop between replaceTargetWithStub and fixBranches. After this patch, we first convert the target of BB back to the target of the local stub, and then look up for other valid local stubs and so on.	2023-12-11 10:38:28 +08:00
Ho Cheung	fa5486e487	[BOLT] [Passes] Fix two compile warnings in BOLT (#73086 ) Fix build issue on Windows. issue:#73085 @maksfb PTAL thank you	2023-12-06 11:19:07 -08:00
ShatianWang	296088bdf3	[BOLT][NFC] Remove unused code for CDSplit (#74136 ) This diff removes JumpInfo related code that is no longer needed by CDSplit from SplitFunctions.cpp.	2023-12-01 15:21:30 -05:00
ShatianWang	4483cf2d8b	[BOLT] CDSplit main logic part 2/2 (#74032 ) This diff implements the main splitting logic of CDSplit. CDSplit processes functions in a binary in parallel. For each function BF, it assumes that all other functions are hot-cold split. For each possible hot-warm split point of BF, it computes its corresponding SplitScore, and chooses the split point with the best SplitScore. The SplitScore of each split point is computed in the following way: each call edge or jump edge has an edge score that is proportional to its execution count, and inversely proportional to its distance. The SplitScore of a split point is a sum of edge scores over a fixed set of edges whose distance can change due to hot-warm splitting BF. This set contains all cover calls in the form of X->Y or Y->X given function order [... X ... BF ... Y ...]; we refer to the sum of edge scores over the set of cover calls as CoverCallScore. This set also contains all jump edges (branches) within BF as well as all call edges originated from BF; we refer to the sum of edge scores over this set of edges as LocalScore. CDSplit finds the split index maximizing CoverCallScore + LocalScore.	2023-11-30 23:17:11 -05:00
ShatianWang	56bbf8135e	[BOLT] CDSplit main logic part 1/2 (#73895 ) This diff defines and initializes auxiliary variables used by CDSplit and implements two important helper functions. The first helper function approximates the block level size increase if a function is hot-warm split at a given split index (X86 specific). The second helper function finds all calls in the form of X->Y or Y->X for each BF given function order [... X ... BF ... Y ...]. These calls are referred to as "cover calls". Their distance will decrease if BF's hot fragment size is further reduced by hot-warm splitting. NFC.	2023-11-30 20:55:36 -05:00
ShatianWang	c43d0432ef	[BOLT] Create .text.warm for 3-way splitting (#73863 ) This commit explicitly adds a warm code section, .text.warm, when -split-functions -split-strategy=cdsplit is used. This replaces the previous approach of using .text.cold.0 as warm and .text.cold.1 as cold in 3-way function splitting. NFC.	2023-11-29 22:42:36 -05:00
ShatianWang	076bd22f57	[BOLT] Add structure of CDSplit to SplitFunctions (#73430 ) This commit establishes the general structure of the CDSplit strategy in SplitFunctions without incorporating the exact splitting logic. With -split-functions -split-strategy=cdsplit, the SplitFunctions pass will run twice: the first time is before function reordering and functions are hot-cold split; the second time is after function reordering and functions are hot-warm-cold split based on the fixed function ordering. Currently, all functions are hot-warm split after the entry block in the second splitting pass. Subsequent commits will introduce the precise splitting logic. NFC.	2023-11-29 15:43:21 -05:00
llongint	f3e54f2f97	[BOLT][NFC] Extract a function for dump MCInst (#67225 ) In GDB debugging, obtaining the assembly representation of MCInst is more intuitive.	2023-11-21 20:30:44 +08:00
Maksim Panchenko	f633f325a1	[BOLT] Fix NOP instruction emission on x86 (#72186 ) Use MCAsmBackend::writeNopData() interface to emit NOP instructions on x86. There are multiple forms of NOP instruction on x86 with different sizes. Currently, LLVM's assembly/disassembly does not support all forms correctly which can lead to a breakage of input code semantics, e.g. if the program relies on NOP instructions for reserving a patch space. Add "--keep-nops" option to preserve NOP instructions.	2023-11-13 18:12:39 -08:00
Maksim Panchenko	ec4a03c658	[BOLT] Enhance LowerAnnotations pass. NFCI. (#71847 ) After #70147, all primary annotation types are stored directly in the instruction and hence there's no need for the temporary storage we've used previously for repopulating preserved annotations.	2023-11-12 19:34:42 -08:00
Vladislav Khmelevsky	6206817380	[BOLT][AArch64] Fix ADR relaxation (#71835 ) Currently we have an optimization that if the ADR points to the same function we might skip it's relaxation. But it doesn't take into account that BF might be split, in such situation we still need to relax it. And just in case also relax if the initial BF size is >= 1MB. Fixes #71822	2023-11-10 11:48:03 +04:00
Vladislav Khmelevsky	abec50cb93	[BOLT][AArch64] Fix strict usage during ADR Relax (#71377 ) Currently strict mode is used to expand number of optimized functions, not to shrink it. Revert the option usage in the pass, so passing strict option would relax adr instruction even if there are no nops around it. Also add check for nop after adr instruction.	2023-11-10 11:46:36 +04:00
Vladislav Khmelevsky	c6c04a83a7	[BOLT] Run EliminateUnreachableBlocks in parallel (#71299 ) The wall time for this pass decreased on my laptop from ~80 sec to 5 sec processing the clang.	2023-11-10 00:46:04 +04:00
spaette	1a2f83366b	[BOLT] Fix typos (#68121 ) Closes https://github.com/llvm/llvm-project/issues/63097 Before merging please make sure the change to bolt/include/bolt/Passes/StokeInfo.h is correct. bolt/include/bolt/Passes/StokeInfo.h ```diff // This Pass solves the two major problems to use the Stoke program without - // proting its code: + // probing its code: ``` I'm still not happy about the awkward wording in this comment. bolt/include/bolt/Passes/FixRelaxationPass.h ``` $ ed -s bolt/include/bolt/Passes/FixRelaxationPass.h <<<'9,12p' // This file declares the FixRelaxations class, which locates instructions with // wrong targets and fixes them. Such problems usually occures when linker // relaxes (changes) instructions, but doesn't fix relocations types properly // for them. $ ``` bolt/docs/doxygen.cfg.in bolt/include/bolt/Core/BinaryContext.h bolt/include/bolt/Core/BinaryFunction.h bolt/include/bolt/Core/BinarySection.h bolt/include/bolt/Core/DebugData.h bolt/include/bolt/Core/DynoStats.h bolt/include/bolt/Core/Exceptions.h bolt/include/bolt/Core/MCPlusBuilder.h bolt/include/bolt/Core/Relocation.h bolt/include/bolt/Passes/FixRelaxationPass.h bolt/include/bolt/Passes/InstrumentationSummary.h bolt/include/bolt/Passes/ReorderAlgorithm.h bolt/include/bolt/Passes/StackReachingUses.h bolt/include/bolt/Passes/StokeInfo.h bolt/include/bolt/Passes/TailDuplication.h bolt/include/bolt/Profile/DataAggregator.h bolt/include/bolt/Profile/DataReader.h bolt/lib/Core/BinaryContext.cpp bolt/lib/Core/BinarySection.cpp bolt/lib/Core/DebugData.cpp bolt/lib/Core/DynoStats.cpp bolt/lib/Core/Relocation.cpp bolt/lib/Passes/Instrumentation.cpp bolt/lib/Passes/JTFootprintReduction.cpp bolt/lib/Passes/ReorderData.cpp bolt/lib/Passes/RetpolineInsertion.cpp bolt/lib/Passes/ShrinkWrapping.cpp bolt/lib/Passes/TailDuplication.cpp bolt/lib/Rewrite/BoltDiff.cpp bolt/lib/Rewrite/DWARFRewriter.cpp bolt/lib/Rewrite/RewriteInstance.cpp bolt/lib/Utils/CommandLineOpts.cpp bolt/runtime/instr.cpp bolt/test/AArch64/got-ld64-relaxation.test bolt/test/AArch64/unmarked-data.test bolt/test/X86/Inputs/dwarf5-cu-no-debug-addr-helper.s bolt/test/X86/Inputs/linenumber.cpp bolt/test/X86/double-jump.test bolt/test/X86/dwarf5-call-pc-function-null-check.test bolt/test/X86/dwarf5-split-dwarf4-monolithic.test bolt/test/X86/dynrelocs.s bolt/test/X86/fallthrough-to-noop.test bolt/test/X86/tail-duplication-cache.s bolt/test/runtime/X86/instrumentation-ind-calls.s	2023-11-09 11:29:46 -08:00
Vladislav Khmelevsky	485075c095	[BOLT][AArch64] Don't change layout in PatchEntries (#71278 ) Due to LongJmp pass that is executed before PatchEntries we can't ignore the function here since it would change pre-calculated output layout. The test reloc-26 relied on the wrong behavior, rewritten to unittest. This is also attemp to fix #70771	2023-11-08 11:38:46 +04:00
Maksim Panchenko	0df154671b	[BOLT] Use Label annotation instead of EHLabel pseudo. NFCI. (#70179 ) When we need to attach EH label to an instruction, we can now use Label annotation instead of EHLabel pseudo instruction.	2023-11-06 14:43:14 -08:00
maksfb	e28c393bd1	[BOLT] Reduce the number of emitted symbols. NFCI. (#70175 ) We emit a symbol before an instruction for a number of reasons, e.g. for tracking LocSyms, debug line, or if the instruction has a label annotation. Currently, we may emit multiple symbols per instruction. Reuse the same label instead of creating and emitting new ones when possible. I'm planning to refactor EH labels as well in a separate diff. Change getLabel() to return a pointer instead of std::optional<> since an empty label should be treated identically to no label.	2023-11-06 11:41:47 -08:00
maksfb	7f031d1c7c	[BOLT] Fix address mapping for ICP code (#70136 ) When we create new code for indirect code promotion optimization, we should mark it as originating from the indirect jump instruction for BOLT address translation (BAT) to map it to the original instruction.	2023-11-06 11:25:49 -08:00
spupyrev	287fcd38a1	[BOLT] Rename cds to cdsort (#69966 ) Unify naming for the layout algorithms by renaming "cds" to "cdsort". This is NFC unless someone is already using the new algorithm (which is unlikely).	2023-11-02 12:46:36 -07:00
Kazu Hirata	f9306f6de3	[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156 ) C++20 comes with std::erase to erase a value from std::vector. This patch renames llvm::erase_value to llvm::erase for consistency with C++20. We could make llvm::erase more similar to std::erase by having it return the number of elements removed, but I'm not doing that for now because nobody seems to care about that in our code base. Since there are only 50 occurrences of erase_value in our code base, this patch replaces all of them with llvm::erase and deprecates llvm::erase_value.	2023-10-24 23:03:13 -07:00
Kazu Hirata	e1a584305e	[BOLT] Use llvm::is_contained (NFC)	2023-10-19 23:21:58 -07:00
Vladislav Khmelevsky	b7944f7c04	[BOLT] Return proper minimal alignment from BF (#67707 ) Currently minimal alignment of function is hardcoded to 2 bytes. Add 2 more cases: 1. In case BF is data in code return the alignment of CI as minimal alignment 2. For aarch64 and riscv platforms return the minimal value of 4 (added test for aarch64) Otherwise fallback to returning the 2 as it previously was.	2023-10-12 09:33:08 +04:00
Job Noorman	43e9eae6e8	[BOLT] Preserve label annotations for injected functions (#68713 ) Needed for instrumentation on RISC-V.	2023-10-11 07:26:20 +00:00
qijitao	bae41ff57e	[BOLT] Fix long jump negative offset issue. (#67132 ) In instruction encoding, the relative offset address of the PC is signed, that is, the number of positive offset bits and the number of negative offset bits is asymmetric. Therefore, the maximum and minimum values are used to replace Mask to determine the boundary. Co-authored-by: qijitao <qijitao@hisilicon.com>	2023-10-08 01:06:10 +04:00
Job Noorman	ff5e2babcb	[BOLT] Improve handling of relocations targeting specific instructions (#66395 ) On RISC-V, there are certain relocations that target a specific instruction instead of a more abstract location like a function or basic block. Take the following example that loads a value from symbol `foo`: ``` nop 1: auipc t0, %pcrel_hi(foo) ld t0, %pcrel_lo(1b)(t0) ``` This results in two relocation: - auipc: `R_RISCV_PCREL_HI20` referencing `foo`; - ld: `R_RISCV_PCREL_LO12_I` referencing to local label `1` which points to the auipc instruction. It is of utmost importance that the `R_RISCV_PCREL_LO12_I` keeps referring to the auipc instruction; if not, the program will fail to assemble. However, BOLT currently does not guarantee this. BOLT currently assumes that all local symbols are jump targets and always starts a new basic block at symbol locations. The example above results in a CFG the looks like this: ``` .BB0: nop .BB1: auipc t0, %pcrel_hi(foo) ld t0, %pcrel_lo(.BB1)(t0) ``` While this currently works (i.e., the `R_RISCV_PCREL_LO12_I` relocation points to the correct instruction), it has two downsides: - Too many basic blocks are created (the example above is logically only one yet two are created); - If instructions are inserted in `.BB1` (e.g., by instrumentation), things will break since the label will not point to the auipc anymore. This patch proposes to fix this issue by teaching BOLT to track labels that should always point to a specific instruction. This is implemented as follows: - Add a new annotation type (`kLabel`) that allows us to annotate instructions with an `MCSymbol *`; - Whenever we encounter a relocation type that is used to refer to a specific instruction (`Relocation::isInstructionReference`), we register it without a symbol; - During disassembly, whenever we encounter an instruction with such a relocation, create a symbol for its target and store it in an offset to symbol map (to ensure multiple relocations referencing the same instruction use the same label); - After disassembly, iterate this map to attach labels to instructions via the new annotation type; - During emission, emit these labels right before the instruction. I believe the use of annotations works quite well for this use case as it allows us to reliably track instruction labels. If we were to store them as offsets in basic blocks, it would be error prone to keep them updated whenever instructions are inserted or removed. I have chosen to add labels as first-class annotations (as opposed to a generic one) because the documentation of `MCAnnotation` suggests that generic annotations are to be used for optional metadata that can be discarded without affecting correctness. As this is not the case for labels, a first-class annotation seemed more appropriate.	2023-10-06 06:46:16 +00:00
Job Noorman	7fa33773e3	[BOLT][RISCV] Handle long tail calls (#67098 ) Long tail calls use the following instruction sequence on RISC-V: ``` 1: auipc xi, %pcrel_hi(sym) jalr zero, %pcrel_lo(1b)(xi) ``` Since the second instruction in isolation looks like an indirect branch, this confused BOLT and most functions containing a long tail call got marked with "unknown control flow" and didn't get optimized as a consequence. This patch fixes this by detecting long tail call sequence in `analyzeIndirectBranch`. `FixRISCVCallsPass` also had to be updated to expand long tail calls to `PseudoTAIL` instead of `PseudoCALL`. Besides this, this patch also fixes a minor issue with compressed tail calls (`c.jr`) not being detected. Note that I had to change `BinaryFunction::postProcessIndirectBranches` slightly: the documentation of `MCPlusBuilder::analyzeIndirectBranch` mentions that the [`Begin`, `End`) range contains the instructions immediately preceding `Instruction`. However, in `postProcessIndirectBranches`, all the instructions in the BB where passed in the range. This made it difficult to find the preceding instruction so I made sure only the preceding instructions are passed.	2023-10-05 08:55:30 +00:00
Vladislav Khmelevsky	f99bd29610	[BOLT][NFC] Run ADRRelaxationPass in parallel (#67831 ) To do this: 1. Protect BC.Ctx with mutex 2. Don't call exit from thread, please check the reason comment near PassFailed variable definition. The other option would be call _Exit instead of exit, but I think we shall call destructors properly.	2023-09-30 13:47:41 +04:00
Vladislav Khmelevsky	08086c1529	[BOLT][AArch64] Fix CI alignment Fix alignment calculation for CI. Differential Revision: https://reviews.llvm.org/D159548	2023-09-28 12:55:57 +04:00
Vladislav Khmelevsky	846eb76761	[BOLT][AArch64] Fix instrumentation deadloop According to ARMv8-a architecture reference manual B2.10.5 software must avoid having any explicit memory accesses between exclusive load and associated store instruction. Otherwise exclusive monitor might clear the exclusivity without application-related cause which may result in the deadloop. Disable instrumentation for such functions, since between exclusive load and store there might be branches and we would insert instrumentation snippet which contains loads and stores. The better solution would be to analyze with BFS finding the exact BBs between load and store and not instrumenting them. Or even better to recognize such sequences and replace them with more complex one, e.g. loading value non exclusively, and for the brach where exclusive store is made make exclusive load and store sequentially, but for now just disable instrumentation for such functions completely. Differential Revision: https://reviews.llvm.org/D159520	2023-09-22 00:58:01 +04:00
Fangrui Song	6b8d04c23d	[CodeLayout] Refactor std::vector uses, namespace, and EdgeCountT. NFC * Place types and functions in the llvm::codelayout namespace * Change EdgeCountT from pair<pair<uint64_t, uint64_t>, uint64_t> to a struct and utilize structured bindings. It is not conventional to use the "T" suffix for structure types. * Remove a redundant copy in ChainT::merge. * Change {ExtTSPImpl,CDSortImpl}::run to use return value instead of an output parameter * Rename applyCDSLayout to computeCacheDirectedLayout: (a) avoid rare abbreviation "CDS" (cache-directed sort) (b) "compute" is more conventional for the specific use case * Change the parameter types from std::vector to ArrayRef so that SmallVector arguments can be used. * Similarly, rename applyExtTspLayout to computeExtTspLayout. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D159526	2023-09-21 13:13:03 -07:00
Job Noorman	dc925be68b	[BOLT][RISCV] Carry-over annotations when fixing calls (#66763 ) `FixRISCVCallsPass` changes all different forms of calls to `PseudoCALL` instructions. However, the original call's annotations were lost in the process. This patch fixes this by moving all annotations from the old to the new call. `MCPlusBuilder::moveAnnotations` had to be made public for this.	2023-09-21 06:37:47 +00:00

1 2 3 4 5

237 Commits