clang-p2996

Author	SHA1	Message	Date
Nathan Sidwell	4308c7422d	[BOLT][NFC] Refactor relocation arch selection (#87829 ) Convert the relocation routines to switch on architecture and have an explicit unreachable default.	2024-04-08 09:01:28 -04:00
Amir Ayupov	fd38366e45	[BOLT][NFC] Clean includes, add license headers (#87200 )	2024-03-31 19:29:45 -07:00
Amir Ayupov	d12e45ad16	[BOLT][NFC] Split out DomTree construction from BF::calculateLoopInfo (#87181 )	2024-03-31 06:24:19 -07:00
Amir Ayupov	c0febca3a6	[BOLT][NFC] Refactor BC::createBinaryContext for #81346 (#87172 )	2024-03-30 20:43:23 -07:00
Maksim Panchenko	7de82ca369	[BOLT] Don't terminate on trap instruction for Linux kernel (#87021 ) Under normal circumstances, we terminate basic blocks on a trap instruction. However, Linux kernel may resume execution after hitting a trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will specify if the trap instruction should terminate the control flow. The option is on by default except for the Linux kernel mode when it's off.	2024-03-29 16:41:15 -07:00
Amir Ayupov	d8fe2e4bb0	[BOLT] Fix enumeration of secondary entry points Make them start with 1 instead of 0 (reserved for primary entry point). Test Plan: ``` bin/llvm-lit -a tools/bolt/test/X86/yaml-secondary-entry-discriminator.s ``` Reviewers: rafaelauler, ayermolo, maksfb, dcci Reviewed By: maksfb Pull Request: https://github.com/llvm/llvm-project/pull/86848	2024-03-27 15:23:49 -07:00
Alexander Yermolovich	f3cfe016c5	[BOLT][DWARF] Add support for cross-cu references for debug-names (#86015 ) The DW_AT_abstract_origin can be a cross-cu reference as a by-product of LTO. On IR level for absolute references an address is stored, vs a DIE for relative references. Added a map to keep track of cross-cu referenced DIEs to use when we add an Entry.	2024-03-22 13:48:49 -07:00
Maksim Panchenko	6b1cf00400	[BOLT] Add support for Linux kernel static keys jump table (#86090 ) Runtime code modification used by static keys is the most ubiquitous self-modifying feature of the Linux kernel. The idea is to to eliminate the condition check and associated conditional jump on a hot path if that condition (based on a boolean value of a static key) does not change often. Whenever they condition changes, the kernel runtime modifies all code paths associated with that key flipping the code between nop and (unconditional) jump.	2024-03-21 14:05:21 -07:00
Alexander Yermolovich	4841858862	[BOLT][DWARF] Add support to debug_names for DW_AT_abstract_origin/DW_AT_specification (#85485 ) According to the DWARF spec a DIE that has DW_AT_specification or DW_AT_abstract_origin can be part of .debug_name if a DIE those attribute points to has DW_AT_name or DW_AT_linkage_name.	2024-03-18 15:28:01 -07:00
Alexander Yermolovich	a4610c7182	[BOLT][DWARF] Add support for DW_IDX_parent (#85285 ) This adds support for DW_IDX_parent. If DIE has a parent then DW_IDX_parent in Entry will point to Entry for that parent DIE. Otherwise it will have DW_FORM_flag_present in abbrev. Which takes zero space in Entry. This came from https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151	2024-03-15 13:52:45 -07:00
Alexander Yermolovich	6d4aa9d70e	[BOLT][DWWARF] Fix foreign TU index with local TUs (#84594 ) The foreign TU list immediately follows the local TU list and they both use the same index, so that if there are N local TU entries, the index for the first foreign TU is N. Changed so that the size of local TU is accounted for when setting foreign TU index.	2024-03-11 12:20:25 -07:00
Mehdi Amini	716042a63f	Rename llvm::ThreadPool -> llvm::DefaultThreadPool (NFC) (#83702 ) The base class llvm::ThreadPoolInterface will be renamed llvm::ThreadPool in a subsequent commit. This is a breaking change: clients who use to create a ThreadPool must now create a DefaultThreadPool instead.	2024-03-05 18:00:46 -08:00
Mehdi Amini	4a4fb930a5	Use the new ThreadPoolInterface base class instead of the concrete implementation (NFC) (#84056 )	2024-03-05 12:37:11 -08:00
sinan	71c2a132b2	[BOLT] support AArch64 JUMP26 createRelocation (#83531 ) Add R_AARCH64_JUMP26 implementation for createRelocation, which could significantly reduce the number of failed scan-refs cases if we perform bolt on a selective range of functions.	2024-03-04 17:11:47 +08:00
Maksim Panchenko	d7d564b2fc	[BOLT] Add BinaryFunction::registerBranch(). NFC (#83337 ) Add an external interface to register a branch in a function that is in disassembled state. Allows to make custom modifications to the disassembler. E.g., a pre-CFG pass can add an instruction and register a branch that will later be used during the CFG construction.	2024-02-28 20:04:28 -08:00
Maksim Panchenko	3f2a9e5910	[BOLT] Sort TakenBranches immediately before use. NFCI (#83333 ) Move code that sorts TakenBranches right before the branches are used. We can populate TakenBranches in pre-CFG post-processing and hence have to postpone the sorting to a later point in the processing pipeline. Will add such a pass later. For now it's NFC.	2024-02-28 19:51:44 -08:00
Maksim Panchenko	7c206c7812	[BOLT] Refactor interface for instruction labels. NFCI (#83209 ) To avoid accidentally setting the label twice for the same instruction, which can lead to a "lost" label, introduce getOrSetInstLabel() function. Rename existing functions to getInstLabel()/setInstLabel() to make it explicit that they operate on instruction labels. Add an assertion in setInstLabel() that the instruction did not have a prior label set.	2024-02-27 18:44:28 -08:00
Alexander Yermolovich	6de5fcc746	[BOLT][DWARF] Add support for .debug_names (#81062 ) DWARF5 spec supports the .debug_names acceleration table. This is the formalized version of combination of gdb-index/pubnames/types. Added implementation of it to BOLT. It supports both monolothic and split dwarf, with and without Type Units. It does not include parent indices. This will be in followup PR. Unlike LLVM output this will put all the CUs and TUs into one Module.	2024-02-26 14:00:31 -08:00
Alexander Yermolovich	004c1972b4	[BOLT][DWARF][NFC] Expose DebugStrOffsetsWriter::clear (#82548 ) Refactored cod that clears data-structures in DebugStrOffsetsWriter into clear() function and made initialize() public. This is for https://github.com/llvm/llvm-project/pull/81062.	2024-02-21 16:48:02 -08:00
Alexander Yermolovich	640e781dc8	[BOLT][DWARF][NFC] Use SkeletonCU in place of IsDWO check (#82540 ) Changed isDWO to a function that checks Skeleton CU that is passed in. This is for preparation for https://github.com/llvm/llvm-project/pull/81062.	2024-02-21 16:18:18 -08:00
Maksim Panchenko	5daf2001a1	[BOLT] Fix memory leak in BinarySection (#82520 ) The change in #80950 exposed a memory leak in BinarySection. Let BinarySection manage memory passed via updateContents() unless a valid SectionID is set indicating that the contents are managed by JITLink.	2024-02-21 11:54:34 -08:00
Alexander Yermolovich	c9e8e91aca	[BOLT][DWARF] Fix out of order rangelists/loclists (#81645 ) GCC can generate rangelists/loclists that are out of order. Fixed so that we don't assert, and instead generate partially optimized list. Through most code paths we do sort rnglists/loclists, but not for loclist for a path where BOLT does not modify a function. Although it's nice to have lists sorted, this implementation shouldn't rely on it. This also fixes an issue if we partially capture a list we would write out *end_of_list in helper function. So tools won't see the rest of the addresses being written out.	2024-02-14 11:23:57 -08:00
Amir Ayupov	52cf07116b	[BOLT][NFC] Log through JournalingStreams (#81524 ) Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:53:53 -08:00
Amir Ayupov	13d60ce2f2	[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch continue the migration on libCore, libRewrite and libPasses to use the new BOLTError class whenever a failure occurs. Test Plan: NFC Co-authored-by: Rafael Auler <rafaelauler@fb.com>	2024-02-12 14:51:15 -08:00
Amir Ayupov	fa7dd4919a	[BOLT][NFC] Add BOLTError and return it from passes (1/2) (#81522 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch we add a new class BOLTError and auxiliary functions `createFatalBOLTError()` and `createNonFatalBOLTError()` that allow BOLT code to bubble up the problem to the caller by using the Error class as a return type (or Expected). Also changes passes to use these. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:39:59 -08:00
Maksim Panchenko	8075f0db16	[BOLT] Use new contents when emitting sections with relocations (#80782 ) We can use BinarySection::updateContents() to change section contents. However, if we also add relocations for new contents, then the original data (i.e. not updated) is going to be used. Fix that. A follow-up diff will use the update interface and will include a test case.	2024-02-06 14:38:21 -08:00
Alexander Yermolovich	7d272722fb	[BOLT][DWARF] Add option to specify DW_AT_comp_dir (#79395 ) Added an --comp-dir-override option that overrides DW_AT_comp_dir in the unit die. This allows for llvm-bolt to be invoked from any category and still find .dwo files.	2024-01-25 15:00:52 -08:00
Amir Ayupov	9fec33aadc	Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 )" This reverts commit `82bc33ea3f`. Accidentally pushed unrelated changes.	2024-01-18 19:59:09 -08:00
Amir Ayupov	82bc33ea3f	[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653 ) Fix the bug where merge-fdata unconditionally outputs boltedcollection line, regardless of whether input files have it set. Test Plan: Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this fix.	2024-01-18 19:44:16 -08:00
Alexander Yermolovich	ad4cead67c	[BOLT][DWARF][NFC] Initialize CloneUnitCtxMap with current partition size (#75876 ) We would always allocate maximum amount for vector containing DWARFUnitInfo. In real usecases what ends up hapenning is we allocate a giant vector when processing one CU, or for thin-lto case multiple CUs. This lead to a lot of memory overhead, and 2x BOLT processing slowdown for at least one service built with monolithic DWARF. For binaries built with LTO with clang all of CUs that have cross references will share an abbrev table and will be processed in one batch. Rest of CUs are processesd in --cu-processing-batch-size size. Which defaults to 1. For theoretical cases where cross-cu references are present, but they do not share abbrev will increase the size of CloneUnitCtxMap as each CU is being processsed.	2023-12-20 16:12:52 -08:00
Alexander Yermolovich	bf2b035e58	[BOLT][DWARF] Fix handling .debug_str_offsets for type units (#75522 ) There was an assumpiton that TUs and CUs share .debug_str_offsets contribution. For ThinLTO builds it is not the case. Changed so that we parse contributions for TUs also, and did some refactoring so that we don't re-parse contributions that were not modified.	2023-12-14 17:27:21 -08:00
Kazu Hirata	ad8fd5b185	[BOLT] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 23:34:49 -08:00
Alexander Yermolovich	fb9a851224	[BOLT][DWARF] Fix handling of debug_str_offsets (#75100 ) We were not setting size field of .debug_str_offsets correctly. Fixed it, and added a test.	2023-12-11 15:56:32 -08:00
Kazu Hirata	1cc5431285	[BOLT] Fix warnings This patch fixes: bolt/lib/Core/BinaryFunctionProfile.cpp:222:10: error: variable 'BBMergeSI' set but not used [-Werror,-Wunused-but-set-variable] bolt/lib/Passes/VeneerElimination.cpp:67:12: error: variable 'VeneerCallers' set but not used [-Werror,-Wunused-but-set-variable]	2023-12-11 12:55:29 -08:00
Amir Ayupov	b039ccc684	[BOLT] Provide backwards compatibility for YAML profile with std::hash (#74253 ) Provide backwards compatibility for YAML profile that uses `std::hash`: xxh3 hash is the default for newly produced profile (sets `std-hash: false`), whereas the profile that doesn't specify `std-hash` will be treated as `std-hash: true`, preserving old behavior.	2023-12-11 12:27:32 -08:00
Nathan Sidwell	9596676e65	[BOLT] Determine address size from binary (#74870 ) Query the executable for address size.	2023-12-09 14:39:57 -05:00
ShatianWang	56bbf8135e	[BOLT] CDSplit main logic part 1/2 (#73895 ) This diff defines and initializes auxiliary variables used by CDSplit and implements two important helper functions. The first helper function approximates the block level size increase if a function is hot-warm split at a given split index (X86 specific). The second helper function finds all calls in the form of X->Y or Y->X for each BF given function order [... X ... BF ... Y ...]. These calls are referred to as "cover calls". Their distance will decrease if BF's hot fragment size is further reduced by hot-warm splitting. NFC.	2023-11-30 20:55:36 -05:00
Maksim Panchenko	4f3081296f	[BOLT][NFC] Fix comment (#73983 ) Fix off-by-one error in comment.	2023-11-30 14:31:38 -08:00
ShatianWang	c43d0432ef	[BOLT] Create .text.warm for 3-way splitting (#73863 ) This commit explicitly adds a warm code section, .text.warm, when -split-functions -split-strategy=cdsplit is used. This replaces the previous approach of using .text.cold.0 as warm and .text.cold.1 as cold in 3-way function splitting. NFC.	2023-11-29 22:42:36 -05:00
Maksim Panchenko	4bcbbe1f70	[BOLT] Refactor fixBranches() (#73752 ) Simplify code in fixBranches(). Mostly NFC, accept the x86-specific check for code fragments now takes into account presence of more than two fragments. Should only matter when we split code into multiple fragments and can run fixBranches() more than once. Also, don't replace a branch target with the same one, as such operation may allocate memory for extra MCSymbolRefExpr.	2023-11-29 16:24:16 -08:00
Alexander Yermolovich	00dbea7c73	[BOLT][DWARF][NFC] Added const to variable (#73731 ) Nit followup to 72729.	2023-11-28 17:30:28 -08:00
Alexander Yermolovich	b47b3bee7b	[BOLT][DWARF] Fix handling of DWARF5 DWP (#72729 ) Fixed handling of DWP as input. Before BOLT crashed. Now it will write out correct CU, and all the TUs. Potential future improvement is to scan all the TUs used in this CU, and only include those.	2023-11-28 15:54:14 -08:00
spupyrev	e7dd596c68	[BOLT] Use deterministic xxh3 for computing BF/BB hashes (#72542 ) std::hash and ADT/Hashing::hash_value are non-deterministic functions whose results might vary across implementation/process/execution. Using xxh3 instead for computing hashes of BinaryFunctions and BinaryBasicBlock for stale profile matching. (A possible alternative is to use ADT/StableHashing.h based on FNV hashing but xxh3 seems to be more popular in LLVM) This is to address https://github.com/llvm/llvm-project/issues/65241.	2023-11-27 14:45:46 -08:00
Maksim Panchenko	f4834255d3	[BOLT] Reset output addresses for deleted blocks (#73429 ) This is a follow-up to #73076. We need to reset output addresses for deleted blocks, otherwise the address translation may mistakenly attribute input address of a deleted block to a non-zero address. While working on a test case, I've discovered that DWARF output ranges were already broken for deleted basic blocks: #73428. I will provide a test case for this PR with a DWARF address range fix PR.	2023-11-25 23:23:47 -08:00
Maksim Panchenko	365114292a	[BOLT][NFC] Refactor function state check (#73420 ) Remove redundant check in updateOutputValues().	2023-11-25 21:09:54 -08:00
ShatianWang	d333c0e062	[BOLT] Extend calculateEmittedSize() for block size calculation (#73076 ) This commit modifies BinaryContext::calculateEmittedSize() to update the BinaryBasicBlock::OutputAddressRange of each basic block in the function in place. BinaryBasicBlock::getOutputSize() now gives the emitted size of the basic block.	2023-11-23 15:28:31 -05:00
llongint	f3e54f2f97	[BOLT][NFC] Extract a function for dump MCInst (#67225 ) In GDB debugging, obtaining the assembly representation of MCInst is more intuitive.	2023-11-21 20:30:44 +08:00
Maksim Panchenko	84602066a6	[BOLT] Fix C++ exceptions when LPStart is specified (#72737 ) Whenever LPStartEncoding was different from DW_EH_PE_omit, we used to miscalculate LPStart. As a result, landing pads were assigned wrong addresses. Fix that.	2023-11-20 20:55:38 -08:00
Maksim Panchenko	f653f6d57a	[BOLT][NFC] Delete unused declarations (#72596 )	2023-11-16 23:36:19 -08:00
JohnLee1243	ae51ec84bb	[Bolt] Solving pie support issue (#65494 ) Now PIE is default supported after clang 14. It cause parsing error when using perf2bolt. The reason is the base address can not get correctly. Fix the method of geting base address. If SegInfo.Alignment is not equal to pagesize, alignDown(SegInfo.FileOffset, SegInfo.Alignment) can not equal to FileOffset. So the SegInfo.FileOffset and FileOffset should be aligned by SegInfo.Alignment first and then judge whether they are equal. The .text segment's offset from base address in VAS is aligned by pagesize. So MMapAddress's offset from base address is alignDown(SegInfo.Address, pagesize) instead of alignDown(SegInfo.Address, SegInfo.Alignment). So the base address calculate way should be changed. Co-authored-by: Li Zhuohang <lizhuohang3@huawei.com>	2023-11-16 15:05:06 +08:00

1 2 3 4 5 ...

351 Commits