clang-p2996

Author	SHA1	Message	Date
Maksim Panchenko	70bf5e514b	[BOLT][AArch64] Symbolize ADRP after relaxation (#131414 ) When the linker relaxes a GOT load, it changes ADRP+LDR instruction pair into ADRP+ADD. It is relatively straightforward to detect and symbolize the second instruction in the disassembler. However, it is not always possible to properly symbolize the ADRP instruction without looking at the second instruction. Hence, we have the FixRelaxationPass that adjust the operand of ADRP by looking at the corresponding ADD. This PR tries to properly symbolize ADRP earlier in the pipeline, i.e. in AArch64MCSymbolizer. This change makes it easier to adjust the instruction once we add AArch64 support in `scanExternalRefs()`. Additionally, we get a benefit of looking at proper operands while observing the function state prior to running FixRelaxationPass. To disambiguate the operand of ADRP that has a GOT relocation against it, we look at the contents/value of the operand. If it contains an address of a page that is valid for GOT, we assume that the operand wasn't modified by the linker and leave it up to FixRelaxationPass to do a proper adjustment. If the page referenced by ADRP cannot point to GOT, then it's an indication that the linker has modified the operand and we substitute the operand with a non-GOT reference to the symbol.	2025-03-18 14:31:31 -07:00
Kazu Hirata	c72f7958b0	[BOLT] Fix the build This is a follow-up for: commit `3c4b931791` Author: Fangrui Song <i@maskray.me> Date: Mon Mar 17 20:05:28 2025 -0700	2025-03-17 20:18:34 -07:00
Anatoly Trosinenko	4f2ee07454	[BOLT][AArch64] Do not crash on authenticated branch instructions (#129898 ) When an indirect branch instruction is decoded, analyzeIndirectBranch method is asked if this is a well-known code pattern. On AArch64, the only special pattern which is detected is Jump Table, emitted as a branch to the sum of a constant base address and a variable offset. Therefore, `Inst.getOpcode()` being one of `AArch64::BRA*` means Inst cannot belong to such Jump Table pattern, thus returning early.	2025-03-17 12:00:05 +03:00
Kazu Hirata	4b1b629d60	[BOLT] Fix a warning This patch fixes: bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp:128:20: error: unused variable 'SymbolPageAddr' [-Werror,-Wunused-variable]	2025-03-14 19:20:03 -07:00
Maksim Panchenko	bac21719a8	[BOLT] Pass unfiltered relocations to disassembler. NFCI (#131202 ) Instead of filtering and modifying relocations in readRelocations(), preserve the relocation info and use it in the symbolizing disassembler. This change mostly affects AArch64, where we need to look at original linker relocations in order to properly symbolize instruction operands.	2025-03-14 18:44:33 -07:00
Paschalis Mpeis	2f9d94981c	[BOLT] Change Relocation Type to 32-bit NFCI (#130792 )	2025-03-14 18:15:59 +00:00
Kazu Hirata	03614b9a8a	[BOLT] Workaround failures (#131245 ) These tests have been failing since: commit `1cfca53b9f` Author: Arthur Eubanks <aeubanks@google.com> Date: Wed Mar 12 16:20:13 2025 -0700 This patch works around the failures by removing some FileCheck directives. Hopefully, BOLT folks can chime in and commit a right fix.	2025-03-13 20:55:43 -07:00
Nikita Popov	f137c3d592	[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940 ) This avoids doing a Triple -> std::string -> Triple round trip in lots of places, now that the Module stores a Triple.	2025-03-12 17:35:09 +01:00
Maksim Panchenko	a28daa7c1a	[BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (#129980 ) We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.	2025-03-05 23:06:01 -08:00
chrisPyr	038fff3f24	[NFC][BOLT] Make file-local cl::opt global variables static (#126472 ) #125983	2025-03-05 22:11:05 -08:00
Yevhen Babiichuk (DustDFG)	36cd60144b	[BOLT] Remove unexisting targets from bolt dockerfile (#122321 ) `perf2bolt` and `llvm-boltdiff` are now not separate targets but just symlinks to `llvm-bolt` created when you install `llvm-bolt` itself so when you try to build it ninja reports there are no targets for both of them	2025-03-05 09:23:06 +00:00
Eric Wang	fcb65ad2a2	[BOLT] Fix kernel version check for THP in hugify (#129380 ) BOLT --hugify does not work in kernel 6.x. Co-authored-by: rfwang07 <wangrufeng5@huawei.com>	2025-03-04 20:38:41 -08:00
Maksim Panchenko	b971d4d7c8	[BOLT][AArch64] Add symbolizer for AArch64 disassembler. NFCI (#127969 ) Add AArch64MCSymbolizer that symbolizes `MCInst` operands during disassembly. The symbolization was previously done in `BinaryFunction::disassemble()`, but it is also required by `scanExternalRefs()` for "lite" mode functionality. Hence, similar to x86, I've implemented the symbolizer interface that uses `BinaryFunction` relocations to properly create instruction operands. I expect the result of the disassembly to be identical after the change. AArch64 disassembler was not calling `tryAddingSymbolicOperand()` for `MOV` instructions. Fix that. Additionally, the disassembler marks `ldr` instructions as branches by setting `IsBranch` parameter to true. Ignore the parameter and rely on `MCPlusBuilder` interface instead. I've modified `--check-encoding` flag to check symolization of operands of instructions that have relocations against them.	2025-03-03 12:44:28 -08:00
Maksim Panchenko	6a161cbfd4	[BOLT] Remove BinaryFunction::IsPatched. NFC (#129461 ) BinaryFunction::IsPatched is no longer used.	2025-03-02 23:40:02 -08:00
Fangrui Song	74638f1634	[test] Replace .data.rel.ro with .section .data.rel.ro,"aw" to avoid using the extension unsupported by gas.	2025-03-01 20:55:17 -08:00
Maksim Panchenko	5a11912ece	[BOLT] Refactor interface for creating instruction patches. NFCI (#129404 ) Add BinaryContext::createInstructionPatch() interface for patching parts of the original binary with new instruction sequences. Refactor PatchEntries pass to use the new interface.	2025-03-01 19:20:17 -08:00
Maksim Panchenko	8910e41c86	[BOLT][AArch64] Refactor ADR to ADRP+ADD conversion pass. NFCI (#129399 ) In preparation of using the new interface in more places, refactor the ADR conversion pass.	2025-03-01 14:10:59 -08:00
Maksim Panchenko	074c2c6713	[BOLT] Refactor MCInst target symbol lookup. NFCI (#129131 ) In analyzeInstructionForFuncReference(), use MCPlusBuilder interface while scanning symbolic operands of MCInst. Should be NFC on x86, but will make the function work on other architectures. Note that it's currently unused on non-x86 as its functionality is exclusive to safe ICF that runs on x86 only.	2025-02-28 17:57:54 -08:00
ShatianWang	7e33bebe7c	[BOLT] Report flow conservation scores (#127954 ) Add two additional profile quality stats for CG (call graph) and CFG (control flow graph) flow conservations besides the CFG discontinuity stats introduced in #109683. The two new stats quantify how different "in-flow" is from "out-flow" in the following cases where they should be equal. The smaller the reported stats, the better the flow conservations are. CG flow conservation: for each function that is not a program entry, the number of times the function is called according to CG ("in-flow") should be equal to the number of times the transition from an entry basic block of the function to another basic block within the function is recorded ("out-flow"). CFG flow conservation: for each basic block that is not a function entry or exit, the number of times the transition into this basic block from another basic block within the function is recorded ("in-flow") should be equal to the number of times the transition from this basic block to another basic block within the function is recorded ("out-flow"). Use `-v=1` for more detailed bucketed stats, and use `-v=2` to dump functions / basic blocks with bad flow conservations.	2025-02-28 11:06:52 -05:00
YongKang Zhu	5401c675eb	[BOLT][instr] Avoid WX segment (#128982 ) BOLT instrumented binary today has a readable (R), writeable (W) and also executable (X) segment, which Android system won't load due to its WX attribute. Such RWX segment was produced because BOLT has a two step linking, first for everything in the updated or rewritten input binary and next for runtime library. Each linking will layout sections in the order of RX sections followed by RO sections and then followed by RW sections. So we could end up having a RW section `.bolt.instr.counters` surrounded by a number of RO and RX sections, and a new text segment was then formed by including all RX sections which includes the RW section in the middle, and hence the RWX segment. One way to fix this is to separate the RW `.bolt.instr.counters` section into its own segment by a). assigning the starting addresses for section `.bolt.instr.counters` and its following section with regular page aligned addresses and b). creating two extra program headers accordingly.	2025-02-27 16:13:57 -08:00
Amir Ayupov	f567524399	[BOLT] Fix doTrace in BAT mode (#128546 ) When processing BOLTed binaries with BAT section, we used to indiscriminately use `BAT->getFallthroughsInTrace` to record fall-throughs, even if the function is not covered by BAT. Fix that by using non-BAT CFG-based `getFallthroughsInTrace` if the function is not in BAT. Test Plan: updated bolt-address-translation-yaml.test	2025-02-25 10:56:13 -08:00
Amir Ayupov	3968ebd00d	[BOLT] Keep multi-entry functions simple in aggregation mode (#128253 ) BOLT used to mark multi-entry functions non-simple in non-relocation mode with the reasoning that we can't move them due to potentially undetected references. However, in aggregation mode it doesn't apply as BOLT doesn't perform optimizations. Relax this constraint in case of an aggregation job. Test Plan: added entry-point-fallthru.s	2025-02-25 10:53:45 -08:00
Kristof Beyls	6c61c55756	[BOLT] pacret-scanner: fix regression test failure (#128576 ) ... which is caused by a seemingly recent change in BOLTs basic block calculation, where function calls seem to be ending basic blocks? I don't have a pointer to the commit that caused this change. I'll be looking for that later. For now, I'm trying to get the regression tests passing again.	2025-02-24 21:08:43 +00:00
Kristof Beyls	55c76ea391	[BOLT] pacret-scanner: fix regression tests... (#128565 ) by making the regex to match basic block names more general. See failing test case that was reported on some system in comment https://github.com/llvm/llvm-project/pull/122304#issuecomment-2679460678 These test cases were introduced in PR #122304, commit `850b492976` .	2025-02-24 20:24:12 +00:00
Christian Sigg	0770afb88e	[bolt] Remove unnecessary include. ... which introduced a testing dependency in `850b492976`	2025-02-24 09:05:40 +01:00
Kristof Beyls	850b492976	[BOLT][binary-analysis] Add initial pac-ret gadget scanner (#122304 ) This adds an initial pac-ret gadget scanner to the llvm-bolt-binary-analysis-tool. The scanner is taken from the prototype that was published last year at https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype, and has been discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and in the EuroLLVM 2024 keynote "Does LLVM implement security hardenings correctly? A BOLT-based static analyzer to the rescue?" [Video](https://youtu.be/Sn_Fxa0tdpY) [Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf) In the spirit of incremental development, this PR aims to add a minimal implementation that is "fully working" on its own, but has major limitations, as described in the bolt/docs/BinaryAnalysis.md documentation in this proposed commit. These and other limitations will be fixed in follow-on PRs, mostly based on code already existing in the prototype branch. I hope incrementally upstreaming will make it easier to review the code. Note that I believe that this could also form the basis of a scanner to analyze correct implementation of PAuthABI.	2025-02-24 07:26:28 +00:00
Amir Ayupov	209252f3d5	[BOLT] Introduce skip-inline flag (#128135 ) Introduce exclusion list for inlining, allowing more fine-grained control than using skip-funcs. Test Plan: added skip-inline.s	2025-02-21 09:10:53 -08:00
YongKang Zhu	9fa77c1854	[BOLT][Linker][NFC] Remove lookupSymbol() in favor of lookupSymbolInfo() (#128070 ) Sometimes we need to know the size of a symbol besides its address, so maybe we can start using the existing `BOLTLinker::lookupSymbolInfo()` (that returns symbol address and size) and remove `BOLTLinker::lookupSymbol()` (that only returns symbol address). And for both we need to check return value as it is wrapped in `std::optional<>`, which makes the difference even smaller.	2025-02-20 17:14:33 -08:00
Maksim Panchenko	0ba391a85f	[BOLT] Improve constant island disassembly (#127971 ) * Add label that identifies constant island. * Support cases where the island is located after the function.	2025-02-20 11:16:01 -08:00
YongKang Zhu	19bad2ac4a	[BOLT][NFC] Fix an incorrect address used in a BOLT-INFO message (#127902 )	2025-02-19 16:57:18 -08:00
Nikita Popov	e235fcb582	[BOLT] Only link and initialize supported targets (#127509 ) Bolt currently links and initializes all LLVM targets. This substantially increases the binary size, and link time if LTO is used. Instead, only link the targets specified by BOLT_TARGETS_TO_BUILD. We also have to only initialize those targets, so generate a TargetConfig.def file with the necessary information. The way the initialization is done mirrors what llvm-exegesis does. This reduces llvm-bolt size from 137MB to 78MB for me.	2025-02-18 09:17:51 +01:00
Amir Ayupov	61acfb07e8	[BOLT] Add pre-aggregated trace support (#127125 ) Traces are triplets of branch source, target, and fall-through end (next branch). Traces simplify differentiation of fall-throughs into local- and external-origin, which improves performance over profile with undifferentiated fall-throughs by eliminating profile discontinuity in call to continuation fall-throughs. This makes it possible to avoid converting return profile into call to continuation profile which may introduce statistical biases. The existing format makes provisions for local- (F) and external- (f) origin fall-throughs, but the profile producer needs to know function boundaries. BOLT has that information readily available, so providing the origin branch of a fall-through is a functional replacement of the fall-through kind (f or F). This also has an effect of combining branches and fall-throughs into a single record. As traces subsume other pre-aggregated profile kinds, BOLT may drop support for them soon. Users of pre-aggregated profile format are advised to migrate to the trace format. Test Plan: Updated callcont-fallthru.s	2025-02-13 15:14:56 -08:00
Paschalis Mpeis	385af283cd	[BOLT] Prevent addRelocation from adding pending relocs (#123635 ) `addPendingRelocation` is the only way to add a pending relocation. Can no longer use `addRelocation` for this. Update the only user (`BinaryContextTester`).	2025-02-12 15:24:11 +00:00
Nikita Popov	0abe058d7f	[BOLT] Use getMainExecutable() (#126698 ) Use LLVM's getMainExecutable() helper instead of rolling our own. This will result in standard behavior across platforms, such as making sure that symlinks are always resolved.	2025-02-12 09:44:26 +01:00
YongKang Zhu	1e0a489671	[BOLT] Resolve symlink for library lookup (#126386 )	2025-02-08 14:02:46 -08:00
Amir Ayupov	b884be8640	[BOLT] Exit with error code on missing DWO CU (#125976 ) If BOLT fails to locate DWO CU when using split DWARF, this signifies an issue with the input (missing .dwo) rather than an internal assertion.	2025-02-06 10:01:12 -08:00
Maksim Panchenko	3115278c4e	[BOLT] Fixup for commit 137c378/#125961	2025-02-06 00:26:20 -08:00
Maksim Panchenko	137c3781e6	[BOLT][AArch64] Include constant islands in disassembly (#125961 ) When printing disassembly of a function with constant islands, include the island info in the dump. At the moment, only print islands in pre-CFG state. Include islands that are interleaved with instructions.	2025-02-05 22:41:40 -08:00
Fangrui Song	a907008bcb	[BOLT,test] Link against a shared object to test PLT (#125625 ) A few tests generate a statically-linked position-independent executable with `-nostdlib -Wl,--unresolved-symbols=ignore-all -pie` (`%clang`) and test PLT handling. (--unresolved-symbols=ignore-all suppresses undefined symbol errors and serves as a convenience hack.) This relies on an unguaranteed linker behavior: a statically-linked PIE does not necessarily generate PLT entries. While current lld generates a PLT entry, it will change to suppress the PLT entry to simplify internal handling and improve consistency. (The behavior has no consistency in GNU ld, some ports generated a .dynsym entry while some don't. While most seem to generate a PLT entry but some ports use a weird `R_*_NONE` relocation.)	2025-02-05 09:31:58 -08:00
YongKang Zhu	e6d12ad791	[BOLT][NFC] Fix test X86/dynamic-relocs-on-entry.s (#125264 )	2025-01-31 13:19:49 -08:00
Maksim Panchenko	69c24684f6	[BOLT] Fix test. NFC (#124851 ) Keep output files different for multiple tool invocations. Otherwise, it causes issues with our internal testing infra.	2025-01-29 16:57:49 -08:00
Maksim Panchenko	ef232a7e34	[BOLT][AArch64] Remove nops in functions with defined control flow (#124705 ) When a function has an indirect branch with unknown control flow, we preserve nops in order to keep all instruction offsets (from the start of the function) the same in case the indirect branch is used by a PC-relative jump table. However, when we know the control flow of the function, we should be able to safely remove nops.	2025-01-28 11:03:49 -08:00
Maksim Panchenko	1b4bd4e1a5	[BOLT][AArch64] Remove assertions from jump table heuristic (#124372 ) The code for jump table detection on AArch64 asserts liberally whenever the input instruction sequence does not match the expected pattern. As a result, BOLT fails to process binaries with such sequences instead of ignoring functions with unknown control flow. Remove asserts in analyzeIndirectBranchFragment() and mark indirect jumps as instructions with unknown control flow instead.	2025-01-24 16:43:02 -08:00
Maksim Panchenko	34c6c5e72f	[BOLT][AArch64] Fix PLT optimization (#124192 ) Preserve C++ exception metadata while running PLT optimization on AArch64.	2025-01-24 14:20:24 -08:00
Amir Ayupov	e6c9cd9c06	[BOLT] Drop parsing sample PC when processing LBR perf data (#123420 ) Remove options to generate autofdo data (unused) and `use-event-pc` (not beneficial). Cuts down perf2bolt time for 11GB perf.data by 40s (11:10->10:30).	2025-01-21 09:04:49 -08:00
Alexey Moksyakov	ad599c25d9	[BOLT][AArch64] Add isPush & isPop (#120713 ) This functionality is needed for inliner pass and also for correct dyno stats. Needed for [PR](https://github.com/llvm/llvm-project/pull/120187)	2025-01-20 10:42:48 +08:00
Nicholas	ee4282259d	[BOLT][AArch64]support `inline-small-functions` for AArch64 (#120187 ) Add some functions in `AArch64MCPlusBuilder.cpp` to support inline for AArch64.	2025-01-17 17:55:55 +08:00
Nikita Popov	320c2ee6c2	[BOLT] Pass -Wl,--build-id=none to linker in tests (#122886 ) This fixes the following tests: BOLT :: AArch64/check-init-not-moved.s BOLT :: X86/dwarf5-dwarf4-types-backward-forward-cross-reference.test BOLT :: X86/dwarf5-locexpr-referrence.test When clang is compiled with `-DENABLE_LINKER_BUILD_ID=ON`.	2025-01-17 10:09:26 +01:00
Nikita Popov	3c42a77456	[BOLT] Fix handling of LLVM_LIBDIR_SUFFIX (#122874 ) This fixes a number of issues introduced in #97130 when LLVM_LIBDIR_SUFFIX is a non-empty string. Make sure that the libdir is always referenced as `lib${LLVM_LIBDIR_SUFFIX}`, not as just `lib` or `${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}`. This is the standard libdir convention for all LLVM subprojects. Using `${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}` would result in a duplicate suffix.	2025-01-17 09:38:00 +01:00
Nicholas	1fa02b9684	[BOLT][AArch64] Speedup `computeInstructionSize` (#121106 ) AArch64 instructions have a fixed size 4 bytes, no need to compute.	2025-01-17 09:48:17 +08:00

... 2 3 4 5 6 ...

2585 Commits