clang-p2996

Author	SHA1	Message	Date
Maksim Panchenko	7d6fda4fd3	[BOLT] Run PatchEntries pass before LongJmp (#137236 ) With --force-patch option, every original function entry point is overwritten with a trampoline to a new version of the function to prevent the execution of the original code. If the function size is too small for the trampoline code, we are forced to bail out on rewriting the function. That presented a problem on AArch64 due to LongJmp pass that assumed the presence of the new copy of the function. If the new copy was not emitted it could have lead to a relocation overflow. Run PatchEntries pass before LongJmp and make the latter aware of the functions that are not going to be emitted. Make --force-patch option behavior on AArch64 consistent with other architectures.	2025-05-01 15:09:09 -07:00
Gergely Bálint	5b20b5721a	[BOLT][AArch64] Allow binary-analysis and heatmap tool to run with pac-ret binaries (#136664 ) OpNegateRAState support is only needed for tools that produce binaries.	2025-04-30 13:41:11 +01:00
Elvina Yakubova	5cec6f6f2d	[BOLT][NFC] Add keep-nops option to non-empty-debug-line.test (#137812 ) On openSUSE distribution test is failing due to different .debug_line size without the keep-nops option	2025-04-29 18:16:36 +01:00
YongKang Zhu	316a6ff3d0	[BOLT][RelVTable] Skip special handling on non virtual function pointer relocations (#137406 ) Besides virtual function pointers vtable could contain other kinds of entries like those for RTTI data that also require relocations. We need to skip special handling on relocations for non virtual function pointers in relative vtable. Co-authored-by: Maksim Panchenko <maks@meta.com>	2025-04-29 08:13:44 -07:00
Owen Rodley	d3d856ad84	Clean up external users of GlobalValue::getGUID(StringRef) (#129644 ) See https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801 for context. This is a non-functional change which just changes the interface of GlobalValue, in preparation for future functional changes. This part touches a fair few users, so is split out for ease of review. Future changes to the GlobalValue implementation can then be focused purely on that class. This does the following: * Rename GlobalValue::getGUID(StringRef) to getGUIDAssumingExternalLinkage. This is simply making explicit at the callsite what is currently implicit. * Where possible, migrate users to directly calling getGUID on a GlobalValue instance. * Otherwise, where possible, have them call the newly renamed getGUIDAssumingExternalLinkage, to make the assumption explicit. There are a few cases where neither of the above are possible, as the caller saves and reconstructs the necessary information to compute the GUID themselves. We want to migrate these callers eventually, but for this first step we leave them be.	2025-04-28 11:09:43 +10:00
cor3ntin	320ec7fa7f	[Documentation] Always use SVG for dot-generated doxygen images. (#136843 ) Despite our attempt (build-docs.sh) to build the documentation with SVG, it still uses PNG https://llvm.org/doxygen/classllvm_1_1StringRef.html, and that renders terribly on any high dpi display. SVG leads to smasller installation and works fine on all browser (that has been true for _a while_ https://caniuse.com/svg), so this patch just unconditionally build all dot graphs as SVG in all subprojects and remove the option.	2025-04-25 14:13:17 +02:00
Amir Ayupov	5d0afacd1b	[BOLT][NFCI] Emit uniform diagnostics in DataAggregator (#136530 ) DataAggregator supports reading different kinds of profile data: - perf data: branch records or IP samples, - pre-aggregated branch data. Make profile quality reporting uniform across all kinds of input: - out-of-range and mismatching samples, - samples in cold code in BAT mode (profiled BOLTed binary). Test Plan: NFCI	2025-04-24 13:51:18 -07:00
Anatoly Trosinenko	37e8c6c6ee	[BOLT] Do not return Def-ed registers from MCPlusBuilder::getUsedRegs (#129890 ) Update the implementation of `MCPlusBuilder::getUsedRegs` to match its description in the header file, add unit tests.	2025-04-23 13:32:59 +03:00
ShatianWang	ce2b3ce3b6	[BOLT] Improve profile quality reporting (#130810 ) Improve profile quality reporting by 1) fixing a format issue for small binaries, 2) adding new stats for exception handling usage, 3) excluding selected blocks when computing the CFG flow conservation score. More specifically for 3), we are excluding blocks that satisfy at least one of the following characteristics: a) is a landing pad, b) has at least one landing pad with non-zero execution counts, c) ends with a recursive call. The reason for a) and b) is because the thrower --> landing pad edges are not explicitly represented in the CFG. The reason for c) is because the call-continuation fallthrough edge count is not important in case of recursive calls. Modified test `bolt/test/X86/profile-quality-reporting.test`. Added test `bolt/test/X86/profile-quality-reporting-small-binary.s`.	2025-04-22 15:42:47 -04:00
YongKang Zhu	2dca9e80ff	[BOLT][test] Resolve symlink for nm tool (NFC) (#136722 ) Handle the case where nm could be a symlink to llvm-nm.	2025-04-22 12:08:56 -07:00
Kazu Hirata	a8644b3d88	[BOLT] Call hash_combine_range with ranges (NFC) (#136524 )	2025-04-20 19:41:26 -07:00
Kazu Hirata	c6e7bb19f7	[BOLT] Use llvm::unique (NFC) (#136513 )	2025-04-20 18:29:51 -07:00
Rafael Auler	3bcb724903	[BOLT] Add --custom-allocation-vma flag (#136385 ) Add an advanced-user flag so we are able to rewrite binaries when we fail to identify a suitable location to put new code. User then can supply a custom location via --custom-allocation-vma. This happens more obviously if the binary has segments mapped to very high addresses.	2025-04-18 21:02:09 -07:00
Fangrui Song	c239acb5b6	MCFixup: Make FixupKindInfo smaller and change getFixupKindInfo to return value We will increase the use of raw relocation types and eliminate fixup kinds that correspond to relocation types. The getFixupKindInfo functions will return an rvalue instead. Let's update the return type from a const reference to a value type.	2025-04-18 20:55:43 -07:00
Rafael Auler	5c4e6c6113	[BOLT] Don't choke on nobits symbols (#136384 )	2025-04-18 17:29:24 -07:00
Maksim Panchenko	0977a7130b	[BOLT] Skip FDE emission for patch functions (#136224 ) Patch functions are used to fix instructions in the original code, i.e., they are not functions in a traditional sense, but rather pieces of emitted code that are embedded into real functions. We used to emit FDEs for all functions, including patch functions. However, FDEs for patches are not only unnecessary, but they can lead to problems with libraries and runtimes that consume FDEs, e.g. C++ exception handling runtime. Note that we use named patches to fix function entry points and in that case they behave more like regular functions. Thus we issue FDEs for those.	2025-04-17 19:58:32 -07:00
Kazu Hirata	2af5e01456	[BOLT][RISCV] Fix MCPlusBuilder instrumentation ifaces (#136211 ) a) Due to the different capabilities of the functions implemented, rename the createCmpJE function b) Refactor the convertIndirectCallToLoad function to override the interface. Patch by WangJee, originally posted in #136129	2025-04-17 15:27:44 -07:00
wangjue	dbb79c30c9	[BOLT][Instrumentation] Initial instrumentation support for RISCV64 (#133882 ) This patch adds code generation for RISCV64 instrumentation.The work involved includes the following three points: a) Implements support for instrumenting direct function call and jump on RISC-V which relies on , Atomic instructions (used to increment counters) are only available on RISC-V when the A extension is used. b) Implements support for instrumenting direct function inderect call by implementing the createInstrumentedIndCallHandlerEntryBB and createInstrumentedIndCallHandlerExitBB interfaces. In this process, we need to accurately record the target address and IndCallID to ensure the correct recording of the indirect call counters. c)Implemented the RISCV64 Bolt runtime library, implemented some system call interfaces through embedded assembly. Get the difference between runtime addrress of .text section andstatic address in section header table, which in turn can be used to search for indirect call description. However, the community code currently has problems with relocation in some scenarios, but this has nothing to do with instrumentation. We may continue to submit patches to fix the related bugs.	2025-04-16 23:01:00 -07:00
Maksim Panchenko	0b8f817aab	[BOLT] Fix conditional compilation of hugify.cpp (#135880 ) Fix builds after #117158: do not build hugify.cpp on Apple platforms.	2025-04-15 16:59:05 -07:00
YongKang Zhu	823adc7a2d	[BOLT] Validate secondary entry point (#135731 ) Some functions have their sizes as zero in input binary's symbol table, like those compiled by assembler. When figuring out function sizes, we may create label symbol if it doesn't point to any constant island. However, before function size is known, marker symbol can not be correctly associated to a function and therefore all such checks would fail and we could end up adding a code label pointing to constant island as secondary entry point and later mistakenly marking the function as not simple. Querying the global marker symbol array has big throughput overhead. Instead we can run an extra check when post processing entry points to identify such label symbols that actually point to constant islands.	2025-04-15 13:19:15 -07:00
alekuz01	38faf32d23	[BOLT] Enable hugify for AArch64 (#117158 ) Add required hugify instrumentation and runtime libraries support for AArch64. Fixes #58226 Unblocks #62695	2025-04-15 12:59:05 +01:00
YongKang Zhu	2a83c0cc13	[BOLT] Support relative vtable (#135449 ) To handle relative vftable, which is enabled with clang option `-fexperimental-relative-c++-abi-vtables`, we look for PC relative relocations whose fixup locations fall in vtable address ranges. For such relocations, actual target is just virtual function itself, and the addend is to record the distance between vtable slot for target virtual function and the first virtual function slot in vtable, which is to match generated code that calls virtual function. So we can skip the logic of handling "function + offset" and directly save such relocations for future fixup after new layout is known.	2025-04-14 10:24:47 -07:00
Kazu Hirata	7940b0546b	[BOLT] Fix warning This patch fixes: bolt/lib/Core/BinaryContext.cpp:582:8: error: unused variable 'printEntryDiagnostics' [-Werror,-Wunused-variable] bolt/lib/Core/BinaryContext.cpp:842:10: error: unused variable 'isSibling' [-Werror,-Wunused-variable]	2025-04-12 23:35:49 -07:00
Amir Ayupov	fa4ac19f0f	[BOLT] Accept PLT fall-throughs as valid traces (#129481 ) We used to report PLT traces as invalid (mismatching disassembled function contents) because PLT functions are marked as pseudo and ignored, thus missing CFG. However, such traces are not mismatching the function contents. Accept them without attaching the profile. Test Plan: updated callcont-fallthru.s	2025-04-11 21:26:19 -07:00
Fangrui Song	c04d9d57ee	MCAsmStreamer: Replace the MCInstPrinter * parameter with unique_ptr ... to clarify ownership, aligning with other parameters. Using `std::unique_ptr` encourages users to manage `createMCInstPrinter` with a unique_ptr instead of a raw pointer, reducing the risk of memory leaks. * llvm-mc: fix a leak and update llvm/test/tools/llvm-mc/disassembler-options.test * #121078 copied the llvm-mc code to CodeGenTargetMachineImpl and made the same mistake. Fixed by `2b8cc651dc` Using unique_ptr requires #include MCInstPrinter.h in a few translation units. * Delete a createAsmStreamer overload I deprecated in 2024 * SystemZMCTargetDesc.cpp: rename to `createSystemZAsmStreamer` to fix an overload conflict. Pull Request: https://github.com/llvm/llvm-project/pull/135128	2025-04-10 21:25:35 -07:00
Amir Ayupov	ba93fe97c2	[BOLT][NFC] Simplify getOrCreate/analyze/populate/emitJumpTable (#132108 )	2025-04-10 21:17:04 -07:00
Anatoly Trosinenko	2927050dd4	[BOLT] Gadget scanner: refine class names and debug output (NFC) (#135073 ) Scanning functions without CFG information as well as the detection of authentication oracles requires introducing more classes related to register state analysis. To make the future code easier to understand, rename several classes beforehand. To detect authentication oracles, one has to query the properties of output operands of authentication instructions after the instruction is executed - this requires adding another analysis that iterates over the instructions in reverse order, and a corresponding state class. As the main difference of the existing `State` class is that it stores the properties of source register operands of the instructions before the instruction's execution, rename it to `SrcState` and `PacRetAnalysis` to `SrcSafetyAnalysis`. Apply minor adjustments to the debug output along the way.	2025-04-10 20:54:05 +03:00
Anatoly Trosinenko	8521bd2424	[BOLT][AArch64] Handle PAuth call instructions in isIndirectCall (#133227 ) Handle `BLRA*` opcodes in AArch64MCPlusBuilder::isIndirectCall, update getRegUsedAsCallDest accordingly.	2025-04-08 13:23:10 +03:00
Anatoly Trosinenko	2c107238d5	[BOLT] Make DataflowAnalysis::getStateBefore() const (NFC) (#133308 )	2025-04-07 13:37:34 +03:00
Anatoly Trosinenko	0fc7aec349	[BOLT] Gadget scanner: detect address materialization and arithmetic (#132540 ) In addition to authenticated pointers, consider the contents of a register safe if it was * written by PC-relative address computation * updated by an arithmetic instruction whose input address is safe	2025-04-07 13:13:11 +03:00
Maksim Panchenko	e4cbb7780b	[BOLT][AArch64] Fix symbolization of unoptimized TLS access (#134332 ) TLS relocations may not have a valid BOLT symbol associated with them. While symbolizing the operand, we were checking for the symbol value, and since there was no symbol the check resulted in a crash. Handle TLS case while performing operand symbolization on AArch64.	2025-04-04 11:42:21 -07:00
Paschalis Mpeis	3d24046b33	[BOLT] Skip out-of-range pending relocations (#116964 ) When a pending relocation is created it is also marked whether it is optional or not. It can be optional when such relocation is added as part of an optimization (i.e., `scanExternalRefs`). When bolt tries to `flushPendingRelocations`, it safely skips any optional relocations that cannot be encoded due to being out of range. A pre-requisite to that is the usage of the `-force-patch` flag. Alternatrively, BOLT will bail out with a relevant message. Background: BOLT, as part of scanExternalRefs, identifies external references from calls and creates some pending relocations for them. Those when flushed will update references to point to the optimized functions. This optimization can be disabled using `--no-scan`. BOLT can assert if any of these pending relocations cannot be encoded. This patch does not disable this optimization but instead selectively applies it given that a pending relocation is optional and `-force-patch` was enabled.	2025-04-04 17:31:14 +01:00
Rodrigo Rocha	b9891715af	[BOLT] Handle generation of compare and jump sequences (#131949 ) This patch fixes the following two issues with the createCmpJE for AArch64: 1. Avoids overwriting the value of the input register RegNo by use XZR as the destination register. subs xzr, RegNo, #Imm which is equivalent to a simple cmp RegNo, #Imm 2. The immediate operand to the Bcc instruction must be EQ instead of #Imm. This patch also adds a new function for createCmpJNE and unit tests for the both createCmpJE and createCmpJNE for X86 and AArch64.	2025-04-03 18:34:24 -07:00
Anatoly Trosinenko	c818ae7399	[BOLT] Gadget scanner: detect non-protected indirect calls (#131899 ) Implement the detection of non-protected indirect calls and branches similar to pac-ret scanner.	2025-04-03 16:40:34 +03:00
Alexey Moksyakov	19a319667b	[bolt][aarch64] Adding test with unsupported indirect branches (#127655 ) This test contains the set of common indirect branch patterns. Adding the support will be step by step	2025-04-01 13:49:09 +03:00
Maksim Panchenko	b2d272ccfb	[BOLT][X86] Fix getTargetSymbol() (#133834 ) In `96e5ee2`, I inadvertently broke the way non-trivial symbol references got updated from non-optimized code. The breakage was a consequence of `getTargetSymbol(MCExpr *)` not returning a symbol when the parameter was a binary expression. Fix `getTargetSymbol()` to cover such cases.	2025-03-31 18:31:33 -07:00
Kazu Hirata	0c7be9392f	[BOLT] Use *Set::insert_range (NFC) (#133601 )	2025-03-29 16:52:16 -07:00
Paschalis Mpeis	427725508b	[BOLT] Add getter for optional relocations (#133085 ) Minor refactoring on comments.	2025-03-28 14:07:51 +00:00
Maksim Panchenko	96e5ee23a7	[BOLT][AArch64] Add partial support for lite mode (#133014 ) In lite mode, we only emit code for a subset of functions while preserving the original code in .bolt.org.text. This requires updating code references in non-emitted functions to ensure that: * Non-optimized versions of the optimized code never execute. * Function pointer comparison semantics is preserved. On x86-64, we can update code references in-place using "pending relocations" added in scanExternalRefs(). However, on AArch64, this is not always possible due to address range limitations and linker address "relaxation". There are two types of code-to-code references: control transfer (e.g., calls and branches) and function pointer materialization. AArch64-specific control transfer instructions are covered by #116964. For function pointer materialization, simply changing the immediate field of an instruction is not always sufficient. In some cases, we need to modify a pair of instructions, such as undoing linker relaxation and converting NOP+ADR into ADRP+ADD sequence. To achieve this, we use the instruction patch mechanism instead of pending relocations. Instruction patches are emitted via the regular MC layer, just like regular functions. However, they have a fixed address and do not have an associated symbol table entry. This allows us to make more complex changes to the code, ensuring that function pointers are correctly updated. Such mechanism should also be portable to RISC-V and other architectures. To summarize, for AArch64, we extend the scanExternalRefs() process to undo linker relaxation and use instruction patches to partially overwrite unoptimized code.	2025-03-27 21:33:25 -07:00
Ash Dobrescu	a308d421aa	Remove -no-pie case from indirect-goto-relocs.test (#133067 ) This test was added in PR: https://github.com/llvm/llvm-project/pull/120267. The -no-pie case in the above mentioned test needs to be removed as subsequent changes have caused it to fail.	2025-03-26 11:11:55 +00:00
Anatoly Trosinenko	b6b40e9ac9	[BOLT] Gadget scanner: reformulate the state for data-flow analysis (#131898 ) In preparation for implementing support for detection of non-protected call instructions, refine the definition of state which is computed for each register by data-flow analysis. Explicitly marking the registers which are known to be trusted at function entry is crucial for finding non-protected calls. In addition, it fixes less-common false negatives for pac-ret, such as `ret x1` in `f_nonx30_ret_non_auted` test case.	2025-03-25 21:45:02 +03:00
Kazu Hirata	993311799b	[BOLT] Fix a warning This patch fixes: bolt/lib/Passes/PAuthGadgetScanner.cpp:438:18: error: unused variable 'BC' [-Werror,-Wunused-variable]	2025-03-21 11:08:27 -07:00
Anatoly Trosinenko	72d1058af0	[BOLT] Gadget scanner: refactor analysis of RET instructions (#131897 ) In preparation for implementing detection of more gadget kinds, refactor checking for non-protected return instructions.	2025-03-21 19:54:57 +03:00
Paschalis Mpeis	6bbd45dec7	[NFC][BOLT] Refactor ForcePatch option (#127812 ) Move force-patch flag to CommandLineOpts and add details on PatchEntries.	2025-03-21 15:55:09 +00:00
Anatoly Trosinenko	03557169e0	[BOLT] Gadget scanner: streamline issue reporting (#131896 ) In preparation for adding more gadget kinds to detect, streamline issue reporting. Rename classes representing issue reports. In particular, rename `Annotation` base class to `Report`, as it has nothing to do with "annotations" in `MCPlus` terms anymore. Remove references to "return instructions" from variable names and report messages, use generic terms instead. Rename NonPacProtectedRetAnalysis to PAuthGadgetScanner. Remove `GeneralDiagnostic` as a separate class, make `GenericReport` (former `GenDiag`) store `std::string Text` directly. Remove unused `operator=` and `operator==` methods, as `Report`s are created on the heap and referenced via `shared_ptr`s. Introduce `GadgetKind` class - currently, it only wraps a `const char *` description to display to the user. This description is intended to be a per-gadget-kind constant (or a few hard-coded constants), so no need to store it to `std::string` field in each report instance. To handle both free-form `GenericReport`s and statically-allocated messages without unnecessary overhead, move printing of the report header to the base class (and take the message argument as a `StringRef`).	2025-03-21 11:19:53 +03:00
Fangrui Song	42a8813757	[RISCV] Rename VariantKind to Specifier Follow the X86 and Mips renaming. > "Relocation modifier" suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. > "Relocation specifier" is clear, aligns with Arm and IBM AIX's documentation, and fits the assembler's role seamlessly. In addition, rename *MCExpr::getKind, which confusingly shadows the base class getKind.	2025-03-20 22:25:57 -07:00
Paschalis Mpeis	5f6d9b45e9	[BOLT] Make Relocations a class and add optional field (#131638 ) This patch converts `Relocations` from a struct to a class, and introduces the `Optional` field. Patch #116964 will use it. Some optimizations, like `scanExternalRefs`, create relocations that patch the old code. Under certain circumstances these may be skipped without correctness implications.	2025-03-20 17:16:14 +00:00
Kazu Hirata	10624e67c3	[BOLT] Fix warnings bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp:62:13: error: unused function 'traceInst' [-Werror,-Wunused-function] bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp:68:13: error: unused function 'traceReg' [-Werror,-Wunused-function] bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp:80:13: error: unused function 'traceRegMask' [-Werror,-Wunused-function]	2025-03-20 10:12:46 -07:00
Anatoly Trosinenko	482b95217e	[BOLT] Gadget scanner: factor out utility code (#131895 ) Factor out the code for mapping from physical registers to consecutive array indexes. Introduce helper functions to print instructions and registers to prevent mixing of analysis logic and implementation details of debug output. Removed the debug printing from `Gadget::generateReport`, as it doesn't seem to add important information to what was already printed in the report itself.	2025-03-20 19:35:31 +03:00
Ash Dobrescu	3bba268013	[BOLT] Support computed goto and allow map addrs inside functions (#120267 ) Create entry points for addresses referenced by dynamic relocations and allow getNewFunctionOrDataAddress to map addrs inside functions. By adding addresses referenced by dynamic relocations as entry points. This patch fixes an issue where bolt fails on code using computing goto's. This also fixes a mapping issue with the bugfix from this PR: https://github.com/llvm/llvm-project/pull/117766.	2025-03-19 14:55:59 +00:00

1 2 3 4 5 ...

2585 Commits