clang-p2996

Author	SHA1	Message	Date
Amir Ayupov	fd38366e45	[BOLT][NFC] Clean includes, add license headers (#87200 )	2024-03-31 19:29:45 -07:00
Maksim Panchenko	7de82ca369	[BOLT] Don't terminate on trap instruction for Linux kernel (#87021 ) Under normal circumstances, we terminate basic blocks on a trap instruction. However, Linux kernel may resume execution after hitting a trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will specify if the trap instruction should terminate the control flow. The option is on by default except for the Linux kernel mode when it's off.	2024-03-29 16:41:15 -07:00
Maksim Panchenko	6b1cf00400	[BOLT] Add support for Linux kernel static keys jump table (#86090 ) Runtime code modification used by static keys is the most ubiquitous self-modifying feature of the Linux kernel. The idea is to to eliminate the condition check and associated conditional jump on a hot path if that condition (based on a boolean value of a static key) does not change often. Whenever they condition changes, the kernel runtime modifies all code paths associated with that key flipping the code between nop and (unconditional) jump.	2024-03-21 14:05:21 -07:00
Maksim Panchenko	49b8a99a0f	[BOLT] Add createCondBranch() and createLongUncondBranch() (#85315 ) Add MCPlusBuilder interface for creating two new branch types.	2024-03-14 15:28:22 -07:00
Maksim Panchenko	bba790db47	[BOLT] Refactor instruction creation interface. NFCI (#85292 ) Refactor MCPlusBuilder's create{Instruction}() functions that used to return bool. We almost never check the return value as we rely on llvm_unreachable() to detect unimplemented functionality. There were a couple of cases that checked the return value, but they would hit the unreachable condition first (at least in debug builds) before the return value gets checked.	2024-03-14 13:17:17 -07:00
Maksim Panchenko	59ab86bb2f	[BOLT] Clear operands when creating new instructions. NFCI (#85191 ) Reset operand list whenever we create a new instruction via a parameter passed by reference. Most functions were already doing this, but there are several places missing the reset. Potentially, if we don not clear the list it could lead to invalid instruction operands. But the existing code is unaffected.	2024-03-14 11:00:08 -07:00
sinan	71c2a132b2	[BOLT] support AArch64 JUMP26 createRelocation (#83531 ) Add R_AARCH64_JUMP26 implementation for createRelocation, which could significantly reduce the number of failed scan-refs cases if we perform bolt on a selective range of functions.	2024-03-04 17:11:47 +08:00
Elvina Yakubova	b98e6a5ced	[BOLT][AArch64] Skip BBs only instead of functions (#81989 ) After [this ](`846eb76761`) commit we noticed that the size of fdata file decreased a lot. That's why the better and more precise way will be to skip basic blocks with exclusive instructions only instead of the whole function	2024-02-27 19:19:47 +03:00
Amir Ayupov	52cf07116b	[BOLT][NFC] Log through JournalingStreams (#81524 ) Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC	2024-02-12 14:53:53 -08:00
Amir Ayupov	13d60ce2f2	[BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523 ) As part of the effort to refactor old error handling code that would directly call exit(1), in this patch continue the migration on libCore, libRewrite and libPasses to use the new BOLTError class whenever a failure occurs. Test Plan: NFC Co-authored-by: Rafael Auler <rafaelauler@fb.com>	2024-02-12 14:51:15 -08:00
Maksim Panchenko	082fe9a5dd	[BOLT] Remove duplicate expression (#80380 ) Reported by cpp check static analyzer in #80111. Fixes #80111.	2024-02-01 19:05:11 -08:00
eleviant	f20af7372f	[bolt] Support arm64 FP register spills (#73021 ) At the moment llvm-bolt fails when analyzing jump tables on aarch64 in case FP register spill/reload is used.	2023-12-05 20:32:58 +01:00
Maksim Panchenko	0df154671b	[BOLT] Use Label annotation instead of EHLabel pseudo. NFCI. (#70179 ) When we need to attach EH label to an instruction, we can now use Label annotation instead of EHLabel pseudo instruction.	2023-11-06 14:43:14 -08:00
Vladislav Khmelevsky	888742a121	[BOLT][AArch64] Handle .plt.got section (#71216 ) It seems that currently this section is only created by the mold linker if 2 conditions are met: 1. The PLT function was called directly. 2. The indirect access to PLT function was found (e.g. through ADRP relocation). Although mold created symbol for every plt entry I've removed them in yaml file to check that .plt.got was truly disassembled by bolt.	2023-11-04 00:47:24 +04:00
Job Noorman	b6b492880f	[BOLT][RISCV] Set minimum function alignment to 2 for RVC (#69837 ) In #67707, the minimum function alignment on RISC-V was set to 4. When RVC (compressed instructions) is enabled, the minimum alignment can be reduced to 2. This patch implements this by delegating the choice of minimum alignment to a new `MCPlusBuilder::getMinFunctionAlignment` function. This way, the target-dependent code in `BinaryFunction` is minimized.	2023-10-23 08:09:11 +00:00
Job Noorman	3ab536fb99	[BOLT][RISCV] Implement getCalleeSavedRegs (#69161 ) The main reason for implementing this now is to ensure the `assume=abi.test` test passes on RISC-V. Since it uses `--indirect-call-promotion=all`, it requires some support for register analysis on the target. Further testing and implementation of register/frame analysis on RISC-V will come later.	2023-10-16 08:52:56 +00:00
Job Noorman	d8de38b401	[BOLT][RISCV] Handle EH_LABEL operands (#68998 ) Fixes the `runtime/exceptions-no-pie.cpp` test on RISC-V.	2023-10-16 08:29:28 +00:00
Job Noorman	5c0931727e	[BOLT][RISCV] Implement MCPlusBuilder::equals (#68989 ) This enables ICF for RISC-V. No tests are added by this commit as `bolt-icf.test` covers this case (only on a RISC-V host though).	2023-10-16 07:13:07 +00:00
Job Noorman	8fb83bf5f1	[BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223 ) On RISC-V, it's helpful to have access to `MCSubtargetInfo` while generating instructions in `MCPlusBuilder`. For example, a return instruction might be generated differently based on if the target supports compressed instructions (`c.jr ra`) or not (`jalr ra`).	2023-10-06 06:39:58 +00:00
Job Noorman	7fa33773e3	[BOLT][RISCV] Handle long tail calls (#67098 ) Long tail calls use the following instruction sequence on RISC-V: ``` 1: auipc xi, %pcrel_hi(sym) jalr zero, %pcrel_lo(1b)(xi) ``` Since the second instruction in isolation looks like an indirect branch, this confused BOLT and most functions containing a long tail call got marked with "unknown control flow" and didn't get optimized as a consequence. This patch fixes this by detecting long tail call sequence in `analyzeIndirectBranch`. `FixRISCVCallsPass` also had to be updated to expand long tail calls to `PseudoTAIL` instead of `PseudoCALL`. Besides this, this patch also fixes a minor issue with compressed tail calls (`c.jr`) not being detected. Note that I had to change `BinaryFunction::postProcessIndirectBranches` slightly: the documentation of `MCPlusBuilder::analyzeIndirectBranch` mentions that the [`Begin`, `End`) range contains the instructions immediately preceding `Instruction`. However, in `postProcessIndirectBranches`, all the instructions in the BB where passed in the range. This made it difficult to find the preceding instruction so I made sure only the preceding instructions are passed.	2023-10-05 08:55:30 +00:00
Job Noorman	c7d6d62252	[BOLT][RISCV] Implement TLS le/ie relocations (#67112 ) Handle the following relocations related to TLS local-exec and initial-exec: - R_RISCV_TLS_GOT_HI20 - R_RISCV_TPREL_HI20 - R_RISCV_TPREL_ADD - R_RISCV_TPREL_LO12_I - R_RISCV_TPREL_LO12_S In addition, GNU ld has a quirk where after TLS le relaxation, two unofficial relocation types may be emitted: - R_RISCV_TPREL_I - R_RISCV_TPREL_S Since they are unofficial (defined in the reserved range of relocation types), LLVM does not define them. Hence, I've defined them locally in BOLT in a private namespace.	2023-10-05 08:53:51 +00:00
Rafael Auler	853e126ce3	[BOLT] Support input binaries that use R_X86_GOTPC64 In large code model, the address of GOT is calculated by the static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ instruction. In the final binary, it can be disassembled as a regular immediate, but because such immediate is the result of PC-relative pointer arithmetic, we need to parse this relocation and update this calculation whenever we move code, otherwise we break the code trying to read GOT. A test case showing how GOT is accessed was provided. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D158911	2023-10-02 23:12:44 -07:00
Job Noorman	9555736ac6	[BOLT][RISCV] Implement LO/HI relocations (#67444 ) Implement the following relocations used by the medlow code model and non-PIE binaries: - R_RISCV_HI20 - R_RISCV_LO12_I - R_RISCV_LO12_S	2023-09-26 15:54:11 +00:00
Kepontry	2d902d0f88	[BOLT] Implement '--assume-abi' option for AArch64 This patch implements the `getCalleeSavedRegs` function for AArch64, addressing the issue where the "not implemented" error occurs when both the `--assume-abi` option and options related to the RegAnalysis Pass (e.g., `--indirect-call-promotion=all`) are enabled.	2023-09-25 21:55:29 +08:00
Vladislav Khmelevsky	846eb76761	[BOLT][AArch64] Fix instrumentation deadloop According to ARMv8-a architecture reference manual B2.10.5 software must avoid having any explicit memory accesses between exclusive load and associated store instruction. Otherwise exclusive monitor might clear the exclusivity without application-related cause which may result in the deadloop. Disable instrumentation for such functions, since between exclusive load and store there might be branches and we would insert instrumentation snippet which contains loads and stores. The better solution would be to analyze with BFS finding the exact BBs between load and store and not instrumenting them. Or even better to recognize such sequences and replace them with more complex one, e.g. loading value non exclusively, and for the brach where exclusive store is made make exclusive load and store sequentially, but for now just disable instrumentation for such functions completely. Differential Revision: https://reviews.llvm.org/D159520	2023-09-22 00:58:01 +04:00
Job Noorman	c5ba61978c	[BOLT][RISCV] Add support for linker relaxation Calls on RISC-V are typically compiled to `auipc`/`jalr` pairs to allow a maximum target range (32-bit pc-relative). In order to optimize calls to near targets, linker relaxation may replace those pairs with, for example, single `jal` instructions. To allow BOLT to freely reassign function addresses in relaxed binaries, this patch proposes the following approach: - Expand all relaxed calls back to `auipc`/`jalr`; - Rely on JITLink to relax those back to shorter forms where possible. This is implemented by detecting all possible call instructions and replacing them with `PseudoCALL` (or `PseudoTAIL`) instructions. The RISC-V backend then expands those and adds the necessary relocations for relaxation. Since BOLT generally ignores pseudo instruction, this patch makes `MCPlusBuilder::isPseudo` virtual so that `RISCVMCPlusBuilder` can override it to exclude `PseudoCALL` and `PseudoTAIL`. To ensure JITLink knows about the correct section addresses while relaxing, reassignment of addresses has been moved to a post-allocation pass. Note that this is probably the time it had to be done in the first place since in `notifyResolved` (where it was done before), all symbols are supposed to be resolved already. Depends on D159082 Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D159089	2023-09-15 11:57:28 +02:00
Job Noorman	1b78742e77	[BOLT][RISCV] Implement R_RISCV_PCREL_LO12_S (#65204 ) Relocation used for store instructions.	2023-09-09 08:22:37 +00:00
Job Noorman	eafe4ee2e8	[BOLT] Rename isLoad/isStore to mayLoad/mayStore As discussed in D159266, for some instructions it's impossible to know statically if they will load/store (e.g., predicated instructions). Therefore, mayLoad/mayStore are more appropriate names.	2023-09-01 09:36:05 +02:00
Elvina Yakubova	70405a0bf7	[BOLT][Instrumentation] Add support for MacOS counters This commit adds support for generation of getter counters for AArch64 MacOS. Continuation of work D151899 Reviewed By: rafauleir, yota9 Differential Revision: https://reviews.llvm.org/D151901	2023-08-24 19:34:57 +03:00
Elvina Yakubova	6e4c230525	[BOLT][Instrumentation] Initial instrumentation support for AArch64 This commit adds code generation for AArch64 instrumentation, including direct and indirect calls support. Reviewed By: rafauler, yota9 Differential Revision: https://reviews.llvm.org/D151899	2023-08-24 19:34:57 +03:00
Denis Revunov	28fd2ca142	[BOLT] Fix trap value for non-X86 The trap value used by BOLT was assumed to be single-byte instruction. It made some functions unaligned on AArch64(e.g exceptions-instrumentation test) and caused emission failures. Fix that by changing fill value to StringRef. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D158191	2023-08-24 01:29:41 +03:00
zhoujiapeng	62020a3a7e	[BOLT] Implement createRelocation for AArch64 The implementation is based on the X86 version, with the same code of symbol and addend extraction. The differences include the support for RelType `R_AARCH64_CALL26` and the deletion of 8-bit relocation. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D156018	2023-08-23 00:53:32 +08:00
zhoujiapeng	9fee2ac044	[BOLT][NFC] Split createRelocation in X86 and share the second part This commit splits the createRelocation function for the X86 architecture into two parts, retaining the first half and moving the second half to a new function called extractFixupExpr. The purpose of this change is to make extractFixupExpr a shared function between AArch64 and X86 architectures, increasing code reusability and maintainability. Child revision: https://reviews.llvm.org/D156018 Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D157217	2023-08-23 00:29:25 +08:00
Job Noorman	b6556dc9fe	[BOLT][RISCV] Fix implementation of getTargetSymbol - Correctly handle OpNum == 0 (auto select operand) - Implement MCExpr overload Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D153343	2023-06-21 10:21:00 +02:00
Job Noorman	41b8aed499	[BOLT][RISCV] Implement branch reversal Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D153344	2023-06-21 10:21:00 +02:00
Job Noorman	5e67ae151e	[BOLT][RISCV] Implement return/unconditional branch creation Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D153342	2023-06-21 10:21:00 +02:00
Job Noorman	f873029386	[BOLT] Add minimal RISC-V 64-bit support Just enough features are implemented to process a simple "hello world" executable and produce something that still runs (including libc calls). This was mainly a matter of implementing support for various relocations. Currently, the following are handled: - R_RISCV_JAL - R_RISCV_CALL - R_RISCV_CALL_PLT - R_RISCV_BRANCH - R_RISCV_RVC_BRANCH - R_RISCV_RVC_JUMP - R_RISCV_GOT_HI20 - R_RISCV_PCREL_HI20 - R_RISCV_PCREL_LO12_I - R_RISCV_RELAX - R_RISCV_NONE Executables linked with linker relaxation will probably fail to be processed. BOLT relocates .text to a high address while leaving .plt at its original (low) address. This causes PC-relative PLT calls that were relaxed to a JAL to not fit their offset in an I-immediate anymore. This is something that will be addressed in a later patch. Changes to the BOLT core are relatively minor. Two things were tricky to implement and needed slightly larger changes. I'll explain those below. The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a AUIPC/JALR pair, the second does not get any relocation (unlike other PCREL pairs). This causes issues with the combinations of the way BOLT processes binaries and the RISC-V MC-layer handles relocations: - BOLT reassembles instructions one by one and since the JALR doesn't have a relocation, it simply gets copied without modification; - Even though the MC-layer handles R_RISCV_CALL properly (adjusts both the AUIPC and the JALR), it assumes the immediates of both instructions are 0 (to be able to or-in a new value). This will most likely not be the case for the JALR that got copied over. To handle this difficulty without resorting to RISC-V-specific hacks in the BOLT core, a new binary pass was added that searches for AUIPC/JALR pairs and zeroes-out the immediate of the JALR. A second difficulty was supporting ABS symbols. As far as I can tell, ABS symbols were not handled at all, causing __global_pointer$ to break. RewriteInstance::analyzeRelocation was updated to handle these generically. Tests are provided for all supported relocations. Note that in order to test the correct handling of PLT entries, an ELF file produced by GCC had to be used. While I tried to strip the YAML representation, it's still quite large. Any suggestions on how to improve this would be appreciated. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D145687	2023-06-16 12:19:36 +02:00
Maksim Panchenko	5c4d306a10	[BOLT][NFC] Change signature of MCPlusBuilder::isUnsupportedBranch() Make MCPlusBuilder::isUnsupportedBranch() take MCInst, not opcode. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D152765	2023-06-13 12:20:36 -07:00
Maksim Panchenko	43f56a2f27	[BOLT] Fix handling of code references from unmodified code In lite mode (default for X86), BOLT optimizes and relocates functions with profile. The rest of the code is preserved, but if it references relocated code such references have to be updated. The update is handled by scanExternalRefs() function. Note that we cannot solely rely on relocations written by the linker, as not all code references are exposed to the linker. Additionally, the linker can modify certain instructions and relocations will no longer match the code. With this change, start using symbolic disassembler for scanning code for references in scanExternalRefs(). Unlike the previous approach, the symbolizer properly detects and creates references for instructions with multiple/ambiguous symbolic operands and handles cases where a relocation doesn't match any operand. See test cases for examples. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D152631	2023-06-12 10:46:51 -07:00
Shengchen Kan	3f1e9468f6	[X86][MC][bolt] Share code between encoding optimization and assembler relaxation, NFCI PUSH[16\|32\|64]i[8\|32] are not arithmetic instructions, so I renamed the functions. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D151028	2023-05-21 09:31:50 +08:00
Shengchen Kan	89ca4eb002	[X86][NFC] Correct the instruction names for PUSH16i, PUSH32i Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D151012	2023-05-20 17:33:42 +08:00
Amir Ayupov	b6f07d3ae8	[BOLT][NFC] Add MCPlusBuilder defOperands/useOperands helpers Make intent more explicit with the use of new helper methods. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D150810	2023-05-17 21:52:33 -07:00
spupyrev	3e3a926be8	[BOLT][NFC] Add hash computation for basic blocks Extending yaml profile format with block hashes, which are used for stale profile matching. To avoid duplication of the code, created a new class with a collection of utilities for computing hashes. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D144306	2023-05-02 14:03:47 -07:00
Nathan Sidwell	f84ac48f1e	[BOLT] Add BOLT_TARGETS_TO_BUILD Adds BOLT_TARGETS_TO_BUILD, which defaults to the intersection of X86;AArch64 and LLVM_TARGETS_TO_BUILD, but allows configuration to alter that -- for instance omitting one of those two targets even if llvm supports both. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D148847	2023-04-21 13:07:04 -04:00
Job Noorman	df3f1e2f31	[BOLT][NFC] Fix UB due to left shift of negative value The following test fails when enabling UBSan due to a left shift of a negative value: > runtime error: left shift of negative value -2 BOLT :: AArch64/ext-island-ref.s This patch fixes this by using a multiplication instead of a shift. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D148218	2023-04-13 14:29:19 +02:00
Amir Ayupov	edda85771a	[BOLT][NFC] Move addRelocation{X86,AArch64} into MCPlusBuilder The two methods don't belong in BinaryFunction methods. Move the dispatch tables into target-specific MCPlusBuilder methods. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131813	2023-03-14 17:34:25 -07:00
Amir Ayupov	4e99891e70	[BOLT][NFC] Provide default impl for MIB methods that are only overridden on X86 Simplifies D145687 Reviewed By: #bolt, rafauler Differential Revision: https://reviews.llvm.org/D145972	2023-03-14 17:19:24 -07:00
Amir Ayupov	223ec28da4	[BOLT][NFC] Return instruction list from createInstrIncMemory Leverage move semantics for `std::vector`. This also makes it consistent with `createInstrumentationSnippet`. Reviewed By: Elvina Differential Revision: https://reviews.llvm.org/D145465	2023-03-13 12:56:39 -07:00
Maksim Panchenko	fb28196a64	[BOLT] Fix intermittent crash with instrumentation When createInstrumentedIndirectCall() was invoked for tail calls, we attached annotation instruction twice to the new call instruction. First in createDirectCall(), and then again while copying over the metadata operands. As a result, the annotations were not properly stripped for such calls before the call to freeAnnotations() in LowerAnnotations pass. That lead to use-after-free while restoring the offsets with setOffset() call. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D144806	2023-02-27 14:11:10 -08:00
Shengchen Kan	471c0e000a	[BOLT][X86][NFC] Simplify the code of X86MCPlusBuilder::getAliasSized Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D144551	2023-02-23 10:41:28 +08:00

1 2 3

113 Commits