Under normal circumstances, we terminate basic blocks on a trap
instruction. However, Linux kernel may resume execution after hitting a
trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will
specify if the trap instruction should terminate the control flow. The
option is on by default except for the Linux kernel mode when it's off.
Runtime code modification used by static keys is the most ubiquitous
self-modifying feature of the Linux kernel. The idea is to to eliminate
the condition check and associated conditional jump on a hot path if
that condition (based on a boolean value of a static key) does not
change often. Whenever they condition changes, the kernel runtime
modifies all code paths associated with that key flipping the code
between nop and (unconditional) jump.
Refactor MCPlusBuilder's create{Instruction}() functions that used to
return bool. We almost never check the return value as we rely on
llvm_unreachable() to detect unimplemented functionality. There were a
couple of cases that checked the return value, but they would hit the
unreachable condition first (at least in debug builds) before the return
value gets checked.
Reset operand list whenever we create a new instruction via a parameter
passed by reference. Most functions were already doing this, but there
are several places missing the reset. Potentially, if we don not clear
the list it could lead to invalid instruction operands. But the existing
code is unaffected.
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.
In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.
Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
Test Plan: NFC
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.
Test Plan: NFC
Co-authored-by: Rafael Auler <rafaelauler@fb.com>
On RISC-V, it's helpful to have access to `MCSubtargetInfo` while
generating instructions in `MCPlusBuilder`. For example, a return
instruction might be generated differently based on if the target
supports compressed instructions (`c.jr ra`) or not (`jalr ra`).
In large code model, the address of GOT is calculated by the
static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ
instruction. In the final binary, it can be disassembled as a regular
immediate, but because such immediate is the result of PC-relative
pointer arithmetic, we need to parse this relocation and update this
calculation whenever we move code, otherwise we break the code trying
to read GOT.
A test case showing how GOT is accessed was provided.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D158911
As discussed in D159266, for some instructions it's impossible to know
statically if they will load/store (e.g., predicated instructions).
Therefore, mayLoad/mayStore are more appropriate names.
This commit adds code generation for AArch64 instrumentation,
including direct and indirect calls support.
Reviewed By: rafauler, yota9
Differential Revision: https://reviews.llvm.org/D151899
The trap value used by BOLT was assumed to be single-byte instruction.
It made some functions unaligned on AArch64(e.g exceptions-instrumentation test)
and caused emission failures. Fix that by changing fill value to StringRef.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D158191
This commit splits the createRelocation function for the X86 architecture
into two parts, retaining the first half and moving the second half to a
new function called extractFixupExpr. The purpose of this change is to make
extractFixupExpr a shared function between AArch64 and X86 architectures,
increasing code reusability and maintainability.
Child revision: https://reviews.llvm.org/D156018
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D157217
In lite mode (default for X86), BOLT optimizes and relocates functions
with profile. The rest of the code is preserved, but if it references
relocated code such references have to be updated. The update is handled
by scanExternalRefs() function. Note that we cannot solely rely on
relocations written by the linker, as not all code references are
exposed to the linker. Additionally, the linker can modify certain
instructions and relocations will no longer match the code.
With this change, start using symbolic disassembler for scanning code
for references in scanExternalRefs(). Unlike the previous approach, the
symbolizer properly detects and creates references for instructions with
multiple/ambiguous symbolic operands and handles cases where a
relocation doesn't match any operand. See test cases for examples.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D152631
PUSH[16|32|64]i[8|32] are not arithmetic instructions, so I renamed the
functions.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D151028
Extending yaml profile format with block hashes, which are used for stale
profile matching. To avoid duplication of the code, created a new class with a
collection of utilities for computing hashes.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D144306
The two methods don't belong in BinaryFunction methods.
Move the dispatch tables into target-specific MCPlusBuilder methods.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D131813
Leverage move semantics for `std::vector`.
This also makes it consistent with `createInstrumentationSnippet`.
Reviewed By: Elvina
Differential Revision: https://reviews.llvm.org/D145465
When createInstrumentedIndirectCall() was invoked for tail calls, we
attached annotation instruction twice to the new call instruction.
First in createDirectCall(), and then again while copying over the
metadata operands.
As a result, the annotations were not properly stripped for such calls
before the call to freeAnnotations() in LowerAnnotations pass. That lead
to use-after-free while restoring the offsets with setOffset() call.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D144806
Simplify `MCPlusBuilder::evaluateX86MemoryOperand`: make it return a struct
with memory operand analysis struct `X86MemOperand`.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D144310
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
This does *not* link with libLLVM, but with static archives instead. Not
super-great, but at least the build works, which is probably better than
failing.
Related to #57551
Differential Revision: https://reviews.llvm.org/D134434
A simple sed doing these substitutions:
- `${LLVM_BINARY_DIR}/(\$\{CMAKE_CFG_INTDIR}/)?lib(${LLVM_LIBDIR_SUFFIX})?\>` -> `${LLVM_LIBRARY_DIR}`
- `${LLVM_BINARY_DIR}/(\$\{CMAKE_CFG_INTDIR}/)?bin\>` -> `${LLVM_TOOLS_BINARY_DIR}`
where `\>` means "word boundary".
The only manual modifications were reverting changes in
- `compiler-rt/cmake/Modules/CompilerRTUtils.cmake
- `runtimes/CMakeLists.txt`
because these were "entry points" where we wanted to tread carefully not not introduce a "loop" which would end with an undefined variable being expanded to nothing.
This hopefully increases readability overall, and also decreases the usages of `LLVM_LIBDIR_SUFFIX`, preparing us for D130586.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/D132316
Add -experimental-shrink-wrapping flag to control when we
want to move callee-saved registers even when addresses of the stack
frame are captured and used in pointer arithmetic, making it more
challenging to do alias analysis to prove that we do not access
optimized stack positions. This alias analysis is not yet implemented,
hence, it is experimental. In practice, though, no compiler would emit
code to do pointer arithmetic to access a saved callee-saved register
unless there is a memory bug or we are failing to identify a
callee-saved reg, so I'm not sure how useful it would be to formally
prove that.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126115
Refactor isStackAccess() to reflect updates by D126116. Now we only
handle simple stack accesses and delegate the rest of the cases to
getMemDataSize.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126112
Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc.
This diff adds support for instructions that were previously reported as having
memory access size 0. It replaces the heuristic of looking at instruction
register width to determine memory access width by instead checking the memory
operand type using tablegen-provided tables.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D126116
The linker can convert instructions with GOTPCRELX relocations into a
form that uses an absolute addressing with an immediate. BOLT needs to
recognize such conversions and symbolize the immediates.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D126747
Summary:
While disassembling instructions, we need to replace certain immediate
operands with symbols. This symbolizing process relies on reading
relocations against instructions. However, some X86 instructions can
have multiple immediate operands and up to two relocations against
them. Thus, correctly matching a relocation to an operand is not
always possible without knowing the operand offset within the
instruction.
Luckily, LLVM provides an interface for passing the required info from
the disassembler via a virtual MCSymbolizer class. Creating a
target-specific version allows a precise matching of relocations to
operands.
This diff adds X86MCSymbolizer class that performs X86-specific
symbolizing (currently limited to non-branch instructions).
Reviewers: yota9, Amir, ayermolo, rafauler, zr33
Differential Revision: https://reviews.llvm.org/D120928
Fix a bug where shrink-wrapping would use wrong stack offsets
because the stack was being aligned with an AND instruction, hence,
making its true offsets only available during runtime (we can't
statically determine where are the stack elements and we must give up
on this case).
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126110
Address warnings in Release build without assertions.
Tip @tschuett for reporting the issue #55404.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125475
We don't actually depend on entire X86/AArch64 components that pull in CodeGen,
SelectionDAG etc., just the Desc part with opcode and other definitions.
Note that it doesn't decouple BOLT from these components - we still pull in X86
and AArch64 from top-level llvm-bolt dependencies as we use assembler and
disassembler. It's difficult to reduce these as this requires non-trivial
changes to X86/AArch64 components themselves (e.g. moving out AsmPrinter).
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124206
Add !isTailCall in isUnconditionalBranch check in order to sync the x86
and aarch64 and fix the fixDoubleJumps pass on aarch64.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122929