Before we always used DW_RLE_startx_length. This is not very efficient and leads
to bigger .debug_addr section. Changed it to use
DW_RLE_base_addressx/DW_RLE_offset_pair.
clang-16 build in debug mode
llvm-bolt ran on it with --update-debug-sections
| section | before | after | diff | % decrease |
| .debug_rnglists | 32732292 | 31986051 | -746241 | 2.3% |
| .debug_addr | 14415808 | 14184128 | -231680 | 1.6% |
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D140439
Use llvm::reverse instead of `for (auto I = rbegin(), E = rend(); I != E; ++I)`
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D140516
Linker might relax adrp + ldr got address loading to adrp + add for
local non-preemptible symbols (e.g. hidden/protected symbols in
executable). As usually linker doesn't change relocations properly after
relaxation, so we have to handle such cases by ourselves. To do that
during relocations reading we change LD64 reloc to ADD if instruction
mismatch found and introduce FixRelaxationPass that searches for ADRP+ADD
pairs and after performing some checks we're replacing ADRP target symbol
to already fixed ADDs one.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D138097
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838
With ThinLTO mutliple CUs can share the same .debug_str_offsets contribution. We
were creating a new one for each CU. This lead to a binary size increase.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D139214
Summary:
Changed contribution data structure to 64 bit. I added the 32bit and 64bit
accessors to make it explicit where we use 32bit and where we use 64bit. Also to
make sure sure we catch all the cases where this data structure is used.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch makes code less readable but it will clean itself after all functions are converted.
Differential Revision: https://reviews.llvm.org/D138665
This has the following advantages:
- std::shared_timed_mutex is macOS 10.12+ only. llvm::sys::RWMutex
automatically switches to a different implementation internally
when targeting older macOS versions.
- bolt only needs std::shared_mutex, not std::shared_timed_mutex.
llvm::sys::RWMutex automatically uses std::shared_mutex internally
where available.
std::shared_mutex and RWMutex have the same API, so no code changes
other than types and includes are needed.
Differential Revision: https://reviews.llvm.org/D138423
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
matchLinkerVeneer() returns 3 if `Instruction` and the last
two instructions in `[Instructions.begin, Instructions.end())`
match the pattern
ADRP x16, imm
ADD x16, x16, imm
BR x16
BinaryContext.cpp used to use
--Count;
for (auto It = std::prev(Instructions.end()); Count != 0;
It = std::prev(It), --Count) {
...use It...
}
to walk these instructions. The first `--Count` skips the
instruction that's in `Instruction` instead of in `Instructions`.
The loop then walks over `Instructions`.
However, on the last iteration, this calls `std::prev()` on an
iterator that points at the container's begin(), which can blow
up.
Instead, use rbegin(), which sidesteps this issue.
Fixes test/AArch64/veneer-gold.s on a macOS host.
With this, check-bolt passes on macOS.
Differential Revision: https://reviews.llvm.org/D138313
If NewName twine has reference to the old name, then after
Section.Name = NewName.str(); this reference is invalidated,
so we cannot use NewName.str() anymore.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D137616
We should always move jump tables when requested. Previously,
we were not moving jump tables of non-simple functions in relocation
mode. That caused a bug detailed in the attached test case: in PIC
jump tables, we force jump tables to be moved, but if they are not
moved because the function is not simple, we could incorrectly update
original entries in .rodata, corrupting it under special circumstances
(see testcase).
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D137357
Always use non-symbolizing disassembler for instruction encoding
validation as symbols will be treated as undefined/zeros be the encoder
and causing byte sequence mismatches.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D136118
Simplify the logic of handling sections in BOLT. This change brings more
direct and predictable mapping of BinarySection instances to sections in
the input and output files.
* Only sections from the input binary will have a non-null SectionRef.
When a new section is created as a copy of the input section,
its SectionRef is reset to null.
* RewriteInstance::getOutputSectionName() is removed as the section name
in the output file is now defined by BinarySection::getOutputName().
* Querying BinaryContext for sections by name uses their original name.
E.g., getUniqueSectionByName(".rodata") will return the original
section even if the new .rodata section was created.
* Input file sections (with relocations applied) are emitted via MC with
".bolt.org" prefix. However, their name in the output binary is
unchanged unless a new section with the same name is created.
* New sections are emitted internally with ".bolt.new" prefix if there's
a name conflict with an input file section. Their original name is
preserved in the output file.
* Section header string table is properly populated with section names
that are actually used. Previously we used to include discarded
section names as well.
* Fix the problem when dynamic relocations were propagated to a new
section with a name that matched a section in the input binary.
E.g., the new .rodata with jump tables had dynamic relocations from
the original .rodata.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135494
This adds a round of checks to memory references, looking for
incorrect references to jump table objects. Fix them by replacing the
jump table reference with another object reference + offset.
This solves bugs related to regular data references in code
accidentally being bound to a jump table, and this reference being
updated to a new (incorrect) location because we moved this jump
table.
Fixes#55004
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D134098
Put code that creates references to symbol+addend behind MCPlusBuilder.
Will use this later in validate memory references pass.
Reviewed By: #bolt, maksfb, yota9
Differential Revision: https://reviews.llvm.org/D134097
While the order of new sections in the output binary was deterministic
in the past (i.e. there was no run-to-run variation), it wasn't always
rational as we used size to define the precedence of allocatable
sections within "code" or "data" groups (probably unintentionally).
Fix that by defining stricter section-ordering rules.
Other than the order of sections, this should be NFC.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135235
To properly set the "_end" symbol, we need to track the last allocatable
address. Simply emitting "_end" at the end of some section is not
sufficient since the order of section allocation is unknown during the
emission step.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135121
This does *not* link with libLLVM, but with static archives instead. Not
super-great, but at least the build works, which is probably better than
failing.
Related to #57551
Differential Revision: https://reviews.llvm.org/D134434
I'm planning to deprecate and eventually remove llvm::empty.
Note that no use of llvm::empty requires the ability of llvm::empty to
determine the emptiness from begin/end only.
When we derive EFMM from SectionMemoryManager, it brings into EFMM extra
functionality, such as the registry of exception handling sections,
page permission management, etc. Such functionality is of no use to
llvm-bolt and can even be detrimental (see
https://github.com/llvm/llvm-project/issues/56726).
Change the base class of ExecutableFileMemoryManager to MemoryManager,
avoid registering EH sections, and skip memory finalization.
Fixes#56726
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133994
In non-relocation mode, every function is emitted in its own section. If
a function is empty, RuntimeDyld will still allocate 1-byte section
for the function and initialize it with zero. As a result, we will
overwrite the first byte of the original function contents with zero.
Such scenario can happen when the input function had only NOP
instructions which BOLT removes by default. Even though such functions
likely cause undefined behavior, it's better to preserve their contents.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133978
For functions with references to internal offsets from data, verify externally
referenced blocks against the set of jump table targets. Mark the function
as non-simple if there are any unclaimed data to code references.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D132495
In non-pie binaries BOLT unconditionally converted type encoding
from indirect to absptr, which broke std exceptions since pointers
to their typeinfo were only assigned at runtime in .data section.
In this patch we preserve original encoding so that indirect
remains indirect and can be resolved at runtime, and absolute remains absolute.
Reviewed By: rafauler, maksfb
Differential Revision: https://reviews.llvm.org/D132484
For exception handling, LSDA call sites have to be emitted for each
fragment individually. With this patch, call sites and respective LSDA
symbols are generated and associated with each fragment of their
function, such that they can be used by the emitter.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132052
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132051
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132051
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050