With ThinLTO mutliple CUs can share the same .debug_str_offsets contribution. We
were creating a new one for each CU. This lead to a binary size increase.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D139214
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch makes code less readable but it will clean itself after all functions are converted.
Differential Revision: https://reviews.llvm.org/D138665
This has the following advantages:
- std::shared_timed_mutex is macOS 10.12+ only. llvm::sys::RWMutex
automatically switches to a different implementation internally
when targeting older macOS versions.
- bolt only needs std::shared_mutex, not std::shared_timed_mutex.
llvm::sys::RWMutex automatically uses std::shared_mutex internally
where available.
std::shared_mutex and RWMutex have the same API, so no code changes
other than types and includes are needed.
Differential Revision: https://reviews.llvm.org/D138423
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.
In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.
Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:
using NoneType = std::nullopt_t;
inline constexpr std::nullopt_t None = std::nullopt;
to ease the migration from llvm::Optional to std::optional.
Differential Revision: https://reviews.llvm.org/D138376
This patch adds the huge pages support (-hugify) for PIE/no-PIE
binaries. Also returned functionality to support the kernels < 5.10
where there is a problem in a dynamic loader with the alignment of
pages addresses.
Differential Revision: https://reviews.llvm.org/D129107
Some distribution install libraries under lib64. LLVM supports this
through LLVM_LIBDIR_SUFFIX, have bolt do the same.
Differential Revision: https://reviews.llvm.org/D137039
Always use non-symbolizing disassembler for instruction encoding
validation as symbols will be treated as undefined/zeros be the encoder
and causing byte sequence mismatches.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D136118
Simplify the logic of handling sections in BOLT. This change brings more
direct and predictable mapping of BinarySection instances to sections in
the input and output files.
* Only sections from the input binary will have a non-null SectionRef.
When a new section is created as a copy of the input section,
its SectionRef is reset to null.
* RewriteInstance::getOutputSectionName() is removed as the section name
in the output file is now defined by BinarySection::getOutputName().
* Querying BinaryContext for sections by name uses their original name.
E.g., getUniqueSectionByName(".rodata") will return the original
section even if the new .rodata section was created.
* Input file sections (with relocations applied) are emitted via MC with
".bolt.org" prefix. However, their name in the output binary is
unchanged unless a new section with the same name is created.
* New sections are emitted internally with ".bolt.new" prefix if there's
a name conflict with an input file section. Their original name is
preserved in the output file.
* Section header string table is properly populated with section names
that are actually used. Previously we used to include discarded
section names as well.
* Fix the problem when dynamic relocations were propagated to a new
section with a name that matched a section in the input binary.
E.g., the new .rodata with jump tables had dynamic relocations from
the original .rodata.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135494
This adds a round of checks to memory references, looking for
incorrect references to jump table objects. Fix them by replacing the
jump table reference with another object reference + offset.
This solves bugs related to regular data references in code
accidentally being bound to a jump table, and this reference being
updated to a new (incorrect) location because we moved this jump
table.
Fixes#55004
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D134098
Put code that creates references to symbol+addend behind MCPlusBuilder.
Will use this later in validate memory references pass.
Reviewed By: #bolt, maksfb, yota9
Differential Revision: https://reviews.llvm.org/D134097
While the order of new sections in the output binary was deterministic
in the past (i.e. there was no run-to-run variation), it wasn't always
rational as we used size to define the precedence of allocatable
sections within "code" or "data" groups (probably unintentionally).
Fix that by defining stricter section-ordering rules.
Other than the order of sections, this should be NFC.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135235
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D130824
In perf2bolt and `-aggregate-only` BOLT mode, the output profile file is written
in fdata format by default. Provide a knob `-profile-format=[fdata,yaml]` to
control the format.
Note that `-w` option still dumps in YAML format.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D133995
After BOLT's merge to LLVM, there are two (almost identical) versions of the
code layout algorithm. The diff unifies the implementations by keeping the one
in LLVM.
There are mild changes in the resulting block orders. I tested the changes
extensively both on the clang binary and on prod services. Didn't see stat sig
differences on average.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D129895
I'm planning to deprecate and eventually remove llvm::empty.
Note that no use of llvm::empty requires the ability of llvm::empty to
determine the emptiness from begin/end only.
When we derive EFMM from SectionMemoryManager, it brings into EFMM extra
functionality, such as the registry of exception handling sections,
page permission management, etc. Such functionality is of no use to
llvm-bolt and can even be detrimental (see
https://github.com/llvm/llvm-project/issues/56726).
Change the base class of ExecutableFileMemoryManager to MemoryManager,
avoid registering EH sections, and skip memory finalization.
Fixes#56726
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133994
In non-relocation mode, every function is emitted in its own section. If
a function is empty, RuntimeDyld will still allocate 1-byte section
for the function and initialize it with zero. As a result, we will
overwrite the first byte of the original function contents with zero.
Such scenario can happen when the input function had only NOP
instructions which BOLT removes by default. Even though such functions
likely cause undefined behavior, it's better to preserve their contents.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133978
For functions with references to internal offsets from data, verify externally
referenced blocks against the set of jump table targets. Mark the function
as non-simple if there are any unclaimed data to code references.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D132495
In non-pie binaries BOLT unconditionally converted type encoding
from indirect to absptr, which broke std exceptions since pointers
to their typeinfo were only assigned at runtime in .data section.
In this patch we preserve original encoding so that indirect
remains indirect and can be resolved at runtime, and absolute remains absolute.
Reviewed By: rafauler, maksfb
Differential Revision: https://reviews.llvm.org/D132484
Without this patch, I get warnings like:
bolt/include/bolt/Core/BinaryContext.h:108:19: error:
'iterator<std::bidirectional_iterator_tag,
llvm::bolt::BinarySection>' is deprecated
[-Werror,-Wdeprecated-declarations]
This patch fixes those warnings by defining iterator_category,
value_type, etc.
This patch intentionally leaves duplicate types like FilterIterator::T
and FilterIterator::PointerT intact to avoid mixing the fix and the
cleanup.
Differential Revision: https://reviews.llvm.org/D133650
For exception handling, LSDA call sites have to be emitted for each
fragment individually. With this patch, call sites and respective LSDA
symbols are generated and associated with each fragment of their
function, such that they can be used by the emitter.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132052
To enable split strategies that require view of the entire CFG (e.g. to
estimate cost of path from entry block), with this patch, all blocks of
a function are passed to `SplitStrategy::fragment`. Because this might
move non-outlineable blocks into a split fragment, these blocks are
moved back into the main fragment after fragmenting. This also gives
strategies the option to specify whether empty fragments should be
kept or removed.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132423
ICP has two modes: jump table promotion and indirect call promotion.
The selection is based on whether an instruction has a jump table or not.
An instruction with unknown control flow doesn't have a jump table and will
fall under indirect call promotion policy which might be incorrect/unsafe
(if an instruction is not a tail call, i.e. has local jump targets).
Prevent ICP for functions containing instructions with unknown control flow.
Follow-up to https://reviews.llvm.org/D128870.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132882
This introduces an abstract base class for splitting strategies to
document the interface a strategy needs to implement, and also to avoid
code bloat of the `splitFunction` method.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132054
We were trying to process .debug_addr for CU that doesn't have it. This resulted
in assert. Example came from GCC that also doesn't use DW_OP_addrx in
DW_FORM_exprloc.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132422
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132051
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132051
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049