Current Bionic processes relocations in this order:
* DT_ANDROID_REL[A]
* DT_RELR
* DT_REL[A]
* DT_JMPREL
If an IRELATIVE relocation is in DT_ANDROID_REL[A], it would read
unrelocated (incorrect) global variables associated with RELR when
--pack-dyn-relocs=android+relr is enabled. Work around this by placing
IRELATIVE in .rel[a].plt (DT_JMPREL).
Link: https://r.android.com/3014185
`relaIplt` was added so that IRELATIVE relocations are placed at the end
of .rela.dyn (since https://reviews.llvm.org/D65651) or .rela.plt
(--pack-dyn-relocs=android[+relr]). Unfortunately, handling `relaIplt`
requires special cases all over the code base. We can extend
partitionRels/computeRels to partition both RELATIVE and IRELATIVE
relocations, rendering `relaIplt` unneeded.
The change allows IRELATIVE relocations in the DT_ANDROID_REL[A] table
(untested?!), which may be processed before other types of relocations.
This seems acceptable for Bionic's DEFINE_IFUNC_FOR use cases.
In addition, this change simplies changing .rel[a].dyn to a compact
relocation format (CREL).
SHF_INFO_LINK is removed from .rel[a].dyn with IRELATIVE relocations.
(See https://reviews.llvm.org/D89828).
When adding fixups for RISCV_TLSDESC_ADD_LO and RISCV_TLSDESC_LOAD_LO,
the local label added for RISCV TLSDESC relocations have STT_TLS set,
which is incorrect. Instead, these labels should have `STT_NOTYPE`.
This patch stops adding such fixups and avoid setting the STT_TLS on
these symbols. Failing to do so can cause LLD to emit an error `has an
STT_TLS symbol but doesn't have an SHF_TLS section`. We additionally,
adjust how LLD services these relocations to avoid errors with
incompatible relocation and symbol types.
Reviewers: topperc, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/85817
With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:
9e8d8: R_390_TLS_TPOFF *ABS*+0x18
This is caused by the config->isPic conditon in addTpOffsetGotEntry:
static void addTpOffsetGotEntry(Symbol &sym) {
in.got->addEntry(sym);
uint64_t off = sym.getGotOffset();
if (!sym.isPreemptible && !config->isPic) {
in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
return;
}
It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.
Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.
In addition to new platform code and the obvious changes, there were a
few additional changes to common code:
- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.
- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.
This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.
For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.
The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.
```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
auipc a4, %tlsdesc_hi(c) # if relax: remove; otherwise, NOP
load a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
addi a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2) # if LE && !hi20 {if relax: remove; otherwise, NOP}
jalr t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
add a0, a0, tp
```
The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.
* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).
Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373
Reviewed By: ilovepi
Pull Request: https://github.com/llvm/llvm-project/pull/79239
Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile
kind `InternalKind`, use it for
* `ctx.internalFile`: for linker-defined symbols and some synthesized
`Undefined`
* `createInternalFile`: for symbol assignments and --defsym
I picked "internal" instead of "synthetic" to avoid confusion with
SyntheticSection.
Currently a symbol's file is one of: nullptr, ObjKind, SharedKind,
BitcodeKind, BinaryKind. Now it's non-null (I plan to add an
`assert(file)` to Symbol::Symbol and change `toString(const InputFile
*)`
separately).
Debugging and error reporting gets improved. The immediate user-facing
difference is more descriptive "File" column in the --cref output. This
patch may unlock further simplification.
Currently each symbol assignment gets its own
`createInternalFile(cmd->location)`. Two symbol assignments in a linker
script do not share the same file. Making the file the same would be
nice, but would require non trivial code.
Makes it clearer what the issue is when hand-written assembly doesn't
follow medium code model assumptions in a medium code model build.
Alternative to #71248 by only hinting on an overflow.
The relocations that map to R_ARM_PCA are equivalent to R_PC. They are
PC-relative and safe to use in shared libraries, but have a different
relocation code as they are evaluated differently. Now that LLVM may
generate these relocations in object files, they may occur in
shared libraries or position-independent executables.
Complement #72610 (non-SHF_ALLOC sections). GCC-generated
.gcc_exception_table has the SHF_ALLOC flag and may contain
R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 relocations.
Copy relocations and canonical PLT entries are for symbols defined in a
DSO. Currently we create them even for a `Defined`, possibly leading to
an output that won't work at run-time (e.g. R_X86_64_JUMP_SLOT
referencing a null symbol).
```
% cat a.s
.globl _start, main
.type main, @function
_start: main: ret
.rodata
.quad main
% clang -fuse-ld=lld -pie -nostdlib a.s
% readelf -Wr a.out
Relocation section '.rela.plt' at offset 0x290 contains 1 entry:
Offset Info Type Symbol's Value Symbol's Name + Addend
00000000000033b8 0000000000000007 R_X86_64_JUMP_SLOT 12b0
```
Report an error instead for the default `-z text` mode. GNU ld reports
an error in `-z text` mode as well.
The two fields are similar.
`versionId` is the Verdef index in the output file. It is set for
`--exclude-id=`, version script patterns, and `sym@ver` symbols.
`verdefIndex` is the Verdef index of a Sharedfile (SharedSymbol or a
copy-relocated Defined), the default value -1 is also used to indicate
that the symbol has not been matched by a version script pattern
(https://reviews.llvm.org/D65716).
It seems confusing to have two fields. Merge them so that we can
allocate one bit for #70130 (suppress --no-allow-shlib-undefined
error in the presence of a DSO definition).
The two fields are similar.
`versionId` is the Verdef index in the output file. It is set for
version script patterns and `sym@ver` symbols.
`verdefIndex` is the Verdef index of a SharedSymbol. The default value
-1 is also used to indicate that the symbol has not been matched by a
version script pattern (https://reviews.llvm.org/D65716).
It seems confusing to have two fields. Merge them so that we can
allocate one bit for #70130 (suppress --no-allow-shlib-undefined
error in the presence of a DSO definition).
The undefined symbol message suggests the source line when line number
information is available (see https://reviews.llvm.org/D31481).
When the undefined symbol is from a global variable, we won't get the
line information.
```
extern int undef;
namespace ns {
int *var[] = {
&undef
};
// DW_TAG_variable(DW_AT_decl_file/DW_AT_decl_line) is available while
// line number information is unavailable.
}
ld.lld: error: undefined symbol: undef
>>> referenced by undef-debug2.cc
>>> undef-debug2.o:(ns::var)
```
This patch utilizes `getEnclosingSymbol` to locate `var` and find
DW_TAG_variable for `var`:
```
ld.lld: error: undefined symbol: undef
>>> referenced by undef-debug2.cc:3 (/tmp/c/undef-debug2.cc:3)
>>> undef-debug2.o:(ns::var)
```
When an input section is matched by /DISCARD/ in a linker script, GNU ld
reports errors for relocations referencing symbols defined in the section:
`.aaa' referenced in section `.bbb' of a.o: defined in discarded section `.aaa' of a.o
Implement the error by demoting eligible symbols to `Undefined` and changing
STB_WEAK to STB_GLOBAL. As a side benefit, in relocatable links, relocations
referencing symbols defined relative to /DISCARD/ discarded sections no longer
set symbol/type to zeros.
It's arguable whether a weak reference to a discarded symbol should lead to
errors. GNU ld reports an error and our demoting approach reports an error as
well.
Close#58891
Co-authored-by: Bevin Hansson <bevin.hansson@ericsson.com>
For each R_X86_64_(REX_)GOTPCRELX relocation, check that the offset to the symbol is representable with 2^32 signed offset. If not, add a GOT entry for it and set its expr to R_GOT_PC so that we emit the GOT load instead of the relaxed lea. Do this in finalizeAddressDependentContent() where we iteratively attempt this (e.g. RISCV uses this for relaxation, ARM uses this to insert thunks).
Decided not to do the opposite of inserting GOT entries initially and removing them when relaxable because removing GOT entries isn't simple.
One drawback of this approach is that if we see any GOTPCRELX relocation, we'll create an empty .got even if it's not required in the end.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D157020
As per the ABI at
https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst,
this patch interprets the SHT_AARCH64_MEMTAG_GLOBALS_STATIC section,
which contains R_NONE relocations to tagged globals, and emits a
SHT_AARCH64_MEMTAG_GLOBALS_DYNAMIC section, with the correct
DT_AARCH64_MEMTAG_GLOBALS and DT_AARCH64_MEMTAG_GLOBALSSZ dynamic
entries. This section describes, in a uleb-encoded stream, global memory
ranges that should be tagged with MTE.
We are also out of bits to spare in the LLD Symbol class. As a result,
I've reused the 'needsTocRestore' bit, which is a PPC64 only feature.
Now, it's also used for 'isTagged' on AArch64.
An entry in SHT_AARCH64_MEMTAG_GLOBALS_STATIC is practically a guarantee
from an objfile that all references to the linked symbol are through the
GOT, and meet correct alignment requirements. As a result, we go through
all symbols and make sure that, for all symbols $SYM, all object files
that reference $SYM also have a SHT_AARCH64_MEMTAG_GLOBALS_STATIC entry
for $SYM. If this isn't the case, we demote the symbol to being
untagged. Symbols that are imported from other DSOs should always be
fine, as they're GOT-referenced (and thus the GOT entry either has the
correct tag or not, depending on whether it's tagged in the defining DSO
or not).
Additionally hand-tested by building {libc, libm, libc++, libm, and libnetd}
on Android with some experimental MTE globals support in the
linker/libc.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D152921
This adds support for the LoongArch ELF psABI v2.00 [1] relocation
model to LLD. The deprecated stack-machine-based psABI v1 relocs are not
supported.
The code is tested by successfully bootstrapping a Gentoo/LoongArch
stage3, complete with common GNU userland tools and both the LLVM and
GNU toolchains (GNU toolchain is present only for building glibc,
LLVM+Clang+LLD are used for the rest). Large programs like QEMU are
tested to work as well.
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
Reviewed By: MaskRay, SixWeining
Differential Revision: https://reviews.llvm.org/D138135
That patch adds a check for threadIndex being used with only threads
created by ThreadPoolExecutor. This helps catch two types of errors:
1. If a thread is created not by ThreadPoolExecutor its index may clash
with the index of another thread. Using threadIndex, in that case, may
lead to a data race.
2. Index of the main thread(threadIndex == 0) currently clashes with
the index of thread0 in ThreadPoolExecutor threads. That may lead
to a data race if main thread and thread0 are executed concurrently.
This patch allows execution tasks on the main thread only in case
parallel::strategy.ThreadsRequested == 1. In all other cases,
assertions check that threadIndex != UINT_MAX(i.e. that task
is executed on a thread created by ThreadPoolExecutor).
Differential Revision: https://reviews.llvm.org/D148916
This patch allows to specify that some part of tasks should be
done in sequential order. It makes it possible to not use
condition operator for separating sequential tasks:
TaskGroup tg;
for () {
if(condition) ==> tg.spawn([](){fn();}, condition)
fn();
else
tg.spawn([](){fn();});
}
It also prevents execution on main thread. Which allows adding
checks for getThreadIndex() function discussed in D142318.
The patch also replaces std::stack with std::deque in the
ThreadPoolExecutor to have natural execution order in case
(parallel::strategy.ThreadsRequested == 1).
Differential Revision: https://reviews.llvm.org/D148728
D73518 mentioned non-STT_SECTION symbol names. This patch extends the code to
handle STT_SECTION symbols, where we report the section name.
This change helps at least the following cases with very little code.
* Whether a out-of-range relocation is due to code or data.
* For a relocation in .debug_info, which referenced `.debug_*` section (due to DWARF32 limitation) causes the problem.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D145199
Fix https://github.com/llvm/llvm-project/issues/60392
```
// a.cc
void raise() { throw 42; }
bool foo() {
try { raise(); } catch (int) { return true; }
return false;
}
int main() { foo(); }
```
```
clang++ --target=x86_64-linux-gnu -fno-pic -mcmodel=large -no-pie -fuse-ld=lld -z notext a.cc -o a && ./a
clang++ --target=aarch64-linux-gnu -fno-pic -no-pie -fuse-ld=lld -Wl,--dynamic-linker=/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/usr/aarch64-linux-gnu/lib -z notext a.cc -o a && ./a
```
Both commands fail because we produce a dynamic relocation for
R_X86_64_64/R_AARCH64_ABS64 in .eh_frame which will be adjusted to a wrong
offset by `SectionBase::getOffset` after D122459.
Since GNU ld uses a canonical PLT entry instead of a dynamic relocation for
.eh_frame, we follow suit as well to avoid the issue.
Mips has an ABI issue (https://github.com/llvm/llvm-project/issues/5837) and we
don't implement GNU ld's DW_EH_PE_absptr conversion. mips64-eh-abs-reloc.s wants
a dynamic relocation, so keep the original behavior for EM_MIPS.
Differential Revision: https://reviews.llvm.org/D143136
to prepare for changing `relocations` from a SmallVector to a pointer.
Also change the `isec` parameter in `addAddendOnlyRelocIfNonPreemptible` to `GotSection &`.
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection. We can move other global variables into ctx without
indirection concern. In the long term we may consider passing Ctx
as a parameter to various functions and eliminate global state as
much as possible and then remove `Ctx::reset`.
`config` has 1000+ uses so we try to avoid changing `config->foo`. Define a
wrapper with LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection.
My x86-64 lld executable is 11+KiB smaller.
Symbol::replace intends to overwrite a few fields (mostly Elf{32,64}_Sym
fields), but the implementation copies all fields then restores some old fields.
This is error-prone and wasteful. Add Symbol::overwrite to copy just the
needed fields and add other overwrite member functions to copy the extra
fields.
https://reviews.llvm.org/D133003#3806508 can reproduce a non-determinism with
--threads=4. Making the config serial fixes non-determinism (by running the link
many times and compare output).
On Unix platforms, this wrapper function is inline, so it should
expand to the same direct access to the thread local variable. On
Windows, it's a non-inline function within Parallel.cpp, allowing
making the thread_local variable static.
Windows Native TLS doesn't support direct access to thread local
variables in a different DLL, and GCC/binutils on Windows occasionally
has problems with non-static thread local variables too.
This fixes mingw dylib builds with native TLS after
e6aebff674.
At the same time, move the whole thread local variable within
#if LLVM_ENABLE_THREADS
to fix builds without threading support.
Differential Revision: https://reviews.llvm.org/D133759
* Change `Symbol::flags` to a `std::atomic<uint16_t>`
* Add `llvm::parallel::threadIndex` as a thread-local non-negative integer
* Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
* Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.
MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.
Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:
* clang (Release): 1.27x as fast
* clang (Debug): 1.06x as fast
* chrome (default): 1.05x as fast
* scylladb (default): 1.04x as fast
Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):
* clang (Release): 1.31x as fast
* scylladb (default): 1.06x as fast
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D133003