This handles two initial relocation types R_X86_64_GOTPC32_TLSDESC and
R_X86_64_TLSDESC_CALL, as well as the GD->LE and GD->IE relaxations.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D62513
llvm-svn: 361911
This is implemented by creating Undefined (instead of Defined) for such
local STT_SECTION symbols. It allows us to catch errors when there are
relocations to such discarded sections (e.g. in PR41693, ld.bfd and gold
error but we don't). Updated comdat-discarded-error.s checks we emit
friendly error message.
For relocatable-eh-frame.s, ld.lld -r a.o a.o will now error
"STT_SECTION symbol should be defined" because the section .eh_frame
refers to is now an Undefined instead of a Defined.
So I have to change `error()` to `warn()` to retain the output.
rLLD361144 inadvertently enabled the error for --gdb-index
(in LLDDwarfObj<ELFT>::findAux()).
Relocations from .debug_info (not in comdat) to .text.* (in comdat) for
DW_AT_low_pc are common. If an .text.* was discarded, rLLD361144 would error,
which was unexpected. (Note, if we don't error as this patch does,
InputSection::relocateNonAlloc() will resolve such relocations).
llvm-svn: 361830
This is implemented by creating Undefined (instead of Defined) for such
local STT_SECTION symbols. It allows us to catch errors when there are
relocations to such discarded sections (e.g. in PR41693, ld.bfd and gold
error but we don't). Updated comdat-discarded-error.s checks we emit
friendly error message.
For relocatable-eh-frame.s, ld.lld -r a.o a.o will now error
"STT_SECTION symbol should be defined" because the section .eh_frame
refers to is now an Undefined instead of a Defined.
So I have to change `error()` to `warn()` to retain the output.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D61583
llvm-svn: 361792
For R_TLS:
1) Delete Sym.isTls() . The assembler ensures the symbol is STT_TLS.
If not (the input is broken), we would crash (dereferencing null Out::TlsPhdr).
2) Change Sym.isUndefWeak() to Sym.isUndefined(), otherwise with --noinhibit-exec
we would still evaluate the symbol and crash.
3) Return A if the symbol is undefined. This is PR40570.
The case is probably unrealistic but returning A matches R_ABS and the
behavior of several dynamic loaders.
R_NEG_TLS is obsoleted Sun TLS we don't fully support, but
R_RELAX_TLS_GD_TO_LE_NEG is still used by GD->LE relaxation (subl $var@tpoff,%eax).
They should add the addend. Unfortunately I can't test it as compilers don't seem to generate non-zero implicit addends.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D62098
llvm-svn: 361146
This reverts D53906.
D53906 increased p_align of PT_TLS on ARM/AArch64 to 32/64 to make the
static TLS layout compatible with Android Bionic's ELF TLS. However,
this may cause glibc ARM/AArch64 programs to crash (see PR41527).
The faulty PT_TLS in the executable satisfies p_vaddr%p_align != 0. The
remainder is normally 0 but may be non-zero with the hack in place. The
problem is that we increase PT_TLS's p_align after OutputSections'
addresses are fixed (assignAddress()). It is possible that
p_vaddr%old_p_align = 0 while p_vaddr%new_p_align != 0.
For a thread local variable defined in the executable, lld computed TLS
offset (local exec) is different from glibc computed TLS offset from
another module (initial exec/generic dynamic). Note: PR41527 said the
bug affects initial exec but actually generic dynamic is affected as
well.
(glibc is correct in that it compute offsets that satisfy
`offset%p_align == p_vaddr%p_align`, which is a basic ELF requirement.
This hack appears to work on FreeBSD rtld, musl<=1.1.22, and Bionic, but
that is just because they (and lld) incorrectly compute offsets that
satisfy `offset%p_align = 0` instead.)
Android developers are fine to revert this patch, carry this patch in
their tree before figuring out a long-term solution (e.g. a dummy .tdata
with sh_addralign=64 sh_size={0,1} in crtbegin*.o files. The overhead is
now insignificant after D62059).
Reviewed By: rprichard, srhines
Differential Revision: https://reviews.llvm.org/D62055
llvm-svn: 361090
After D62059, we don't align p_memsz of PT_TLS to p_align. The
getRelocTargetVA formula should align it instead.
It becomes clear that R_NEG_TLS and R_TLS are opposite from each other.
In i386-tls-le-align.s, I put ret after call ___tls_get_addr@plt as
otherwise ld.bfd would reject the relaxation:
TLS transition from R_386_TLS_GD to R_386_TLS_LE_32 against `a' at 0x3 in section `.text' failed
llvm-svn: 361088
As Ryan Prichard pointed out, after D62059, the TP offset is incorrect.
Add x86-64-tls-le-align.s to check this. Better formulae for both
variants should take p_vaddr%p_align into account (offset%p_align =
p_vaddr%p_align is a basic ELF requirement), but I can't find a way to
test the behavior.
llvm-svn: 361084
On Elf*_Rel targets, for a relocation to a section symbol, an R_ABS is
added which will be used by relocateOne() to compute the implicit
addend.
Addends of R_*_NONE should be ignored, so don't emit an R_ABS.
This fixes crashes on X86 and ARM because their relocateOne() do not
handle R_*_NONE.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D62052
llvm-svn: 361036
The change broke some scenarios where debug information is still
needed, although MarkLive cannot see it, including the
Chromium/Android build. Reverting to unbreak that build.
llvm-svn: 360955
This is based on D54720 by Sean Fertile.
When accessing a global symbol which is not defined in the translation unit,
compilers will generate instructions that load the address from the toc entry.
If the symbol is defined, non-preemptable, and addressable with a 32-bit
signed offset from the toc pointer, the address can be computed
directly. e.g.
addis 3, 2, .LC0@toc@ha # R_PPC64_TOC16_HA
ld 3, .LC0@toc@l(3) # R_PPC64_TOC16_LO_DS, load the address from a .toc entry
ld/lwa 3, 0(3) # load the value from the address
.section .toc,"aw",@progbits
.LC0: .tc var[TC],var
can be relaxed to
addis 3,2,var@toc@ha # this may be relaxed to a nop,
addi 3,3,var@toc@l # then this becomes addi 3,2,var@toc
ld/lwa 3, 0(3) # load the value from the address
We can delete the test ppc64-got-indirect.s as its purpose is covered by
newly added ppc64-toc-relax.s and ppc64-toc-relax-constants.s
Reviewed By: ruiu, sfertile
Differential Revision: https://reviews.llvm.org/D60958
llvm-svn: 360112
Summary:
We use `uint32_t SectionBase::Alignment` and `uint32_t
PhdrEntry::p_align` despite alignments being 64 bits in ELF64.
Fix the std::max template arguments accordingly.
The currently 160-byte InputSection will become 168 bytes if we make SectionBase::Alignment uint64_t.
Differential Revision: https://reviews.llvm.org/D61171
llvm-svn: 359268
Summary:
Fixes PR35242. A simplified reproduce:
thread_local int i; int f() { return i; }
% {g++,clang++} -fPIC -shared -ftls-model=local-dynamic -fuse-ld=lld a.cc
ld.lld: error: can't create dynamic relocation R_X86_64_DTPOFF32 against symbol: i in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output
In isStaticLinkTimeConstant(), Syn.IsPreemptible is true, so it is not
seen as a constant. The error is then issued in processRelocAux().
A symbol of the local-dynamic TLS model cannot be preempted but it can
preempt symbols of the global-dynamic TLS model in other DSOs.
So it makes some sense that the variable is not static.
This patch fixes the linking error by changing getRelExpr() on
R_386_TLS_LDO_32 and R_X86_64_DTPOFF{32,64} from R_ABS to R_DTPREL.
R_PPC64_DTPREL_* and R_MIPS_TLS_DTPREL_* need similar fixes, but they are not handled in this patch.
As a bonus, we use `if (Expr == R_ABS && !Config->Shared)` to find
ld-to-le opportunities. R_ABS is overloaded here for such STT_TLS symbols.
A dedicated R_DTPREL is clearer.
Differential Revision: https://reviews.llvm.org/D60945
llvm-svn: 358870
Summary:
This relocation type is used by R_386_TLS_GD. Its formula is the same as
R_GOTPLT (e.g R_X86_64_GOT{32,64} R_386_TLS_GOTIE). Rename it to be clearer.
Differential Revision: https://reviews.llvm.org/D60941
llvm-svn: 358868
Patch by Robert O'Callahan.
Rust projects tend to link in all object files from all dependent
libraries and rely on --gc-sections to strip unused code and data.
Unfortunately --gc-sections doesn't currently strip any debuginfo
associated with GC'ed sections, so lld links in the full debuginfo from
all dependencies even if almost all that code has been discarded. See
https://github.com/rust-lang/rust/issues/56068 for some details.
Properly stripping debuginfo for discarded sections would be difficult,
but a simple approach that helps significantly is to mark debuginfo
sections as live only if their associated object file has at least one
live code/data section. This patch does that. In a (contrived but not
totally artificial) Rust testcase linked above, it reduces the final
binary size from 46MB to 5.1MB.
Differential Revision: https://reviews.llvm.org/D54747
llvm-svn: 358069
Summary:
This should address remaining issues discussed in PR36555.
Currently R_GOT*_FROM_END are exclusively used by x86 and x86_64 to
express relocations types relative to the GOT base. We have
_GLOBAL_OFFSET_TABLE_ (GOT base) = start(.got.plt) but end(.got) !=
start(.got.plt)
This can have problems when _GLOBAL_OFFSET_TABLE_ is used as a symbol, e.g.
glibc dl_machine_dynamic assumes _GLOBAL_OFFSET_TABLE_ is start(.got.plt),
which is not true.
extern const ElfW(Addr) _GLOBAL_OFFSET_TABLE_[] attribute_hidden;
return _GLOBAL_OFFSET_TABLE_[0]; // R_X86_64_GOTPC32
In this patch, we
* Change all GOT*_FROM_END to GOTPLT* to fix the problem.
* Add HasGotPltOffRel to denote whether .got.plt should be kept even if
the section is empty.
* Simplify GotSection::empty and GotPltSection::empty by setting
HasGotOffRel and HasGotPltOffRel according to GlobalOffsetTable early.
The change of R_386_GOTPC makes X86::writePltHeader simpler as we don't
have to compute the offset start(.got.plt) - Ebx (it is constant 0).
We still diverge from ld.bfd (at least in most cases) and gold in that
.got.plt and .got are not adjacent, but the advantage doing that is
unclear.
Reviewers: ruiu, sivachandra, espindola
Subscribers: emaste, mehdi_amini, arichardson, dexonsmith, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59594
llvm-svn: 356968
This shaves another word off SectionBase and makes it possible to clone a
section using the implicit copy constructor.
This basically reverts r311056, which removed the mutex in order to
make the code easier to understand. On balance I think it's probably more
straightforward to have a mutex here than to have an unusual copy constructor
in SectionBase.
Differential Revision: https://reviews.llvm.org/D59269
llvm-svn: 355966
This patch removes the precompiled binary from inputs,
replacing it with a YAML. And teaches LLD to report a
section name in case of such error.
Differential revision: https://reviews.llvm.org/D58670
llvm-svn: 354959
Previously, we showed the following message for an unknown relocation:
foo.o: unrecognized reloc 256
This patch improves it so that the error message includes a symbol name:
foo.o: unknown relocation (256) against symbol bar
llvm-svn: 354040
Non-GOT non-PLT relocations to non-preemptible ifuncs result in the
creation of a canonical PLT, which now takes the identity of the IFUNC
in the symbol table. This (a) ensures address consistency inside and
outside the module, and (b) fixes a bug where some of these relocations
end up pointing to the resolver.
Fixes (at least) PR40474 and PR40501.
Differential Revision: https://reviews.llvm.org/D57371
llvm-svn: 353981
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The section and offset can be very helpful in diagnosing certian errors.
For example on a relocation overflow or misalignment diagnostic:
test.c:(function foo): relocation R_PPC64_ADDR16_DS out of range: ...
The function foo can have many R_PPC64_ADDR16_DS relocations. Adding the offset
and section will identify exactly which relocation is causing the failure.
Differential Revision: https://reviews.llvm.org/D56453
llvm-svn: 350828
ARM and AArch64 use TLS variant 1, where the first two words after the
thread pointer are reserved for the TCB, followed by the executable's TLS
segment. Both the thread pointer and the TLS segment are aligned to at
least the TLS segment's alignment.
Android/Bionic historically has not supported ELF TLS, and it has
allocated memory after the thread pointer for several Bionic TLS slots
(currently 9 but soon only 8). At least one of these allocations
(TLS_SLOT_STACK_GUARD == 5) is widespread throughout Android/AArch64
binaries and can't be changed.
To reconcile this disagreement about TLS memory layout, set the minimum
alignment for executable TLS segments to 8 words on ARM/AArch64, which
reserves at least 8 words of memory after the TP (2 for the ABI-specified
TCB and 6 for alignment padding). For simplicity, and because lld doesn't
know when it's targeting Android, increase the alignment regardless of
operating system.
Differential Revision: https://reviews.llvm.org/D53906
llvm-svn: 350681
In the ABI for the 64-bit Arm architecture the section on weak references
states:
During linking, the symbol value of an undefined weak reference is:
- Zero if the relocation type is absolute
- The address of the place if the relocation type is pc-relative.
The relocations associated with an ADRP are relative so we should resolve
the undefined weak reference to the place instead of 0. This matches GNU
ld.bfd behaviour.
fixes pr34928
Differential Revision: https://reviews.llvm.org/D55599
llvm-svn: 349024
Previously, we have a hash table containing strings and their offsets
to manage mergeable strings. Technically we can live without that, because
we can do binary search on a vector of mergeable strings to find a mergeable
strings.
We did have both the hash table and the binary search because we thought
that that is faster.
We recently observed that lld tend to consume more memory than gold when
building an output with debug info. A few percent of memory is consumed by
the hash table. So, we needed to reevaluate whether or not having the extra
hash table is a good CPU/memory tradeoff. I run a few benchmarks with and
without the hash table.
I got a mixed result for the benchmark. We observed a regression for some
programs by removing the hash table (that's what we expected), but we also
observed that performance imrpovements for some programs. This is perhaps
due to reduced memory usage.
Differential Revision: https://reviews.llvm.org/D55234
llvm-svn: 348401
This is https://bugs.llvm.org/show_bug.cgi?id=38074.
The issue is that when calling a function, LLD generates a
.got entry that points to the IFUNC resolver function when
instead, it should use the PLT entries properly for
handling the IFUNC.
So we should create a got entry that points to PLT entry,
which itself loads the value from
.got.plt, relocated with R_*_IRELATIVE to make things work.
Patch do that.
Differential revision: https://reviews.llvm.org/D54314
llvm-svn: 347650
The R_AARCH64_ADR_PREL_PG_HI21 relocation type is given the R_PAGE_PC
RelExpr. This can be transformed to R_PLT_PAGE_PC via toPlt().
Unfortunately the resolution is identical to R_PAGE_PC so instead of
getting the address of the PLT entry we get the address of the symbol
which may not be correct in the case of static ifuncs. The fix is to
handle the cases separately and use getPltVA() + A with R_PLT_PAGE_PC.
Differential Revision: https://reviews.llvm.org/D54474
llvm-svn: 346863
This is https://bugs.llvm.org/show_bug.cgi?id=39493.
We crashed previously because did not handle /DISCARD/ properly
when -r was used. I think it is uncommon to use scripts with -r, though I see
nothing wrong to handle the /DISCARD/ so that we will not crash at least.
Differential revision: https://reviews.llvm.org/D53864
llvm-svn: 345819
Summary:
There are really three different kinds of TLS layouts:
* A fixed TLS-to-TP offset. On architectures like PowerPC, MIPS, and
RISC-V, the thread pointer points to a fixed offset from the start
of the executable's TLS segment. The offset is 0x7000 for PowerPC
and MIPS, which allows a signed 16-bit offset to reach 0x1000 of
per-thread implementation data and 0xf000 of the application's TLS
segment. The size and layout of the TCB isn't relevant to the static
linker and might not be known.
* A fixed TCB size. This is the format documented as "variant 1" in
Ulrich Drepper's TLS spec. The thread pointer points to a 2-word TCB
followed by the executable's TLS segment. The first word is always
the DTV pointer. Used on ARM. The thread pointer must be aligned to
the TLS segment's alignment, possibly creating alignment padding.
* Variant 2. This format predates variant 1 and is also documented in
Drepper's TLS spec. It allocates the executable's TLS segment before
the thread pointer, apparently for backwards-compatibility. It's
used on x86 and SPARC.
Factor out an lld::elf::getTlsTpOffset() function for use in a
follow-up patch for Android. The TcbSize/TlsTpOffset fields are only used
in getTlsTpOffset, so replace them with a switch on Config->EMachine.
Reviewers: espindola, ruiu, PkmX, jrtc27
Reviewed By: ruiu, PkmX, jrtc27
Subscribers: jyknight, emaste, sdardis, nemanjai, javed.absar, arichardson, kristof.beyls, kbarton, fedor.sergeev, atanasyan, PkmX, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D53905
llvm-svn: 345775
Recommitting https://reviews.llvm.org/rL344544 after fixing undefined behavior
from left-shifting a negative value. Original commit message:
This support is slightly different then the X86_64 implementation in that calls
to __morestack don't need to get rewritten to calls to __moresatck_non_split
when a split-stack caller calls a non-split-stack callee. Instead the size of
the stack frame requested by the caller is adjusted prior to the call to
__morestack. The size the stack-frame will be adjusted by is tune-able through a
new --split-stack-adjust-size option.
llvm-svn: 344622
This reverts commit https://reviews.llvm.org/rL344544, which causes failures on
a undefined behaviour sanitizer bot -->
lld/ELF/Arch/PPC64.cpp:849:35: runtime error: left shift of negative value -1
llvm-svn: 344551
This support is slightly different then the X86_64 implementation in that calls
to __morestack don't need to get rewritten to calls to __moresatck_non_split
when a split-stack caller calls a non-split-stack callee. Instead the size of
the stack frame requested by the caller is adjusted prior to the call to
__morestack. The size the stack-frame will be adjusted by is tune-able through a
new --split-stack-adjust-size option.
Differential Revision: https://reviews.llvm.org/D52099
llvm-svn: 344544
Previously, we cast a pointer to Elf{32,64}_Chdr like this
auto *Hdr = reinterpret_cast<const ELF64_Chdr>(Ptr);
and read from its members like this
read32(&Hdr->ch_size);
I was thinking that this does not violate alignment requirement,
since &Hdr->ch_size doesn't really access memory, but seems like
it is a violation in terms of C++ spec (?)
In this patch, I use a different struct that allows unaligned access.
llvm-svn: 344083
Previously, we uncompress all compressed sections before doing anything.
That works, and that is conceptually simple, but that could results in
a waste of CPU time and memory if uncompressed sections are then
discarded or just copied to the output buffer.
In particular, if .debug_gnu_pub{names,types} are compressed and if no
-gdb-index option is given, we wasted CPU and memory because we
uncompress them into newly allocated bufers and then memcpy the buffers
to the output buffer. That temporary buffer was redundant.
This patch changes how to uncompress sections. Now, compressed sections
are uncompressed lazily. To do that, `Data` member of `InputSectionBase`
is now hidden from outside, and `data()` accessor automatically expands
an compressed buffer if necessary.
If no one calls `data()`, then `writeTo()` directly uncompresses
compressed data into the output buffer. That eliminates the redundant
memory allocation and redundant memcpy.
This patch significantly reduces memory consumption (20 GiB max RSS to
15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total
5 GiB in an uncompressed form.
Differential Revision: https://reviews.llvm.org/D52917
llvm-svn: 343979
The GOT is referenced through the symbol _GLOBAL_OFFSET_TABLE_ .
The relocation added calculates the offset into the global offset table for
the entry of a symbol. In order to get the correct TargetVA I needed to
create an new relocation expression, HEXAGON_GOT. It does
Sym.getGotVA() - In.GotPlt->getVA().
Differential Revision: https://reviews.llvm.org/D52744
llvm-svn: 343784
Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: ruiu, espindola
Subscribers: emaste, arichardson, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D52569
llvm-svn: 343146
Previously, if you invoke lld's `main` more than once in the same process,
the second invocation could fail or produce a wrong result due to a stale
pointer values of the previous run.
Differential Revision: https://reviews.llvm.org/D52506
llvm-svn: 343009
The PPC64 elf V2 abi defines 2 entry points for a function. There are a few
places we need to calculate the offset from the global entry to the local entry
and how this is done is not straight forward. This patch adds a helper function
mostly for documentation purposes, explaining how the 2 entry points differ and
why we choose one over the other, as well as documenting how the offsets are
encoded into a functions st_other field.
Differential Revision: https://reviews.llvm.org/D52231
llvm-svn: 342603
section will not have an input file. Don't crash under those circumstances.
Neither clang nor llvm-mc generates R_X86_64_PC32 relocations due to
https://reviews.llvm.org/D43383, which makes it hard to write a test case.
However, gcc does generate such relocations. I want to get a fix in now,
but will figure out a way to actually exercise this code path as soon
as I can.
llvm-svn: 341408