Commit Graph

629 Commits

Author SHA1 Message Date
Daniil Kovalev
cca9115b1c [lld][AArch64][ELF][PAC] Support AUTH relocations and AUTH ELF marking (#72714)
This patch adds lld support for:

- Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH
relocations) as described here:
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations

- .note.AARCH64-PAUTH-ABI-tag section as defined here
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking

Depends on #72713 and #85231

---------

Co-authored-by: Peter Collingbourne <peter@pcc.me.uk>
Co-authored-by: Fangrui Song <i@maskray.me>
2024-04-04 12:38:09 +03:00
Fangrui Song
c335accb07 [ELF] --pack-dyn-relocs=android+relr: place IRELATIVE in .rela.plt (#86751)
Current Bionic processes relocations in this order:

* DT_ANDROID_REL[A]
* DT_RELR
* DT_REL[A]
* DT_JMPREL

If an IRELATIVE relocation is in DT_ANDROID_REL[A], it would read
unrelocated (incorrect) global variables associated with RELR when
--pack-dyn-relocs=android+relr is enabled. Work around this by placing
IRELATIVE in .rel[a].plt (DT_JMPREL).

Link: https://r.android.com/3014185
2024-03-27 09:47:16 -07:00
Fangrui Song
18a49f03aa [ELF] Merge relaIplt into relaDyn
`relaIplt` was added so that IRELATIVE relocations are placed at the end
of .rela.dyn (since https://reviews.llvm.org/D65651) or .rela.plt
(--pack-dyn-relocs=android[+relr]). Unfortunately, handling `relaIplt`
requires special cases all over the code base. We can extend
partitionRels/computeRels to partition both RELATIVE and IRELATIVE
relocations, rendering `relaIplt` unneeded.

The change allows IRELATIVE relocations in the DT_ANDROID_REL[A] table
(untested?!), which may be processed before other types of relocations.
This seems acceptable for Bionic's DEFINE_IFUNC_FOR use cases.

In addition, this change simplies changing .rel[a].dyn to a compact
relocation format (CREL).

SHF_INFO_LINK is removed from .rel[a].dyn with IRELATIVE relocations.
(See https://reviews.llvm.org/D89828).
2024-03-24 14:07:09 -07:00
Paul Kirth
dfe4ca9b7f [RISCV][lld] Set the type of TLSDESC relocation's referenced local symbol to STT_NOTYPE
When adding fixups for RISCV_TLSDESC_ADD_LO and RISCV_TLSDESC_LOAD_LO,
the local label added for RISCV TLSDESC relocations have STT_TLS set,
which is incorrect. Instead, these labels should have `STT_NOTYPE`.

This patch stops adding such fixups and avoid setting the STT_TLS on
these symbols. Failing to do so can cause LLD to emit an error `has an
STT_TLS symbol but doesn't have an SHF_TLS section`. We additionally,
adjust how LLD services these relocations to avoid errors with
incompatible relocation and symbol types.

Reviewers: topperc, MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/85817
2024-03-22 12:27:41 -07:00
Ulrich Weigand
6f907733e6 [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739)
With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:

                        9e8d8: R_390_TLS_TPOFF  *ABS*+0x18

This is caused by the config->isPic conditon in addTpOffsetGotEntry:

 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
   if (!sym.isPreemptible && !config->isPic) {
     in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
     return;
   }

It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.

Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)
2024-02-14 18:26:38 +01:00
Ulrich Weigand
fe3406e349 [lld] Add target support for SystemZ (s390x) (#75643)
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
2024-02-13 11:29:21 +01:00
Fangrui Song
1117fdd7c1 [ELF] Implement R_RISCV_TLSDESC for RISC-V
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373

Reviewed By: ilovepi

Pull Request: https://github.com/llvm/llvm-project/pull/79239
2024-01-25 13:42:31 -08:00
Fangrui Song
849951f875 [ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC 2024-01-25 10:17:36 -08:00
Fangrui Song
43b13341fb [ELF] Add internal InputFile (#78944)
Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile
kind `InternalKind`, use it for

* `ctx.internalFile`: for linker-defined symbols and some synthesized
`Undefined`
* `createInternalFile`: for symbol assignments and --defsym

I picked "internal" instead of "synthetic" to avoid confusion with
SyntheticSection.

Currently a symbol's file is one of: nullptr, ObjKind, SharedKind,
BitcodeKind, BinaryKind. Now it's non-null (I plan to add an
`assert(file)` to Symbol::Symbol and change `toString(const InputFile
*)`
separately).

Debugging and error reporting gets improved. The immediate user-facing
difference is more descriptive "File" column in the --cref output. This
patch may unlock further simplification.

Currently each symbol assignment gets its own
`createInternalFile(cmd->location)`. Two symbol assignments in a linker
script do not share the same file. Making the file the same would be
nice, but would require non trivial code.
2024-01-22 09:09:46 -08:00
Arthur Eubanks
1d4c0092a8 [lld/ELF] Hint if R_X86_64_PC32 overflows and references a SHF_X86_64_LARGE section (#73045)
Makes it clearer what the issue is when hand-written assembly doesn't
follow medium code model assumptions in a medium code model build.

Alternative to #71248 by only hinting on an overflow.
2024-01-17 15:38:59 -08:00
Eleanor Bonnici
d21fb06a6e [lld][ELF] Allow Arm PC-relative relocations in PIC links (#77304)
The relocations that map to R_ARM_PCA are equivalent to R_PC. They are
PC-relative and safe to use in shared libraries, but have a different
relocation code as they are evaluated differently. Now that LLVM may
generate these relocations in object files, they may occur in
shared libraries or position-independent executables.
2024-01-11 16:49:14 +00:00
Mitch Phillips
b399c84073 [NFC] [lld] [MTE] Rename MemtagDescriptors to MemtagGlobalDescriptors (#77300)
Requested in https://github.com/llvm/llvm-project/pull/77078, I agree
that we may as well be unambiguous.
2024-01-09 10:06:21 +01:00
Fangrui Song
3fa17954de [ELF] Support R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 in SHF_ALLOC sections (#77261)
Complement #72610 (non-SHF_ALLOC sections). GCC-generated
.gcc_exception_table has the SHF_ALLOC flag and may contain
R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 relocations.
2024-01-08 20:24:00 -08:00
Fangrui Song
42e4967140 [ELF] Don't create copy relocation/canonical PLT entry for a defined symbol (#75095)
Copy relocations and canonical PLT entries are for symbols defined in a
DSO. Currently we create them even for a `Defined`, possibly leading to
an output that won't work at run-time (e.g. R_X86_64_JUMP_SLOT
referencing a null symbol).
```
% cat a.s
.globl _start, main
.type main, @function
_start: main: ret

.rodata
.quad main
% clang -fuse-ld=lld -pie -nostdlib a.s
% readelf -Wr a.out

Relocation section '.rela.plt' at offset 0x290 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
00000000000033b8  0000000000000007 R_X86_64_JUMP_SLOT                        12b0
```

Report an error instead for the default `-z text` mode. GNU ld reports
an error in `-z text` mode as well.
2023-12-12 10:14:36 -08:00
Fangrui Song
255ea48608 [ELF] Merge verdefIndex into versionId. NFC (#72208)
The two fields are similar.

`versionId` is the Verdef index in the output file. It is set for
`--exclude-id=`, version script patterns, and `sym@ver` symbols.

`verdefIndex` is the Verdef index of a Sharedfile (SharedSymbol or a
copy-relocated Defined), the default value -1 is also used to indicate
that the symbol has not been matched by a version script pattern
(https://reviews.llvm.org/D65716).

It seems confusing to have two fields. Merge them so that we can
allocate one bit for #70130 (suppress --no-allow-shlib-undefined
error in the presence of a DSO definition).
2023-11-16 01:03:52 -08:00
Fangrui Song
e84575449f Revert "[ELF] Merge verdefIndex into versionId. NFC" #72208 (#72484)
Reverts llvm/llvm-project#72208

If a unversioned Defined preempts a versioned DSO definition, the
version ID will not be reset.
2023-11-15 23:14:07 -08:00
Fangrui Song
667ea2ca40 [ELF] Merge verdefIndex into versionId. NFC (#72208)
The two fields are similar.

`versionId` is the Verdef index in the output file. It is set for
version script patterns and `sym@ver` symbols.

`verdefIndex` is the Verdef index of a SharedSymbol. The default value
-1 is also used to indicate that the symbol has not been matched by a
version script pattern (https://reviews.llvm.org/D65716).

It seems confusing to have two fields. Merge them so that we can
allocate one bit for #70130 (suppress --no-allow-shlib-undefined
error in the presence of a DSO definition).
2023-11-14 10:20:21 -08:00
Fangrui Song
b169e7fedd [ELF] Improve undefined symbol message w/ DW_TAG_variable of the enclosing symbol but w/o line number information (#70854)
The undefined symbol message suggests the source line when line number
information is available (see https://reviews.llvm.org/D31481).
When the undefined symbol is from a global variable, we won't get the
line information.
```
extern int undef;
namespace ns {
int *var[] = {
  &undef
};
// DW_TAG_variable(DW_AT_decl_file/DW_AT_decl_line) is available while
// line number information is unavailable.
}

ld.lld: error: undefined symbol: undef
>>> referenced by undef-debug2.cc
>>>               undef-debug2.o:(ns::var)
```

This patch utilizes `getEnclosingSymbol` to locate `var` and find
DW_TAG_variable for `var`:
```
ld.lld: error: undefined symbol: undef
>>> referenced by undef-debug2.cc:3 (/tmp/c/undef-debug2.cc:3)
>>>               undef-debug2.o:(ns::var)
```
2023-11-03 13:53:36 -07:00
Fangrui Song
1981b1b6b9 [ELF] Demote symbols in /DISCARD/ discarded sections to Undefined (#69295)
When an input section is matched by /DISCARD/ in a linker script, GNU ld
reports errors for relocations referencing symbols defined in the section:

    `.aaa' referenced in section `.bbb' of a.o: defined in discarded section `.aaa' of a.o

Implement the error by demoting eligible symbols to `Undefined` and changing
STB_WEAK to STB_GLOBAL. As a side benefit, in relocatable links, relocations
referencing symbols defined relative to /DISCARD/ discarded sections no longer
set symbol/type to zeros.

It's arguable whether a weak reference to a discarded symbol should lead to
errors. GNU ld reports an error and our demoting approach reports an error as
well.

Close #58891

Co-authored-by: Bevin Hansson <bevin.hansson@ericsson.com>
2023-10-17 14:10:52 -07:00
Arthur Eubanks
9d6ec280fc [lld/ELF] Don't relax R_X86_64_(REX_)GOTPCRELX when offset is too far
For each R_X86_64_(REX_)GOTPCRELX relocation, check that the offset to the symbol is representable with 2^32 signed offset. If not, add a GOT entry for it and set its expr to R_GOT_PC so that we emit the GOT load instead of the relaxed lea. Do this in finalizeAddressDependentContent() where we iteratively attempt this (e.g. RISCV uses this for relaxation, ARM uses this to insert thunks).

Decided not to do the opposite of inserting GOT entries initially and removing them when relaxable because removing GOT entries isn't simple.

One drawback of this approach is that if we see any GOTPCRELX relocation, we'll create an empty .got even if it's not required in the end.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D157020
2023-10-04 13:03:56 -07:00
Mitch Phillips
ca35a19aca [lld] Synthesize metadata for MTE globals
As per the ABI at
https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst,
this patch interprets the SHT_AARCH64_MEMTAG_GLOBALS_STATIC section,
which contains R_NONE relocations to tagged globals, and emits a
SHT_AARCH64_MEMTAG_GLOBALS_DYNAMIC section, with the correct
DT_AARCH64_MEMTAG_GLOBALS and DT_AARCH64_MEMTAG_GLOBALSSZ dynamic
entries. This section describes, in a uleb-encoded stream, global memory
ranges that should be tagged with MTE.

We are also out of bits to spare in the LLD Symbol class. As a result,
I've reused the 'needsTocRestore' bit, which is a PPC64 only feature.
Now, it's also used for 'isTagged' on AArch64.

An entry in SHT_AARCH64_MEMTAG_GLOBALS_STATIC is practically a guarantee
from an objfile that all references to the linked symbol are through the
GOT, and meet correct alignment requirements. As a result, we go through
all symbols and make sure that, for all symbols $SYM, all object files
that reference $SYM also have a SHT_AARCH64_MEMTAG_GLOBALS_STATIC entry
for $SYM. If this isn't the case, we demote the symbol to being
untagged. Symbols that are imported from other DSOs should always be
fine, as they're GOT-referenced (and thus the GOT entry either has the
correct tag or not, depending on whether it's tagged in the defining DSO
or not).

Additionally hand-tested by building {libc, libm, libc++, libm, and libnetd}
on Android with some experimental MTE globals support in the
linker/libc.

Reviewed By: MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D152921
2023-07-31 17:07:42 +02:00
WANG Xuerui
6084ee7420 [lld][ELF] Support LoongArch
This adds support for the LoongArch ELF psABI v2.00 [1] relocation
model to LLD. The deprecated stack-machine-based psABI v1 relocs are not
supported.

The code is tested by successfully bootstrapping a Gentoo/LoongArch
stage3, complete with common GNU userland tools and both the LLVM and
GNU toolchains (GNU toolchain is present only for building glibc,
LLVM+Clang+LLD are used for the rest). Large programs like QEMU are
tested to work as well.

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

Reviewed By: MaskRay, SixWeining

Differential Revision: https://reviews.llvm.org/D138135
2023-07-25 17:06:07 +08:00
Fangrui Song
8d85c96e0e [lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC
The latter form is now preferred to be similar to C++20 starts_with.
This replacement also removes one function call when startswith is not inlined.
2023-06-05 14:36:19 -07:00
Alexey Lapshin
85c2768ce9 [Support][Parallel] Initialize threadIndex and add assertion checking its usage.
That patch adds a check for threadIndex being used with only threads
created by ThreadPoolExecutor. This helps catch two types of errors:

1. If a thread is created not by ThreadPoolExecutor its index may clash
   with the index of another thread. Using threadIndex, in that case, may
   lead to a data race.

2. Index of the main thread(threadIndex == 0) currently clashes with
   the index of thread0 in ThreadPoolExecutor threads. That may lead
   to a data race if main thread and thread0 are executed concurrently.

This patch allows execution tasks on the main thread only in case
parallel::strategy.ThreadsRequested == 1. In all other cases,
assertions check that threadIndex != UINT_MAX(i.e. that task
is executed on a thread created by ThreadPoolExecutor).

Differential Revision: https://reviews.llvm.org/D148916
2023-05-02 18:44:15 +02:00
Alexey Lapshin
fea8c07356 [Support][Parallel] Add sequential mode to TaskGroup::spawn().
This patch allows to specify that some part of tasks should be
done in sequential order. It makes it possible to not use
condition operator for separating sequential tasks:

TaskGroup tg;
for () {
  if(condition)      ==>   tg.spawn([](){fn();}, condition)
    fn();
  else
    tg.spawn([](){fn();});
}

It also prevents execution on main thread. Which allows adding
checks for getThreadIndex() function discussed in D142318.

The patch also replaces std::stack with std::deque in the
ThreadPoolExecutor to have natural execution order in case
(parallel::strategy.ThreadsRequested == 1).

Differential Revision: https://reviews.llvm.org/D148728
2023-04-26 13:52:26 +02:00
Fangrui Song
b30b1f173c [ELF] Add single quotes around out of range errors
to match the convention we use for other diagnostics.
2023-03-03 12:48:16 -08:00
Fangrui Song
ffa1118330 [ELF] Mention section name for STT_SECTION in reportRangeError()
D73518 mentioned non-STT_SECTION symbol names. This patch extends the code to
handle STT_SECTION symbols, where we report the section name.
This change helps at least the following cases with very little code.

* Whether a out-of-range relocation is due to code or data.
* For a relocation in .debug_info, which referenced `.debug_*` section (due to DWARF32 limitation) causes the problem.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D145199
2023-03-03 12:35:05 -08:00
Fangrui Song
08c915fa76 [ELF] -z notext: avoid dynamic relocations in .eh_frame
Fix https://github.com/llvm/llvm-project/issues/60392

```
// a.cc
void raise() { throw 42; }
bool foo() {
  try { raise(); } catch (int) { return true; }
  return false;
}
int main() { foo(); }
```

```
clang++ --target=x86_64-linux-gnu -fno-pic -mcmodel=large -no-pie -fuse-ld=lld -z notext a.cc -o a && ./a
clang++ --target=aarch64-linux-gnu -fno-pic -no-pie -fuse-ld=lld -Wl,--dynamic-linker=/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/usr/aarch64-linux-gnu/lib -z notext a.cc -o a && ./a
```
Both commands fail because we produce a dynamic relocation for
R_X86_64_64/R_AARCH64_ABS64 in .eh_frame which will be adjusted to a wrong
offset by `SectionBase::getOffset` after D122459.

Since GNU ld uses a canonical PLT entry instead of a dynamic relocation for
.eh_frame, we follow suit as well to avoid the issue.

Mips has an ABI issue (https://github.com/llvm/llvm-project/issues/5837) and we
don't implement GNU ld's DW_EH_PE_absptr conversion. mips64-eh-abs-reloc.s wants
a dynamic relocation, so keep the original behavior for EM_MIPS.

Differential Revision: https://reviews.llvm.org/D143136
2023-02-03 10:27:33 -08:00
Shivam Gupta
e8e0e5f3ee [NFC] Small indentation fix in lld/ELF/Relocations.cpp 2023-01-22 18:40:58 +05:30
Guillaume Chatelet
08e2a76381 [lld][NFC] rename ELF alignment into addralign 2022-12-01 16:20:12 +00:00
Fangrui Song
4191fda69c [ELF] Change most llvm::Optional to std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 19:19:15 -08:00
Fangrui Song
6bca3ad379 [ELF] Simplify postScanRelocations with in.got 2022-11-21 06:04:18 +00:00
Fangrui Song
c3c9e45312 [ELF] Add InputSectionBase::{addRelocs,relocs} and GotSection::addConstant to add/access relocations
to prepare for changing `relocations` from a SmallVector to a pointer.

Also change the `isec` parameter in `addAddendOnlyRelocIfNonPreemptible` to `GotSection &`.
2022-11-21 04:12:03 +00:00
Fangrui Song
2bf5d86422 [ELF] Change rawData to content() and data() to contentMaybeDecompress()
Clarify data() which may trigger decompression and make it feasible to refactor
the member variable rawData.
2022-11-20 22:43:22 +00:00
Fangrui Song
4eda362539 [ELF] Inline computeAddend. NFC 2022-10-17 13:09:39 -07:00
Fangrui Song
d8af31eced [ELF] Move ELFT-agnostic relocation code to processAux 2022-10-17 11:57:17 -07:00
Fangrui Song
874fc6bd78 [ELF] Move ELFT-agnostic relocation code to processAux. NFC 2022-10-17 11:44:28 -07:00
Fangrui Song
dc884f0f43 [ELF] Remove RelocationScanner::target. NFC 2022-10-16 12:39:37 -07:00
Fangrui Song
e983109876 [ELF] Move R_TPREL/R_TPREL_NEG check into handleTlsRelocation 2022-10-16 12:19:58 -07:00
Fangrui Song
2b153088be [ELF] Set DF_STATIC_TLS for AArch64/PPC32/PPC64 2022-10-16 12:08:08 -07:00
Fangrui Song
9c626d4a0d [ELF] Remove symtab indirection. NFC
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.
2022-10-01 14:46:49 -07:00
Fangrui Song
34fa860048 [ELF] Remove ctx indirection. NFC
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection. We can move other global variables into ctx without
indirection concern. In the long term we may consider passing Ctx
as a parameter to various functions and eliminate global state as
much as possible and then remove `Ctx::reset`.
2022-10-01 12:06:33 -07:00
Fangrui Song
a623a4c8b4 [ELF] Remove elf::config indirection. NFC
`config` has 1000+ uses so we try to avoid changing `config->foo`. Define a
wrapper with LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection.

My x86-64 lld executable is 11+KiB smaller.
2022-10-01 11:39:45 -07:00
Fangrui Song
ab11ed5249 [ELF] Reset verdefIndex for Defined preempting SharedSymbol
to avoid spurious "attempt to reassign symbol '...'" warning after
7a58dd1046
2022-09-29 21:26:53 -07:00
Fangrui Song
e3ecc6a912 [ELF] Make symAux[0] a sentinel
And default auxIdx to 0.
2022-09-29 00:50:19 -07:00
Fangrui Song
7a58dd1046 [ELF] Refactor Symbol initialization and overwriting
Symbol::replace intends to overwrite a few fields (mostly Elf{32,64}_Sym
fields), but the implementation copies all fields then restores some old fields.
This is error-prone and wasteful. Add Symbol::overwrite to copy just the
needed fields and add other overwrite member functions to copy the extra
fields.
2022-09-28 13:11:31 -07:00
Fangrui Song
62e7c5b4e2 Revert "[ELF] --pack-dyn-relocs=android: scan relocation serially after D133003"
This reverts commit bce6416775.

The workaround is unneeded after 7dac9f4e48.
2022-09-28 07:06:49 +00:00
Fangrui Song
bce6416775 [ELF] --pack-dyn-relocs=android: scan relocation serially after D133003
https://reviews.llvm.org/D133003#3806508 can reproduce a non-determinism with
--threads=4. Making the config serial fixes non-determinism (by running the link
many times and compare output).
2022-09-21 11:43:13 -07:00
Martin Storsjö
e280940bfb [Support] Access threadIndex via a wrapper function
On Unix platforms, this wrapper function is inline, so it should
expand to the same direct access to the thread local variable. On
Windows, it's a non-inline function within Parallel.cpp, allowing
making the thread_local variable static.

Windows Native TLS doesn't support direct access to thread local
variables in a different DLL, and GCC/binutils on Windows occasionally
has problems with non-static thread local variables too.

This fixes mingw dylib builds with native TLS after
e6aebff674.

At the same time, move the whole thread local variable within
    #if LLVM_ENABLE_THREADS
to fix builds without threading support.

Differential Revision: https://reviews.llvm.org/D133759
2022-09-14 09:19:27 +03:00
Fangrui Song
e6aebff674 [ELF] Parallelize relocation scanning
* Change `Symbol::flags` to a `std::atomic<uint16_t>`
* Add `llvm::parallel::threadIndex` as a thread-local non-negative integer
* Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
* Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.

MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.

Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:

* clang (Release): 1.27x as fast
* clang (Debug): 1.06x as fast
* chrome (default): 1.05x as fast
* scylladb (default): 1.04x as fast

Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):

* clang (Release): 1.31x as fast
* scylladb (default): 1.06x as fast

Reviewed By: andrewng

Differential Revision: https://reviews.llvm.org/D133003
2022-09-12 12:56:35 -07:00