Commit Graph

485 Commits

Author SHA1 Message Date
Fangrui Song
843c2b2303 [ELF] Error for undefined foo@v1
If an object file has an undefined foo@v1, we emit a dynamic symbol foo.
This is incorrect if at runtime a shared object provides the non-default version foo@v1
(the undefined foo may bind to foo@@v2, for example).

GNU ld issues an error for this case, even if foo@v1 is undefined weak
(https://sourceware.org/bugzilla/show_bug.cgi?id=3351). This behavior makes
sense because to represent an undefined foo@v1, we have to construct a Verneed
entry. However, without knowing the defining filename, we cannot construct a
Verneed entry (Verneed::vn_file is unavailable).

This patch implements the error.

Depends on D92258

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D92260
2020-12-01 08:59:54 -08:00
Fangrui Song
50564ca075 [ELF] Rename adjustRelaxExpr to adjustTlsExpr and delete the unused data parameter. NFC
Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D91995
2020-11-25 09:00:55 -08:00
Fangrui Song
572d18397c [ELF] Add TargetInfo::adjustGotPcExpr for R_GOT_PC relaxations. NFC
With this change, `TargetInfo::adjustRelaxExpr` is only related to TLS
relaxations and a subsequent clean-up can delete the `data` parameter.

Differential Revision: https://reviews.llvm.org/D92079
2020-11-25 08:43:26 -08:00
serge-sans-paille
1e70ec10eb [lld] Provide a hook to customize undefined symbols error handling
This is a follow up to https://reviews.llvm.org/D87758, implementing the missing
symbol part, as done by binutils.

Differential Revision: https://reviews.llvm.org/D89687
2020-11-09 13:28:48 +01:00
Fangrui Song
9267caebfa [ELF] Don't error on R_PPC64_REL24/R_PPC64_REL24_NOTOC referencing __tls_get_addr for missing R_PPC64_TLSGD/R_PPC64_TLSLD
This partially reverts D85994.

In glibc, elf/dl-sym.c calls the raw `__tls_get_addr` by specifying the
tls_index parameter. Such a call does not have a pairing R_PPC64_TLSGD/R_PPC64_TLSLD.
This is legitimate. Since we cannot distinguish the benign case from cases due
to toolchain issues, we have to be permissive.

Acked by Stefan Pintilie
2020-10-23 10:38:07 -07:00
Stefan Pintilie
c6561ccfd9 [PowerPC][LLD] Support for PC Relative TLS for Local Dynamic
Add support to LLD for PC Relative Thread Local Storage for Local Dynamic.
This patch adds support for two relocations: R_PPC64_GOT_TLSLD_PCREL34 and
R_PPC64_DTPREL34.

The Local Dynamic code is:
```
pla r3, x@got@tlsld@pcrel        R_PPC64_GOT_TLSLD_PCREL34
bl __tls_get_addr@notoc(x@tlsld) R_PPC64_TLSLD
                                 R_PPC64_REL24_NOTOC
...
paddi r9, r3, x@dtprel           R_PPC64_DTPREL34
```

After relaxation to Local Exec:
```
paddi r3, r13, 0x1000
nop
...
paddi r9, r3, x@dtprel          R_PPC64_DTPREL34
```

Reviewed By: NeHuang, sfertile

Differential Revision: https://reviews.llvm.org/D87504
2020-10-23 08:23:56 -05:00
Fangrui Song
88f2fe5cad Raland D87318 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.

The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel            R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd)     R_PPC64_TLSGD
                                     R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for  R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.

Reviewed By: NeHuang, sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D87318
2020-10-01 12:36:33 -07:00
Stefan Pintilie
5f3e565f59 Revert "[LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic"
This reverts commit 79122868f9.
2020-10-01 13:28:35 -05:00
Stefan Pintilie
79122868f9 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.

The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel            R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd)     R_PPC64_TLSGD
                                     R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for  R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.

Reviewed By: NeHuang, sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D87318
2020-10-01 13:00:37 -05:00
Georgii Rymar
4845531fa8 [lib/Object] - Refine interface of ELFFile<ELFT>. NFCI.
`ELFFile<ELFT>` has many methods that take pointers,
though they assume that arguments are never null and
hence could take references instead.

This patch performs such clean-up.

Differential revision: https://reviews.llvm.org/D87385
2020-09-15 11:38:31 +03:00
Fangrui Song
94921e9f8a [ELF] Define a reportRangeError() overload for thunks and tidy up recent PPC64 thunk range errors
Prefer `errorOrWarn` to `fatal` for recoverable errors and graceful degradation
when --noinhibit-exec is specified.

Mention the destination symbol, otherwise the diagnostic is not really actionable.
Two errors are not tested but the patch does not intend to add the coverage.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D87486
2020-09-14 09:55:59 -07:00
Stefan Pintilie
02e02f5398 [LLD][PowerPC] Add check in LLD to produce an error for missing TLSGD/TLSLD
The function `__tls_get_addr` is used to get the address of an object that is Thread Local Storage.
It needs to have two relocations on it.
One relocation is for the function call itself and it is either R_PPC64_REL24 or R_PPC64_REL24_NOTOC.
The other is R_PPC64_TLSGD or R_PPC64_TLSLD for the symbol that is having its address computed.

In the early days of the transition from the ELFv1 ABI that is used for big endian PowerPC Linux distributions to the ELFv2 ABI that is used for little endian PowerPC Linux distributions, there was some ambiguity in the specification of the relocations for TLS. The GNU linker has implemented support for correct handling of calls to __tls_get_addr with a missing relocation. Unfortunately, we didn't notice that the IBM XL compiler did not handle TLS according to the updated ABI until we tried linking XL compiled libraries with LLD. As a result, there is a lot of code out there in various libraries compiled with XL that have this problem.

This patch adds a new error check in LLD that makes sure calls to `__tls_get_addr` are not missing the TLSGD/TLSLD relocation.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D85994
2020-08-21 12:56:12 -05:00
Nemanja Ivanovic
cddb0dbcef [LLD][PowerPC] Implement GOT to PC-Rel relaxation
This patch implements the handling for the R_PPC64_PCREL_OPT relocation as well
as the GOT relocation for the associated R_PPC64_GOT_PCREL34 relocation.

On Power10 targets with PC-Relative addressing, the linker can relax
GOT-relative accesses to PC-Relative under some conditions. Since the sequence
consists of a prefixed load, followed by a non-prefixed access (load or store),
the linker needs to replace the first instruction (as the replacement
instruction will be prefixed). The compiler communicates to the linker that
this optimization is safe by placing the two aforementioned relocations on the
GOT load (of the address).
The linker then does two things:

- Convert the load from the got into a PC-Relative add to compute the address
  relative to the PC
- Find the instruction referred to by the second relocation (R_PPC64_PCREL_OPT)
  and replace the first with the PC-Relative version of it

It is important to synchronize the mapping from legacy memory instructions to
their PC-Relative form. Hence, this patch adds a file to be included by both
the compiler and the linker so they're always in agreement.

Differential revision: https://reviews.llvm.org/D84360
2020-08-17 09:36:09 -05:00
Fangrui Song
169ec2d6b0 [ELF] Rename canRelax to toExecRelax. NFC
In the absence of TLS relaxation (rewrite of code sequences),
there is still an applicable optimization:

[gd]: General Dynamic: resolve DTPMOD to 1 and/or resolve DTPOFF statically

All the other relaxations are only performed when transiting to
executable (`!config->shared`).
Since [gd] is handled differently, we can fold `!config->shared` into canRelax
and simplify its use sites. Rename the variable to reflect to new semantics.

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D83243
2020-07-08 10:27:31 -07:00
Fangrui Song
c1a5f73a4a [ELF][ARM] Represent R_ARM_LDO32 as R_DTPREL instead of R_ABS
Follow-up to D82899. Note, we need to disable R_DTPREL relaxation
because ARM psABI does not define TLS relaxation.

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D83138
2020-07-06 09:47:53 -07:00
Fangrui Song
07837b8f49 [ELF] Use namespace qualifiers (lld:: or elf::) instead of namespace lld { namespace elf {
Similar to D74882. This reverts much code from commit
bd8cfe65f5 (D68323) and fixes some
problems before D68323.

Sorry for the churn but D68323 was a mistake. Namespace qualifiers avoid
bugs where the definition does not match the declaration from the
header. See
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions (D74515)

Differential Revision: https://reviews.llvm.org/D79982
2020-05-15 08:49:53 -07:00
Sid Manning
0e6536fd97 [Hexagon] Add R_HEX_GD_PLT_B22/32_PCREL relocations
Extended versions of GD_PLT_B22_PCREL. These surface when -mlong-calls
is used.

Differential Revision: https://reviews.llvm.org/D79191
2020-05-05 11:47:51 -05:00
Fangrui Song
b257d3c8a8 [ELF][PPC64] Suppress toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with R_PPC64_TOC16_LO:

  // Load the address of the TOC entry, instead of the value stored at that address
  addis 3, 2, .LC0@tloc@ha  # R_PPC64_TOC16_HA
  addi  3, 3, .LC0@tloc@l   # R_PPC64_TOC16_LO
  blr

which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so it cannot use an `addis;ld` pair.
Instead, it jumps to a stub which loads the TOC entry address with `addis;addi`.

This patch checks the presence of R_PPC64_TOC16_LO and suppresses
toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen.
This approach is conservative and loses some relaxation opportunities but is easy to implement.

  addis 3, 2, .LC0@toc@ha  # no relaxation
  addi  3, 3, .LC0@toc@l   # no relaxation
  li    9, 0
  addis 4, 2, .LC0@toc@ha  # can relax but suppressed
  ld    4, .LC0@toc@l(4)   # can relax but suppressed

Also note that interleaved R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS is
possible and this patch accounts for that.

  addis 3, 2, .LC1@toc@ha  # can relax
  addis 4, 2, .LC2@toc@ha  # can relax
  ld    3, .LC1@toc@l(3)   # can relax
  ld    4, .LC2@toc@l(4)   # can relax

Reviewed By: #powerpc, sfertile

Differential Revision: https://reviews.llvm.org/D78431
2020-04-30 09:16:51 -07:00
Sid Manning
c484b3e334 [Hexagon] Fix issue with non-preemptible STT_TLS symbols
A PC-relative relocation referencing a non-preemptible absolute symbol
(due to STT_TLS) is not representable in -pie/-shared mode.

Differential Revision: https://reviews.llvm.org/D77021
2020-04-03 08:55:23 -05:00
Igor Kudrin
b0b1f451ae [LLD][ELF] Follow the common pattern in a message about an undefined vtable symbol.
In most cases, LLD prints its multiline diagnostic messages starting
additional lines with ">>> ". That greatly helps external tools to parse
the output, simplifying combining several lines of the log back into one
message. The patch fixes the only message I found that does not follow
the common pattern.

Differential Revision: https://reviews.llvm.org/D77132
2020-04-02 11:39:03 +07:00
Nico Weber
20eb719f99 lld: Reduce number of references to undefined printed from 10 to 3.
As of a while ago, lld groups all undefined references to a single
symbol in a single diagnostic. Back then, I made it so that we
print up to 10 references to each undefined symbol.

Having used this for a while, I never wished there were more
references, but I sometimes found that this can print a lot of
output. lld prints up to 10 diagnostics by default, and if
each has 10 references (which I've seen in practice), and each
undefined symbol produces 2 (possibly very long) lines of output,
that's over 200 lines of error output.

Let's try it with just 3 references for a while and see how
that feels in practice.

Differential Revision: https://reviews.llvm.org/D77017
2020-03-30 14:31:32 -04:00
Sid Manning
5a5a075c5b [LLD][ELF][Hexagon] Support GDPLT transforms
Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr.

Example:
     call x@gdplt
is changed to
     call __tls_get_addr

When x is an external tls variable.

Differential Revision: https://reviews.llvm.org/D74443
2020-03-13 11:02:11 -05:00
Fangrui Song
315f8a55f5 [ELF][PPC32] Don't report "relocation refers to a discarded section" for .got2
Similar to D63182 [ELF][PPC64] Don't report "relocation refers to a discarded section" for .toc

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D75419
2020-03-01 19:54:40 -08:00
Fangrui Song
00925aadb3 [ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order
Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D75394
2020-02-28 22:23:14 -08:00
Fangrui Song
4a4ce14eb2 [ELF] Mention symbol name in reportRangeError()
For an out-of-range relocation referencing a non-local symbol, report the symbol name and the object file that defines the symbol. As an example:
```
t.o:(function func: .text.func+0x3): relocation R_X86_64_32S out of range: -281474974609120 is not in [-2147483648, 2147483647]
```
=>
```
t.o:(function func: .text.func+0x3): relocation R_X86_64_32S out of range: -281474974609120 is not in [-2147483648, 2147483647]; references func
>>> defined in t1.o
```

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D73518
2020-01-29 09:38:25 -08:00
Benjamin Kramer
adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Fangrui Song
70389be7a0 [ELF][PPC32] Support range extension thunks with addends
* Generalize the code added in D70637 and D70937. We should eventually remove the EM_MIPS special case.
* Handle R_PPC_LOCAL24PC the same way as R_PPC_REL24.

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D73424
2020-01-25 22:32:42 -08:00
Fangrui Song
837e8a9c0c [ELF][PPC32] Support canonical PLT
-fno-pie produces a pair of non-GOT-non-PLT relocations R_PPC_ADDR16_{HA,LO} (R_ABS) referencing external
functions.

```
lis 3, func@ha
la 3, func@l(3)
```

In a -no-pie/-pie link, if func is not defined in the executable, a canonical PLT entry (st_value>0, st_shndx=0) will be needed.
References to func in shared objects will be resolved to this address.
-fno-pie -pie should fail with "can't create dynamic relocation ... against ...", so we just need to think about -no-pie.

On x86, the PLT entry passes the JMP_SLOT offset to the rtld PLT resolver.
On x86-64: the PLT entry passes the JUMP_SLOT index to the rtld PLT resolver.
On ARM/AArch64: the PLT entry passes &.got.plt[n]. The PLT header passes &.got.plt[fixed-index]. The rtld PLT resolver can compute the JUMP_SLOT index from the two addresses.

For these targets, the canonical PLT entry can just reuse the regular PLT entry (in PltSection).

On PPC32: PltSection (.glink) consists of `b PLTresolve` instructions and `PLTresolve`. The rtld PLT resolver depends on r11 having been set up to the .plt (GotPltSection) entry.
On PPC64 ELFv2: PltSection (.glink) consists of `__glink_PLTresolve` and `bl __glink_PLTresolve`. The rtld PLT resolver depends on r12 having been set up to the .plt (GotPltSection) entry.

We cannot reuse a `b PLTresolve`/`bl __glink_PLTresolve` in PltSection as a canonical PLT entry. PPC64 ELFv2 avoids the problem by using TOC for any external reference, even in non-pic code, so the canonical PLT entry scenario should not happen in the first place.
For PPC32, we have to create a PLT call stub as the canonical PLT entry. The code sequence sets up r11.

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D73399
2020-01-25 17:56:37 -08:00
Fangrui Song
6ab89c3c5d [ELF] Allow R_PLT_PC (R_PC) to a hidden undefined weak symbol
This essentially reverts b841e119d7.

Such code construct can be used in the following way:

  // glibc/stdlib/exit.c
  // clang -fuse-ld=lld => succeeded
  // clang -fuse-ld=lld -fpie -pie => relocation R_PLT_PC cannot refer to absolute symbol
  __attribute__((weak, visibility("hidden"))) extern void __call_tls_dtors();
  void __run_exit_handlers() {
    if (__call_tls_dtors)
        __call_tls_dtors();
  }

Since we allow R_PLT_PC in -no-pie mode, it makes sense to allow it in
-pie mode as well.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D72943
2020-01-17 13:06:42 -08:00
Peter Smith
01ad4c8384 [LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS.
In D71281 a fix was put in to round up the size of a ThunkSection to the
nearest 4KiB when performing errata patching. This fixed a problem with a
very large instrumented program that had thunks and patches mutually
trigger each other. Unfortunately it triggers an assertion failure in an
AArch64 allyesconfig build of the kernel. There is a specific assertion
preventing an InputSectionDescription being larger than 4KiB. This will
always trigger if there is at least one Thunk needed in that
InputSectionDescription, which is possible for an allyesconfig build.

Abstractly the problem case is:
.text : {
          *(.text) ;
          ...
          . = ALIGN(SZ_4K);
          __idmap_text_start = .;
          *(.idmap.text)
          __idmap_text_end = .;
          ...
        }
The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB.
Note that there is more than one InputSectionDescription in the
OutputSection so we can't just restrict the fix to OutputSections smaller
than 4 KiB.

The fix presented here limits the D71281 to InputSectionDescriptions that
meet the following conditions:
1.) The OutputSection is bigger than the thunkSectionSpacing so adding
thunks will affect the addresses of following code.
2.) The InputSectionDescription is larger than 4 KiB. This will prevent
any assertion failures that an InputSectionDescription is < 4 KiB
in size.

We do this at ThunkSection creation time as at this point we know that
the addresses are stable and up to date prior to adding the thunks as
assignAddresses() will have been called immediately prior to thunk
generation.

The fix reverts the two tests affected by D71281 to their original state
as they no longer need the 4KiB size roundup. I've added simpler tests to
check for D71281 when the OutputSection size is larger than the ThunkSection
spacing.

Fixes https://github.com/ClangBuiltLinux/linux/issues/812

Differential Revision: https://reviews.llvm.org/D72344
2020-01-17 10:47:21 +00:00
Fangrui Song
bec1b55c64 [ELF] Delete the RelExpr member R_HINT. NFC
R_HINT is ignored like R_NONE. There are no strong reasons to keep
R_HINT. The largest RelExpr member R_RISCV_PC_INDIRECT is 60 now.

Differential Revision: https://reviews.llvm.org/D71822
2020-01-14 10:56:53 -08:00
Sid Manning
0fa8f701cc [ELF][Hexagon] Add support for IE relocations
Differential Revision: https://reviews.llvm.org/D71143
2020-01-09 09:45:24 -06:00
Fangrui Song
b841e119d7 [ELF] Delete an unused special rule from isStaticLinkTimeConstant. NFC
Weak undefined symbols are preemptible after D71794.

  if (sym.isPreemptible)
    return false;
  if (!config->isPic)
    return true;
  // isPic means includeInDynsym is true after D71794.

  ...

  // We can delete this if because it can never be true.
  if (sym.isUndefWeak)
    return true;

Differential Revision: https://reviews.llvm.org/D71795
2020-01-08 09:41:59 -08:00
Fangrui Song
085898d469 [ELF] Drop const qualifier to fix -Wrange-loop-analysis. NFC
```
lld/ELF/Relocations.cpp:1622:56: warning: loop variable 'ts' of type 'const std::pair<ThunkSection *, uint32_t>' (aka 'const pair<lld::elf::ThunkSection *, unsigned int>') creates a copy from type 'const std::pair<ThunkSection *, uint32_t>' [-Wrange-loop-analysis]
        for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)
```

Drop const qualifier to fix -Wrange-loop-analysis.
We can make -Wrange-loop-analysis warnings (DiagnoseForRangeConstVariableCopies) on `const A` more
permissive on more types (e.g. POD -> trivially copyable), unfortunately it will not make std::pair
good, because `constexpr pair& operator=(const pair& p);` is unfortunately user-defined.

Reviewed By: Mordante

Differential Revision: https://reviews.llvm.org/D72211
2020-01-04 12:24:39 -08:00
Fangrui Song
261b7b4a6b [ELF] Don't suggest an alternative spelling for a symbol in a discarded section
For undef-not-suggest.test, we currently make redundant alternative
spelling suggestions:

```
ld.lld: error: relocation refers to a discarded section: .text.foo
>>> defined in a.o
>>> section group signature: foo
>>> prevailing definition is in a.o
>>> referenced by a.o:(.rodata+0x0)
>>> did you mean:
>>> defined in: a.o

ld.lld: error: relocation refers to a symbol in a discarded section: foo
>>> defined in a.o
>>> section group signature: foo
>>> prevailing definition is in a.o
>>> referenced by a.o:(.rodata+0x8)
>>> did you mean: for
>>> defined in: a.o
```

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71735
2019-12-23 09:10:29 -08:00
Fangrui Song
2539cd22e9 [ELF] Delete a redundant R_HINT check from isStaticLinkTimeConstant(). NFC
scanReloc() returns when it sees an R_HINT.
2019-12-22 16:58:22 -08:00
Fangrui Song
891a8655ab [ELF] Add IpltSection
PltSection is used by both PLT and IPLT. The PLT section may have a
header while the IPLT section does not. Split off IpltSection from
PltSection to be clearer.

Unlike other targets, PPC64 cannot use the same code sequence for PLT
and IPLT. This helps make a future PPC64 patch (D71509) more isolated.

On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently
we are inconsistent whether the PLT header is conceptually attached to
in.plt or in.iplt .  Consistently attach the header to in.plt can make
the -z retpolineplt logic simpler. It also makes `jmp` point to an
aesthetically better place for non-retpolineplt cases.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71519
2019-12-17 00:06:04 -08:00
Fangrui Song
98afa2c1f1 [ELF] De-template PltSection::addEntry. NFC 2019-12-16 11:03:20 -08:00
Fangrui Song
c8f0d3e130 [ELF][PPC64] Support long branch thunks with addends
Fixes PPC64 part of PR40438

  // clang -target ppc64le -c a.cc
  // .text.unlikely may be placed in a separate output section (via -z keep-text-section-prefix)
  // The distance between bar in .text.unlikely and foo in .text may be larger than 32MiB.
  static void foo() {}
  __attribute__((section(".text.unlikely"))) static int bar() { foo(); return 0; }
  __attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for PPC64.

AArch64: .text -> `__AArch64ADRPThunk_ (adrp x16, ...; add x16, x16, ...; br x16)` -> target
PPC64: .text -> `__long_branch_ (addis 12, 2, ...; ld 12, ...(12); mtctr 12; bctr)` -> target

AArch64 can leverage ADRP to jump to the target directly, but PPC64
needs to load an address from .branch_lt . Before Power ISA v3.0, the
PC-relative ADDPCIS was not available. .branch_lt was invented to work
around the limitation.

Symbol::ppc64BranchltIndex is replaced by
PPC64LongBranchTargetSection::entry_index which take addends into
consideration.

The tests are rewritten: ppc64-long-branch.s tests -no-pie and
ppc64-long-branch-pi.s tests -pie and -shared.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D70937
2019-12-05 10:17:45 -08:00
Fangrui Song
944f109ad7 [ELF][PPC64] Don't copy ppc64BranchltIndex in replaceWithDefined
replaceWithDefined is used by canonical PLT and copy relocations, which
imply that the symbol is preemptable. ppc64BranchltIndex is only used by
non-preemptable cases, and it can only be the default value in
replaceWithDefined.
2019-12-05 09:33:30 -08:00
Fangrui Song
bf535ac4a2 [ELF][AArch64] Support R_AARCH64_{CALL26,JUMP26} range extension thunks with addends
Fixes AArch64 part of PR40438

The current range extension thunk framework does not handle a relocation
relative to a STT_SECTION symbol with a non-zero addend, which may be
used by jumps/calls to local functions on some RELA targets (AArch64,
powerpc ELFv1, powerpc64 ELFv2, etc).  See PR40438 and the following
code for examples:

  // clang -target $target a.cc
  // .text.cold may be placed in a separate output section.
  // The distance between bar in .text.cold and foo in .text may be larger than 128MiB.
  static void foo() {}
  __attribute__((section(".text.cold"))) static int bar() { foo(); return
  0; }
  __attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for AArch64. The target
independent part can be reused by PPC in the future.

On REL targets (ARM, MIPS), jumps/calls are not represented as
STT_SECTION + non-zero addend (see
MCELFObjectTargetWriter::needsRelocateWithSymbol), so they don't need
this feature, but we need to make sure this patch does not affect them.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D70637
2019-12-02 10:07:24 -08:00
Fangrui Song
3d9b1128d6 [ELF][ARM] Add getPCBias()
ThunkCreator::getThunk and ThunkCreator::normalizeExistingThunk
currently assume that the implicit addends are -8 for ARM and -4 for
Thumb. In D70637, ThunkCreator::getThunk will need to take care of the
relocation addend explicitly.

Add the utility function getPCBias() as a prerequisite so that the getThunk change in D70637
can be more general.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D70690
2019-11-27 09:09:46 -08:00
Fangrui Song
54a366f515 [ELF] Add a corrector for case mismatch problems
Reviewed By: grimar, peter.smith

Differential Revision: https://reviews.llvm.org/D70506
2019-11-26 09:11:56 -08:00
Fangrui Song
a2fc964417 [ELF] Replace SymbolTable::forEachSymbol with iterator_range symbols()
D62381 introduced forEachSymbol(). It seems that many call sites cannot
be parallelized because the body shared some states. Replace
forEachSymbol with iterator_range<filter_iterator<...>> symbols() to
simplify code and improve debuggability (std::function calls take some
frames).

It also allows us to use early return to simplify code added in D69650.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D70505
2019-11-26 09:09:32 -08:00
Fangrui Song
5b47efa20e [ELF] Fix stack-use-after-scope after D69592 and 69650 2019-11-08 11:21:32 -08:00
Fangrui Song
59d3fbc227 [ELF] Suggest extern "C" when the definition is mangled while an undefined reference is not
The definition may be mangled while an undefined reference is not.
This may come up when (1) the reference is from a C file or (2) the definition
misses an extern "C".

(2) is more common. Suggest an arbitrary mangled name that matches the
undefined reference, if such a definition exists.

  ld.lld: error: undefined symbol: foo
  >>> referenced by a.o:(.text+0x1)
  >>> did you mean to declare foo(int) as extern "C"?
  >>> defined in: a1.o

Reviewed By: dblaikie, ruiu

Differential Revision: https://reviews.llvm.org/D69650
2019-11-08 09:46:45 -08:00
Fangrui Song
70e62a4fa6 [ELF] Suggest extern "C" when an undefined reference is mangled while the definition is not
When missing an extern "C" declaration, an undefined reference may be
mangled while the definition is not. Suggest the missing
extern "C" and the base name.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D69592
2019-11-08 09:42:50 -08:00
Nico Weber
5976a3f5aa Fix a few typos in lld/ELF to cycle bots 2019-10-28 21:41:47 -04:00
Fangrui Song
bd8cfe65f5 [ELF] Wrap things in namespace lld { namespace elf {, NFC
This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf`
namespace and simplifies `elf::foo` to `foo`.

Reviewed By: atanasyan, grimar, ruiu

Differential Revision: https://reviews.llvm.org/D68323

llvm-svn: 373885
2019-10-07 08:31:18 +00:00
Fangrui Song
e47bbd28f8 [ELF] Make MergeInputSection merging aware of output sections
Fixes PR38748

mergeSections() calls getOutputSectionName() to get output section
names. Two MergeInputSections may be merged even if they are made
different by SECTIONS commands.

This patch moves mergeSections() after processSectionCommands() and
addOrphanSections() to fix the issue. The new pass is renamed to
OutputSection::finalizeInputSections().

processSectionCommands() and addorphanSections() are changed to add
sections to InputSectionDescription::sectionBases.

finalizeInputSections() merges MergeInputSections and migrates
`sectionBases` to `sections`.

For the -r case, we drop an optimization that tries keeping sh_entsize
non-zero. This is for the simplicity of addOrphanSections(). The
updated merge-entsize2.s reflects the change.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67504

llvm-svn: 372734
2019-09-24 11:48:31 +00:00