Commit Graph

762 Commits

Author SHA1 Message Date
Fangrui Song
ee19eb3037 [ELF] Change some upper-case utohexstr to lower-case to improve consistency
The convention is to use lower-case addresses.
2024-11-29 18:37:47 -08:00
Fangrui Song
360718fb90 [test] Improve symbol-location.s to check --defsym 2024-11-24 11:22:19 -08:00
Fangrui Song
a2359a865a [ELF] Fix PROVIDE_HIDDEN -shared regression with bitcode file references
The inaccurate #111945 condition fixes a PROVIDE regression (#111478)
but introduces another regression: in a DSO link, if a symbol referenced
only by bitcode files is defined as PROVIDE_HIDDEN, lld would not set
the visibility correctly, leading to an assertion failure in
DynamicReloc::getSymIndex (https://reviews.llvm.org/D123985).
This is because `(sym->isUsedInRegularObj || sym->exportDynamic)` is
initially false (bitcode undef does not set `isUsedInRegularObj`) then
true (in `addSymbol`, after LTO compilation).

Fix this by making the condition accurate: use a map to track defined
symbols.

Reviewers: smithp35

Reviewed By: smithp35

Pull Request: https://github.com/llvm/llvm-project/pull/112386
2024-10-15 09:20:10 -07:00
Fangrui Song
1c6688ae34 [ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined
Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate
sections that would be discarded by GC.

Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return
false (when Defined) and then return true (been demoted to Undefined).

```
addScriptReferencedSymbolsToSymTable
  shouldAddProvideSym(f1): false
  // the RHS (bar) is not added to `referencedSymbols` and may be GCed
declareSymbols
  shouldAddProvideSym(f1): false
markLive
demoteSymbolsAndComputeIsPreemptible
  // demoted f1 to Undefined
processSymbolAssignments
  addSymbol
    shouldAddProvideSym(f1): true
```

The inconsistency can cause `cmd->expression()` in `addSymbol` to be
evaluated, leading to `symbol not found: bar` errors (since `bar` in the
RHS is not in `referencedSymbols` and is GCed) (#111478).

Fix this by adding a `sym->isUsedInRegularObj` condition, making
`shouldAddProvideSym(f1)` values consistent. In addition, we need a
`sym->exportDynamic` condition to keep provide-shared.s working.

Fixes: ebb326a51f

Pull Request: https://github.com/llvm/llvm-project/pull/111945
2024-10-11 08:47:07 -07:00
Fangrui Song
b6448a03d8 [ELF] Change "no PT_TLS" error to use errorOrWarn
so that --noinhibit-exec downgrades the error to a warning, which helps
debugging when `PHDRS` is specified without `PT_TLS`. Also update the
message to make it accurate: STT_TLS may exist in the absence of PT_TLS.

In addition, invoking `exitLld(1)` (through `fatal`) is problematic
(#66974): When a thread is `exitLld(1)`, triggering `llvm_shutdown`,
another thread may be at `relocateAlloc`, accessing `sec.relocs()` which
got destroyed(tampered?), leading to
incorrect `llvm_unreachable("invalid expression")`.
2024-08-12 11:56:29 -07:00
Fangrui Song
dc21cb5cc7 [ELF,test] Test STT_TLS and relocation without PT_TLS 2024-08-12 11:25:46 -07:00
Daniel Thornburgh
7e8a9020b1 [LLD] Add CLASS syntax to SECTIONS (#95323)
This allows the input section matching algorithm to be separated from
output section descriptions. This allows a group of sections to be
assigned to multiple output sections, providing an explicit version of
--enable-non-contiguous-regions's spilling that doesn't require altering
global linker script matching behavior with a flag. It also makes the
linker script language more expressive even if spilling is not intended,
since input section matching can be done in a different order than
sections are placed in an output section.

The implementation reuses the backend mechanism provided by
--enable-non-contiguous-regions, so it has roughly similar semantics and
limitations. In particular, sections cannot be spilled into or out of
INSERT, OVERWRITE_SECTIONS, or /DISCARD/. The former two aren't
intrinsic, so it may be possible to relax those restrictions later.
2024-08-05 13:06:45 -07:00
Fangrui Song
0af07c0787 [ELF] Support relocatable files using CREL with explicit addends
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.

---

Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.

(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)

A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.

---

Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).

* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.

SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/98115
2024-08-01 10:22:03 -07:00
Fangrui Song
a7e8bddfc1 [ELF] Respect --sysroot for INCLUDE
If an included script is under the sysroot directory, when it opens an
absolute path file (`INPUT` or `GROUP`), add sysroot before the absolute
path. When the included script ends, the `isUnderSysroot` state is
restored.
2024-07-28 11:43:27 -07:00
Fangrui Song
30fa011413 [ELF,test] Improve --sysroot and GROUP tests
3i.t (INCLUDE "%t.dir/3a.t") describes a behavior difference from GNU
ld, which will be fixed by the next change.
2024-07-28 11:40:14 -07:00
Fangrui Song
a4921f10e0 [ELF] Output section phdr: support quoted names 2024-07-27 17:40:51 -07:00
Fangrui Song
9c16a4a2dc [ELF] INSERT [AFTER|BEFORE]: support quoted names 2024-07-27 17:34:37 -07:00
Fangrui Song
8f72b0cb08 [ELF] Fix INCLUDE cycle detection
Fix #93947: the cycle detection mechanism added by
https://reviews.llvm.org/D37524 also disallowed including a file twice,
which is an unnecessary limitation.

Now that we have an include stack #100493, supporting multiple inclusion
is trivial. Note: a filename can be referenced with many different
paths, e.g. a.lds, ./a.lds, ././a.lds. We don't attempt to detect the
cycle in the earliest point.
2024-07-27 17:25:13 -07:00
Fangrui Song
aad2238f78 [ELF] Improve INCLUDE cycle tests
And demonstrate the incorrect diagnostic when a linker script is
included multiple times (#93947).
2024-07-27 17:14:53 -07:00
Fangrui Song
4ad3deeefc [ELF] PHDRS: test EOF without ; 2024-07-27 16:56:27 -07:00
Fangrui Song
dbd65a07f2 [ELF] OUTPUT_ARCH: report unclosed error 2024-07-27 16:52:47 -07:00
Fangrui Song
0d8bc10acb [ELF] Memory region: support quoted names 2024-07-27 16:39:15 -07:00
Fangrui Song
e689515491 [ELF] OVERLAY: support quoted output section names 2024-07-27 16:33:18 -07:00
Fangrui Song
74ef53a01a [ELF] REGION_ALIAS: support quoted names 2024-07-27 16:29:43 -07:00
Fangrui Song
30ec2bf58d [ELF] PROVIDE: allow quoted names to be discarded
Extend commit ebb326a51f for (#74771) to
support quoted names, e.g. `PROVIDE("f1" = f2 + f3);`.
2024-07-27 16:19:57 -07:00
Fangrui Song
9328c20cc8 [ELF] Track line number precisely
`getLineNumber` is both imprecise (when `INCLUDE` is used) and
inefficient (see https://reviews.llvm.org/D104137). Track line number
precisely now that we have `struct Buffer` abstraction from #100493.
2024-07-27 14:46:41 -07:00
Fangrui Song
2a89356d64 [ELF] Add till and rewrite while (... consume("}"))
After #100493, the idiom `while (!errorCount() && !consume("}"))` could
lead to inaccurate diagnostics or dead loops. Introduce till to change
the code pattern.
2024-07-26 17:13:37 -07:00
Fangrui Song
6cf1ea99c6 [ELF,test] Improve unclosed tests 2024-07-26 16:51:42 -07:00
Fangrui Song
4f5ad22b95 [ELF,test] Improve PHDRS tests 2024-07-26 15:55:01 -07:00
Fangrui Song
1978c21d96 [ELF] ScriptLexer: generate tokens lazily
The current tokenize-whole-file approach has a few limitations.

* Lack of state information: `maybeSplitExpr` is needed to parse
  expressions. It's infeasible to add new states to behave more like GNU
  ld.
* `readInclude` may insert tokens in the middle, leading to a time
  complexity issue with N-nested `INCLUDE`.
* line/column information for diagnostics are inaccurate, especially
  after an `INCLUDE`.
* `getLineNumber` cannot be made more efficient without significant code
  complexity and memory consumption. https://reviews.llvm.org/D104137

The patch switches to a traditional lexer that generates tokens lazily.

* `atEOF` behavior is modified: we need to call `peek` to determine EOF.
* `peek` and `next` cannot call `setError` upon `atEOF`.
* Since `consume` no longer reports an error upon `atEOF`, the idiom `while (!errorCount() && !consume(")"))`
  would cause a dead loop. Use `while (peek() != ")" && !atEOF()) { ... } expect(")")` instead.
* An include stack is introduced to handle `readInclude`. This can be
  utilized to address #93947 properly.
* `tokens` and `pos` are removed.
* `commandString` is reimplemented. Since it is used in -Map output,
  `\n` needs to be replaced with space.

Pull Request: https://github.com/llvm/llvm-project/pull/100493
2024-07-26 14:26:38 -07:00
Fangrui Song
8644a2aa0f [ELF,test] Improve negative linker script tests 2024-07-25 17:11:52 -07:00
Fangrui Song
28045ceab0 [ELF] Support (TYPE=<value>) beside output section address
Support `preinit_array . (TYPE=SHT_PREINIT_ARRAY) : { QUAD(16) }`

Follow-up to https://reviews.llvm.org/D118840

peek2() could be eliminated by a future change.
2024-07-20 14:13:02 -07:00
Fangrui Song
0778f5c1f1 [ELF] Support NOCROSSREFS and NOCROSSERFS_TO
Implement the two commands described by
https://sourceware.org/binutils/docs/ld/Miscellaneous-Commands.html

After `outputSections` is available, check each output section described
by at least one `NOCROSSREFS`/`NOCROSSERFS_TO` command. For each checked
output section, scan relocations from its input sections.
This step is slow, therefore utilize `parallelForEach(isd->sections, ...)`.

To support non SHF_ALLOC sections, `InputSectionBase::relocations`
(empty) cannot be used. In addition, we may explore eliminating this
member to speed up relocation scanning.

Some parse code is adapted from #95714.

Close #41825

Pull Request: https://github.com/llvm/llvm-project/pull/98773
2024-07-17 10:45:59 -07:00
Fangrui Song
fdd3196553 [ELF] Make start/stop symbols retain associated discardable output sections
An empty output section specified in the `SECTIONS` command (e.g.
`empty : { *(empty) }`) may be discarded. Due to phase ordering, we
might define `__start_empty`/`__stop_empty` symbols with incorrect
section indexes (usually benign, but could go out of bounds and cause
`readelf -s` to print `BAD`).

```
finalizeSections
  addStartStopSymbols     // __start_empty is defined
  // __start_empty is added to .symtab
  sortSections
    adjustOutputSections  // `empty` is discarded
writeSections
  // __start_empty is Defined with an invalid section index
```

Loaders use `st_value` members of the start/stop symbols and expect no
"undefined symbol" linker error, but do not particularly care whether
the symbols are defined or undefined. Let's retain the associated empty
output section so that start/stop symbols will have correct section
indexes.

The approach allows us to remove `LinkerScript::isDiscarded`
(https://reviews.llvm.org/D114179). Also delete the
`findSection(".text")` special case from https://reviews.llvm.org/D46200,
which is unnecessary even before this patch (`elfHeader` would be fine
even with very large executables).

Note: we should be careful not to unnecessarily retain .ARM.exidx, which
would create an empty PT_ARM_EXIDX. ~40 tests would need to be updated.

---

An alternative is to discard the empty output section and keep the
start/stop symbols undefined. This approach needs more code and requires
`LinkerScript::isDiscarded` before we discard empty sections in
``adjustOutputSections`.

Pull Request: https://github.com/llvm/llvm-project/pull/96343
2024-07-02 10:58:24 -07:00
Fangrui Song
ee4c12f87d [ELF] Postpone more linker script errors
Since `assignAddresses` is executed more than once, error reporting
during `assignAddresses` would be duplicated. Generalize #66854 to cover
more errors.

Note: address-related errors exposed in one invocation might not be
errors in another invocation.

Pull Request: https://github.com/llvm/llvm-project/pull/96361
2024-06-24 10:15:28 -07:00
Fangrui Song
fbcb92ca01 [ELF] Test non-alloc orphan that does not the RF_NOT_ADDR_SET rank flag 2024-06-06 13:22:47 -07:00
Fangrui Song
9ad0175ea0 [ELF] Keep non-alloc orphan sections at the end
https://reviews.llvm.org/D85867 changed the way we assign file offsets
(alloc sections first, then non-alloc sections).

It also removed a non-alloc special case from `findOrphanPos`.
Looking at the memory-nonalloc-no-warn.test change, which would be
needed by #93761, it makes sense to restore the previous behavior: when
placing non-alloc orphan sections, keep these sections at the end so
that the section index order matches the file offset order.

This change is cosmetic. In sections-nonalloc.s, GNU ld places the
orphan `other3` in the middle and the orphan .symtab/.shstrtab/.strtab
at the end.

Pull Request: https://github.com/llvm/llvm-project/pull/94519
2024-06-06 12:13:19 -07:00
Fangrui Song
7b346357db [ELF] Orphan placement: prefer the last similar section when its rank <= orphan's rank
`findOrphanPos` finds the most similar output section (that has input
sections). In the event of proximity ties, we select the first section.

However, when an orphan section's rank is equal to or larger than the
most similar sections's, it makes sense to prioritize the last similar
section. This new behavior matches GNU ld better.

```
// orphan placement for .bss (SHF_ALLOC|SHF_WRITE, SHT_NOBITS)

WA SHT_PROGBITS
(old behavior) <= here
A
WA SHT_PROGBITS
AX
WA (.data)
(new behavior) <= here
```

When the orphan section's rank is less, the current behavior
prioritizing the first section still makes sense.
```
// orphan with a smaller rank, e.g. .rodata

<= here
WA
AX
WA
```

Close #92987

Pull Request: https://github.com/llvm/llvm-project/pull/94099
2024-06-04 09:14:54 -07:00
Fangrui Song
7b6a89f346 [ELF] Detect convergence of output section addresses
Some linker scripts don't converge. https://reviews.llvm.org/D66279
("[ELF] Make LinkerScript::assignAddresses iterative") detected
convergence of symbol assignments.

This patch detects convergence of output section addresses. While input
sections might also have convergence issues, they are less common as
expressions that could cause convergence issues typically involve output
sections and symbol assignments.

GNU ld has an error `non constant or forward reference address expression for section` that
correctly rejects
```
SECTIONS {
  .text ADDR(.data)+0x1000 : { *(.text) }
  .data : { *(.data) }
}
```

but not the following variant:
```
SECTIONS {
  .text foo : { *(.text) }
  .data : { *(.data) }
  foo = ADDR(.data)+0x1000;
}
```

Our approach consistently rejects both cases.

Link: https://discourse.llvm.org/t/lld-and-layout-convergence/79232

Pull Request: https://github.com/llvm/llvm-project/pull/93888
2024-05-31 09:31:15 -07:00
Fangrui Song
167cad531d [ELF] Improve ADDR tests
Merge some test files.
The "undefined section" error (`checkIfExists`) was previously untested.
2024-05-30 13:43:15 -07:00
Fangrui Song
747d670bae [ELF] Make .interp/SHT_NOTE not special
Follow-up to a previous simplification
2473b1af08.

The xor difference between a SHT_NOTE and a read-only SHT_PROGBITS
(previously >=NOT_SPECIAL) should be smaller than RF_EXEC. Otherwise,
for the following section layout, `findOrphanPos` would place .text
before note.

```
// simplified from linkerscript/custom-section-type.s
non orphans:
progbits 0x8060c00 NOT_SPECIAL
note     0x8040003

orphan:
.text    0x8061000 NOT_SPECIAL
```

rw-text.lds in orphan.s (added by
73e07e9244) demonstrates a similar case.
The new behavior is more similar to GNU ld.

#93763 fixed BOLT's brittle reliance on the previous .interp behavior.
2024-05-30 11:18:03 -07:00
Fangrui Song
73e07e9244 [ELF] Add RW then text test
Currently, lld assigns RF_NOT_SPECIAL so that orphan .interp and
SHT_NOTE are always before other sections. GNU ld doesn't do so. The
next change will remove RF_NOT_SPECIAL.
2024-05-30 11:12:55 -07:00
Fangrui Song
270d95bfed [ELF] Improve orphan placement tests
Merge orphan-align.test (which introduced `shouldSkip`) into orphan.s.
2024-05-30 10:59:22 -07:00
Igor Kudrin
34b14cc4f8 [lld][ELF] Suppress --orphan-handling=error/warn without SECTIONS (#93630)
Without a linker script, `--orphan-handling=error` or `=warn` reports
all input sections, including even well-known sections like `.text`,
`.bss`, `.dynamic`, or `.symtab`. However, in this case, no sections
should be considered orphans because they all are placed with the same
default rules. This patch suppresses errors/warnings for placing orphan
sections if no linker script with the `SECTIONS` command is provided.

The proposed behavior matches GNU gold. GNU ld in the same scenario only
reports sections that are not in its default linker script, thus, it
avoids complaining about `.text` and similar.
2024-05-29 14:53:29 -07:00
Fangrui Song
e53c53559b [ELF,test] Improve --compress-debug-sections/--compress-sections tests
Make sections larger so that compressed content will be smaller than
uncompressed content. Add a few dedicated tests where compressed content
is larger.
2024-05-22 15:40:24 -07:00
Daniel Thornburgh
66466ff151 Reland: [LLD] Implement --enable-non-contiguous-regions (#90007)
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.

This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:

- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).

- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.

- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.

The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.

Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.

A few notable feature interactions occur:

- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.

- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).

- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.

- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.

- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
2024-05-13 11:06:54 -07:00
Daniel Thornburgh
81f34afa5c Revert "[LLD] Implement --enable-non-contiguous-regions" (#92005)
Reverts llvm/llvm-project#90007

Broke in merging I think.
2024-05-13 10:38:40 -07:00
Daniel Thornburgh
673114447b [LLD] Implement --enable-non-contiguous-regions (#90007)
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.

This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:

- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).

- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.

- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.

The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.

Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.

A few notable feature interactions occur:

- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.

- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).

- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.

- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.

- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
2024-05-13 10:30:50 -07:00
Fangrui Song
b1f04d57f5 [ELF,test] Fix typo in check prefixes 2024-05-12 21:15:36 -07:00
Fangrui Song
f02a27df2f [ELF] Add --default-script/-dT
GNU ld added --default-script (alias: -dT) in 2007. The option specifies
a default script that is processed if --script/-T is not specified. -dT
can be used to override GNU ld's internal linker script, but only when
the application does not specify -T.
In addition, dynamorio's CMakeLists.txt may use -dT.

The implementation is simple and the feature can be useful to dabble
with different section layouts.

Pull Request: https://github.com/llvm/llvm-project/pull/89327
2024-04-19 09:09:41 -07:00
Fangrui Song
dcc45faa30 [ELF] PROVIDE: fix spurious "symbol not found"
When archive member extraction involving ENTRY happens after
`addScriptReferencedSymbolsToSymTable`,
`addScriptReferencedSymbolsToSymTable` may fail to define some PROVIDE
symbols used by ENTRY. This is an edge case that regressed after #84512.
(The interaction with PROVIDE and ENTRY-in-archive was not considered
before).

While here, also ensure that --undefined-glob extracted object files
are parsed before `addScriptReferencedSymbolsToSymTable`.

Fixes: ebb326a51f

Pull Request: https://github.com/llvm/llvm-project/pull/87530
2024-04-04 09:38:01 -07:00
Parth Arora
ebb326a51f [ELF] Fix unnecessary inclusion of unreferenced provide symbols
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes #74771
Closes #84730

Co-authored-by: Fangrui Song <i@maskray.me>
2024-03-25 16:11:21 -07:00
Fangrui Song
e115c00565 [ELF] Reject certain unknown section types (#85173)
Unknown section sections may require special linking rules, and
rejecting such sections for older linkers may be desired. For example,
if we introduce a new section type to replace a control structure (e.g.
relocations), it would be nice for older linkers to reject the new
section type. GNU ld allows certain unknown section types:

* [SHT_LOUSER,SHT_HIUSER] and non-SHF_ALLOC
* [SHT_LOOS,SHT_HIOS] and non-SHF_OS_NONCONFORMING

but reports errors and stops linking for others (unless
--no-warn-mismatch is specified). Port its behavior. For convenience, we
additionally allow all [SHT_LOPROC,SHT_HIPROC] types so that we don't
have to hard code all known types for each processor.

Close https://github.com/llvm/llvm-project/issues/84812
2024-03-15 09:50:23 -07:00
Fangrui Song
8fe3e70e81 [ELF] Eliminate symbols demoted due to /DISCARD/ discarded sections (#85167)
#69295 demoted Defined symbols relative to discarded sections.
If such a symbol is unreferenced, the desired behavior is to
eliminate it from .symtab just like --gc-sections discarded
definitions.
Linux kernel's CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y configuration expects
that the unreferenced `unused` is not emitted to .symtab
(https://github.com/ClangBuiltLinux/linux/issues/2006).

For relocations referencing demoted symbols, the symbol index restores
to 0 like older lld (`R_X86_64_64 0` in `discard-section.s`).

Fix #85048
2024-03-14 09:51:27 -07:00
Fangrui Song
f1ca2a0967 [ELF] Add --compress-section to compress matched non-SHF_ALLOC sections
--compress-sections <section-glib>=[none|zlib|zstd] is similar to
--compress-debug-sections but applies to broader sections without the
SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is
matched. An interesting use case is to compress `.strtab`/`.symtab`,
which consume a significant portion of the file size (15.1% for a
release build of Clang).

An older revision is available at https://reviews.llvm.org/D154641 .
This patch focuses on non-allocated sections for safety. Moving
`maybeCompress` as D154641 does not handle STT_SECTION symbols for
`-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s`
from #66804).

Since different output sections may use different compression
algorithms, we need CompressedData::type to generalize
config->compressDebugSections.

GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452

Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674

Pull Request: https://github.com/llvm/llvm-project/pull/84855
2024-03-12 10:56:14 -07:00