Remove the global variable `symtab` and add a member variable
(`std::unique_ptr<SymbolTable>`) to `Ctx` instead.
This is one step toward eliminating global states.
Pull Request: https://github.com/llvm/llvm-project/pull/109612
Ctx was introduced in March 2022 as a more suitable place for such
singletons.
We now use default-initialization for `LinkerScript` and should pay
attention to non-class types (e.g. `dot` is initialized by commit
503907dc50).
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.
---
Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.
(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)
A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.
---
Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).
* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.
SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.
Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600
Pull Request: https://github.com/llvm/llvm-project/pull/98115
This commit provides linker support for Cortex-M Security Extensions (CMSE).
The specification for this feature can be found in ARM v8-M Security Extensions:
Requirements on Development Tools.
The linker synthesizes a security gateway veneer in a special section;
`.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`,
defined relative to the same text section and having the same address. The
address of `<entry>` is retargeted to the starting address of the
linker-synthesized security gateway veneer in section `.gnu.sgstubs`.
In summary, the linker translates input:
```
.text
entry:
__acle_se_entry:
[entry_code]
```
into:
```
.section .gnu.sgstubs
entry:
SG
B.W __acle_se_entry
.text
__acle_se_entry:
[entry_code]
```
If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker
considers that `<entry>` already defines a secure gateway veneer so does not
synthesize one.
If `--out-implib=<out.lib>` is specified, the linker writes the list of secure
gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library
will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway
veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with
value `<addr>`.
If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import
library `<in.lib>` and preserves the entry function addresses in the resulting
executable and new import library.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D139092
This reverts commit c4fea39056.
I am reverting this for now until I figure out how to fix
the build bot errors and warnings.
Errors:
llvm-project/lld/ELF/Arch/ARM.cpp:1300:29: error: expected primary-expression before ‘>’ token
osec->writeHeaderTo<ELFT>(++sHdrs);
Warnings:
llvm-project/lld/ELF/Arch/ARM.cpp:1306:31: warning: left operand of comma operator has no effect [-Wunused-value]
This commit provides linker support for Cortex-M Security Extensions (CMSE).
The specification for this feature can be found in ARM v8-M Security Extensions:
Requirements on Development Tools.
The linker synthesizes a security gateway veneer in a special section;
`.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`,
defined relative to the same text section and having the same address. The
address of `<entry>` is retargeted to the starting address of the
linker-synthesized security gateway veneer in section `.gnu.sgstubs`.
In summary, the linker translates input:
```
.text
entry:
__acle_se_entry:
[entry_code]
```
into:
```
.section .gnu.sgstubs
entry:
SG
B.W __acle_se_entry
.text
__acle_se_entry:
[entry_code]
```
If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker
considers that `<entry>` already defines a secure gateway veneer so does not
synthesize one.
If `--out-implib=<out.lib>` is specified, the linker writes the list of secure
gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library
will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway
veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with
value `<addr>`.
If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import
library `<in.lib>` and preserves the entry function addresses in the resulting
executable and new import library.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D139092
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection. We can move other global variables into ctx without
indirection concern. In the long term we may consider passing Ctx
as a parameter to various functions and eliminate global state as
much as possible and then remove `Ctx::reset`.
This change renames this method match its original name and the name
used in the wasm linker.
Back in d8f8abbd4a the ELF SymbolTable
method `getSymbols()` was replaced with `forEachSymbol`.
Then in a2fc964417 `forEachSymbol` was
replaced with a `llvm::iterator_range`.
Then in e9262edf0d we came full circle
and the `llvm::iterator_range` was replaced with a `symbols()` accessor
that was identical the original `getSymbols()`.
`getSymbols` also matches the name used elsewhere in the ELF linker as
well as in both COFF and wasm backend (e.g. `InputFiles.h` and
`SyntheticSections.h`)
Differential Revision: https://reviews.llvm.org/D130787
This simplifies code, removes a read32 (for id==0 check), and makes it feasible
to combine some operations in EhInputSection::split and EhFrameSection::addRecords.
Mostly NFC, but fixes "Relocation not in any piece" assertion failure in an
erroneous case when a relocation offset precedes all CIE/FDE pices.
inputSections temporarily contains EhInputSection objects mainly for
combineEhSections. Place EhInputSection objects into a new vector
ehInputSections instead of inputSections.
Symbol.h depends on InputFiles.h. This change moves us toward dropping the
weird dependency.
The call sites will become slightly uglier (`cast<SharedFile>(s->file)`), but
the compromise is acceptable.
In many call sites we know uncompression cannot happen (non-SHF_ALLOC, or the
data (even if compressed) must have been uncompressed by a previous pass).
Prefer rawData in these cases. data() increases code size and prevents
optimization on rawData.
Previously an InputSectionBase is dead (`partition==0`) by default.
SyntheticSection calls markLive and BssSection overrides that with markDead.
It is more natural to make InputSectionBase live by default and let
--gc-sections mark InputSectionBase dead.
When linking a Release build of clang:
* --no-gc-sections:, the removed `inputSections` loop decreases markLive time from 4ms to 1ms.
* --gc-sections: the extra `inputSections` loop increases markLive time from 0.181296s to 0.188526s.
This is as of we lose the removing one `inputSections` loop optimization (4374824ccf).
I believe the loss can be mitigated if we refactor markLive.
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.
See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html
Differential Revision: https://reviews.llvm.org/D108850
Older Go cmd/link used SHT_PROGBITS for .init_array .
Work around the lack of https://golang.org/cl/373734 for a while.
It does not generate .fini_array or .preinit_array
For `InputSection` `.foo`, its `InputBaseSection::{areRelocsRela,firstRelocation,numRelocation}` basically
encode the information of `.rel[a].foo`. However, one uint32_t (the relocation section index)
suffices. See the implementation of `relsOrRelas`.
This change decreases sizeof(InputSection) from 184 to 176 on 64-bit Linux.
The maximum resident set size linking a large application (1.2G output) decreases by 0.39%.
Differential Revision: https://reviews.llvm.org/D112513
clang may place dynamic initializations for explicitly specialized class
template static data members in comdat.
Such in-comdat SHT_INIT_ARRAY was an abuse but we have to work around it for a while.
This generalizes D70146 (SHT_NOTE) to more reserved sections and makes our rules
more consistent. Now SHF_GROUP is more similar to SHF_LINK_ORDER.
For SHT_INIT_ARRAY/SHT_FINI_ARRAY, the rule will be closer to PE/COFF link.exe.
Previously sanitizers use llvm.global_ctors to make module_ctor a GC
root, which is considered an abuse.
https://groups.google.com/g/generic-abi/c/TpleUEkNoQI
We can squeak through on compatibility issues because compilers otherwise don't
use SHF_GROUP special sections.
Change the default to facilitate GC for metadata section usage, so that they
don't need SHF_LINK_ORDER or SHF_GROUP just to drop the unhelpful rule (if they
want to be unconditionally retained, use SHF_GNU_RETAIN
(`__attribute__((retain))`) or linker script `KEEP`).
The dropped SHF_GROUP special case makes the behavior of -z start-stop-gc and -z
nostart-stop-gc closer to GNU ld>=2.37 (https://sourceware.org/PR27451).
However, we default to -z start-stop-gc (which actually matches more closely to
GNU ld before 2015-10 https://sourceware.org/PR19167), which is different from
modern GNU ld (which has the unhelpful rule to work around glibc). As a
compensation, we special case `__libc_` sections as a workaround for glibc<2.34
(https://sourceware.org/PR27492).
Since -z start-stop-gc as the default actually matches the traditional GNU ld
behavior, there isn't much to be aware of. There was a systemd usage which has
been fixed by https://github.com/systemd/systemd/pull/19144
For one metadata section usage, each text section references a metadata section.
The metadata sections have a C identifier name to allow the runtime to collect them via `__start_/__stop_` symbols.
Since `__start_`/`__stop_` references are always present from live sections, the
C identifier name sections appear like GC roots, which means they cannot be
discarded by `ld --gc-sections`.
To make such sections GCable, either SHF_LINK_ORDER or a section group is needed.
SHF_LINK_ORDER is not suitable for the references can be inlined into other functions
(See D97430:
Function A (in the section .text.A) references its `__sancov_guard` section.
Function B inlines A (so now .text.B references `__sancov_guard` - this is invalid with the semantics of SHF_LINK_ORDER).
In the linking stage,
if `.text.A` gets discarded, and `__sancov_guard` is retained via the reference from `.text.B`,
the output will be invalid because `__sancov_guard` references the discarded `.text.A`.
LLD errors "sh_link points to discarded section".
)
A section group have size overhead, and is cumbersome when there is just one metadata section.
Add `-z start-stop-gc` to drop the "__start_/__stop_ references retain
non-SHF_LINK_ORDER non-SHF_GROUP C identifier name sections" rule.
We reserve the rights to switch the default in the future.
Reviewed By: phosek, jrtc27
Differential Revision: https://reviews.llvm.org/D96914
The special root semantics for identifier-named sections is meant
specifically for the metadata sections. In the context of group
semantics, where group members are always retained or discarded as a
unit, it's natural not to have this semantics apply to a section in a
group, otherwise we would never discard the group defeating the purpose
of using the group in the first place.
This change modifies the GC behavior so that __start_/__stop_ references
don't retain C identifier named sections in section groups which allows
for these groups to be collected. This matches the behavior of BFD ld.
The only kind of existing case that might break is interdependent
metadata sections that are all in a group together, but that group
doesn't contain any other sections referenced by anything except
implicit inclusion in a `__start_` and/or `__stop_`-referenced
identifier-named section, but such cases should be unlikely.
Differential Revision: https://reviews.llvm.org/D96753