Ctx was introduced in March 2022 as a more suitable place for such
singletons.
We now use default-initialization for `LinkerScript` and should pay
attention to non-class types (e.g. `dot` is initialized by commit
503907dc50).
Ctx was introduced in March 2022 as a more suitable place for such
singletons. ctx's hidden visibility optimizes generated instructions.
This change fixes a pitfall: certain ElfSym members (e.g.
globalOffsetTable, tlsModuleBase) were not zeroed and might be stale
when lld::elf::link was invoked the second time.
Ctx was introduced in March 2022 as a more suitable place for such
singletons. ctx's hidden visibility optimizes generated instructions.
bufferStart and tlsPhdr, which are not OutputSection, can now be moved
outside of `Out`.
GNU ld since 2.41 supports this option, which is mildly useful. It omits
the section header table and non-ALLOC sections (including
.symtab/.strtab (--strip-all)).
This option is simple to implement and might be used by LLDB to test
program headers parsing without the section header table (#100900).
-z sectionheader, which is the default, is also added.
Pull Request: https://github.com/llvm/llvm-project/pull/101286
This reverts commit 740161a9b9.
I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.
This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
Parsing the new input file's symbols might invalidate LTO codegen, but
the semantics of deplibs require them to be parsed. Accordingly, report
an error unless the file had already been added to the link.
Fixes#56070
Fix#98294.
When you specify --wrap=foo, sometimes foo is undefined in any context.
If you declare __real_foo as weak, GNU ld will not attempt to find the
strong symbol foo, instead, it generates a weak undefined symbol.
This pull request imitates this behavior by copying the binding
attribute from __real_foo to foo.
With `LLVM_APPEND_VC_REV=off`, the new version message after #97323
looks like:
```
% /tmp/out/custom2/bin/ld.lld --version
LLD 19.0.0, compatible with GNU linkers
```
A trailing comma after the version string might cause issues with
version detection tools that don't strip it, as seen in the Linux
kernel's scripts/ld-version.sh script.
Pull Request: https://github.com/llvm/llvm-project/pull/97942
Added LLVM repository and LLVM revision information for
`lld::getLLDVersion()`
Before this change:
```
hongyuchy@hongyuchy:~/llvm-project/.build_lld_version$ bin/ld.lld --version
LLD 19.0.0 (compatible with GNU linkers)
```
After this change with LLVM_APPEND_VC_REV=on
```
hongyuchy@hongyuchy:~/llvm-project/.build_lld_version$ bin/ld.lld --version
LLD 19.0.0 (https://github.com/yugier/llvm-project.git4134b33c6a), compatible with GNU linkers
```
with LLVM_APPEND_VC_REV=off
```
hongyuchy@hongyuchy:~/llvm-project/.build_lld_version$ bin/ld.lld --version
LLD 19.0.0, compatible with GNU linkers
```
Summary:
The LTO pass currently supporting emitting LTO via the
`--plugin-opt=emit-llvm` option. However, there is a very similar option
called `--lto-emit-asm`. This patch just makes the usage more
consistent and more obvious that emitting LLVM-IR is supported.
The first object file whose EI_OSABI is not ELFOSABI_NONE is selected.
This is useful for some OSes to identify themselves. This achieves
similar effects to BFD emulations `ld.lld -m *_fbsd` but is more
lightweight.
Pull Request: https://github.com/llvm/llvm-project/pull/97144
In GNU ld, -r forces -Bstatic and has precedence over -Bdynamic: -lfoo
probes libfoo.a but not libfoo.so, even if -Bdynamic is in effect. Our
behavior currently matches gold and probes libfoo.so. Since we don't
have strong opinion on the exact behavior, let's just follow GNU ld and
also unify the reason we report the "attempted static link of dynamic
object " error.
Close#94958
GNU ld's relocatable linking behaviors:
* Sections with the `SHF_GROUP` flag are handled like sections matched
by the `--unique=pattern` option. They are processed like orphan
sections and ignored by input section descriptions.
* Section groups' (usually named `.group`) content is updated as the
section indexes are updated. Section groups can be discarded with
`/DISCARD/ : { *(.group) }`.
`-r --force-group-allocation` discards section groups and allows
sections with the `SHF_GROUP` flag to be matched like normal sections.
If two section group members are placed into the same output section,
their relocation sections (if present) are combined as well.
This behavior can be useful when -r output is used as a pseudo shared
object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments).
This patch implements --force-group-allocation:
* Input SHT_GROUP sections are discarded.
* Input sections do not get the SHF_GROUP flag, so `addInputSec`
will combine relocation sections if their relocated section group
members are combined.
The default behavior is:
* Input SHT_GROUP sections are retained.
* Input SHF_GROUP sections can be matched (unlike GNU ld)
* Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec`
will create different OutputDesc copies.
GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not
implemented.
Pull Request: https://github.com/llvm/llvm-project/pull/94704
This adds the -z gcs and -z gcs-report options, which behave similarly
to -z shtk and -z cet-report, except that -z gcs accepts a parameter:
* -z gcs=implicit is the default behaviour, where the GCS bit is
inferred from the input objects.
* -z gcs=never clears the GCS bit, ignoring the input objects.
* -z gcs=always sets the GCS bit, ignoring the input objects.
This is so that there's a means of explicitly disabling GCS even when
all input objects have the GCS bit set.
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.
This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:
- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).
- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.
- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.
The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.
Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.
A few notable feature interactions occur:
- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.
- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).
- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.
- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.
- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
When enabled, input sections that would otherwise overflow a memory
region are instead spilled to the next matching output section.
This feature parallels the one in GNU LD, but there are some differences
from its documented behavior:
- /DISCARD/ only matches previously-unmatched sections (i.e., the flag
does not affect it).
- If a section fails to fit at any of its matches, the link fails
instead of discarding the section.
- The flag --enable-non-contiguous-regions-warnings is not implemented,
as it exists to warn about such occurrences.
The implementation places stubs at possible spill locations, and
replaces them with the original input section when effecting spills.
Spilling decisions occur after address assignment. Sections are spilled
in reverse order of assignment, with each spill naively decreasing the
size of the affected memory regions. This continues until the memory
regions are brought back under size. Spilling anything causes another
pass of address assignment, and this continues to fixed point.
Spilling after rather than during assignment allows the algorithm to
consider the size effects of unspillable input sections that appear
later in the assignment. Otherwise, such sections (e.g. thunks) may
force an overflow, even if spilling something earlier could have avoided
it.
A few notable feature interactions occur:
- Stubs affect alignment, ONLY_IF_RO, etc, broadly as if a copy of the
input section were actually placed there.
- SHF_MERGE synthetic sections use the spill list of their first
contained input section (the one that gives the section its name).
- ICF occurs oblivious to spill sections; spill lists for merged-away
sections become inert and are removed after assignment.
- SHF_LINK_ORDER and .ARM.exidx are ordered according to the final
section ordering, after all spilling has completed.
- INSERT BEFORE/AFTER and OVERWRITE_SECTIONS are explicitly disallowed.
zstd excels at scaling from low-ratio-very-fast to
high-ratio-pretty-slow. Some users prioritize speed and prefer disk read
speed, while others focus on achieving the highest compression ratio
possible, similar to traditional high-ratio codecs like LZMA.
Add an optional `level` to `--compress-sections` (#84855) to cater to
these diverse needs. While we initially aimed for a one-size-fits-all
approach, this no longer seems to work.
(https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html)
When --compress-debug-sections is used together, make
--compress-sections take precedence since --compress-sections is usually
more specific.
Remove the level distinction between -O/-O1 and -O2 for
--compress-debug-sections=zlib for a more consistent user experience.
Pull Request: https://github.com/llvm/llvm-project/pull/90567
GNU ld added --default-script (alias: -dT) in 2007. The option specifies
a default script that is processed if --script/-T is not specified. -dT
can be used to override GNU ld's internal linker script, but only when
the application does not specify -T.
In addition, dynamorio's CMakeLists.txt may use -dT.
The implementation is simple and the feature can be useful to dabble
with different section layouts.
Pull Request: https://github.com/llvm/llvm-project/pull/89327
`clang -g -gpubnames` (with optional -gsplit-dwarf) creates the
`.debug_names` section ("per-CU" index). By default lld concatenates
input `.debug_names` sections into an output `.debug_names` section.
LLDB can consume the concatenated section but the lookup performance is
not good.
This patch adds --debug-names to create a per-module index by combining
the per-CU indexes into a single index that covers the entire load
module. The produced `.debug_names` is a replacement for `.gdb_index`.
Type units (-fdebug-types-section) are not handled yet.
Co-authored-by: Fangrui Song <i@maskray.me>
---------
Co-authored-by: Fangrui Song <i@maskray.me>
When archive member extraction involving ENTRY happens after
`addScriptReferencedSymbolsToSymTable`,
`addScriptReferencedSymbolsToSymTable` may fail to define some PROVIDE
symbols used by ENTRY. This is an edge case that regressed after #84512.
(The interaction with PROVIDE and ENTRY-in-archive was not considered
before).
While here, also ensure that --undefined-glob extracted object files
are parsed before `addScriptReferencedSymbolsToSymTable`.
Fixes: ebb326a51f
Pull Request: https://github.com/llvm/llvm-project/pull/87530
The CloudABI (removed from Clang Driver) change from
https://reviews.llvm.org/D29982 does not make sense. GNU ld and gold
don't create dynamic sections for a non-PIC static link when
--export-dynamic is specified.
Creating dynamic sections is harmful in this scenario because we would
consider undefined weak symbols preemptible and generate GLOB_DAT
relocations, breaking the expectation that non-PIC static links only
contain IRELATIVE relocations.
In addition, there are other options that export symbols
(--export-dynamic-symbol, --dynamic-list, etc). It does not make sense
to special case --export-dynamic.
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.
PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)
This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.
This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.
Closes#74771Closes#84730
Co-authored-by: Fangrui Song <i@maskray.me>
Fixes: 36146d2b6b
When `doParseFile template defintion` in InputFiles.cpp is optimized
out, we will get a link failure. Actually, we can move the file parsing
loop from Driver.too to InputFiles.cpp and merge it with
parseArmCMSEImportLib.
Unknown section sections may require special linking rules, and
rejecting such sections for older linkers may be desired. For example,
if we introduce a new section type to replace a control structure (e.g.
relocations), it would be nice for older linkers to reject the new
section type. GNU ld allows certain unknown section types:
* [SHT_LOUSER,SHT_HIUSER] and non-SHF_ALLOC
* [SHT_LOOS,SHT_HIOS] and non-SHF_OS_NONCONFORMING
but reports errors and stops linking for others (unless
--no-warn-mismatch is specified). Port its behavior. For convenience, we
additionally allow all [SHT_LOPROC,SHT_HIPROC] types so that we don't
have to hard code all known types for each processor.
Close https://github.com/llvm/llvm-project/issues/84812
--compress-sections <section-glib>=[none|zlib|zstd] is similar to
--compress-debug-sections but applies to broader sections without the
SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is
matched. An interesting use case is to compress `.strtab`/`.symtab`,
which consume a significant portion of the file size (15.1% for a
release build of Clang).
An older revision is available at https://reviews.llvm.org/D154641 .
This patch focuses on non-allocated sections for safety. Moving
`maybeCompress` as D154641 does not handle STT_SECTION symbols for
`-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s`
from #66804).
Since different output sections may use different compression
algorithms, we need CompressedData::type to generalize
config->compressDebugSections.
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674
Pull Request: https://github.com/llvm/llvm-project/pull/84855
https://reviews.llvm.org/D150510 places .lrodata before .rodata to
minimize the number of permission transitions in the memory image.
However, this layout is less ideal for -fno-pic code (which is still
important).
Small code model -fno-pic code has R_X86_64_32S relocations with a range
of `[0,2**31)` (if we ignore the negative area). Placing `.lrodata`
earlier exerts relocation pressure on such code. Non-x86 64-bit
architectures generally have a similar `[0,2**31)` limitation if they
don't use PC-relative relocations.
If we place .lrodata later, we will need one extra PT_LOAD. Two layouts
are appealing:
* .bss/.lbss/.lrodata/.ldata (GNU ld)
* .bss/.ldata/.lbss/.lrodata
The GNU ld layout has the nice property that there is only one BSS
(except .tbss/.relro_padding). Add -z lrodata-after-bss to support
this layout.
Since a read-only PT_LOAD segment (for large data sections) may appear
after RW PT_LOAD segments. The placement of `_etext` has to be adjusted.
Pull Request: https://github.com/llvm/llvm-project/pull/81224
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.
In addition to new platform code and the obvious changes, there were a
few additional changes to common code:
- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.
- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.
This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.
Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
Today `-split-machine-functions` and `-fbasic-block-sections={all,list}`
cannot be combined with `-basic-block-sections=labels` (the labels
option will be ignored).
The inconsistency comes from the way basic block address map -- the
underlying mechanism for basic block labels -- encodes basic block
addresses
(https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html).
Specifically, basic block offsets are computed relative to the function
begin symbol. This relies on functions being contiguous which is not the
case for MFS and basic block section binaries. This means Propeller
cannot use binary profiles collected from these binaries, which limits
the applicability of Propeller for iterative optimization.
To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section
binaries, we propose modifying the encoding of this section as follows.
First let us review the current encoding which emits the address of each
function and its number of basic blocks, followed by basic block entries
for each basic block.
| | |
|--|--|
| Address of the function | Function Address |
| Number of basic blocks in this function | NumBlocks |
| BB entry 1
| BB entry 2
| ...
| BB entry #NumBlocks
To make this work for basic block sections, we treat each basic block
section similar to a function, except that basic block sections of the
same function must be encapsulated in the same structure so we can map
all of them to their single function.
We modify the encoding to first emit the number of basic block sections
(BB ranges) in the function. Then we emit the address map of each basic
block section section as before: the base address of the section, its
number of blocks, and BB entries for its basic block. The first section
in the BB address map is always the function entry section.
| | |
|--|--|
| Number of sections for this function | NumBBRanges |
| Section 1 begin address | BaseAddress[1] |
| Number of basic blocks in section 1 | NumBlocks[1] |
| BB entries for Section 1
|..................|
| Section #NumBBRanges begin address | BaseAddress[NumBBRanges] |
| Number of basic blocks in section #NumBBRanges |
NumBlocks[NumBBRanges] |
| BB entries for Section #NumBBRanges
The encoding of basic block entries remains as before with the minor
change that each basic block offset is now computed relative to the
begin symbol of its containing BB section.
This patch adds a new boolean codegen option `-basic-block-address-map`.
Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD
flag `--lto-basic-block-address-map` are introduced.
Analogously, we add a new TargetOption field `BBAddrMap`. This means BB
address maps are either generated for all functions in the compiling
unit, or for none (depending on `TargetOptions::BBAddrMap`).
This patch keeps the functionality of the old
`-fbasic-block-sections=labels` option but does not remove it. A
subsequent patch will remove the obsolete option.
We refactor the `BasicBlockSections` pass by separating the BB address
map and BB sections handing to their own functions (named
`handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers
basic blocks and places them in their assigned sections.
`handleBBAddrMap` is invoked after `handleBBSections` (if requested) and
only renumbers the blocks.
- New tests added:
- Two tests basic-block-address-map-with-basic-block-sections.ll and
basic-block-address-map-with-mfs.ll to exercise the combination of
`-basic-block-address-map` with `-basic-block-sections=list` and
'-split-machine-functions`.
- A driver sanity test for the `-fbasic-block-address-map` option
(basic-block-address-map.c).
- An LLD test for testing the `--lto-basic-block-address-map` option.
This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`.
- Renamed and modified the two existing codegen tests for basic block
address map (`basic-block-sections-labels-functions-sections.ll` and
`basic-block-sections-labels.ll`)
- Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of
`SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2
will happen in a separate PR in a few months.
Based on https://reviews.llvm.org/D45375 . Introduce a new InputFile
kind `InternalKind`, use it for
* `ctx.internalFile`: for linker-defined symbols and some synthesized
`Undefined`
* `createInternalFile`: for symbol assignments and --defsym
I picked "internal" instead of "synthetic" to avoid confusion with
SyntheticSection.
Currently a symbol's file is one of: nullptr, ObjKind, SharedKind,
BitcodeKind, BinaryKind. Now it's non-null (I plan to add an
`assert(file)` to Symbol::Symbol and change `toString(const InputFile
*)`
separately).
Debugging and error reporting gets improved. The immediate user-facing
difference is more descriptive "File" column in the --cref output. This
patch may unlock further simplification.
Currently each symbol assignment gets its own
`createInternalFile(cmd->location)`. Two symbol assignments in a linker
script do not share the same file. Making the file the same would be
nice, but would require non trivial code.
Maintaining the long list of known -z options
(https://reviews.llvm.org/D48621) turns out to be cumbersome. Go the
D48433 route instead.
max-page-size/common-page-size are claimed when `target` is available.
Inspired by: https://reviews.llvm.org/D48433