The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.
This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.
In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.
- The next change is https://github.com/llvm/llvm-project/pull/87600
[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
Avoid replacing replacing a chunk with one from a different type. It's
mostly a concern for ARM64X, where we don't want to merge aarch64 and
arm64ec chunks, but it may also in theory happen between arm64ec and
x86_64 chunks.
When archive member extraction involving ENTRY happens after
`addScriptReferencedSymbolsToSymTable`,
`addScriptReferencedSymbolsToSymTable` may fail to define some PROVIDE
symbols used by ENTRY. This is an edge case that regressed after #84512.
(The interaction with PROVIDE and ENTRY-in-archive was not considered
before).
While here, also ensure that --undefined-glob extracted object files
are parsed before `addScriptReferencedSymbolsToSymTable`.
Fixes: ebb326a51f
Pull Request: https://github.com/llvm/llvm-project/pull/87530
Adds support for ARM64EC, which should use the same search paths as
ARM64.
It's similar to #87370 and #87495. The test is based on the existing x86
test. Generally ARM64EC libraries are shipped together with native ARM64
libraries (using ECSYMBOLS section mechanism).
getMachineArchType uses Triple::thumb, while the existing
implementation uses Triple::arm. It's ultimately passed to
MSVCPaths.cpp functions, so modify them to accept both forms.
The unstable partition in partitionRels might reverse IRELATIVE
relocations, so stable_partition in computeRels would lead to IRELATIVE
relocations ordered by decreasing offset. Use stable_partition in
partitionRels to get IRELATIVE relocations ordered by increasing offset.
For a DSO with all DT_NEEDED entries accounted for, if it contains an
undefined non-weak symbol that shares a name with a non-exported
definition (hidden visibility or localized by a version script), and
there is no DSO definition, we should report an error.
#70769 implemented the error when we see `ref.so def-hidden.so`. This patch
implementes the error when we see `def-hidden.so ref.so`, matching GNU
ld.
Close#86777
lld/test/MachO/objc-relative-method-lists-simple.s fails on AArch64
hosts running 32-bit ARM binaries, such as
armv8l-unknown-linux-gnueabihf.
Disable the test on the failing targets for now, to keep the
buildbots passing.
This is a new ld64 flag (along with `-warn_duplicate_libraries`), where
the warning is enabled by default, and it can be useful to ignore since
it can be hard to dedup library flags across large builds. This doesn't
ignore the enabling version since if someone manually passed that and
lld didn't respect it, we probably want the user to know that.
The CloudABI (removed from Clang Driver) change from
https://reviews.llvm.org/D29982 does not make sense. GNU ld and gold
don't create dynamic sections for a non-PIC static link when
--export-dynamic is specified.
Creating dynamic sections is harmful in this scenario because we would
consider undefined weak symbols preemptible and generate GLOB_DAT
relocations, breaking the expectation that non-PIC static links only
contain IRELATIVE relocations.
In addition, there are other options that export symbols
(--export-dynamic-symbol, --dynamic-list, etc). It does not make sense
to special case --export-dynamic.
Previously, `makeSyntheticInputSection` would create a new
`ConcatInputSection` without setting `live` explicitly for it. Without
`-dead_strip` this would be OK since `live` would default to `true`.
However, with `-dead_strip`, `live` would default to false, and it would
remain set to `false`.
This hasn't resulted in any issues so far since no code paths that
exposed this issue were present.
However a recent change - ObjC relative method lists
(https://github.com/llvm/llvm-project/pull/86231) exposes this issue by
creating relocations to the `SyntheticInputSection`.
When these relocations are attempted to be written, this ends up with a
crash(assert), since the `SyntheticInputSection` they refer to is marked
as dead (`live` = `false`).
With this change, we set the correct behavior - `live` will always be
`true`. We add a test case that before this change would trigger an
assert in the linker.
The MachO format supports relative offsets for ObjC method lists. This
support is present already in ld64. With this change we implement this
support in lld also.
Relative method lists can be identified by a specific flag (0x80000000)
in the method list header. When this flag is present, the method list
will contain 32-bit relative offsets to the current Program Counter
(PC), instead of absolute pointers.
Additionally, when relative method lists are used, the offset to the
selector name will now be relative and point to the selector reference
(selref) instead of the name itself.
Current Bionic processes relocations in this order:
* DT_ANDROID_REL[A]
* DT_RELR
* DT_REL[A]
* DT_JMPREL
If an IRELATIVE relocation is in DT_ANDROID_REL[A], it would read
unrelocated (incorrect) global variables associated with RELR when
--pack-dyn-relocs=android+relr is enabled. Work around this by placing
IRELATIVE in .rel[a].plt (DT_JMPREL).
Link: https://r.android.com/3014185
#78772 added similar support for .def file parser and import library
writer. This PR adds missing bits in LLD to propagate EXPORTAS name and
allow it in `/export` parser. This is syntax is used by MSVC for ARM64EC
`__declspec(dllexport)` handling.
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.
PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)
This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.
This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.
Closes#74771Closes#84730
Co-authored-by: Fangrui Song <i@maskray.me>
`relaIplt` was added so that IRELATIVE relocations are placed at the end
of .rela.dyn (since https://reviews.llvm.org/D65651) or .rela.plt
(--pack-dyn-relocs=android[+relr]). Unfortunately, handling `relaIplt`
requires special cases all over the code base. We can extend
partitionRels/computeRels to partition both RELATIVE and IRELATIVE
relocations, rendering `relaIplt` unneeded.
The change allows IRELATIVE relocations in the DT_ANDROID_REL[A] table
(untested?!), which may be processed before other types of relocations.
This seems acceptable for Bionic's DEFINE_IFUNC_FOR use cases.
In addition, this change simplies changing .rel[a].dyn to a compact
relocation format (CREL).
SHF_INFO_LINK is removed from .rel[a].dyn with IRELATIVE relocations.
(See https://reviews.llvm.org/D89828).
The existing implementation didn't handle when the input text section
was some offset from the output section.
This resulted in an assert in relaxGot() with an lld built with asserts
for some large binaries, or even worse, a silently broken binary with an
lld without asserts.
When adding fixups for RISCV_TLSDESC_ADD_LO and RISCV_TLSDESC_LOAD_LO,
the local label added for RISCV TLSDESC relocations have STT_TLS set,
which is incorrect. Instead, these labels should have `STT_NOTYPE`.
This patch stops adding such fixups and avoid setting the STT_TLS on
these symbols. Failing to do so can cause LLD to emit an error `has an
STT_TLS symbol but doesn't have an SHF_TLS section`. We additionally,
adjust how LLD services these relocations to avoid errors with
incompatible relocation and symbol types.
Reviewers: topperc, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/85817
Currently, we mistakenly mark the local labels used in RISC-V TLSDESC as
TLS symbols, when they should not be. This patch adds tests with the
current incorrect behavior, and subsequent patches will address the
issue.
Reviewers: MaskRay, topperc
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/85816
Following the commit of #83972 which added COFF support for SPGO, this
patch ports the support of the option -lto-sample-profile that was only
available in the ELF variant of LLD to the COFF variant to enable
running the SPGO passes in the LTO/thinLTO pipelines.
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
This change adds a flag to lld to enable category merging for MachoO +
ObjC.
It adds the '-objc_category_merging' flag for enabling this option and
uses the existing '-no_objc_category_merging' flag for disabling it.
In ld64, this optimization is enabled by default, but in lld, for now,
we require explicitly passing the '-objc_category_merging' flag in order
to enable it.
Behavior: if in the same link unit, multiple categories are extending
the same class, then they get merged into a single category.
Ex: Cat1(method1+method2,protocol1) + Cat2(method3+method4,protocol2,
property1) = Cat1_2(method1+method2+method3+method4,
protocol1+protocol2, property1)
Notes on implementation decisions made in this diff:
There is a possibility to further improve the current implementation by
directly merging the category data into the base class (if the base
class is present in the link unit) - this improvement may be done as a
follow-up. This improved functionality is already present in ld64.
We do the merging on the raw inputSections - after dead-stripping
(categories can't be dead stripped anyway).
The changes are mostly self-contained to ObjC.cpp, except for adding a
new flag (linkerOptimizeReason) to ConcatInputSection and StringPiece to
mark that this data has been optimized away. Another way to do it would
have been to just mark the pieces as not 'live' but this would cause the
old symbols to show up in the linker map as being dead-stripped - even
if dead-stripping is disabled. This flag allows us to match the ld64
behavior.
Note: This is a re-land of
https://github.com/llvm/llvm-project/pull/82928 after fixing using
already freed memory in `generatedSectionData`. This issue was detected
by ASAN build.
---------
Co-authored-by: Alex B <alexborcan@meta.com>
GNU readelf prints a blank line before the first hex/string dump, which
serves as a separator when there are other dump operations. Port the
behavior.
Pull Request: https://github.com/llvm/llvm-project/pull/85744
This change adds a flag to lld to enable category merging for MachoO +
ObjC.
It adds the '-objc_category_merging' flag for enabling this option and
uses the existing '-no_objc_category_merging' flag for disabling it.
In ld64, this optimization is enabled by default, but in lld, for now,
we require explicitly passing the '-objc_category_merging' flag in order
to enable it.
Behavior: if in the same link unit, multiple categories are extending
the same class, then they get merged into a single category.
Ex: `Cat1(method1+method2,protocol1) + Cat2(method3+method4,protocol2,
property1) = Cat1_2(method1+method2+method3+method4,
protocol1+protocol2, property1)`
Notes on implementation decisions made in this diff:
1. There is a possibility to further improve the current implementation
by directly merging the category data into the base class (if the base
class is present in the link unit) - this improvement may be done as a
follow-up. This improved functionality is already present in ld64.
2. We do the merging on the raw inputSections - after dead-stripping
(categories can't be dead stripped anyway).
3. The changes are mostly self-contained to ObjC.cpp, except for adding
a new flag (linkerOptimizeReason) to ConcatInputSection and StringPiece
to mark that this data has been optimized away. Another way to do it
would have been to just mark the pieces as not 'live' but this would
cause the old symbols to show up in the linker map as being
dead-stripped - even if dead-stripping is disabled. This flag allows us
to match the ld64 behavior.
---------
Co-authored-by: Alex B <alexborcan@meta.com>
Unknown section sections may require special linking rules, and
rejecting such sections for older linkers may be desired. For example,
if we introduce a new section type to replace a control structure (e.g.
relocations), it would be nice for older linkers to reject the new
section type. GNU ld allows certain unknown section types:
* [SHT_LOUSER,SHT_HIUSER] and non-SHF_ALLOC
* [SHT_LOOS,SHT_HIOS] and non-SHF_OS_NONCONFORMING
but reports errors and stops linking for others (unless
--no-warn-mismatch is specified). Port its behavior. For convenience, we
additionally allow all [SHT_LOPROC,SHT_HIPROC] types so that we don't
have to hard code all known types for each processor.
Close https://github.com/llvm/llvm-project/issues/84812
#69295 demoted Defined symbols relative to discarded sections.
If such a symbol is unreferenced, the desired behavior is to
eliminate it from .symtab just like --gc-sections discarded
definitions.
Linux kernel's CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y configuration expects
that the unreferenced `unused` is not emitted to .symtab
(https://github.com/ClangBuiltLinux/linux/issues/2006).
For relocations referencing demoted symbols, the symbol index restores
to 0 like older lld (`R_X86_64_64 0` in `discard-section.s`).
Fix#85048
--compress-sections <section-glib>=[none|zlib|zstd] is similar to
--compress-debug-sections but applies to broader sections without the
SHF_ALLOC flag. lld will report an error if a SHF_ALLOC section is
matched. An interesting use case is to compress `.strtab`/`.symtab`,
which consume a significant portion of the file size (15.1% for a
release build of Clang).
An older revision is available at https://reviews.llvm.org/D154641 .
This patch focuses on non-allocated sections for safety. Moving
`maybeCompress` as D154641 does not handle STT_SECTION symbols for
`-r --compress-debug-sections=zlib` (see `relocatable-section-symbol.s`
from #66804).
Since different output sections may use different compression
algorithms, we need CompressedData::type to generalize
config->compressDebugSections.
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27452
Link: https://discourse.llvm.org/t/rfc-compress-arbitrary-sections-with-ld-lld-compress-sections/71674
Pull Request: https://github.com/llvm/llvm-project/pull/84855
This makes a difference when linking executables with delay loaded
libraries for arm32; the delay loader implementation can load data from
the registry with instructions that assume alignment.
This issue does not show up when linking in MinGW mode, because a
PseudoRelocTableChunk gets injected, which also sets alignment, even if
the chunk itself is empty.
This allows handling importlibs produced by llvm-dlltool in #78772.
ARM64EC import libraries use it by default, but it's supported by MSVC
link.exe on other platforms too.
This also avoids assuming null-terminated input, like in #78769.
The convention is for such MC-specific options to reside in
MCTargetOptions. However, CompressDebugSections/RelaxELFRelocations do
not follow the convention: `CompressDebugSections` is defined in both
TargetOptions and MCAsmInfo and there is forwarding complexity.
Move the option to MCTargetOptions and hereby simplify the code. Rename
the misleading RelaxELFRelocations to X86RelaxRelocations. llvm-mc
-relax-relocations and llc -x86-relax-relocations can now be unified.
The lexer is overly permissive. When parsing file patterns in an input
section description and there is a missing `)`, we would accept many
non-sensible tokens (e.g. `}`) as patterns, leading to confusion, e.g.
`*(SORT_BY_ALIGNMENT(SORT_BY_NAME(.text*)) } PROVIDE_HIDDEN(__code_end = .)`
(#81804).
Ideally, the lexer should be stateful to report more errors like GNU ld
and get rid of hacks like `ScriptLexer::maybeSplitExpr`, but that would
require a large rewrite of the lexer. For now, just reject certain
non-wildcard meta characters to detect common mistakes.
Pull Request: https://github.com/llvm/llvm-project/pull/84130