Move symbol names from directly under `objc` scope to
`objc::symbol_names`.
Ex: `objc::klass` -> `objc::symbol_names::klass`
Co-authored-by: Alex B <alexborcan@meta.com>
A distinction that doesn't _usually_ matter is that the
MachO::SymbolKind is really a mapping of entries in TBD files not
symbols. To better understand this, rename the enum so it represents an
encoding mapped to TBDs as opposed to symbols alone.
For example, it can be a bit confusing that "GlobalSymbol" is a enum
value when all of those values can represent a GlobalSymbol.
This patch renames {starts,ends}with to {starts,ends}_with for
consistency with std::{string,string_view}::{starts,ends}_with in
C++20. Since there are only a handful of occurrences, this patch
skips the deprecation phase and simply renames them.
When forming MachO STABS, this change detects if the DW_AT_name of the
compile unit is already absolute (as allowed by DWARF), and if so, does
not prepend DW_AT_comp_dir.
Fixes#70995
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
LLD resolves symbols before performing LTO compilation, assuming that the symbols in question are resolved by the resulting object files from LTO. However, there is a scenario where the prevailing symbols might be resolved incorrectly due to specific assembly symbols not appearing in the symbol table of the bitcode. This patch deals with such a scenario by generating an error instead of silently allowing a mis-linkage.
If a prevailing symbol is resolved through post-loaded archives via LC linker options, a warning will now be issued.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D158003
LLD resolves symbols regardless of LTO modes early when reading and parsing input files in order. The object files built from LTO passes are appended later.
Because LLD eagerly resolves the LC linker options while parsing a new object file (and its chain of dependent libraries), the prior decision on pending prevailing symbols (belonging to some bitcode files) can change to ones in those native libraries that are just loaded.
This patch delays processing LC linker options until all the native object files are added after LTO is done, similar to LD64. This way we preserve the decision on prevailing symbols LLD made, regardless of LTO modes.
- When parsing a new object file in `parseLinkerOptions()`, it just parses LC linker options in the header, and saves those contents to `unprocessedLCLinkerOptions`.
- After LTO is finished, `resolveLCLinkerOptions()` is called to recursively load dependent libraries, starting with initial linker options collected in `unprocessedLCLinkerOptions` (which also updates during recursions)
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D157716
Details:
calling getMemoryBufferRef() on an empty archive can trigger segfault so the code should check before calling this.
this seems like a bug in the Archive API but that can be fixed separately.
P.S: follow up to D156468
Differential Revision: https://reviews.llvm.org/D157300
Two changes:
- Avoid crashing in predicate functions.
Querying the property of the Symbols via these is*() functions shouldn't crash the program - the answer should just be "false".
Currently, having them throw UNREACHABLE already (incorrectly) crash certain code paths involving macho::validateSymbolRelocation() .
- Simply ignore input archives with incompatible arch (changes from PRESIDENT810@)
Differential Revision: https://reviews.llvm.org/D156468
We never really supported 32-bit ARM arch entirely, and partial support was added for
very specific features. Regardless, it fails to even link the most basic applications that at
this point, it might be better to move this arch as unsupported. Given that Apple will be
moving towards arm64 long term, I don't see any reason for anyone to invest time in
supporting this either, and for those who still need it should use apple's ld64 linker.
Fixes#62691
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D150544
In particular, make it `foo.a(foo.o)$ARCHIVE_OFFSET`. The goal is to
make it more similar to both ld64 implementation, which uses the
`foo.a(foo.o)$MODULE_ID` format. We dump some of these names in LTO
code, so matching ld64's format is helpful. This format is also more
similar to LLD-ELF's, which is `foo.a(foo.o at $ARCHIVE_OFFSET)`.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D148828
This implements ld64's checks for duplicate method names in categories &
classes.
In addition, this sets us up for implementing Obj-C category merging.
This diff handles the most of the parsing work; what's left is rewriting
those category / class structures.
Numbers for chromium_framework:
base diff difference (95% CI)
sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%]
user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%]
wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%]
samples 33 22
Fixes https://github.com/llvm/llvm-project/issues/54912.
Issues seen with the previous land will be fixed in the next commit.
Reviewed By: #lld-macho, thevinster, oontvoo
Differential Revision: https://reviews.llvm.org/D142916
This is also what ld64 does. This will make it easier to compare their
respective map files.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D145654
This supersedes {D139069}. In some ways we are now closer to ld64's
behavior: we previously only did this coalescing for private-label
symbols, but now we do it for all locals, just like ld64. However, we no
longer generate weak binds when a local alias to a weak symbol is
referenced. This is merely for implementation simplicity; it's not clear
to me that any real-world programs depend on us emulating this behavior.
The problem with the previous approach is that we ended up with
duplicate references to the same symbol instance in our InputFiles,
which translated into duplicate symbols in our output. While we could
work around that problem by performing a dedup step before emitting the
symbol table, it seems cleaner to not generate duplicate references in
the first place.
Numbers for chromium_framework on my 16 Core Intel Mac Pro:
base diff difference (95% CI)
sys_time 2.243 ± 0.093 2.231 ± 0.066 [ -2.5% .. +1.4%]
user_time 6.529 ± 0.087 6.080 ± 0.050 [ -7.5% .. -6.3%]
wall_time 6.928 ± 0.175 6.474 ± 0.112 [ -7.7% .. -5.4%]
samples 26 31
Yep, that's a massive win... because it turns out that {D140606} and
{D139069} caused a regression (of about the same size.) I just didn't
think to measure them back then. I'm guessing all the extra symbols we
have been emitting did not help perf at all...
Reviewed By: lgrey
Differential Revision: https://reviews.llvm.org/D145455
This implements ld64's checks for duplicate method names in categories &
classes.
In addition, this sets us up for implementing Obj-C category merging.
This diff handles the most of the parsing work; what's left is rewriting
those category / class structures.
Numbers for chromium_framework:
base diff difference (95% CI)
sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%]
user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%]
wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%]
samples 33 22
Fixes https://github.com/llvm/llvm-project/issues/54912.
Reviewed By: #lld-macho, thevinster, oontvoo
Differential Revision: https://reviews.llvm.org/D142916
At some point PlatformInfo's Target changed types to a type that also
has minimum deployment target info. This caused ambiguity if you tried
to get the target triple from the Target, as the actual minimum version
info was being stored separately. This bulk of this change is changing
the parsing of these values to support this.
Differential Revision: https://reviews.llvm.org/D145263
This nearly mirrors ld64's error for this case:
ld: warning: ignoring file path/to/file, file is universal (armv7,arm64) but does not contain the x86_64 architecture: path/to/file
Differential Revision: https://reviews.llvm.org/D141729
Adds support for the following flags:
* --thinlto-index-only, --thinlto-index-only=
* --thinlto-emit-imports-files
* --thinlto-emit-index-files
* --thinlto-object-suffix-replace=
* --thinlto-prefix-replace=
See https://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html
for some words on --thinlto-index-only.
I don't really need the other flags, but they were in the vicinity
and _someone_ might need them, so I figured I'd add them too.
`-object_path_lto` now sets `c.AlwaysEmitRegularLTOObj` as in the other ports,
which means it can now only point to a filename for non-thin LTO.
I think that was the intent of D129705 anyways, so update
test/MachO/lto-object-path.ll to use a non-thin bitcode file for that test.
Differential Revision: https://reviews.llvm.org/D138451
This mirrors ld64's behavior. In many cases this likely still leads to a
link failure but if you didn't actually use anything from the library
it can be ignored. If you care about these invalid cases -fatal_warnings
still upgrades it back to an error.
Example: https://github.com/keith/ld64.lld/issues/3
Differential Revision: https://reviews.llvm.org/D141638
In https://reviews.llvm.org/D137982 we found that on Mach-O private
aliases could trigger an assert in lld when the aliasee was a
weak_def_can_be_hidden symbol.
This appears to be incorrect, and should be allowed in Mach-O.
Disallowing this behavior is also inconsistent with how ld64 handles
a private alias to weak_def_can_be_hidden symbols.
This patch removes the assert and tests that LLD handles such aliases
gracefully.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D141082
We can technically handle them, but since they shouldn't come up in any
real-world programs (since ld64 dedups strings unconditionally), there's
no reason to support them.
It's a thoroughly untested code path too -- as evidenced by the fact
that the only test this change breaks is one that verifies that we
reject relocations when dedup'ing. There is no test that covers the case
where we handle relocations in cstring sections when dedup is disabled.
Reviewed By: #lld-macho, oontvoo, keith, thakis
Differential Revision: https://reviews.llvm.org/D141025
Errors / warnings that originate from a particular file should be of the
form `$file: $message`.
Reviewed By: #lld-macho, keith
Differential Revision: https://reviews.llvm.org/D140634
This will enable us to re-land {D139069}.
The issue with the original diff was that we were folding all
private-label symbols. We were not merging the symbol flags during this
folding; instead we just made all references to the folded symbol point
to its aliasee. This caused some flags to be incorrectly discarded. This
surfaced as code that was incorrectly stripped due to LLD dropping the
`.no_dead_strip` flag.
This diff fixes things by only folding flag-less private-label aliases.
Most (maybe all) of the `ltmp<N>` symbols that are generated by the MC
aarch64 backend are flag-less, so this conservative folding behavior
does the job.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D140606
Previously by default, when not using `--ifc=`, lld would not
deduplicate string literals. This reveals reliance on undefined behavior
where string literal addresses are compared instead of using string
equality checks. While ideally you would be able to easily identify and
eliminate the reliance on this UB, this can be difficult, especially for
third party code, and increases the friction and risk of users migrating
to lld. This flips the default to deduplicate strings unless
`--no-deduplicate-strings` is passed, matching ld64's behavior.
Differential Revision: https://reviews.llvm.org/D140517
This reverts commit 09c5aab7f8.
New changes:
Temporarily skip checking the output bundle's cpu/cpu-subtype
Suspected there's a bug in selecting targets which caused the produced bundle
to be arm64 bundle (even though it was specifically linked with -arch x86_64).
The current test that link succeeded should be sufficient, because
it would have failed with "unable to find matching tagets" prior to this patch.
Differential Revision: https://reviews.llvm.org/D139572
This reverts commit 52a118d08f.
New changes:
Fix tests to dump both slices in the fat-archive because otool
isn't deterministic about which slice it prints across different archs.
(It printed x86 on x86 machines but arm64 on arm64, this was why
the test failed on arm64)
Differential Revision: https://reviews.llvm.org/D139572
This reverts commit 66692c822a.
New changes:
- update test to require aarch64
- update test to not hard-code cpu[sub] type values (since they could change)
- update test to temporarily skip windows (because llvm-mc on windows doesn't seem to work with triple arm64-apple-macos)
Differential Revision: https://reviews.llvm.org/D139572
Fixes#59162. The test has a comment explaining what's going on.
See also Symbol::extract() in lld/ELF/Symbols.cpp.
The included test sadly also passes if I pass just bd448f01a6,
while doing that isn't enough to make my bigger repro case work
(if I port just that, something else asserts later on, but with
this fix here everything's fine in my bigger repro).
Differential Revision: https://reviews.llvm.org/D139199
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
If we have two files with the same weak symbol like so:
```
ltmp0:
_weak:
<contents>
```
and
```
ltmp1:
_weak:
<contents>
```
Linking them together should leave only one copy of `<contents>`, not
two. Previously, we would keep around both copies because of the
private-label `ltmp<N>` symbols (i.e. symbols that start with `l`) -- we
would not coalesce those, so we would treat them as retaining the
contents.
This matters for more than just size -- we are depending upon this
behavior internally for emitting a certain file format. This file
format's header is repeated in each object file, but we want it to
appear just once in our output.
Why can't we not emit those aliases to `_weak`, or reference the
`ltmp<N>` symbols instead of `_weak`? Well, MC actually adds `ltmp<N>`
symbols as part of the assembly-to-binary translation step. So any
codegen at the clang level can't access them.
All that said... this solution is actually kind of hacky. Here, we avoid
creating the private-label symbols at parse time. This is acceptable
since we never emit those symbols in our output. However, in ld64, any
aliasing temporary symbols (ignored or otherwise) won't retain coalesced
data. But implementing this is harder -- we would have to create those
symbols first (so we can emit their names later), but we would have to
ensure the linker correctly shuffles them around when their aliasees get
coalesced.
Additionally, ld64 treats these temporary symbols as functionally
equivalent to the weak symbols themselves -- that is, it will emit weak
binds when those non-weak temporary aliases are referenced. We have
imitated this behavior for private-label symbols, but implementing it for
local aliases in general seems substantially more difficult. I'm not
sure if any programs actually depend on this behavior though, so maybe
it's a moot point.
Finally, ld64 does all this regardless of whether
`.subsections_via_symbols` is specified. We don't. But again, given how
rare the lack of that directive is (I've only seen it from hand-written
assembly inputs), I don't think we need to worry about it.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D139069
While extending the map file to cover unwind info, I realized we had two
issues with our EH_Frame symbols:
1. Their size was not set
2. We would create two EH_Frame symbols per frame when we only needed
one. This was because the Defined constructor would add the symbol
itself to InputSection::symbols, but we were also manually appending
the symbol to that same vector.
Note that ld64 prints "CIE" and "FDE for: <function>" instead of just
"EH_Frame", but I'm punting on that for now unless we discover that
users really depend upon it.
Reviewed By: #lld-macho, Roger
Differential Revision: https://reviews.llvm.org/D137370
I love C++17!
chromium_framework_less_dwarf on my 16-core Mac Pro shows no stat sig change in wall time but a slight decrease in user time:
```
base diff difference (95% CI)
sys_time 1.759 ± 0.037 1.761 ± 0.033 [ -0.9% .. +1.1%]
user_time 4.920 ± 0.043 4.886 ± 0.051 [ -1.2% .. -0.2%]
wall_time 5.950 ± 0.117 5.900 ± 0.116 [ -1.8% .. +0.2%]
samples 26 37
```
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D136518