Saves several MB of memory in larger applications after linking finishes
by clearing DenseMap storage that is empty. This does not attempt to
shrink partially full materialization infos. The assumption is that
adding more after linking finishes is rare.
rdar://126145336
Same idea as 006aaf3225 -- reduce boilerplate and improve readability. This
time updates will be piecemeal to make it easier to identify errors.
Coding my way home: 2.18555S, 93.78063W
Add an ExecutionSession state verifier, enabled under EXPENSIVE_CHECKS, that can
be used to identify inconsistent session state to assist in tracking down bugs.
This initial version was motivated by investigation of the EDU-update bug that
was fixed in a671ceec33.
rdar://125376708
We were catching a local variable, SymMI, by value instead of by reference
during EDU cleanup and this was leaving the dependence graph in an
inconsistent state that could lead to crashes on subsequent emits. Fixing this
bug required us to also avoid aliasing between SymMI and MI (which would have
caused cleanup to clear the MI.DependantEDUs set that we're iterating over).
No testcase: the crash only triggered in very specific circumstances
(including concurrent linking) in an out-of-tree ORC client. I'm working on a
session state verifier that could be turned on when compiling with
expensive-checks turned on and that should help us catch issues like this in
the future.
rdar://125164262
Coding my way home: 0.89527S, 89.61313W
`TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness".
I cannot find noticeable out-of-tree users of `TargetEndianness`, but
keep `TargetEndianness` to make this patch safer. `TargetEndianness`
will be removed by a subsequent change.
Transform section$start$<section-name> and section$end$<section-name> external
symbols into defined symbols when a section named <section-name> is present.
rdar://125357048
Coding my way home: 8.98112N, 79.52094W
This commit adds section start and stop symbol handling to ELF/aarch64, and
fixes the section symbol prefixes (using `__start_` and `__stop_`, rather than
`__start` and `__end`). It also adds a testcase for handling of these symbols.
If you build LLVM with `-DCMAKE_CXX_VISIBILITY_PRESET=hidden` to help
reduce binary size, these symbols end up becoming local, and getting
stripped. This forces default visibility to override the global setting
in that case.
Relevant:
https://github.com/llvm/llvm-project/issues/62815#issuecomment-1560078260
Only platform specific darwin OS values (e.g. macosx, ios, watchos, ...) can be
mapped to an LC_BUILD_VERSION platform. For all other values return an empty
optional to indicate that the load command can't be constructed.
Also fixes the simulator conditions to return the correct platform, and adds a
testcase.
LLVM is inconsistent about how it converts `errno` to `std::error_code`.
This can cause problems because values outside of `std::errc` compare
differently if one is system and one is generic on POSIX systems.
This is even more of a problem on Windows where use of the system
category is just wrong, as that is for Windows errors, which have a
completely different mapping than POSIX/generic errors. This patch fixes
one instance of this mistake in `JSONTransport.cpp`.
This patch adds `errnoAsErrorCode()` which makes it so people do not
need to think about this issue in the future. It also cleans up a lot of
usage of `errno` in LLVM and Clang.
If notifyEmitted encounters a failure (either because some plugin returned one,
or because the ResourceTracker was defunct) then we need to deallocate the
FinalizedAlloc manually.
No testcase yet: This requires a concurrent setup -- we'll need to build some
infrastructure to coordinate links and deliberately injected failures in order
to reliably test this.
Remove an overly aggressive cantFail: This call to defineMaterializing should
never fail with a duplicate symbols error (since all new symbols shoul be
weak), but may fail if the tracker has become defunct in the mean time. In that
case we need to propagate the error.
[ORC] Re-land https://github.com/llvm/llvm-project/pull/81826
This patch adds two plugins: VTuneSupportPlugin.cpp and
JITLoaderVTune.cpp. The testing is done in a manner similar to
llvm-jitlistener. Currently, we only support the old version of Intel
VTune API.
The base class llvm::ThreadPoolInterface will be renamed
llvm::ThreadPool in a subsequent commit.
This is a breaking change: clients who use to create a ThreadPool must
now create a DefaultThreadPool instead.
API clients can now set a MachO::HeaderOptions::BuildVersionOpts field to have
MachOPlatform add an LC_BUILD_VERSION load command to the Mach header for each
JITDylib.
No testcase yet. In the future we'll try to add a MachO parser to the ORC
runtime and extra test options to llvm-jitlink for this.
This commit also incidentally fixes a bug in the MachOBuilder class that lead to
a delegation cycle.
This patch adds two plugins: VTuneSupportPlugin.cpp and
JITLoaderVTune.cpp. The testing is done in a manner similar to
llvm-jitlistener. Currently, we only support the old version of Intel
VTune API.
This pull request is stacked on top of
https://github.com/llvm/llvm-project/pull/81825
The SectCreateMaterializationUnit creates a LinkGraph with a single named
section containing a single named block whose content is given by a
MemoryBuffer. It is intended to support emulation of ld64's -sectcreate option.
Right now InProcessMemoryManager only releases a standard segment (via
sys::Memory::releaseMappedMemory) in `deallocate` when there is a
DeallocAction associated, leaving residual memory pages in the process
until termination.
Despite being a de facto memory leak, it won't cause a major issue if
users only create a single LLJIT instance per process, which is the most
common use cases. It will, however, drain virtual memory pages if we
create thousands of ephemeral LLJIT instances in the same process.
This patch fixes this issue by releasing every standard segments
regardless of the attached DeallocAction.
Switch the primary implementation of EPC lookupSymbols to be async,
keeping a synchronous wrapper for compatibility. Use the new async
implementation inside EPCDynamicLibrarySearchGenerator to work
working towards a fully async search generator.
Provide an asynchronous lookup API for EPCGenericDylibManager and adopt
that from the SimpleRemoteEPC. This enables an end-to-end async
EPCDynamicLibrarySearchGenerator. Note: currently we keep the current
per-dlhandle lookup model, but a future improvement could do a single
async call for a given lookup operation.
As noted in issues #68594 and #73935, `JITLink/RISCV/ELF_ehframe.s`
fails with libstdc++'s expensive checks because `getRISCVPCRelHi20`
calls `std::equal_range` on the edges which may not be ordered by their
offset. Instead let `ELFJITLinker_riscv` build a hashmap of all edges
with type `R_RISCV_PCREL_HI20` that can be looked up in constant time.
Closes#73935
The error check should be performed after the iterator increment, not before
it. Thanks to @dcb314 for catching this!
Fixes github.com/apple/swift/issues/81119
This code appears to be a hack to set the features to include compressed
instructions if the ELF EFLAGS flags bit is present, but the ELF
attribute for the ISA string is no present or not accurate.
We can't remove the hack because llvm-mc doesn't create ELF attributes
by default so a lot of tests fail to disassembler properly. Using clang
as the assembler does set the attributes.
This patch changes the hack to only set Zca since that is the minimum
implied by the flag. Setting anything else potentially conflicts with
the ISA string containing Zcmp or Zcmt.
JITLink also needs to be updated to recognize Zca in addition to C.
Removes the MaterializationResponsibility::addDependencies and
addDependenciesForAll methods, and transfers dependency registration to
the notifyEmitted operation. The new dependency registration allows
dependencies to be specified for arbitrary subsets of the
MaterializationResponsibility's symbols (rather than just single symbols
or all symbols) via an array of SymbolDependenceGroups (pairs of symbol
sets and corresponding dependencies for that set).
This patch aims to both improve emission performance and simplify
dependence tracking. By eliminating some states (e.g. symbols having
registered dependencies but not yet being resolved or emitted) we make
some errors impossible by construction, and reduce the number of error
cases that we need to check. NonOwningSymbolStringPtrs are used for
dependence tracking under the session lock, which should reduce
ref-counting operations, and intra-emit dependencies are resolved
outside the session lock, which should provide better performance when
JITing concurrently (since some dependence tracking can happen in
parallel).
The Orc C API is updated to account for this change, with the
LLVMOrcMaterializationResponsibilityNotifyEmitted API being modified and
the LLVMOrcMaterializationResponsibilityAddDependencies and
LLVMOrcMaterializationResponsibilityAddDependenciesForAll operations
being removed.
The JITLink AArch32 backend reached feature-parity with RuntimeDyld
on ELF-based systems. This patch changes the default JIT-linker in Orc's
LLJIT for these platforms.
This allows us to run clang-repl with JITLink on ARM and use all the
features we had with RuntimeDyld before. All existing tests for
clang-repl are passing.
This stub type loads an absolute address directly into the PC register.
It's the simplest and most compatible way to implement a branch
indirection across the entire address space (and probably the slowest as
well). It's the ideal fallback for all targets for which we did not
(yet) implement a more performant solution.
`R_ARM_PREL31` is a 31-bits relative data relocation where the
most-significant bit is preserved. It's used primarily in `.ARM.exidx`
sections, which we skipped processing until now, because we didn't
support the relocation type. This was implemented in RuntimeDyld with
https://reviews.llvm.org/D25069 and I implemented it in a similar way in
JITLink in order to reach feature parity.
We want to emit stubs that match the instruction set state of the
relocation site. This is important for branches that have no built-in
switch for the instruction set state. It's the case for Jump24
relocations. Relocations on instructions that support switching on
the fly will be rewritten in a relaxation step in the future. This
affects Call relocations on `BL`/`BLX` instructions.
In this patch, the StubManager gains a second stub symbol slot for each
target and selects which one to use based on the relocation type. For
testing, we select the appropriate slot with a stub-kind filter, i.e.
`arm` or `thumb`. With that we can implement Armv7 stubs and test
that we can have both kinds of stubs for a single external symbol.
We use `jitlink-check` lines in LIT tests as the primary tool for
testing JITLink backends. Parsing and evaluation of the expressions is
implemented in `RuntimeDyldChecker`. The `stub_addr(obj, name)`
expression allows to obtain the linker-generated stub for the external
symbol `name` in object file `obj`.
This patch adds support for a filter parameter to select one out of many
stubs. This is necessary for the AArch32 JITLink backend, which must be
able to emit two different kinds of stubs depending on the instruction
set state (Arm/Thumb) of the relocation site. Since the new parameter is
optional, we don't have to update existing tests.
Filters are regular expressions without brackets that match exactly one
existing stub. Given object file `armv7.o` with two stubs for external
function `ext` of kinds `armv7_abs_le` and `thumbv7_abs_le`, we get the
following filter results e.g.:
```
stub_addr(armv7.o, ext, thumb) thumbv7_abs_le
stub_addr(armv7.o, ext, thumbv7) thumbv7_abs_le
stub_addr(armv7.o, ext, armv7_abs_le) armv7_abs_le
stub_addr(armv7.o, ext, v7_.*_le) Error: "ext" has 2 candidate stubs in file "armv7.o". Please refine stub-kind filter "v7_.*_le" for disambiguation (encountered kinds are "thumbv7_abs_le", "armv7_abs_le").
stub_addr(armv7.o, ext, v8) Error: "ext" has 2 stubs in file "armv7.o", but none of them matches the stub-kind filter "v8" (all encountered kinds are "thumbv7_abs_le", "armv7_abs_le").
```
Add a HeaderOptions struct that can be used to configure commonly-used
load commands LC_ID_DYLIB, LC_LOAD_DYLIB, and LC_RPATH when setupDylib
creates a mach-o header.