The relaxation algorithm used to only update offsets of relaxable edges.
This caused non-relaxable edges that appear after a relaxed instruction
to have an incorrect offset and be applied at the wrong location. This
patch fixes this by updating the offsets of all edges.
Note that this bug was caused by an incorrect translation of LLD's
relaxation algorithm. LLD always uses all edges during relaxation while
I decided to filter-out relaxable edges to prevent having to iterate
non-relaxable edges at each step. However, this had the side-effect of
only updating offsets of relaxable edges. This patch leaves the
filtering of relaxable edges as-is but iterates all edges when updating
offsets.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D153515
When llvm Interpreter is used to execute bitcode, it shall be able to handle TargetExtType.
Reviewed By: jcranmer-intel
Differential Revision: https://reviews.llvm.org/D148283
This patch introduces a skeleton JITLink ppc64 support header and ELF/ppc64
backend. No relocations are supported in this initial version, but given a
program requiring no relocations (e.g. one that just returns a constant value
from main) the new backend is able to construct a LinkGraph from a ppc64 ELF
relocatable object, and the llvm-jitlink tool is able to execute it.
This commit should also serve as a good example of how to introduce a JITLink
backend for a new architecture.
Reviewed By: sgraenitz, v.g.vassilev, vchuravy, nemanjai, jain98, MaskRay
Differential Revision: https://reviews.llvm.org/D148192
For linker relaxation (D149526), a new edge kind (`CallRelaxable`) was
introduced. However, this new kind was not taken into account by
`PerGraphGOTAndPLTStubsBuilder_ELF_riscv`. This patch fixes this.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D150957
These were previously re-enabled in d771f54107, but had to be disabled again
in 2060a72b4d due to test failures.
This is a next step to landing https://reviews.llvm.org/D148192, which adds
a skeleton JITLink backend for PowerPC.
The fixes for those failures were (1) to explicitly specify IsLittleEndian =
true for the MachO YAML testcases, (2) disable some example tests for examples
that aren't supported on PowerPC yet, and (3) fixing the endianness of a
relocation read/write (for ELF R_AARCH64_TSTBR14) in RuntimeDyldELF.
This relocation is used for the 14-bit immediate in test and branch
instructions.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D150778
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
This patch is essentially an adaption of LLD's algorithm to JITLink.
Currently, only relaxing R_RISCV_CALL(_PLT) and R_RISCV_ALIGN is
implemented, other relocations can follow later.
From a high level, the algorithm works as follows. In the first phase
(relaxBlock), we iteratively try to relax all instructions that have a
R_RISCV_RELAX relocation:
- If, based on the current symbol values, an instruction sequence can be
relaxed (i.e., replaced by a shorter instruction), we record how many
bytes would be removed, the new instruction (Writes), and the new
relocation type (EdgeKinds).
- We keep track of the total number of bytes that got removed up to each
relocation in the RelocDeltas array. This is the cumulative sum of the
number of bytes removed for each relocation.
- Symbol values and sizes are updated based on the number of removed
bytes.
- If for any relocation, the current RelocDeltas value doesn't match the
one from the previous iteration, something changed and we need to run
another iteration as some symbols might now have different values.
In the second phase (finalizeBlockRelax), all code is moved based on
RelocDeltas, the relaxed instructions are rewritten using Writes, and
R_RISCV_ALIGN is handled (moving instructions to ensure alignment and
inserting the correct NOP-sequence if needed). Finally, edge kinds and
offsets are updated and all R_RISCV_RELAX and R_RISCV_ALIGN edges are
removed (they are not needed anymore for the fixup linking stage).
Linker relaxation is implemented as a pass and added to PreFixupPasses
in the default configuration on RISC-V.
Since linker relaxation removes instructions, the memory for blocks
should ideally be reallocated. However, I believe this is currently not
possible in JITLink. Therefore, relaxation directly modifies the memory
of blocks, reducing the number of instructions but not the size of
blocks. I'm not very familiar with JITLink's memory allocators so I
might be overlooking something here, though.
Note on testing: some of the tests rely on the debug output of
llvm-jitlink. The main reason for this is the verification of symbol
sizes (which may change due to relaxation). I don't believe this can be
done using jitlink-check checks alone.
Note that there is a slightly unrelated change that makes
Symbol::setOffset public to be able to update symbol offsets during
relaxation.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D149526
Prior to this patch, for the DWARF frame base LLVM uses the frame pointer
register if available, otherwise the stack pointer register. If the stack
pointer register is being used and a call or other code modifies the stack
pointer during the body of the function this results in the locations being
wrong and the debugger displaying the wrong values for variables.
By using DW_OP_call_frame_cfa in these situations the emitted location for
the variable will automatically handle changes in the stack pointer.
The CFA needs to be adjusted for the offset between the frame pointer/stack
pointer to allow the variable locations themselves to remain unchanged by
this patch.
Reviewed By: #debug-info, scott.linder, jryans
Differential Revision: https://reviews.llvm.org/D143463
These are an attempt to more systematically test the features covered by the
MCJIT regression tests (though these tests apply to lli's default mode, which
is now -jit-kind=orc).
This first batch of tests includes a basic smoke test (trivial-return-zero),
tests for single function calls and data references, and alignment handling.
This relocation is used for the 19-bit immediate in conditional branch
and compare and branch instructions.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D149138
These tests passed on my ppc64le test machine. If they survive testing by the
buildbots then we can leave them enabled, and this will allow us to land the
new ppc64 JITLink backend (https://reviews.llvm.org/D148192).
This patch ports PerfJITEventListener to a JITLink plugin, but adds unwind record support and drops debuginfo support temporarily. Debuginfo can be enabled in the future by providing a way to obtain a DWARFContext from a LinkGraph.
See D146060 for an experimental implementation that adds debuginfo parsing.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D146169
Adds support for the R_X86_64_GOTPC32 relocation, which is a 32-bit delta to
the global offset table.
Since the delta to the GOT doesn't actually require any GOT entries to exist
this commit adds an extra fallback path to the getOrCreateGOTSymbol function:
If the symbol is in the extenal symbols list but no entry exists then the
symbol is turned into an absolute symbol pointing to an arbitrary address in
the current graph's allocation (accessing this address via the symbol would be
illegal, but any access should have triggered creation of a GOT entry which
would prevent this fallback path from being taken in the first place).
This commit also updates the llvm-jitlink tool to scrape the addresses of the
absolute symbols in the graph so that the testcase can see the now-absolute
_GLOBAL_OFFSET_TABLE_ symbol.
Tests in test/ExecutionEngine/JITLink/AArch64 are for aarch64 only, so we don't
need to repeat 'aarch64' / 'arm64' in the individual test names.
Commit 8ad75c1037 made a similar change for the x86-64 tests.
Tests in test/ExecutionEngine/JITLink/X86 were for x86-64 only (never i386) so
it makes sense to name the test directory x86-64, and drop x86-64 from the
individual test names.
When LLVM_ENABLE_THREADS=Off, disables tests under
llvm/test/ExecutionEngine/MCJIT/remote/ that require threads.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D147803
This reapplies 371cb1af61, which was reverted in 0b2240eda0 due to bot
failures.
The clang-repl test failure is fixed by dropping the process symbols definition
generator that was manually attached to the main JITDylib, since LLJIT now
exposes process symbols by default. (The bug was triggered when JIT'd code used
the process atexit provided by the generator, rather than the JIT atexit which
has been moved into the platform JITDylib).
Any LLJIT clients that see crashes in static destructors should likewise remove
any process symbol generators attached to their main JITDylib.
The various ADD/SUB relocations work by reading the current value the
relocation points to, transforming it, and then writing it back to
memory. While the current implementation writes the value back to
working memory, it reads the current value from the execution address of
the relocation. This causes at least wrong results, but often crashes,
when the addresses of working memory are not equal to execution
addresses. This patch fixes this by reading the current value from
working memory.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D147693
This commit includes several related ergonomic improvements to LLJIT.
(1) Adds a default JITDylibSearchOrder to be appended to the initial link order
of JITDylibs created via LLJIT::createJITDylib (dropping any duplicate entries).
This was introduced to support automatic reflection of process symbols (see
(2) below), but has been made visible to clients as it's generically useful,
e.g. if clients have some extra set of libraries that they want to be visible
to JIT'd code by default.
The default JITDylibSearchOrder is only appended to the link order of JITDylibs
created via LLJIT::createJITDylib, and will not be apply to JITDylibs created by
directly calling the underlying ExecutionSession -- in that case clients can set
up the link order manually.
(2) Makes process symbols visible to JIT'd code by default via the new "Process"
JITDylib, which is added to the default link order.
LLJIT clients usually want symbols in the executor process to be accessible to
JIT'd code. Until now clients have been left to set this up themselves by adding
a DynamicLibrarySearchGenerator to the Main JITDylib. This patch adds a new
process symbols JITDylib that will be created by default (with an
EPCDynamicLibrarySearchGenerator attached) and added to the default link order,
making process symbols available to JIT'd code.
Clients who do not want process symbols to be visible to JIT'd code by default
can call setLinkProcessSymbolsByDefault(false) on their LLJITBuilder to disable
this:
LLJITBuilder()
...
.setLinkProcessSymbolsByDefault(false)
...
.create();
Clients can also call setProcessSymbolsJITDylibSetup to take over responsibility
for configuring the process symbols JITDylib (the callback that the client
supplies will be called on the bare process symbols JITDylib immediately after
it is created).
If setLinkProcessSymbolsByDefault(false) is called and no JITDylib setup
callback has been set then the process symbols JITDylib will not be created and
LLJIT::getProcessSymbolsJITDylib will return null.
(3) Adds an ExecutorNativePlatform utility that makes it easier to enable
native platform features.
Some object format features (e.g. native static initializers and thread locals)
require runtime support in the executing process. Support for these features in
ORC is implemented cooperatively between the ORC runtime and the LLVM Platform
subclasses (COFFPlatform, ELFNixPlatform, and MachOPlatform).
ExecutorNativePlatfrom simplifies the process of loading the ORC runtime and
creating the appropriate platform class for the executor process.
ExecutorNativePlatform takes a path to the ORC runtime (or a MemoryBuffer
containing the runtime) and other required runtimes for the executor platform
(e.g. MSVC on Windows) and then configures LLJIT with an appropriate platform
class based on the executor's target triple:
LLJITBuilder()
.setPlatformSetUp(ExecutorNativePlatform("/path/to/orc-runtime.a"));
(The ORC runtime is built as part of compiler-rt, and the exact name of the
archive is platform dependent).
The ORC runtime and platform symbols will be added to a new "Platform" JITDylib,
which will be added to the *front* of the default link order (so JIT'd code will
prefer symbol definitions in the platform/runtime to definitions in the executor
process).
ExecutorNativePlatform assumes that the Process JITDylib is available, as
the ORC runtime may depend on symbols provided by the executor process.
Differential Revision: https://reviews.llvm.org/D144276
The original -show-graph option dumped the LinkGraph for all graphs loaded into
the session, but can make it difficult to see small graphs (e.g. reduced test
cases) among the surrounding larger files (especially the ORC runtime).
The new -show-graphs option takes a regex and dumps only those graphs matching
the regex. This allows testcases to specify exactly which graphs to dump.
This first version lays the foundations for AArch32 support in JITLink. ELFLinkGraphBuilder_aarch32 processes REL-type relocations and populates LinkGraphs from ELF object files for both big- and little-endian systems. The ArmCfg member controls subarchitecture-specific details throughout the linking process (i.e. it's passed to ELFJITLinker_aarch32).
Relocation types follow the ABI documentation's division into classes: Data (endian-sensitive), Arm (32-bit little-endian) and Thumb (2x 16-bit little-endian, "Thumb32" in the docs). The implementation of instruction encoding/decoding for relocation resolution is implemented symmetrically and is testable in isolation (see AArch32 category in JITLinkTests).
Callable Thumb functions are marked with a ThumbSymbol target-flag and stored in the LinkGraph with their real addresses. The thumb-bit is added back in when the owning JITDylib requests the address for such a symbol.
The StubsManager can generate (absolute) Thumb-state stubs for branch range extensions on v7+ targets. Proper GOT/PLT handling is not yet implemented.
This patch is based on the backend implementation in ez-clang and has just enough functionality to model the infrastructure and link a Thumb function `main()` that calls `printf()` to dump "Hello Arm!" on Armv7a. It was tested on Raspberry Pi with 32-bit Raspbian OS.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D144083
This first version lays the foundations for AArch32 support in JITLink. ELFLinkGraphBuilder_aarch32 processes REL-type relocations and populates LinkGraphs from ELF object files for both big- and little-endian systems. The ArmCfg member controls subarchitecture-specific details throughout the linking process (i.e. it's passed to ELFJITLinker_aarch32).
Relocation types follow the ABI documentation's division into classes: Data (endian-sensitive), Arm (32-bit little-endian) and Thumb (2x 16-bit little-endian, "Thumb32" in the docs). The implementation of instruction encoding/decoding for relocation resolution is implemented symmetrically and is testable in isolation (see AArch32 category in JITLinkTests).
Callable Thumb functions are marked with a ThumbSymbol target-flag and stored in the LinkGraph with their real addresses. The thumb-bit is added back in when the owning JITDylib requests the address for such a symbol.
The StubsManager can generate (absolute) Thumb-state stubs for branch range extensions on v7+ targets. Proper GOT/PLT handling is not yet implemented.
This patch is based on the backend implementation in ez-clang and has just enough functionality to model the infrastructure and link a Thumb function `main()` that calls `printf()` to dump "Hello Arm!" on Armv7a. It was tested on Raspberry Pi with 32-bit Raspbian OS.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D144083
This reapplies 57aeb30546, which was reverted in
f721fcb6ed due to buildbot failures.
The cause of the failure was missing support for R_AARCH64_ABS32, which was
added in fb1b9945be.
Simplifies the process of building an LLJIT instance that supports the native
platform features (initializers, TLV, etc.).
SetUpExecutorNativePlatform can be passed to LLJITBuilder::setPlatformSetUp
method. It takes a reference to the ORC runtime (as a path or an in-memory
archive) and automatically sets the platform for LLJIT's ExecutionSession based
on the executor process's triple.
Differential Revision: https://reviews.llvm.org/D144276
By default ELFLinkGraphBuilder will now create LinkGraph sections with NoAlloc
lifetime for debug info sections in the original object. Debug sections are not
kept alive by default, and will be dead-stripped unless some plugin marks them
as live in the pre-prune phase.
This speeds up section lookup by name.
This change was motivated by poor performance of a testcase while trying to fix
the NoAlloc lifetime patch that was originally landed as 2cc64df0bd. The
NoAlloc lifetime patch causes ELF non-SHF_ALLOC sections to be given a JITLink
Section (previously they were skipped), and the
llvm/test/ExecutionEngine/JITLink/X86/ELF_shndex.s testcase creates > 64k
non-SHF_ALLOC sections, each of which now needs to be checked to ensure that its
name does not clash. Moving to a DenseMap allows us to implement this check
efficiently.
Allocation actions may run JIT'd code, which isn't permitted in -noexec mode.
Testcases that depend on actions running should be moved to the ORC runtime.
The classification of TLS symbols in ELF was changed from ST_Data to
ST_Other in the following commit:
018a484cd2
RuntimeDyldELF::processRelocationRef() needs to be updated to also
handle ST_Other symbols so that it handles TLS relocations correctly.
The current tests did not fail because we have a shortcut for global
symbols that are already defined.
Differential Revision: https://reviews.llvm.org/D143568
According to the IR verifier, "Declaration[s] may not be in a Comdat!"
This is a re-commit of 76b3f0b4d5 and
87d7838202 with updates to the test:
* Force emission of the extra-module, to trigger the bug after D138264,
by providing a second symbol @g, and making the comdat nodeduplicate.
(Technically only one is needed, but two should be safer.)
* Name the comdat $f to avoid failure on Windows:
LLVM ERROR: Associative COMDAT symbol 'c' does not exist.
* Mark the test as UNSUPPORTED on macOS, MachO doesn't support COMDATs.
Differential Revision: https://reviews.llvm.org/D142443