Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning
on the boundary between JITLink and ORC, and allows pointer comparisons (rather
than string comparisons) between Symbol names. This should improve the
performance and readability of code that bridges between JITLink and ORC (e.g.
ObjectLinkingLayer and ObjectLinkingLayer::Plugins).
To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to
LinkGraph and threaded through to its construction sites in LLVM and Bolt. All
LinkGraphs that are to have symbol names compared by pointer equality must point
to the same SymbolStringPool instance, which in ORC sessions should be the pool
attached to the ExecutionSession.
---------
Co-authored-by: Lang Hames <lhames@gmail.com>
This patch extends the PGO infrastructure with an option to prefer the
instrumentation of loop entry blocks.
This option is a generalization of
19fb5b467b,
and helps to cover cases where the loop exit is never executed.
An example where this can occur are event handling loops.
Note that change does NOT change the default behavior.
Added option "--unify-instantiations" to llvm-cov export to combine
branch execution counts of C++ template instantiations.
on-behalf-of: @e-solutions-GmbH <info@esolutions.de>
This PR adds support in the gSYM format for call site information and
adds support for loading call sites from a YAML file. The support for
YAML input is mostly for testing purposes - so we have a way to test the
functionality.
Note that this data is not currently used in the gSYM tooling - the
logic to use call sites will be added in a later PR.
The reason why we need call site information in gSYM files is so that we
can support better call stack function disambiguation in the case where
multiple functions have been merged due to optimization (linker ICF).
When resolving a merged function on the callstack, we can use the call
site information of the calling function to narrow down the actual
function that is being called, from the set of all merged functions.
See [this
RFC](https://discourse.llvm.org/t/rfc-extending-gsym-format-with-call-site-information-for-merged-function-disambiguation/80682)
for more details on this change.
Some distinct metadata nodes, e.g DICompileUnit, have implicit nullptrs
inside them. Iterating over them with dyn_cast leads to a crash, change
the behavior so that the nullptr operands are skipped.
Add the test distinct-metadata-nullptr.ll which will crash if null
pointers are not handled correctly.
This patch adds MemProfReader::takeMemProfData, a function to return
the complete MemProf profile from the reader. We can directly pass
its return value to InstrProfWriter::addMemProfData without having to
deal with the indivual components of the MemProf profile. The new
function is named "take", but it doesn't do std::move yet because of
type differences (DenseMap v.s. MapVector).
The end state I'm trying to get to is roughly as follows:
- MemProfReader accepts IndexedMemProfData as a parameter as opposed
to the three individual components (frames, call stacks, and
records).
- MemProfReader keeps IndexedMemProfData as a class member without
decomposing it into its individual components.
- MemProfReader returns IndexedMemProfData like:
IndexedMemProfData takeMemProfData() {
return std::move(MemProfData);
}
While llvm-exegesis has explicit support for setting EFLAGS which
contains DF, it can be nice sometimes to explicitly set DF, especially
given that it is modeled as a separate register within LLVM. This patch
adds the ability to do that by lowering setting the value to 0 or 1 to
cld and std respectively.
LazyObjectLinkingLayer can be used to add object files that will not be linked
into the executor unless some function that they define is called at runtime.
(References to data members defined by these objects will still trigger
immediate linking)
To implement lazy linking, LazyObjectLinkingLayer uses the lazyReexports
utility to construct stubs for each function in a given object file, and an
ObjectLinkingLayer::Plugin to rename the function bodies at link-time. (Data
symbols are not renamed)
The llvm-jitlink utility is extended with a -lazy option that can be
passed before input files or archives to add them using the lazy linking
layer rather than the base ObjectLinkingLayer.
This patch removes MemProf format Version 0 now that version 2 and 3
seem to be working well.
I'm not touching version 1 for now because some tests still rely on
version 1.
Note that Version 0 is identical to Version 1 except that the MemProf
section of the indexed format has a MemProf version field.
* Have clang always append & pass System/Library/SubFrameworks when determining default sdk search paths.
* Teach clang-installapi to traverse there for framework input.
* Teach llvm-readtapi that the library files (TBD or binary) in there should be considered private.
resolves: rdar://137457006
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.
cc @arsenm @aeubanks
This implements a global function merging pass. Unlike traditional
function merging passes that use IR comparators, this pass employs a
structurally stable hash to identify similar functions while ignoring
certain constant operands. These ignored constants are tracked and
encoded into a stable function summary. When merging, instead of
explicitly folding similar functions and their call sites, we form a
merging instance by supplying different parameters via thunks. The
actual size reduction occurs when identically created merging instances
are folded by the linker.
Currently, this pass is wired to a pre-codegen pass, enabled by the
`-enable-global-merge-func` flag.
In a local merging mode, the analysis and merging steps occur
sequentially within a module:
- `analyze`: Collects stable function hashes and tracks locations of
ignored constant operands.
- `finalize`: Identifies merge candidates with matching hashes and
computes the set of parameters that point to different constants.
- `merge`: Uses the stable function map to optimistically create a
merged function.
We can enable a global merging mode similar to the global function
outliner
(https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/),
which will perform the above steps separately.
- `-codegen-data-generate`: During the first round of code generation,
we analyze local merging instances and publish their summaries.
- Offline using `llvm-cgdata` or at link-time, we can finalize all these
merging summaries that are combined to determine parameters.
- `-codegen-data-use`: During the second round of code generation, we
optimistically create merging instances within each module, and finally,
the linker folds identically created merging instances.
Depends on #112664
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.
These symbols need to be exported for llvm-pdbutil when using windows
shared library builds.
Exclude the YAML traits declared in llvm-pdbutil so there not declared
as dllimported which will causing missing symbol errors for windows
shared library builds.
This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
window.
Use LLVM_ALWAYS_EXPORT for __jit_debug_descriptor and
__jit_debug_register_code so there exported even if LLVM is not built as
a shared library.
This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
windows #109483.
Claiming AArch64 support for llvm-exegesis is a bit of a stretch in my
opinion as only a couple of opcodes with GPR64 operands will work for
snippet benchmarking, so I propose to clarify that AArch64 support is
very experimental. Also added some clarifications about its libpfm4
dependency.
This introduces a new cgdata format for stable function maps. The raw
data is embedded in the __llvm_merge section during compile time. This
data can be read and merged using the llvm-cgdata tool, into an indexed
cgdata file. Consequently, the tool is now capable of handling either
outlined hash trees, stable function maps, or both, as they are
orthogonal.
Depends on #112662.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
This patch makes X86 llvm-exegesis unconditionally use older
instructions to load the lower vector registers, rather than trying to
use AVX512 for everything when available. This fixes a case where we
would try and load AVX512 registers using the older instructions if such
a snippet was constructed while -mcpu was set to something that did not
support AVX512. This would lead to a machine code verification error
rather than resulting in incomplete snippet setup, which seems to be the
intention of how this should work.
Fixes#114691.
The early simplication pipeline is used in non-LTO and (Thin/Full)LTO
pre-link
stage. There are some passes that we want them in non-LTO mode, but not
at LTO
pre-link stage. The control is missing currently. This PR adds the
support. To
demonstrate the use, we only enable the internalization pass in non-LTO
mode for
AMDGPU because having it run in pre-link stage causes some issues.
Add support for generating random hotness in the memprof profile writer,
to be used for testing. The random seed is printed to stderr, and an
additional option enables providing a specific seed in order to
reproduce a particular random profile.
llvm-cxxfilt can demangle names of data symbols, in addition to function
names.
$ llvm-cxxfilt _ZN6garden5gnomeE
garden::gnome
And type names too, on request:
$ llvm-cxxfilt -t i
int
Update some overly specific the wording in the --help and documentation
that suggests otherwise.
The Intel C++ Compiler (ICX) passes linker flags through the driver
unlike MSVC and clang-cl, and therefore needs them to be prefixed with
`/Qoption,link` (the equivalent of `-Wl,` for gcc on *nix).
Use `LINKER:` prefix wherever supported by cmake, when that's not
possible fall-back to `${CMAKE_CXX_LINKER_WRAPPER_FLAG}`. CMake replaces
these with `/Qoption,link` for ICX and with the empty string for MSVC
and clang-cl.
For `target_link_libraries` neither `LINKER:` (not supported prior to
CMake 3.32) nor `${CMAKE_CXX_LINKER_WRAPPER_FLAG}` (does not begin with
`-` would be taken as a library name) works, use `-Qoption,link`
directly within a conditional generator expression that we're linking
with ICX.
For MSVC and clang-cl no functional change is intended.
Tested by compiling with ICX and setting
`CMAKE_(EXE|SHARED|STATIC|MODULE)_LINKER_FLAGS_INIT` to
`-Werror=unknown-argument`.
RFC:
https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446
This is one of the many PRs to fix errors with LLVM_ENABLE_WERROR=on.
Built by GCC 11.
Refactor the code to avoid the false warning
llvm-project/llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
llvm-project/llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp: In
function ‘int LLVMFuzzerInitialize(int*, char***)’:
llvm-project/llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp:141:43:
error: ISO C++ forbids zero-size array ‘argv’ [-Werror=pedantic]
141 | ExitOnError ExitOnErr(std::string(*argv[0]) + ": error:");
|
Provide a option (--no-object-timestamp) to ignore object file timestamp
mismatches. We already have a similar option for Swift modules
(--no-swiftmodule-timestamp).
rdar://123975869
I (re)discovered that dsymutil was instantiating two BinaryHolders: one
for parsing the debug map and one for linking. That really defeats the
purpose of the BinaryHolder as it serves as a cache. Fix the issue and
remove an old FIXME.
This is useful when looking at LLVM/MLIR assembly produced from C++
sources. For example
cir.call @_ZN3aie4tileILi1ELi4EE7programIZ4mainE3$_0EEvOT_(%2, %7) :
will be translated to
cir.call @"void aie::tile<1, 4>::program<main::$_0>(main::$_0&&)"(%2,
%7) : which can be parsed as valid MLIR by the right mlir-lsp-server.
If a symbol is already quoted, do not quote it more.
---------
Co-authored-by: James Henderson <jh7370@my.bristol.ac.uk>