This patch adds debuginfod support into llvm-profdata to
find the assosicated executable by a build id in a raw
profile to correlate a profile with a provided correlation
kind (debug-info or binary).
This patch addresses compatibility issues with the lit internal shell by
expanding and rewriting test scripts in the compiler-rt subproject.
These changes were prompted by the FileNotFound unresolved errors
encountered during the testing process, specifically when running the
command `LIT_USE_INTERNAL_SHELL=1 ninja check compiler-rt`.
**Why the error occurred:**
The error occurred because the original test scripts used process
substitution `(<(...))` in their diff commands. Process substitution
creates temporary files or FIFOs to hold command output, and these are
then passed to `diff`. However, the lit internal shell, which is more
limited than a typical shell like `bash`, does not support process
substitution. When lit tries to execute these commands, it is unable to
create or access the temporary files or FIFOs generated by process
substitution. As a result, lit attempts to open a file or directory that
doesn't exist, leading to the `FileNotFoundError`.
**Changes Made:**
- Instead of using process substitution, the commands now explicitly
redirect the output of `llvm-profdata show` to temporary files before
performing the `diff` comparison. This ensures that the lit internal
shell can correctly find and open these files, resolving the
`FileNotFoundError`.
[This change is relevant [RFC] Enabling the lit internal shell by
Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179)
fixes: #106111
When running tests in compiler-rt using the lit internal shell with the
following command:
```
LIT_USE_INTERNAL_SHELL=1 ninja check-compiler-rt
```
one common error that occurs is:
```
'XRAY_OPTIONS=patch_premain=false:verbosity=1': command not found
```
This error, along with over 50 similar "environment variable not found"
errors, appears across 35 files in complier-rt. These errors happen
because the environment variables are not being set correctly when the
internal shell is used, leading to commands failing due to unrecognized
environment variables.
This patch addresses the issue by using the `env` command to properly
set the environment variables before running the tests. By explicitly
setting the environment variables through `env`, the internal shell can
correctly interpret and apply them, allowing the tests to pass.
fixes: #102395
[link to
RFC](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179)
VTable value profiling can create reference edges from `mod1:func_foo`
to `mod2:local-vtable`. Indirect call profiling can create reference
edges from `mod1:func_foo` to `mod2:local_func_bar`.
Given a ref chain `mod1:func_foo -> mod2:local-var`,`local-var` doesn't
get imported by default.
Compiler checks / requires the module of 'local-var' is the same as the
function that referenced it(`mod1:func_foo`). This is to prevent
mis-compilation when both `mod1` and `mod2` has `local-var` of the same
name, and cpp files are compiled without full path.
This patch allows the import when one of the following conditions
happen:
1) Introduce an option `import-assume-local-unique`. When the compiler
user can guarantee that all files are compiled with full paths, they can
set this option.
2) When there is one instance of value summary.
Test:
* A/B testing this option alone gives -0.16% statistically consistent
cpu cycle reduction on one search workload (no throughput increase)
* Testing it together with existing more-efficient ICP bumps the
throughput increase by a margin (0.05%~0.1%)
* No regressions observed.
Before this change, when `file.profdata` have vtable profiles but `--enable-vtable-value-profiling` is not on for optimized build, warnings from this line [1] will show up. They are benign for performance but confusing.
It's better to automatically annotate vtable profiles if `file.profdata` has them. This PR implements it in profile use pass.
* If `-icp-max-num-vtables` is zero (default value is 6), vtable profiles won't be annotated.
[1] 464d321ee8/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1762-L1768)
Clang's `-fwhole-program-vtables` is required for this optimization to
take place. If `-fwhole-program-vtables` is not enabled, this change is
no-op.
* Function-comparison (before):
```
%vtable = load ptr, ptr %obj
%vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
%func = load ptr, ptr %vfn
%cond = icmp eq ptr %func, @callee
br i1 %cond, label bb1, label bb2:
bb1:
call @callee
bb2:
call %func
```
* VTable-comparison (after):
```
%vtable = load ptr, ptr %obj
%cond = icmp eq ptr %vtable, @vtable-address-point
br i1 %cond, label bb1, label bb2:
bb1:
call @callee
bb2:
%vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
%func = load ptr, ptr %vfn
call %func
```
Key changes:
1. Find out virtual calls and the vtables they come from.
- The ICP relies on type intrinsic `llvm.type.test` to find out virtual
calls and the
compatible vtables, and relies on type metadata to find the address
point for comparison.
2. ICP pass does cost-benefit analysis and compares vtable only when the
number of vtables for a function candidate is within (option specified)
threshold.
3. Sink the function addressing and vtable load instruction to indirect
fallback.
- The sink helper functions are simplified versions of
`InstCombinerImpl::tryToSinkInstruction`. Currently debug intrinsics are
not handled. Ideally `InstCombinerImpl::tryToSinkInstructionDbgValues`
and `InstCombinerImpl::tryToSinkInstructionDbgVariableRecords` could be
moved into Transforms/Utils/Local.cpp (or another util cpp file) to
handle debug intrinsics when moving instructions across basic blocks.
4. Keep value profiles updated
1) Update vtable value profiles after inline
2) For either function-based comparison or vtable-based comparison,
update both vtable and indirect call value profiles.
- The indexed iFDO profiles contains compressed vtable names for `llvm-profdata show --show-vtables` debugging
usage. An optimized build doesn't need it and doesn't decompress the blob now [1], since optimized binary has the
source code and IR to find vtable symbols.
- The motivation is to avoid increasing profile size when it's not necessary.
- This doesn't change the indexed profile format and thereby doesn't need a version change.
[1] eac925fb81/llvm/include/llvm/ProfileData/InstrProfReader.h (L696-L699)
This patch canonicalizes constant expression GEPs to use i8 source
element type, aka ptradd. This is the ConstantFolding equivalent of the
InstCombine canonicalization introduced in #68882.
I believe all our optimizations working on constant expression GEPs
(like GlobalOpt etc) have already been switched to work on offsets, so I
don't expect any significant fallout from this change.
This is part of:
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699
`Linux/instrprof-vtable-value-prof.cpp` needs to be built for the test
to run. However, cpp compile & link failed with undefined-ABI error [1].
See original failure in
https://lab.llvm.org/buildbot/#/builders/18/builds/16429
[1]
```
FAIL: Profile-powerpc64 :: Linux/instrprof-vtable-value-prof.cpp (2406 of 2414)
******************** TEST 'Profile-powerpc64 :: Linux/instrprof-vtable-value-prof.cpp' FAILED ********************
Exit Code: 1
Command Output (stderr):
--
RUN: at line 3: /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/./bin/clang --driver-mode=g++ -m64 -ldl -fprofile-generate -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/llvm-project/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp -o /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/runtimes/runtimes-bins/compiler-rt/test/profile/Profile-powerpc64/Linux/Output/instrprof-vtable-value-prof.cpp.tmp-test
+ /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/./bin/clang --driver-mode=g++ -m64 -ldl -fprofile-generate -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/llvm-project/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp -o /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/runtimes/runtimes-bins/compiler-rt/test/profile/Profile-powerpc64/Linux/Output/instrprof-vtable-value-prof.cpp.tmp-test
ld.lld: error: /lib/../lib64/Scrt1.o: ABI version 1 is not supported
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691)
* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
* Implement profile reader and writer support
* Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
* Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
* Indexed profile writer collects the list of vtable names, and stores that to index profiles.
* Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
This PR exposes four PGO functions
- `__llvm_profile_set_filename`
- `__llvm_profile_reset_counters`,
- `__llvm_profile_dump`
- `__llvm_orderfile_dump`
to user programs through the new header `instr_prof_interface.h` under
`compiler-rt/include/profile`. This way, the user can include the header
`profile/instr_prof_interface.h` to introduce these four names to their
programs.
Additionally, this PR defines macro `__LLVM_INSTR_PROFILE_GENERATE` when
the program is compiled with profile generation, and defines macro
`__LLVM_INSTR_PROFILE_USE` when the program is compiled with profile
use. `__LLVM_INSTR_PROFILE_GENERATE` together with
`instr_prof_interface.h` define the PGO functions only when the program
is compiled with profile generation. When profile generation is off,
these PGO functions are defined away and leave no trace in the user's
program.
Background:
https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832
When merging instrFDO profiles with afdo profile as supplementary, instrFDO counters for static functions are stored with function's PGO name (with filename.cpp; prefix).
- This pull request fixes the delimiter used when a PGO function name is 'normalized' for AFDO look-up.
Debug info correlation is an option in InstrProfiling pass, which is used by
both IR instrumentation and front-end instrumentation. So, Clang coverage can
also benefits the binary size saving from it.
Reviewed By: ellis
Differential Revision: https://reviews.llvm.org/D157913
When using debug info correlation, value profiling needs to be switched off.
So, we are only merging counter sections. In that case the existance of data
section is just used to provide an extra check in case of corrupted profile.
This patch performs counter merging by iterating the counter section by counter
size and add them together.
Reviewed By: ellis, MaskRay
Differential Revision: https://reviews.llvm.org/D157632
Original patch: D157632.
Fixed the issue when data is 0 by relaxing the condition.
Reviewed By: ellis
Differential Revision: https://reviews.llvm.org/D159000
Emit warnings when `InstrProfCorrelator` finds problems with debug info for lightweight instrumentation profile correlation. To prevent excessive printing, only emit the first 5 warnings.
In addition, remove a diagnostic about missing debug info in `InstrProfiling.cpp`. Some compiler-generated functions, e.g., `__clang_call_terminate`, does not emit debug info and will fail a build if `-Werror` is used. This warning is not actionable by the user and I have not seen non-compiler-generated functions fail this test.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D156006
When using debug info correlation, value profiling needs to be switched off.
So, we are only merging counter sections. In that case the existance of data
section is just used to provide an extra check in case of corrupted profile.
This patch performs counter merging by iterating the counter section by counter
size and add them together.
Reviewed By: ellis, MaskRay
Differential Revision: https://reviews.llvm.org/D157632
The `llvm-profdata show` command is not sufficient to check that two
profiles are identical because it doesn't show the counter values. Add
the flags `--all-functions --counts` so that counter values are checked.
This was changed in https://reviews.llvm.org/D135929 but I failed to
remember that counter values were not printed by default.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D157665
This is an ongoing series of commits that are reformatting our
Python code. This catches the last of the python files to
reformat. Since they where so few I bunched them together.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: jhenderson, #libc, Mordante, sivachandra
Differential Revision: https://reviews.llvm.org/D150784
This adds the --check-binary-id flag that makes sure that an object file
is available for every binary ID mentioned in the given profile. This
should help make the tool more robust in CI environments where it's
expected that coverage mappings should be available for every object
contributing to the profile.
Reviewed By: gulfem
Differential Revision: https://reviews.llvm.org/D144308
Since binary ID lookup makes CLI object arguments optional, it should be
possible to pass a list of source files without a binary. Unfortunately,
the current syntax will always interpret the first source file as a
binary. This change adds a `-sources` option to cause all later
positional arguments to be considered sources.
Reviewed By: gulfem
Differential Revision: https://reviews.llvm.org/D144207
Depends on https://reviews.llvm.org/D142861.
Alternative to https://reviews.llvm.org/D137601.
xxHash is much faster than djbHash. This makes a simple Rust test case with a large constant string 10% faster to compile.
Previous attempts at changing this hash function (e.g. https://reviews.llvm.org/D97396) had to be reverted due to breaking tests that depended on iteration order.
No additional tests fail with this patch compared to `main` when running `check-all` with `-DLLVM_ENABLE_PROJECTS="all"` (on a Linux host), so I hope I found everything that needs to be changed.
Differential Revision: https://reviews.llvm.org/D142862
The tests contain a redundant condition in the else if branch that can
be simplified with fb13dcf343.
Update the condition used to prevent it from getting removed.
Alive2 proof for check removal:
https://alive2.llvm.org/ce/z/iFBnsy
This patch adds support for including binary ids in an indexed profile.
It adds a new field into the header that points to the offset of the
binary id section. The binary id section consists of a size of the
section, and a list of binary ids (if they are present) that consist
of two parts: length and data.
This patch guarantees that indexed profile is backwards compatible
after adding binary ids.
Differential Revision: https://reviews.llvm.org/D135929
This caused lld on mac to assert when building instrumented clang (or
instrumented code in general). See comment on the code review for
reproducer.
> In many cases, we can use an alias to avoid a symbolic relocations,
> instead of using the public, interposable symbol. When the instrumented
> function is in a COMDAT, we can use a hidden alias, and still avoid
> references to discarded sections.
>
> New compiler-rt tests are Linux only for now.
>
> Previous versions of this patch allowed the compiler to name the
> generated alias, but that would only be valid when the functions were
> local. Since the alias may be used across TUs we use a more
> deterministic naming convention, and add a `.local` suffix to the alias
> name just as we do for relative vtables aliases.
>
> Reviewed By: phosek
>
> Differential Revision: https://reviews.llvm.org/D137982
This reverts commit c42e50fede.
In many cases, we can use an alias to avoid a symbolic relocations,
instead of using the public, interposable symbol. When the instrumented
function is in a COMDAT, we can use a hidden alias, and still avoid
references to discarded sections.
New compiler-rt tests are Linux only for now.
Previous versions of this patch allowed the compiler to name the
generated alias, but that would only be valid when the functions were
local. Since the alias may be used across TUs we use a more
deterministic naming convention, and add a `.local` suffix to the alias
name just as we do for relative vtables aliases.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D137982