Fixes#83844.
This PR adds callbacks to mark futex syscalls as blocking. Unfortunately
we didn't have a mechanism before to mark syscalls as a blocking call,
so I had to implement it, but it mostly reuses the `BlockingCall`
implementation
[here](96819daa3d/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (L362-L380)).
The issue includes some information but this issue was discovered
because Rust uses futexes directly. So most likely we need to update
Rust as well to use these callbacks.
Also see the latest comments in #85188 for some context.
I also sent another PR #84162 to mark `pthread_*_lock` calls as
blocking.
New change on top of [reviewed
patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits
after this
one](d0757f46b3).
Previous commits are restored from the remote branch with timestamps.
1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1acf))
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121))
**Original Description**
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
- Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600
---------
Co-authored-by: modiking <modiking213@gmail.com>
This is a follow up of
7b9d73c2f9
and
5ef9ba7412
* The definition of `Type::getInt8PtrTy` is deleted. This doesn't cause
a compile error because the `Initializer` part of the macro doesn't run.
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
- Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600
---------
Co-authored-by: modiking <modiking213@gmail.com>
There were buildbot failures when running memprof tests:
Failed Tests (12):
MemProfiler-x86_64-linux :: TestCases/interface_test.cpp
MemProfiler-x86_64-linux :: TestCases/log_path_test.cpp
MemProfiler-x86_64-linux :: TestCases/memprof_merge_mib.cpp
MemProfiler-x86_64-linux :: TestCases/memprof_profile_dump.cpp
MemProfiler-x86_64-linux :: TestCases/profile_reset.cpp
MemProfiler-x86_64-linux :: TestCases/unaligned_loads_and_stores.cpp
MemProfiler-x86_64-linux-dynamic :: TestCases/interface_test.cpp
MemProfiler-x86_64-linux-dynamic :: TestCases/log_path_test.cpp
MemProfiler-x86_64-linux-dynamic :: TestCases/memprof_merge_mib.cpp
MemProfiler-x86_64-linux-dynamic :: TestCases/memprof_profile_dump.cpp
MemProfiler-x86_64-linux-dynamic :: TestCases/profile_reset.cpp
MemProfiler-x86_64-linux-dynamic ::
TestCases/unaligned_loads_and_stores.cpp
See
- https://lab.llvm.org/buildbot/#/builders/258/builds/8852
- https://lab.llvm.org/buildbot/#/builders/258/builds/12876
I suspect the failure is because when build with
-DLLVM_ENABLE_RUNTIMES=compiler-rt -DCOMPILER_RT_BUILD_SANITIZERS=OFF,
the headers sanitizer/allocator_interface.h and
sanitizer/common_interface_defs.h
are not copied to the build tree, and not installed.
But in the failed memprof tests,
sanitizer/allocator_interface.h or sanitizer/memprof_interface.h is
included.
This patch adds sanitizer/allocator_interface.h and
sanitizer/memprof_interface.h to memprof headers if
COMPILER_RT_BUILD_SANITIZERS is false.
This PR exposes four PGO functions
- `__llvm_profile_set_filename`
- `__llvm_profile_reset_counters`,
- `__llvm_profile_dump`
- `__llvm_orderfile_dump`
to user programs through the new header `instr_prof_interface.h` under
`compiler-rt/include/profile`. This way, the user can include the header
`profile/instr_prof_interface.h` to introduce these four names to their
programs.
Additionally, this PR defines macro `__LLVM_INSTR_PROFILE_GENERATE` when
the program is compiled with profile generation, and defines macro
`__LLVM_INSTR_PROFILE_USE` when the program is compiled with profile
use. `__LLVM_INSTR_PROFILE_GENERATE` together with
`instr_prof_interface.h` define the PGO functions only when the program
is compiled with profile generation. When profile generation is off,
these PGO functions are defined away and leave no trace in the user's
program.
Background:
https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.
## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.
## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
FILE SIZE VM SIZE
-------------- --------------
+121% +30.4Mi +121% +30.4Mi .text
[NEW] +14.6Mi [NEW] +14.6Mi __llvm_prf_data
[NEW] +10.6Mi [NEW] +10.6Mi __llvm_prf_names
[NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts
+95% +1.75Mi +95% +1.75Mi .eh_frame
+108% +400Ki +108% +400Ki .eh_frame_hdr
+9.5% +211Ki +9.5% +211Ki .rela.dyn
+9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro
+5.0% +87.3Ki +5.0% +87.3Ki .rodata
[ = ] 0 +13% +47.0Ki .bss
+40% +1.78Ki +40% +1.78Ki .got
+12% +1.49Ki +12% +1.49Ki .gcc_except_table
[ = ] 0 +65% +1.23Ki .relro_padding
+62% +1.20Ki [ = ] 0 [Unmapped]
+13% +448 +19% +448 .init_array
+8.8% +192 [ = ] 0 [ELF Section Headers]
+0.0% +136 +0.0% +80 [7 Others]
+0.1% +96 +0.1% +96 .dynsym
+1.2% +96 +1.2% +96 .rela.plt
+1.5% +80 +1.2% +64 .plt
[ = ] 0 -99.2% -3.68Ki [LOAD #5 [RW]]
+195% +64.0Mi +194% +64.0Mi TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
FILE SIZE VM SIZE
-------------- --------------
+121% +30.4Mi +121% +30.4Mi .text
[NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts
+95% +1.75Mi +95% +1.75Mi .eh_frame
+108% +400Ki +108% +400Ki .eh_frame_hdr
+9.5% +211Ki +9.5% +211Ki .rela.dyn
+9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro
+5.0% +87.3Ki +5.0% +87.3Ki .rodata
[ = ] 0 +13% +47.0Ki .bss
+40% +1.78Ki +40% +1.78Ki .got
+12% +1.49Ki +12% +1.49Ki .gcc_except_table
+13% +448 +19% +448 .init_array
+0.1% +96 +0.1% +96 .dynsym
+1.2% +96 +1.2% +96 .rela.plt
+1.2% +64 +1.2% +64 .plt
+2.9% +64 [ = ] 0 [ELF Section Headers]
+0.0% +40 +0.0% +40 .data
+1.2% +32 +1.2% +32 .got.plt
+0.0% +24 +0.0% +8 [5 Others]
[ = ] 0 -22.9% -872 [LOAD #5 [RW]]
-74.5% -1.44Ki [ = ] 0 [Unmapped]
[ = ] 0 -76.5% -1.45Ki .relro_padding
+118% +38.8Mi +117% +38.8Mi TOTAL
```
A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
Add __memprof_profile_reset() interface which can be used to facilitate
dumping multiple rounds of profiles from a single binary run. This
closes the current file descriptor and resets the internal file
descriptor to invalid (-1), which ensures the underlying writer reopens
the recorded profile filename. This can be used once the client is done
moving or copying a dumped profile, to prepare for reinvoking profile
dumping.
As an intrinsic, `_ReturnAddress` does not need it; additionally,
if someone else declares `_ReturnAddress` without `__cdecl` (for
example, `<intrin.h>`)
Additionally, actually add a test for this change. I've tested it
locally with both LLVM and MSVC.
This adds a new helper that can be called from application code to
ensure that no mutexes are held on specific code paths. This is useful
for multiple scenarios, including ensuring no locks are held:
- at thread exit
- in peformance-critical code
- when a coroutine is suspended (can cause deadlocks)
See this discourse thread for more discussion:
https://discourse.llvm.org/t/add-threadsanitizer-check-to-prevent-coroutine-suspending-while-holding-a-lock-potential-deadlock/74051
This resubmits and fixes#69372 (was reverted because of build
breakage).
This also includes the followup change #71471 (to fix a land race).
The new lit test fails, see comment on the PR. This also reverts
the follow-up commit, see below.
> This adds a new helper that can be called from application code to
> ensure that no mutexes are held on specific code paths. This is useful
> for multiple scenarios, including ensuring no locks are held:
>
> - at thread exit
> - in peformance-critical code
> - when a coroutine is suspended (can cause deadlocks)
>
> See this discourse thread for more discussion:
>
> https://discourse.llvm.org/t/add-threadsanitizer-check-to-prevent-coroutine-suspending-while-holding-a-lock-potential-deadlock/74051
This reverts commit bd841111f3.
This reverts commit 16a395b74d.
in https://github.com/llvm/llvm-project/pull/69625 @strega-nil added
cdecl to a huge number of sanitizer interface declarations. It looks
like she was racing against @kennyyu adding a tsan interface function. I
noticed this when merging in the latest changes from llvm/main and
corrected it.
Co-authored-by: Charlie Barto <Charles.Barto@microsoft.com>
Public headers intended for user code should not define `__has_feature`,
because this can break preprocessor checks done later in user code, e.g.
if they test `#ifdef __has_feature` to check for real support in the
compiler.
Replace the only use in the public header with a check for it being
supported before trying to use it. Define the fallback definition in the
internal headers, so that other internal sanitizer headers can continue
to use it as preferred.
This resolves a bug reported to GCC as https://gcc.gnu.org/PR109882
This is necessary for many projects which pass `/Gz` to their compiles,
which makes their default calling convention `__stdcall`.
(personal note, I _really_ wish there was a pragma for this)
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.
Differential Revision: https://reviews.llvm.org/D138846
This seems to cause Clang to crash, see comments on the code review. Reverting
until the problem can be investigated.
> Part 1 of 3. This includes the LLVM back-end processing and profile
> reading/writing components. compiler-rt changes are included.
>
> Differential Revision: https://reviews.llvm.org/D138846
This reverts commit a50486fd73.
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.
Differential Revision: https://reviews.llvm.org/D138846
This makes the implicit conversion that is happening explicit.
Otherwise, each user is forced to suppress this
implicit-integer-sign-change runtime error in their their UBSAN
suppressions file.
For example, the runtime error might look like:
runtime error: implicit conversion from type 'long' of value -9223372036854775808 (64-bit, signed) to type 'uint64_t' (aka 'unsigned long') changed the value to 9223372036854775808 (64-bit, unsigned)
#0 0x55fe29dea91d in long FuzzedDataProvider::ConsumeIntegralInRange<long>(long, long) src/./test/fuzz/FuzzedDataProvider.h:233:25
[...]
SUMMARY: UndefinedBehaviorSanitizer: implicit-integer-sign-change test/fuzz/FuzzedDataProvider.h:233:25 in
Differential Revision: https://reviews.llvm.org/D155206
The Clang built-in function is void __xray_typedevent(size_t, const void *, size_t),
but the LLVM intrinsics has smaller integer types. Since we only allow
64-bit ELF/Mach-O targets, we can change llvm.xray.typedevent to
i64/ptr/i64.
This allows encoding more information and avoids i16 legalization for
many non-X86 targets.
fdrLoggingHandleTypedEvent only supports uint16_t event type.
The primary motivation for this change is to allow FreeHooks to obtain
the allocated size of the pointer being freed in a fast, efficient manner.
Differential Revision: https://reviews.llvm.org/D151360
Revert "[xray] Ignore -Wc++20-extensions in xray_records.h [NFC]"
Not needed. The fix is 3826a74fc7.
This reverts commit 231c1d4134.
This reverts commit 7f191e6d2c.
This revision updates the description of
`__sanitizer_annotate_contiguous_container` in includes. Possibilites of
the function were changed in D132522 and it supports:
- unaligned beginning,
- shared first/last granule with other objects.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D149341
This broke the lit tests on Mac, see comment on the code review.
> This change the types to match the ones used in:
> Darwin/debug_external.cpp
> debugging.cpp
>
> Reviewed By: vitalybuka
>
> Differential Revision: https://reviews.llvm.org/D148214
This reverts commit ea7d6e658e.
This change the types to match the ones used in:
Darwin/debug_external.cpp
debugging.cpp
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D148214
It broke lit tests on Mac, see comments on the code review.
> Reviewed By: vitalybuka, dvyukov
>
> Differential Revision: https://reviews.llvm.org/D147337
This reverts commit ebb0f1d063 and
follow-up commit 3c83aeee6b.
As described in [0], this extends IRPGO to support //Temporal Profiling//.
When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile.
Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively.
In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup.
Special thanks to Julian Mestre for helping with reservoir sampling.
[0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
Reviewed By: snehasish
Differential Revision: https://reviews.llvm.org/D147287
D147005 introduced __sanitizer_get_allocated_begin, with a return
value of void*. This involved a few naughty casts that dropped the
const. This patch adds back the const qualifier.
Differential Revision: https://reviews.llvm.org/D147489