This reverts commit 97fb21ac1d.
Due to issue brought up in #112780
> Unfortunately this breaks the build on our (automerger) bots, which
have -mmacosx-version-min=10.13 and also
-Werror=unguarded-availability-new . I was thinking about patching it
via wrapping in __builtin_available check (which I believe is the right
one to use, as it should match the -mmacosx-version-min ) - but can't
actually think of a quick fix, due to interceptors being defined via C
macros.
>
llvm-project/compiler-rt/lib/rtsan/rtsan_interceptors_posix.cpp:475:21:
error: 'aligned_alloc' is only available on macOS 10.15 or newer
[-Werror,-Wunguarded-availability-new] 475 | INTERCEPTOR(void *,
aligned_alloc, SIZE_T alignment, SIZE_T size) {
# What
This PR renames the newly-introduced llvm attribute
`sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise,
sibling variables such as `SanitizeRealtimeUnsafe` are renamed to
`SanitizeRealtimeBlocking` respectively. There are no other functional
changes.
# Why?
- There are a number of problems that can cause a function to be
real-time "unsafe",
- we wish to communicate what problems rtsan detects and *why* they're
unsafe, and
- a generic "unsafe" attribute is, in our opinion, too broad a net -
which may lead to future implementations that need extra contextual
information passed through them in order to communicate meaningful
reasons to users.
- We want to avoid this situation and make the runtime library boundary
API/ABI as simple as possible, and
- we believe that restricting the scope of attributes to names like
`sanitize_realtime_blocking` is an effective means of doing so.
We also feel that the symmetry between `[[clang::blocking]]` and
`sanitize_realtime_blocking` is easier to follow as a developer.
# Concerns
- I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been
part of the tree for a few weeks now (introduced here:
https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't
been released in version 20 yet, am I correct in considering this to not
be a breaking change?
This fixes remaining issues in my previous PR #90959.
Changes:
- Removed dependency on LLVM header in `xray_interface.cpp`
- Fixed XRay patching for some targets due to missing changes in
architecture-specific patching functions
- Addressed some remaining compiler warnings that I missed in the
previous patch
- Formatting
I have tested these changes on `x86_64` (natively), as well as
`ppc64le`, `aarch64` and `arm32` (cross-compiled and emulated using
qemu).
**Original description:**
This PR introduces shared library (DSO) support for XRay based on a
revised version of the implementation outlined in [this
RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000).
The feature enables the patching and handling of events from DSOs,
supporting both libraries linked at startup or explicitly loaded, e.g.
via `dlopen`.
This patch adds the following:
- The `-fxray-shared` flag to enable the feature (turned off by default)
- A small runtime library that is linked into every instrumented DSO,
providing position-independent trampolines and code to register with the
main XRay runtime
- Changes to the XRay runtime to support management and patching of
multiple objects
These changes are fully backward compatible, i.e. running without
instrumented DSOs will produce identical traces (in terms of recorded
function IDs) to the previous implementation.
Due to my limited ability to test on other architectures, this feature
is only implemented and tested with x86_64. Extending support to other
architectures is fairly straightforward, requiring only a
position-independent implementation of the architecture-specific
trampoline implementation (see
`compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference).
This patch does not include any functionality to resolve function IDs
from DSOs for the provided logging/tracing modes. These modes still work
and will record calls from DSOs, but symbol resolution for these
functions in not available. Getting this to work properly requires
recording information about the loaded DSOs and should IMO be discussed
in a separate RFC, as there are mulitple feasible approaches.
---------
Co-authored-by: Sebastian Kreutzer <sebastian.kreutzer@tu-darmstadt.de>
Some platforms (e.g. 64-bit CHERI) have stronger alignment requirements
on values returned from allocators. For all other platforms this does
not result in any functional change.
Reviewed By: cjappl, vitalybuka
Pull Request: https://github.com/llvm/llvm-project/pull/84440
According to the Arm Architecture Reference Manual for A-profile
architecture you can't have one feature without having the other:
ID_AA64ZFR0_EL1.AES, bits [7:4]
> FEAT_SVE_AES implements the functionality identified by the value
0b0001.
> FEAT_SVE_PMULL128 implements the functionality identified by the value
0b0010.
> The permitted values are 0b0000 and 0b0010.
(The following was removed from the latest release of the specification,
but it appears to be a mistake that was not intended to relax the
architecture constraints. The discrepancy has been reported)
ID_AA64ISAR0_EL1.AES, bits [7:4]
> FEAT_AES implements the functionality identified by the value 0b0001.
> FEAT_PMULL implements the functionality identified by the value
0b0010.
> From Armv8, the permitted values are 0b0000 and 0b0010.
Approved in ACLE as https://github.com/ARM-software/acle/pull/352
If we split these features in the compiler (see relevant pull request
https://github.com/llvm/llvm-project/pull/109299), we would only be able
to hand-write a 'memtag2' version using inline assembly since the
compiler cannot generate the instructions that become available with
FEAT_MTE2. However these instructions only work at Exception Level 1, so
they would be unusable since FMV is a user space facility. I am
therefore unifying them.
Approved in ACLE as https://github.com/ARM-software/acle/pull/351
For such threads we have no registers, so no exact
stack range, and no guaranties that stack is mapped
at all.
To avoid crashes on unmapped memory,
`MemCpyAccessible` copies intersting range into
temporarily buffer, and we search for pointers there.
The instruction is present in some library in the 24H2 update for
Windows 11:
==8508==interception_win: unhandled instruction at 0x7ff83e193a40: 44 0f
b6 1a 4c 8b d2 48
This could be generalized, but getting all the ModR/M byte combinations
right is tricky. Many other classes of instructions handled in this file
could use some generalization too.
For posix implementation is similar to
`IsAccessibleMemoryRange`, using `pipe`.
We need this because we can't rely on non-atomic
`IsAccessibleMemoryRange` + `memcpy`, as the
protection or mapping may change and we may
crash.
The comment stated that it's slow, but likely it's a deadlock,
as write can be blocked.
Also we can't be sure that `page_size * 10` is appropriate size.
Still most likely this is NFC, as the max `size` we use is 32,
and should fit in any buffer.
This PR registers the writeout and reset functions for `gcov` for all
modules in the PGO runtime, instead of registering them
using global constructors in each module. The change is made for AIX
only, but the same mechanism works on Linux on Power.
When registering such functions using global constructors in each module
without `-ffunction-sections`, the AIX linker cannot garbage collect
unused undefined symbols, because such symbols are grouped in the same
section as the `__sinit` symbol. Keeping such undefined symbols causes
link errors (see test case
https://github.com/llvm/llvm-project/pull/108570/files#diff-500a7e1ba871e1b6b61b523700d5e30987900002add306e1b5e4972cf6d5a4f1R1
for this scenario). This PR implements the initialization in the
runtime, hence avoiding introducing `__sinit` into each module.
The implementation adds a new global variable `__llvm_covinit_functions`
to each module. This new global variable contains the function pointers
to the `Writeout` and `Reset` functions. `__llvm_covinit_functions`'s
section is the named section `__llvm_covinit`. The linker will aggregate
all the `__llvm_covinit` sections from each module
to form one single named section in the final binary. The pair of
functions
```
const __llvm_gcov_init_func_struct *__llvm_profile_begin_covinit();
const __llvm_gcov_init_func_struct *__llvm_profile_end_covinit();
```
are implemented to return the start and end address of this named
section in the final binary, and they are used in function
```
__llvm_profile_gcov_initialize()
```
(which is a constructor function in the runtime) so the runtime knows
the addresses of all the `Writeout` and `Reset` functions from all the
modules.
One noticeable implementation detail relevant to AIX is that to preserve
the `__llvm_covinit` from the linker's garbage collection, a `.ref`
pseudo instruction is inserted into them, referring to the section that
contains the `__llvm_gcov_ctr` variables, which are used in the
instrumented code. The `__llvm_gcov_ctr` variables did not belong to
named sections before, but this PR added them to the
`__llvm_gcov_ctr_section` named section, so we can add a `.ref` pseudo
instruction that refers to them in the `__llvm_covinit` section.
The `check-fuzzer` runs fine with cl build llvm, but the following lit
tests fail with clang-cl build llvm
```
********************
Timed Out Tests (2):
libFuzzer-x86_64-default-Windows :: fork-ubsan.test
libFuzzer-x86_64-default-Windows :: fuzzer-oom.test
********************
Failed Tests (22):
libFuzzer-x86_64-default-Windows :: acquire-crash-state.test
libFuzzer-x86_64-default-Windows :: cross_over_copy.test
libFuzzer-x86_64-default-Windows :: cross_over_insert.test
libFuzzer-x86_64-default-Windows :: exit_on_src_pos.test
libFuzzer-x86_64-default-Windows :: fuzzer-alignment-assumption.test
libFuzzer-x86_64-default-Windows :: fuzzer-implicit-integer-sign-change.test
libFuzzer-x86_64-default-Windows :: fuzzer-implicit-signed-integer-truncation-or-sign-change.test
libFuzzer-x86_64-default-Windows :: fuzzer-implicit-signed-integer-truncation.test
libFuzzer-x86_64-default-Windows :: fuzzer-implicit-unsigned-integer-truncation.test
libFuzzer-x86_64-default-Windows :: fuzzer-printcovpcs.test
libFuzzer-x86_64-default-Windows :: fuzzer-timeout.test
libFuzzer-x86_64-default-Windows :: fuzzer-ubsan.test
libFuzzer-x86_64-default-Windows :: minimize_crash.test
libFuzzer-x86_64-default-Windows :: minimize_two_crashes.test
libFuzzer-x86_64-default-Windows :: null-deref-on-empty.test
libFuzzer-x86_64-default-Windows :: null-deref.test
libFuzzer-x86_64-default-Windows :: print-func.test
libFuzzer-x86_64-default-Windows :: stack-overflow-with-asan.test
libFuzzer-x86_64-default-Windows :: trace-malloc-2.test
libFuzzer-x86_64-default-Windows :: trace-malloc-unbalanced.test
libFuzzer-x86_64-default-Windows :: trace-malloc.test
```
The related commits are
53a81d4d26
and
e31efd8f6f.
Following the change in
e31efd8f6f
can fix these failures.
As for the issue mentioned in the comment that alternatename support in
clang not good enough(https://bugs.llvm.org/show_bug.cgi?id=40218). I
find that using `__builtin_function_start(func)` instead of directly
using `func` would make it work as intended.
The goal is to move `SuspendedThreadsList` related code into
the beginning of the loop, and prepare for extraction the rest
of the loop body into a function.
Introduces a new type of suppression:
1. function-name-matches - allows users to disable `malloc`, `free`,
`pthread_mutex_lock` or similar. This could be helpful if a user thinks
these are real-time safe on their OS. Also allows disabling of any
function marked [[blocking]].
This is useful as a **more performant "early outs" compared to the
`call-stack-contains` suppression**. `call-stack-contains` is inherently
VERY costly, needing to inspect every frame of every stack for a
matching string. This new suppression has an early out before we unwind
the stack.
Fixes bug where a device that supports tagged pointers doesn't use
the tagged pointer when computing the checksum.
Add tests to verify that double frees result in chunk state error
not corrupted header errors.
Important part of the test to have correct
`ThreadDescriptorSize` after `InitTlsSize()`.
It's not a problem if another test called
`InitTlsSize()` before.
Fixes#112399.
This PR adds a `__sanitizer_copy_contiguous_container_annotations`
function, which copies annotations from one memory area to another. New
area is annotated in the same way as the old region at the beginning
(within limitations of ASan).
Overlapping case: The function supports overlapping containers, however
no assumptions should be made outside of no false positives in new
buffer area. (It doesn't modify old container annotations where it's not
necessary, false negatives may happen in edge granules of the new
container area.) I don't expect this function to be used with
overlapping buffers, but it's designed to work with them and not result
in incorrect ASan errors (false positives).
If buffers have granularity-aligned distance between them (`old_beg %
granularity == new_beg % granularity`), copying algorithm works faster.
If the distance is not granularity-aligned, annotations are copied byte
after byte.
```cpp
void __sanitizer_copy_contiguous_container_annotations(
const void *old_storage_beg_p, const void *old_storage_end_p,
const void *new_storage_beg_p, const void *new_storage_end_p) {
```
This function aims to help with short string annotations and similar
container annotations. Right now we change trait types of
`std::basic_string` when compiling with ASan and this function purpose
is reverting that change as soon as possible.
87f3407856/libcxx/include/string (L738-L751)
The goal is to not change `__trivially_relocatable` when compiling with
ASan. If this function is accepted and upstreamed, the next step is
creating a function like `__memcpy_with_asan` moving memory with ASan.
And then using this function instead of `__builtin__memcpy` while moving
trivially relocatable objects.
11a6799740/libcxx/include/__memory/uninitialized_algorithms.h (L644-L646)
---
I'm thinking if there is a good way to address fact that in a container
the new buffer is usually bigger than the previous one. We may add two
more arguments to the functions to address it (the beginning and the end
of the whole buffer.
Another potential change is removing `new_storage_end_p` as it's
redundant, because we require the same size.
Potential future work is creating a function `__asan_unsafe_memmove`,
which will be basically memmove, but with turned off instrumentation
(therefore it will allow copy data from poisoned area).
---------
Co-authored-by: Vitaly Buka <vitalybuka@google.com>