`DlSymAllocator` allows early allocations, when
tsan is not yet initialized, e.g. from `dlsym`.
All other sanitizers with interceptors already use
`DlSymAllocator`.
Existing `in_symbolizer()` tsan logic is very similar.
However, we need to keep both as `DlSymAllocator`
does not support large allocations, needed for Symolizer.
Fixes asan, msan crash on check added in #108684.
The #108684 includes reproducer of the issue.
Change interface of `GetThreadStackAndTls` to
set `tls_begin` and `tls_end` at the same time.
We can use an internal linkage variable to make it clear the variable is
not exported. The special section .preinit_array is a GC root.
Pull Request: https://github.com/llvm/llvm-project/pull/98584
In extremely rare cases (estimated 1 in 3 million), minor allocations
that happen after the memory layout was checked in
InitializePlatformEarly() [1] may result in the memory layout
unexpectedly being incompatible in InitializePlatform(). We fix this by
adding another memory layout check (and opportunity to re-exec without
ASLR) in InitializePlatform().
To improve future debuggability, this patch also dumps the process map
if the memory layout is unexpectedly incompatible.
[1]
```
__sanitizer::InitializePlatformEarly();
__tsan::InitializePlatformEarly();
#if !SANITIZER_GO
InitializeAllocator(); // <-- ~8MB mmap'ed
ReplaceSystemMalloc();
#endif
if (common_flags()->detect_deadlocks)
ctx->dd = DDetector::Create(flags()); // <-- ~4MB mmap'ed
Processor *proc = ProcCreate(); // <-- ~1MB mmap'ed
ProcWire(proc, thr);
InitializeInterceptors(); <-- ~3MB mmap'ed
InitializePlatform();
```
MemoryAccessRangeT overestimates the size of the shadow region by 8x,
occasionally leading to assertion failure:
```
RawShadow* shadow_mem = MemToShadow(addr);
...
// Check that end of shadow is valid
if (!IsShadowMem(shadow_mem + size * kShadowCnt - 1)) {
DCHECK(IsShadowMem(shadow_mem + size * kShadowCnt - 1));
```
It is erroneous for two separate reasons:
- it uses kShadowCnt (== 4) instead of kShadowMultiplier (== 2)
- since shadow_mem is a RawShadow*, pointer arithmetic is multiplied by
sizeof(RawShadow) == 4
This patch fixes the calculation, and also improves the debugging
information.
The assertion error was observed on a buildbot
(https://lab.llvm.org/staging/#/builders/89/builds/656/steps/13/logs/stdio):
```
Bad shadow addr 0x3000000190bc (7fffffffe85f)
ThreadSanitizer: CHECK failed: tsan_rtl_access.cpp:690 "((IsShadowMem(shadow_mem + size * kShadowCnt - 1))) != (0)" (0x0, 0x0) (tid=2202676)
```
Notice that 0x3000000190bc is not the correct shadow for the end address
0x7fffffffe85f.
This error is more commonly observed on high-entropy ASLR systems, since
ASLR may be disabled (if the randomized memory layout is incompatible),
leading to an allocation near the boundaries of the high app memory
region (and therefore a shadow end that may be erroneously calculated to
be past the end of the shadow region). Also note that the assertion is
guarded by SANITIZER_DEBUG.
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
Start functions with BTI in order to identify the function as a valid
branch target.
Also add the BTI marker to tsan_rtl_aarch64.S.
With this patch, libclang_rt.tsan.so can now be generated with
DT_AARCH64_BTI_PLT when built with -mbranch-protection=standard.
According to https://reviews.llvm.org/D114250
this was to handle Mac specific issue, however
the test is Linux only.
The test effectively prevents to lock main allocator
on fork, but we do that on Linux for other
sanitizers for years, and need to do the same
for TSAN to avoid deadlocks.
We use REAL() calls in interceptors, but
DEFINE_REAL_PTHREAD_FUNCTIONS has nothing to do
with them and only used for internal maintenance
threads.
This is done to avoid confusion like in #96456.
Sometime tsan runtimes calls, like
`__tsan_mutex_create ()`, need to store a stack
in the StackDepot, and the Depot may need to start
and maintenance thread.
Example:
```
__sanitizer::FutexWait ()
__sanitizer::Semaphore::Wait ()
__sanitizer::Mutex::Lock ()
__tsan::SlotLock ()
__tsan::SlotLocker::SlotLocker ()
__tsan::Acquire ()
__tsan::CallUserSignalHandler ()
__tsan::ProcessPendingSignalsImpl ()
__tsan::ProcessPendingSignals ()
__tsan::ScopedInterceptor::~ScopedInterceptor ()
___interceptor_mmap ()
pthread_create ()
__sanitizer::internal_start_thread ()
__sanitizer::(anonymous namespace)::CompressThread::NewWorkNotify ()
__sanitizer::StackDepotNode::store ()
__sanitizer::StackDepotBase<__sanitizer::StackDepotNode, 1, 20>::Put ()
__tsan::CurrentStackId ()
__tsan::MutexCreate ()
__tsan_mutex_create ()
```
pthread_create() implementation may hit other
interceptors recursively, which may invoke
ProcessPendingSignals, which deadlocks.
Alternative solution could be block interceptors
closer to TSAN runtime API function, like
`__tsan_mutex_create`, or just before
`StackDepotPut``, but it's not needed for most
calls, only when new thread is created using
`real_pthread_create`.
I don't see a reasonable way to create a
regression test.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
In glibc versions before 2.33. `libc_nonshared.a` defines
`__fxstat/__fxstat64` but there is no `fstat/fstat64`. glibc 2.33 added
`fstat/fstat64` and obsoleted `__fxstat/__fxstat64`. Ports added after
2.33 do not provide `__fxstat/__fxstat64`, so our `fstat/fstat64`
interceptors using `__fxstat/__fxstat64` interceptors would lead to
runtime failures on such ports (LoongArch and certain RISC-V ports).
Similar to https://reviews.llvm.org/D118423, refine the conditions that
we define fstat{,64} interceptors. `fstat` is supported by musl/*BSD
while `fstat64` is glibc only.
Fixes#83844.
This PR adds callbacks to mark futex syscalls as blocking. Unfortunately
we didn't have a mechanism before to mark syscalls as a blocking call,
so I had to implement it, but it mostly reuses the `BlockingCall`
implementation
[here](96819daa3d/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (L362-L380)).
The issue includes some information but this issue was discovered
because Rust uses futexes directly. So most likely we need to update
Rust as well to use these callbacks.
Also see the latest comments in #85188 for some context.
I also sent another PR #84162 to mark `pthread_*_lock` calls as
blocking.
Fixes#83561.
When a thread is blocked on a mutex and we send an async signal to that
mutex, it never arrives because tsan thinks that `pthread_mutex_lock` is
not a blocking function. This patch marks `pthread_*_lock` functions as
blocking so we can successfully deliver async signals like `SIGPROF`
when the thread is blocked on them.
See the issue also for more details. I also added a test, which is a
simplified version of the compiler explorer example I posted in the
issue.
Please let me know if you have any other ideas or things to improve!
Happy to work on them.
Also I filed #83844 which is more tricky because we don't have a libc
wrapper for `SYS_futex`. I'm not sure how to intercept this yet. Please
let me know if you have ideas on that as well. Thanks!
Since this standalone build configuration uses the runtime libraries that
are being built just now, we need to ensure that e.g. the TSan unit tests
depend on the tsan runtime library. Also fix TSAN_DEPS being overridden
to not include the tsan runtime (commit .....).
This change fixes a build race seen in the CI checks for
TsanRtlTest-x86_64-Test in https://github.com/llvm/llvm-project/pull/83088.
Reviewed By: vitalybuka
Pull Request: https://github.com/llvm/llvm-project/pull/83650
Blocking that signal causes inter-blocking for profilers that monitor
threads through that signal.
Update tests accordingly to use an uncaught signal.
This is a recommit of 6f3f659ce9 with the
tests fixed.
Fix#83844 and #83561
All these unit tests already include ${COMPILER_RT_GTEST_SOURCE} as an
input source file and the target llvm_gtest does not exist for
standalone builds. Currently the DEPS argument is ignored for standalone
builds so the missing target is not a problem, but as part of fixing a
build race for standalone builds I am planning to include those
dependencies in COMPILER_RT_TEST_STANDALONE_BUILD_LIBS configurations.
Reviewed By: vitalybuka
Pull Request: https://github.com/llvm/llvm-project/pull/83649
Most sanitizers don't support static linking. One primary reason is the
incompatibility with interceptors. `GetTlsSize` is another reason.
asan/memprof use `__interception::DoesNotSupportStaticLinking`
(`_DYNAMIC` reference) to reject -static at link time. Port this
detector to other sanitizers. dfsan actually supports -static for
certain cases. Don't touch dfsan.
`__tsan::Acquire()`, which is called by `__tsan_acquire()`, has a
performance optimization which attempts to avoid acquiring the atomic
variable's mutex if the variable has no associated memory model state.
However, if the atomic variable was recently written to by a
`compare_exchange_weak/strong` on another thread, the memory model state
may be created *after* the atomic variable is updated. This is a data
race, and can cause the thread calling `Acquire()` to not realize that
the atomic variable was previously written to by another thread.
Specifically, if you have code that writes to an atomic variable using
`compare_exchange_weak/strong`, and then in another thread you read the
value using a relaxed load, followed by an
`atomic_thread_fence(memory_order_acquire)`, followed by a call to
`__tsan_acquire()`, TSAN may not realize that the store happened before
the fence, and so it will complain about any other variables you access
from both threads if the thread-safety of those accesses depended on the
happens-before relationship between the store and the fence.
This change eliminates the unsafe optimization in `Acquire()`. Now,
`Acquire()` acquires the mutex before checking for the existence of the
memory model state.
This can be useful because dlsym() may call malloc on failure which could
result in other interposed functions being called that could eventually
make use of TLS. While the crash that I experienced originally has been
fixed differently (by not using global-dynamic TLS accesses in the mutex
deadlock detector, see https://github.com/llvm/llvm-project/pull/83890),
moving this interception earlier is still a good since it makes the code
a bit more robust against initialization order problems.
Reviewed By: MaskRay, vitalybuka
Pull Request: https://github.com/llvm/llvm-project/pull/83886
My previous patch, "Re-exec TSan with no ASLR if memory layout is incompatible on Linux (#78351)" (0784b1eefa) hoisted the 'personality' call, to share the code between Android and non-Android Linux. Unfortunately, this eager call to 'personality' may trigger sandbox violations on non-Android Linux.
This patch fixes the issue by only calling 'personality' on non-Android Linux if the memory mapping is incompatible. This may still cause a sandbox violation, but only if it was going to abort anyway due to an incompatible memory mapping.
(The behavior on Android Linux is unchanged by this patch or the previous patch.)
SANITIZER_WEAK_IMPORT is useful for any call that needs to be
conditionally linked in. This is currently used for the
tsan_dispatch_interceptors, but can be used for other calls introduced
in newer versions of MacOS. (such as `aligned_alloc` in this PR
https://github.com/llvm/llvm-project/pull/79198).
This PR moves the definition to a higher level so it can be used in
other sanitizers.
In 0784b1eefa some code for re-execution was moved to
`ReExecIfNeeded()`, but also extended with a few Linux-only features.
This leads to compile errors on FreeBSD, or other non-Linux platforms:
compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp:247:25: error: use of
undeclared identifier 'personality'
247 | int old_personality = personality(0xffffffff);
| ^
compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp:249:54: error: use of
undeclared identifier 'ADDR_NO_RANDOMIZE'
249 | (old_personality != -1) && ((old_personality & ADDR_NO_RANDOMIZE)
== 0);
| ^
compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp:281:46: error: use of
undeclared identifier 'ADDR_NO_RANDOMIZE'
281 | CHECK_NE(personality(old_personality | ADDR_NO_RANDOMIZE), -1);
| ^
Surround the affected part with a `#if SANITIZER_LINUX` block for now.
TSan's shadow mappings only support 30-bits of ASLR entropy on x86
Linux, and it is not practical to support the maximum of 32-bits (due to pointer compression and the overhead of shadow mappings). Instead, this patch changes TSan to re-exec without ASLR if it encounters an
incompatible memory layout, as suggested by Dmitry in
https://github.com/google/sanitizers/issues/1716.
If ASLR is already disabled but the memory layout is still incompatible,
it will abort.
This patch involves a bit of refactoring, because the old code is:
1. InitializePlatformEarly()
2. InitializeAllocator()
3. InitializePlatform(): CheckAndProtect()
but it may already segfault during InitializeAllocator() if the memory
layout is incompatible, before we get a chance to check in
CheckAndProtect().
This patch adds CheckAndProtect() during InitializePlatformEarly(), before the allocator is initialized. Naturally, it is necessary to ensure that CheckAndProtect() does *not* allow the heap regions to be occupied here, hence we generalize CheckAndProtect() to optionally check the heap
regions. We keep the original behavior of CheckAndProtect() in InitializePlatform() as a last line of defense.
We need to be careful not to prematurely abort if ASLR is disabled but TSan was going to re-exec for other reasons (e.g., unlimited stack size); we implement this by moving all the re-exec logic into ReExecIfNeeded().