Once part of PR #144648, follow the reviewer's advice and split into
this separate PR.
`unmap` works at page granularity, but supports an arbitrary non-zero
size as an argument, which results in possible shadow undercleaning in
the existing TSan implementation when `size % kShadowCell != 0`.
This change introduces two test cases to verify the shadow cleaning
effect in `unmap`.
- java_heap_init2.cpp: Imitating java_heap_init cpp, verify the
incomplete cleaning of meta
- munmap_clear_shadow.c: verify the incomplete cleaning of shadow
This reverts commit 5eb5f0d876 i.e.,
relands 1b71ea411a.
Test case was failing on aarch64 because the long double type is
implemented differently on x86 vs aarch64. This reland restricts the
test to x86.
----
Original CL description:
A commonly used aid for debugging MSan reports is
`__msan_print_shadow()`, which requires manual app code annotations
(typically of the variable in the UUM report or nearby). This is in
contrast to ASan, which automatically prints out the shadow map when a
check fails.
This patch changes MSan to print the shadow that failed an outlined
check (checks are outlined per function after the
`-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >=
1. Note that we do not print out the shadow map of "neighboring"
variables because this is technically infeasible; see "Caveat" below.
This patch can be easier to use than `__msan_print_shadow()` because
this does not require manual app code annotations. Additionally, due to
optimizations, `__msan_print_shadow()` calls can sometimes spuriously
affect whether a variable is initialized.
As a side effect, this patch also enables outlined checks for
arbitrary-sized shadows (vs. the current hardcoded handlers for
{1,2,4,8}-byte shadows).
Caveat: the shadow does not necessarily correspond to an individual user
variable, because MSan instrumentation may combine and/or truncate
multiple shadows prior to emitting a check that the mangled shadow is
zero (e.g., `convertShadowToScalar()`,
`handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`).
OTOH it is arguably a strength that this feature emit the shadow that
directly matters for the MSan check, but which cannot be obtained using
the MSan API.
A commonly used aid for debugging MSan reports is `__msan_print_shadow()`, which requires manual app code annotations (typically of the variable in the UUM report or nearby). This is in contrast to ASan, which automatically prints out the shadow map when a check fails.
This patch changes MSan to print the shadow that failed an outlined check (checks are outlined per function after the `-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >= 1. Note that we do not print out the shadow map of "neighboring" variables because this is technically infeasible; see "Caveat" below.
This patch can be easier to use than `__msan_print_shadow()` because this does not require manual app code annotations. Additionally, due to optimizations, `__msan_print_shadow()` calls can sometimes spuriously affect whether a variable is initialized.
As a side effect, this patch also enables outlined checks for arbitrary-sized shadows (vs. the current hardcoded handlers for {1,2,4,8}-byte shadows).
Caveat: the shadow does not necessarily correspond to an individual user variable, because MSan instrumentation may combine and/or truncate multiple shadows prior to emitting a check that the mangled shadow is zero (e.g., `convertShadowToScalar()`, `handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`). OTOH it is arguably a strength that this feature emit the shadow that directly matters for the MSan check, but which cannot be obtained using the MSan API.
Correct the interval desc of ReleaseMemoryPagesToOS from [beg, end] to
[beg, end), as it actually does.
The previous incorrect description of [beg, end] might cause an
incorrect invoke as follows: `ReleaseMemoryPagesToOS(0, kPageSize - 1);`
In the current gcov implementation, all lines within a basic block are
attributed to the source file of the block's containing function. This
is inaccurate when a block contains lines from other files (e.g., via
#include "foo.inc").
Commit
[406e81b](406e81b79d)
attempted to address this by filtering lines based on debug info types,
but this approach has two limitations:
* **Over-filtering**: Some valid lines belonging to the function are
incorrectly excluded.
* **Under-counting**: Lines not belonging to the function are filtered
out and omitted from coverage statistics.
**GCC Reference Behavior**
GCC's gcov implementation handles this case correctly.This change aligns
the LLVM behavior with GCC.
**Proposed Solution**
1. **GCNO Generation**:
* **Current**: Each block stores a single GCOVLines record (filename +
lines).
* **New**: Dynamically create new GCOVLines records whenever consecutive
lines in a block originate from different source files. Group subsequent
lines from the same file under one record.
2. **GCNO Parsing**:
* **Current**: Lines are directly attributed to the function's source
file.
* **New**: Introduce a GCOVLocation type to track filename/line mappings
within blocks. Statistics will reflect the actual source file for each
line.
This commit changes the interval shadow/meta address check from
inclusive-inclusive ( $[\mathrm{start}, \mathrm{end}]$ ) to
inclusive-exclusive ( $[\mathrm{start}, \mathrm{end})$ ), to resolve the
ambiguity of the end point address. This also aligns the logic with the
check for `isAppMem` (i.e., inclusive-exclusive), ensuring consistent
behavior across all memory classifications.
1. The `isShadowMem` and `isMetaMem` checks previously used an
inclusive-inclusive interval, i.e., $[\mathrm{start}, \mathrm{end}]$,
which could lead to a boundary address being incorrectly classified as
both Shadow and Meta memory, e.g., 0x3000_0000_0000 in
`Mapping48AddressSpace`.
- What's more, even when Shadow doesn't border Meta, `ShadowMem::end`
cannot be considered a legal shadow address, as TSan protects the gap,
i.e., `ProtectRange(ShadowEnd(), MetaShadowBeg());`
2. `ShadowMem`/`MetaMem` addresses are derived from `AppMem` using an
affine-like transformation (`* factor + bias`). This transformation
includes two extra modifications: high- and low-order bits are masked
out, and for Shadow Memory, an optional XOR operation may be applied to
prevent conflicts with certain AppMem regions.
- Given that all AppMem regions are defined as inclusive-exclusive
intervals, $[\mathrm{start}, \mathrm{end})$, the resulting Shadow/Meta
regions should logically also be inclusive-exclusive.
Note: This change is purely for improving code consistency and should
have no functional impact. In practice, the exact endpoint addresses of
the Shadow/Meta regions are generally not reached.
Today, the `uncaught-exception.test` fuzzer test checks for the string
"libFuzzer: deadly signal" in the program output as the result of an
uncaught exception.
Although this is correct for `clang`, `msvc` reports a different error
message: "libFuzzer: uncaught C++ exception". Since `msvc` reuses the
`libFuzzer` infrastructure for ASan regression testing, it would help us
greatly if the test handled the `msvc` divergence more gracefully.
**This PR:** augments this test so check for a different string (namely
"libFuzzer: uncaught C++ exception") if the compiler target matches the
`msvc` naming scheme.
I understand if this is outside the scope of support for LLVM as well,
and I'm also open for different approaches here. Thanks!
Fix for #144495 by 6f4add3 broke sanitizer-aarch64-linux buildbot.
compiler-rt/lib/fuzzer/tests build failed because the linker was
looking gcc_s without '-l' appended.
The CMake script was adding the library name without the required
'-l' prefix. This patch adds the -l prefix changing gcc_s to -lgcc_s
and gcc to -lgcc.
https://lab.llvm.org/buildbot/#/builders/51/builds/18170
Mark as many of the reportXX functions that take pointers const. This
avoid the need to use const_cast when calling these functions on an
already const pointer.
Fix reportHeaderCorruption calls where an argument was passed into an
append call that didn't use them.
compiler-rt/lib/fuzzer/tests build was failing on armv7, with undefined
references to unwinder symbols, such as __aeabi_unwind_cpp_pr0.
This occurs because the test is built with `-nostdlib++` but `libunwind`
is not explicitly linked to the final test executable.
This patch resolves the issue by adding CMake logic to explicitly link
the required unwinder to the fuzzer tests, inspired by the same solution
used to fix Scudo build failures by https://reviews.llvm.org/D142888.
MSan should unpoison the parameters of extended signal handlers.
However, MSan unpoisoned the second parameter with the wrong size
`sizeof(__sanitizer_sigaction)`, inconsistent with its real type
`siginfo_t`.
This commit fixes this issue by correcting the size to
`sizeof(__sanitizer_siginfo)`.
Currently, `ENABLE_BAREMETAL_AARCH64_FMV` is added to builtin defines
for all baremetal targets though it is only needed for aarch64. This
patch fixes this by adding it only for aarch64 target.
Adds support to LSan for `free_sized` and `free_aligned_sized` from C23.
Other sanitizers will be handled with their own separate PRs.
For #144435
Signed-off-by: Justin King <jcking@google.com>
In https://github.com/llvm/llvm-project/pull/143183 was mistakenly added
a default value to `python_root_dir` in lit tests of compiler-rt.
This is unused by the lit tests of compiler-rt, as it was meant to be
used by `lldb`.
This patch removes this change.
AAPCS64 reserves any of X9-X15 for a compiler to choose to use for this
purpose, and says not to use X16 or X18 like GCC (and the previous
implementation) chose to use. The X18 register may need to get used by
the kernel in some circumstances, as specified by the platform ABI, so
it is generally an unwise choice. Simply choosing a different register
fixes the problem of this being broken on any platform that actually
follows the platform ABI (which is all of them except EABI, if I am
reading this linux kernel bug correctly
https://lkml2.uits.iu.edu/hypermail/linux/kernel/2001.2/01502.html). As
a side benefit, also generate slightly better code and avoids needing
the compiler-rt to be present. I did that by following the XCore
implementation instead of PPC (although in hindsight, following the
RISCV might have been slightly more readable). That X18 is wrong to use
for this purpose has been known for many years (e.g.
https://www.mail-archive.com/gcc@gcc.gnu.org/msg76934.html) and also
known that fixing this to use one of the correct registers is not an ABI
break, since this only appears inside of a translation unit. Some of the
other temporary registers (e.g. X9) are already reserved inside llvm for
internal use as a generic temporary register in the prologue before
saving registers, while X15 was already used in rare cases as a scratch
register in the prologue as well, so I felt that seemed the most logical
choice to choose here.
The runtimes build expects libclang_rt.tysan.a to be available, but the
check-tysan target does not actually depend on it when built using a
runtimes build with LLVM_ENABLE_RUNTIMES pointing at ./llvm. This means
we get test failures when running check-compiler-rt due to the missing
static archive.
This patch also makes check-tysan depend on tysan when we are using the
runtimes build.
This is causing premerge failures currently since we recently migrated
to the runtimes build.
## Purpose
The compiler-rt project `check-same-common-code.test` test case started
failing after #142861 was merged. This change addresses the failure.
## Overview
This patch replicates the changes made by #142861 to
llvm/include/llvm/ProfileData/InstrProfData.inc in the duplicated file
compiler-rt/include/profile/InstrProfData.inc. These files otherwise
match.
## Validation
Locally built `check-profile` target and verified
`check-same-common-code.test` now passes.
Some of the aeabi functions were missing PACBTI support. The lack of it
resulted in exceptions at runtime if the running environment had PAC
and/or BTI enabled.
This patch adds this support. This involves the addition of PACBTI
instructions, depending on whether each of these features is enabled,
plus the saving and restoring of the PAC code that resides in r12. Some
of the common code has been put in preprocessor macros to reduce
duplication, but not all, especially since some register saving and
restoring is very specific to each context.
When a CHECK() fails during TSan initialization, it may segfault (e.g., https://github.com/google/sanitizers/issues/837#issuecomment-2939267531). This is because a failed CHECK() will attempt to print a symbolized stack trace, which requires dl_iterate_phdr, but the interceptor may not yet be set up.
This patch fixes the issue by not symbolizing the stack trace if the dl_iterate_phdr interceptor is not ready.
When testing LLDB, we want to make sure to use the same Python as the
one we used to build it.
This patch used the CMake variable `Python3_ROOT_DIR` to set the
`PYTHONHOME` env variable in LLDB lit tests, in order to ensure of this.
Please see https://github.com/swiftlang/swift/pull/82063 for the
original issue.
Commit 75c3ff8c0b inadvertently removed
some files from the build related to the SME ABI routines.
This patch fixes the issue by reintroducing the files to the build in
CMake.
`sanitizer_procmaps_solaris.cpp` currently uses `#undef
_FILE_OFFSET_BITS` to hack around the fact that old versions of Solaris
`<procfs.h>` don't work in a largefile environment:
```
/usr/include/sys/procfs.h:42:2: error: #error "Cannot use procfs in the large file compilation environment"
42 | #error "Cannot use procfs in the large file compilation environment"
| ^~~~~
```
However, this is no longer an issue on either Solaris 11.4 or Illumos.
The workaround only existed for the benefit of Solaris 11.3. While that
had never been supported by LLVM, the sanitizer runtime libs were
imported into GCC's `libsanitzer`. With the removal of Solaris 11.3
support in GCC 15, this is no longer an issue and the workaround can be
removed.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
Currently, if TSan needs to disable ASLR but is unable to do so, it
aborts with a cryptic message:
```
ThreadSanitizer: CHECK failed: tsan_platform_linux.cpp:290 "((personality(old_personality | ADDR_NO_RANDOMIZE))) != ((-1))"
```
and a segfault
(https://github.com/google/sanitizers/issues/837#issuecomment-2939267531).
This patch replaces the CHECK with more user-friendly diagnostics and
suggestions via printf, followed by Die().
The existing implementations, written in assembly, make use of unaligned
accesses for performance reasons. They are not compatible with strict
aligned configurations, i.e. with `-mno-unaligned-access`.
If the functions are used in this scenario, an exception is raised due
to unaligned memory accesses.
This patch reintroduces vanilla implementations for these functions to
be used in strictly aligned configurations. The actual code is largely
based on the code from https://github.com/llvm/llvm-project/pull/77496
This changes the save/restore procedures to save/restore registers one
by one - to match the stack alignment for the ILP32E/LP64E ABIs, rather
than the larger batches of the conventional ABIs. The implementations of
the save routines are not tail-shared, to reduce the number of
instructions. I think this also helps code size but I need to check this
again.
I would expect (but haven't measured) that the majority of functions
compiled for the ILP32E/LP64E ABIs will in fact use both callee-saved
registers, and therefore there are still savings to be had, but I think
those can come later, with more data (especially if those changes are
just to the instruction sequences we use to save the registers, rather
than the number and alignment of how this is done).
This is a potential break for all of the ILP32E/LP64E ABI - we may
instead have to teach the compiler to emit the CFI information correctly
for the grouping we already have implemented (because that grouping
matches GCC). It depends on how intentional we think the grouping is in
the original ILP32E/LP64E save/restore implementation was, and whether
we think we can fix that now.
Created for Wine's memset by clang or mingw-gcc,
the latter places it quite at the start of the function:
```
0x00006ffffb67e210 <memset+0>: 0f b6 d2 movzbl %dl,%edx
0x00006ffffb67e213 <memset+3>: 48 b8 01 01 01 01 01 01 01 01 movabs $0x101010101010101,%rax
```
`3200 uint64_t v = 0x101010101010101ull * (unsigned char)c;`
290fd532ee/dlls/msvcrt/string.c (L3200)