Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bd, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
Most pacbti instructions are a nop when the target does not have pacbti,
and thus safe to execute, but bxaut is an undefined instruction. When we
don't have pacbti (e.g. if we're compiling compiler-rt with
-mbranch-protection=standard in order to be forward-compatible with
pacbti while still working on targets without it) then we need to use
separate aut and bx instructions.
In TSan, every `k` bytes of application memory (where `k = 8`) maps to a
single shadow/meta cell. This design leads to two distinct outcomes when
calculating the end of a shadow range using `MemToShadow(addr_end)`,
depending on the alignment of `addr_end`:
- **Exclusive End:** If `addr_end` is aligned (`addr_end % k == 0`),
`MemToShadow(addr_end)` points to the first shadow cell *past* the
intended range. This address is an exclusive boundary marker, not a cell
to be operated on.
- **Inclusive End:** If `addr_end` is not aligned (`addr_end % k != 0`),
`MemToShadow(addr_end)` points to the last shadow cell that *is* part of
the range (i.e., the same cell as `MemToShadow(addr_end - 1)`).
Different TSan functions have different expectations for whether the
shadow end should be inclusive or exclusive. However, these expectations
are not always explicitly enforced, which can lead to subtle bugs or
reliance on unstated invariants.
The core of this patch is to ensure that functions ONLY requiring an
**exclusive shadow end** behave correctly.
1. Enforcing Existing Invariants:
For functions like `MetaMap::MoveMemory` and `MapShadow`, the assumption
is that the end address is always `k`-aligned. While this holds true in
the current codebase (e.g., due to some external implicit conditions),
this invariant is not guaranteed by the function's internal context. We
add explicit assertions to make this requirement clear and to catch any
future changes that might violate this assumption.
2. Fixing Latent Bugs:
In other cases, unaligned end addresses are possible, representing a
latent bug. This was the case in `UnmapShadow`. The `size` of a memory
region being unmapped is not always a multiple of `k`. When this
happens, `UnmapShadow` would fail to clear the final (tail) portion of
the shadow memory.
This patch fixes `UnmapShadow` by rounding up the `size` to the next
multiple of `k` before clearing the shadow memory. This is safe because
the underlying OS `unmap` operation is page-granular, and the page size
is guaranteed to be a multiple of `k`.
Notably, this fix makes `UnmapShadow` consistent with its inverse
operation, `MemoryRangeImitateWriteOrResetRange`, which already performs
a similar size round-up.
In summary, this PR:
- **Adds assertions** to `MetaMap::MoveMemory` and `MapShadow` to
enforce their implicit requirement for k-aligned end addresses.
- **Fixes a latent bug** in `UnmapShadow` by rounding up the size to
ensure the entire shadow range is cleared. Two new test cases have been
added to cover this scenario.
- Removes a redundant assertion in `__tsan_java_move`.
- Fixes an incorrect shadow end calculation introduced in commit
4052de6. The previous logic, while fixing an overestimation issue, did
not properly account for `kShadowCell` alignment and could lead to
underestimation.
Once part of PR #144648, follow the reviewer's advice and split into
this separate PR.
`unmap` works at page granularity, but supports an arbitrary non-zero
size as an argument, which results in possible shadow undercleaning in
the existing TSan implementation when `size % kShadowCell != 0`.
This change introduces two test cases to verify the shadow cleaning
effect in `unmap`.
- java_heap_init2.cpp: Imitating java_heap_init cpp, verify the
incomplete cleaning of meta
- munmap_clear_shadow.c: verify the incomplete cleaning of shadow
This reverts commit 5eb5f0d876 i.e.,
relands 1b71ea411a.
Test case was failing on aarch64 because the long double type is
implemented differently on x86 vs aarch64. This reland restricts the
test to x86.
----
Original CL description:
A commonly used aid for debugging MSan reports is
`__msan_print_shadow()`, which requires manual app code annotations
(typically of the variable in the UUM report or nearby). This is in
contrast to ASan, which automatically prints out the shadow map when a
check fails.
This patch changes MSan to print the shadow that failed an outlined
check (checks are outlined per function after the
`-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >=
1. Note that we do not print out the shadow map of "neighboring"
variables because this is technically infeasible; see "Caveat" below.
This patch can be easier to use than `__msan_print_shadow()` because
this does not require manual app code annotations. Additionally, due to
optimizations, `__msan_print_shadow()` calls can sometimes spuriously
affect whether a variable is initialized.
As a side effect, this patch also enables outlined checks for
arbitrary-sized shadows (vs. the current hardcoded handlers for
{1,2,4,8}-byte shadows).
Caveat: the shadow does not necessarily correspond to an individual user
variable, because MSan instrumentation may combine and/or truncate
multiple shadows prior to emitting a check that the mangled shadow is
zero (e.g., `convertShadowToScalar()`,
`handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`).
OTOH it is arguably a strength that this feature emit the shadow that
directly matters for the MSan check, but which cannot be obtained using
the MSan API.
A commonly used aid for debugging MSan reports is `__msan_print_shadow()`, which requires manual app code annotations (typically of the variable in the UUM report or nearby). This is in contrast to ASan, which automatically prints out the shadow map when a check fails.
This patch changes MSan to print the shadow that failed an outlined check (checks are outlined per function after the `-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >= 1. Note that we do not print out the shadow map of "neighboring" variables because this is technically infeasible; see "Caveat" below.
This patch can be easier to use than `__msan_print_shadow()` because this does not require manual app code annotations. Additionally, due to optimizations, `__msan_print_shadow()` calls can sometimes spuriously affect whether a variable is initialized.
As a side effect, this patch also enables outlined checks for arbitrary-sized shadows (vs. the current hardcoded handlers for {1,2,4,8}-byte shadows).
Caveat: the shadow does not necessarily correspond to an individual user variable, because MSan instrumentation may combine and/or truncate multiple shadows prior to emitting a check that the mangled shadow is zero (e.g., `convertShadowToScalar()`, `handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`). OTOH it is arguably a strength that this feature emit the shadow that directly matters for the MSan check, but which cannot be obtained using the MSan API.
Correct the interval desc of ReleaseMemoryPagesToOS from [beg, end] to
[beg, end), as it actually does.
The previous incorrect description of [beg, end] might cause an
incorrect invoke as follows: `ReleaseMemoryPagesToOS(0, kPageSize - 1);`
In the current gcov implementation, all lines within a basic block are
attributed to the source file of the block's containing function. This
is inaccurate when a block contains lines from other files (e.g., via
#include "foo.inc").
Commit
[406e81b](406e81b79d)
attempted to address this by filtering lines based on debug info types,
but this approach has two limitations:
* **Over-filtering**: Some valid lines belonging to the function are
incorrectly excluded.
* **Under-counting**: Lines not belonging to the function are filtered
out and omitted from coverage statistics.
**GCC Reference Behavior**
GCC's gcov implementation handles this case correctly.This change aligns
the LLVM behavior with GCC.
**Proposed Solution**
1. **GCNO Generation**:
* **Current**: Each block stores a single GCOVLines record (filename +
lines).
* **New**: Dynamically create new GCOVLines records whenever consecutive
lines in a block originate from different source files. Group subsequent
lines from the same file under one record.
2. **GCNO Parsing**:
* **Current**: Lines are directly attributed to the function's source
file.
* **New**: Introduce a GCOVLocation type to track filename/line mappings
within blocks. Statistics will reflect the actual source file for each
line.
This commit changes the interval shadow/meta address check from
inclusive-inclusive ( $[\mathrm{start}, \mathrm{end}]$ ) to
inclusive-exclusive ( $[\mathrm{start}, \mathrm{end})$ ), to resolve the
ambiguity of the end point address. This also aligns the logic with the
check for `isAppMem` (i.e., inclusive-exclusive), ensuring consistent
behavior across all memory classifications.
1. The `isShadowMem` and `isMetaMem` checks previously used an
inclusive-inclusive interval, i.e., $[\mathrm{start}, \mathrm{end}]$,
which could lead to a boundary address being incorrectly classified as
both Shadow and Meta memory, e.g., 0x3000_0000_0000 in
`Mapping48AddressSpace`.
- What's more, even when Shadow doesn't border Meta, `ShadowMem::end`
cannot be considered a legal shadow address, as TSan protects the gap,
i.e., `ProtectRange(ShadowEnd(), MetaShadowBeg());`
2. `ShadowMem`/`MetaMem` addresses are derived from `AppMem` using an
affine-like transformation (`* factor + bias`). This transformation
includes two extra modifications: high- and low-order bits are masked
out, and for Shadow Memory, an optional XOR operation may be applied to
prevent conflicts with certain AppMem regions.
- Given that all AppMem regions are defined as inclusive-exclusive
intervals, $[\mathrm{start}, \mathrm{end})$, the resulting Shadow/Meta
regions should logically also be inclusive-exclusive.
Note: This change is purely for improving code consistency and should
have no functional impact. In practice, the exact endpoint addresses of
the Shadow/Meta regions are generally not reached.
Today, the `uncaught-exception.test` fuzzer test checks for the string
"libFuzzer: deadly signal" in the program output as the result of an
uncaught exception.
Although this is correct for `clang`, `msvc` reports a different error
message: "libFuzzer: uncaught C++ exception". Since `msvc` reuses the
`libFuzzer` infrastructure for ASan regression testing, it would help us
greatly if the test handled the `msvc` divergence more gracefully.
**This PR:** augments this test so check for a different string (namely
"libFuzzer: uncaught C++ exception") if the compiler target matches the
`msvc` naming scheme.
I understand if this is outside the scope of support for LLVM as well,
and I'm also open for different approaches here. Thanks!
Fix for #144495 by 6f4add3 broke sanitizer-aarch64-linux buildbot.
compiler-rt/lib/fuzzer/tests build failed because the linker was
looking gcc_s without '-l' appended.
The CMake script was adding the library name without the required
'-l' prefix. This patch adds the -l prefix changing gcc_s to -lgcc_s
and gcc to -lgcc.
https://lab.llvm.org/buildbot/#/builders/51/builds/18170
Mark as many of the reportXX functions that take pointers const. This
avoid the need to use const_cast when calling these functions on an
already const pointer.
Fix reportHeaderCorruption calls where an argument was passed into an
append call that didn't use them.
compiler-rt/lib/fuzzer/tests build was failing on armv7, with undefined
references to unwinder symbols, such as __aeabi_unwind_cpp_pr0.
This occurs because the test is built with `-nostdlib++` but `libunwind`
is not explicitly linked to the final test executable.
This patch resolves the issue by adding CMake logic to explicitly link
the required unwinder to the fuzzer tests, inspired by the same solution
used to fix Scudo build failures by https://reviews.llvm.org/D142888.
MSan should unpoison the parameters of extended signal handlers.
However, MSan unpoisoned the second parameter with the wrong size
`sizeof(__sanitizer_sigaction)`, inconsistent with its real type
`siginfo_t`.
This commit fixes this issue by correcting the size to
`sizeof(__sanitizer_siginfo)`.
Currently, `ENABLE_BAREMETAL_AARCH64_FMV` is added to builtin defines
for all baremetal targets though it is only needed for aarch64. This
patch fixes this by adding it only for aarch64 target.
Adds support to LSan for `free_sized` and `free_aligned_sized` from C23.
Other sanitizers will be handled with their own separate PRs.
For #144435
Signed-off-by: Justin King <jcking@google.com>
In https://github.com/llvm/llvm-project/pull/143183 was mistakenly added
a default value to `python_root_dir` in lit tests of compiler-rt.
This is unused by the lit tests of compiler-rt, as it was meant to be
used by `lldb`.
This patch removes this change.
AAPCS64 reserves any of X9-X15 for a compiler to choose to use for this
purpose, and says not to use X16 or X18 like GCC (and the previous
implementation) chose to use. The X18 register may need to get used by
the kernel in some circumstances, as specified by the platform ABI, so
it is generally an unwise choice. Simply choosing a different register
fixes the problem of this being broken on any platform that actually
follows the platform ABI (which is all of them except EABI, if I am
reading this linux kernel bug correctly
https://lkml2.uits.iu.edu/hypermail/linux/kernel/2001.2/01502.html). As
a side benefit, also generate slightly better code and avoids needing
the compiler-rt to be present. I did that by following the XCore
implementation instead of PPC (although in hindsight, following the
RISCV might have been slightly more readable). That X18 is wrong to use
for this purpose has been known for many years (e.g.
https://www.mail-archive.com/gcc@gcc.gnu.org/msg76934.html) and also
known that fixing this to use one of the correct registers is not an ABI
break, since this only appears inside of a translation unit. Some of the
other temporary registers (e.g. X9) are already reserved inside llvm for
internal use as a generic temporary register in the prologue before
saving registers, while X15 was already used in rare cases as a scratch
register in the prologue as well, so I felt that seemed the most logical
choice to choose here.
The runtimes build expects libclang_rt.tysan.a to be available, but the
check-tysan target does not actually depend on it when built using a
runtimes build with LLVM_ENABLE_RUNTIMES pointing at ./llvm. This means
we get test failures when running check-compiler-rt due to the missing
static archive.
This patch also makes check-tysan depend on tysan when we are using the
runtimes build.
This is causing premerge failures currently since we recently migrated
to the runtimes build.
## Purpose
The compiler-rt project `check-same-common-code.test` test case started
failing after #142861 was merged. This change addresses the failure.
## Overview
This patch replicates the changes made by #142861 to
llvm/include/llvm/ProfileData/InstrProfData.inc in the duplicated file
compiler-rt/include/profile/InstrProfData.inc. These files otherwise
match.
## Validation
Locally built `check-profile` target and verified
`check-same-common-code.test` now passes.
Some of the aeabi functions were missing PACBTI support. The lack of it
resulted in exceptions at runtime if the running environment had PAC
and/or BTI enabled.
This patch adds this support. This involves the addition of PACBTI
instructions, depending on whether each of these features is enabled,
plus the saving and restoring of the PAC code that resides in r12. Some
of the common code has been put in preprocessor macros to reduce
duplication, but not all, especially since some register saving and
restoring is very specific to each context.
When a CHECK() fails during TSan initialization, it may segfault (e.g., https://github.com/google/sanitizers/issues/837#issuecomment-2939267531). This is because a failed CHECK() will attempt to print a symbolized stack trace, which requires dl_iterate_phdr, but the interceptor may not yet be set up.
This patch fixes the issue by not symbolizing the stack trace if the dl_iterate_phdr interceptor is not ready.
When testing LLDB, we want to make sure to use the same Python as the
one we used to build it.
This patch used the CMake variable `Python3_ROOT_DIR` to set the
`PYTHONHOME` env variable in LLDB lit tests, in order to ensure of this.
Please see https://github.com/swiftlang/swift/pull/82063 for the
original issue.
Commit 75c3ff8c0b inadvertently removed
some files from the build related to the SME ABI routines.
This patch fixes the issue by reintroducing the files to the build in
CMake.
`sanitizer_procmaps_solaris.cpp` currently uses `#undef
_FILE_OFFSET_BITS` to hack around the fact that old versions of Solaris
`<procfs.h>` don't work in a largefile environment:
```
/usr/include/sys/procfs.h:42:2: error: #error "Cannot use procfs in the large file compilation environment"
42 | #error "Cannot use procfs in the large file compilation environment"
| ^~~~~
```
However, this is no longer an issue on either Solaris 11.4 or Illumos.
The workaround only existed for the benefit of Solaris 11.3. While that
had never been supported by LLVM, the sanitizer runtime libs were
imported into GCC's `libsanitzer`. With the removal of Solaris 11.3
support in GCC 15, this is no longer an issue and the workaround can be
removed.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
Currently, if TSan needs to disable ASLR but is unable to do so, it
aborts with a cryptic message:
```
ThreadSanitizer: CHECK failed: tsan_platform_linux.cpp:290 "((personality(old_personality | ADDR_NO_RANDOMIZE))) != ((-1))"
```
and a segfault
(https://github.com/google/sanitizers/issues/837#issuecomment-2939267531).
This patch replaces the CHECK with more user-friendly diagnostics and
suggestions via printf, followed by Die().