The x86_64_lam_qemu buildbots started failing
(https://lab.llvm.org/buildbot/#/builders/139/builds/5462/steps/2/logs/stdio).
Based on the logs, it appears the HWASan check is correct but it did not
match the stderr/stdout output. This patch attempts to fix the issue by
flushing stderr/stdout as appropriate.
Compiling with `-fxray-shared` requires position-independent code
(introduced in #113548).
Some tests do not explicitly specify this, thus falling back to the
compiler default.
If, for example, Clang is compiled with
`-DCLANG_DEFAULT_PIE_ON_LINUX=OFF`, these checks fail.
This patch addresses this issue in two tests:
- Removing a check in `xray-shared.cpp` that only tests default PIC
behavior
- Adding `-fPIC` explicitly in `clang-xray-shared.cpp`
This fixes remaining issues in my previous PR #90959.
Changes:
- Removed dependency on LLVM header in `xray_interface.cpp`
- Fixed XRay patching for some targets due to missing changes in
architecture-specific patching functions
- Addressed some remaining compiler warnings that I missed in the
previous patch
- Formatting
I have tested these changes on `x86_64` (natively), as well as
`ppc64le`, `aarch64` and `arm32` (cross-compiled and emulated using
qemu).
**Original description:**
This PR introduces shared library (DSO) support for XRay based on a
revised version of the implementation outlined in [this
RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000).
The feature enables the patching and handling of events from DSOs,
supporting both libraries linked at startup or explicitly loaded, e.g.
via `dlopen`.
This patch adds the following:
- The `-fxray-shared` flag to enable the feature (turned off by default)
- A small runtime library that is linked into every instrumented DSO,
providing position-independent trampolines and code to register with the
main XRay runtime
- Changes to the XRay runtime to support management and patching of
multiple objects
These changes are fully backward compatible, i.e. running without
instrumented DSOs will produce identical traces (in terms of recorded
function IDs) to the previous implementation.
Due to my limited ability to test on other architectures, this feature
is only implemented and tested with x86_64. Extending support to other
architectures is fairly straightforward, requiring only a
position-independent implementation of the architecture-specific
trampoline implementation (see
`compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference).
This patch does not include any functionality to resolve function IDs
from DSOs for the provided logging/tracing modes. These modes still work
and will record calls from DSOs, but symbol resolution for these
functions in not available. Getting this to work properly requires
recording information about the loaded DSOs and should IMO be discussed
in a separate RFC, as there are mulitple feasible approaches.
---------
Co-authored-by: Sebastian Kreutzer <sebastian.kreutzer@tu-darmstadt.de>
I am currently trying to test the LLVM runtimes (including compiler-rt)
against an installed LLVM tree rather than a build tree (since that is
no longer available). Currently, the runtimes build of compiler-rt assumes
that LLVM_BINARY_DIR is writable since it uses configure_file() to write
there during the CMake configure stage. Instead, generate this file inside
CMAKE_CURRENT_BINARY_DIR, which will match LLVM_BINARY_DIR when invoked
from llvm/runtimes/CMakeLists.txt.
I also needed to make a minor change to the hwasan tests: hwasan_symbolize
was previously found in the LLVM_BINARY_DIR, but since it is generated as
part of the compiler-rt build it is now inside the CMake build directory
instead. I fixed this by passing the output directory to lit as
config.compiler_rt_bindir and using llvm_config.add_tool_substitutions().
For testing that we no longer write to the LLVM install directory as
part of testing or configuration, I created a read-only bind mount and
configured the runtimes builds as follows:
```
$ sudo mount --bind --read-only ~/llvm-install /tmp/upstream-llvm-readonly
$ cmake -DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_COMPILER=/tmp/upstream-llvm-readonly/bin/clang \
-DCMAKE_CXX_COMPILER=/tmp/upstream-llvm-readonly/bin/clang++ \
-DLLVM_INCLUDE_TESTS=TRUE -DLLVM_ENABLE_ASSERTIONS=TRUE \
-DCOMPILER_RT_INCLUDE_TESTS=TRUE -DCOMPILER_RT_DEBUG=OFF \
-DLLVM_ENABLE_RUNTIMES=compiler-rt \
-DLLVM_BINARY_DIR=/tmp/upstream-llvm-readonly \
-G Ninja -S ~/upstream-llvm-project/runtimes \
-B ~/upstream-llvm-project/runtimes/cmake-build-debug-llvm-git
```
Reviewed By: ldionne
Pull Request: https://github.com/llvm/llvm-project/pull/86209
The use of CLANG_NO_DEFAULT_CONFIG in the tests was added because some
Linux distributions had a global default config file, that added flags
relating to hardening, which interfere with the sanitizer tests. By
setting CLANG_NO_DEFAULT_CONFIG, the global default config files that
are found are ignored, and the sanitizers get the expected default
compiler behaviour.
(This was https://github.com/llvm/llvm-project/issues/60394, which was
fixed in 8ab762557fb057af1a3015211ee116a975027e78.)
However, some toolchains may rely on default config files for mandatory
parts required for functioning at all - setting things like sysroots,
-rtlib, -unwindlib, -stdlib, -fuse-ld etc. In such a case we can't
forcibly disable any default config, because it will break the otherwise
working toolchain.
Add a test for whether the compiler works while passing
--no-default-config to it. If the option is accepted and the toolchain
still works while that is set, set CLANG_NO_DEFAULT_CONFIG while running
tests.
(This adds a little bit of inconsistency, as we're testing for the
command line option, while using the environment variable. However doing
compile testing with an environment variable isn't quite as easily
doable, and passing an extra command line flag to all compile commands
while testing, is a bit clumsy - therefore this inconsistency.)
Some of the profile test files fail if they have CRLF newlines;
add a .gitattributes file that forces them to be checked out
with LF newlines, regarless of the user Git configuration.
This parameter seems unintentional here; we're trying to grep
the input on stdin, from the earlier stage in the pipeline.
Since a recent update on Github Actions runners, the previous
form (grepping a file, while piping in data on stdin) would fail
running the test, with the test runner Python script throwing
an exception when evaluating it:
File "D:\a\llvm-mingw\llvm-mingw\llvm-project\llvm\utils\lit\lit\TestRunner.py", line 935, in _executeShCmd
out = procs[i].stdout.read()
^^^^^^^^^^^^^^^^^^^^^^
File "C:\hostedtoolcache\windows\Python\3.12.7\x64\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: a bytes-like object is required, not 'NoneType'
This PR registers the writeout and reset functions for `gcov` for all
modules in the PGO runtime, instead of registering them
using global constructors in each module. The change is made for AIX
only, but the same mechanism works on Linux on Power.
When registering such functions using global constructors in each module
without `-ffunction-sections`, the AIX linker cannot garbage collect
unused undefined symbols, because such symbols are grouped in the same
section as the `__sinit` symbol. Keeping such undefined symbols causes
link errors (see test case
https://github.com/llvm/llvm-project/pull/108570/files#diff-500a7e1ba871e1b6b61b523700d5e30987900002add306e1b5e4972cf6d5a4f1R1
for this scenario). This PR implements the initialization in the
runtime, hence avoiding introducing `__sinit` into each module.
The implementation adds a new global variable `__llvm_covinit_functions`
to each module. This new global variable contains the function pointers
to the `Writeout` and `Reset` functions. `__llvm_covinit_functions`'s
section is the named section `__llvm_covinit`. The linker will aggregate
all the `__llvm_covinit` sections from each module
to form one single named section in the final binary. The pair of
functions
```
const __llvm_gcov_init_func_struct *__llvm_profile_begin_covinit();
const __llvm_gcov_init_func_struct *__llvm_profile_end_covinit();
```
are implemented to return the start and end address of this named
section in the final binary, and they are used in function
```
__llvm_profile_gcov_initialize()
```
(which is a constructor function in the runtime) so the runtime knows
the addresses of all the `Writeout` and `Reset` functions from all the
modules.
One noticeable implementation detail relevant to AIX is that to preserve
the `__llvm_covinit` from the linker's garbage collection, a `.ref`
pseudo instruction is inserted into them, referring to the section that
contains the `__llvm_gcov_ctr` variables, which are used in the
instrumented code. The `__llvm_gcov_ctr` variables did not belong to
named sections before, but this PR added them to the
`__llvm_gcov_ctr_section` named section, so we can add a `.ref` pseudo
instruction that refers to them in the `__llvm_covinit` section.
Introduces a new type of suppression:
1. function-name-matches - allows users to disable `malloc`, `free`,
`pthread_mutex_lock` or similar. This could be helpful if a user thinks
these are real-time safe on their OS. Also allows disabling of any
function marked [[blocking]].
This is useful as a **more performant "early outs" compared to the
`call-stack-contains` suppression**. `call-stack-contains` is inherently
VERY costly, needing to inspect every frame of every stack for a
matching string. This new suppression has an early out before we unwind
the stack.
This patch adds support for forced loading of archive members, similar to the
behavior of the -all_load and -ObjC options in ld64. To enable this, the
StaticLibraryDefinitionGenerator class constructors are extended with a
VisitMember callback that is called on each member file in the archive at
generator construction time. This callback can be used to unconditionally add
the member file to a JITDylib at that point.
To test this the llvm-jitlink utility is extended with -all_load (all platforms)
and -ObjC (darwin only) options. Since we can't refer to symbols in the test
objects directly (these would always cause the member to be linked in, even
without the new flags) we instead test side-effects of force loading: execution
of constructors and registration of Objective-C metadata.
rdar://134446111
This PR adds a `__sanitizer_copy_contiguous_container_annotations`
function, which copies annotations from one memory area to another. New
area is annotated in the same way as the old region at the beginning
(within limitations of ASan).
Overlapping case: The function supports overlapping containers, however
no assumptions should be made outside of no false positives in new
buffer area. (It doesn't modify old container annotations where it's not
necessary, false negatives may happen in edge granules of the new
container area.) I don't expect this function to be used with
overlapping buffers, but it's designed to work with them and not result
in incorrect ASan errors (false positives).
If buffers have granularity-aligned distance between them (`old_beg %
granularity == new_beg % granularity`), copying algorithm works faster.
If the distance is not granularity-aligned, annotations are copied byte
after byte.
```cpp
void __sanitizer_copy_contiguous_container_annotations(
const void *old_storage_beg_p, const void *old_storage_end_p,
const void *new_storage_beg_p, const void *new_storage_end_p) {
```
This function aims to help with short string annotations and similar
container annotations. Right now we change trait types of
`std::basic_string` when compiling with ASan and this function purpose
is reverting that change as soon as possible.
87f3407856/libcxx/include/string (L738-L751)
The goal is to not change `__trivially_relocatable` when compiling with
ASan. If this function is accepted and upstreamed, the next step is
creating a function like `__memcpy_with_asan` moving memory with ASan.
And then using this function instead of `__builtin__memcpy` while moving
trivially relocatable objects.
11a6799740/libcxx/include/__memory/uninitialized_algorithms.h (L644-L646)
---
I'm thinking if there is a good way to address fact that in a container
the new buffer is usually bigger than the previous one. We may add two
more arguments to the functions to address it (the beginning and the end
of the whole buffer.
Another potential change is removing `new_storage_end_p` as it's
redundant, because we require the same size.
Potential future work is creating a function `__asan_unsafe_memmove`,
which will be basically memmove, but with turned off instrumentation
(therefore it will allow copy data from poisoned area).
---------
Co-authored-by: Vitaly Buka <vitalybuka@google.com>
When testing on Linux/sparc64 with a `runtimes` build, the
`UBSan-Standalone-sparc :: TestCases/Misc/Linux/sigaction.cpp` test
`FAIL`s:
```
runtimes/runtimes-bins/compiler-rt/test/ubsan/Standalone-sparc/TestCases/Misc/Linux/Output/sigaction.cpp.tmp: error while loading shared libraries: libclang_rt.ubsan_standalone.so: wrong ELF class: ELFCLASS64
```
It turns out SPARC needs the same `LD_LIBRARY_PATH` handling as x86.
This is what this patch does, at the same time noticing that the current
duplication between `lit.common.cfg.py` and
`asan/Unit/lit.site.cfg.py.in` isn't necessary.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
When investigating PR #101634, it turned out that
`UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp`
isn't Linux-specific at all. In fact, none of the
`ubsan/TestCases/Misc/Linux` tests are.
Therefore this patch moves them to `Misc/Posix` instead.
Tested on `sparc64-unknown-linux-gnu`, `sparcv9-sun-solaris2.11`,
`x86_64-pc-linux-gnu`, and `amd64-pc-solaris2.11`.
After https://github.com/llvm/llvm-project/pull/80007 Fuchsia builds are
now always building cxx_shared for arm64 and x64 Linux. Ultimately, this
is because the LIBCXX_ENABLE_SHARED is not used in compiler-rt to select
the correct libc++ target, and because cxx_shared is now always defined,
it is selected as a dependency when building runtimes tests.
---------
Co-authored-by: Petr Hosek <phosek@google.com>
I saw cases that this finised before finding `BINGO`, possibly
insufficient number of iteration. In my case, 11,067,133 satisfied.
So, increase the number for now. This change may increase the duration
of this in failing (`BINGO` not found) case.
There are hard to debug leaks which look like
false.
In general, repeating leak checking should not
affect set of leaks significantly, especial
`at_exit` leak checking.
But if we see significant discrepancy, it may give
us a clue for investigation.
This PR introduces shared library (DSO) support for XRay based on a
revised version of the implementation outlined in [this
RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000).
The feature enables the patching and handling of events from DSOs,
supporting both libraries linked at startup or explicitly loaded, e.g.
via `dlopen`.
This patch adds the following:
- The `-fxray-shared` flag to enable the feature (turned off by default)
- A small runtime library that is linked into every instrumented DSO,
providing position-independent trampolines and code to register with the
main XRay runtime
- Changes to the XRay runtime to support management and patching of
multiple objects
These changes are fully backward compatible, i.e. running without
instrumented DSOs will produce identical traces (in terms of recorded
function IDs) to the previous implementation.
Due to my limited ability to test on other architectures, this feature
is only implemented and tested with x86_64. Extending support to other
architectures is fairly straightforward, requiring only a
position-independent implementation of the architecture-specific
trampoline implementation (see
`compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference).
This patch does not include any functionality to resolve function IDs
from DSOs for the provided logging/tracing modes. These modes still work
and will record calls from DSOs, but symbol resolution for these
functions in not available. Getting this to work properly requires
recording information about the loaded DSOs and should IMO be discussed
in a separate RFC, as there are mulitple feasible approaches.
@petrhosek @jplehr
Issue: #105181
extendhfxf2 calls extendhfXfy to convert _Float16 to double, then type
casts this converted value to long double.
__uint128_t may not be available on all architectures. Thus I din't use
extendhfXfy to widen precision to 128 bits.
When ASan testing is enabled on SPARC as per PR #107405, the
```
AddressSanitizer-sparc-linux :: TestCases/Linux/ptrace.cpp
```
`FAIL`s on Linux/sparc64. This happens because the `ptrace` interceptor
has no support for that target at all.
This patch adds the missing parts and accounts for a couple of issues
specific to this target:
- In some cases, SPARC just needs to be included in the list of
supported targets.
- Besides, the types used by the `PTRACE_GETREGS` and `PTRACE_GETFPREGS`
requests need to be filled in.
- `ptrace` has a weird quirk on this target: for a couple of requests,
the meaning of the `data` and `addr` args is reversed. All of the
`Linux/ptrace.cpp` test and the interceptor, pre-syscall and
post-syscall hooks need to account for that swap in their checks.
Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.