Default cutoff is not usefull here. Decision to
enable or not sanitizer causes more significant
performance impact, than a typical optimizations
which rely on `profile-summary-cutoff-hot`.
Currently, the builtins used for implementing `va_list` handling
unconditionally take their arguments as unqualified `ptr`s i.e. pointers
to AS 0. This does not work for targets where the default AS is not 0 or
AS 0 is not a viable AS (for example, a target might choose 0 to
represent the constant address space). This patch changes the builtins'
signature to take generic `anyptr` args, which corrects this issue. It
is noisy due to the number of tests affected. A test for an upstream
target which does not use 0 as its default AS (SPIRV for HIP device
compilations) is added as well.
atomicrmw xchg also accepts pointer and floating-point values. To handle
those, insert necessary casts to and from integer. This is what we do
for cmpxchg as well.
Fixes https://github.com/llvm/llvm-project/issues/85226.
Previously, runtime calls introduced by ASan instrumentation into EH
pads were missing the funclet token expected by WinEHPrepare.
WinEHPrepare would then identify the containing BB as invalid and
discard it, causing invalid code generation that most likely crashes.
Also fixed localescape test, switching its EH personality to match code
without funclets.
This PR is based on the Phabricator patch
https://reviews.llvm.org/D143108
Fixes https://github.com/llvm/llvm-project/issues/64990
For COFF, available_externally global will be instrumented because of
the lack of filtering, and will trigger the Verifier pass assertion and
crash the compilation. This patch will filter out the
available_externally global for COFF.
For non-COFF, `!G->hasExactDefinition()` in line 1954 will filter out
the available_externally globals.
There is a related bug reported in
https://bugs.llvm.org/show_bug.cgi?id=47950 /
https://github.com/llvm/llvm-project/issues/47294. I tried the
reproducer posted on the page and this will fix the problem.
Reproducer:
```
#include <locale>
void grouping_impl() {
std::use_facet<std::numpunct<char>>(std::locale());
}
// clang -fsanitize=address -D_DLL -std=c++14 -c format.cc
```
New change on top of [reviewed
patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits
after this
one](d0757f46b3).
Previous commits are restored from the remote branch with timestamps.
1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1acf))
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121))
**Original Description**
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
- Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600
---------
Co-authored-by: modiking <modiking213@gmail.com>
* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
- Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
[The originally reviewed
patch](https://github.com/llvm/llvm-project/pull/66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in https://github.com/llvm/llvm-project/pull/80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600
---------
Co-authored-by: modiking <modiking213@gmail.com>
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to
update the second expression.
Fixes#76545. This is #78606 rebased and with the addition of DPValue handling.
Note the addition of --try-experimental-debuginfo-iterators in the tests and
some shuffling of code in MemoryTaggingSupport.cpp.
The performance of cold functions shouldn't matter too much, so if we
care about binary sizes, add an option to mark cold functions as
optsize/minsize for binary size, or optnone for compile times [1]. Clang
patch will be in a future patch.
This is intended to replace `shouldOptimizeForSize(Function&, ...)`.
We've seen multiple cases where calls to this expensive function, if not
careful, can blow up compile times. I will clean up users of that
function in a followup patch.
Initial version: https://reviews.llvm.org/D149800
[1]
https://discourse.llvm.org/t/rfc-new-feature-proposal-de-optimizing-cold-functions-using-pgo-info/56388
In `StackInfoBuilder::visit(Instruction &Inst)`, operations are
performed on memory-related instructions, including debug intrinsics
that refer to "interesting" allocas. There is a block that also visits
DPValues attached to the instruction, but this block is near the end of
the function; this has two problems:
1. The DPValues attached to an instruction precede that instruction, so
they should always be processed before the instruction itself.
2. More importantly, some of the paths for visiting other instructions
contain early returns, which will result in the DPValues not being
visited at all.
This patch simply moves the DPValue-visiting block to the top of the
function, which should resolve both of these problems.
Modify #77393 to clear shadow memory using `llvm.memset.*` when the size
is large, similar to `shouldUseBZeroPlusStoresToInitialize` in clang for
`-ftrivial-auto-var-init=`. The intrinsic, if lowered to libcall, will
use the msan interceptor.
The instruction selector lowers a `StoreInst` to multiple stores, not
utilizing `memset`. When the size is large (e.g.
`store { [100 x i32] } zeroinitializer, ptr %12, align 1`), the
generated code will be long (and `CodeGenPrepare::optimizeInst` will
even crash for a huge size).
```
// Test stack size
template <class T>
void DoNotOptimize(const T& var) { // deprecated by https://github.com/google/benchmark/pull/1493
asm volatile("" : "+m"(const_cast<T&>(var)));
}
int main() {
using LargeArray = std::array<int, 1000000>;
auto large_stack = []() { DoNotOptimize(LargeArray()); };
/////// CodeGenPrepare::optimizeInst triggers an assertion failure when creating an integer type with a bit width>2**23
large_stack();
}
```
Assertion failure `(i >= FTy->getNumParams() || FTy->getParamType(i) ==
Args[i]->getType()) && "Calling a function with a bad signature!"'. The
'llvm.memcpy' intercepted by ASan instrumentation pass is implemented by
it's own __asan_memcpy implementation. The second argument of
llvm.memcpy accepts ptr to addrspace(4), __asan_memcpy also has to
follow ptr to addrspace(4) convention.
---------
Co-authored-by: Amit Pandey <amit.pandey@amd.com>
This patch extends HWASAN to support maintenance of debug-info that
isn't stored as intrinsics, but is instead in a DPValue object. This is
straight-forwards: we collect any such objects in StackInfoBuilder, and
apply the same operations to them as we would to dbg.value and similar
intrinsics.
I've also replaced some calls to getNextNode with debug-info skipping
next calls, and use iterators for instruction insertion rather than
instruction pointers. This avoids any difference in output between
intrinsic / non-intrinsic debug-info, but also means that any debug-info
comes before code inserted by HWAsan, rather than afterwards. See the
test modifications, where the variable assignment (presented as a
dbg.value) jumps up over all the code inserted by HWAsan. Seeing how the
code inserted by HWAsan is always (AFAIUI) given the source-location of
the instruction being instrumented, I don't believe this will have any
effect on which lines variable assignments become visible on; it may
extend the number of instructions covered by the assignments though.
- With PGO, indirect call edges are constructed using value profiles, and the profile address is mapped to a function's PGO name. The PGO name is computed using a functions linkage before LTO internalization or global promotion.
- With ThinLTO, local functions [could be
promoted](2663d2cb9c/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp (L288)) to have external linkage; and with
[full](2663d2cb9c/llvm/lib/LTO/LTO.cpp (L1328))
or
[thin](2663d2cb9c/llvm/lib/LTO/LTO.cpp (L448))
LTO, global functions could be internalized. Edge construction should use a function's PGO name before its linkage is updated.
KMSAN defaults to `msan-handle-asm-conservative`, which inserts
`__msan_instrument_asm_store` calls to unpoison indirect outputs in
inline assembly (e.g. `=m` constraints in source).
```c
unsigned f() {
unsigned v;
// __msan_instrument_asm_store unpoisons v before invoking the asm.
asm("movl $1,%0" : "=m"(v));
return v;
}
```
Extend the mechanism to userspace, but require explicit
`-mllvm -msan-handle-asm-conservative` for experiments for now.
As
https://docs.kernel.org/dev-tools/kmsan.html#inline-assembly-instrumentation
says, this approach may mask certain errors (an indirect output may not
actually be initialized), but it also helps to avoid a lot of false
positives.
Link: https://github.com/google/sanitizers/issues/192
StackSafetyAnalysis determines whether stack-allocated variables are
guaranteed to be safe from memory access bugs and enables the removal of
certain unneeded instrumentations.
(hwasan enables StackSafetyAnalysis in https://reviews.llvm.org/D108381)
In a release build of clang, text sections are 9% smaller.
Test updates:
* asan-stack-safety.ll: test the -asan-use-stack-safety=1 default
* lifetime-uar-uas.ll: switch to an indexed store to prevent
StackSafetyAnalysis from optimizing out instrumentation for %c
* alloca_vla_interact.cpp: add a load to prevent StackSafetyAnalysis
from optimizing out `__asan_alloca_poison` for the VLA `array`
* scariness_score_test.cpp: add -asan-use-stack-safety=0 to make a load
of a `__asan_poison_memory_region`-poisoned local variable fail as
intended.
* other .ll tests: add -asan-use-stack-safety=0
Reviewed By: kstoimenov
Pull Request: https://github.com/llvm/llvm-project/pull/77210