Unconditionally change std::string's alignment to 8.
This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.
That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.
The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.
This change is the child of PR #68807, which enabled the change behind
an ABI flag.
Previously, there was a ternary conditional with a less-than comparison
appearing inside a template argument, which was really confusing because
of the <...> of the function template. This patch rewrites the same
statement on two lines for clarity.
Originally merged here: https://github.com/llvm/llvm-project/pull/75882
Reverted here: https://github.com/llvm/llvm-project/pull/78627
Reverted due to failing buildbots. The problem was not caused by the
annotations code, but by code in the `UniqueFunctionBase` class and in
the `JSON.h` file. That code caused the program to write to memory that
was already being used by string objects, which resulted in an ASan
error.
Fixes are implemented in:
- https://github.com/llvm/llvm-project/pull/79065
- https://github.com/llvm/llvm-project/pull/79066
Problematic code from `UniqueFunctionBase` for example:
```cpp
#ifndef NDEBUG
// In debug builds, we also scribble across the rest of the storage.
memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
#endif
```
---
Original description:
This commit turns on ASan annotations in `std::basic_string` for short
stings (SSO case).
Originally suggested here: https://reviews.llvm.org/D147680
String annotations added here:
https://github.com/llvm/llvm-project/pull/72677
Requires to pass CI without fails:
- https://github.com/llvm/llvm-project/pull/75845
- https://github.com/llvm/llvm-project/pull/75858
Annotating `std::basic_string` with default allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677 but annotations for
short strings (SSO - Short String Optimization) are turned off there.
This commit turns them on. This also removes
`_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to
support turning on and off short string annotations.
Support in ASan API exists since
dd1b7b797a.
You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.
This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.
The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.
If you have any questions, please email:
advenam.tacet@trailofbits.comdisconnect3d@trailofbits.com
If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general) to
a sorting algorithm, the algorithm can produce an incorrect result or
even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see [the related
RFC](https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217)).
As a result, we have three independent sets of checks:
- assertions that catch out-of-bounds accesses within the algorithms'
implementation. These are relatively cheap; however, they cannot catch
the underlying cause and cannot prevent the case where an invalid
comparator would result in an incorrectly-sorted sequence without
actually triggering an OOB access;
- debug comparators that wrap a given comparator and on each comparison
check that if `(a < b)`, then `!(b < a)`, where `<` stands for the
user-provided comparator. This performs up to 2x number of comparisons
but doesn't affect the algorithmic complexity. While this approach can
find more issues, it is still a heuristic;
- a comprehensive check of the comparator that validates up to 100
elements in the resulting sorted sequence (see the RFC above for
details). The check is expensive but the 100 element limit can somewhat
compensate for that, especially for large values of `N`.
The first set of checks is enabled in the fast hardening mode while the
other two are only enabled in the debug mode.
This patch also removes the
`_LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK` macro that
previously was used to selectively enable the 100-element check.
Now this check is enabled unconditionally in the debug mode.
Also, introduce a new category
`_LIBCPP_ASSERT_SEMANTIC_REQUIREMENT`. This category is
intended for checking the semantic requirements from the Standard.
Typically, these are hard or impossible to completely validate, so
these checks are expected to be heuristic in nature and potentially
quite expensive.
See https://reviews.llvm.org/D150264 for additional background.
Fixes#71496
Using std::regex_search with the regex_constant match_default and a
simple regex pattern `$` is expected to match general strings such as
_"a", "ab", "abc"..._ at `[last, last)` positions. But, the current
implementation fails to do so.
Fixes#75042
The CUDA SDK contains an unfortunate definition for the `__noinline__`
macro. This patch works around it by using `__attribute__((noinline))`
instead of `__attribute__((__noinline__))` on CUDA. We are still waiting
for a long-term resolution to this issue in NVIDIA/cccl#1235.
This reverts commit 7d9b5aa65b since
std/utilities/format/format.arguments/format.arg/visit.return_type.pass.cpp
is failing on Windows when building with Clang-cl.
This patch implements __cxa_init_primary_exception, an extension to the
Itanium C++ ABI. This extension is already present in both libsupc++ and
libcxxrt. This patch also starts making use of this function in
std::make_exception_ptr: instead of going through a full throw/catch
cycle, we are now able to initialize an exception directly, thus making
std::make_exception_ptr around 30x faster.
Currently std::expected can have some padding bytes in its tail due to
[[no_unique_address]]. Those padding bytes can be used by other objects.
For example, in the current implementation:
sizeof(std::expected<std::optional<int>, bool>) ==
sizeof(std::expected<std::expected<std::optional<int>, bool>, bool>)
As a result, the data layout of an
std::expected<std::expected<std::optional<int>, bool>, bool>
can look like this:
+-- optional "has value" flag
| +--padding
/---int---\ | |
00 00 00 00 01 00 00 00
| |
| +- "outer" expected "has value" flag
|
+- expected "has value" flag
This is problematic because `emplace()`ing the "inner" expected can not
only overwrite the "inner" expected "has value" flag (issue #68552) but
also the tail padding where other objects might live.
This patch fixes the problem by ensuring that std::expected has no tail
padding, which is achieved by conditional usage of [[no_unique_address]]
based on the tail padding that this would create.
This is an ABI breaking change because the following property changes:
sizeof(std::expected<std::optional<int>, bool>) <
sizeof(std::expected<std::expected<std::optional<int>, bool>, bool>)
Before the change, this relation didn't hold. After the change, the relation
does hold, which means that the size of std::expected in these cases increases
after this patch. The data layout will change in the following cases where
tail padding can be reused by other objects:
class foo : std::expected<std::optional<int>, bool> {
bool b;
};
or using [[no_unique_address]]:
struct foo {
[[no_unique_address]] std::expected<std::optional<int>, bool> e;
bool b;
};
The vendor communication is handled in #70820.
Fixes: #70494
Co-authored-by: philnik777 <nikolasklauser@berlin.de>
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
While the code is technically correct because the index is never
actually moved from (and anyway that wouldn't matter since it's an
integer), it's still better style not to access an object after it has
been moved-from. Since this is so easy to do, just save the index in a
temporary variable.
rdar://120501577
Introduce a new `argument-within-domain` category that covers cases
where the given arguments make it impossible to produce a correct result
(or create a valid object in case of constructors). While the incorrect
result doesn't create an immediate problem within the library (like e.g.
a null pointer dereference would), it always indicates a logic error in
user code and is highly likely to lead to a bug in the program once the
value is used.
When I implemented `condition_variable_any::wait`, I missed the most
important paragraph in the spec:
> The following wait functions will be notified when there is a stop
request on the passed stop_token.
> In that case the functions return immediately, returning false if the
predicate evaluates to false.
From
https://eel.is/c++draft/thread.condition#thread.condvarany.intwait-1.
Fixes#76807
The tag name was long for an ABI tag. The name was misleading too, the
tag is first introduced in LLVM 18 in 2024 and not in 2023.
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
This commit turns on ASan annotations in `std::basic_string` for short
stings (SSO case).
Originally suggested here: https://reviews.llvm.org/D147680
String annotations added here:
https://github.com/llvm/llvm-project/pull/72677
Requires to pass CI without fails:
- https://github.com/llvm/llvm-project/pull/75845
- https://github.com/llvm/llvm-project/pull/75858
Annotating `std::basic_string` with default allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677 but annotations for
short strings (SSO - Short String Optimization) are turned off there.
This commit turns them on. This also removes
`_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to
support turning on and off short string annotations.
Support in ASan API exists since
dd1b7b797a.
You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.
This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.
The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.
If you have any questions, please email:
advenam.tacet@trailofbits.comdisconnect3d@trailofbits.com
Previously there were two ways to override the verbose abort function
which gets called when a hardening assertion is triggered:
- compile-time: define the `_LIBCPP_VERBOSE_ABORT` macro;
- link-time: provide a definition of `__libcpp_verbose_abort` function.
This patch adds a new configure-time approach: the vendor can provide
a path to a custom header file which will get copied into the build by
CMake and included by the library. The header must provide a definition
of the
`_LIBCPP_ASSERTION_HANDLER` macro which is what will get called should
a hardening assertion fail. As of this patch, overriding
`_LIBCPP_VERBOSE_ABORT` will still work, but the previous mechanisms
will be effectively removed in a follow-up patch, making the
configure-time mechanism the sole way of overriding the default handler.
Note that `_LIBCPP_ASSERTION_HANDLER` only gets invoked when a hardening
assertion fails. It does not affect other cases where
`_LIBCPP_VERBOSE_ABORT` is currently used (e.g. when an exception is
thrown in the `-fno-exceptions` mode).
The library provides a default version of the custom header file that
will get used if it's not overridden by the vendor. That allows us to
always test the override mechanism and reduces the difference in
configuration between the pristine version of the library and
a platform-specific version.
The behavior of `std::regex_search` for patterns anchored both to the
start and to the end of the input went wrong after merging #77256 .
Patterns like `"^b*$"` started matching the strings such as `"a"`, which
is not expected.
Reverts the PR: #77256
This commit simplifies ASan helper functions in `std::vector` by
removing arguments which can be calculated later.
Short term it improves readability of helper functions in `std::vector`.
Long term it aims to help with a bigger refactor of container
annotations.
Clang-tidy 18 no longer has false positives with the spaceship operator.
Note that I'm quite sure there are more occurrences in our headers that
are not caught.
This simplifies the IWYU generation script by treating everything as a
file, instead of dealing with directories and files separately.
This has the downside that the `libcxx.imp` file is a lot larger than it
used to be, however we now have the flexibility of mapping files under
detail directories to different public headers. For example, this allows
us to map <__fwd/subrange.h> to <ranges> but <__fwd/pair.h> to
<utility>.
This patch also adds basic validation to ensure that we never map a
header to a public header that doesn't exist. We may still be missing
some mappings or we may be mapping to incorrect headers, but we won't be
mapping to headers that downright don't exist.
Fixes#63346
As suggested in #73262 this enable the stream printing on Apple
backdeployment targets. This omits the check whether the file is a
terminal. This is not entirely conforming, but the differences should be
minor and are typically not observable.
Fixes https://github.com/llvm/llvm-project/issues/75225
Previously the header included several headers, possibly granularized
threading headers. This could lead to build errors when these headers
were incompatible with threading disabled.
Now test the guard before inclusion. This matches the pattern used for
no localization and no wide characters.
Fixes: https://github.com/llvm/llvm-project/issues/76620
We discussed the removal of these enable-all macros in the libc++
monthly meeting and we agreed that we should deprecate these macros in
LLVM 18, and then remove them in LLVM 19 since they can silently enable
deprecated features that are implemented after the first release of the
macro.
This patch does the first part of this -- it deprecates the macro.
Note that the file
test/libcxx/depr/enable_removed_cpp20_features.compile.pass.cpp
does not exist so this file is not adapted. Since the feature is
deprecated and slated for removal soon the missing test is not
implemented.
Partly addresses: https://github.com/llvm/llvm-project/issues/75976
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
The overloads of `println` are specified in terms of `format`. The
function `format` is specified to work with ranges.
The implementations for `println` do not include `<format>`, but
libc++'s granularized header. This means the following example does not
work
#include <vector>
#include <print>
int main() {
std::vector<int> v{1, 2, 3};
std::println("{}", v);
}
(The other print functions also require this to work, they are specified
in terms of other format functions.)
Fixes: https://github.com/llvm/llvm-project/issues/71925
This commit turns on ASan annotations in `std::basic_string` for all
allocators by default.
Originally suggested here: https://reviews.llvm.org/D146214
String annotations added here:
https://github.com/llvm/llvm-project/pull/72677
This commit is part of our efforts to support container annotations with
(almost) every allocator. Annotating `std::basic_string` with default
allocator is implemented in
https://github.com/llvm/llvm-project/pull/72677.
Additionally it removes `__begin != nullptr` because `data()` should
never return a nullptr.
Support in ASan API exists since
1c5ad6d2c0.
This patch removes the check in std::basic_string annotation member
function (__annotate_contiguous_container) to support different
allocators.
You can turn off annotations for a specific allocator based on changes
from
2fa1bec7a2.
The motivation for a research and those changes was a bug, found by
Trail of Bits, in a real code where an out-of-bounds read could happen
as two strings were compared via a call to `std::equal` that took
`iter1_begin`, `iter1_end`, `iter2_begin` iterators (with a custom
comparison function). When object `iter1` was longer than `iter2`, read
out-of-bounds on `iter2` could happen. Container sanitization would
detect it.
If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
As described in #69994, using the escape hatch makes us non-conforming
in C++20 due to incorrect constexpr-ness. It also leads to bad
diagnostics as reported by #63900. We discussed the issue in the libc++
monthly meeting and we agreed that we should deprecate the macro in LLVM
18, and then remove it in LLVM 19 since it causes too many problems.
This patch does the first part of this -- it deprecates the macro.
Fixes#69994Fixes#63900
Partially addresses #75975
This commit is a refactor (increases readability) and optimization fix.
This is a fixed commit of
https://github.com/llvm/llvm-project/pull/76200 First reverthed here:
1ea7a56057
Please, check original PR for details.
The difference is a return type of the lambda.
Original description:
This commit addresses optimization and instrumentation challenges
encountered within comma constructors.
1) _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS does not work in comma
constructors.
2) Code inside comma constructors is not always correctly optimized.
Problematic code examples:
- `: __r_(((__str.__is_long() ? 0 : (__str.__annotate_delete(), 0)),
std::move(__str.__r_))) {`
- `: __r_(__r_([&](){ if(!__s.__is_long()) __s.__annotate_delete();
return std::move(__s.__r_);}())) {`
However, lambda with argument seems to be correctly optimized. This
patch uses that fact.
Use of lambda based on idea from @ldionne.
Since we use _LIBCPP_USING_IF_EXISTS to handle missing C library functions
now, _LIBCPP_C_HAS_NO_GETS shouldn't be necessary anymore.
See the discussion thread in #77242 for more details.