41a94de75caacb979070ec7a010dfe3c4e9f116f
417 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
69b54c1a05 |
[libcxx][algorithm] Optimize std::stable_sort via radix sort algorithm (#104683)
The radix sort (LSD) algorithm allows to speed up std::stable_sort dramatically in case we sort integers. The speed up varies from a relatively small to x10 times, depending on type of sorted elements and the initial state of the sorted array. ``` Running ./libcxx/test/benchmarks/stable_sort.bench.out Run on (12 X 2600 MHz CPU s) CPU Caches: L1 Data 32 KiB L1 Instruction 32 KiB L2 Unified 256 KiB (x6) L3 Unified 12288 KiB Load Average: 3.48, 3.38, 3.08 --------------------------------------------------------------------------- Benchmark After Before --------------------------------------------------------------------------- BM_StableSort_int8_Random_1 3.39 ns 3.58 ns BM_StableSort_int8_Random_4 21.1 ns 21.9 ns BM_StableSort_int8_Random_16 142 ns 147 ns BM_StableSort_int8_Random_64 893 ns 903 ns BM_StableSort_int8_Random_256 409 ns 5810 ns BM_StableSort_int8_Random_1024 1235 ns 29973 ns BM_StableSort_int8_Random_4096 4410 ns 141880 ns BM_StableSort_int8_Random_16384 18044 ns 620540 ns BM_StableSort_int8_Random_65536 144030 ns 2592013 ns BM_StableSort_int8_Random_262144 858350 ns 10935814 ns BM_StableSort_int8_Random_524288 2929988 ns 27060729 ns BM_StableSort_int8_Random_1048576 6058292 ns 49622720 ns BM_StableSort_int8_Ascending_1 3.42 ns 3.92 ns BM_StableSort_int8_Ascending_4 5.86 ns 8.08 ns BM_StableSort_int8_Ascending_16 10.6 ns 12.0 ns BM_StableSort_int8_Ascending_64 28.9 ns 30.6 ns BM_StableSort_int8_Ascending_256 415 ns 391 ns BM_StableSort_int8_Ascending_1024 1666 ns 2309 ns BM_StableSort_int8_Ascending_4096 7748 ns 12269 ns BM_StableSort_int8_Ascending_16384 40588 ns 60181 ns BM_StableSort_int8_Ascending_65536 178843 ns 298221 ns BM_StableSort_int8_Ascending_262144 919959 ns 1402692 ns BM_StableSort_int8_Ascending_524288 2397397 ns 3036984 ns BM_StableSort_int8_Ascending_1048576 5080043 ns 7218581 ns BM_StableSort_int8_Descending_1 3.44 ns 3.53 ns BM_StableSort_int8_Descending_4 7.94 ns 8.29 ns BM_StableSort_int8_Descending_16 59.6 ns 57.7 ns BM_StableSort_int8_Descending_64 1051 ns 1027 ns BM_StableSort_int8_Descending_256 422 ns 4718 ns BM_StableSort_int8_Descending_1024 1676 ns 21044 ns BM_StableSort_int8_Descending_4096 7766 ns 64827 ns BM_StableSort_int8_Descending_16384 40230 ns 93981 ns BM_StableSort_int8_Descending_65536 190978 ns 421151 ns BM_StableSort_int8_Descending_262144 1055141 ns 1918927 ns BM_StableSort_int8_Descending_524288 2875115 ns 3809153 ns BM_StableSort_int8_Descending_1048576 5854135 ns 8713690 ns BM_StableSort_int8_SingleElement_1 3.52 ns 3.46 ns BM_StableSort_int8_SingleElement_4 6.25 ns 5.79 ns BM_StableSort_int8_SingleElement_16 10.7 ns 11.4 ns BM_StableSort_int8_SingleElement_64 29.3 ns 30.3 ns BM_StableSort_int8_SingleElement_256 858 ns 380 ns BM_StableSort_int8_SingleElement_1024 3036 ns 2231 ns BM_StableSort_int8_SingleElement_4096 11580 ns 11866 ns BM_StableSort_int8_SingleElement_16384 44956 ns 59621 ns BM_StableSort_int8_SingleElement_65536 182006 ns 297853 ns BM_StableSort_int8_SingleElement_262144 962181 ns 1432857 ns BM_StableSort_int8_SingleElement_524288 2256687 ns 2975707 ns BM_StableSort_int8_SingleElement_1048576 4522556 ns 6949948 ns BM_StableSort_int8_PipeOrgan_1 3.26 ns 3.64 ns BM_StableSort_int8_PipeOrgan_4 6.21 ns 6.58 ns BM_StableSort_int8_PipeOrgan_16 23.7 ns 25.4 ns BM_StableSort_int8_PipeOrgan_64 250 ns 248 ns BM_StableSort_int8_PipeOrgan_256 414 ns 2498 ns BM_StableSort_int8_PipeOrgan_1024 1697 ns 10946 ns BM_StableSort_int8_PipeOrgan_4096 7840 ns 37238 ns BM_StableSort_int8_PipeOrgan_16384 41402 ns 74805 ns BM_StableSort_int8_PipeOrgan_65536 180107 ns 357891 ns BM_StableSort_int8_PipeOrgan_262144 988273 ns 1647296 ns BM_StableSort_int8_PipeOrgan_524288 2547374 ns 3245991 ns BM_StableSort_int8_PipeOrgan_1048576 5128783 ns 7342444 ns BM_StableSort_int8_QuickSortAdversary_1 3.14 ns 4.01 ns BM_StableSort_int8_QuickSortAdversary_4 6.05 ns 7.02 ns BM_StableSort_int8_QuickSortAdversary_16 10.5 ns 11.9 ns BM_StableSort_int8_QuickSortAdversary_64 520 ns 516 ns BM_StableSort_int8_QuickSortAdversary_256 920 ns 386 ns BM_StableSort_int8_QuickSortAdversary_1024 3083 ns 2299 ns BM_StableSort_int8_QuickSortAdversary_4096 11659 ns 12295 ns BM_StableSort_int8_QuickSortAdversary_16384 45721 ns 60931 ns BM_StableSort_int8_QuickSortAdversary_65536 186334 ns 295423 ns BM_StableSort_int8_QuickSortAdversary_262144 946262 ns 1399973 ns BM_StableSort_int8_QuickSortAdversary_524288 2282004 ns 2832266 ns BM_StableSort_int8_QuickSortAdversary_1048576 |
||
|
|
e99c4906e4 | [libc++] Granularize <cstddef> includes (#108696) | ||
|
|
09e3a36058 |
[libc++][modules] Fix missing and incorrect includes (#108850)
This patch adds a large number of missing includes in the libc++ headers and the test suite. Those were found as part of the effort to move towards a mostly monolithic top-level std module. |
||
|
|
94e7c0b051 |
[libc++] Remove get_temporary_buffer and return_temporary_buffer (#100914)
Works towards P0619R4 / #99985. The use of `std::get_temporary_buffer` and `std::return_temporary_buffer` are replaced with `unique_ptr`-based RAII buffer holder. Escape hatches: - `_LIBCPP_ENABLE_CXX20_REMOVED_TEMPORARY_BUFFER` restores `std::get_temporary_buffer` and `std::return_temporary_buffer`. Drive-by changes: - In `<syncstream>`, states that `get_temporary_buffer` is now removed, because `<syncstream>` is added in C++20. |
||
|
|
f73050e722 |
[libc++] Fix several double-moves in the code base (#104616)
This patch hardens the "test iterators" we use to test algorithms by ensuring that they don't get double-moved. As a result of this hardening, the tests started reporting multiple failures where we would double-move iterators, which are being fixed in this patch. In particular: - Fixed a double-move in pstl.partition - Add coverage for begin()/end() in subrange tests - Fix tests for ranges::ends_with and ranges::contains, which were incorrectly calling begin() twice on the same subrange containing non-copyable input iterators. Fixes #100709 |
||
|
|
257831582c |
[libc++] Check correctly ref-qualified __is_callable in algorithms (#101553)
We were only checking that the comparator was rvalue callable, when in reality the algorithms always call comparators as lvalues. This patch also refactors the tests for callable requirements and expands it to a few missing algorithms. This is take 2 of #73451, which was reverted because it broke some CI bots. The issue was that we checked __is_callable with arguments in the wrong order inside std::upper_bound. This has now been fixed and a test was added. Fixes #69554 |
||
|
|
d07fdf9779 |
[libc++] Optimize lexicographical_compare (#65279)
If the comparison operation is equivalent to < and that is a total order, we know that we can use equality comparison on that type instead to extract some information. Furthermore, if equality comparison on that type is trivial, the user can't observe that we're calling it. So instead of using the user-provided total order, we use std::mismatch, which uses equality comparison (and is vertorized). Additionally, if the type is trivially lexicographically comparable, we can go one step further and use std::memcmp directly instead of calling std::mismatch. Benchmarks: ``` ------------------------------------------------------------------------------------- Benchmark old new ------------------------------------------------------------------------------------- bm_lexicographical_compare<unsigned char>/1 1.17 ns 2.34 ns bm_lexicographical_compare<unsigned char>/2 1.64 ns 2.57 ns bm_lexicographical_compare<unsigned char>/3 2.23 ns 2.58 ns bm_lexicographical_compare<unsigned char>/4 2.82 ns 2.57 ns bm_lexicographical_compare<unsigned char>/5 3.34 ns 2.11 ns bm_lexicographical_compare<unsigned char>/6 3.94 ns 2.21 ns bm_lexicographical_compare<unsigned char>/7 4.56 ns 2.11 ns bm_lexicographical_compare<unsigned char>/8 5.25 ns 2.11 ns bm_lexicographical_compare<unsigned char>/16 9.88 ns 2.11 ns bm_lexicographical_compare<unsigned char>/64 38.9 ns 2.36 ns bm_lexicographical_compare<unsigned char>/512 317 ns 6.54 ns bm_lexicographical_compare<unsigned char>/4096 2517 ns 41.4 ns bm_lexicographical_compare<unsigned char>/32768 20052 ns 488 ns bm_lexicographical_compare<unsigned char>/262144 159579 ns 4409 ns bm_lexicographical_compare<unsigned char>/1048576 640456 ns 20342 ns bm_lexicographical_compare<signed char>/1 1.18 ns 2.37 ns bm_lexicographical_compare<signed char>/2 1.65 ns 2.60 ns bm_lexicographical_compare<signed char>/3 2.23 ns 2.83 ns bm_lexicographical_compare<signed char>/4 2.81 ns 3.06 ns bm_lexicographical_compare<signed char>/5 3.35 ns 3.30 ns bm_lexicographical_compare<signed char>/6 3.90 ns 3.99 ns bm_lexicographical_compare<signed char>/7 4.56 ns 3.78 ns bm_lexicographical_compare<signed char>/8 5.20 ns 4.02 ns bm_lexicographical_compare<signed char>/16 9.80 ns 6.21 ns bm_lexicographical_compare<signed char>/64 39.0 ns 3.16 ns bm_lexicographical_compare<signed char>/512 318 ns 7.58 ns bm_lexicographical_compare<signed char>/4096 2514 ns 47.4 ns bm_lexicographical_compare<signed char>/32768 20096 ns 504 ns bm_lexicographical_compare<signed char>/262144 156617 ns 4146 ns bm_lexicographical_compare<signed char>/1048576 624265 ns 19810 ns bm_lexicographical_compare<int>/1 1.15 ns 2.12 ns bm_lexicographical_compare<int>/2 1.60 ns 2.36 ns bm_lexicographical_compare<int>/3 2.21 ns 2.59 ns bm_lexicographical_compare<int>/4 2.74 ns 2.83 ns bm_lexicographical_compare<int>/5 3.26 ns 3.06 ns bm_lexicographical_compare<int>/6 3.81 ns 4.53 ns bm_lexicographical_compare<int>/7 4.41 ns 4.72 ns bm_lexicographical_compare<int>/8 5.08 ns 2.36 ns bm_lexicographical_compare<int>/16 9.54 ns 3.08 ns bm_lexicographical_compare<int>/64 37.8 ns 4.71 ns bm_lexicographical_compare<int>/512 309 ns 24.6 ns bm_lexicographical_compare<int>/4096 2422 ns 204 ns bm_lexicographical_compare<int>/32768 19362 ns 1947 ns bm_lexicographical_compare<int>/262144 155727 ns 19793 ns bm_lexicographical_compare<int>/1048576 623614 ns 80180 ns bm_ranges_lexicographical_compare<unsigned char>/1 1.07 ns 2.35 ns bm_ranges_lexicographical_compare<unsigned char>/2 1.72 ns 2.13 ns bm_ranges_lexicographical_compare<unsigned char>/3 2.46 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/4 3.17 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/5 3.86 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/6 4.55 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/7 5.25 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/8 5.95 ns 2.13 ns bm_ranges_lexicographical_compare<unsigned char>/16 11.7 ns 2.13 ns bm_ranges_lexicographical_compare<unsigned char>/64 45.5 ns 2.36 ns bm_ranges_lexicographical_compare<unsigned char>/512 366 ns 6.35 ns bm_ranges_lexicographical_compare<unsigned char>/4096 2886 ns 40.9 ns bm_ranges_lexicographical_compare<unsigned char>/32768 23054 ns 489 ns bm_ranges_lexicographical_compare<unsigned char>/262144 185302 ns 4339 ns bm_ranges_lexicographical_compare<unsigned char>/1048576 741576 ns 19430 ns bm_ranges_lexicographical_compare<signed char>/1 1.10 ns 2.12 ns bm_ranges_lexicographical_compare<signed char>/2 1.66 ns 2.35 ns bm_ranges_lexicographical_compare<signed char>/3 2.23 ns 2.58 ns bm_ranges_lexicographical_compare<signed char>/4 2.82 ns 2.82 ns bm_ranges_lexicographical_compare<signed char>/5 3.34 ns 3.06 ns bm_ranges_lexicographical_compare<signed char>/6 3.92 ns 3.99 ns bm_ranges_lexicographical_compare<signed char>/7 4.64 ns 4.10 ns bm_ranges_lexicographical_compare<signed char>/8 5.21 ns 4.61 ns bm_ranges_lexicographical_compare<signed char>/16 9.79 ns 7.42 ns bm_ranges_lexicographical_compare<signed char>/64 38.9 ns 2.93 ns bm_ranges_lexicographical_compare<signed char>/512 317 ns 7.31 ns bm_ranges_lexicographical_compare<signed char>/4096 2500 ns 47.5 ns bm_ranges_lexicographical_compare<signed char>/32768 19940 ns 496 ns bm_ranges_lexicographical_compare<signed char>/262144 159166 ns 4393 ns bm_ranges_lexicographical_compare<signed char>/1048576 638206 ns 19786 ns bm_ranges_lexicographical_compare<int>/1 1.10 ns 2.12 ns bm_ranges_lexicographical_compare<int>/2 1.64 ns 3.04 ns bm_ranges_lexicographical_compare<int>/3 2.23 ns 2.58 ns bm_ranges_lexicographical_compare<int>/4 2.81 ns 2.81 ns bm_ranges_lexicographical_compare<int>/5 3.35 ns 3.05 ns bm_ranges_lexicographical_compare<int>/6 3.94 ns 4.60 ns bm_ranges_lexicographical_compare<int>/7 4.60 ns 4.81 ns bm_ranges_lexicographical_compare<int>/8 5.19 ns 2.35 ns bm_ranges_lexicographical_compare<int>/16 9.85 ns 2.87 ns bm_ranges_lexicographical_compare<int>/64 38.9 ns 4.70 ns bm_ranges_lexicographical_compare<int>/512 318 ns 24.5 ns bm_ranges_lexicographical_compare<int>/4096 2494 ns 202 ns bm_ranges_lexicographical_compare<int>/32768 20000 ns 1939 ns bm_ranges_lexicographical_compare<int>/262144 160433 ns 19730 ns bm_ranges_lexicographical_compare<int>/1048576 642636 ns 80760 ns ``` |
||
|
|
451bba6fbf |
[libc++] Revert "Check correctly ref-qualified __is_callable in algorithms (#73451)"
This reverts commit
|
||
|
|
8d151f804f |
[libc++] Check correctly ref-qualified __is_callable in algorithms (#73451)
We were only checking that the comparator was rvalue callable, when in reality the algorithms always call comparators as lvalues. This patch also refactors the tests for callable requirements and expands it to a few missing algorithms. Fixes #69554 |
||
|
|
f90e51a508 |
[libcxx][test] Mark sort.pass.cpp as a long test (#100720)
Picolib testing skips any test requiring this feature, I just didn't know the feature existed until now. |
||
|
|
c6b192ac2e |
[libc++][test] Do not assume array::iterator is a pointer (#100603)
In the tests I added for `ranges::find_last{_if{_not}}`, I accidentally
introduced an assumption that `same_as<array<T, 0>::iterator, T*>`; this
is a faulty assumption on MSVC-STL.
Fixes #100498.
|
||
|
|
929b474991 |
[libcxx][test] Explain picolib unsupported in sort.pass.cpp
This is not a hidden bug, it's just a very slow test under emulation. |
||
|
|
04760bfadb |
[libc++][ranges] P1223R5: find_last (#99312)
Implements [P1223R5][] completely. Includes an implementation of `find_last`, `find_last_if`, and `find_last_if_not`. [P1223R5]: https://wg21.link/p1223r5 |
||
|
|
a0662176a9 |
[libc++] Speed up set_intersection() by fast-forwarding over ranges of non-matching elements with one-sided binary search. (#75230)
One-sided binary search, aka meta binary search, has been in the public domain for decades, and has the general advantage of being constant time in the best case, with the downside of executing at most 2*log(N) comparisons vs classic binary search's exact log(N). There are two scenarios in which it really shines: the first one is when operating over non-random-access iterators, because the classic algorithm requires knowing the container's size upfront, which adds N iterator increments to the complexity. The second one is when traversing the container in order, trying to fast-forward to the next value: in that case the classic algorithm requires at least O(N*log(N)) comparisons and, for non-random-access iterators, O(N^2) iterator increments, whereas the one-sided version will yield O(N) operations on both counts, with a best-case of O(log(N)) comparisons which is very common in practice. |
||
|
|
dfddc0c484 |
[libc++] Include the rest of the detail headers by version in the umbrella headers (#96032)
This is a follow-up to #83740. |
||
|
|
7918e624ad |
[libc++] Test suite portability improvements (#98527)
This patch contains a number of small portability improvements for the test suite, making it easier to run the test suite with other standard library implementations. - Guard checks for _LIBCPP_HARDENING_MODE to avoid -Wundef - Avoid defining _LIBCPP_HARDENING_MODE even when no hardening mode is specified -- we should use the default mode of the library in that case. - Add missing includes and qualify a few function calls. - Avoid opening namespace std to forward declare stdlib containers. The test suite should represent user code, and user code isn't allowed to do that. |
||
|
|
9e9404387d |
[libc++] Remove annotations for GCC 13 and update the documentation (#97744)
GCC 14 has been released a while ago. We've updated the CI to use GCC 14 now. This removes any old annotations in the tests and updates the documentation to reflect the updated version requirements. |
||
|
|
a0cdd32b79 |
[libc++] [test] Consistently use REQUIRES: has-unix-headers (#94122)
There were 7 occurrences of `UNSUPPORTED: !has-unix-headers`, versus 212 occurrences of `REQUIRES: has-unix-headers`. I don't completely understand how libc++ uses UNSUPPORTED versus REQUIRES, but it seems better to be consistent, and to avoid the double negation in "this is unsupported if we don't have unix headers". (This came to my attention because of the single occurrence in `libcxx/test/std`. Our MSVC-internal test harness isn't aware of lit features, so we teach it to skip tests via the incredibly primitive method of searching for specific comments, so I had to deal with this comment inconsistency.) |
||
|
|
037a0528bb |
[libc++] Handle 0 size case for testing support operator new (#93834)
The return of malloc is implementation defined when the requested size is 0. On platforms (such as AIX) that return a null pointer for 0 size, operator new will throw a bad_alloc exception. operator new should return a non null pointer for 0 size instead. |
||
|
|
2ba0838615 |
[libc++] [test] Fix portability issues for MSVC (#93259)
* Guard `std::__make_from_tuple_impl` tests with `#ifdef _LIBCPP_VERSION` and `LIBCPP_STATIC_ASSERT`.
* Change `_LIBCPP_CONSTEXPR_SINCE_CXX20` to `TEST_CONSTEXPR_CXX20`.
+ Other functions in `variant.swap/swap.pass.cpp` were already using the proper test macro.
* Mark `what` as `[[maybe_unused]]` when used by `TEST_LIBCPP_REQUIRE`.
+ This updates one occurrence in `libcxx/test/libcxx` for consistency.
* Windows `_putenv_s()` takes 2 arguments, not 3.
+ See MSVC documentation: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/putenv-s-wputenv-s?view=msvc-170
+ POSIX `setenv()` takes `int overwrite`, but Windows `_putenv_s()` always overwrites.
* Avoid non-Standard zero-length arrays.
+ Followup to #74183 and #79792.
* Add `operator++()` to `unsized_it`.
+ The Standard requires this due to [N4981][] [move.iter.requirements]/1 "The template parameter `Iterator` shall
either meet the *Cpp17InputIterator* requirements ([input.iterators])
or model `input_iterator` ([iterator.concept.input])."
+ MSVC's STL requires this because it has a strengthened exception
specification in `move_iterator` that inspects the underlying iterator's
increment operator.
* `uniform_int_distribution` forbids `int8_t`/`uint8_t`.
+ See [N4981][] [rand.req.genl]/1.5. MSVC's STL enforces this.
+ Note that when changing the distribution's `IntType`, we need to be
careful to preserve the original value range of `[0, max_input]`.
* fstreams are constructible from `const fs::path::value_type*` on wide systems.
+ See [ifstream.cons], [ofstream.cons], [fstream.cons].
* In `msvc_stdlib_force_include.h`, map `_HAS_CXX23` to `TEST_STD_VER` 23 instead of 99.
+ On 2023-05-23,
|
||
|
|
bd3f5a4bd3 |
[libc++][pstl] Improve exception handling (#88998)
There were various places where we incorrectly handled exceptions in the PSTL. Typical issues were missing `noexcept` and taking iterators by value instead of by reference. This patch fixes those inconsistent and incorrect instances, and adds proper tests for all of those. Note that the previous tests were often incorrectly turned into no-ops by the compiler due to copy ellision, which doesn't happen with these new tests. |
||
|
|
05cc2d5fe1 | [libc++] Vectorize std::mismatch with trivially equality comparable types (#87716) | ||
|
|
f0ea888e01 |
[libcxx] applies changes regarding post-commit feedback to #75259 (#76534)
Some of the feedback was also relevant to other files, and has been applied there too. |
||
|
|
04dbf7ad44 |
[libc++][ranges] Avoid using distance in ranges::contains_subrange (#87155)
Both `std::distance` or `ranges::distance` are inefficient for non-sized ranges. Also, calculating the range using `int` type is seriously problematic. This patch avoids using `distance` and calculation of the length of non-sized ranges. Fixes #86833. |
||
|
|
985c1a44f8 |
[libc++] Optimize the two range overload of mismatch (#86853)
``` ----------------------------------------------------------------------------- Benchmark old new ----------------------------------------------------------------------------- bm_mismatch_two_range_overload<char>/1 0.941 ns 1.88 ns bm_mismatch_two_range_overload<char>/2 1.43 ns 2.15 ns bm_mismatch_two_range_overload<char>/3 1.95 ns 2.55 ns bm_mismatch_two_range_overload<char>/4 2.58 ns 2.90 ns bm_mismatch_two_range_overload<char>/5 3.75 ns 3.31 ns bm_mismatch_two_range_overload<char>/6 5.00 ns 3.83 ns bm_mismatch_two_range_overload<char>/7 5.59 ns 4.35 ns bm_mismatch_two_range_overload<char>/8 6.37 ns 4.84 ns bm_mismatch_two_range_overload<char>/16 11.8 ns 6.72 ns bm_mismatch_two_range_overload<char>/64 45.5 ns 2.59 ns bm_mismatch_two_range_overload<char>/512 366 ns 12.6 ns bm_mismatch_two_range_overload<char>/4096 2890 ns 91.6 ns bm_mismatch_two_range_overload<char>/32768 23038 ns 758 ns bm_mismatch_two_range_overload<char>/262144 142813 ns 6573 ns bm_mismatch_two_range_overload<char>/1048576 366679 ns 26710 ns bm_mismatch_two_range_overload<short>/1 0.934 ns 1.88 ns bm_mismatch_two_range_overload<short>/2 1.30 ns 2.58 ns bm_mismatch_two_range_overload<short>/3 1.76 ns 3.28 ns bm_mismatch_two_range_overload<short>/4 2.24 ns 3.98 ns bm_mismatch_two_range_overload<short>/5 2.80 ns 4.92 ns bm_mismatch_two_range_overload<short>/6 3.58 ns 6.01 ns bm_mismatch_two_range_overload<short>/7 4.29 ns 7.03 ns bm_mismatch_two_range_overload<short>/8 4.67 ns 7.39 ns bm_mismatch_two_range_overload<short>/16 9.86 ns 13.1 ns bm_mismatch_two_range_overload<short>/64 38.9 ns 4.55 ns bm_mismatch_two_range_overload<short>/512 348 ns 27.7 ns bm_mismatch_two_range_overload<short>/4096 2881 ns 225 ns bm_mismatch_two_range_overload<short>/32768 23111 ns 1715 ns bm_mismatch_two_range_overload<short>/262144 184846 ns 14416 ns bm_mismatch_two_range_overload<short>/1048576 742885 ns 57264 ns bm_mismatch_two_range_overload<int>/1 0.838 ns 1.19 ns bm_mismatch_two_range_overload<int>/2 1.19 ns 1.65 ns bm_mismatch_two_range_overload<int>/3 1.83 ns 2.06 ns bm_mismatch_two_range_overload<int>/4 2.38 ns 2.42 ns bm_mismatch_two_range_overload<int>/5 3.60 ns 2.47 ns bm_mismatch_two_range_overload<int>/6 3.68 ns 3.05 ns bm_mismatch_two_range_overload<int>/7 4.32 ns 3.36 ns bm_mismatch_two_range_overload<int>/8 5.18 ns 3.58 ns bm_mismatch_two_range_overload<int>/16 10.6 ns 2.84 ns bm_mismatch_two_range_overload<int>/64 39.0 ns 7.78 ns bm_mismatch_two_range_overload<int>/512 247 ns 53.9 ns bm_mismatch_two_range_overload<int>/4096 1927 ns 429 ns bm_mismatch_two_range_overload<int>/32768 15569 ns 3393 ns bm_mismatch_two_range_overload<int>/262144 125413 ns 28504 ns bm_mismatch_two_range_overload<int>/1048576 504549 ns 112729 ns ``` |
||
|
|
beaff78528 |
[libc++] Optimize the std::mismatch tail (#83440)
This adds vectorization to the last 0-3 vectors and, if the range is large enough, the remaining elements that don't fill a vector completely. ``` ----------------------------------------------------------------------- Benchmark old full vectors partial vector ----------------------------------------------------------------------- bm_mismatch<char>/1 1.40 ns 1.62 ns 2.09 ns bm_mismatch<char>/2 1.88 ns 2.10 ns 2.33 ns bm_mismatch<char>/3 2.67 ns 2.56 ns 2.72 ns bm_mismatch<char>/4 3.01 ns 3.20 ns 3.70 ns bm_mismatch<char>/5 3.51 ns 3.73 ns 3.64 ns bm_mismatch<char>/6 4.71 ns 4.85 ns 4.37 ns bm_mismatch<char>/7 5.12 ns 5.33 ns 4.37 ns bm_mismatch<char>/8 5.79 ns 6.02 ns 4.75 ns bm_mismatch<char>/15 9.20 ns 10.5 ns 7.23 ns bm_mismatch<char>/16 10.2 ns 10.1 ns 7.46 ns bm_mismatch<char>/17 10.2 ns 10.8 ns 7.57 ns bm_mismatch<char>/31 17.6 ns 17.1 ns 10.8 ns bm_mismatch<char>/32 17.4 ns 1.64 ns 1.64 ns bm_mismatch<char>/33 23.3 ns 2.10 ns 2.33 ns bm_mismatch<char>/63 31.8 ns 16.9 ns 2.33 ns bm_mismatch<char>/64 32.6 ns 2.10 ns 2.10 ns bm_mismatch<char>/65 33.6 ns 2.57 ns 2.80 ns bm_mismatch<char>/127 67.3 ns 18.1 ns 3.27 ns bm_mismatch<char>/128 2.17 ns 2.14 ns 2.57 ns bm_mismatch<char>/129 2.36 ns 2.80 ns 3.27 ns bm_mismatch<char>/255 67.5 ns 19.6 ns 4.68 ns bm_mismatch<char>/256 3.76 ns 3.71 ns 3.97 ns bm_mismatch<char>/257 3.77 ns 4.04 ns 4.43 ns bm_mismatch<char>/511 70.8 ns 22.1 ns 7.47 ns bm_mismatch<char>/512 7.27 ns 7.30 ns 6.95 ns bm_mismatch<char>/513 7.11 ns 7.05 ns 6.96 ns bm_mismatch<char>/1023 75.9 ns 27.4 ns 13.3 ns bm_mismatch<char>/1024 13.9 ns 13.8 ns 12.4 ns bm_mismatch<char>/1025 13.6 ns 13.6 ns 12.8 ns bm_mismatch<char>/2047 87.3 ns 37.5 ns 25.4 ns bm_mismatch<char>/2048 26.8 ns 27.4 ns 24.0 ns bm_mismatch<char>/2049 26.7 ns 27.3 ns 25.5 ns bm_mismatch<char>/4095 112 ns 64.7 ns 48.7 ns bm_mismatch<char>/4096 53.0 ns 54.2 ns 46.8 ns bm_mismatch<char>/4097 52.7 ns 54.2 ns 48.4 ns bm_mismatch<char>/8191 160 ns 118 ns 98.4 ns bm_mismatch<char>/8192 107 ns 108 ns 96.0 ns bm_mismatch<char>/8193 106 ns 108 ns 97.2 ns bm_mismatch<char>/16383 283 ns 234 ns 215 ns bm_mismatch<char>/16384 227 ns 223 ns 217 ns bm_mismatch<char>/16385 221 ns 221 ns 215 ns bm_mismatch<char>/32767 547 ns 499 ns 488 ns bm_mismatch<char>/32768 495 ns 492 ns 492 ns bm_mismatch<char>/32769 491 ns 489 ns 488 ns bm_mismatch<char>/65535 1028 ns 979 ns 971 ns bm_mismatch<char>/65536 976 ns 970 ns 974 ns bm_mismatch<char>/65537 970 ns 965 ns 971 ns bm_mismatch<char>/131071 2031 ns 1948 ns 2005 ns bm_mismatch<char>/131072 1973 ns 1955 ns 1974 ns bm_mismatch<char>/131073 1989 ns 1932 ns 2001 ns bm_mismatch<char>/262143 4469 ns 4244 ns 4223 ns bm_mismatch<char>/262144 4443 ns 4183 ns 4243 ns bm_mismatch<char>/262145 4400 ns 4232 ns 4246 ns bm_mismatch<char>/524287 10169 ns 9733 ns 9592 ns bm_mismatch<char>/524288 10154 ns 9664 ns 9843 ns bm_mismatch<char>/524289 10113 ns 9641 ns 10003 ns bm_mismatch<short>/1 1.86 ns 2.53 ns 2.32 ns bm_mismatch<short>/2 2.57 ns 2.77 ns 2.55 ns bm_mismatch<short>/3 3.26 ns 3.00 ns 2.79 ns bm_mismatch<short>/4 3.95 ns 3.39 ns 3.15 ns bm_mismatch<short>/5 4.83 ns 3.97 ns 3.72 ns bm_mismatch<short>/6 5.43 ns 4.34 ns 4.03 ns bm_mismatch<short>/7 6.11 ns 4.73 ns 4.44 ns bm_mismatch<short>/8 6.84 ns 5.02 ns 4.79 ns bm_mismatch<short>/15 11.5 ns 7.12 ns 6.50 ns bm_mismatch<short>/16 13.9 ns 1.87 ns 2.11 ns bm_mismatch<short>/17 14.0 ns 3.00 ns 2.47 ns bm_mismatch<short>/31 23.1 ns 7.87 ns 2.47 ns bm_mismatch<short>/32 23.8 ns 2.57 ns 2.81 ns bm_mismatch<short>/33 24.5 ns 3.70 ns 2.94 ns bm_mismatch<short>/63 44.8 ns 9.37 ns 3.46 ns bm_mismatch<short>/64 2.32 ns 2.57 ns 2.64 ns bm_mismatch<short>/65 2.52 ns 3.02 ns 3.51 ns bm_mismatch<short>/127 45.6 ns 9.97 ns 5.18 ns bm_mismatch<short>/128 3.85 ns 3.93 ns 3.94 ns bm_mismatch<short>/129 3.82 ns 4.20 ns 4.70 ns bm_mismatch<short>/255 50.4 ns 12.6 ns 8.07 ns bm_mismatch<short>/256 7.23 ns 6.91 ns 6.98 ns bm_mismatch<short>/257 7.24 ns 7.19 ns 7.55 ns bm_mismatch<short>/511 52.3 ns 17.8 ns 14.0 ns bm_mismatch<short>/512 13.6 ns 13.7 ns 13.6 ns bm_mismatch<short>/513 13.9 ns 13.8 ns 18.5 ns bm_mismatch<short>/1023 60.9 ns 30.9 ns 26.3 ns bm_mismatch<short>/1024 26.7 ns 27.7 ns 25.7 ns bm_mismatch<short>/1025 27.7 ns 27.6 ns 25.3 ns bm_mismatch<short>/2047 88.4 ns 58.0 ns 51.6 ns bm_mismatch<short>/2048 52.8 ns 55.3 ns 50.6 ns bm_mismatch<short>/2049 55.2 ns 54.8 ns 48.7 ns bm_mismatch<short>/4095 153 ns 113 ns 102 ns bm_mismatch<short>/4096 105 ns 110 ns 101 ns bm_mismatch<short>/4097 110 ns 110 ns 99.1 ns bm_mismatch<short>/8191 277 ns 219 ns 206 ns bm_mismatch<short>/8192 226 ns 214 ns 250 ns bm_mismatch<short>/8193 226 ns 207 ns 208 ns bm_mismatch<short>/16383 519 ns 492 ns 488 ns bm_mismatch<short>/16384 494 ns 492 ns 492 ns bm_mismatch<short>/16385 492 ns 488 ns 489 ns bm_mismatch<short>/32767 1007 ns 968 ns 964 ns bm_mismatch<short>/32768 977 ns 972 ns 970 ns bm_mismatch<short>/32769 972 ns 962 ns 967 ns bm_mismatch<short>/65535 1978 ns 1918 ns 1956 ns bm_mismatch<short>/65536 1940 ns 1927 ns 1970 ns bm_mismatch<short>/65537 1937 ns 1922 ns 1959 ns bm_mismatch<short>/131071 4524 ns 4193 ns 4304 ns bm_mismatch<short>/131072 4445 ns 4196 ns 4306 ns bm_mismatch<short>/131073 4452 ns 4278 ns 4311 ns bm_mismatch<short>/262143 9801 ns 10188 ns 9634 ns bm_mismatch<short>/262144 9738 ns 10151 ns 9651 ns bm_mismatch<short>/262145 9716 ns 10171 ns 9715 ns bm_mismatch<short>/524287 19944 ns 20718 ns 20044 ns bm_mismatch<short>/524288 21139 ns 20647 ns 20008 ns bm_mismatch<short>/524289 21162 ns 19512 ns 20068 ns bm_mismatch<int>/1 1.40 ns 1.84 ns 1.87 ns bm_mismatch<int>/2 1.87 ns 2.08 ns 2.09 ns bm_mismatch<int>/3 2.36 ns 2.31 ns 2.87 ns bm_mismatch<int>/4 3.06 ns 2.72 ns 2.95 ns bm_mismatch<int>/5 3.66 ns 3.37 ns 3.42 ns bm_mismatch<int>/6 4.55 ns 3.65 ns 3.73 ns bm_mismatch<int>/7 5.03 ns 3.93 ns 3.94 ns bm_mismatch<int>/8 5.67 ns 1.86 ns 1.87 ns bm_mismatch<int>/15 9.89 ns 4.41 ns 2.34 ns bm_mismatch<int>/16 10.1 ns 2.33 ns 2.34 ns bm_mismatch<int>/17 10.2 ns 3.34 ns 2.86 ns bm_mismatch<int>/31 17.2 ns 5.54 ns 3.28 ns bm_mismatch<int>/32 2.16 ns 2.15 ns 2.58 ns bm_mismatch<int>/33 2.36 ns 3.01 ns 3.28 ns bm_mismatch<int>/63 17.7 ns 6.50 ns 4.93 ns bm_mismatch<int>/64 3.81 ns 3.58 ns 3.90 ns bm_mismatch<int>/65 3.74 ns 4.36 ns 4.45 ns bm_mismatch<int>/127 19.5 ns 9.56 ns 7.74 ns bm_mismatch<int>/128 7.30 ns 6.41 ns 6.85 ns bm_mismatch<int>/129 7.09 ns 7.04 ns 7.06 ns bm_mismatch<int>/255 24.7 ns 14.8 ns 13.3 ns bm_mismatch<int>/256 14.0 ns 12.1 ns 12.3 ns bm_mismatch<int>/257 13.8 ns 12.7 ns 12.8 ns bm_mismatch<int>/511 34.3 ns 26.3 ns 24.8 ns bm_mismatch<int>/512 27.6 ns 23.6 ns 23.9 ns bm_mismatch<int>/513 27.3 ns 24.4 ns 25.1 ns bm_mismatch<int>/1023 62.5 ns 50.9 ns 48.3 ns bm_mismatch<int>/1024 54.4 ns 46.1 ns 46.6 ns bm_mismatch<int>/1025 54.2 ns 48.4 ns 47.5 ns bm_mismatch<int>/2047 116 ns 97.8 ns 94.1 ns bm_mismatch<int>/2048 108 ns 92.6 ns 92.4 ns bm_mismatch<int>/2049 108 ns 104 ns 94.0 ns bm_mismatch<int>/4095 233 ns 222 ns 205 ns bm_mismatch<int>/4096 226 ns 223 ns 225 ns bm_mismatch<int>/4097 221 ns 219 ns 210 ns bm_mismatch<int>/8191 499 ns 485 ns 488 ns bm_mismatch<int>/8192 496 ns 490 ns 495 ns bm_mismatch<int>/8193 491 ns 485 ns 488 ns bm_mismatch<int>/16383 982 ns 962 ns 964 ns bm_mismatch<int>/16384 974 ns 971 ns 971 ns bm_mismatch<int>/16385 971 ns 961 ns 968 ns bm_mismatch<int>/32767 2003 ns 1959 ns 1920 ns bm_mismatch<int>/32768 1996 ns 1947 ns 1928 ns bm_mismatch<int>/32769 1990 ns 1945 ns 1926 ns bm_mismatch<int>/65535 4434 ns 4275 ns 4312 ns bm_mismatch<int>/65536 4437 ns 4267 ns 4321 ns bm_mismatch<int>/65537 4442 ns 4261 ns 4321 ns bm_mismatch<int>/131071 9673 ns 9648 ns 9465 ns bm_mismatch<int>/131072 9667 ns 9671 ns 9465 ns bm_mismatch<int>/131073 9661 ns 9653 ns 9464 ns bm_mismatch<int>/262143 20595 ns 19605 ns 19064 ns bm_mismatch<int>/262144 19894 ns 19572 ns 19009 ns bm_mismatch<int>/262145 19851 ns 19656 ns 18999 ns bm_mismatch<int>/524287 39556 ns 39364 ns 38131 ns bm_mismatch<int>/524288 39678 ns 39573 ns 38183 ns bm_mismatch<int>/524289 40168 ns 39301 ns 38121 ns ``` |
||
|
|
b68e2eba0b |
[libc++] Vectorize mismatch (#73255)
``` --------------------------------------------------- Benchmark old new --------------------------------------------------- bm_mismatch<char>/1 0.835 ns 2.37 ns bm_mismatch<char>/2 1.44 ns 2.60 ns bm_mismatch<char>/3 2.06 ns 2.83 ns bm_mismatch<char>/4 2.60 ns 3.29 ns bm_mismatch<char>/5 3.15 ns 3.77 ns bm_mismatch<char>/6 3.82 ns 4.17 ns bm_mismatch<char>/7 4.29 ns 4.52 ns bm_mismatch<char>/8 4.78 ns 4.86 ns bm_mismatch<char>/16 9.06 ns 7.54 ns bm_mismatch<char>/64 31.7 ns 19.1 ns bm_mismatch<char>/512 249 ns 8.16 ns bm_mismatch<char>/4096 1956 ns 44.2 ns bm_mismatch<char>/32768 15498 ns 501 ns bm_mismatch<char>/262144 123965 ns 4479 ns bm_mismatch<char>/1048576 495668 ns 21306 ns bm_mismatch<short>/1 0.710 ns 2.12 ns bm_mismatch<short>/2 1.03 ns 2.66 ns bm_mismatch<short>/3 1.29 ns 3.56 ns bm_mismatch<short>/4 1.68 ns 4.29 ns bm_mismatch<short>/5 1.96 ns 5.18 ns bm_mismatch<short>/6 2.59 ns 5.91 ns bm_mismatch<short>/7 2.86 ns 6.63 ns bm_mismatch<short>/8 3.19 ns 7.33 ns bm_mismatch<short>/16 5.48 ns 13.0 ns bm_mismatch<short>/64 16.6 ns 4.06 ns bm_mismatch<short>/512 130 ns 13.8 ns bm_mismatch<short>/4096 985 ns 93.8 ns bm_mismatch<short>/32768 7846 ns 1002 ns bm_mismatch<short>/262144 63217 ns 10637 ns bm_mismatch<short>/1048576 251782 ns 42471 ns bm_mismatch<int>/1 0.716 ns 1.91 ns bm_mismatch<int>/2 1.21 ns 2.49 ns bm_mismatch<int>/3 1.38 ns 3.46 ns bm_mismatch<int>/4 1.71 ns 4.04 ns bm_mismatch<int>/5 2.00 ns 4.98 ns bm_mismatch<int>/6 2.43 ns 5.67 ns bm_mismatch<int>/7 3.05 ns 6.38 ns bm_mismatch<int>/8 3.22 ns 7.09 ns bm_mismatch<int>/16 5.18 ns 12.8 ns bm_mismatch<int>/64 16.6 ns 5.28 ns bm_mismatch<int>/512 129 ns 25.2 ns bm_mismatch<int>/4096 1009 ns 201 ns bm_mismatch<int>/32768 7776 ns 2144 ns bm_mismatch<int>/262144 62371 ns 20551 ns bm_mismatch<int>/1048576 254750 ns 90097 ns ``` |
||
|
|
07b18c5e1b |
[libc++] Optimize ranges::fill{,_n} for vector<bool>::iterator (#84642)
``` ------------------------------------------------------ Benchmark old new ------------------------------------------------------ bm_ranges_fill_n/1 1.64 ns 3.06 ns bm_ranges_fill_n/2 3.45 ns 3.06 ns bm_ranges_fill_n/3 4.88 ns 3.06 ns bm_ranges_fill_n/4 6.46 ns 3.06 ns bm_ranges_fill_n/5 8.03 ns 3.06 ns bm_ranges_fill_n/6 9.65 ns 3.07 ns bm_ranges_fill_n/7 11.5 ns 3.06 ns bm_ranges_fill_n/8 13.0 ns 3.06 ns bm_ranges_fill_n/16 25.9 ns 3.06 ns bm_ranges_fill_n/64 103 ns 4.62 ns bm_ranges_fill_n/512 711 ns 4.40 ns bm_ranges_fill_n/4096 5642 ns 9.86 ns bm_ranges_fill_n/32768 45135 ns 33.6 ns bm_ranges_fill_n/262144 360818 ns 243 ns bm_ranges_fill_n/1048576 1442828 ns 982 ns bm_ranges_fill/1 1.63 ns 3.17 ns bm_ranges_fill/2 3.43 ns 3.28 ns bm_ranges_fill/3 4.97 ns 3.31 ns bm_ranges_fill/4 6.53 ns 3.27 ns bm_ranges_fill/5 8.12 ns 3.33 ns bm_ranges_fill/6 9.76 ns 3.32 ns bm_ranges_fill/7 11.6 ns 3.29 ns bm_ranges_fill/8 13.2 ns 3.26 ns bm_ranges_fill/16 26.3 ns 3.26 ns bm_ranges_fill/64 104 ns 4.92 ns bm_ranges_fill/512 716 ns 4.47 ns bm_ranges_fill/4096 5772 ns 8.21 ns bm_ranges_fill/32768 45778 ns 33.1 ns bm_ranges_fill/262144 351422 ns 241 ns bm_ranges_fill/1048576 1404710 ns 965 ns ``` |
||
|
|
a6b846ae1e | [libc++][ranges] Implement ranges::contains_subrange (#66963) | ||
|
|
ef83894810 |
[libc++][test] Fix zero-length arrays and copy-pasted lambdas in ranges.contains.pass.cpp (#79792)
* Fix MSVC error C2466: cannot allocate an array of constant size 0 + MSVC rejects this non-Standard extension. Previous fixes: #74183 * Fix MSVC warning C4805: `'=='`: unsafe mix of type `'int'` and type `'const bool'` in operation + AFAICT, these lambdas were copy-pasted, and didn't intend to take and return `int` here. This part of the test is using `vector<bool>` for random-access but non-contiguous iterators, and it's checking how many times the projection is invoked, but the projection doesn't need to do anything squirrely, it should otherwise be an identity. * Fix typos: "continuous" => "contiguous". |
||
|
|
c9535d7b61 |
[libc++][test] Silence MSVC warnings (#79791)
* `libcxx/test/std/algorithms/alg.nonmodifying/alg.find/find.pass.cpp`
emits a bunch of warnings, all caused by what appears to be intentional
code:
+ Silence MSVC warning C4245: conversion from `'int'` to `'wchar_t'`,
signed/unsigned mismatch
- Caused by: `test<U>(0, -1);`
+ Silence MSVC warning C4305: 'argument': truncation from `'int'` to
`'bool'`
- Caused by: `test<U>(0, -1);`
+ Silence MSVC warning C4310: cast truncates constant value
- Caused by: `test<U>(T(-129), U(-129));`
+ Silence MSVC warning C4805: `'=='`: unsafe mix of type `'char'` and
type `'bool'` in operation
- Caused by: `bool expect_match = val == to_find;`
*
`libcxx/test/std/algorithms/alg.nonmodifying/alg.fold/left_folds.pass.cpp`
+ Silence MSVC warning C4244: 'argument': conversion from `'double'` to
`'const int'`, possible loss of data
- Caused by `[](int const x, double const y) { return x + y; }`
deliberately being given `double`s to truncate.
*
`libcxx/test/std/numerics/numeric.ops/numeric.ops.midpoint/midpoint.pointer.pass.cpp`
+ Silence MSVC warnings about C++20 deprecated `volatile`.
- Caused by: `runtime_test< volatile T>();`
|
||
|
|
ad01447d30 |
[libcxx] Fix typo in parallel for_each_n test (#78954)
This fixes a trivial copy and paste error where we forgot to change `for_each` to `for_each_n` |
||
|
|
8dfc67d672 |
[libc++][hardening] Rework how the assertion handler can be overridden. (#77883)
Previously there were two ways to override the verbose abort function which gets called when a hardening assertion is triggered: - compile-time: define the `_LIBCPP_VERBOSE_ABORT` macro; - link-time: provide a definition of `__libcpp_verbose_abort` function. This patch adds a new configure-time approach: the vendor can provide a path to a custom header file which will get copied into the build by CMake and included by the library. The header must provide a definition of the `_LIBCPP_ASSERTION_HANDLER` macro which is what will get called should a hardening assertion fail. As of this patch, overriding `_LIBCPP_VERBOSE_ABORT` will still work, but the previous mechanisms will be effectively removed in a follow-up patch, making the configure-time mechanism the sole way of overriding the default handler. Note that `_LIBCPP_ASSERTION_HANDLER` only gets invoked when a hardening assertion fails. It does not affect other cases where `_LIBCPP_VERBOSE_ABORT` is currently used (e.g. when an exception is thrown in the `-fno-exceptions` mode). The library provides a default version of the custom header file that will get used if it's not overridden by the vendor. That allows us to always test the override mechanism and reduces the difference in configuration between the pristine version of the library and a platform-specific version. |
||
|
|
b203d5320d |
[libc++] Optimize std::find if types are integral and have the same signedness (#70345)
Fixes #70238 |
||
|
|
3903438860 |
[libcxx] adds ranges::fold_left_with_iter and ranges::fold_left (#75259)
Notable things in this commit: * refactors `__indirect_binary_left_foldable`, making it slightly different (but equivalent) to _`indirect-binary-left-foldable`_, which improves readability (a [patch to the Working Paper][patch] was made) * omits `__cpo` namespace, since it is not required for implementing niebloids (a cleanup should happen in 2024) * puts tests ensuring invocable robustness and dangling correctness inside the correctness testing to ensure that the algorithms' results are still correct [patch]: https://github.com/cplusplus/draft/pull/6734 |
||
|
|
fdd089b500 |
[libc++] Implement ranges::contains (#65148)
Differential Revision: https://reviews.llvm.org/D159232 ``` Running ./ranges_contains.libcxx.out Run on (10 X 24.121 MHz CPU s) CPU Caches: L1 Data 64 KiB (x10) L1 Instruction 128 KiB (x10) L2 Unified 4096 KiB (x5) Load Average: 3.37, 6.77, 5.27 -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- bm_contains_char/16 1.88 ns 1.87 ns 371607095 bm_contains_char/256 7.48 ns 7.47 ns 93292285 bm_contains_char/4096 99.7 ns 99.6 ns 7013185 bm_contains_char/65536 1296 ns 1294 ns 540436 bm_contains_char/1048576 23887 ns 23860 ns 29302 bm_contains_char/16777216 389420 ns 389095 ns 1796 bm_contains_int/16 7.14 ns 7.14 ns 97776288 bm_contains_int/256 90.4 ns 90.3 ns 7558089 bm_contains_int/4096 1294 ns 1290 ns 543052 bm_contains_int/65536 20482 ns 20443 ns 34334 bm_contains_int/1048576 328817 ns 327965 ns 2147 bm_contains_int/16777216 5246279 ns 5239361 ns 133 bm_contains_bool/16 2.19 ns 2.19 ns 322565780 bm_contains_bool/256 3.42 ns 3.41 ns 205025467 bm_contains_bool/4096 22.1 ns 22.1 ns 31780479 bm_contains_bool/65536 333 ns 332 ns 2106606 bm_contains_bool/1048576 5126 ns 5119 ns 135901 bm_contains_bool/16777216 81656 ns 81574 ns 8569 ``` --------- Co-authored-by: Nathan Gauër <brioche@google.com> |
||
|
|
a35629cd8d |
[libc++] Remove assumptions that std::array::iterator is a raw pointer (#74624)
This patch removes assumptions that std::array's iterators are raw pointers in the source code and in our test suite. While this is true right now, this doesn't have to be true and ion the future we might want to enable bounded iterators in std::array, which would require this change. This is a pre-requisite for landing #74482 |
||
|
|
f7407411a1 |
[libc++] Optimize std::find for segmented iterators (#67224)
``` -------------------------------------------------------------------------- Benchmark old new -------------------------------------------------------------------------- bm_find<std::deque<char>>/1 6.06 ns 10.6 ns bm_find<std::deque<char>>/2 15.5 ns 10.6 ns bm_find<std::deque<char>>/3 19.0 ns 10.6 ns bm_find<std::deque<char>>/4 20.8 ns 10.6 ns bm_find<std::deque<char>>/5 22.0 ns 10.6 ns bm_find<std::deque<char>>/6 23.0 ns 10.5 ns bm_find<std::deque<char>>/7 24.8 ns 10.7 ns bm_find<std::deque<char>>/8 25.7 ns 10.6 ns bm_find<std::deque<char>>/16 28.3 ns 10.6 ns bm_find<std::deque<char>>/64 44.2 ns 27.0 ns bm_find<std::deque<char>>/512 133 ns 37.6 ns bm_find<std::deque<char>>/4096 867 ns 53.1 ns bm_find<std::deque<char>>/32768 6838 ns 160 ns bm_find<std::deque<char>>/262144 52897 ns 1495 ns bm_find<std::deque<char>>/1048576 215621 ns 6077 ns bm_find<std::deque<short>>/1 6.03 ns 6.28 ns bm_find<std::deque<short>>/2 15.8 ns 15.8 ns bm_find<std::deque<short>>/3 20.5 ns 20.3 ns bm_find<std::deque<short>>/4 21.0 ns 21.0 ns bm_find<std::deque<short>>/5 23.0 ns 22.1 ns bm_find<std::deque<short>>/6 22.6 ns 23.0 ns bm_find<std::deque<short>>/7 23.4 ns 23.7 ns bm_find<std::deque<short>>/8 24.4 ns 24.9 ns bm_find<std::deque<short>>/16 26.6 ns 27.2 ns bm_find<std::deque<short>>/64 43.2 ns 40.9 ns bm_find<std::deque<short>>/512 124 ns 90.7 ns bm_find<std::deque<short>>/4096 845 ns 525 ns bm_find<std::deque<short>>/32768 7273 ns 3194 ns bm_find<std::deque<short>>/262144 53710 ns 24385 ns bm_find<std::deque<short>>/1048576 216086 ns 96195 ns bm_find<std::deque<int>>/1 6.03 ns 10.3 ns bm_find<std::deque<int>>/2 15.6 ns 10.3 ns bm_find<std::deque<int>>/3 19.1 ns 10.3 ns bm_find<std::deque<int>>/4 22.3 ns 10.3 ns bm_find<std::deque<int>>/5 23.5 ns 10.4 ns bm_find<std::deque<int>>/6 23.1 ns 10.3 ns bm_find<std::deque<int>>/7 23.7 ns 10.2 ns bm_find<std::deque<int>>/8 24.5 ns 10.2 ns bm_find<std::deque<int>>/16 27.9 ns 26.6 ns bm_find<std::deque<int>>/64 42.6 ns 32.2 ns bm_find<std::deque<int>>/512 123 ns 43.0 ns bm_find<std::deque<int>>/4096 874 ns 93.5 ns bm_find<std::deque<int>>/32768 7031 ns 751 ns bm_find<std::deque<int>>/262144 57723 ns 6169 ns bm_find<std::deque<int>>/1048576 230867 ns 35851 ns bm_ranges_find<std::deque<char>>/1 5.97 ns 10.6 ns bm_ranges_find<std::deque<char>>/2 16.0 ns 10.5 ns bm_ranges_find<std::deque<char>>/3 19.5 ns 10.5 ns bm_ranges_find<std::deque<char>>/4 21.1 ns 10.6 ns bm_ranges_find<std::deque<char>>/5 22.8 ns 10.5 ns bm_ranges_find<std::deque<char>>/6 22.8 ns 10.6 ns bm_ranges_find<std::deque<char>>/7 23.4 ns 10.8 ns bm_ranges_find<std::deque<char>>/8 24.1 ns 10.5 ns bm_ranges_find<std::deque<char>>/16 26.9 ns 10.6 ns bm_ranges_find<std::deque<char>>/64 50.2 ns 27.2 ns bm_ranges_find<std::deque<char>>/512 126 ns 38.3 ns bm_ranges_find<std::deque<char>>/4096 868 ns 53.8 ns bm_ranges_find<std::deque<char>>/32768 6695 ns 161 ns bm_ranges_find<std::deque<char>>/262144 54411 ns 1497 ns bm_ranges_find<std::deque<char>>/1048576 241699 ns 6042 ns bm_ranges_find<std::deque<short>>/1 6.39 ns 6.31 ns bm_ranges_find<std::deque<short>>/2 15.8 ns 15.9 ns bm_ranges_find<std::deque<short>>/3 19.0 ns 19.8 ns bm_ranges_find<std::deque<short>>/4 20.8 ns 20.9 ns bm_ranges_find<std::deque<short>>/5 21.8 ns 22.1 ns bm_ranges_find<std::deque<short>>/6 23.0 ns 23.0 ns bm_ranges_find<std::deque<short>>/7 23.2 ns 23.9 ns bm_ranges_find<std::deque<short>>/8 23.7 ns 24.4 ns bm_ranges_find<std::deque<short>>/16 26.6 ns 26.8 ns bm_ranges_find<std::deque<short>>/64 43.4 ns 39.7 ns bm_ranges_find<std::deque<short>>/512 131 ns 90.5 ns bm_ranges_find<std::deque<short>>/4096 851 ns 523 ns bm_ranges_find<std::deque<short>>/32768 7370 ns 3166 ns bm_ranges_find<std::deque<short>>/262144 60778 ns 24814 ns bm_ranges_find<std::deque<short>>/1048576 229288 ns 99273 ns bm_ranges_find<std::deque<int>>/1 6.43 ns 10.2 ns bm_ranges_find<std::deque<int>>/2 16.6 ns 10.2 ns bm_ranges_find<std::deque<int>>/3 19.6 ns 10.2 ns bm_ranges_find<std::deque<int>>/4 21.0 ns 10.2 ns bm_ranges_find<std::deque<int>>/5 21.9 ns 10.4 ns bm_ranges_find<std::deque<int>>/6 22.7 ns 10.2 ns bm_ranges_find<std::deque<int>>/7 23.9 ns 10.2 ns bm_ranges_find<std::deque<int>>/8 23.8 ns 10.2 ns bm_ranges_find<std::deque<int>>/16 27.2 ns 27.1 ns bm_ranges_find<std::deque<int>>/64 42.4 ns 32.4 ns bm_ranges_find<std::deque<int>>/512 122 ns 43.0 ns bm_ranges_find<std::deque<int>>/4096 895 ns 93.7 ns bm_ranges_find<std::deque<int>>/32768 6890 ns 756 ns bm_ranges_find<std::deque<int>>/262144 54025 ns 6102 ns bm_ranges_find<std::deque<int>>/1048576 221558 ns 32783 ns ``` |
||
|
|
64addd6521 |
[libc++][test] Enhance ADDITIONAL_COMPILE_FLAGS, use TEST_MEOW_DIAGNOSTIC_IGNORED sparingly (#75317)
This is the last PR that's needed (for now) to get libc++'s tests working with MSVC's STL. The ADDITIONAL_COMPILE_FLAGS machinery is very useful, but also very problematic for MSVC, as it doesn't understand most of Clang's compiler options. We've been dealing with this by simply marking anything that uses ADDITIONAL_COMPILE_FLAGS as FAIL or SKIPPED, but that creates significant gaps in test coverage. Fortunately, ADDITIONAL_COMPILE_FLAGS also supports "features", which can be slightly enhanced to send Clang-compatible and MSVC-compatible options to the right compilers. This patch adds the gcc-style-warnings and cl-style-warnings Lit features, and uses that to pass the appropriate warning flags to tests. It also uses TEST_MEOW_DIAGNOSTIC_IGNORED for a few local suppressions of MSVC warnings. |
||
|
|
6a66467499 |
[libc++] P2770R0: Stashing stashing iterators for proper flattening (#66033)
- Partially implements P2770R0 (http://wg21.link/p2770) - Fixes https://wg21.link/LWG3698, https://wg21.link/LWG3700, and https://wg21.link/LWG3791 - join_with_view hasn't been done yet since this type isn't implemented yet - Rename tuple test directory to match the standard (which changed in P2770R0) - Rename join_view test directory to match the standard |
||
|
|
b2cc4b994e |
[libc++][test] Fix more MSVC and Clang warnings (#74965)
Found while running libc++'s tests with MSVC's STL.
*
`libcxx/test/std/algorithms/alg.sorting/alg.heap.operations/sort.heap/ranges_sort_heap.pass.cpp`
+ Fix Clang `-Wunused-variable`, because `LIBCPP_ASSERT` expands to
nothing for MSVC's STL.
+ This is the same "always void-cast" change that #73437 applied to the
neighboring `complexity.pass.cpp`. I missed that
`ranges_sort_heap.pass.cpp` was also affected because we had disabled
this test.
*
`libcxx/test/std/input.output/file.streams/fstreams/ifstream.members/buffered_reads.pass.cpp`
*
`libcxx/test/std/input.output/file.streams/fstreams/ofstream.members/buffered_writes.pass.cpp`
+ Fix MSVC "warning C4244: '`=`': conversion from '`__int64`' to
'`_Ty`', possible loss of data".
+ This is a valid warning, possibly the best one that MSVC found in this
entire saga. We're accumulating a `std::vector<std::streamsize>` and
storing the result in `std::streamsize total_size` but we actually have
to start with `std::streamsize{0}` or we'll truncate.
*
`libcxx/test/std/input.output/filesystems/fs.enum/enum.path.format.pass.cpp`
+ Fix Clang `-Wunused-local-typedef` because the following usage is
libc++-only.
+ I'm just expanding it at the point of use, and using the dedicated
`LIBCPP_STATIC_ASSERT` to keep the line length down.
*
`libcxx/test/std/input.output/syncstream/syncbuf/syncstream.syncbuf.assign/swap.pass.cpp`
+ Fix MSVC "warning C4242: 'argument': conversion from '`int`' to
'`const _Elem`', possible loss of data".
+ This is a valid warning (possibly the second-best) as `sputc()`
returns `int_type`. If `sputc()` returns something unexpected, we want
to know, so we should separately say `expected.push_back(CharT('B'))`.
*
`libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/new.size_align_nothrow.pass.cpp`
*
`libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/new.size_nothrow.pass.cpp`
+ Fix MSVC "warning C6001: Using uninitialized memory '`x`'."
+ [N4964](https://wg21.link/N4964) \[new.delete.single\]/12:
> *Effects:* The deallocation functions
(\[basic.stc.dynamic.deallocation\]) called by a *delete-expression*
(\[expr.delete\]) to render the value of `ptr` invalid.
+ \[basic.stc.general\]/4:
> When the end of the duration of a region of storage is reached, the
values of all pointers representing the address of any part of that
region of storage become invalid pointer values (\[basic.compound\]).
Indirection through an invalid pointer value and passing an invalid
pointer value to a deallocation function have undefined behavior. Any
other use of an invalid pointer value has implementation-defined
behavior.
+ In certain configurations, after `delete x;` MSVC will consider `x` to
be radioactive (and in other configurations, it'll physically null out
`x` as a safety measure). We can copy it into `old_x` before deletion,
which the implementation finds acceptable.
*
`libcxx/test/std/ranges/range.adaptors/range.elements/general.pass.cpp`
*
`libcxx/test/std/ranges/range.adaptors/range.elements/iterator/deref.pass.cpp`
+ Fix MSVC "warning C4242: 'initializing': conversion from '`_Ty`' to
'`_Ty`', possible loss of data".
+ This was being emitted in `pair` and `tuple`'s perfect forwarding
constructors. Passing `short{1}` allows MSVC to see that no truncation
is happening.
*
`libcxx/test/std/ranges/range.adaptors/range.elements/iterator/member_types.compile.pass.cpp`
+ Fix MSVC "warning C4242: 'initializing': conversion from '`_Ty`' to
'`_Ty2`', possible loss of data".
+ Similarly, this was being emitted in `pair`'s perfect forwarding
constructor. After passing `short{1}`, I reduced repetition by relying
on CTAD. (I can undo that cleanup if it's stylistically undesirable.)
*
`libcxx/test/std/utilities/function.objects/refwrap/refwrap.const/type_conv_ctor.pass.cpp`
+ Fix MSVC "warning C4930: '`std::reference_wrapper<int> purr(void)`':
prototyped function not called (was a variable definition intended?)".
+ There's no reason for `purr()` to be locally declared (aside from
isolating it to a narrow scope, which has minimal benefits); it can be
declared like `meow()` above. 😸
*
`libcxx/test/std/utilities/memory/util.smartptr/util.smartptr.shared/util.smartptr.shared.create/make_shared_for_overwrite.pass.cpp`
*
`libcxx/test/std/utilities/smartptr/unique.ptr/unique.ptr.create/make_unique_for_overwrite.default_init.pass.cpp`
+ Fix MSVC static analysis warnings when replacing `operator new`:
```
warning C28196: The requirement that '(_Param_(1)>0)?(return!=0):(1)' is
not satisfied. (The expression does not evaluate to true.)
warning C6387: 'return' could be '0': this does not adhere to the
specification for the function 'new'.
warning C6011: Dereferencing NULL pointer 'reinterpret_cast<char
*>ptr+i'.
```
+ All we need is a null check, which appears in other `operator new`
replacements:
|
||
|
|
774295ca1d |
[libc++][test] Fix MSVC warnings with static_casts (#74962)
Found while running libc++'s tests with MSVC's STL. * `libcxx/test/std/algorithms/alg.modifying.operations/alg.unique/ranges_unique_copy.pass.cpp` + Fix MSVC "warning C4389: '`==`': signed/unsigned mismatch". + This was x86-specific for me. The LHS is `int` and the RHS is `size_t`. We know the `array`'s size, so `static_cast<int>` is certainly safe, and this matches the following `numberOfProj` comparisons. * `libcxx/test/std/containers/sequences/insert_range_sequence_containers.h` + Fix MSVC "warning C4267: 'argument': conversion from '`size_t`' to '`const int`', possible loss of data". + `test_case.index` is `size_t`: |
||
|
|
bfdc562d0c |
[libc++] Fix copy-paste damage in ranges::rotate_copy and its test (#74544)
Found while running libc++'s tests with MSVC's STL. `ranges::rotate_copy` takes `forward_iterator`s as this test's comment banner correctly depicts. However, this test had bogus assertions expecting that `ranges::rotate_copy` would be constrained away for not-quite-**bidi** iterators. @philnik777 confirmed that these were copy-paste relics from the `ranges::reverse_copy` test. I fixed this by replacing the assertions with the test types that aren't quite **forward** iterators/ranges. Additionally, I noticed that the top-level `test()` function was missing coverage with the weakest possible `forward_iterator<int*>`. This revealed that the product code in `ranges_rotate_copy.h` was similarly damaged. In addition to fixing it by taking `forward_iterator` and `forward_range` as depicted in the Standard, this drops the inclusion of `<__iterator/reverse_iterator.h>` as this algorithm doesn't need `std::__reverse_range`. |
||
|
|
f1db578f0d |
[libc++][test] Fix assumptions that std::array iterators are pointers (#74430)
Found while running libc++'s tests with MSVC's STL, where `std::array` iterators are never pointers. Most of these changes are reasonably self-explanatory (the `std::array`s are right there, and the sometimes-slightly-wrapped raw pointer types are a short distance away). A couple of changes are less obvious: In `libcxx/test/std/containers/from_range_helpers.h`, `wrap_input()` is called with `Iter` types that are constructible from raw pointers. It's also sometimes called with an `array` as the `input`, so the first overload was implicitly assuming that `array` iterators are pointers. We can fix this assumption by providing a dedicated overload for `array`, just like the one for `vector` immediately below. Finally, `from_range_helpers.h` should explicitly include both `<array>` and `<vector>`, even though they were apparently being dragged in already. In `libcxx/test/std/containers/views/views.span/span.cons/iterator_sentinel.pass.cpp`, fix `throw_operator_minus`. The error was pretty complicated, caused by the concepts machinery noticing that `value_type` and `element_type` were inconsistent. In the template instantiation context, you can see the critical detail that `throw_operator_minus<std::_Array_iterator>` is being formed. Fortunately, the fix is extremely simple. To produce `element_type` (which retains any cv-qualification, unlike `value_type`), we shouldn't attempt to `remove_pointer` with the iterator type `It`. Instead, we've already obtained the `reference` type, so we can `remove_reference_t`. (This is modern code, where we have access to the alias templates, so I saw no reason to use the older verbose form.) |
||
|
|
164c204a19 |
[libc++][test] Fix simple warnings (#74186)
Found while running libc++'s tests with MSVC's STL. This fixes 3 kinds of warnings: - Add void-casts to fix `-Wunused-variable` warnings. - Avoid sign/truncation warnings in `ConvertibleToIntegral.h`. - Add `TEST_STD_AT_LEAST_23_OR_RUNTIME_EVALUATED` to avoid mixing preprocessor and runtime tests. - Cleanup: Add `TEST_STD_AT_LEAST_20_OR_RUNTIME_EVALUATED` for consistency. |
||
|
|
c000f754bf |
[libc++][test] Avoid non-Standard zero-length arrays (#74183)
Found while running libc++'s test suite with MSVC's STL, where we use
both MSVC's compiler and Clang/LLVM.
MSVC's compiler rejects the non-Standard extension of zero-length
arrays. For conformance, I'm changing these occurrences to
`std::array<int, 0>`.
Many of these files already had `#include <array>`; I'm adding it to the
rest.
I wanted to add `-Wzero-length-array` to
`libcxx/utils/libcxx/test/params.py` to prevent future occurrences, but
it complained about product code 😿 :
```
In file included from /home/runner/_work/llvm-project/llvm-project/libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/long.pass.cpp:18:
In file included from /home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/istream:170:
In file included from /home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/ostream:172:
In file included from /home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/__system_error/error_code.h:18:
In file included from /home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/__system_error/error_category.h:15:
/home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/string:811:25: error: zero size arrays are an extension [-Werror,-Wzero-length-array]
811 | char __padding_[sizeof(value_type) - 1];
| ^~~~~~~~~~~~~~~~~~~~~~
/home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/string:817:19: note: in instantiation of member class 'std::basic_string<char>::__short' requested here
817 | static_assert(sizeof(__short) == (sizeof(value_type) * (__min_cap + 1)), "__short has an unexpected size.");
| ^
/home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/string:2069:5: note: in instantiation of template class 'std::basic_string<char>' requested here
2069 | _LIBCPP_STRING_V1_EXTERN_TEMPLATE_LIST(_LIBCPP_DECLARE, char)
| ^
/home/runner/_work/llvm-project/llvm-project/build/generic-cxx03/include/c++/v1/__string/extern_template_lists.h:31:60: note: expanded from macro '_LIBCPP_STRING_V1_EXTERN_TEMPLATE_LIST'
31 | _Func(_LIBCPP_EXPORTED_FROM_ABI basic_string<_CharType>& basic_string<_CharType>::replace(size_type, size_type, value_type const*, size_type)) \
| ^
```
I pushed a tiny commit to fix unrelated comment typos, in an attempt to
clear out spurious CI failures.
|
||
|
|
be811d1617 |
[libc++] Run picolibc tests with qemu
This patch actually runs the tests for picolibc behind an emulator, removing a few workarounds and increasing coverage. Differential Revision: https://reviews.llvm.org/D155521 |
||
|
|
ed27a4edb0 |
[libc++][PSTL] Implement std::equal (#72448)
Differential Revision: https://reviews.llvm.org/D157131 Co-authored-by: Louis Dionne <ldionne.2@gmail.com> |
||
|
|
2b7cca1ccf |
[libc++] Add missing REQUIRES for exception handling test
It otherwise fails on Windows. |
||
|
|
f5832bab6f |
[libc++][test] Cleanup typos and unnecessary semicolons (#73435)
I've structured this into a series of commits for even easier reviewing, if that helps. I could easily split this up into separate PRs if desired, but as this is low-risk with simple edits, I thought one PR would be easiest. * Drop unnecessary semicolons after function definitions. * Cleanup comment typos. * Cleanup `static_assert` typos. * Cleanup test code typos. + There should be no functional changes, assuming I've changed all occurrences. * ~~Fix massive test code typos.~~ + This was a real problem, but needed more surgery. I reverted those changes here, and @philnik777 is fixing this properly with #73444. * clang-formatting as requested by the CI. |