Commit Graph

220 Commits

Author SHA1 Message Date
lntue
d0a0de50f7 [libc] Fix implicit conversion warnings in tests. (#131362) 2025-03-14 14:24:11 -04:00
Vinay Deshmukh
257e483715 [libc] Add -Wno-sign-conversion & re-attempt -Wconversion (#129811)
Relates to
https://github.com/llvm/llvm-project/issues/119281#issuecomment-2699470459
2025-03-10 11:57:09 -04:00
Augie Fackler
da61b0ddc5 Revert "[libc] Enable -Wconversion for tests. (#127523)"
This reverts commit 1e6e845d49 because it
changed the 1st parameter of adjust() to be unsigned, but libc itself
calls adjust() with a negative argument in align_backward() in
op_generic.h.
2025-03-05 16:42:40 -05:00
Vinay Deshmukh
1e6e845d49 [libc] Enable -Wconversion for tests. (#127523)
Relates to: #119281
2025-03-04 10:24:35 -05:00
Nick Desaulniers
431ea2d076 [libc] move bcmp, bzero, bcopy, index, rindex, strcasecmp, strncasecmp to strings.h (#118899)
docgen relies on the convention that we have a file foo.cpp in
libc/src/\<header\>/. Because the above functions weren't in libc/src/strings/
but rather libc/src/string/, docgen could not find that we had implemented
these.

Rather than add special carve outs to docgen, let's fix up our sources for
these 7 functions to stick with the existing conventions the rest of the
codebase follows.

Link: #118860
Fixes: #118875
2024-12-10 08:58:45 -08:00
Michael Jones
a0c4f854ca [libc] Change ctype to be encoding independent (#110574)
The previous implementation of the ctype functions assumed ASCII.
This patch changes to a switch/case implementation that looks odd, but
actually is easier for the compiler to understand and optimize.
2024-12-03 12:36:04 -08:00
Job Henandez Lara
33bdb53d86 [libc] Remove the #include <stdlib.h> header (#114453) 2024-11-01 21:49:57 -07:00
George Burgess IV
50c44478fe [libc] fix behavior of strrchr(x, '\0') (#112620)
`strrchr("foo", '\0')` is defined to point to the end of `foo`, rather
than returning NULL. This wasn't caught by tests, since llvm-libc's
`ASSERT_STREQ(nullptr, "");` is not an assertion error.

While I'm here, refactor the test slightly to check for NULL more
specifically. I considered adding fancier `ASSERT`s (and changing the
semantics of `ASSERT_STREQ`), but opted for a more local fix by fair
dice roll.
2024-10-30 15:08:03 -07:00
George Burgess IV
b03c8c4fdd libc: strlcpy/strlcat shouldn't bzero the rest of buf (#114259)
When running Bionic's testsuite over llvm-libc, tests broke because
e.g.,

```
const char *str = "abc";
char buf[7]{"111111"};
strlcpy(buf, str, 7);
ASSERT_EQ(buf, {'1', '1', '1', '\0', '\0', '\0', '\0'});
```

On my machine (Debian w/ glibc and clang-16), a `printf` loop over `buf`
gets unrolled into a series of const `printf` at compile-time:
```
printf("%d\n", '1');
printf("%d\n", '1');
printf("%d\n", '1');
printf("%d\n", 0);
printf("%d\n", '1');
printf("%d\n", '1');
printf("%d\n", 0);
```

Seems best to match existing precedent here.
2024-10-30 12:28:32 -07:00
Joseph Huber
cbfcea1fc2 [libc] Temporarily disable strerror test on NVPTX
Summary:
This is failing on the NVPTX buildbot,
https://lab.llvm.org/buildbot/#/builders/69/builds/6997/. I cannot
reproduce it locally so I'm disabling it temporarily so the bot is
green.
2024-10-10 20:20:05 -05:00
Sirui Mu
ded080152a [libc] Add osutils for Windows and make libc and its tests build on Windows target (#104676)
This PR first adds osutils for Windows, and changes some libc code to
make libc and its tests build on the Windows target. It then temporarily
disables some libc tests that are currently problematic on Windows.

Specifically, the changes besides the addition of osutils include:

- Macro `LIBC_TYPES_HAS_FLOAT16` is disabled on Windows. `clang-cl`
generates calls to functions in `compiler-rt` to handle float16
arithmetic and these functions are currently not linked in on Windows.
- Macro `LIBC_TYPES_HAS_INT128` is disabled on Windows.
- The invocation to `::aligned_malloc` is changed to an invocation to
`::_aligned_malloc`.
- The following unit tests are temporarily disabled because they
currently fail on Windows:
  - `test.src.__support.big_int_test`
  - `test.src.__support.arg_list_test`
  - `test.src.fenv.getenv_and_setenv_test`
- Tests involving `__m128i`, `__m256i`, and `__m512i` in
`test.src.string.memory_utils.op_tests.cpp`
- `test_range_errors` in `libc/test/src/math/smoke/AddTest.h` and
`libc/test/src/math/smoke/SubTest.h`
2024-09-11 23:41:32 -04:00
Joseph Huber
f7cee44ef2 [libc] Add strerror and strerror_k to the GPU (#99083)
Summary:
The GPU ignores `errno` primarily, but targets want these functions to
be defined for certain C standard interfaces. This patch enables them
and makes the test function on non-Linux targets.
2024-07-16 16:17:01 -05:00
OverMighty
88f0dc48d6 [libc] Fix warnings emitted by GCC (#98751)
Fixes #98709.
2024-07-15 17:20:53 +02:00
Petr Hosek
5ff3ff33ff [libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597)
This is a part of #97655.
2024-07-12 09:28:41 -07:00
Mehdi Amini
ce9035f5bd Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration" (#98593)
Reverts llvm/llvm-project#98075

bots are broken
2024-07-12 09:12:13 +02:00
Petr Hosek
3f30effe1b [libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075)
This is a part of #97655.
2024-07-11 12:35:22 -07:00
Joseph Huber
398162ddbc [libc] Fix typo in test message 2024-05-15 07:12:54 -05:00
Jan Patrick Lehr
ccbf908b08 [libc] Fix GPU test build error (#92235)
This fixes a build error on the AMDGPU buildbot introduced in PR
https://github.com/llvm/llvm-project/pull/92172
2024-05-15 07:06:40 -05:00
Guillaume Chatelet
292b300c51 [libc][bug] Fix out of bound write in memcpy w/ software prefetching (#90591)
This patch adds tests for `memcpy` and `memset` making sure that we
don't access buffers out of bounds. It relies on POSIX `mmap` /
`mprotect` and works only when FULL_BUILD_MODE is disabled.

The bug showed up while enabling software prefetching.
`loop_and_tail_offset` is always running at least one iteration but in
some configurations loop unrolled prefetching was actually needing only
the tail operation and no loop iterations at all.
2024-05-14 13:55:24 +02:00
Guillaume Chatelet
07d7b9c255 [libc] Fix forward arm32 builtbot (#84794)
Introduced by https://github.com/llvm/llvm-project/pull/83441.
2024-03-11 18:17:18 +01:00
Guillaume Chatelet
a84e66a92d [libc] Provide LIBC_TYPES_HAS_INT64 (#83441)
Umbrella bug #83182
2024-03-09 09:43:07 +01:00
Schrodinger ZHU Yifan
57a337378f [libc][c23] add memset_explicit (#83577) 2024-03-07 14:57:35 -05:00
michaelrj-google
3eb1e6d8e9 [libc] Move libc_errno inside of LIBC_NAMESPACE (#80774)
Having libc_errno outside of the namespace causes versioning issues when
trying to link the tests against LLVM-libc. Most of this patch is just
moving libc_errno inside the namespace in tests. This isn't necessary in
the function implementations since those are already inside the
namespace.
2024-02-06 10:36:05 +01:00
Guillaume Chatelet
73874f7a39 [libc][NFC] Use specific ASSERT macros to test errno (#79573)
This patch provides specific test macros to deal with `errno`.
This will help abstract away the differences between unit test and integration/hermetic tests in #79319.
In one case we use `libc_errno` which is a struct, in the other case we deal directly with `errno`.
2024-01-26 15:18:22 +01:00
Guillaume Chatelet
9ca6e5bb86 [libc] Fix buggy AVX2 / AVX512 memcmp (#77081)
Fixes #77080.
2024-01-11 11:45:37 +01:00
Nick Desaulniers
7c6b4be615 [libc] fix msan failure in mempcpy_test (#75532)
Internal builds of the unittests with msan flagged mempcpy_test.

    ==6862==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55e34d7d734a in length
llvm-project/libc/src/__support/CPP/string_view.h:41:11
#1 0x55e34d7d734a in string_view
llvm-project/libc/src/__support/CPP/string_view.h:71:24
#2 0x55e34d7d734a in
__llvm_libc_9999_0_0_git::testing::Test::testStrEq(char const*, char
const*, char const*, char const*,
__llvm_libc_9999_0_0_git::testing::internal::Location)
llvm-project/libc/test/UnitTest/LibcTest.cpp:284:13
#3 0x55e34d7d4e09 in LlvmLibcMempcpyTest_Simple::Run()
llvm-project/libc/test/src/string/mempcpy_test.cpp:20:3
#4 0x55e34d7d6dff in
__llvm_libc_9999_0_0_git::testing::Test::runTests(char const*)
llvm-project/libc/test/UnitTest/LibcTest.cpp:133:8
#5 0x55e34d7d86e0 in main
llvm-project/libc/test/UnitTest/LibcTestMain.cpp:21:10

SUMMARY: MemorySanitizer: use-of-uninitialized-value
llvm-project/libc/src/__support/CPP/string_view.h:41:11 in length

What's going on here is that mempcpy_test.cpp's Simple test is using
ASSERT_STREQ with a partially initialized char array. ASSERT_STREQ calls
Test::testStrEq which constructs a cpp:string_view. That constructor
calls the
private method cpp::string_view::length. When built with msan, the loop
is
transformed into multi-byte access, which then fails upon access.

I took a look at libc++'s __constexpr_strlen which just calls
__builtin_strlen(). Replacing the implementation of
cpp::string_view::length
with a call to __builtin_strlen() may still result in out of bounds
access when
the test is built with msan.

It's not safe to use ASSERT_STREQ with a partially initialized array.
Initialize the whole array so that the test passes.
2023-12-14 13:39:33 -08:00
Guillaume Chatelet
1d89478830 [reland][libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead (#73939) (#74446)
Same as #73939 but also fix `libc/src/string/memory_utils/op_aarch64.h`
that was still using `deferred_static_assert`.
2023-12-05 11:35:13 +01:00
Guillaume Chatelet
de7fdc5b54 Revert "[libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead" (#74444)
Reverts llvm/llvm-project#73939

This broke libc-aarch64-ubuntu build bot 
https://lab.llvm.org/buildbot/#/builders/138/builds/56186
2023-12-05 11:25:39 +01:00
Guillaume Chatelet
b140948850 [libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead (#73939) 2023-12-05 11:21:07 +01:00
Joseph Huber
4cb6c1c7cb [libc] Enable missing memory tests on the GPU (#68111)
Summary:
There were a few tests that weren't enabled on the GPU. This is because
the logic caused them to be skipped as we don't use CPU featured on the
host. This also disables the logic making multiple versions of the
memory functions.
2023-10-06 08:27:36 -05:00
Siva Chandra
aecb58005c [libc][NFC] Remove an inappropriate -ffreestanding arg to memory_utils test. (#67435) 2023-09-26 08:04:08 -07:00
Guillaume Chatelet
b6bc9d72f6 [libc] Mass replace enclosing namespace (#67032)
This is step 4 of
https://discourse.llvm.org/t/rfc-customizable-namespace-to-allow-testing-the-libc-when-the-system-libc-is-also-llvms-libc/73079
2023-09-26 11:45:04 +02:00
Guillaume Chatelet
1c814c99aa [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-30 13:00:58 +00:00
Guillaume Chatelet
bd1cba9f4f Revert D148717 "[libc] Improve memcmp latency and codegen"
Once integrated in our codebase the patch triggered a bunch of failing
tests. We do not yet understand where the bug is but we revert it to
move forward with integration.
This reverts commit 5e32765c15.
2023-06-21 12:37:14 +00:00
Guillaume Chatelet
9902fc8dad [libc] Enable custom logging in LibcTest
This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630
2023-06-14 13:37:50 +00:00
Guillaume Chatelet
bdb07c98c4 Revert D152630 "[libc] Enable custom logging in LibcTest"
Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707
This reverts commit 9a7b4c9348.
2023-06-14 10:31:49 +00:00
Guillaume Chatelet
9a7b4c9348 [libc] Enable custom logging in LibcTest
This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630
2023-06-14 10:26:18 +00:00
Guillaume Chatelet
2cfae7cdf4 [libc] Dispatch memmove to memcpy when buffers are disjoint
Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster.
The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip).
On x86 this patch adds a latency of 2 to 3 cycles.

Before
```
--------------------------------------------------------------------------------
Benchmark                      Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------
BM_Memmove/0/0_median       5.00 ns         5.00 ns           10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       6.21 ns         6.21 ns           10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       8.09 ns         8.09 ns           10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       5.95 ns         5.95 ns           10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       5.63 ns         5.63 ns           10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       5.68 ns         5.68 ns           10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       7.46 ns         7.46 ns           10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       5.40 ns         5.40 ns           10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       5.62 ns         5.62 ns           10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median        101 ns          101 ns           10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096
```
After
```
BM_Memmove/0/0_median       3.57 ns         3.57 ns           10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       4.52 ns         4.52 ns           10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       5.70 ns         5.70 ns           10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       4.47 ns         4.47 ns           10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       4.53 ns         4.53 ns           10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       4.19 ns         4.19 ns           10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       5.02 ns         5.02 ns           10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       4.03 ns         4.03 ns           10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       4.70 ns         4.70 ns           10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median       90.7 ns         90.7 ns           10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096
```

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D152811
2023-06-14 08:29:15 +00:00
Guillaume Chatelet
5e32765c15 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 13:47:16 +00:00
Guillaume Chatelet
1ec995cc1c Revert D148717 "[libc] Improve memcmp latency and codegen"
This broke aarch64 debug buildbot https://lab.llvm.org/buildbot/#/builders/223/builds/21703
This reverts commit bd4f978754.
2023-06-12 08:32:00 +00:00
Guillaume Chatelet
bd4f978754 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 07:56:23 +00:00
Guillaume Chatelet
e49a608511 Revert D148717 "[libc] Improve memcmp latency and codegen"
This reverts commit 9ec6ebd3ce.

The patch broke RISCV and aarch64 builtbots.
2023-06-05 09:50:30 +00:00
Guillaume Chatelet
9ec6ebd3ce [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-05 09:46:05 +00:00
Guillaume Chatelet
c76a3e795e [libc][NFC] Fixing various typos 2023-05-31 12:11:09 +00:00
Roland McGrath
1d4e8f0ea6 [libc] Fix compilation issues in memory_check_utils.h
Strict warnings require explicit static_cast to counteract
default widening of types narrower than int.

Functions in header files should have vague linkage (inline
keyword), not internal linkage (static) or external linkage
(no inline keyword) even for template functions.  Note these
don't use the LIBC_INLINE macro since this is only for test code.

Reviewed By: abrachet

Differential Revision: https://reviews.llvm.org/D151494
2023-05-25 14:09:14 -07:00
Guillaume Chatelet
298843cd66 [libc][test] Drastically reduce mem test runtime
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D151450
2023-05-25 15:05:21 +00:00
Tue Ly
111d274841 [libc][math] Implement double precision log2 function correctly rounded to all rounding modes.
Implement double precision log2 function correctly rounded to all
rounding modes.

See https://reviews.llvm.org/D150014 for a more detail description of the algorithm.

**Performance**

  - For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.91%.

  - Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log2
GNU libc version: 2.35
GNU libc release: stable

-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 15.458 + 0.204 clc/call; Median-Min = 0.224 clc/call; Max = 15.867 clc/call;

-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 23.711 + 0.524 clc/call; Median-Min = 0.443 clc/call; Max = 25.307 clc/call;

-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 14.807 + 0.199 clc/call; Median-Min = 0.211 clc/call; Max = 15.137 clc/call;

-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 17.666 + 0.274 clc/call; Median-Min = 0.298 clc/call; Max = 18.531 clc/call;

-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 26.534 + 0.418 clc/call; Median-Min = 0.462 clc/call; Max = 27.327 clc/call;

```
  - Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log2 --latency
GNU libc version: 2.35
GNU libc release: stable

-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 46.048 + 1.643 clc/call; Median-Min = 1.694 clc/call; Max = 48.018 clc/call;

-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 62.333 + 0.138 clc/call; Median-Min = 0.119 clc/call; Max = 62.583 clc/call;

-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 45.206 + 1.503 clc/call; Median-Min = 1.467 clc/call; Max = 47.229 clc/call;

-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 43.042 + 0.454 clc/call; Median-Min = 0.484 clc/call; Max = 43.912 clc/call;

-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 57.016 + 1.636 clc/call; Median-Min = 1.655 clc/call; Max = 58.816 clc/call;
```
  - Accurate pass latency:
```
$ ./perf.sh log2 --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable

-- CORE-MATH latency -- with FMA
177.632

-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
231.332

-- LIBC latency -- with FMA
459.751

-- LIBC latency -- without FMA
463.850
```

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D150374
2023-05-23 10:49:30 -04:00
Guillaume Chatelet
f4a3549250 [libc] Add optimized memcpy for RISCV
This patch adds two versions of memcpy optimized for architectures where unaligned accesses are either illegal or extremely slow.
It is currently enabled for RISCV 64 and RISCV 32 but it could be used for ARM 32 architectures as well.

Here is the before / after output of `libc.benchmarks.memory_functions.opt_host --benchmark_filter=BM_Memcpy` on a quad core Linux starfive RISCV 64 board running at 1.5GHz.

Before:
```
Run on (4 X 1500 MHz CPU s)
CPU Caches:
  L1 Instruction 32 KiB (x4)
  L1 Data 32 KiB (x4)
  L2 Unified 2048 KiB (x1)
------------------------------------------------------------------------
Benchmark              Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------
BM_Memcpy/0/0        474 ns          474 ns      1483776 bytes_per_cycle=0.243492/s bytes_per_second=348.318M/s items_per_second=2.11097M/s __llvm_libc::memcpy,memcpy Google A
BM_Memcpy/1/0        210 ns          209 ns      3649536 bytes_per_cycle=0.233819/s bytes_per_second=334.481M/s items_per_second=4.77519M/s __llvm_libc::memcpy,memcpy Google B
BM_Memcpy/2/0       1814 ns         1814 ns       396288 bytes_per_cycle=0.247899/s bytes_per_second=354.622M/s items_per_second=551.402k/s __llvm_libc::memcpy,memcpy Google D
BM_Memcpy/3/0       89.3 ns         89.2 ns      7459840 bytes_per_cycle=0.217415/s bytes_per_second=311.014M/s items_per_second=11.2071M/s __llvm_libc::memcpy,memcpy Google L
BM_Memcpy/4/0        134 ns          134 ns      3815424 bytes_per_cycle=0.226584/s bytes_per_second=324.131M/s items_per_second=7.44567M/s __llvm_libc::memcpy,memcpy Google M
BM_Memcpy/5/0       52.8 ns         52.6 ns     11001856 bytes_per_cycle=0.194893/s bytes_per_second=278.797M/s items_per_second=19.0284M/s __llvm_libc::memcpy,memcpy Google Q
BM_Memcpy/6/0        180 ns          180 ns      4101120 bytes_per_cycle=0.231884/s bytes_per_second=331.713M/s items_per_second=5.55957M/s __llvm_libc::memcpy,memcpy Google S
BM_Memcpy/7/0        195 ns          195 ns      3906560 bytes_per_cycle=0.232951/s bytes_per_second=333.239M/s items_per_second=5.1217M/s __llvm_libc::memcpy,memcpy Google U
BM_Memcpy/8/0        152 ns          152 ns      4789248 bytes_per_cycle=0.227507/s bytes_per_second=325.452M/s items_per_second=6.58187M/s __llvm_libc::memcpy,memcpy Google W
BM_Memcpy/9/0       6036 ns         6033 ns       118784 bytes_per_cycle=0.249158/s bytes_per_second=356.423M/s items_per_second=165.75k/s __llvm_libc::memcpy,uniform 384 to 4096
```

After:
```
BM_Memcpy/0/0        126 ns          126 ns      5770240 bytes_per_cycle=1.04707/s bytes_per_second=1.46273G/s items_per_second=7.9385M/s __llvm_libc::memcpy,memcpy Google A
BM_Memcpy/1/0       75.1 ns         75.0 ns     10204160 bytes_per_cycle=0.691143/s bytes_per_second=988.687M/s items_per_second=13.3289M/s __llvm_libc::memcpy,memcpy Google B
BM_Memcpy/2/0        333 ns          333 ns      2174976 bytes_per_cycle=1.39297/s bytes_per_second=1.94596G/s items_per_second=3.00002M/s __llvm_libc::memcpy,memcpy Google D
BM_Memcpy/3/0       49.6 ns         49.5 ns     16092160 bytes_per_cycle=0.710161/s bytes_per_second=1015.89M/s items_per_second=20.1844M/s __llvm_libc::memcpy,memcpy Google L
BM_Memcpy/4/0       57.7 ns         57.7 ns     11213824 bytes_per_cycle=0.561557/s bytes_per_second=803.314M/s items_per_second=17.3228M/s __llvm_libc::memcpy,memcpy Google M
BM_Memcpy/5/0       48.0 ns         47.9 ns     16437248 bytes_per_cycle=0.346708/s bytes_per_second=495.97M/s items_per_second=20.8571M/s __llvm_libc::memcpy,memcpy Google Q
BM_Memcpy/6/0       67.5 ns         67.5 ns     10616832 bytes_per_cycle=0.614173/s bytes_per_second=878.582M/s items_per_second=14.8142M/s __llvm_libc::memcpy,memcpy Google S
BM_Memcpy/7/0       84.7 ns         84.6 ns     10480640 bytes_per_cycle=0.819077/s bytes_per_second=1.14424G/s items_per_second=11.8174M/s __llvm_libc::memcpy,memcpy Google U
BM_Memcpy/8/0       61.7 ns         61.6 ns     11191296 bytes_per_cycle=0.550078/s bytes_per_second=786.893M/s items_per_second=16.2279M/s __llvm_libc::memcpy,memcpy Google W
BM_Memcpy/9/0        981 ns          981 ns       703488 bytes_per_cycle=1.52333/s bytes_per_second=2.12807G/s items_per_second=1019.81k/s __llvm_libc::memcpy,uniform 384 to 4096
```

It is not as good as glibc for now so there's room for improvement. I suspect a path pumping 16 bytes at once given the doubled numbers for large copies.
```
BM_Memcpy/0/1        146 ns         82.5 ns      8576000 bytes_per_cycle=1.35236/s bytes_per_second=1.88922G/s items_per_second=12.1169M/s glibc memcpy,memcpy Google A
BM_Memcpy/1/1        112 ns         63.7 ns     10634240 bytes_per_cycle=0.628018/s bytes_per_second=898.387M/s items_per_second=15.702M/s glibc memcpy,memcpy Google B
BM_Memcpy/2/1        315 ns          180 ns      4079616 bytes_per_cycle=2.65229/s bytes_per_second=3.7052G/s items_per_second=5.54764M/s glibc memcpy,memcpy Google D
BM_Memcpy/3/1       85.3 ns         43.1 ns     15854592 bytes_per_cycle=0.774164/s bytes_per_second=1107.45M/s items_per_second=23.2249M/s glibc memcpy,memcpy Google L
BM_Memcpy/4/1        105 ns         54.3 ns     13427712 bytes_per_cycle=0.7793/s bytes_per_second=1114.8M/s items_per_second=18.4109M/s glibc memcpy,memcpy Google M
BM_Memcpy/5/1       77.1 ns         43.2 ns     16476160 bytes_per_cycle=0.279808/s bytes_per_second=400.269M/s items_per_second=23.1428M/s glibc memcpy,memcpy Google Q
BM_Memcpy/6/1        112 ns         62.7 ns     11236352 bytes_per_cycle=0.676078/s bytes_per_second=967.137M/s items_per_second=15.9387M/s glibc memcpy,memcpy Google S
BM_Memcpy/7/1        131 ns         65.5 ns     11751424 bytes_per_cycle=0.965616/s bytes_per_second=1.34895G/s items_per_second=15.2762M/s glibc memcpy,memcpy Google U
BM_Memcpy/8/1        104 ns         55.0 ns     12314624 bytes_per_cycle=0.583336/s bytes_per_second=834.468M/s items_per_second=18.1937M/s glibc memcpy,memcpy Google W
BM_Memcpy/9/1        932 ns          466 ns      1480704 bytes_per_cycle=3.17342/s bytes_per_second=4.43321G/s items_per_second=2.14679M/s glibc memcpy,uniform 384 to 4096
```

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D150202
2023-05-10 08:42:07 +00:00
Joseph Huber
632fa3798c [libc] Enable running libc unit tests on AMDGPU
The previous patches added the necessary support for global constructors
used to register tests. This patch enables the AMDGPU target to build
and run the unit tests on the GPU. Currently this only tests the `ctype`
tests, but adding more should be straightforward from here on.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D149517
2023-05-04 06:32:52 -05:00
Joseph Huber
644b63bd31 [libc] Add 'UNIT_TEST_ONLY' and 'HERMETIC_TEST_ONLY' to 'add_libc_test'
The `add_libc_test` option allows us to enable both kinds of tests in a
single option. However, some tests cannot be made hermetic with the
current approach. Such as tests that rely on system utilities or
libraries. This patch adds two options `UNIT_TEST_ONLY` and
`HERMETIC_TEST_ONLY` to offer more fine-grained control over which
version gets built. This makes it explicit which version a test supports
and why.

Depends on D149662

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D149691
2023-05-02 18:51:12 -05:00