Files
clang-p2996/libc/src/string
Guillaume Chatelet bd4f978754 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 07:56:23 +00:00
..
2022-12-01 10:07:04 +00:00
2022-12-01 10:07:04 +00:00
2023-04-11 04:41:14 +00:00
2023-04-11 04:41:14 +00:00
2023-04-11 20:49:25 +00:00
2023-04-11 20:49:25 +00:00
2023-04-11 04:41:14 +00:00
2023-04-11 04:41:14 +00:00
2023-01-11 05:38:33 +00:00
2023-01-11 05:38:33 +00:00
2023-01-25 17:58:13 +00:00
2023-01-25 17:58:13 +00:00
2022-10-28 11:13:07 -07:00
2022-10-28 11:13:07 -07:00
2022-10-07 11:07:06 -07:00
2023-01-11 05:38:33 +00:00
2023-04-06 17:48:28 +00:00
2023-04-06 17:48:28 +00:00
2023-03-03 19:51:46 +00:00
2023-01-25 16:42:34 +00:00
2023-04-07 00:09:22 +00:00
2023-04-07 00:09:22 +00:00