clang-p2996

Author	SHA1	Message	Date
Tue Ly	5814b7b279	[libc][math] Implement log10 function correctly rounded for all rounding modes Implement double precision log10 function correctly rounded for all rounding modes. This implementation currently needs FMA instructions for correctness. Use 2 passes: Fast pass: - 1 step range reduction with a lookup table of `2^7 = 128` elements to reduce the ranges to `[-2^-7, 2^-7]`. - Use a degree-7 minimax polynomial generated by Sollya, evaluated using a mixed of double-double and double precisions. - Apply Ziv's test for accuracy. Accurate pass: - Apply 5 more range reduction steps to reduce the ranges further to [-2^-27, 2^-27]. - Use a degree-4 minimax polynomial generated by Sollya, evaluated using 192-bit precisions. - By the result of Lefevre (add quote), this is more than enough for correct rounding to all rounding modes. In progress: Adding detail documentations about the algorithm. Depend on: https://reviews.llvm.org/D136799 Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D139846	2023-01-08 17:41:54 -05:00
Guillaume Chatelet	436c8f4420	[reland][libc] Add bcopy Differential Revision: https://reviews.llvm.org/D138994	2022-12-01 10:07:04 +00:00
Guillaume Chatelet	c5fe7eb216	Revert D138994 "[libc] Add bcopy" Broke build bot This reverts commit `186a15f7a9`.	2022-12-01 09:55:36 +00:00
Guillaume Chatelet	186a15f7a9	[libc] Add bcopy Differential Revision: https://reviews.llvm.org/D138994	2022-12-01 09:52:10 +00:00
Joseph Huber	dabb7514f5	[libc] Fix assert.h and ctype.h not being built The `assert.h` and `ctype.h` headers are never built despite their entrypoints being present in the generated library. This patch adds a dependency on these headers so that they will be built properly. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D138142	2022-11-16 12:00:41 -06:00
Raman Tenneti	78f172e45a	[libc] Implement gettimeofday Implement gettimeofday per .../onlinepubs/9699919799/functions/gettimeofday.html. This call clock_gettime to implement gettimeofday function. Tested: Limited unit test: This makes a call and checks that no error was returned. Used nanosleep for 100 microseconds and verfified it returns a value that elapses more than 100 microseconds and less than 300 microseconds. Co-authored-by: Jeff Bailey <jeffbailey@google.com> Differential Revision: https://reviews.llvm.org/D137881	2022-11-11 18:02:33 -08:00
Tue Ly	45233cc1ca	[libc][math] Add place-holder implementation for pow function. Add place-holder implementation for pow function to unblock libc demo examples. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D137109	2022-10-31 17:23:33 -04:00
Tue Ly	97b4cc83e1	[libc][math] Add place-holder implementation for asin to unblock demo examples. Add a place-holder implementation for asin to unblock libc demo examples. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D137105	2022-10-31 17:22:12 -04:00
Alex Brachet	5fd03c8176	[libc] Implement getopt Differential Revision: https://reviews.llvm.org/D133487	2022-10-31 16:55:53 +00:00
Alex Brachet	d6ac84bce8	Revert "[libc] Implement getopt" This reverts commit `a678f86351`.	2022-10-27 06:47:24 +00:00
Alex Brachet	a678f86351	[libc] Implement getopt Differential Revision: https://reviews.llvm.org/D133487	2022-10-27 06:23:33 +00:00
Siva Chandra	07b7023181	[libc] Enable more entrypoints on aarch64.	2022-10-26 14:03:15 -07:00
Siva Chandra	f8490601c2	[libc] Enable a few entrypoints on aarch64 already available on x86_64.	2022-10-25 15:41:41 -07:00
Raman Tenneti	12204429f2	[libc] Add implementation of difftime function. The difftime function computes the difference between two calendar times: time1 - time0 as per as per 7.27.2.2 section in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2478.pdf. double difftime(time_t time1, time_t time0); Tested: Unit tests Co-authored-by: Jeff Bailey <jeffbailey@google.com> Reviewed By: jeffbailey Differential Revision: https://reviews.llvm.org/D136631	2022-10-24 15:14:26 -07:00
Siva Chandra Reddy	3f965818b6	[libc] Add POSIX execv and execve functions. The POSIX global variable environ has also been added. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D135351	2022-10-06 19:50:23 +00:00
Siva Chandra	5e56e294ae	[libc][Obvious] Enable some of the recently added functions on aarch64.	2022-09-29 15:06:44 -07:00
Raman Tenneti	8f1e362ee9	Implement nanosleep per https://pubs.opengroup.org/onlinepubs/009695399/basedefs/time.h.html Tested: Limited unit test: This makes a call and checks that no error was returned, but we currently don't have the ability to ensure that time has elapsed as expected. Co-authored-by: Jeff Bailey <jeffbailey@google.com> Reviewed By: sivachandra, jeffbailey Differential Revision: https://reviews.llvm.org/D134095	2022-09-24 00:13:58 +00:00
Tue Ly	a752460d73	[libc][math] Implement exp10f function correctly rounded to all rounding modes. Implement exp10f function correctly rounded to all rounding modes. Algorithm: perform range reduction to reduce ``` 10^x = 2^(hi + mid) * 10^lo ``` where: ``` hi is an integer, 0 <= mid * 2^5 < 2^5 -log10(2) / 2^6 <= lo <= log10(2) / 2^6 ``` Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is performed by adding `hi` into the exponent field of `2^mid`. `10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with: ``` > P = fpminimax((10^x - 1)/x, 4, [\|D...\|], [-log10(2)/64. log10(2)/64]); ``` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 10.215 System LIBC reciprocal throughput : 7.944 LIBC reciprocal throughput : 38.538 LIBC reciprocal throughput : 12.175 (with `-msse4.2` flag) LIBC reciprocal throughput : 9.862 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 40.744 System LIBC latency : 37.546 BEFORE LIBC latency : 48.989 LIBC latency : 44.486 (with `-msse4.2` flag) LIBC latency : 40.221 (with `-mfma` flag) ``` This patch relies on https://reviews.llvm.org/D134002 Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D134104	2022-09-19 10:01:40 -04:00
Siva Chandra Reddy	7fb96fb5d3	[libc] Add implementation of POSIX "uname" function. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D134065	2022-09-16 21:21:29 +00:00
Siva Chandra Reddy	d23d858d04	[libc] Add the implementation of the "remove" function. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D133922	2022-09-15 17:32:02 +00:00
Siva Chandra Reddy	6e675fba3a	[libc] Add POSIX functions pread and pwrite. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D133888	2022-09-14 20:52:20 +00:00
Siva Chandra Reddy	419580c699	[libc] Add implementation of POSIX function "access". Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D133814	2022-09-14 07:44:47 +00:00
Siva Chandra Reddy	8989aa003f	[libc] Add POSIX functions dup, dup2, and GNU extension function dup3. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D133748	2022-09-13 18:06:30 +00:00
Tue Ly	463dcc8749	[libc][math] Implement acosf function correctly rounded for all rounding modes. Implement acosf function correctly rounded for all rounding modes. We perform range reduction as follows: - When `\|x\| < 2^(-10)`, we use cubic Taylor polynomial: ``` acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 / 6. ``` - When `2^(-10) <= \|x\| <= 0.5`, we use the same approximation that is used for `asinf(x)` when `\|x\| <= 0.5`: ``` acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 * P(x^2). ``` - When `0.5 < x <= 1`, we use the double angle formula: `cos(2y) = 1 - 2 * sin^2 (y)` to reduce to: ``` acos(x) = 2 * asin( sqrt( (1 - x)/2 ) ) ``` - When `-1 <= x < -0.5`, we reduce to the positive case above using the formula: ``` acos(x) = pi - acos(-x) ``` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh acosf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 28.613 System LIBC reciprocal throughput : 29.204 LIBC reciprocal throughput : 24.271 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 55.554 System LIBC latency : 76.879 LIBC latency : 62.118 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133550	2022-09-09 09:55:30 -04:00
Tue Ly	e2f065c2a3	[libc][math] Implement asinf function correctly rounded for all rounding modes. Implement asinf function correctly rounded for all rounding modes. For `\|x\| <= 0.5`, we approximate `asin(x)` by ``` asin(x) = x * P(x^2) ``` where `P(X^2) = Q(X)` is a degree-20 minimax even polynomial approximating `asin(x)/x` on `[0, 0.5]` generated by Sollya with: ``` > Q = fpminimax(asin(x)/x, [\|0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20\|], [\|1, D...\|], [0, 0.5]); ``` When `\|x\| > 0.5`, we perform range reduction as follow: Assume further that `0.5 < x <= 1`, and let: ``` y = asin(x) ``` We will use the double angle formula: ``` cos(2X) = 1 - 2 sin^2(X) ``` and the complement angle identity: ``` x = sin(y) = cos(pi/2 - y) = 1 - 2 sin^2 (pi/4 - y/2) ``` So: ``` sin(pi/4 - y/2) = sqrt( (1 - x)/2 ) ``` And hence: ``` pi/4 - y/2 = asin( sqrt( (1 - x)/2 ) ) ``` Equivalently: ``` asin(x) = y = pi/2 - 2 * asin( sqrt( (1 - x)/2 ) ) ``` Let `u = (1 - x)/2`, then ``` asin(x) = pi/2 - 2 * asin(u) ``` Moreover, since `0.5 < x <= 1`, ``` 0 <= u < 1/4, and 0 <= sqrt(u) < 0.5. ``` And hence we can reuse the same polynomial approximation of `asin(x)` when `\|x\| <= 0.5`: ``` asin(x) = pi/2 - 2 * u * P(u^2). ``` Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf CORE-MATH reciprocal throughput : 23.418 System LIBC reciprocal throughput : 27.310 LIBC reciprocal throughput : 22.741 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 58.884 System LIBC latency : 62.055 LIBC latency : 62.037 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133400	2022-09-07 19:27:47 -04:00
Kirill Okhotnikov	77e1d9beed	[libc][math] Added atanf function. Performance by core-math (core-math/glibc 2.31/current llvm-14): 28.879/20.843/20.15 Differential Revision: https://reviews.llvm.org/D132842	2022-08-30 22:39:54 +02:00
Kirill Okhotnikov	6c1fc7e430	[libc][math] Added atanhf function. Performance by core-math (core-math/glibc 2.31/current llvm-14): 10.845/43.174/13.467 The review is done on top of D132809. Differential Revision: https://reviews.llvm.org/D132811	2022-08-30 22:39:54 +02:00
Siva Chandra Reddy	f6506ec443	[libc] Implement POSIX truncate and ftruncate functions for Linux. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D132705	2022-08-26 19:27:24 +00:00
Siva Chandra Reddy	b8be3dabde	[libc] Add Linux implementation of GNU extension function sendfile. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D132721	2022-08-26 19:13:40 +00:00
Siva Chandra Reddy	00e51f04e8	[libc] Implement linux link, linkat, symlink, symlinkat, readlink, readlinkat. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D132619	2022-08-25 18:50:39 +00:00
Siva Chandra Reddy	85dff76416	[libc] Add linux implementation of POSIX fchmodat function. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D132533	2022-08-24 18:46:29 +00:00
Siva Chandra	d00e97df0f	[libc][Obvious] Fix typo is chmod implementation. This now allows enabling the chmod function on aarch64.	2022-08-23 15:01:21 -07:00
Siva Chandra	8856137ce7	[libc] Enable a few entrypoints on aarch64 which are now available on x86_64.	2022-08-23 12:31:42 -07:00
Siva Chandra Reddy	84517b6fb1	[libc] Add more headers to the linux x86_64 and aarch64 configs.	2022-08-19 07:15:08 +00:00
Michael Jones	e0e7fa36d3	[libc] enable s(n)printf without fullbuild To use the FILE data structure, LLVM-libc must be in fullbuild mode since it expects its own implementation. This means that (f)printf can't be used without fullbuild, but s(n)printf only uses strings. This patch adjusts the CMake to allow for this. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D131913	2022-08-15 13:45:34 -07:00
Tue Ly	82d6e77048	[libc] Implement tanf function correctly rounded for all rounding modes. Implement tanf function correctly rounded for all rounding modes. We use the range reduction that is shared with `sinf`, `cosf`, and `sincosf`: ``` k = round(x * 32/pi) and y = x * (32/pi) - k. ``` Then we use the tangent of sum formula: ``` tan(x) = tan((k + y)* pi/32) = tan((k mod 32) * pi / 32 + y * pi/32) = (tan((k mod 32) * pi/32) + tan(y * pi/32)) / (1 - tan((k mod 32) * pi/32) * tan(y * pi/32)) ``` We need to make a further reduction when `k mod 32 >= 16` due to the pole at `pi/2` of `tan(x)` function: ``` if (k mod 32 >= 16): k = k - 31, y = y - 1.0 ``` And to compute the final result, we store `tan(k * pi/32)` for `k = -15..15` in a table of 32 double values, and evaluate `tan(y * pi/32)` with a degree-11 minimax odd polynomial generated by Sollya with: ``` > P = fpminimax(tan(y * pi/32)/y, [\|0, 2, 4, 6, 8, 10\|], [\|D...\|], [0, 1.5]); ``` Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf CORE-MATH reciprocal throughput : 18.586 System LIBC reciprocal throughput : 50.068 LIBC reciprocal throughput : 33.823 LIBC reciprocal throughput : 25.161 (with `-msse4.2` flag) LIBC reciprocal throughput : 19.157 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf --latency GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 55.630 System LIBC latency : 106.264 LIBC latency : 96.060 LIBC latency : 90.727 (with `-msse4.2` flag) LIBC latency : 82.361 (with `-mfma` flag) ``` Reviewed By: orex Differential Revision: https://reviews.llvm.org/D131715	2022-08-12 09:21:05 -04:00
Kirill Okhotnikov	5ef987c985	[libc][math] Added tanhf function. Correct rounding function. Performance ~2x faster than glibc analog. Performance (llvm 12 intel): ``` CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='' ./perf.sh tanhf GNU libc version: 2.31 GNU libc release: stable 13.279 37.492 18.145 CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='--latency' ./perf.sh tanhf GNU libc version: 2.31 GNU libc release: stable 40.658 109.582 66.568 ``` Differential Revision: https://reviews.llvm.org/D130780	2022-08-01 22:43:00 +02:00
Kirill Okhotnikov	a7f55f0805	[libc][math] Added sinhf function. Differential Revision: https://reviews.llvm.org/D129278	2022-07-29 17:20:53 +02:00
Kirill Okhotnikov	fcb9d7e2cf	[libc][math] Added coshf function. Differential Revision: https://reviews.llvm.org/D129275	2022-07-29 16:57:28 +02:00
Siva Chandra	98fdabecf5	[libc] Enable a few stdlib and time functions on aarch64.	2022-07-14 14:37:50 -07:00
Siva Chandra	75a628925e	[libc] Enable few stdio functions on aarch64.	2022-07-14 13:40:48 -07:00
Siva Chandra	edee61b55c	[libc] Enable few pthread and threads functions on aarch64.	2022-07-14 13:25:21 -07:00
Alex Brachet	c179bcc151	[libc] Add imaxabs Differential Revision: https://reviews.llvm.org/D129517	2022-07-11 21:28:21 +00:00
Kirill Okhotnikov	b8e8012aa2	[libc][math] fmod/fmodf implementation. This is a implementation of find remainder fmod function from standard libm. The underline algorithm is developed by myself, but probably it was first invented before. Some features of the implementation: 1. The code is written on more-or-less modern C++. 2. One general implementation for both float and double precision numbers. 3. Spitted platform/architecture dependent and independent code and tests. 4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc. 5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided). 6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication. Performance tests: The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases. `./check.sh <--special\|--worst> fmodf` passed. `CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are ``` GNU libc version: 2.35 GNU libc release: stable 21.166 <-- FPU 51.031 <-- current glibc 37.659 <-- this fmod version. ```	2022-06-24 23:09:14 +02:00
Alex Brachet	b1183305f8	[libc] Add strlcat Differential Revision: https://reviews.llvm.org/D125978	2022-05-19 21:48:39 +00:00
Alex Brachet	fc2c8b2371	[libc] Add strlcpy Differential Revision: https://reviews.llvm.org/D125806	2022-05-18 17:45:05 +00:00
Siva Chandra	3f5287125a	[libc] Add stdio entrypoints to aarch64 fullbuild.	2022-04-26 00:25:12 -07:00
Siva Chandra Reddy	0258f56646	[libc] Add a definition of pthread_attr_t and its getters and setters. Not all attributes have been added to phtread_attr_t in this patch. They will be added gradually in future patches. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D123423	2022-04-11 16:08:49 +00:00
Siva Chandra Reddy	83f153ce34	[libc] Add pthread_mutexattr_t type and its setters and getters. A simple implementation of the getters and setters has been added. More logic can be added to them in future as required. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D122969	2022-04-04 18:11:12 +00:00
Siva Chandra	97417e0300	[libc] Enable threads.h functions on aarch64. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D122788	2022-03-31 08:42:07 -07:00

1 2 3

120 Commits