clang-p2996

Author	SHA1	Message	Date
Vinayak Dev	3b961d113e	[libc] Implement roundeven C23 math functions (#87678 ) Implements the functions `roundeven()`, `roundevenf()`, `roundevenl()` from the roundeven family of functions introduced in C23. Also implements `roundevenf128()`.	2024-04-05 08:36:12 -04:00
OverMighty	a8c59750d9	[libc][math][c23] Add exp2m1f C23 math function (#86996 ) Fixes #86502. cc @lntue	2024-04-04 08:22:45 -04:00
Vinayak Dev	986435c765	[libc] Move {f,d}sqrt to higher functions in docs (#87445 ) Moves the functions `fsqrt()` and `dsqrt()` from basic functions to higher math functions in math docs	2024-04-02 22:39:48 -04:00
Vinayak Dev	2fb5440e76	[libc] Re-organize the math function tables in docs (#87412 ) Re-organizes the tables that listed libc's support for math functions, and adds two new columns to the tables indicating where the respective function definitions and error handling methods are located in the C23 standard draft WG14-N3096.	2024-04-02 22:23:35 -04:00
lntue	2be722587f	[libc][math] Implement atan2f correctly rounded to all rounding modes. (#86716 ) We compute atan2f(y, x) in 2 stages: - Fast step: perform computations in double precision , with relative errors < 2^-50 - Accurate step: if the result from the Fast step fails Ziv's rounding test, then we perform computations in double-double precision, with relative errors < 2^-100. On Ryzen 5900X, worst-case latency is ~ 200 clocks, compared to average latency ~ 60 clocks, and average reciprocal throughput ~ 20 clocks.	2024-04-01 13:31:07 -04:00
Nick Desaulniers	8a071678a9	Revert "[libc][math][c23] Add remaining linux/* entrypoints for {,u}fromfp{,x}* (#86692 )" This reverts commit `cd17082b24` because the newly added tests fail on 32b ARM. Link: #86692 Link: https://lab.llvm.org/buildbot/#/builders/229/builds/24458	2024-03-27 13:28:26 -07:00
OverMighty	cd17082b24	[libc][math][c23] Add remaining linux/* entrypoints for {,u}fromfp{,x}* (#86692 )	2024-03-27 12:28:27 -07:00
Shourya Goel	19ca79e867	[libc][math][c23] Implement canonicalize functions (#85940 ) Fixes: #85286	2024-03-26 08:28:22 -04:00
OverMighty	b282259711	[libc][math][c23] Add {,u}fromfp{,x}{,f,l,f128} functions (#86003 ) Fixes #85279. cc @lntue	2024-03-25 10:26:22 -04:00
OverMighty	85b6af198f	[libc][math][c23] Add linux/* entrypoints for nextup* and nextdown* (#85803 ) See https://github.com/llvm/llvm-project/pull/85484#discussion_r1526971653. There already were entrypoints for linux/x86_64. I haven't tested on the other targets and will rely on the buildbots.	2024-03-19 12:47:01 -07:00
OverMighty	a2bad75879	[libc][math][c23] Add nextupl and nextdownl functions (#85484 ) Fixes #85283. cc @lntue	2024-03-16 17:21:07 -04:00
Michael Flanders	b43965adac	[libc][math][c23] adds `nanf128` (#85201 ) Continuing #84689, this one required more changes than the others, so I am making it a separate PR. Extends some stuff in `str_to_float.h`, `str_to_integer.h` to work on types wider than `unsigned long long` and `uint64_t`. cc @lntue for review.	2024-03-15 13:31:50 -04:00
Michael Flanders	15a55486a5	[libc][math] Adds entrypoint and test for `nextafterf128` (#84882 )	2024-03-12 23:25:05 -04:00
lntue	4d21e75210	[libc][math][c23] Add fmodl and fmodf128 math functions. (#84600 ) - Allow `FMod` template to have different computational types and make it work for 80-bit long double. - Switch to use `uint64_t` as the intermediate computational types for `float`, significantly reduce the latency of `fmodf` when the exponent difference is large.	2024-03-11 16:27:42 -04:00
lntue	d99bb01422	[libc][NFC] Clean up test/src/math/differential_testing folder, renaming it to performance_testing. (#84646 ) Removing all the diff tests.	2024-03-11 11:38:39 -04:00
lntue	99f5e9634b	[libc][math][c23] Add modff128 C23 math function. (#84532 )	2024-03-09 11:47:22 -05:00
lntue	60d7bf3dbe	[libc][math][c23] Add (l\|ll)rintf128 and (l\|ll)roundf128 math functions. (#84504 )	2024-03-08 12:15:02 -05:00
lntue	14171b87a3	[libc][stdfix] Add exp function for short _Accum and _Accum types. (#84391 )	2024-03-07 17:58:28 -05:00
lntue	ad33fe1281	[libc][stdfix] Add integer square root with fixed point output functions. (#83959 ) Fix https://github.com/llvm/llvm-project/issues/83924.	2024-03-06 18:35:44 -05:00
lntue	aa95aa69b9	[libc][math][c23] Add C23 math functions ilogbf128, logbf128, and llogb(f\|l\|f128). (#82144 )	2024-02-27 12:23:19 -05:00
lntue	ded4ea9752	[libc][stdfix] Add sqrt for fixed point types. (#83042 )	2024-02-26 19:36:30 -05:00
Joseph Huber	69c0b2febe	[libc][NFC] Remove all trailing spaces from libc (#82831 ) Summary: There are a lot of random training spaces on various lines. This patch just got rid of all of them with `sed 's/\ \+$//g'.	2024-02-23 16:34:00 -06:00
lntue	f01ed3bc88	[libc][stdfix] Add round functions for fixed point types. (#81994 )	2024-02-16 12:45:26 -05:00
lntue	2c45bda802	[libc][stdfix] Add abs functions for signed fixed point types. (#81823 )	2024-02-15 18:09:40 -05:00
lntue	ff409d39ce	[libc][math] Add C23 ldexpf128 math function and fix DyadicFloat conversions for subnormal ranges and 80-bit floating points. (#81780 )	2024-02-14 21:35:00 -05:00
lntue	84277fe90f	[libc][stdfix] Generate stdfix.h header with fixed point precision macros according to ISO/IEC TR 18037:2008 standard, and add fixed point type support detection. (#81255 ) Fixed point extension standard: https://standards.iso.org/ittf/PubliclyAvailableStandards/c051126_ISO_IEC_TR_18037_2008.zip	2024-02-13 16:48:14 -05:00
lntue	637c37025d	[libc][math] Add C23 math function frexpf128. (#81337 )	2024-02-09 21:13:14 -05:00
lntue	1f20bc2cd2	[libc][math] Add C23 math function fdimf128. (#81074 )	2024-02-09 11:21:04 -05:00
lntue	bcc1635c7f	[libc] Enable float128 entrypoints on aarch64 and riscv64. (#80682 )	2024-02-07 13:39:19 -05:00
lntue	6ba9d2988b	[libc][math] Add float128 rounding functions (ceilf128, floorf128, roundf128, truncf128). (#80634 )	2024-02-05 07:37:57 -05:00
felixh5678	0b0cce8978	[libc] Add fminf128 and fmaxf128 implementations for Linux x86_64. (#79307 ) Co-authored-by: Felix <felix@Dirks-MacBook-Pro.local>	2024-01-25 15:04:18 -05:00
felixh5678	777eb35614	[libc] Add sqrtf128 implementation for Linux x86_64. (#79195 ) Co-authored-by: Tue Ly <lntue@google.com> Co-authored-by: Felix <felix@Dirks-MacBook-Pro.local>	2024-01-24 10:16:12 -05:00
Petr Hosek	23edf782a2	[libc] Include missing RISC-V stdlib.h and math.h entrypoints (#79034 ) This matches the entrypoints for baremetal ARM.	2024-01-22 18:46:48 -08:00
Nick Desaulniers	3044d75485	[libc][arm] add more math.h entrypoints (#77839 ) In particular, we have internal customers that would like to use nanf and scalbnf. The differences between various entrypoint files can be checked via: $ comm -3 <(grep libc\.src path/to/entrypoints.txt \| sort) \ <(grep libc\.src path/to/other/entrypoints.txt \| sort)	2024-01-18 08:18:13 -08:00
lntue	1048b5999b	[libc][math] Add C23 math function fabsf128. (#77825 )	2024-01-12 15:00:16 -05:00
Nishant Mittal	0504e93288	[libc][math] Implement nan(f\|l) functions (#76690 ) Specification: https://en.cppreference.com/w/c/numeric/math/nan	2024-01-05 08:23:23 -05:00
Nishant Mittal	0c49fc4c68	[libc][math] Implement nexttoward functions (#72763 ) Implements the `nexttoward`, `nexttowardf` and `nexttowardl` functions. Also, raise excepts required by the standard in `nextafter` functions. cc: @lntue	2023-11-21 09:02:51 -05:00
lntue	3f906f513e	[libc][math] Add initial support for C23 float128 math functions, starting with copysignf128. (#71731 )	2023-11-10 14:32:59 -05:00
lntue	bc7a3bd864	[libc][math] Implement powf function correctly rounded to all rounding modes. (#71188 ) We compute `pow(x, y)` using the formula ``` pow(x, y) = x^y = 2^(y * log2(x)) ``` We follow similar steps as in `log2f(x)` and `exp2f(x)`, by breaking down into `hi + mid + lo` parts, in which `hi` parts are computed using the exponent field directly, `mid` parts will use look-up tables, and `lo` parts are approximated by polynomials. We add some speedup for common use-cases: ``` pow(2, y) = exp2(y) pow(10, y) = exp10(y) pow(x, 2) = x * x pow(x, 1/2) = sqrt(x) pow(x, -1/2) = rsqrt(x) - to be added ```	2023-11-06 16:54:25 -05:00
lntue	da28593d71	[libc][math] Implement double precision expm1 function correctly rounded for all rounding modes. (#67048 ) Implementing expm1 function for double precision based on exp function algorithm: - Reduced x = log2(e) * (hi + mid1 + mid2) + lo, where: * hi is an integer * mid1 * 2^-6 is an integer * mid2 * 2^-12 is an integer * \|lo\| < 2^-13 + 2^-30 - Then exp(x) - 1 = 2^hi * 2^mid1 * 2^mid2 * exp(lo) - 1 ~ 2^hi * (2^mid1 * 2^mid2 * (1 + lo * P(lo)) - 2^(-hi) ) - We evaluate fast pass with P(lo) is a degree-3 Taylor polynomial of (e^lo - 1) / lo in double precision - If the Ziv accuracy test fails, we use degree-6 Taylor polynomial of (e^lo - 1) / lo in double double precision - If the Ziv accuracy test still fails, we re-evaluate everything in 128-bit precision.	2023-09-28 16:43:15 -04:00
Tue Ly	76bb278ebb	[libc][math] Implement double precision exp10 function correctly rounded for all rounding modes. Implement double precision exp10 function correctly rounded for all rounding modes. Using the same algorithm as double precision exp (https://reviews.llvm.org/D158551) and exp2 (https://reviews.llvm.org/D158812) functions. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D159143	2023-08-30 08:43:50 -04:00
Tue Ly	8ca614aa22	[libc][math] Implement double precision exp2 function correctly rounded for all rounding modes. Implement double precision exp2 function correctly rounded for all rounding modes. Using the same algorithm as double precision exp function in https://reviews.llvm.org/D158551. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D158812	2023-08-25 10:15:08 -04:00
Tue Ly	434bf16084	[libc][math] Implement double precision exp function correctly rounded for all rounding modes. Implement double precision exp function correctly rounded for all rounding modes. Using 4 stages: - Range reduction: reduce to `exp(x) = 2^hi * 2^mid1 * 2^mid2 * exp(lo)`. - Use 64 + 64 LUT for 2^mid1 and 2^mid2, and use cubic Taylor polynomial to approximate `(exp(lo) - 1) / lo` in double precision. Relative error in this step is bounded by 1.5 * 2^-63. - If the rounding test fails, use degree-6 Taylor polynomial to approximate `exp(lo)` in double-double precision. Relative error in this step is bounded by 2^-99. - If the rounding test still fails, use degree-7 Taylor polynomial to compute `exp(lo)` in ~128-bit precision. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D158551	2023-08-24 10:17:17 -04:00
Tue Ly	f320fefc4a	[libc][math] Implement erff function correctly rounded to all rounding modes. Implement correctly rounded `erff` functions. For `x >= 4`, `erff(x) = 1` for `FE_TONEAREST` or `FE_UPWARD`, `0x1.ffffep-1` for `FE_DOWNWARD` or `FE_TOWARDZERO`. For `0 <= x < 4`, we divide into 32 sub-intervals of length `1/8`, and use a degree-15 odd polynomial to approximate `erff(x)` in each sub-interval: ``` erff(x) ~ x * (c0 + c1 * x^2 + c2 * x^4 + ... + c7 * x^14). ``` For `x < 0`, we can use the same formula as above, since the odd part is factored out. Performance tested with `perf.sh` tool from the CORE-MATH project on AMD Ryzen 9 5900X: Reciprocal throughput (clock cycles / op) ``` $ ./perf.sh erff --path2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH reciprocal throughput -- with -march=native (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 11.790 + 0.182 clc/call; Median-Min = 0.154 clc/call; Max = 12.255 clc/call; -- CORE-MATH reciprocal throughput -- with -march=x86-64-v2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 14.205 + 0.151 clc/call; Median-Min = 0.159 clc/call; Max = 15.893 clc/call; -- System LIBC reciprocal throughput -- [####################] 100 % Ntrial = 20 ; Min = 45.519 + 0.445 clc/call; Median-Min = 0.552 clc/call; Max = 46.345 clc/call; -- LIBC reciprocal throughput -- with -mavx2 -mfma (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 9.595 + 0.214 clc/call; Median-Min = 0.220 clc/call; Max = 9.887 clc/call; -- LIBC reciprocal throughput -- with -msse4.2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 10.223 + 0.190 clc/call; Median-Min = 0.222 clc/call; Max = 10.474 clc/call; ``` and latency (clock cycles / op): ``` $ ./perf.sh erff --path2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH latency -- with -march=native (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 38.566 + 0.391 clc/call; Median-Min = 0.503 clc/call; Max = 39.170 clc/call; -- CORE-MATH latency -- with -march=x86-64-v2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 43.223 + 0.667 clc/call; Median-Min = 0.680 clc/call; Max = 43.913 clc/call; -- System LIBC latency -- [####################] 100 % Ntrial = 20 ; Min = 111.613 + 1.267 clc/call; Median-Min = 1.696 clc/call; Max = 113.444 clc/call; -- LIBC latency -- with -mavx2 -mfma (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 40.138 + 0.410 clc/call; Median-Min = 0.536 clc/call; Max = 40.729 clc/call; -- LIBC latency -- with -msse4.2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 44.858 + 0.872 clc/call; Median-Min = 0.814 clc/call; Max = 46.019 clc/call; ``` Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153683	2023-06-28 13:58:37 -04:00
Tue Ly	e557b8a142	[libc][RISCV] Add log, log2, log1p, log10 for RISC-V64 entrypoints. Add log, log2, log1p, log10 RISCV64 entrypoints. Reviewed By: michaelrj, sivachandra Differential Revision: https://reviews.llvm.org/D151674	2023-05-30 14:18:19 -04:00
Tue Ly	0bda541829	[libc][doc] Update math function status page to show more targets. Show availability of math functions on each target. Reviewed By: jeffbailey Differential Revision: https://reviews.llvm.org/D151489	2023-05-25 19:24:33 -04:00
Kazu Hirata	9a515d8142	[libc] Fix typos in documentation	2023-05-22 23:27:59 -07:00
Kazu Hirata	e042efdab6	[libc] Fix typos in documentation	2023-04-24 23:31:48 -07:00
Tue Ly	f63025f52f	[libc][Obvious] Fix the performance table in math function documentation.	2023-04-18 14:10:26 -04:00
Tue Ly	9af8dca70f	[libc][math] Update range reduction step for log10f and reduce its latency. Simplify the range reduction steps by choosing the reduction constants carefully so that the reduced arguments v = r*m_x - 1 and v^2 are exact in double precision, even without FMA instructions, and -2^-8 <= v < 2^-7. This allows the polynomial evaluations to be parallelized more efficiently. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D147676	2023-04-07 10:31:46 -04:00

1 2

52 Commits