Commit Graph

52 Commits

Author SHA1 Message Date
Vinayak Dev
3b961d113e [libc] Implement roundeven C23 math functions (#87678)
Implements the functions `roundeven()`, `roundevenf()`, `roundevenl()`
from the roundeven family of functions introduced in C23. Also
implements `roundevenf128()`.
2024-04-05 08:36:12 -04:00
OverMighty
a8c59750d9 [libc][math][c23] Add exp2m1f C23 math function (#86996)
Fixes #86502.

cc @lntue
2024-04-04 08:22:45 -04:00
Vinayak Dev
986435c765 [libc] Move {f,d}sqrt to higher functions in docs (#87445)
Moves the functions `fsqrt()` and `dsqrt()` from basic functions to
higher math functions in math docs
2024-04-02 22:39:48 -04:00
Vinayak Dev
2fb5440e76 [libc] Re-organize the math function tables in docs (#87412)
Re-organizes the tables that listed libc's support for math functions,
and adds two new columns to the tables indicating where the respective
function definitions and error handling methods are located in the C23
standard draft WG14-N3096.
2024-04-02 22:23:35 -04:00
lntue
2be722587f [libc][math] Implement atan2f correctly rounded to all rounding modes. (#86716)
We compute atan2f(y, x) in 2 stages:
- Fast step: perform computations in double precision , with relative
errors < 2^-50
- Accurate step: if the result from the Fast step fails Ziv's rounding
test, then we perform computations in double-double precision, with
relative errors < 2^-100.

On Ryzen 5900X, worst-case latency is ~ 200 clocks, compared to average
latency ~ 60 clocks, and average reciprocal throughput ~ 20 clocks.
2024-04-01 13:31:07 -04:00
Nick Desaulniers
8a071678a9 Revert "[libc][math][c23] Add remaining linux/* entrypoints for {,u}fromfp{,x}* (#86692)"
This reverts commit cd17082b24 because the newly
added tests fail on 32b ARM.

Link: #86692
Link: https://lab.llvm.org/buildbot/#/builders/229/builds/24458
2024-03-27 13:28:26 -07:00
OverMighty
cd17082b24 [libc][math][c23] Add remaining linux/* entrypoints for {,u}fromfp{,x}* (#86692) 2024-03-27 12:28:27 -07:00
Shourya Goel
19ca79e867 [libc][math][c23] Implement canonicalize functions (#85940)
Fixes: #85286
2024-03-26 08:28:22 -04:00
OverMighty
b282259711 [libc][math][c23] Add {,u}fromfp{,x}{,f,l,f128} functions (#86003)
Fixes #85279.

cc @lntue
2024-03-25 10:26:22 -04:00
OverMighty
85b6af198f [libc][math][c23] Add linux/* entrypoints for nextup* and nextdown* (#85803)
See
https://github.com/llvm/llvm-project/pull/85484#discussion_r1526971653.

There already were entrypoints for linux/x86_64. I haven't tested on the other
targets and will rely on the buildbots.
2024-03-19 12:47:01 -07:00
OverMighty
a2bad75879 [libc][math][c23] Add nextupl and nextdownl functions (#85484)
Fixes #85283.

cc @lntue
2024-03-16 17:21:07 -04:00
Michael Flanders
b43965adac [libc][math][c23] adds nanf128 (#85201)
Continuing #84689, this one required more changes than the others, so I
am making it a separate PR.

Extends some stuff in `str_to_float.h`, `str_to_integer.h` to work on
types wider than `unsigned long long` and `uint64_t`.

cc @lntue for review.
2024-03-15 13:31:50 -04:00
Michael Flanders
15a55486a5 [libc][math] Adds entrypoint and test for nextafterf128 (#84882) 2024-03-12 23:25:05 -04:00
lntue
4d21e75210 [libc][math][c23] Add fmodl and fmodf128 math functions. (#84600)
- Allow `FMod` template to have different computational types and make
it work for 80-bit long double.
- Switch to use `uint64_t` as the intermediate computational types for
`float`, significantly reduce the latency of `fmodf` when the exponent
difference is large.
2024-03-11 16:27:42 -04:00
lntue
d99bb01422 [libc][NFC] Clean up test/src/math/differential_testing folder, renaming it to performance_testing. (#84646)
Removing all the diff tests.
2024-03-11 11:38:39 -04:00
lntue
99f5e9634b [libc][math][c23] Add modff128 C23 math function. (#84532) 2024-03-09 11:47:22 -05:00
lntue
60d7bf3dbe [libc][math][c23] Add (l|ll)rintf128 and (l|ll)roundf128 math functions. (#84504) 2024-03-08 12:15:02 -05:00
lntue
14171b87a3 [libc][stdfix] Add exp function for short _Accum and _Accum types. (#84391) 2024-03-07 17:58:28 -05:00
lntue
ad33fe1281 [libc][stdfix] Add integer square root with fixed point output functions. (#83959)
Fix https://github.com/llvm/llvm-project/issues/83924.
2024-03-06 18:35:44 -05:00
lntue
aa95aa69b9 [libc][math][c23] Add C23 math functions ilogbf128, logbf128, and llogb(f|l|f128). (#82144) 2024-02-27 12:23:19 -05:00
lntue
ded4ea9752 [libc][stdfix] Add sqrt for fixed point types. (#83042) 2024-02-26 19:36:30 -05:00
Joseph Huber
69c0b2febe [libc][NFC] Remove all trailing spaces from libc (#82831)
Summary:
There are a lot of random training spaces on various lines. This patch
just got rid of all of them with `sed 's/\ \+$//g'.
2024-02-23 16:34:00 -06:00
lntue
f01ed3bc88 [libc][stdfix] Add round functions for fixed point types. (#81994) 2024-02-16 12:45:26 -05:00
lntue
2c45bda802 [libc][stdfix] Add abs functions for signed fixed point types. (#81823) 2024-02-15 18:09:40 -05:00
lntue
ff409d39ce [libc][math] Add C23 ldexpf128 math function and fix DyadicFloat conversions for subnormal ranges and 80-bit floating points. (#81780) 2024-02-14 21:35:00 -05:00
lntue
84277fe90f [libc][stdfix] Generate stdfix.h header with fixed point precision macros according to ISO/IEC TR 18037:2008 standard, and add fixed point type support detection. (#81255)
Fixed point extension standard:
https://standards.iso.org/ittf/PubliclyAvailableStandards/c051126_ISO_IEC_TR_18037_2008.zip
2024-02-13 16:48:14 -05:00
lntue
637c37025d [libc][math] Add C23 math function frexpf128. (#81337) 2024-02-09 21:13:14 -05:00
lntue
1f20bc2cd2 [libc][math] Add C23 math function fdimf128. (#81074) 2024-02-09 11:21:04 -05:00
lntue
bcc1635c7f [libc] Enable float128 entrypoints on aarch64 and riscv64. (#80682) 2024-02-07 13:39:19 -05:00
lntue
6ba9d2988b [libc][math] Add float128 rounding functions (ceilf128, floorf128, roundf128, truncf128). (#80634) 2024-02-05 07:37:57 -05:00
felixh5678
0b0cce8978 [libc] Add fminf128 and fmaxf128 implementations for Linux x86_64. (#79307)
Co-authored-by: Felix <felix@Dirks-MacBook-Pro.local>
2024-01-25 15:04:18 -05:00
felixh5678
777eb35614 [libc] Add sqrtf128 implementation for Linux x86_64. (#79195)
Co-authored-by: Tue Ly <lntue@google.com>
Co-authored-by: Felix <felix@Dirks-MacBook-Pro.local>
2024-01-24 10:16:12 -05:00
Petr Hosek
23edf782a2 [libc] Include missing RISC-V stdlib.h and math.h entrypoints (#79034)
This matches the entrypoints for baremetal ARM.
2024-01-22 18:46:48 -08:00
Nick Desaulniers
3044d75485 [libc][arm] add more math.h entrypoints (#77839)
In particular, we have internal customers that would like to use nanf
and
scalbnf.

The differences between various entrypoint files can be checked via:

    $ comm -3 <(grep libc\.src path/to/entrypoints.txt | sort) \
       <(grep libc\.src path/to/other/entrypoints.txt | sort)
2024-01-18 08:18:13 -08:00
lntue
1048b5999b [libc][math] Add C23 math function fabsf128. (#77825) 2024-01-12 15:00:16 -05:00
Nishant Mittal
0504e93288 [libc][math] Implement nan(f|l) functions (#76690)
Specification: https://en.cppreference.com/w/c/numeric/math/nan
2024-01-05 08:23:23 -05:00
Nishant Mittal
0c49fc4c68 [libc][math] Implement nexttoward functions (#72763)
Implements the `nexttoward`, `nexttowardf` and `nexttowardl` functions.
Also, raise excepts required by the standard in `nextafter` functions.

cc: @lntue
2023-11-21 09:02:51 -05:00
lntue
3f906f513e [libc][math] Add initial support for C23 float128 math functions, starting with copysignf128. (#71731) 2023-11-10 14:32:59 -05:00
lntue
bc7a3bd864 [libc][math] Implement powf function correctly rounded to all rounding modes. (#71188)
We compute `pow(x, y)` using the formula
```
  pow(x, y) = x^y = 2^(y * log2(x))
```
We follow similar steps as in `log2f(x)` and `exp2f(x)`, by breaking
down into `hi + mid + lo` parts, in which `hi` parts are computed using
the exponent field directly, `mid` parts will use look-up tables, and
`lo` parts are approximated by polynomials.

We add some speedup for common use-cases:
```
  pow(2, y) = exp2(y)
  pow(10, y) = exp10(y)
  pow(x, 2) = x * x
  pow(x, 1/2) = sqrt(x)
  pow(x, -1/2) = rsqrt(x) - to be added
```
2023-11-06 16:54:25 -05:00
lntue
da28593d71 [libc][math] Implement double precision expm1 function correctly rounded for all rounding modes. (#67048)
Implementing expm1 function for double precision based on exp function
algorithm:

- Reduced x = log2(e) * (hi + mid1 + mid2) + lo, where:
  * hi is an integer
  * mid1 * 2^-6 is an integer
  * mid2 * 2^-12 is an integer
  * |lo| < 2^-13 + 2^-30
- Then exp(x) - 1 = 2^hi * 2^mid1 * 2^mid2 * exp(lo) - 1 ~ 2^hi *
(2^mid1 * 2^mid2 * (1 + lo * P(lo)) - 2^(-hi) )
- We evaluate fast pass with P(lo) is a degree-3 Taylor polynomial of
(e^lo - 1) / lo in double precision
- If the Ziv accuracy test fails, we use degree-6 Taylor polynomial of
(e^lo - 1) / lo in double double precision
- If the Ziv accuracy test still fails, we re-evaluate everything in
128-bit precision.
2023-09-28 16:43:15 -04:00
Tue Ly
76bb278ebb [libc][math] Implement double precision exp10 function correctly rounded for all rounding modes.
Implement double precision exp10 function correctly rounded for all
rounding modes.  Using the same algorithm as double precision exp
(https://reviews.llvm.org/D158551) and exp2 (https://reviews.llvm.org/D158812)
functions.

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D159143
2023-08-30 08:43:50 -04:00
Tue Ly
8ca614aa22 [libc][math] Implement double precision exp2 function correctly rounded for all rounding modes.
Implement double precision exp2 function correctly rounded for all
rounding modes.  Using the same algorithm as double precision exp function in
https://reviews.llvm.org/D158551.

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D158812
2023-08-25 10:15:08 -04:00
Tue Ly
434bf16084 [libc][math] Implement double precision exp function correctly rounded for all rounding modes.
Implement double precision exp function correctly rounded for all
rounding modes.  Using 4 stages:
- Range reduction: reduce to `exp(x) = 2^hi * 2^mid1 * 2^mid2 * exp(lo)`.
- Use 64 + 64 LUT for 2^mid1 and 2^mid2, and use cubic Taylor polynomial to
approximate `(exp(lo) - 1) / lo` in double precision.  Relative error in this
step is bounded by 1.5 * 2^-63.
- If the rounding test fails, use degree-6 Taylor polynomial to approximate
`exp(lo)` in double-double precision.  Relative error in this step is bounded by
2^-99.
- If the rounding test still fails, use degree-7 Taylor polynomial to compute
`exp(lo)` in ~128-bit precision.

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D158551
2023-08-24 10:17:17 -04:00
Tue Ly
f320fefc4a [libc][math] Implement erff function correctly rounded to all rounding modes.
Implement correctly rounded `erff` functions.

For `x >= 4`, `erff(x) = 1` for `FE_TONEAREST` or `FE_UPWARD`, `0x1.ffffep-1` for `FE_DOWNWARD` or `FE_TOWARDZERO`.

For `0 <= x < 4`, we divide into 32 sub-intervals of length `1/8`, and use a degree-15 odd polynomial to approximate `erff(x)` in each sub-interval:
```
  erff(x) ~ x * (c0 + c1 * x^2 + c2 * x^4 + ... + c7 * x^14).
```

For `x < 0`, we can use the same formula as above, since the odd part is factored out.

Performance tested with `perf.sh` tool from the CORE-MATH project on AMD Ryzen 9 5900X:

Reciprocal throughput (clock cycles / op)
```
$ ./perf.sh erff --path2
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput --  with -march=native      (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 11.790 + 0.182 clc/call; Median-Min = 0.154 clc/call; Max = 12.255 clc/call;
-- CORE-MATH reciprocal throughput --  with -march=x86-64-v2      (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 14.205 + 0.151 clc/call; Median-Min = 0.159 clc/call; Max = 15.893 clc/call;

-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 45.519 + 0.445 clc/call; Median-Min = 0.552 clc/call; Max = 46.345 clc/call;

-- LIBC reciprocal throughput --  with -mavx2 -mfma     (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 9.595 + 0.214 clc/call; Median-Min = 0.220 clc/call; Max = 9.887 clc/call;
-- LIBC reciprocal throughput --  with -msse4.2     (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 10.223 + 0.190 clc/call; Median-Min = 0.222 clc/call; Max = 10.474 clc/call;
```

and latency (clock cycles / op):
```
$ ./perf.sh erff --path2
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency --  with -march=native      (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 38.566 + 0.391 clc/call; Median-Min = 0.503 clc/call; Max = 39.170 clc/call;
-- CORE-MATH latency --  with -march=x86-64-v2      (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 43.223 + 0.667 clc/call; Median-Min = 0.680 clc/call; Max = 43.913 clc/call;

-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 111.613 + 1.267 clc/call; Median-Min = 1.696 clc/call; Max = 113.444 clc/call;

-- LIBC latency --  with -mavx2 -mfma     (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 40.138 + 0.410 clc/call; Median-Min = 0.536 clc/call; Max = 40.729 clc/call;
-- LIBC latency --  with -msse4.2     (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 44.858 + 0.872 clc/call; Median-Min = 0.814 clc/call; Max = 46.019 clc/call;
```

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153683
2023-06-28 13:58:37 -04:00
Tue Ly
e557b8a142 [libc][RISCV] Add log, log2, log1p, log10 for RISC-V64 entrypoints.
Add log, log2, log1p, log10 RISCV64 entrypoints.

Reviewed By: michaelrj, sivachandra

Differential Revision: https://reviews.llvm.org/D151674
2023-05-30 14:18:19 -04:00
Tue Ly
0bda541829 [libc][doc] Update math function status page to show more targets.
Show availability of math functions on each target.

Reviewed By: jeffbailey

Differential Revision: https://reviews.llvm.org/D151489
2023-05-25 19:24:33 -04:00
Kazu Hirata
9a515d8142 [libc] Fix typos in documentation 2023-05-22 23:27:59 -07:00
Kazu Hirata
e042efdab6 [libc] Fix typos in documentation 2023-04-24 23:31:48 -07:00
Tue Ly
f63025f52f [libc][Obvious] Fix the performance table in math function documentation. 2023-04-18 14:10:26 -04:00
Tue Ly
9af8dca70f [libc][math] Update range reduction step for log10f and reduce its latency.
Simplify the range reduction steps by choosing the reduction constants
carefully so that the reduced arguments v = r*m_x - 1 and v^2 are exact in double
precision, even without FMA instructions, and -2^-8 <= v < 2^-7.  This allows the
polynomial evaluations to be parallelized more efficiently.

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D147676
2023-04-07 10:31:46 -04:00