Commit Graph

97 Commits

Author SHA1 Message Date
lntue
ca8b14de51 [libc][math] Implement fast pass for double precision atan2 with 1 ULP errors. (#100648) 2024-07-26 09:56:46 -04:00
Mikhail R. Gadelha
e90d552c77 [libc][NFC] Update riscv documentation (#100578)
This adds linux-riscv32 to the documentation and fixes riscv's
entrypoint broken link.
2024-07-25 13:25:09 -03:00
Job Henandez Lara
7b51777ed8 [libc][math][c23] add entrypoints and tests for totalordermag{f,l,f128} (#100159)
Fixes https://github.com/llvm/llvm-project/issues/100139
2024-07-24 19:53:23 -04:00
Job Henandez Lara
c1562374c8 [libc][math][c23] Add entrypoints and tests for dsqrt{l,f128} (#99815) 2024-07-21 15:55:11 -04:00
Job Henandez Lara
af0f58cf14 [libc][math][c23] Add entrypoints and tests for fsqrt{,l,f128} (#99669) 2024-07-21 11:17:41 -04:00
aaryanshukla
a2f61ba08b [libc][math]fadd implementation (#99694)
- **[libc] math fadd**
- **[libc][math] implemented fadd**
2024-07-19 14:40:34 -07:00
OverMighty
9fb049c8c6 [libc][math][c23] Add {f,d}mul{l,f128} and f16mul{,f,l,f128} C23 math functions (#98972)
Part of #93566.
                
Fixes #94833.
2024-07-18 19:50:49 +02:00
lntue
7fc9fb9f3f [libc][math] Implement double precision cbrt correctly rounded to all rounding modes. (#99262)
Division-less Newton iterations algorithm for cube roots.

1. **Range reduction**

For `x = (-1)^s * 2^e * (1.m)`, we get 2 reduced arguments `x_r` and `a`
as:
```
  x_r = 1.m
  a   = (-1)^s * 2^(e % 3) * (1.m)
```
Then `cbrt(x) = x^(1/3)` can be computed as:
```
  x^(1/3) = 2^(e / 3) * a^(1/3).
```

In order to avoid division, we compute `a^(-2/3)` using Newton method
and then
multiply the results by a:
```
  a^(1/3) = a * a^(-2/3).
```

2. **First approximation to a^(-2/3)**

First, we use a degree-7 minimax polynomial generated by Sollya to
approximate `x_r^(-2/3)` for `1 <= x_r < 2`.
```
  p = P(x_r) ~ x_r^(-2/3),
```
with relative errors bounded by:
```
  | p / x_r^(-2/3) - 1 | < 1.16 * 2^-21.
```

Then we multiply with `2^(e % 3)` from a small lookup table to get:
```
  x_0 = 2^(-2*(e % 3)/3) * p
      ~ 2^(-2*(e % 3)/3) * x_r^(-2/3)
      = a^(-2/3)
```
with relative errors:
```
  | x_0 / a^(-2/3) - 1 | < 1.16 * 2^-21.
```
This step is done in double precision.

3. **First Newton iteration**

We follow the method described in:
Sibidanov, A. and Zimmermann, P., "Correctly rounded cubic root
evaluation
in double precision", https://core-math.gitlabpages.inria.fr/cbrt64.pdf
to derive multiplicative Newton iterations as below:
Let `x_n` be the nth approximation to `a^(-2/3)`. Define the n^th error
as:
```
  h_n = x_n^3 * a^2 - 1
```
Then:
```
  a^(-2/3) = x_n / (1 + h_n)^(1/3)
           = x_n * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3 + ...)
```
using the Taylor series expansion of `(1 + h_n)^(-1/3)`.

Apply to `x_0` above:
```
  h_0 = x_0^3 * a^2 - 1
      = a^2 * (x_0 - a^(-2/3)) * (x_0^2 + x_0 * a^(-2/3) + a^(-4/3)),
```
it's bounded by:
```
  |h_0| < 4 * 3 * 1.16 * 2^-21 * 4 < 2^-17.
```
So in the first iteration step, we use:
```
  x_1 = x_0 * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3)
```
Its relative error is bounded by:
```
  | x_1 / a^(-2/3) - 1 | < 35/242 * |h_0|^4 < 2^-70.
```
Then we perform Ziv's rounding test and check if the answer is exact.
This step is done in double-double precision.

4. **Second Newton iteration**

If the Ziv's rounding test from the previous step fails, we define the
error
term:
```
  h_1 = x_1^3 * a^2 - 1,
```
And perform another iteration:
```
  x_2 = x_1 * (1 - h_1 / 3)
```
with the relative errors exceed the precision of double-double.
We then check the Ziv's accuracy test with relative errors < 2^-102 to
compensate for rounding errors.

5. **Final iteration**
 
If the Ziv's accuracy test from the previous step fails, we perform
another
iteration in 128-bit precision and check for exact outputs.
2024-07-17 12:23:14 -04:00
Joseph Huber
60ff9c2ea5 [libc] Add support for powi as an LLVM libc extension on the GPU (#98236)
Summary:
This function is used by the CUDA / HIP / OpenMP headers and exists as
an NVIDIA extension basically. This function is implemented in the C23
standard as `pown`, but for now we need to provide `powi` for backwards
compatibility. In the future this entrypoint will just be a redirect to
`pown` once that is implemented.
2024-07-09 20:51:36 -05:00
lntue
c9ee6b1977 [libc][math] Implement cbrtf function correctly rounded to all rounding modes. (#97936)
Fixes https://github.com/llvm/llvm-project/issues/92874

Algorithm: Let `x = (-1)^s * 2^e * (1 + m)`.
- Step 1: Range reduction: reduce the exponent with:
```
  y = cbrt(x) = (-1)^s * 2^(floor(e/3)) * 2^((e % 3)/3) * (1 + m)^(1/3)
```
- Step 2: Use the first 4 bit fractional bits of `m` to look up for a
degree-7 polynomial approximation to:
```
  (1 + m)^(1/3) ~ 1 + m * P(m).
```
- Step 3: Perform the multiplication:
```
  2^((e % 3)/3) * (1 + m)^(1/3).
```
- Step 4: Check for exact cases to prevent rounding and clear
`FE_INEXACT` floating point exception.
- Step 5: Combine with the exponent and sign before converting down to
`float` and return.
2024-07-08 10:02:12 -04:00
Hendrik Hübner
f8834ed24b [libc][C23][math] Implement cospif function correctly rounded for all rounding modes (#97464)
I also fixed a comment in sinpif.cpp in the first commit. Should this be
included in this PR?

All tests were passed, including the exhaustive test.

CC: @lntue
2024-07-06 09:24:05 -04:00
OverMighty
ac76ce2693 [libc][math][c23] Classify f16fma{,f,l} as LLVM libc extensions (#97728) 2024-07-05 09:58:01 -04:00
lntue
7d68d9d2f2 [libc][math] Implement correctly rounded double precision tan (#97489)
Using the same range reduction as `sin`, `cos`, and `sincos`:
1) Reducing `x = k*pi/128 + u`, with `|u| <= pi/256`, and `u` is in
double-double.
2) Approximate `tan(u)` using degree-9 Taylor polynomial.
3) Compute
```
   tan(x) ~ (sin(k*pi/128) + tan(u) * cos(k*pi/128)) / (cos(k*pi/128) - tan(u) * sin(k*pi/128))
```
using the fast double-double division algorithm in [the CORE-MATH
project](https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary64/tan/tan.c#L1855).
4) Perform relative-error Ziv's accuracy test
5) If the accuracy tests failed, we redo the computations using 128-bit
precision `DyadicFloat`.

Fixes https://github.com/llvm/llvm-project/issues/96930
2024-07-03 18:05:24 -04:00
OverMighty
4e56724213 [libc][math][c23] Add f16{add,sub}{,l,f128} C23 math functions (#97072)
Part of #93566.
2024-07-02 19:27:09 -04:00
OverMighty
12a1e6dd12 [libc][math][c23] Add f16{add,sub}f C23 math functions (#96787)
Part of #93566.
2024-07-02 09:16:12 -04:00
Hendrik Hübner
ea93c538c7 [libc][math][c23] Implemented sinpif function correctly rounded for all rounding modes. (#97149)
This implements the sinpif function. An exhaustive test shows it's
correct for all rounding modes.

Issue:  #94895
2024-07-01 16:38:03 -04:00
OverMighty
6c1c451b86 [libc][math][c23] Add f16sqrt{,l,f128} C23 math functions (#96642)
Part of #95250.
2024-06-30 19:20:39 -04:00
OverMighty
56ef6a2eb2 [libc][math][c23] Add f16div{,l,f128} C23 math functions (#97054)
Part of #93566.
2024-06-29 18:48:12 -04:00
OverMighty
e34dbb127a [libc][math][c23] Add f16fma{,l,f128} C23 math function (#96711)
Part of #93566.
2024-06-27 14:44:19 -04:00
lntue
4080f174ab [libc][math] Implement double precision sincos correctly rounded to all rounding modes. (#96719)
Sharing the same algorithm as double precision sin:
https://github.com/llvm/llvm-project/pull/95736 and cos:
https://github.com/llvm/llvm-project/pull/96591
2024-06-27 10:15:22 -04:00
lntue
88f80aeb0c [libc][math] Implement double precision cos correctly rounded to all rounding modes. (#96591)
Sharing the same algorithm as double precision sin:
https://github.com/llvm/llvm-project/pull/95736
2024-06-25 16:51:31 -04:00
OverMighty
edbe698ead [libc][math][c23] Add f16divf C23 math function (#96131)
Part of #93566.
2024-06-25 08:48:28 -04:00
lntue
16903ace18 [libc][math] Implement double precision sin correctly rounded to all rounding modes. (#95736)
- Algorithm:
- Step 1 - Range reduction: for a double precision input `x`, return `k`
and `u` such that
    - k is an integer
    - u = x - k * pi / 128, and |u| < pi/256
- Step 2 - Calculate `sin(u)` and `cos(u)` in double-double using Taylor
polynomials with errors < 2^-70 with FMA or < 2^-66 w/o FMA.
- Step 3 - Calculate `sin(x) = sin(k*pi/128) * cos(u) + cos(k*pi/128) *
sin(u)` using look-up table for `sin(k*pi/128)` and `cos(k*pi/128)`.
- Step 4 - Use Ziv's rounding test to decide if the result is correctly
rounded.
- Step 4' - If the Ziv's rounding test failed, redo step 1-3 using
128-bit precision.
- Currently, without FMA instructions, the large range reduction only
works correctly for the default rounding mode (FE_TONEAREST).
- Provide `LIBC_MATH` flag so that users can set `LIBC_MATH =
LIBC_MATH_SKIP_ACCURATE_PASS` to build the `sin` function without step 4
and 4'.
2024-06-24 17:57:08 -04:00
OverMighty
b5efd21429 [libc][math][c23] Add {ldexp,scalbn,scalbln}f16 C23 math functions (#94797)
Part of #93566.
2024-06-21 09:01:47 -04:00
OverMighty
1107575c95 [libc][math][c23] Add {getpayload,setpayload,setpayloadsig}f16 C23 math functions (#95159)
Part of #93566.
2024-06-20 13:33:34 -04:00
OverMighty
f3aceeee8a [libc][math][c23] Add f16fmaf C23 math function (#95483)
Part of #93566.
2024-06-14 12:31:32 -04:00
OverMighty
a239343521 [libc][math][c23] Add f16sqrtf C23 math function (#95251)
Part of #95250.
2024-06-13 12:57:24 -04:00
OverMighty
f5dcfb9968 [libc][math][c23] Add {totalorder,totalordermag}f16 C23 math functions (#95014)
Part of #93566.
2024-06-11 11:04:48 -04:00
OverMighty
7683a16dbf [libc][math][c23] Add {remainder,remquo}f16 C23 math functions (#94773)
Part of #93566.
2024-06-10 11:02:09 -04:00
OverMighty
10cd96dd33 [libc][math][c23] Add {frexp,ilogb,llogb,logb,modf}f16 C23 math functions (#94758)
Part of #93566.
2024-06-10 08:38:47 -04:00
OverMighty
cb1a727dea [libc][math][c23] Add nanf16 C23 math function (#94767)
Part of #93566.
2024-06-10 00:19:22 -04:00
Hendrik Hübner
44aecca020 [libc][math][C23] Implemented remquof128 function (#94809)
Added remquof128 function. Closes #94312
2024-06-08 15:08:45 -04:00
Job Henandez Lara
263be9fb00 [libc][math][c23] fmul correcly rounded to all rounding modes (#91537)
This is an implementation of floating point multiplication:

It will consist of 
   - `double x double -> float`
2024-06-08 15:07:27 -04:00
OverMighty
0cdb0b7473 [libc][math][c23] Add fmodf16 C23 math function (#94629)
Part of #93566.
2024-06-07 18:26:58 -04:00
OverMighty
dd1cd02a43 [libc][math][c23] Add {fmaximum,fminimum}{,_mag,_mag_num,_num} C23 math functions (#94510)
#93566
2024-06-06 11:20:29 -04:00
OverMighty
63cda2d19c [libc][math][c23] Add {nextafter,nexttoward,nextup,nextdown}f16 C23 math functions (#94535)
#93566
2024-06-05 23:06:48 -04:00
Hendrik Hübner
8e67495326 [libc][math][c23] Implement fmaxf16 and fminf16 function (#94131)
Implements fmaxf16 and fminf16, which are two missing functions listed
here: #93566
2024-06-05 22:44:44 -04:00
OverMighty
c537f35646 [libc][math][c23] Add fdimf16 C23 math function (#94354)
#93566
2024-06-05 10:37:55 -04:00
OverMighty
6c97303681 [libc][math][c23] Add copysignf16 C23 math function (#94351)
#93566
2024-06-05 09:46:36 -04:00
OverMighty
3614beede1 [libc][math][c23] Add canonicalizef16 C23 math function (#94341)
#93566
2024-06-05 08:24:23 -04:00
OverMighty
6b5ae148e5 [libc][math][c23] Add {fromfp,fromfpx,ufromfp,ufromfpx}f16 C23 math functions (#94254)
https://github.com/llvm/llvm-project/issues/93566
2024-06-04 18:29:53 -04:00
OverMighty
2635d0419e [libc][math][c23] Add {nearbyint,rint,lrint,llrint,lround,llround}f16 C23 math functions (#94218)
https://github.com/llvm/llvm-project/issues/93566
2024-06-04 10:03:31 -04:00
OverMighty
25b037bdb5 [libc][math][c23] Add {ceil,floor,round,roundeven,trunc}f16 C23 math functions (#94001) 2024-06-03 14:28:51 -04:00
OverMighty
0eb9e021b1 [libc][math][c23] Add fabsf16 C23 math function (#93567)
cc @lntue
2024-05-30 15:37:15 -04:00
Michael Flanders
5e9937d1b3 [libc][math] Adds entrypoint and tests for nearbyintf128,scalbnf128 (#88443)
Closes #84689.

Adding @lntue for review.

I was curious about the implementation of
`round_using_current_rounding_mode` used for the `nearbyint` functions.
It has one of the rounding modes as unreachable
([here](https://github.com/llvm/llvm-project/blob/main/libc/src/__support/FPUtil/NearestIntegerOperations.h#L243)),
and I was wondering if this was okay for the `nearbyint` functions.

---------

Co-authored-by: Michael Flanders <mkf727@cs.washington.edu>
2024-04-29 19:25:45 -04:00
Vinayak Dev
3b961d113e [libc] Implement roundeven C23 math functions (#87678)
Implements the functions `roundeven()`, `roundevenf()`, `roundevenl()`
from the roundeven family of functions introduced in C23. Also
implements `roundevenf128()`.
2024-04-05 08:36:12 -04:00
OverMighty
a8c59750d9 [libc][math][c23] Add exp2m1f C23 math function (#86996)
Fixes #86502.

cc @lntue
2024-04-04 08:22:45 -04:00
Vinayak Dev
986435c765 [libc] Move {f,d}sqrt to higher functions in docs (#87445)
Moves the functions `fsqrt()` and `dsqrt()` from basic functions to
higher math functions in math docs
2024-04-02 22:39:48 -04:00
Vinayak Dev
2fb5440e76 [libc] Re-organize the math function tables in docs (#87412)
Re-organizes the tables that listed libc's support for math functions,
and adds two new columns to the tables indicating where the respective
function definitions and error handling methods are located in the C23
standard draft WG14-N3096.
2024-04-02 22:23:35 -04:00
lntue
2be722587f [libc][math] Implement atan2f correctly rounded to all rounding modes. (#86716)
We compute atan2f(y, x) in 2 stages:
- Fast step: perform computations in double precision , with relative
errors < 2^-50
- Accurate step: if the result from the Fast step fails Ziv's rounding
test, then we perform computations in double-double precision, with
relative errors < 2^-100.

On Ryzen 5900X, worst-case latency is ~ 200 clocks, compared to average
latency ~ 60 clocks, and average reciprocal throughput ~ 20 clocks.
2024-04-01 13:31:07 -04:00