Commit Graph

120 Commits

Author SHA1 Message Date
Tue Ly
5814b7b279 [libc][math] Implement log10 function correctly rounded for all rounding modes
Implement double precision log10 function correctly rounded for all
rounding modes.  This implementation currently needs FMA instructions for
correctness.

Use 2 passes:
Fast pass:
- 1 step range reduction with a lookup table of `2^7 = 128` elements to reduce the ranges to `[-2^-7, 2^-7]`.
- Use a degree-7 minimax polynomial generated by Sollya, evaluated using a mixed of double-double and double precisions.
- Apply Ziv's test for accuracy.
Accurate pass:
- Apply 5 more range reduction steps to reduce the ranges further to [-2^-27, 2^-27].
- Use a degree-4 minimax polynomial generated by Sollya, evaluated using 192-bit precisions.
- By the result of Lefevre (add quote), this is more than enough for correct rounding to all rounding modes.

In progress: Adding detail documentations about the algorithm.

Depend on: https://reviews.llvm.org/D136799

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D139846
2023-01-08 17:41:54 -05:00
Guillaume Chatelet
436c8f4420 [reland][libc] Add bcopy
Differential Revision: https://reviews.llvm.org/D138994
2022-12-01 10:07:04 +00:00
Guillaume Chatelet
c5fe7eb216 Revert D138994 "[libc] Add bcopy"
Broke build bot

This reverts commit 186a15f7a9.
2022-12-01 09:55:36 +00:00
Guillaume Chatelet
186a15f7a9 [libc] Add bcopy
Differential Revision: https://reviews.llvm.org/D138994
2022-12-01 09:52:10 +00:00
Joseph Huber
dabb7514f5 [libc] Fix assert.h and ctype.h not being built
The `assert.h` and `ctype.h` headers are never built despite their
entrypoints being present in the generated library. This patch adds a
dependency on these headers so that they will be built properly.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D138142
2022-11-16 12:00:41 -06:00
Raman Tenneti
78f172e45a [libc] Implement gettimeofday
Implement gettimeofday per
.../onlinepubs/9699919799/functions/gettimeofday.html.
This call clock_gettime to implement gettimeofday function.

Tested:
Limited unit test: This makes a call and checks that no error was
returned. Used nanosleep for 100 microseconds and verfified it
returns a value that elapses more than 100 microseconds and less
than 300 microseconds.

Co-authored-by: Jeff Bailey <jeffbailey@google.com>

Differential Revision: https://reviews.llvm.org/D137881
2022-11-11 18:02:33 -08:00
Tue Ly
45233cc1ca [libc][math] Add place-holder implementation for pow function.
Add place-holder implementation for pow function to unblock libc demo
examples.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D137109
2022-10-31 17:23:33 -04:00
Tue Ly
97b4cc83e1 [libc][math] Add place-holder implementation for asin to unblock demo examples.
Add a place-holder implementation for asin to unblock libc demo
examples.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D137105
2022-10-31 17:22:12 -04:00
Alex Brachet
5fd03c8176 [libc] Implement getopt
Differential Revision: https://reviews.llvm.org/D133487
2022-10-31 16:55:53 +00:00
Alex Brachet
d6ac84bce8 Revert "[libc] Implement getopt"
This reverts commit a678f86351.
2022-10-27 06:47:24 +00:00
Alex Brachet
a678f86351 [libc] Implement getopt
Differential Revision: https://reviews.llvm.org/D133487
2022-10-27 06:23:33 +00:00
Siva Chandra
07b7023181 [libc] Enable more entrypoints on aarch64. 2022-10-26 14:03:15 -07:00
Siva Chandra
f8490601c2 [libc] Enable a few entrypoints on aarch64 already available on x86_64. 2022-10-25 15:41:41 -07:00
Raman Tenneti
12204429f2 [libc] Add implementation of difftime function.
The difftime function computes the difference between two calendar
times: time1 - time0 as per as per 7.27.2.2 section in
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2478.pdf.

  double difftime(time_t time1, time_t time0);

Tested:
Unit tests

Co-authored-by: Jeff Bailey <jeffbailey@google.com>

Reviewed By: jeffbailey

Differential Revision: https://reviews.llvm.org/D136631
2022-10-24 15:14:26 -07:00
Siva Chandra Reddy
3f965818b6 [libc] Add POSIX execv and execve functions.
The POSIX global variable environ has also been added.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D135351
2022-10-06 19:50:23 +00:00
Siva Chandra
5e56e294ae [libc][Obvious] Enable some of the recently added functions on aarch64. 2022-09-29 15:06:44 -07:00
Raman Tenneti
8f1e362ee9 Implement nanosleep per https://pubs.opengroup.org/onlinepubs/009695399/basedefs/time.h.html
Tested:
Limited unit test: This makes a call and checks that no error was
returned, but we currently don't have the ability to ensure that
time has elapsed as expected.

Co-authored-by: Jeff Bailey <jeffbailey@google.com>

Reviewed By: sivachandra, jeffbailey

Differential Revision: https://reviews.llvm.org/D134095
2022-09-24 00:13:58 +00:00
Tue Ly
a752460d73 [libc][math] Implement exp10f function correctly rounded to all rounding modes.
Implement exp10f function correctly rounded to all rounding modes.

Algorithm: perform range reduction to reduce
```
  10^x = 2^(hi + mid) * 10^lo
```
where:
```
  hi is an integer,
  0 <= mid * 2^5 < 2^5
  -log10(2) / 2^6 <= lo <= log10(2) / 2^6
```
Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is
performed by adding `hi` into the exponent field of `2^mid`.
`10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with:
```
  > P = fpminimax((10^x - 1)/x, 4, [|D...|], [-log10(2)/64. log10(2)/64]);
```
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput   : 10.215
System LIBC reciprocal throughput : 7.944

LIBC reciprocal throughput        : 38.538
LIBC reciprocal throughput        : 12.175   (with `-msse4.2` flag)
LIBC reciprocal throughput        : 9.862    (with `-mfma` flag)

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 40.744
System LIBC latency : 37.546

BEFORE
LIBC latency        : 48.989
LIBC latency        : 44.486   (with `-msse4.2` flag)
LIBC latency        : 40.221   (with `-mfma` flag)
```
This patch relies on https://reviews.llvm.org/D134002

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D134104
2022-09-19 10:01:40 -04:00
Siva Chandra Reddy
7fb96fb5d3 [libc] Add implementation of POSIX "uname" function.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D134065
2022-09-16 21:21:29 +00:00
Siva Chandra Reddy
d23d858d04 [libc] Add the implementation of the "remove" function.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D133922
2022-09-15 17:32:02 +00:00
Siva Chandra Reddy
6e675fba3a [libc] Add POSIX functions pread and pwrite.
Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D133888
2022-09-14 20:52:20 +00:00
Siva Chandra Reddy
419580c699 [libc] Add implementation of POSIX function "access".
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D133814
2022-09-14 07:44:47 +00:00
Siva Chandra Reddy
8989aa003f [libc] Add POSIX functions dup, dup2, and GNU extension function dup3.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D133748
2022-09-13 18:06:30 +00:00
Tue Ly
463dcc8749 [libc][math] Implement acosf function correctly rounded for all rounding modes.
Implement acosf function correctly rounded for all rounding modes.

We perform range reduction as follows:

- When `|x| < 2^(-10)`, we use cubic Taylor polynomial:
```
  acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 / 6.
```
- When `2^(-10) <= |x| <= 0.5`, we use the same approximation that is used for `asinf(x)` when `|x| <= 0.5`:
```
  acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 * P(x^2).
```
- When `0.5 < x <= 1`, we use the double angle formula: `cos(2y) = 1 - 2 * sin^2 (y)` to reduce to:
```
  acos(x) = 2 * asin( sqrt( (1 - x)/2 ) )
```
- When `-1 <= x < -0.5`, we reduce to the positive case above using the formula:
```
  acos(x) = pi - acos(-x)
```

Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh acosf
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput   : 28.613
System LIBC reciprocal throughput : 29.204
LIBC reciprocal throughput        : 24.271

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 55.554
System LIBC latency : 76.879
LIBC latency        : 62.118
```

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D133550
2022-09-09 09:55:30 -04:00
Tue Ly
e2f065c2a3 [libc][math] Implement asinf function correctly rounded for all rounding modes.
Implement asinf function correctly rounded for all rounding modes.

For `|x| <= 0.5`, we approximate `asin(x)` by
```
  asin(x) = x * P(x^2)
```
where `P(X^2) = Q(X)` is a degree-20 minimax even polynomial approximating
`asin(x)/x` on `[0, 0.5]` generated by Sollya with:
```
  > Q = fpminimax(asin(x)/x, [|0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20|],
                 [|1, D...|], [0, 0.5]);
```

When `|x| > 0.5`, we perform range reduction as follow:
Assume further that `0.5 < x <= 1`, and let:
```
  y = asin(x)
```
We will use the double angle formula:
```
  cos(2X) = 1 - 2 sin^2(X)
```
and the complement angle identity:
```
  x = sin(y) = cos(pi/2 - y)
              = 1 - 2 sin^2 (pi/4 - y/2)
```
So:
```
  sin(pi/4 - y/2) = sqrt( (1 - x)/2 )
```
And hence:
```
  pi/4 - y/2 = asin( sqrt( (1 - x)/2 ) )
```
Equivalently:
```
  asin(x) = y = pi/2 - 2 * asin( sqrt( (1 - x)/2 ) )
```
Let `u = (1 - x)/2`, then
```
  asin(x) = pi/2 - 2 * asin(u)
```
Moreover, since `0.5 < x <= 1`,
```
  0 <= u < 1/4, and 0 <= sqrt(u) < 0.5.
```
And hence we can reuse the same polynomial approximation of `asin(x)` when
`|x| <= 0.5`:
```
  asin(x) = pi/2 - 2 * u * P(u^2).
```

Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf
CORE-MATH reciprocal throughput   : 23.418
System LIBC reciprocal throughput : 27.310
LIBC reciprocal throughput        : 22.741

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 58.884
System LIBC latency : 62.055
LIBC latency        : 62.037
```

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D133400
2022-09-07 19:27:47 -04:00
Kirill Okhotnikov
77e1d9beed [libc][math] Added atanf function.
Performance by core-math (core-math/glibc 2.31/current llvm-14):
28.879/20.843/20.15

Differential Revision: https://reviews.llvm.org/D132842
2022-08-30 22:39:54 +02:00
Kirill Okhotnikov
6c1fc7e430 [libc][math] Added atanhf function.
Performance by core-math (core-math/glibc 2.31/current llvm-14):
10.845/43.174/13.467

The review is done on top of D132809.

Differential Revision: https://reviews.llvm.org/D132811
2022-08-30 22:39:54 +02:00
Siva Chandra Reddy
f6506ec443 [libc] Implement POSIX truncate and ftruncate functions for Linux.
Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D132705
2022-08-26 19:27:24 +00:00
Siva Chandra Reddy
b8be3dabde [libc] Add Linux implementation of GNU extension function sendfile.
Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D132721
2022-08-26 19:13:40 +00:00
Siva Chandra Reddy
00e51f04e8 [libc] Implement linux link, linkat, symlink, symlinkat, readlink, readlinkat.
Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D132619
2022-08-25 18:50:39 +00:00
Siva Chandra Reddy
85dff76416 [libc] Add linux implementation of POSIX fchmodat function.
Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D132533
2022-08-24 18:46:29 +00:00
Siva Chandra
d00e97df0f [libc][Obvious] Fix typo is chmod implementation.
This now allows enabling the chmod function on aarch64.
2022-08-23 15:01:21 -07:00
Siva Chandra
8856137ce7 [libc] Enable a few entrypoints on aarch64 which are now available on x86_64. 2022-08-23 12:31:42 -07:00
Siva Chandra Reddy
84517b6fb1 [libc] Add more headers to the linux x86_64 and aarch64 configs. 2022-08-19 07:15:08 +00:00
Michael Jones
e0e7fa36d3 [libc] enable s(n)printf without fullbuild
To use the FILE data structure, LLVM-libc must be in fullbuild mode
since it expects its own implementation. This means that (f)printf can't
be used without fullbuild, but s(n)printf only uses strings. This patch
adjusts the CMake to allow for this.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D131913
2022-08-15 13:45:34 -07:00
Tue Ly
82d6e77048 [libc] Implement tanf function correctly rounded for all rounding modes.
Implement tanf function correctly rounded for all rounding modes.

We use the range reduction that is shared with `sinf`, `cosf`, and `sincosf`:
```
  k = round(x * 32/pi) and y = x * (32/pi) - k.
```
Then we use the tangent of sum formula:
```
  tan(x) = tan((k + y)* pi/32) = tan((k mod 32) * pi / 32 + y * pi/32)
         = (tan((k mod 32) * pi/32) + tan(y * pi/32)) / (1 - tan((k mod 32) * pi/32) * tan(y * pi/32))
```
We need to make a further reduction when `k mod 32 >= 16` due to the pole at `pi/2` of `tan(x)` function:
```
  if (k mod 32 >= 16): k = k - 31, y = y - 1.0
```
And to compute the final result, we store `tan(k * pi/32)` for `k = -15..15` in a table of 32 double values,
and evaluate `tan(y * pi/32)` with a degree-11 minimax odd polynomial generated by Sollya with:
```
>  P = fpminimax(tan(y * pi/32)/y, [|0, 2, 4, 6, 8, 10|], [|D...|], [0, 1.5]);
```

Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf
CORE-MATH reciprocal throughput   : 18.586
System LIBC reciprocal throughput : 50.068

LIBC reciprocal throughput        : 33.823
LIBC reciprocal throughput        : 25.161     (with `-msse4.2` flag)
LIBC reciprocal throughput        : 19.157     (with `-mfma` flag)

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf --latency
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency   : 55.630
System LIBC latency : 106.264

LIBC latency        : 96.060
LIBC latency        : 90.727    (with `-msse4.2` flag)
LIBC latency        : 82.361    (with `-mfma` flag)
```

Reviewed By: orex

Differential Revision: https://reviews.llvm.org/D131715
2022-08-12 09:21:05 -04:00
Kirill Okhotnikov
5ef987c985 [libc][math] Added tanhf function.
Correct rounding function. Performance ~2x faster than glibc analog.

Performance (llvm 12 intel):
```
CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='' ./perf.sh tanhf
GNU libc version: 2.31
GNU libc release: stable
13.279
37.492
18.145
CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='--latency' ./perf.sh tanhf
GNU libc version: 2.31
GNU libc release: stable
40.658
109.582
66.568
```

Differential Revision: https://reviews.llvm.org/D130780
2022-08-01 22:43:00 +02:00
Kirill Okhotnikov
a7f55f0805 [libc][math] Added sinhf function.
Differential Revision: https://reviews.llvm.org/D129278
2022-07-29 17:20:53 +02:00
Kirill Okhotnikov
fcb9d7e2cf [libc][math] Added coshf function.
Differential Revision: https://reviews.llvm.org/D129275
2022-07-29 16:57:28 +02:00
Siva Chandra
98fdabecf5 [libc] Enable a few stdlib and time functions on aarch64. 2022-07-14 14:37:50 -07:00
Siva Chandra
75a628925e [libc] Enable few stdio functions on aarch64. 2022-07-14 13:40:48 -07:00
Siva Chandra
edee61b55c [libc] Enable few pthread and threads functions on aarch64. 2022-07-14 13:25:21 -07:00
Alex Brachet
c179bcc151 [libc] Add imaxabs
Differential Revision: https://reviews.llvm.org/D129517
2022-07-11 21:28:21 +00:00
Kirill Okhotnikov
b8e8012aa2 [libc][math] fmod/fmodf implementation.
This is a implementation of find remainder fmod function from standard libm.
The underline algorithm is developed by myself, but probably it was first
invented before.
Some features of the implementation:
1. The code is written on more-or-less modern C++.
2. One general implementation for both float and double precision numbers.
3. Spitted platform/architecture dependent and independent code and tests.
4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc.
5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided).
6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication.

Performance tests:

The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases.

`./check.sh <--special|--worst> fmodf` passed.
`CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are

```
GNU libc version: 2.35
GNU libc release: stable
21.166 <-- FPU
51.031 <-- current glibc
37.659 <-- this fmod version.
```
2022-06-24 23:09:14 +02:00
Alex Brachet
b1183305f8 [libc] Add strlcat
Differential Revision: https://reviews.llvm.org/D125978
2022-05-19 21:48:39 +00:00
Alex Brachet
fc2c8b2371 [libc] Add strlcpy
Differential Revision: https://reviews.llvm.org/D125806
2022-05-18 17:45:05 +00:00
Siva Chandra
3f5287125a [libc] Add stdio entrypoints to aarch64 fullbuild. 2022-04-26 00:25:12 -07:00
Siva Chandra Reddy
0258f56646 [libc] Add a definition of pthread_attr_t and its getters and setters.
Not all attributes have been added to phtread_attr_t in this patch. They
will be added gradually in future patches.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D123423
2022-04-11 16:08:49 +00:00
Siva Chandra Reddy
83f153ce34 [libc] Add pthread_mutexattr_t type and its setters and getters.
A simple implementation of the getters and setters has been added. More
logic can be added to them in future as required.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D122969
2022-04-04 18:11:12 +00:00
Siva Chandra
97417e0300 [libc] Enable threads.h functions on aarch64.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D122788
2022-03-31 08:42:07 -07:00