Commit Graph

390 Commits

Author SHA1 Message Date
Schrodinger ZHU Yifan
3db5c1eeb0 revert all tid changes (#100915) 2024-07-27 22:29:21 -07:00
RoseZhang03
134b4484d8 [libc] Updated GettingStarted.rst with PyYAML version (#100649)
New Headergen requires PyYAML version 5.1 or newer in order to generate
header files .
2024-07-26 18:31:00 +00:00
lntue
ca8b14de51 [libc][math] Implement fast pass for double precision atan2 with 1 ULP errors. (#100648) 2024-07-26 09:56:46 -04:00
Daniel Thornburgh
0c10bdc05f [libc] Lazily initialize freelist malloc using symbols (#99254)
This requires the user to set the upper bounds of the heap by defining
the symbol `__libc_heap_limit`. The heap begins at `_end` and ends
`__libc_heap_limit` bytes afterwards. This prevents a completely unused
heap from requiring any space, and it prevents the heap from being
zeroed at initialization time as part of BSS. It also allows users to
customize the available heap location without recompiling libc.
    
I'd think this should eventually be replaced with an implemenation based
on a morecore() library. This would allow the same implementation to use
sbrk() on POSIX, `_end` and `__libc_heap_limit` on embedded, and a
buffer in tests. It would also provide better "wilderness" behavior that
tends to decrease heap fragementation (see Wilson et al.)

See #98096
2024-07-25 13:38:06 -07:00
Mikhail R. Gadelha
88fb56ebf2 [libc] Fix broken table introduced by PR #100578 2024-07-25 19:57:06 +02:00
Mikhail R. Gadelha
e90d552c77 [libc][NFC] Update riscv documentation (#100578)
This adds linux-riscv32 to the documentation and fixes riscv's
entrypoint broken link.
2024-07-25 13:25:09 -03:00
Job Henandez Lara
7b51777ed8 [libc][math][c23] add entrypoints and tests for totalordermag{f,l,f128} (#100159)
Fixes https://github.com/llvm/llvm-project/issues/100139
2024-07-24 19:53:23 -04:00
aaryanshukla
8b094c9df3 [libc][newheadergen]: PyYaml Version Update (#100463)
- a lot of builds had an issue using new headergen because they do not
have PyYaml installed.
2024-07-24 15:04:05 -07:00
Joseph Huber
2e3ee31d29 [libc] Enable 'sscanf' on the GPU #100211
Summary:
We can enable the sscanf function on the GPU now. This required adding
the configs to the scanf list so that the GPU build didn't do float
conversions.
2024-07-24 14:16:57 -05:00
Joseph Huber
8d8fa01a66 Reapply "[libc] Remove 'packaged' GPU build support (#100208)"
This reverts commit 550b83d658.
2024-07-24 10:24:53 -05:00
Joseph Huber
550b83d658 Revert "[libc] Remove 'packaged' GPU build support (#100208)"
Summary:
I forgot that the OpenMP tests still look for this, reverting for now
until I can make a fix.

This reverts commit c1c6ed83e9.
2024-07-24 07:51:47 -05:00
Joseph Huber
9914609468 Revert "[libc] Enable 'sscanf' on the GPU (#100211)"
Summary:
This fails tests in some situations, revert until it can be fixed.
This reverts commit 445bb35f95.
2024-07-24 07:46:39 -05:00
Joseph Huber
445bb35f95 [libc] Enable 'sscanf' on the GPU (#100211)
Summary:
We can enable the `sscanf` function on the GPU now.
2024-07-24 07:41:32 -05:00
Joseph Huber
c1c6ed83e9 [libc] Remove 'packaged' GPU build support (#100208)
Summary:
Previously, the GPU built the `libc` in a fat binary version that was
used to pass this to the link job in offloading languages like CUDA or
OpenMP. This was mostly required because NVIDIA couldn't consume the
standard static library version. Recent patches have now created the
`clang-nvlink-wrapper` which lets us do that. Now, the C library is just
included implicitly by the toolchain (or passed with -Xoffload-linker
-lc).

This code can be fully removed, which will heavily simplify the build
(and removed some bugs and garbage files I've encoutnered).
2024-07-24 07:22:49 -05:00
Joseph Huber
0420d2f97e [libc] Fix leftover debug commandline argument
Summary:
Fixes https://github.com/llvm/llvm-project/issues/100289
2024-07-23 21:35:42 -05:00
RoseZhang03
8972979c37 [libc] Updated header_generation.rst (#99712)
Added new headergen documentation.
2024-07-22 20:15:26 +00:00
Job Henandez Lara
c1562374c8 [libc][math][c23] Add entrypoints and tests for dsqrt{l,f128} (#99815) 2024-07-21 15:55:11 -04:00
Job Henandez Lara
af0f58cf14 [libc][math][c23] Add entrypoints and tests for fsqrt{,l,f128} (#99669) 2024-07-21 11:17:41 -04:00
Schrodinger ZHU Yifan
29be889c2c reland "[libc] implement cached process/thread identity (#98989)" (#99765) 2024-07-20 10:25:40 -07:00
aaryanshukla
a2f61ba08b [libc][math]fadd implementation (#99694)
- **[libc] math fadd**
- **[libc][math] implemented fadd**
2024-07-19 14:40:34 -07:00
Schrodinger ZHU Yifan
415ca24f8e Revert "[libc] implement cached process/thread identity" (#99559)
Reverts llvm/llvm-project#98989
2024-07-18 13:31:04 -07:00
Schrodinger ZHU Yifan
5c9fc3cdd7 [libc] implement cached process/thread identity (#98989)
migrated from https://github.com/llvm/llvm-project/pull/95965 due to
corrupted git history
2024-07-18 13:27:50 -07:00
OverMighty
9fb049c8c6 [libc][math][c23] Add {f,d}mul{l,f128} and f16mul{,f,l,f128} C23 math functions (#98972)
Part of #93566.
                
Fixes #94833.
2024-07-18 19:50:49 +02:00
Joseph Huber
38f1dd2e45 [libc] Remove strerror_r on the GPU for now
Summary:
This function has conflicting definitions, which makes it difficult to
use in an offloading setting. Disable it for now.
2024-07-18 06:54:03 -05:00
lntue
7fc9fb9f3f [libc][math] Implement double precision cbrt correctly rounded to all rounding modes. (#99262)
Division-less Newton iterations algorithm for cube roots.

1. **Range reduction**

For `x = (-1)^s * 2^e * (1.m)`, we get 2 reduced arguments `x_r` and `a`
as:
```
  x_r = 1.m
  a   = (-1)^s * 2^(e % 3) * (1.m)
```
Then `cbrt(x) = x^(1/3)` can be computed as:
```
  x^(1/3) = 2^(e / 3) * a^(1/3).
```

In order to avoid division, we compute `a^(-2/3)` using Newton method
and then
multiply the results by a:
```
  a^(1/3) = a * a^(-2/3).
```

2. **First approximation to a^(-2/3)**

First, we use a degree-7 minimax polynomial generated by Sollya to
approximate `x_r^(-2/3)` for `1 <= x_r < 2`.
```
  p = P(x_r) ~ x_r^(-2/3),
```
with relative errors bounded by:
```
  | p / x_r^(-2/3) - 1 | < 1.16 * 2^-21.
```

Then we multiply with `2^(e % 3)` from a small lookup table to get:
```
  x_0 = 2^(-2*(e % 3)/3) * p
      ~ 2^(-2*(e % 3)/3) * x_r^(-2/3)
      = a^(-2/3)
```
with relative errors:
```
  | x_0 / a^(-2/3) - 1 | < 1.16 * 2^-21.
```
This step is done in double precision.

3. **First Newton iteration**

We follow the method described in:
Sibidanov, A. and Zimmermann, P., "Correctly rounded cubic root
evaluation
in double precision", https://core-math.gitlabpages.inria.fr/cbrt64.pdf
to derive multiplicative Newton iterations as below:
Let `x_n` be the nth approximation to `a^(-2/3)`. Define the n^th error
as:
```
  h_n = x_n^3 * a^2 - 1
```
Then:
```
  a^(-2/3) = x_n / (1 + h_n)^(1/3)
           = x_n * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3 + ...)
```
using the Taylor series expansion of `(1 + h_n)^(-1/3)`.

Apply to `x_0` above:
```
  h_0 = x_0^3 * a^2 - 1
      = a^2 * (x_0 - a^(-2/3)) * (x_0^2 + x_0 * a^(-2/3) + a^(-4/3)),
```
it's bounded by:
```
  |h_0| < 4 * 3 * 1.16 * 2^-21 * 4 < 2^-17.
```
So in the first iteration step, we use:
```
  x_1 = x_0 * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3)
```
Its relative error is bounded by:
```
  | x_1 / a^(-2/3) - 1 | < 35/242 * |h_0|^4 < 2^-70.
```
Then we perform Ziv's rounding test and check if the answer is exact.
This step is done in double-double precision.

4. **Second Newton iteration**

If the Ziv's rounding test from the previous step fails, we define the
error
term:
```
  h_1 = x_1^3 * a^2 - 1,
```
And perform another iteration:
```
  x_2 = x_1 * (1 - h_1 / 3)
```
with the relative errors exceed the precision of double-double.
We then check the Ziv's accuracy test with relative errors < 2^-102 to
compensate for rounding errors.

5. **Final iteration**
 
If the Ziv's accuracy test from the previous step fails, we perform
another
iteration in 128-bit precision and check for exact outputs.
2024-07-17 12:23:14 -04:00
Joseph Huber
49b2c30feb [libc][docs] Document printf support on the GPU target (#99241)
Summary:
Title
2024-07-16 16:40:37 -05:00
Joseph Huber
8393ea5d1d [libc] Implement clock_gettime for the monotonic clock on the GPU (#99067)
Summary:
This patch implements `clock_gettime` using the monotonic clock. This
allows users to get time elapsed at nanosecond resolution. This is
primarily to facilitate compiling the `chrono` library from `libc++`.
For this reason we provide both `CLOCK_MONOTONIC`, which we can
implement
with the GPU's global fixed-frequency clock, and `CLOCK_REALTIME` which
we cannot. The latter is provided just to make people who use this
header happy and it will always return failure.
2024-07-16 16:17:34 -05:00
lntue
a6d2da8b9d [libc][stdlib] Implement heap sort. (#98582) 2024-07-16 08:13:25 -04:00
Petr Hosek
69258491d2 [libc] Support configurable errno modes (#98287)
Rather than selecting the errno implementation based on the platform
which doesn't provide the necessary flexibility, make it configurable.

The errno value location is returned by `int *__llvm_libc_errno()` which
is a common design used by other C libraries.
2024-07-13 10:52:42 -07:00
Petr Hosek
5ff3ff33ff [libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597)
This is a part of #97655.
2024-07-12 09:28:41 -07:00
Mehdi Amini
ce9035f5bd Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration" (#98593)
Reverts llvm/llvm-project#98075

bots are broken
2024-07-12 09:12:13 +02:00
Petr Hosek
3f30effe1b [libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075)
This is a part of #97655.
2024-07-11 12:35:22 -07:00
Joseph Huber
60ff9c2ea5 [libc] Add support for powi as an LLVM libc extension on the GPU (#98236)
Summary:
This function is used by the CUDA / HIP / OpenMP headers and exists as
an NVIDIA extension basically. This function is implemented in the C23
standard as `pown`, but for now we need to provide `powi` for backwards
compatibility. In the future this entrypoint will just be a redirect to
`pown` once that is implemented.
2024-07-09 20:51:36 -05:00
Joseph Huber
4b53cbb069 [libc] Export configs for scanf support (#98066)
Summary:
This patch adds the options for configuring floating point and index
mode for scanf just like `printf`. Not enabling it on the GPU yet, need
to fix something else first.
2024-07-08 15:11:52 -05:00
Joseph Huber
12e47aabd4 [libc] Add config option for fast math optimizations (#98029)
Summary:
This patch adds `LIBC_COPT_MATH_OPTIMIZATIONS` that allows users to
configure the
different math optimizations.
2024-07-08 11:01:03 -05:00
lntue
c9ee6b1977 [libc][math] Implement cbrtf function correctly rounded to all rounding modes. (#97936)
Fixes https://github.com/llvm/llvm-project/issues/92874

Algorithm: Let `x = (-1)^s * 2^e * (1 + m)`.
- Step 1: Range reduction: reduce the exponent with:
```
  y = cbrt(x) = (-1)^s * 2^(floor(e/3)) * 2^((e % 3)/3) * (1 + m)^(1/3)
```
- Step 2: Use the first 4 bit fractional bits of `m` to look up for a
degree-7 polynomial approximation to:
```
  (1 + m)^(1/3) ~ 1 + m * P(m).
```
- Step 3: Perform the multiplication:
```
  2^((e % 3)/3) * (1 + m)^(1/3).
```
- Step 4: Check for exact cases to prevent rounding and clear
`FE_INEXACT` floating point exception.
- Step 5: Combine with the exponent and sign before converting down to
`float` and return.
2024-07-08 10:02:12 -04:00
Izaak Schroeder
b151c7e36a [libc] Add dlfcn.h placeholder (#97501)
Adds `dlopen` and friends. This is needed as part of the effort to
compile `libunwind` + `libc` without baremetal mode. This is part of
https://github.com/llvm/llvm-project/issues/97191. This should still be
spec compliant, since `dlopen` always returns `NULL` and `dlerror`
always returns an error message.

> If dlopen() fails for any reason, it returns NULL.

> The function dlclose() returns 0 on success, and nonzero on error.

> Since the value of the symbol could actually be NULL (so that a NULL
return from dlsym() need not indicate an error), the correct way to test
for an error is to call dlerror() to clear any old error conditions,
then call dlsym(), and then call dlerror() again, saving its return
value into a variable, and check whether this saved value is not NULL.


See:
- https://linux.die.net/man/3/dlopen
2024-07-06 16:01:59 -07:00
Hendrik Hübner
f8834ed24b [libc][C23][math] Implement cospif function correctly rounded for all rounding modes (#97464)
I also fixed a comment in sinpif.cpp in the first commit. Should this be
included in this PR?

All tests were passed, including the exhaustive test.

CC: @lntue
2024-07-06 09:24:05 -04:00
OverMighty
ac76ce2693 [libc][math][c23] Classify f16fma{,f,l} as LLVM libc extensions (#97728) 2024-07-05 09:58:01 -04:00
PiJoules
665efe8967 [libc] Add LIBC_NAMESPACE_DECL macro (#97109)
This defines to LIBC_NAMESPACE with
`__attribute__((visibility("hidden")))` so all the symbols under it have
hidden visibility. This new macro should be used when declaring a new
namespace that will have internal functions/globals and LIBC_NAMESPACE
should be used as a means of accessing functions/globals declared within
LIBC_NAMESPACE_DECL.
2024-07-03 17:02:57 -07:00
Michael Jones
2ef5b8227a [libc][docs] Update full host build docs (#97643)
Add a note explaining how to fix the missing `asm` folder, as well as a
warning about installing without setting a sysroot.
2024-07-03 16:29:19 -07:00
lntue
7d68d9d2f2 [libc][math] Implement correctly rounded double precision tan (#97489)
Using the same range reduction as `sin`, `cos`, and `sincos`:
1) Reducing `x = k*pi/128 + u`, with `|u| <= pi/256`, and `u` is in
double-double.
2) Approximate `tan(u)` using degree-9 Taylor polynomial.
3) Compute
```
   tan(x) ~ (sin(k*pi/128) + tan(u) * cos(k*pi/128)) / (cos(k*pi/128) - tan(u) * sin(k*pi/128))
```
using the fast double-double division algorithm in [the CORE-MATH
project](https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary64/tan/tan.c#L1855).
4) Perform relative-error Ziv's accuracy test
5) If the accuracy tests failed, we redo the computations using 128-bit
precision `DyadicFloat`.

Fixes https://github.com/llvm/llvm-project/issues/96930
2024-07-03 18:05:24 -04:00
OverMighty
f1a8f94bba [libc][docs] Add doc for using containers to test on a different arch (#97431) 2024-07-03 11:07:49 -04:00
OverMighty
4e56724213 [libc][math][c23] Add f16{add,sub}{,l,f128} C23 math functions (#97072)
Part of #93566.
2024-07-02 19:27:09 -04:00
OverMighty
12a1e6dd12 [libc][math][c23] Add f16{add,sub}f C23 math functions (#96787)
Part of #93566.
2024-07-02 09:16:12 -04:00
Hendrik Hübner
ea93c538c7 [libc][math][c23] Implemented sinpif function correctly rounded for all rounding modes. (#97149)
This implements the sinpif function. An exhaustive test shows it's
correct for all rounding modes.

Issue:  #94895
2024-07-01 16:38:03 -04:00
Joseph Huber
3c64a98180 [libc] Partially implement 'errno' on the GPU (#97107)
Summary:
The `errno` variable is expected to be `thread_local` by the standard.
However, the GPU targets do not support `thread_local` and implementing
that would be a large endeavor. Because of that, we previously didn't
provide the `errno` symbol at all. However, to build some programs we at
least need to be able to link against `errno`. Many things that would
normally set `errno` completely ignore it currently (i.e. stdio) but
some programs still need to be able to link against correct C programs.

For this purpose this patch exports the `errno` symbol as a simple
global. Internally, this will be updated atomically so it's at least not
racy. Externally, this will be on the user. I've updated the
documentation to state as such. This is required to get `libc++` to
build.
2024-07-01 06:30:15 -05:00
Joseph Huber
ec0e6ef09b [libc] Implement the 'remove' function on the GPU (#97096)
Summary:
Straightforward RPC implementation of the `remove` function for the GPU.
Copies over the string and calls `remove` on it, passing the result
back. This is required for building some `libc++` functionality.
2024-07-01 06:29:48 -05:00
OverMighty
6c1c451b86 [libc][math][c23] Add f16sqrt{,l,f128} C23 math functions (#96642)
Part of #95250.
2024-06-30 19:20:39 -04:00
OverMighty
56ef6a2eb2 [libc][math][c23] Add f16div{,l,f128} C23 math functions (#97054)
Part of #93566.
2024-06-29 18:48:12 -04:00