Commit Graph

125 Commits

Author SHA1 Message Date
OverMighty
d97f6d1ae9 [libc][math][c23] Add sqrtf16 C23 math function (#112406)
Part of #95250.
2024-10-19 01:41:52 +02:00
OverMighty
69d3a44ede [libc][math][c23] Add log10f16 C23 math function (#106091)
Part of #95250.
2024-10-19 01:40:40 +02:00
OverMighty
6d347fdfbd [libc][math][c23] Add log2f16 C23 math function (#106084)
Part of #95250.
2024-10-19 01:10:32 +02:00
OverMighty
65cf7afb6d [libc][math][c23] Add logf16 C23 math function (#106072)
Part of #95250.
2024-10-18 22:35:12 +02:00
OverMighty
fdd7c0353f [libc][math][c23] Add tanhf16 C23 math function (#106006)
Part of #95250.
2024-10-18 14:22:45 +02:00
OverMighty
ed3d051782 [libc][math][c23] Add sinhf16 and coshf16 C23 math functions (#105947)
Part of #95250.
2024-10-17 20:44:23 +02:00
OverMighty
95c24cb9de [libc][math][c23] Add exp10m1f16 C23 math function (#105706)
Part of #95250.
2024-10-16 16:33:13 +02:00
Joseph Huber
fe6a3d46aa [libc] Implement the 'rename' function on the GPU (#109814)
Summary:
Straightforward implementation like the other `stdio.h` functions.
2024-09-24 09:32:42 -07:00
Joseph Huber
3bbe0f90f3 [libc] Add 'strings.h' header on the GPU (#109661)
Summary:
These are GNU extensions but still show up, the entrypoints were enabled
but we weren't emitting the header so they couldn't be used.
2024-09-23 14:19:33 -07:00
Joseph Huber
16d11e26f3 [libc] Add GPU support for the 'system' function (#109687)
Summary:
This function can easily be implemented by forwarding it to the host
process. This shows up in a few places that we might want to test the
GPU so it should be provided. Also, I find the idea of the GPU
offloading work to the CPU via `system` very funny.
2024-09-23 14:04:28 -07:00
Michael Jones
f009f72df5 [libc] Add printf strerror conversion (%m) (#105891)
This patch adds the %m conversion to printf, which prints the
strerror(errno). Explanation of why is below, this patch also updates
the docs, tests, and build system to accomodate this.

The standard for syslog in posix specifies it uses the same format as
printf, but adds %m which prints the error message string for the
current value of errno. For ease of implementation, it's standard
practice for libc implementers to just add %m to printf instead of
creating a separate parser for syslog.
2024-09-19 10:48:08 -07:00
Joseph Huber
5c019bdb7a [libc] Add support for 'string.h' locale variants (#105719)
Summary:
This adds the locale variants of the string functions. As previously,
these do not use the locale information at all and simply copy the
non-locale version which expects the "C" locale.
2024-08-29 14:20:15 -05:00
Joseph Huber
a87105121d [libc] Implement locale variants for 'stdlib.h' functions (#105718)
Summary:
This provides the `_l` variants for the `stdlib.h` functions. These are
just copies of the same entrypoint and don't do anything with the locale
information.
2024-08-29 14:18:37 -05:00
Joseph Huber
439d7de14d [libc] Disable failing scanf test on AMDGPU temporarily
Summary:
This test currently fails in the `amdgpu-attributor` pass. I haven't
figured out anything beyond that yet as it's difficult to reduce.
2024-08-28 07:04:15 -05:00
Joseph Huber
856dadb33c [libc] Add ctype.h locale variants (#102711)
Summary:
This patch adds all the libc ctype variants. These ignore the locale
ingormation completely, so they're pretty much just stubs. Because these
use locale information, which is system scope, we do not enable building
them outisde of full build mode.
2024-08-22 13:51:54 -05:00
Joseph Huber
78d8ab2ab9 [libc] Initial support for 'locale.h' in the LLVM libc (#102689)
Summary:
This patch adds the macros and entrypoints associated with the
`locale.h` entrypoints.  These are mostly stubs, as we (for now and the
forseeable future) only expect to support the C and maybe C.UTF-8
locales in the LLVM libc.
2024-08-22 12:58:46 -05:00
Joseph Huber
2f4232db0b Revert " [libc] Add ctype.h locale variants (#102711)"
This reverts commit 8f005f8306.
2024-08-22 12:45:16 -05:00
Joseph Huber
8f005f8306 [libc] Add ctype.h locale variants (#102711)
Summary:
This patch adds all the libc ctype variants. These ignore the locale
ingormation completely, so they're pretty much just stubs. Because these
use locale information, which is system scope, we do not enable building
them outisde of full build mode.
2024-08-22 12:41:20 -05:00
Joseph Huber
6b98a72365 [libc] Add scanf support to the GPU build (#104812)
Summary:
The `scanf` function has a "system file" configuration, which is pretty
much what the GPU implementation does at this point. So we should be
able to use it in much the same way.
2024-08-21 18:02:04 -05:00
Joseph Huber
bd9f2c2ba0 [libc] Add missing math definitions for round and scal for GPU (#104636)
Summary:
These can be enabled
2024-08-16 16:27:03 -05:00
Joseph Huber
55aa4ea1c7 [libc] Add definition for atan2l on 64-bit long double platforms (#104489)
Summary:
This just adds `atan2l` for platforms that can implement it as an alias
to `atan2`.
2024-08-15 14:59:28 -05:00
Joseph Huber
dc2f39e96c [libc] Enable all supported math functions on the GPU (#102563)
Summary:
Simply copies the x64 versions to the GPU directory. Ignoring f128 for
now, but adding long double entrypoints which are identical to `double`
on the target.
2024-08-12 13:12:44 -05:00
aaryanshukla
d0fe470fd2 [libc][math] Add scalbln{,f,l,f128} math functions (#102219)
Co-authored-by: OverMighty <its.overmighty@gmail.com>
2024-08-08 14:33:50 -07:00
Joseph Huber
1a92cc5a0a [libc] Implement 'getenv' on the GPU target (#102376)
Summary:
This patch implements 'getenv'. I was torn on how to implement this,
since realistically we only have access to this environment pointer in
the "loader" interface. An alternative would be to use an RPC call every
time, but I think that's overkill for what this will be used for. A
better solution is just to emit a common `DataEnvironment` that contains
all of the host visible resources to initialize. Right now this is the
`env_ptr`, `clock_freq`, and `rpc_client`.

I did this by making the `app.h` interface that Linux uses more general,
could possibly move that into a separate patch, but I figured it's
easier to see with the usage.
2024-08-08 06:45:42 -05:00
Joseph Huber
3645ca58f4 [libc] Enable quick_exit routines on the GPU (#102242)
Summary:
We should be able to use these on the GPU just like exit.
2024-08-07 08:01:11 -05:00
Joseph Huber
88d288489e [libc] Add lgamma and lgamma_r stubs for the GPU (#102019)
Summary:
These functions are used by the <random> implementation in libc++ and
cause a lot of tests to fail. For now we provide these through the
vendor abstraction until we have a real version. The NVPTX version
doesn't even update the output correctly so these are just temporary.
2024-08-05 14:53:05 -05:00
Joseph Huber
c4ec19b985 [libc] Add support for 'features.h' when targeting the GPU (#102037)
Summary:
`features.h` provides some information about the C library, provide this
on the GPU so external users can tell if it's the LLVM C library.
2024-08-05 14:52:44 -05:00
Joseph Huber
bde51232ba [libc] Provide 'signal.h' header for the GPU (#101996)
Summary:
This header is practically useless, but we provide it mostly for the
macros so that applications can compile. I'm only doing this for the
`libc++` unittests that want it, and it is part of the C standard
technically. I just made an RPC call to do `raise`. Anything more isn't
going to work since it'd be way too annoying to make the CPU call into
some signal handler the GPU registered.
2024-08-05 14:52:14 -05:00
Joseph Huber
97f723bab0 [libc] Fix 'vasprintf' not working in non-fullbuild mode 2024-08-01 15:36:29 -05:00
Job Henandez Lara
ed12f80ff0 [libc][math][c23] add entrypoints and tests for getpayload{,f,f128} (#101285) 2024-07-31 23:16:42 -04:00
Joseph Huber
38ef6929a3 [libc] Add vsscanf function (#101402)
Summary:
Adds support for the `vsscanf` function similar to `sscanf`.
Based off of https://github.com/llvm/llvm-project/pull/97529.
2024-07-31 16:53:25 -05:00
Joseph Huber
bf42a7860a [libc] Implement placeholder memory functions on the GPU (#101082)
Summary:
These functions are needed for `libc++` to link successfully. We can't
implement them well currently, so simply provide some stand-in
implementations. `realloc` will currently copy garbage and potentially
fault and `aligned_alloc` will work unless your alignment is more than
4K alignment. However, these should work in practice to get tests
running. I will write a real allocator soon™.
2024-07-30 10:15:30 -05:00
Joseph Huber
dbb8b7a0f4 Reapply "[OpenMP][libc] Remove special handling for OpenMP printf (#98940)"
This reverts commit fea5914c92.
2024-07-26 17:21:56 -05:00
Joseph Huber
fea5914c92 Revert "[OpenMP][libc] Remove special handling for OpenMP printf (#98940)"
This reverts commit 069e8bcd82.

Summary:
Some tests failing, revert this for now.
2024-07-26 16:39:12 -05:00
Joseph Huber
069e8bcd82 [OpenMP][libc] Remove special handling for OpenMP printf (#98940)
Summary:
Currently there are several layers to handle `printf`. Since we now have
varargs and an implementation of `printf` this can be heavily
simplified.

1. The frontend renames `printf` into `omp_vprintf` and gives it an
   argument buffer.

Removing 1. triggered some code in the AMDGPU backend menat for HIP /
OpenCL, so I hadded an exception to it.

2. Forward this to CUDA vprintf or ignore it.

We no longer need special handling for it since we have varargs. So now
we just forward this to CUDA vprintf if we have libc, otherwise just
leave `printf` as an external function and expect that `libc` will be
linked in.
2024-07-26 16:03:36 -05:00
OverMighty
81ce796095 [libc][math][c23] Enable C23 _Float16 math functions on GPUs (#99248) 2024-07-25 21:09:49 +02:00
Joseph Huber
2e3ee31d29 [libc] Enable 'sscanf' on the GPU #100211
Summary:
We can enable the sscanf function on the GPU now. This required adding
the configs to the scanf list so that the GPU build didn't do float
conversions.
2024-07-24 14:16:57 -05:00
Joseph Huber
9914609468 Revert "[libc] Enable 'sscanf' on the GPU (#100211)"
Summary:
This fails tests in some situations, revert until it can be fixed.
This reverts commit 445bb35f95.
2024-07-24 07:46:39 -05:00
Joseph Huber
445bb35f95 [libc] Enable 'sscanf' on the GPU (#100211)
Summary:
We can enable the `sscanf` function on the GPU now.
2024-07-24 07:41:32 -05:00
Joseph Huber
e0649a5dfc [NVPTX] Fix internal indirect call prototypes not obeying the ABI (#100131)
Summary:
The NVPTX backend optimizes the ABI for functions that are internal,
however, this is not legal for indirect call prototypes. Previously, we
would modify the ABI on an aggregate byval type passed to an indirect
call prototype, which would make PTXAS error. This patch just passes the
function as a nullptr to force strict ABI compliance without
modification in the helper function.

Fixes https://github.com/llvm/llvm-project/issues/100055
2024-07-23 12:54:00 -05:00
Joseph Huber
e7a2405383 [libc] Remove workarounds for lack of functional NVPTX linker (#96972)
Summary:
Currently we have several hacks to work around the fact that the NVPTX
linker, 'nvlink', does not support static libraries or LTO linking.
The patch in https://github.com/llvm/llvm-project/pull/96561 introduces
a wrapper in the toolchain that allows us to use a standard `ld.lld`
like interface. This means all the divergence with this target can be
removed.

Depends on https://github.com/llvm/llvm-project/pull/96561
2024-07-22 22:16:50 -05:00
aaryanshukla
a2f61ba08b [libc][math]fadd implementation (#99694)
- **[libc] math fadd**
- **[libc][math] implemented fadd**
2024-07-19 14:40:34 -07:00
Joseph Huber
38f1dd2e45 [libc] Remove strerror_r on the GPU for now
Summary:
This function has conflicting definitions, which makes it difficult to
use in an offloading setting. Disable it for now.
2024-07-18 06:54:03 -05:00
lntue
7fc9fb9f3f [libc][math] Implement double precision cbrt correctly rounded to all rounding modes. (#99262)
Division-less Newton iterations algorithm for cube roots.

1. **Range reduction**

For `x = (-1)^s * 2^e * (1.m)`, we get 2 reduced arguments `x_r` and `a`
as:
```
  x_r = 1.m
  a   = (-1)^s * 2^(e % 3) * (1.m)
```
Then `cbrt(x) = x^(1/3)` can be computed as:
```
  x^(1/3) = 2^(e / 3) * a^(1/3).
```

In order to avoid division, we compute `a^(-2/3)` using Newton method
and then
multiply the results by a:
```
  a^(1/3) = a * a^(-2/3).
```

2. **First approximation to a^(-2/3)**

First, we use a degree-7 minimax polynomial generated by Sollya to
approximate `x_r^(-2/3)` for `1 <= x_r < 2`.
```
  p = P(x_r) ~ x_r^(-2/3),
```
with relative errors bounded by:
```
  | p / x_r^(-2/3) - 1 | < 1.16 * 2^-21.
```

Then we multiply with `2^(e % 3)` from a small lookup table to get:
```
  x_0 = 2^(-2*(e % 3)/3) * p
      ~ 2^(-2*(e % 3)/3) * x_r^(-2/3)
      = a^(-2/3)
```
with relative errors:
```
  | x_0 / a^(-2/3) - 1 | < 1.16 * 2^-21.
```
This step is done in double precision.

3. **First Newton iteration**

We follow the method described in:
Sibidanov, A. and Zimmermann, P., "Correctly rounded cubic root
evaluation
in double precision", https://core-math.gitlabpages.inria.fr/cbrt64.pdf
to derive multiplicative Newton iterations as below:
Let `x_n` be the nth approximation to `a^(-2/3)`. Define the n^th error
as:
```
  h_n = x_n^3 * a^2 - 1
```
Then:
```
  a^(-2/3) = x_n / (1 + h_n)^(1/3)
           = x_n * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3 + ...)
```
using the Taylor series expansion of `(1 + h_n)^(-1/3)`.

Apply to `x_0` above:
```
  h_0 = x_0^3 * a^2 - 1
      = a^2 * (x_0 - a^(-2/3)) * (x_0^2 + x_0 * a^(-2/3) + a^(-4/3)),
```
it's bounded by:
```
  |h_0| < 4 * 3 * 1.16 * 2^-21 * 4 < 2^-17.
```
So in the first iteration step, we use:
```
  x_1 = x_0 * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3)
```
Its relative error is bounded by:
```
  | x_1 / a^(-2/3) - 1 | < 35/242 * |h_0|^4 < 2^-70.
```
Then we perform Ziv's rounding test and check if the answer is exact.
This step is done in double-double precision.

4. **Second Newton iteration**

If the Ziv's rounding test from the previous step fails, we define the
error
term:
```
  h_1 = x_1^3 * a^2 - 1,
```
And perform another iteration:
```
  x_2 = x_1 * (1 - h_1 / 3)
```
with the relative errors exceed the precision of double-double.
We then check the Ziv's accuracy test with relative errors < 2^-102 to
compensate for rounding errors.

5. **Final iteration**
 
If the Ziv's accuracy test from the previous step fails, we perform
another
iteration in 128-bit precision and check for exact outputs.
2024-07-17 12:23:14 -04:00
Joseph Huber
8393ea5d1d [libc] Implement clock_gettime for the monotonic clock on the GPU (#99067)
Summary:
This patch implements `clock_gettime` using the monotonic clock. This
allows users to get time elapsed at nanosecond resolution. This is
primarily to facilitate compiling the `chrono` library from `libc++`.
For this reason we provide both `CLOCK_MONOTONIC`, which we can
implement
with the GPU's global fixed-frequency clock, and `CLOCK_REALTIME` which
we cannot. The latter is provided just to make people who use this
header happy and it will always return failure.
2024-07-16 16:17:34 -05:00
Joseph Huber
f7cee44ef2 [libc] Add strerror and strerror_k to the GPU (#99083)
Summary:
The GPU ignores `errno` primarily, but targets want these functions to
be defined for certain C standard interfaces. This patch enables them
and makes the test function on non-Linux targets.
2024-07-16 16:17:01 -05:00
Joseph Huber
94ed08d6b2 [libc] Enable 'wchar.h' for the GPU (#98973)
Summary:
This file is not really well populated, but is required for some targets
to configure. Enable it on the GPU for now.
2024-07-15 22:21:17 -05:00
aaryanshukla
34e06dc371 [libc] newheadergen: added assert.yaml (#98826)
- removed assert macro definitions in api.td
- included macro definitions in assert.h.def
- added assert.yaml
2024-07-15 16:52:32 -07:00
Petr Hosek
69258491d2 [libc] Support configurable errno modes (#98287)
Rather than selecting the errno implementation based on the platform
which doesn't provide the necessary flexibility, make it configurable.

The errno value location is returned by `int *__llvm_libc_errno()` which
is a common design used by other C libraries.
2024-07-13 10:52:42 -07:00
Joseph Huber
40effc7af5 [libc] Implement (v|f)printf on the GPU (#96369)
Summary:
This patch implements the `printf` family of functions on the GPU using
the new variadic support. This patch adapts the old handling in the
`rpc_fprintf` placeholder, but adds an extra RPC call to get the size of
the buffer to copy. This prevents the GPU from needing to parse the
string. While it's theoretically possible for the pass to know the size
of the struct, it's prohibitively difficult to do while maintaining ABI
compatibility with NVIDIA's varargs.

Depends on https://github.com/llvm/llvm-project/pull/96015.
2024-07-12 19:36:13 -05:00