clang-p2996

Author	SHA1	Message	Date
OverMighty	d97f6d1ae9	[libc][math][c23] Add sqrtf16 C23 math function (#112406 ) Part of #95250.	2024-10-19 01:41:52 +02:00
OverMighty	69d3a44ede	[libc][math][c23] Add log10f16 C23 math function (#106091 ) Part of #95250.	2024-10-19 01:40:40 +02:00
OverMighty	6d347fdfbd	[libc][math][c23] Add log2f16 C23 math function (#106084 ) Part of #95250.	2024-10-19 01:10:32 +02:00
OverMighty	65cf7afb6d	[libc][math][c23] Add logf16 C23 math function (#106072 ) Part of #95250.	2024-10-18 22:35:12 +02:00
OverMighty	fdd7c0353f	[libc][math][c23] Add tanhf16 C23 math function (#106006 ) Part of #95250.	2024-10-18 14:22:45 +02:00
OverMighty	ed3d051782	[libc][math][c23] Add sinhf16 and coshf16 C23 math functions (#105947 ) Part of #95250.	2024-10-17 20:44:23 +02:00
OverMighty	95c24cb9de	[libc][math][c23] Add exp10m1f16 C23 math function (#105706 ) Part of #95250.	2024-10-16 16:33:13 +02:00
Joseph Huber	fe6a3d46aa	[libc] Implement the 'rename' function on the GPU (#109814 ) Summary: Straightforward implementation like the other `stdio.h` functions.	2024-09-24 09:32:42 -07:00
Joseph Huber	3bbe0f90f3	[libc] Add 'strings.h' header on the GPU (#109661 ) Summary: These are GNU extensions but still show up, the entrypoints were enabled but we weren't emitting the header so they couldn't be used.	2024-09-23 14:19:33 -07:00
Joseph Huber	16d11e26f3	[libc] Add GPU support for the 'system' function (#109687 ) Summary: This function can easily be implemented by forwarding it to the host process. This shows up in a few places that we might want to test the GPU so it should be provided. Also, I find the idea of the GPU offloading work to the CPU via `system` very funny.	2024-09-23 14:04:28 -07:00
Michael Jones	f009f72df5	[libc] Add printf strerror conversion (%m) (#105891 ) This patch adds the %m conversion to printf, which prints the strerror(errno). Explanation of why is below, this patch also updates the docs, tests, and build system to accomodate this. The standard for syslog in posix specifies it uses the same format as printf, but adds %m which prints the error message string for the current value of errno. For ease of implementation, it's standard practice for libc implementers to just add %m to printf instead of creating a separate parser for syslog.	2024-09-19 10:48:08 -07:00
Joseph Huber	5c019bdb7a	[libc] Add support for 'string.h' locale variants (#105719 ) Summary: This adds the locale variants of the string functions. As previously, these do not use the locale information at all and simply copy the non-locale version which expects the "C" locale.	2024-08-29 14:20:15 -05:00
Joseph Huber	a87105121d	[libc] Implement locale variants for 'stdlib.h' functions (#105718 ) Summary: This provides the `_l` variants for the `stdlib.h` functions. These are just copies of the same entrypoint and don't do anything with the locale information.	2024-08-29 14:18:37 -05:00
Joseph Huber	439d7de14d	[libc] Disable failing scanf test on AMDGPU temporarily Summary: This test currently fails in the `amdgpu-attributor` pass. I haven't figured out anything beyond that yet as it's difficult to reduce.	2024-08-28 07:04:15 -05:00
Joseph Huber	856dadb33c	[libc] Add `ctype.h` locale variants (#102711 ) Summary: This patch adds all the libc ctype variants. These ignore the locale ingormation completely, so they're pretty much just stubs. Because these use locale information, which is system scope, we do not enable building them outisde of full build mode.	2024-08-22 13:51:54 -05:00
Joseph Huber	78d8ab2ab9	[libc] Initial support for 'locale.h' in the LLVM libc (#102689 ) Summary: This patch adds the macros and entrypoints associated with the `locale.h` entrypoints. These are mostly stubs, as we (for now and the forseeable future) only expect to support the C and maybe C.UTF-8 locales in the LLVM libc.	2024-08-22 12:58:46 -05:00
Joseph Huber	2f4232db0b	Revert " [libc] Add `ctype.h` locale variants (#102711 )" This reverts commit `8f005f8306`.	2024-08-22 12:45:16 -05:00
Joseph Huber	8f005f8306	[libc] Add `ctype.h` locale variants (#102711 ) Summary: This patch adds all the libc ctype variants. These ignore the locale ingormation completely, so they're pretty much just stubs. Because these use locale information, which is system scope, we do not enable building them outisde of full build mode.	2024-08-22 12:41:20 -05:00
Joseph Huber	6b98a72365	[libc] Add `scanf` support to the GPU build (#104812 ) Summary: The `scanf` function has a "system file" configuration, which is pretty much what the GPU implementation does at this point. So we should be able to use it in much the same way.	2024-08-21 18:02:04 -05:00
Joseph Huber	bd9f2c2ba0	[libc] Add missing math definitions for round and scal for GPU (#104636 ) Summary: These can be enabled	2024-08-16 16:27:03 -05:00
Joseph Huber	55aa4ea1c7	[libc] Add definition for `atan2l` on 64-bit long double platforms (#104489 ) Summary: This just adds `atan2l` for platforms that can implement it as an alias to `atan2`.	2024-08-15 14:59:28 -05:00
Joseph Huber	dc2f39e96c	[libc] Enable all supported math functions on the GPU (#102563 ) Summary: Simply copies the x64 versions to the GPU directory. Ignoring f128 for now, but adding long double entrypoints which are identical to `double` on the target.	2024-08-12 13:12:44 -05:00
aaryanshukla	d0fe470fd2	[libc][math] Add scalbln{,f,l,f128} math functions (#102219 ) Co-authored-by: OverMighty <its.overmighty@gmail.com>	2024-08-08 14:33:50 -07:00
Joseph Huber	1a92cc5a0a	[libc] Implement 'getenv' on the GPU target (#102376 ) Summary: This patch implements 'getenv'. I was torn on how to implement this, since realistically we only have access to this environment pointer in the "loader" interface. An alternative would be to use an RPC call every time, but I think that's overkill for what this will be used for. A better solution is just to emit a common `DataEnvironment` that contains all of the host visible resources to initialize. Right now this is the `env_ptr`, `clock_freq`, and `rpc_client`. I did this by making the `app.h` interface that Linux uses more general, could possibly move that into a separate patch, but I figured it's easier to see with the usage.	2024-08-08 06:45:42 -05:00
Joseph Huber	3645ca58f4	[libc] Enable quick_exit routines on the GPU (#102242 ) Summary: We should be able to use these on the GPU just like exit.	2024-08-07 08:01:11 -05:00
Joseph Huber	88d288489e	[libc] Add `lgamma` and `lgamma_r` stubs for the GPU (#102019 ) Summary: These functions are used by the <random> implementation in libc++ and cause a lot of tests to fail. For now we provide these through the vendor abstraction until we have a real version. The NVPTX version doesn't even update the output correctly so these are just temporary.	2024-08-05 14:53:05 -05:00
Joseph Huber	c4ec19b985	[libc] Add support for 'features.h' when targeting the GPU (#102037 ) Summary: `features.h` provides some information about the C library, provide this on the GPU so external users can tell if it's the LLVM C library.	2024-08-05 14:52:44 -05:00
Joseph Huber	bde51232ba	[libc] Provide 'signal.h' header for the GPU (#101996 ) Summary: This header is practically useless, but we provide it mostly for the macros so that applications can compile. I'm only doing this for the `libc++` unittests that want it, and it is part of the C standard technically. I just made an RPC call to do `raise`. Anything more isn't going to work since it'd be way too annoying to make the CPU call into some signal handler the GPU registered.	2024-08-05 14:52:14 -05:00
Joseph Huber	97f723bab0	[libc] Fix 'vasprintf' not working in non-fullbuild mode	2024-08-01 15:36:29 -05:00
Job Henandez Lara	ed12f80ff0	[libc][math][c23] add entrypoints and tests for getpayload{,f,f128} (#101285 )	2024-07-31 23:16:42 -04:00
Joseph Huber	38ef6929a3	[libc] Add vsscanf function (#101402 ) Summary: Adds support for the `vsscanf` function similar to `sscanf`. Based off of https://github.com/llvm/llvm-project/pull/97529.	2024-07-31 16:53:25 -05:00
Joseph Huber	bf42a7860a	[libc] Implement placeholder memory functions on the GPU (#101082 ) Summary: These functions are needed for `libc++` to link successfully. We can't implement them well currently, so simply provide some stand-in implementations. `realloc` will currently copy garbage and potentially fault and `aligned_alloc` will work unless your alignment is more than 4K alignment. However, these should work in practice to get tests running. I will write a real allocator soon™.	2024-07-30 10:15:30 -05:00
Joseph Huber	dbb8b7a0f4	Reapply "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )" This reverts commit `fea5914c92`.	2024-07-26 17:21:56 -05:00
Joseph Huber	fea5914c92	Revert "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )" This reverts commit `069e8bcd82`. Summary: Some tests failing, revert this for now.	2024-07-26 16:39:12 -05:00
Joseph Huber	069e8bcd82	[OpenMP][libc] Remove special handling for OpenMP printf (#98940 ) Summary: Currently there are several layers to handle `printf`. Since we now have varargs and an implementation of `printf` this can be heavily simplified. 1. The frontend renames `printf` into `omp_vprintf` and gives it an argument buffer. Removing 1. triggered some code in the AMDGPU backend menat for HIP / OpenCL, so I hadded an exception to it. 2. Forward this to CUDA vprintf or ignore it. We no longer need special handling for it since we have varargs. So now we just forward this to CUDA vprintf if we have libc, otherwise just leave `printf` as an external function and expect that `libc` will be linked in.	2024-07-26 16:03:36 -05:00
OverMighty	81ce796095	[libc][math][c23] Enable C23 _Float16 math functions on GPUs (#99248 )	2024-07-25 21:09:49 +02:00
Joseph Huber	2e3ee31d29	[libc] Enable 'sscanf' on the GPU #100211 Summary: We can enable the sscanf function on the GPU now. This required adding the configs to the scanf list so that the GPU build didn't do float conversions.	2024-07-24 14:16:57 -05:00
Joseph Huber	9914609468	Revert "[libc] Enable 'sscanf' on the GPU (#100211 )" Summary: This fails tests in some situations, revert until it can be fixed. This reverts commit `445bb35f95`.	2024-07-24 07:46:39 -05:00
Joseph Huber	445bb35f95	[libc] Enable 'sscanf' on the GPU (#100211 ) Summary: We can enable the `sscanf` function on the GPU now.	2024-07-24 07:41:32 -05:00
Joseph Huber	e0649a5dfc	[NVPTX] Fix internal indirect call prototypes not obeying the ABI (#100131 ) Summary: The NVPTX backend optimizes the ABI for functions that are internal, however, this is not legal for indirect call prototypes. Previously, we would modify the ABI on an aggregate byval type passed to an indirect call prototype, which would make PTXAS error. This patch just passes the function as a nullptr to force strict ABI compliance without modification in the helper function. Fixes https://github.com/llvm/llvm-project/issues/100055	2024-07-23 12:54:00 -05:00
Joseph Huber	e7a2405383	[libc] Remove workarounds for lack of functional NVPTX linker (#96972 ) Summary: Currently we have several hacks to work around the fact that the NVPTX linker, 'nvlink', does not support static libraries or LTO linking. The patch in https://github.com/llvm/llvm-project/pull/96561 introduces a wrapper in the toolchain that allows us to use a standard `ld.lld` like interface. This means all the divergence with this target can be removed. Depends on https://github.com/llvm/llvm-project/pull/96561	2024-07-22 22:16:50 -05:00
aaryanshukla	a2f61ba08b	[libc][math]fadd implementation (#99694 ) - [libc] math fadd - [libc][math] implemented fadd	2024-07-19 14:40:34 -07:00
Joseph Huber	38f1dd2e45	[libc] Remove `strerror_r` on the GPU for now Summary: This function has conflicting definitions, which makes it difficult to use in an offloading setting. Disable it for now.	2024-07-18 06:54:03 -05:00
lntue	7fc9fb9f3f	[libc][math] Implement double precision cbrt correctly rounded to all rounding modes. (#99262 ) Division-less Newton iterations algorithm for cube roots. 1. Range reduction For `x = (-1)^s * 2^e * (1.m)`, we get 2 reduced arguments `x_r` and `a` as: ``` x_r = 1.m a = (-1)^s * 2^(e % 3) * (1.m) ``` Then `cbrt(x) = x^(1/3)` can be computed as: ``` x^(1/3) = 2^(e / 3) * a^(1/3). ``` In order to avoid division, we compute `a^(-2/3)` using Newton method and then multiply the results by a: ``` a^(1/3) = a * a^(-2/3). ``` 2. First approximation to a^(-2/3) First, we use a degree-7 minimax polynomial generated by Sollya to approximate `x_r^(-2/3)` for `1 <= x_r < 2`. ``` p = P(x_r) ~ x_r^(-2/3), ``` with relative errors bounded by: ``` \| p / x_r^(-2/3) - 1 \| < 1.16 * 2^-21. ``` Then we multiply with `2^(e % 3)` from a small lookup table to get: ``` x_0 = 2^(-2(e % 3)/3) p ~ 2^(-2(e % 3)/3) x_r^(-2/3) = a^(-2/3) ``` with relative errors: ``` \| x_0 / a^(-2/3) - 1 \| < 1.16 * 2^-21. ``` This step is done in double precision. 3. First Newton iteration We follow the method described in: Sibidanov, A. and Zimmermann, P., "Correctly rounded cubic root evaluation in double precision", https://core-math.gitlabpages.inria.fr/cbrt64.pdf to derive multiplicative Newton iterations as below: Let `x_n` be the nth approximation to `a^(-2/3)`. Define the n^th error as: ``` h_n = x_n^3 * a^2 - 1 ``` Then: ``` a^(-2/3) = x_n / (1 + h_n)^(1/3) = x_n * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3 + ...) ``` using the Taylor series expansion of `(1 + h_n)^(-1/3)`. Apply to `x_0` above: ``` h_0 = x_0^3 * a^2 - 1 = a^2 * (x_0 - a^(-2/3)) * (x_0^2 + x_0 * a^(-2/3) + a^(-4/3)), ``` it's bounded by: ``` \|h_0\| < 4 * 3 * 1.16 * 2^-21 * 4 < 2^-17. ``` So in the first iteration step, we use: ``` x_1 = x_0 * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3) ``` Its relative error is bounded by: ``` \| x_1 / a^(-2/3) - 1 \| < 35/242 * \|h_0\|^4 < 2^-70. ``` Then we perform Ziv's rounding test and check if the answer is exact. This step is done in double-double precision. 4. Second Newton iteration If the Ziv's rounding test from the previous step fails, we define the error term: ``` h_1 = x_1^3 * a^2 - 1, ``` And perform another iteration: ``` x_2 = x_1 * (1 - h_1 / 3) ``` with the relative errors exceed the precision of double-double. We then check the Ziv's accuracy test with relative errors < 2^-102 to compensate for rounding errors. 5. Final iteration If the Ziv's accuracy test from the previous step fails, we perform another iteration in 128-bit precision and check for exact outputs.	2024-07-17 12:23:14 -04:00
Joseph Huber	8393ea5d1d	[libc] Implement `clock_gettime` for the monotonic clock on the GPU (#99067 ) Summary: This patch implements `clock_gettime` using the monotonic clock. This allows users to get time elapsed at nanosecond resolution. This is primarily to facilitate compiling the `chrono` library from `libc++`. For this reason we provide both `CLOCK_MONOTONIC`, which we can implement with the GPU's global fixed-frequency clock, and `CLOCK_REALTIME` which we cannot. The latter is provided just to make people who use this header happy and it will always return failure.	2024-07-16 16:17:34 -05:00
Joseph Huber	f7cee44ef2	[libc] Add `strerror` and `strerror_k` to the GPU (#99083 ) Summary: The GPU ignores `errno` primarily, but targets want these functions to be defined for certain C standard interfaces. This patch enables them and makes the test function on non-Linux targets.	2024-07-16 16:17:01 -05:00
Joseph Huber	94ed08d6b2	[libc] Enable 'wchar.h' for the GPU (#98973 ) Summary: This file is not really well populated, but is required for some targets to configure. Enable it on the GPU for now.	2024-07-15 22:21:17 -05:00
aaryanshukla	34e06dc371	[libc] newheadergen: added assert.yaml (#98826 ) - removed assert macro definitions in api.td - included macro definitions in assert.h.def - added assert.yaml	2024-07-15 16:52:32 -07:00
Petr Hosek	69258491d2	[libc] Support configurable errno modes (#98287 ) Rather than selecting the errno implementation based on the platform which doesn't provide the necessary flexibility, make it configurable. The errno value location is returned by `int *__llvm_libc_errno()` which is a common design used by other C libraries.	2024-07-13 10:52:42 -07:00
Joseph Huber	40effc7af5	[libc] Implement (v\|f)printf on the GPU (#96369 ) Summary: This patch implements the `printf` family of functions on the GPU using the new variadic support. This patch adapts the old handling in the `rpc_fprintf` placeholder, but adds an extra RPC call to get the size of the buffer to copy. This prevents the GPU from needing to parse the string. While it's theoretically possible for the pass to know the size of the struct, it's prohibitively difficult to do while maintaining ABI compatibility with NVIDIA's varargs. Depends on https://github.com/llvm/llvm-project/pull/96015.	2024-07-12 19:36:13 -05:00

1 2 3

125 Commits