clang-p2996

Author	SHA1	Message	Date
Schrodinger ZHU Yifan	3db5c1eeb0	revert all tid changes (#100915 )	2024-07-27 22:29:21 -07:00
RoseZhang03	134b4484d8	[libc] Updated GettingStarted.rst with PyYAML version (#100649 ) New Headergen requires PyYAML version 5.1 or newer in order to generate header files .	2024-07-26 18:31:00 +00:00
lntue	ca8b14de51	[libc][math] Implement fast pass for double precision atan2 with 1 ULP errors. (#100648 )	2024-07-26 09:56:46 -04:00
Daniel Thornburgh	0c10bdc05f	[libc] Lazily initialize freelist malloc using symbols (#99254 ) This requires the user to set the upper bounds of the heap by defining the symbol `__libc_heap_limit`. The heap begins at `_end` and ends `__libc_heap_limit` bytes afterwards. This prevents a completely unused heap from requiring any space, and it prevents the heap from being zeroed at initialization time as part of BSS. It also allows users to customize the available heap location without recompiling libc. I'd think this should eventually be replaced with an implemenation based on a morecore() library. This would allow the same implementation to use sbrk() on POSIX, `_end` and `__libc_heap_limit` on embedded, and a buffer in tests. It would also provide better "wilderness" behavior that tends to decrease heap fragementation (see Wilson et al.) See #98096	2024-07-25 13:38:06 -07:00
Mikhail R. Gadelha	88fb56ebf2	[libc] Fix broken table introduced by PR #100578	2024-07-25 19:57:06 +02:00
Mikhail R. Gadelha	e90d552c77	[libc][NFC] Update riscv documentation (#100578 ) This adds linux-riscv32 to the documentation and fixes riscv's entrypoint broken link.	2024-07-25 13:25:09 -03:00
Job Henandez Lara	7b51777ed8	[libc][math][c23] add entrypoints and tests for totalordermag{f,l,f128} (#100159 ) Fixes https://github.com/llvm/llvm-project/issues/100139	2024-07-24 19:53:23 -04:00
aaryanshukla	8b094c9df3	[libc][newheadergen]: PyYaml Version Update (#100463 ) - a lot of builds had an issue using new headergen because they do not have PyYaml installed.	2024-07-24 15:04:05 -07:00
Joseph Huber	2e3ee31d29	[libc] Enable 'sscanf' on the GPU #100211 Summary: We can enable the sscanf function on the GPU now. This required adding the configs to the scanf list so that the GPU build didn't do float conversions.	2024-07-24 14:16:57 -05:00
Joseph Huber	8d8fa01a66	Reapply "[libc] Remove 'packaged' GPU build support (#100208 )" This reverts commit `550b83d658`.	2024-07-24 10:24:53 -05:00
Joseph Huber	550b83d658	Revert "[libc] Remove 'packaged' GPU build support (#100208 )" Summary: I forgot that the OpenMP tests still look for this, reverting for now until I can make a fix. This reverts commit `c1c6ed83e9`.	2024-07-24 07:51:47 -05:00
Joseph Huber	9914609468	Revert "[libc] Enable 'sscanf' on the GPU (#100211 )" Summary: This fails tests in some situations, revert until it can be fixed. This reverts commit `445bb35f95`.	2024-07-24 07:46:39 -05:00
Joseph Huber	445bb35f95	[libc] Enable 'sscanf' on the GPU (#100211 ) Summary: We can enable the `sscanf` function on the GPU now.	2024-07-24 07:41:32 -05:00
Joseph Huber	c1c6ed83e9	[libc] Remove 'packaged' GPU build support (#100208 ) Summary: Previously, the GPU built the `libc` in a fat binary version that was used to pass this to the link job in offloading languages like CUDA or OpenMP. This was mostly required because NVIDIA couldn't consume the standard static library version. Recent patches have now created the `clang-nvlink-wrapper` which lets us do that. Now, the C library is just included implicitly by the toolchain (or passed with -Xoffload-linker -lc). This code can be fully removed, which will heavily simplify the build (and removed some bugs and garbage files I've encoutnered).	2024-07-24 07:22:49 -05:00
Joseph Huber	0420d2f97e	[libc] Fix leftover debug commandline argument Summary: Fixes https://github.com/llvm/llvm-project/issues/100289	2024-07-23 21:35:42 -05:00
RoseZhang03	8972979c37	[libc] Updated header_generation.rst (#99712 ) Added new headergen documentation.	2024-07-22 20:15:26 +00:00
Job Henandez Lara	c1562374c8	[libc][math][c23] Add entrypoints and tests for dsqrt{l,f128} (#99815 )	2024-07-21 15:55:11 -04:00
Job Henandez Lara	af0f58cf14	[libc][math][c23] Add entrypoints and tests for fsqrt{,l,f128} (#99669 )	2024-07-21 11:17:41 -04:00
Schrodinger ZHU Yifan	29be889c2c	reland "[libc] implement cached process/thread identity (#98989 )" (#99765 )	2024-07-20 10:25:40 -07:00
aaryanshukla	a2f61ba08b	[libc][math]fadd implementation (#99694 ) - [libc] math fadd - [libc][math] implemented fadd	2024-07-19 14:40:34 -07:00
Schrodinger ZHU Yifan	415ca24f8e	Revert "[libc] implement cached process/thread identity" (#99559 ) Reverts llvm/llvm-project#98989	2024-07-18 13:31:04 -07:00
Schrodinger ZHU Yifan	5c9fc3cdd7	[libc] implement cached process/thread identity (#98989 ) migrated from https://github.com/llvm/llvm-project/pull/95965 due to corrupted git history	2024-07-18 13:27:50 -07:00
OverMighty	9fb049c8c6	[libc][math][c23] Add {f,d}mul{l,f128} and f16mul{,f,l,f128} C23 math functions (#98972 ) Part of #93566. Fixes #94833.	2024-07-18 19:50:49 +02:00
Joseph Huber	38f1dd2e45	[libc] Remove `strerror_r` on the GPU for now Summary: This function has conflicting definitions, which makes it difficult to use in an offloading setting. Disable it for now.	2024-07-18 06:54:03 -05:00
lntue	7fc9fb9f3f	[libc][math] Implement double precision cbrt correctly rounded to all rounding modes. (#99262 ) Division-less Newton iterations algorithm for cube roots. 1. Range reduction For `x = (-1)^s * 2^e * (1.m)`, we get 2 reduced arguments `x_r` and `a` as: ``` x_r = 1.m a = (-1)^s * 2^(e % 3) * (1.m) ``` Then `cbrt(x) = x^(1/3)` can be computed as: ``` x^(1/3) = 2^(e / 3) * a^(1/3). ``` In order to avoid division, we compute `a^(-2/3)` using Newton method and then multiply the results by a: ``` a^(1/3) = a * a^(-2/3). ``` 2. First approximation to a^(-2/3) First, we use a degree-7 minimax polynomial generated by Sollya to approximate `x_r^(-2/3)` for `1 <= x_r < 2`. ``` p = P(x_r) ~ x_r^(-2/3), ``` with relative errors bounded by: ``` \| p / x_r^(-2/3) - 1 \| < 1.16 * 2^-21. ``` Then we multiply with `2^(e % 3)` from a small lookup table to get: ``` x_0 = 2^(-2(e % 3)/3) p ~ 2^(-2(e % 3)/3) x_r^(-2/3) = a^(-2/3) ``` with relative errors: ``` \| x_0 / a^(-2/3) - 1 \| < 1.16 * 2^-21. ``` This step is done in double precision. 3. First Newton iteration We follow the method described in: Sibidanov, A. and Zimmermann, P., "Correctly rounded cubic root evaluation in double precision", https://core-math.gitlabpages.inria.fr/cbrt64.pdf to derive multiplicative Newton iterations as below: Let `x_n` be the nth approximation to `a^(-2/3)`. Define the n^th error as: ``` h_n = x_n^3 * a^2 - 1 ``` Then: ``` a^(-2/3) = x_n / (1 + h_n)^(1/3) = x_n * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3 + ...) ``` using the Taylor series expansion of `(1 + h_n)^(-1/3)`. Apply to `x_0` above: ``` h_0 = x_0^3 * a^2 - 1 = a^2 * (x_0 - a^(-2/3)) * (x_0^2 + x_0 * a^(-2/3) + a^(-4/3)), ``` it's bounded by: ``` \|h_0\| < 4 * 3 * 1.16 * 2^-21 * 4 < 2^-17. ``` So in the first iteration step, we use: ``` x_1 = x_0 * (1 - (1/3) * h_n + (2/9) * h_n^2 - (14/81) * h_n^3) ``` Its relative error is bounded by: ``` \| x_1 / a^(-2/3) - 1 \| < 35/242 * \|h_0\|^4 < 2^-70. ``` Then we perform Ziv's rounding test and check if the answer is exact. This step is done in double-double precision. 4. Second Newton iteration If the Ziv's rounding test from the previous step fails, we define the error term: ``` h_1 = x_1^3 * a^2 - 1, ``` And perform another iteration: ``` x_2 = x_1 * (1 - h_1 / 3) ``` with the relative errors exceed the precision of double-double. We then check the Ziv's accuracy test with relative errors < 2^-102 to compensate for rounding errors. 5. Final iteration If the Ziv's accuracy test from the previous step fails, we perform another iteration in 128-bit precision and check for exact outputs.	2024-07-17 12:23:14 -04:00
Joseph Huber	49b2c30feb	[libc][docs] Document printf support on the GPU target (#99241 ) Summary: Title	2024-07-16 16:40:37 -05:00
Joseph Huber	8393ea5d1d	[libc] Implement `clock_gettime` for the monotonic clock on the GPU (#99067 ) Summary: This patch implements `clock_gettime` using the monotonic clock. This allows users to get time elapsed at nanosecond resolution. This is primarily to facilitate compiling the `chrono` library from `libc++`. For this reason we provide both `CLOCK_MONOTONIC`, which we can implement with the GPU's global fixed-frequency clock, and `CLOCK_REALTIME` which we cannot. The latter is provided just to make people who use this header happy and it will always return failure.	2024-07-16 16:17:34 -05:00
lntue	a6d2da8b9d	[libc][stdlib] Implement heap sort. (#98582 )	2024-07-16 08:13:25 -04:00
Petr Hosek	69258491d2	[libc] Support configurable errno modes (#98287 ) Rather than selecting the errno implementation based on the platform which doesn't provide the necessary flexibility, make it configurable. The errno value location is returned by `int *__llvm_libc_errno()` which is a common design used by other C libraries.	2024-07-13 10:52:42 -07:00
Petr Hosek	5ff3ff33ff	[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597 ) This is a part of #97655.	2024-07-12 09:28:41 -07:00
Mehdi Amini	ce9035f5bd	Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration" (#98593 ) Reverts llvm/llvm-project#98075 bots are broken	2024-07-12 09:12:13 +02:00
Petr Hosek	3f30effe1b	[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075 ) This is a part of #97655.	2024-07-11 12:35:22 -07:00
Joseph Huber	60ff9c2ea5	[libc] Add support for `powi` as an LLVM libc extension on the GPU (#98236 ) Summary: This function is used by the CUDA / HIP / OpenMP headers and exists as an NVIDIA extension basically. This function is implemented in the C23 standard as `pown`, but for now we need to provide `powi` for backwards compatibility. In the future this entrypoint will just be a redirect to `pown` once that is implemented.	2024-07-09 20:51:36 -05:00
Joseph Huber	4b53cbb069	[libc] Export configs for `scanf` support (#98066 ) Summary: This patch adds the options for configuring floating point and index mode for scanf just like `printf`. Not enabling it on the GPU yet, need to fix something else first.	2024-07-08 15:11:52 -05:00
Joseph Huber	12e47aabd4	[libc] Add config option for fast math optimizations (#98029 ) Summary: This patch adds `LIBC_COPT_MATH_OPTIMIZATIONS` that allows users to configure the different math optimizations.	2024-07-08 11:01:03 -05:00
lntue	c9ee6b1977	[libc][math] Implement cbrtf function correctly rounded to all rounding modes. (#97936 ) Fixes https://github.com/llvm/llvm-project/issues/92874 Algorithm: Let `x = (-1)^s * 2^e * (1 + m)`. - Step 1: Range reduction: reduce the exponent with: ``` y = cbrt(x) = (-1)^s * 2^(floor(e/3)) * 2^((e % 3)/3) * (1 + m)^(1/3) ``` - Step 2: Use the first 4 bit fractional bits of `m` to look up for a degree-7 polynomial approximation to: ``` (1 + m)^(1/3) ~ 1 + m * P(m). ``` - Step 3: Perform the multiplication: ``` 2^((e % 3)/3) * (1 + m)^(1/3). ``` - Step 4: Check for exact cases to prevent rounding and clear `FE_INEXACT` floating point exception. - Step 5: Combine with the exponent and sign before converting down to `float` and return.	2024-07-08 10:02:12 -04:00
Izaak Schroeder	b151c7e36a	[libc] Add `dlfcn.h` placeholder (#97501 ) Adds `dlopen` and friends. This is needed as part of the effort to compile `libunwind` + `libc` without baremetal mode. This is part of https://github.com/llvm/llvm-project/issues/97191. This should still be spec compliant, since `dlopen` always returns `NULL` and `dlerror` always returns an error message. > If dlopen() fails for any reason, it returns NULL. > The function dlclose() returns 0 on success, and nonzero on error. > Since the value of the symbol could actually be NULL (so that a NULL return from dlsym() need not indicate an error), the correct way to test for an error is to call dlerror() to clear any old error conditions, then call dlsym(), and then call dlerror() again, saving its return value into a variable, and check whether this saved value is not NULL. See: - https://linux.die.net/man/3/dlopen	2024-07-06 16:01:59 -07:00
Hendrik Hübner	f8834ed24b	[libc][C23][math] Implement cospif function correctly rounded for all rounding modes (#97464 ) I also fixed a comment in sinpif.cpp in the first commit. Should this be included in this PR? All tests were passed, including the exhaustive test. CC: @lntue	2024-07-06 09:24:05 -04:00
OverMighty	ac76ce2693	[libc][math][c23] Classify f16fma{,f,l} as LLVM libc extensions (#97728 )	2024-07-05 09:58:01 -04:00
PiJoules	665efe8967	[libc] Add LIBC_NAMESPACE_DECL macro (#97109 ) This defines to LIBC_NAMESPACE with `__attribute__((visibility("hidden")))` so all the symbols under it have hidden visibility. This new macro should be used when declaring a new namespace that will have internal functions/globals and LIBC_NAMESPACE should be used as a means of accessing functions/globals declared within LIBC_NAMESPACE_DECL.	2024-07-03 17:02:57 -07:00
Michael Jones	2ef5b8227a	[libc][docs] Update full host build docs (#97643 ) Add a note explaining how to fix the missing `asm` folder, as well as a warning about installing without setting a sysroot.	2024-07-03 16:29:19 -07:00
lntue	7d68d9d2f2	[libc][math] Implement correctly rounded double precision tan (#97489 ) Using the same range reduction as `sin`, `cos`, and `sincos`: 1) Reducing `x = kpi/128 + u`, with `\|u\| <= pi/256`, and `u` is in double-double. 2) Approximate `tan(u)` using degree-9 Taylor polynomial. 3) Compute ``` tan(x) ~ (sin(kpi/128) + tan(u) * cos(kpi/128)) / (cos(kpi/128) - tan(u) * sin(k*pi/128)) ``` using the fast double-double division algorithm in [the CORE-MATH project](https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary64/tan/tan.c#L1855). 4) Perform relative-error Ziv's accuracy test 5) If the accuracy tests failed, we redo the computations using 128-bit precision `DyadicFloat`. Fixes https://github.com/llvm/llvm-project/issues/96930	2024-07-03 18:05:24 -04:00
OverMighty	f1a8f94bba	[libc][docs] Add doc for using containers to test on a different arch (#97431 )	2024-07-03 11:07:49 -04:00
OverMighty	4e56724213	[libc][math][c23] Add f16{add,sub}{,l,f128} C23 math functions (#97072 ) Part of #93566.	2024-07-02 19:27:09 -04:00
OverMighty	12a1e6dd12	[libc][math][c23] Add f16{add,sub}f C23 math functions (#96787 ) Part of #93566.	2024-07-02 09:16:12 -04:00
Hendrik Hübner	ea93c538c7	[libc][math][c23] Implemented sinpif function correctly rounded for all rounding modes. (#97149 ) This implements the sinpif function. An exhaustive test shows it's correct for all rounding modes. Issue: #94895	2024-07-01 16:38:03 -04:00
Joseph Huber	3c64a98180	[libc] Partially implement 'errno' on the GPU (#97107 ) Summary: The `errno` variable is expected to be `thread_local` by the standard. However, the GPU targets do not support `thread_local` and implementing that would be a large endeavor. Because of that, we previously didn't provide the `errno` symbol at all. However, to build some programs we at least need to be able to link against `errno`. Many things that would normally set `errno` completely ignore it currently (i.e. stdio) but some programs still need to be able to link against correct C programs. For this purpose this patch exports the `errno` symbol as a simple global. Internally, this will be updated atomically so it's at least not racy. Externally, this will be on the user. I've updated the documentation to state as such. This is required to get `libc++` to build.	2024-07-01 06:30:15 -05:00
Joseph Huber	ec0e6ef09b	[libc] Implement the 'remove' function on the GPU (#97096 ) Summary: Straightforward RPC implementation of the `remove` function for the GPU. Copies over the string and calls `remove` on it, passing the result back. This is required for building some `libc++` functionality.	2024-07-01 06:29:48 -05:00
OverMighty	6c1c451b86	[libc][math][c23] Add f16sqrt{,l,f128} C23 math functions (#96642 ) Part of #95250.	2024-06-30 19:20:39 -04:00
OverMighty	56ef6a2eb2	[libc][math][c23] Add f16div{,l,f128} C23 math functions (#97054 ) Part of #93566.	2024-06-29 18:48:12 -04:00

1 2 3 4 5 ...

390 Commits