clang-p2996

Author	SHA1	Message	Date
Tue Ly	484319f497	[libc] Make expm1f correctly rounded when the targets have no FMA instructions. Add another exceptional value and fix the case when \|x\| is small. Performance tests with CORE-MATH project scripts: With FMA instructions on Ryzen 1700: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.194 LIBC reciprocal throughput : 14.595 $ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH latency : 57.755 System LIBC latency : 147.020 LIBC latency : 60.269 ``` Without FMA instructions: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.300 LIBC reciprocal throughput : 18.020 $ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH latency : 57.758 System LIBC latency : 147.025 LIBC latency : 70.304 ``` Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D123440	2022-06-03 15:57:48 -04:00
Tue Ly	614567a7bf	[libc] Automatically add -mfma flag for architectures supporting FMA. Detect if the architecture supports FMA instructions and if the targets depend on fma. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D123615	2022-06-03 01:21:20 -04:00
Michael Jones	1170951c73	[libc] add uint128 implementation Some platforms don't support proper 128 bit integers, but some algorithms use them, such as any that use long doubles. This patch modifies the existing UInt class to support the necessary operators. This does not put this new class into use, that will be in followup patches. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D124959	2022-05-12 11:16:53 -07:00
Tue Ly	c5f8a0a1e9	[libc] Add support for x86-64 targets that do not have FMA instructions. Make FMA flag checks more accurate for x86-64 targets, and refactor polyeval to use multiply and add instead when FMA instructions are not available. Reviewed By: michaelrj, sivachandra Differential Revision: https://reviews.llvm.org/D123335	2022-04-08 14:12:24 -04:00
Tue Ly	a5466f0436	[libc] Improve the performance of expm1f. Improve the performance of expm1f: - Rearrange the selection logic for different cases to improve the overall throughput. - Use the same degree-4 polynomial for large inputs as `expf` (https://reviews.llvm.org/D122418), reduced from a degree-7 polynomial. Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master): Before this patch: ``` $ ./perf.sh expm1f CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.288 LIBC reciprocal throughput : 54.572 $ ./perf.sh expm1f --latency CORE-MATH latency : 57.759 System LIBC latency : 147.146 LIBC latency : 118.057 ``` After this patch: ``` $ ./perf.sh expm1f CORE-MATH reciprocal throughput : 15.359 System LIBC reciprocal throughput : 53.188 LIBC reciprocal throughput : 14.600 $ ./perf.sh expm1f --latency CORE-MATH latency : 57.774 System LIBC latency : 147.119 LIBC latency : 60.280 ``` Reviewed By: michaelrj, santoshn, zimmermann6 Differential Revision: https://reviews.llvm.org/D122538	2022-03-30 19:23:25 -04:00
Michael Jones	9276074271	[libc][obvious] Add mfma to log2f In the previous patch adding -mfma to functions that need it for windows builds I missed log2f. Differential Revision: https://reviews.llvm.org/D122693	2022-03-29 16:34:52 -07:00
Michael Jones	2f8829aba3	[libc] Add mfma option to functions that use fma On Windows the functions that use fma don't properly include the fma intrinsics unless -mfma is added to the compile options. This patch adds the compile option to all of the functions that need it. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D122689	2022-03-29 16:23:36 -07:00
Tue Ly	6168b42225	[libc] Improve the performance of expf. Reduce the polynomial's degree from 7 down to 4. Currently we use a degree-7 minimax polynomial on an interval of length 2^-7 around 0 to compute `expf`. Based on the suggestion of @santoshn and the RLIBM project (https://github.com/rutgers-apl/rlibm-all/blob/main/source/float/exp.c) and the improvement we made with `exp2f` in https://reviews.llvm.org/D122346, it is possible to have a good polynomial of degree-4 on a subinterval of length 2^(-7) to approximate e^x. We did try to either reduce the degree of the polynomial down to 3 or increase the interval size to 2^(-6), but in both cases the number of exceptional values exploded. So we settle with using a degree-4 polynomial of the interval of size 2^(-7) around 0. Reviewed By: sivachandra, zimmermann6, santoshn Differential Revision: https://reviews.llvm.org/D122418	2022-03-25 12:20:20 -04:00
Tue Ly	b9d87d7466	[libc] Improve the performance of exp2f. Reduce the range-reduction table size from 128 entries down to 64 entries, and reduce the polynomial's degree from 6 down to 4. Currently we use a degree-6 minimax polynomial on an interval of length 2^-7 around 0 to compute exp2f. Based on the suggestion of @santoshn and the RLIBM project (https://github.com/rutgers-apl/rlibm-prog/blob/main/libm/float/exp2.c) it is possible to have a good polynomial of degree-4 on a subinterval of length 2^(-6) to approximate 2^x. We did try to either reduce the degree of the polynomial down to 3 or increase the interval size to 2^(-5), but in both cases the number of exceptional values exploded. So we settle with using a degree-4 polynomial of the interval of size 2^(-6) around 0. Reviewed By: michaelrj, sivachandra, zimmermann6, santoshn Differential Revision: https://reviews.llvm.org/D122346	2022-03-24 18:06:37 -04:00
Tue Ly	64af346b18	[libc] Implement expm1f function that is correctly rounded for all rounding modes. Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation. From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision gives correctly rounded results for all \|x\| > 2^-4 with 1 exception. When \|x\| < 2^-25, we use x + x^2 (implemented with a single fma). And for 2^-25 <= \|x\| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya. Reviewed By: sivachandra, zimmermann6 Differential Revision: https://reviews.llvm.org/D121574	2022-03-15 10:24:56 -04:00
Tue Ly	58edd26255	[libc] Include -150 to the special cases at the beginning of exp2f function.	2022-03-14 10:06:27 -04:00
Tue Ly	64721a3312	[libc] Implement exp2f function that is correctly rounded for all rounding modes. Implement exp2f function that is correctly rounded for all rounding modes. Reviewed By: sivachandra, zimmermann6 Differential Revision: https://reviews.llvm.org/D121463	2022-03-14 09:42:37 -04:00
Tue Ly	38cadd90b7	[libc] Implement expf function that is correctly rounded for all rounding modes. Implement expf function that is correctly rounded for all rounding modes. Reviewed By: sivachandra, zimmermann6 Differential Revision: https://reviews.llvm.org/D121440	2022-03-11 07:16:47 -05:00
Tue Ly	76ec69a911	[libc] Remove the redundant header FPUtil/FEnvUtils.h Remove the redundant header FPUtil/FEnvUtils.h, use FPUtil/FEnvImpl.h header instead. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D120965	2022-03-04 14:09:47 -05:00
Siva Chandra Reddy	dd33f9cdef	[libc] Make the errno macro resolve to the thread local variable directly. With modern architectures having a thread pointer and language supporting thread locals, there is no reason to use a function intermediary to access the thread local errno value. The entrypoint corresponding to errno has been replaced with an object library as there is no formal entrypoint for errno anymore. Reviewed By: jeffbailey, michaelrj Differential Revision: https://reviews.llvm.org/D120920	2022-03-04 17:29:49 +00:00
Alex Brachet	64f5f6d759	[libc] Use '+' constraint on inline assembly As suggested by @mcgrathr in D118099 Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D119978	2022-02-17 03:00:17 +00:00
Tue Ly	f1ec99f973	[libc] Improve hypotf performance with different algorithm correctly rounded to all rounding modes. Algorithm for hypotf: compute (aa + bb) in double precision, then use Dekker's algorithm to find the rounding error, and then correcting it after taking its square-root. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D118157	2022-02-16 09:48:51 -05:00
Guillaume Chatelet	7e7ecef980	[libc] Replace type punning with bit_cast Although type punning is defined for union in C, it is UB in C++. This patch introduces a bit_cast function to convert between types in a safe way. This is necessary to get llvm-libc compile with GCC. This patch is extracted from D119002. Differential Revision: https://reviews.llvm.org/D119145	2022-02-08 20:45:59 +00:00
Tue Ly	e5e93f60ee	[libc] Return a float NaN for log1pf instead of double NaN.	2022-02-07 21:07:09 -05:00
Tue Ly	9e7688c71e	[libc] Implement log1pf correctly rounded to all rounding modes. Implement log1pf correctly rounded to all rounding modes relying on logf implementation for exponent > 2^(-8). Reviewed By: sivachandra, zimmermann6 Differential Revision: https://reviews.llvm.org/D118962	2022-02-07 16:17:18 -05:00
Tue Ly	700aebaf74	[libc] Set default CXX_STANDARD to C++17 and let targets set their own standard if needed. CMAKE_CXX_STANDARD 14 is set in the llvm-project/llvm folder overriding all COMPILE_OPTIONS -std=c++17. We need to override the CXX_STANDARD property of the target in order to set the correct C++ standard flags. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D118871	2022-02-04 09:59:21 -05:00
Tue Ly	ad4ee2d778	[libc] Refactor sqrt implementations and add tests for generic sqrt implementations. Re-apply https://reviews.llvm.org/D118173 with fix for aarch64. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D118433	2022-01-28 13:39:03 -05:00
Siva Chandra Reddy	4beba3a32a	[libc] Revert "Refactor sqrt implementations and add tests for generic sqrt implementations." This reverts commit `21c4c82c20`.	2022-01-27 21:06:14 +00:00
Tue Ly	21c4c82c20	[libc] Refactor sqrt implementations and add tests for generic sqrt implementations. Refactor sqrt implementations: - Move architecture specific instructions from `src/math/<arch>` to `src/__support/FPUtil/<arch>` folder. - Move generic implementation of `sqrt` to `src/__support/FPUtil/generic` folder and add it as a header library. - Use `src/__support/FPUtil/sqrt.h` for architecture/generic selections. - Add unit tests for generic implementation of `sqrt`. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D118173	2022-01-27 11:54:54 -05:00
Tue Ly	82df72cc67	[libc] Make logf function correctly rounded for all rounding modes. Make logf function correctly rounded for all rounding modes. Reviewed By: sivachandra, zimmermann6, santoshn, jpl169 Differential Revision: https://reviews.llvm.org/D118149	2022-01-25 15:22:21 -05:00
Alex Brachet	ce368e1aa5	[libc][NFC] Workaround clang assertion in inline asm The clobber list "cc" is added to inline assembly to workaround a clang assertion that triggers when building with a clang built with assertions enabled. See bug [53391](https://github.com/llvm/llvm-project/issues/53391). See https://godbolt.org/z/z3bc6a9PM showing functionally same output assembly. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D118099	2022-01-25 16:39:55 +00:00
Tue Ly	e581841e8c	[libc] Implement log10f correctly rounded for all rounding modes. Based on RLIBM implementation similar to logf and log2f. Most of the exceptional inputs are the exact powers of 10. Reviewed By: sivachandra, zimmermann6, santoshn, jpl169 Differential Revision: https://reviews.llvm.org/D118093	2022-01-25 10:33:39 -05:00
Tue Ly	1f3f90ab88	[libc] Make log2f correctly rounded for all rounding modes when FMA is not available. Add to log2f 2 more exceptional cases got when not using fma for polyeval. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D117812	2022-01-20 16:16:11 -05:00
Tue Ly	d4baf3b132	[libc] Use get_round() instead of floating point tricks in generic hypot implementation. The floating point tricks used to get rounding mode require -frounding-math flag, which behaves differently on aarch64. Reverting back to use get_round instead. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D117824	2022-01-20 14:54:57 -05:00
Tue Ly	aad04534c4	[libc] Implement correct rounding with all rounding modes for hypot functions. Update the rounding logic for generic hypot function so that it will round correctly with all rounding modes. Reviewed By: sivachandra, zimmermann6 Differential Revision: https://reviews.llvm.org/D117590	2022-01-20 13:33:20 -05:00
Siva Chandra Reddy	75d2fcb03f	[libc] Add a naming rule for global constants. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D117645	2022-01-19 22:11:16 +00:00
Siva Chandra Reddy	d7c8d51f94	[libc][Obvious] Add -Wno-c++17-extensions to sinf, cosf and sincosf targets.	2022-01-19 06:22:17 +00:00
Tue Ly	b0cd3abf03	[libc] Remove as_double usage as constant initializations in sincosf implementation. Use hexadecimal floats with C++17 instead of as_double as floating point constant initializations. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D117628	2022-01-18 23:48:48 -05:00
Tue Ly	63d2df003e	[libc] Implement correctly rounded log2f based on RLIBM library. Implement log2f based on RLIBM library correctly rounded for all rounding modes. Reviewed By: sivachandra, michaelrj, santoshn, jpl169, zimmermann6 Differential Revision: https://reviews.llvm.org/D115828	2022-01-14 12:40:49 -05:00
Tue Ly	e11e973e68	[libc] Update exhaustive testing documentations.	2022-01-14 11:10:05 -05:00
Michael Jones	3e52096809	[libc][NFC] fix variable name A variable was named in a way that doesn't match the format. This patch renames it to match the format. Differential Revision: https://reviews.llvm.org/D116228	2021-12-23 10:42:30 -08:00
Tue Ly	9369aa1444	[libc][Obvious] Change func_ to <func>_ in add_math_function.md.	2021-12-17 13:32:51 -05:00
Tue Ly	d08a801b5f	[libc] Implement correctly rounded logf based on RLIBM library. Implement correctly rounded logf based on RLIBM library: https://people.cs.rutgers.edu/~sn349/rlibm/. Reviewed By: sivachandra, santoshn, jpl169, zimmermann6 Differential Revision: https://reviews.llvm.org/D115408	2021-12-16 13:43:15 -05:00
Tue Ly	a2b3e6bed8	[libc] Add documentation about how to add a math function to LLVM-libc. Add documentation about how to add a math function to LLVM-libc. Differential Revision: https://reviews.llvm.org/D115608	2021-12-16 12:12:21 -05:00
Tue Ly	08aa40b9e6	[libc] Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it. Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D115572	2021-12-11 16:21:33 -05:00
Michael Jones	1c92911e9e	[libc] apply new lint rules This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D114302	2021-12-07 10:49:47 -08:00
Guillaume Chatelet	cca8e1e415	[libc][NFC] Fix typo in CMakeLists documentation	2021-12-03 13:52:09 +01:00
Michael Jones	155f5a6dac	[libc][clang-tidy] fix namespace check for externals Up until now, all references to `errno` were marked with `NOLINT`, since it was technically calling an external function. This fixes the lint rules so that `errno`, as well as `malloc`, `calloc`, `realloc`, and `free` are all allowed to be called as external functions. All of the relevant `NOLINT` comments have been removed, and the documentation has been updated. Reviewed By: sivachandra, lntue, aaron.ballman Differential Revision: https://reviews.llvm.org/D113946	2021-11-30 11:44:24 -08:00
Siva Chandra Reddy	f362aea42d	[libc][NFC] Move utils/CPP to src/__support/CPP. The idea is to move all pieces related to the actual libc sources to the "src" directory. This allows downstream users to ship and build just the "src" directory. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D112653	2021-10-28 15:50:00 +00:00
Siva Chandra Reddy	ca6b354229	[libc] Add range reduction functions based on Paine and Hanek algorithm. These functions will be used in a future patch to implement trigonometric functions. Unit tests have been added but to the libc-long-running-tests suite. The unit tests long running because we compare against MPFR computations performed at 1280 bits of precision. Some cleanups or elimination of repeated patterns can be done as follow up changes. Differential Revision: https://reviews.llvm.org/D104817	2021-08-23 05:18:41 +00:00
Michael Jones	c120edc7b3	[libc][nfc] move ctype_utils and FPUtils to __support Some ctype functions are called from other libc functions (e.g. isspace is used in atoi). By moving ctype_utils.h to __support it becomes easier to include just the implementations of these functions. For these reasons the implementation for isspace was moved into ctype_utils as well. FPUtils was moved to simplify the build order, and to clarify which files are a part of the actual libc. Many files were modified to accomodate these changes, mostly changing the #include paths. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D107600	2021-08-06 17:29:41 +00:00
Siva Chandra Reddy	a58b2827fe	[libc] Add hardware implementations of x86_64 sqrt functions.	2021-06-14 21:25:37 +00:00
Tue Ly	4e5f8b4d8d	[libc] Add implementation of expm1f. Use expm1f(x) = exp(x) - 1 for \|x\| > ln(2). For \|x\| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2] and use a degree-6 polynomial approximation generated by Sollya's fpminmax for each interval. Errors < 1.5 ULPs when we use fma to evaluate the polynomials. Differential Revision: https://reviews.llvm.org/D101134	2021-06-10 14:58:34 -04:00
Siva Chandra Reddy	7deb5ef44f	[libc][NFC] Instead of erroring, skip math targets with missing implementations. Fixes Aarch64 bot.	2021-05-13 19:22:11 +00:00
Siva Chandra Reddy	861dc75906	[libc] Add x86_64 implementations of double precision cos, sin and tan. The implementations use the x86_64 FPU instructions. These instructions are extremely slow compared to a polynomial based software implementation. Also, their accuracy falls drastically once the input goes beyond 2PI. To improve both the speed and accuracy, we will be taking the following approach going forward: 1. As a follow up to this CL, we will implement a range reduction algorithm which will expand the accuracy to the entire double precision range. 2. After that, we will replace the HW instructions with a polynomial implementation to improve the run time. After step 2, the implementations will be accurate, performant and target architecture independent. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D102384	2021-05-13 19:02:00 +00:00

1 2

91 Commits