Detect if the architecture supports FMA instructions and if
the targets depend on fma.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D123615
Some platforms don't support proper 128 bit integers, but some
algorithms use them, such as any that use long doubles. This patch
modifies the existing UInt class to support the necessary operators.
This does not put this new class into use, that will be in followup
patches.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D124959
Make FMA flag checks more accurate for x86-64 targets, and refactor
polyeval to use multiply and add instead when FMA instructions are not
available.
Reviewed By: michaelrj, sivachandra
Differential Revision: https://reviews.llvm.org/D123335
In the previous patch adding -mfma to functions that need it for windows
builds I missed log2f.
Differential Revision: https://reviews.llvm.org/D122693
On Windows the functions that use fma don't properly include the fma
intrinsics unless -mfma is added to the compile options. This patch adds
the compile option to all of the functions that need it.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D122689
Reduce the polynomial's degree from 7 down to 4.
Currently we use a degree-7 minimax polynomial on an interval of length 2^-7
around 0 to compute `expf`. Based on the suggestion of @santoshn and the RLIBM
project (https://github.com/rutgers-apl/rlibm-all/blob/main/source/float/exp.c)
and the improvement we made with `exp2f` in https://reviews.llvm.org/D122346,
it is possible to have a good polynomial of degree-4 on a subinterval of length
2^(-7) to approximate e^x.
We did try to either reduce the degree of the polynomial down to 3 or increase
the interval size to 2^(-6), but in both cases the number of exceptional values
exploded. So we settle with using a degree-4 polynomial of the interval of
size 2^(-7) around 0.
Reviewed By: sivachandra, zimmermann6, santoshn
Differential Revision: https://reviews.llvm.org/D122418
Reduce the range-reduction table size from 128 entries down to 64 entries, and
reduce the polynomial's degree from 6 down to 4.
Currently we use a degree-6 minimax polynomial on an interval of length 2^-7
around 0 to compute exp2f. Based on the suggestion of @santoshn and the RLIBM
project (https://github.com/rutgers-apl/rlibm-prog/blob/main/libm/float/exp2.c)
it is possible to have a good polynomial of degree-4 on a subinterval of length
2^(-6) to approximate 2^x.
We did try to either reduce the degree of the polynomial down to 3 or increase
the interval size to 2^(-5), but in both cases the number of exceptional values
exploded. So we settle with using a degree-4 polynomial of the interval of
size 2^(-6) around 0.
Reviewed By: michaelrj, sivachandra, zimmermann6, santoshn
Differential Revision: https://reviews.llvm.org/D122346
Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.
From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision
gives correctly rounded results for all |x| > 2^-4 with 1 exception. When |x| < 2^-25, we use x + x^2 (implemented with a
single fma). And for 2^-25 <= |x| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121574
Implement exp2f function that is correctly rounded for all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121463
Implement expf function that is correctly rounded for all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121440
With modern architectures having a thread pointer and language supporting
thread locals, there is no reason to use a function intermediary to access
the thread local errno value.
The entrypoint corresponding to errno has been replaced with an object
library as there is no formal entrypoint for errno anymore.
Reviewed By: jeffbailey, michaelrj
Differential Revision: https://reviews.llvm.org/D120920
Algorithm for hypotf: compute (a*a + b*b) in double precision, then use Dekker's algorithm to find the rounding error, and then correcting it after taking its square-root.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D118157
Although type punning is defined for union in C, it is UB in C++.
This patch introduces a bit_cast function to convert between types in a safe way.
This is necessary to get llvm-libc compile with GCC.
This patch is extracted from D119002.
Differential Revision: https://reviews.llvm.org/D119145
CMAKE_CXX_STANDARD 14 is set in the llvm-project/llvm folder overriding all COMPILE_OPTIONS -std=c++17. We need to override the CXX_STANDARD property of the target in order to set the correct C++ standard flags.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D118871
Refactor sqrt implementations:
- Move architecture specific instructions from `src/math/<arch>` to `src/__support/FPUtil/<arch>` folder.
- Move generic implementation of `sqrt` to `src/__support/FPUtil/generic` folder and add it as a header library.
- Use `src/__support/FPUtil/sqrt.h` for architecture/generic selections.
- Add unit tests for generic implementation of `sqrt`.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D118173
Make logf function correctly rounded for all rounding modes.
Reviewed By: sivachandra, zimmermann6, santoshn, jpl169
Differential Revision: https://reviews.llvm.org/D118149
Based on RLIBM implementation similar to logf and log2f. Most of the exceptional inputs are the exact powers of 10.
Reviewed By: sivachandra, zimmermann6, santoshn, jpl169
Differential Revision: https://reviews.llvm.org/D118093
Add to log2f 2 more exceptional cases got when not using fma for polyeval.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D117812
The floating point tricks used to get rounding mode require -frounding-math flag, which behaves differently on aarch64. Reverting back to use get_round instead.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D117824
Update the rounding logic for generic hypot function so that it will round correctly with all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D117590
Use hexadecimal floats with C++17 instead of as_double as floating point constant initializations.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D117628
A variable was named in a way that doesn't match the format. This patch
renames it to match the format.
Differential Revision: https://reviews.llvm.org/D116228
Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D115572
This patch applies the lint rules described in the previous patch. There
was also a significant amount of effort put into manually fixing things,
since all of the templated functions, or structs defined in /spec, were
not updated and had to be handled manually.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D114302
Up until now, all references to `errno` were marked with `NOLINT`, since
it was technically calling an external function. This fixes the lint
rules so that `errno`, as well as `malloc`, `calloc`, `realloc`, and
`free` are all allowed to be called as external functions. All of the
relevant `NOLINT` comments have been removed, and the documentation has
been updated.
Reviewed By: sivachandra, lntue, aaron.ballman
Differential Revision: https://reviews.llvm.org/D113946
The idea is to move all pieces related to the actual libc sources to the
"src" directory. This allows downstream users to ship and build just the
"src" directory.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D112653
These functions will be used in a future patch to implement
trigonometric functions. Unit tests have been added but to the
libc-long-running-tests suite. The unit tests long running because we
compare against MPFR computations performed at 1280 bits of precision.
Some cleanups or elimination of repeated patterns can be done as follow
up changes.
Differential Revision: https://reviews.llvm.org/D104817
Some ctype functions are called from other libc functions (e.g. isspace
is used in atoi). By moving ctype_utils.h to __support it becomes easier
to include just the implementations of these functions. For these
reasons the implementation for isspace was moved into
ctype_utils as well.
FPUtils was moved to simplify the build order, and to clarify which
files are a part of the actual libc.
Many files were modified to accomodate these changes, mostly changing
the #include paths.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D107600
Use expm1f(x) = exp(x) - 1 for |x| > ln(2).
For |x| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2]
and use a degree-6 polynomial approximation generated by Sollya's fpminmax for each interval.
Errors < 1.5 ULPs when we use fma to evaluate the polynomials.
Differential Revision: https://reviews.llvm.org/D101134
The implementations use the x86_64 FPU instructions. These instructions
are extremely slow compared to a polynomial based software
implementation. Also, their accuracy falls drastically once the input
goes beyond 2PI. To improve both the speed and accuracy, we will be
taking the following approach going forward:
1. As a follow up to this CL, we will implement a range reduction algorithm
which will expand the accuracy to the entire double precision range.
2. After that, we will replace the HW instructions with a polynomial
implementation to improve the run time.
After step 2, the implementations will be accurate, performant and target
architecture independent.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D102384