Implement correctly rounded `erff` functions. For `x >= 4`, `erff(x) = 1` for `FE_TONEAREST` or `FE_UPWARD`, `0x1.ffffep-1` for `FE_DOWNWARD` or `FE_TOWARDZERO`. For `0 <= x < 4`, we divide into 32 sub-intervals of length `1/8`, and use a degree-15 odd polynomial to approximate `erff(x)` in each sub-interval: ``` erff(x) ~ x * (c0 + c1 * x^2 + c2 * x^4 + ... + c7 * x^14). ``` For `x < 0`, we can use the same formula as above, since the odd part is factored out. Performance tested with `perf.sh` tool from the CORE-MATH project on AMD Ryzen 9 5900X: Reciprocal throughput (clock cycles / op) ``` $ ./perf.sh erff --path2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH reciprocal throughput -- with -march=native (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 11.790 + 0.182 clc/call; Median-Min = 0.154 clc/call; Max = 12.255 clc/call; -- CORE-MATH reciprocal throughput -- with -march=x86-64-v2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 14.205 + 0.151 clc/call; Median-Min = 0.159 clc/call; Max = 15.893 clc/call; -- System LIBC reciprocal throughput -- [####################] 100 % Ntrial = 20 ; Min = 45.519 + 0.445 clc/call; Median-Min = 0.552 clc/call; Max = 46.345 clc/call; -- LIBC reciprocal throughput -- with -mavx2 -mfma (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 9.595 + 0.214 clc/call; Median-Min = 0.220 clc/call; Max = 9.887 clc/call; -- LIBC reciprocal throughput -- with -msse4.2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 10.223 + 0.190 clc/call; Median-Min = 0.222 clc/call; Max = 10.474 clc/call; ``` and latency (clock cycles / op): ``` $ ./perf.sh erff --path2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH latency -- with -march=native (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 38.566 + 0.391 clc/call; Median-Min = 0.503 clc/call; Max = 39.170 clc/call; -- CORE-MATH latency -- with -march=x86-64-v2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 43.223 + 0.667 clc/call; Median-Min = 0.680 clc/call; Max = 43.913 clc/call; -- System LIBC latency -- [####################] 100 % Ntrial = 20 ; Min = 111.613 + 1.267 clc/call; Median-Min = 1.696 clc/call; Max = 113.444 clc/call; -- LIBC latency -- with -mavx2 -mfma (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 40.138 + 0.410 clc/call; Median-Min = 0.536 clc/call; Max = 40.729 clc/call; -- LIBC latency -- with -msse4.2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 44.858 + 0.872 clc/call; Median-Min = 0.814 clc/call; Max = 46.019 clc/call; ``` Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153683
Building and Testing LLVM libc on Windows
Setting Up Environment
To build LLVM libc on Windows, first build Clang using the following steps.
-
Open Command Prompt in Windows
-
Set TEMP and TMP to a directory. Creating this path is necessary for a successful clang build.
-
Create tmp under your preferred directory or under
C:\src:cd C:\src mkdir tmp -
In the start menu, search for "environment variables for your account". Set TEMP and TMP to
C:\src\tmpor the corresponding path elsewhere.
-
-
Download Visual Studio Community.
-
Install CMake and Ninja. (Optional, included in Visual Studio).
-
Load the Visual Studio environment variables using this command. This is crucial as it allows you to use build tools like CMake and Ninja:
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64Note: Rerun this command every time you open a new Command Prompt window.
-
If you have not used Git before, install Git for Windows. Check out the LLVM source tree from Github using:
git clone https://github.com/llvm/llvm-project.git -
Ensure you have access to Clang, either by downloading from LLVM Download or building it yourself.
Building LLVM libc
In this section, Clang will be used to compile LLVM libc, and finally, build and test the libc.
-
Create a empty build directory in
C:\srcor your preferred directory and cd to it using:mkdir libc-build cd libc-build -
Run the following CMake command to generate build files. LLVM libc must be built by Clang, so ensure Clang is specified as the C and C++ compiler.
cmake -G Ninja ../llvm-project/llvm -DCMAKE_C_COMPILER=C:/src/clang-build/bin/clang-cl.exe -DCMAKE_CXX_COMPILER=C:/src/clang-build/bin/clang-cl.exe -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_FORCE_BUILD_RUNTIME=libc -DLLVM_ENABLE_PROJECTS=libc -DLLVM_NATIVE_ARCH=x86_64 -DLLVM_HOST_TRIPLE=x86_64-window-x86-gnuSome LLVM libc math unittests test correctness/accuracy against results from the GNU MPFR library. If you want to run math tests which use MPFR, and if MPFR on your machine is not installed in the default include and linker lookup directories, then you can specify the MPFR install directory by passing an additional CMake option as follows:
-DLLVM_LIBC_MPFR_INSTALL_PATH=<path/mpfr/install/dir>
If the above option is specified, then
${LLVM_LIBC_MPFR_INSTALL_PATH}/includewill be added to the include directories, and${LLVM_LIBC_MPFR_INSTALL_PATH}/libwill be added to the linker lookup directories.NOTE: The GNU MPFR library depends on the GNU GMP library. If you specify the above option, then it will be assumed that GMP is also installed in the same directory or availabe in the default paths.
-
Build LLVM libc using:
ninja libc -
Run tests using:
ninja checklibc