In full build mode, the fuzzing tests fail to build. This PR disabled MPFR related tests in full build
```
[2/4] Building CXX object projects/libc/fuzzing/stdio/CMakeFiles/libc.fuzzing.stdio.printf_float_conv_fuzz.dir/printf_float_conv_fuzz.cpp.o
FAILED: projects/libc/fuzzing/stdio/CMakeFiles/libc.fuzzing.stdio.printf_float_conv_fuzz.dir/printf_float_conv_fuzz.cpp.o
/usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_19_0_0_git -I/home/schrodingerzy/Documents/llvm/llvm-project/build/projects/libc/fuzzing/stdio -I/home/schrodingerzy/Documents/llvm/llvm-project/libc/fuzzing/stdio -I/home/schrodingerzy/Documents/llvm/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm/llvm-project/build/projects/libc/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fsanitize=fuzzer -O2 -g -DNDEBUG -std=c++17 -MD -MT projects/libc/fuzzing/stdio/CMakeFiles/libc.fuzzing.stdio.printf_float_conv_fuzz.dir/printf_float_conv_fuzz.cpp.o -MF projects/libc/fuzzing/stdio/CMakeFiles/libc.fuzzing.stdio.printf_float_conv_fuzz.dir/printf_float_conv_fuzz.cpp.o.d -o projects/libc/fuzzing/stdio/CMakeFiles/libc.fuzzing.stdio.printf_float_conv_fuzz.dir/printf_float_conv_fuzz.cpp.o -c /home/schrodingerzy/Documents/llvm/llvm-project/libc/fuzzing/stdio/printf_float_conv_fuzz.cpp
In file included from /home/schrodingerzy/Documents/llvm/llvm-project/libc/fuzzing/stdio/printf_float_conv_fuzz.cpp:19:
In file included from /home/schrodingerzy/Documents/llvm/llvm-project/libc/utils/MPFRWrapper/mpfr_inc.h:21:
In file included from /usr/include/mpfr.h:53:
In file included from /usr/include/gmp.h:35:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/iosfwd:38:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/requires_hosted.h:31:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/x86_64-pc-linux-gnu/bits/c++config.h:679:
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/x86_64-pc-linux-gnu/bits/os_defines.h:44:5: error: function-like macro '__GLIBC_PREREQ' is not defined
44 | #if __GLIBC_PREREQ(2,15) && defined(_GNU_SOURCE)
| ^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/x86_64-pc-linux-gnu/bits/os_defines.h:55:5: error: function-like macro '__GLIBC_PREREQ' is not defined
55 | #if __GLIBC_PREREQ(2, 26) \
| ^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/x86_64-pc-linux-gnu/bits/os_defines.h:66:6: error: function-like macro '__GLIBC_PREREQ' is not defined
66 | # if __GLIBC_PREREQ(2, 27)
| ^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/x86_64-pc-linux-gnu/bits/os_defines.h:78:6: error: function-like macro '__GLIBC_PREREQ' is not defined
78 | # if __GLIBC_PREREQ(2, 34)
| ^
In file included from /home/schrodingerzy/Documents/llvm/llvm-project/libc/fuzzing/stdio/printf_float_conv_fuzz.cpp:19:
In file included from /home/schrodingerzy/Documents/llvm/llvm-project/libc/utils/MPFRWrapper/mpfr_inc.h:21:
In file included from /usr/include/mpfr.h:53:
In file included from /usr/include/gmp.h:35:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/iosfwd:42:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/postypes.h:40:
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:64:11: error: no member named 'mbstate_t' in the global namespace
64 | using ::mbstate_t;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:143:11: error: no member named 'btowc' in the global namespace
143 | using ::btowc;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:144:11: error: no member named 'fgetwc' in the global namespace
144 | using ::fgetwc;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:145:11: error: no member named 'fgetws' in the global namespace
145 | using ::fgetws;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:146:11: error: no member named 'fputwc' in the global namespace
146 | using ::fputwc;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:147:11: error: no member named 'fputws' in the global namespace
147 | using ::fputws;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:148:11: error: no member named 'fwide' in the global namespace
148 | using ::fwide;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:149:11: error: no member named 'fwprintf' in the global namespace
149 | using ::fwprintf;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:150:11: error: no member named 'fwscanf' in the global namespace
150 | using ::fwscanf;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:151:11: error: no member named 'getwc' in the global namespace
151 | using ::getwc;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:152:11: error: no member named 'getwchar' in the global namespace
152 | using ::getwchar;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:153:11: error: no member named 'mbrlen' in the global namespace
153 | using ::mbrlen;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:154:11: error: no member named 'mbrtowc' in the global namespace
154 | using ::mbrtowc;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:155:11: error: no member named 'mbsinit' in the global namespace
155 | using ::mbsinit;
| ~~^
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/cwchar:156:11: error: no member named 'mbsrtowcs' in the global namespace
156 | using ::mbsrtowcs;
| ~~^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
```
Context: https://github.com/llvm/llvm-project/pull/87017
- Add proxy header `libc/hdr/math_macros.h` that will:
- include `<math.h>` in overlay mode,
- include `"include/llvm-libc-macros/math-macros.h"` in full build mode.
- Its corresponding CMake target `libc.hdr.math_macros` will only depend
on `libc.include.math` and `libc.include.llvm-libc-macros.math_macros`
in full build mode.
- Replace all `#include "include/llvm-libc-macros/math-macros.h"` with
`#include "hdr/math_macros.h"`.
- Add dependency to `libc.hdr.math_macros` CMake target when using
`add_fp_unittest`.
- Update the remaining dependency.
- Update bazel overlay: add `libc:hdr_math_macros` target, and replacing
all dependency on `libc:llvm_libc_macros_math_macros` with
`libc:hdr_math_macros`.
This script+config should help us generate more consistent documentation wrt.
what we currently support or not.
As an example usage:
$ ./libc/utils/docgen/docgen.py fenv.h
Will spit out an RST formatted table that can be copy+pasted into our docs.
The config is not filled out entirely, but doing so and then updating our docs
would be great beginner bugs for new contributors.
Having python+json generate things like docs, or headers (as imagined in
https://github.com/nickdesaulniers/llvm-project/tree/hdr-gen2) is perhaps
easier to work with than tablegen, and doesn't introduce a dependency on a host
tool that needs to be compiled from llvm sources before building the rest of
the libc. This can probably be merged with whatever we end up doing to replace
libc-hdrgen.
Please use
https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting
for keeping this file formatted.
Implements the functions `roundeven()`, `roundevenf()`, `roundevenl()`
from the roundeven family of functions introduced in C23. Also
implements `roundevenf128()`.
Summary:
This patch adds a temporary implementation that uses a struct-based
interface in lieu of varargs support. Once varargs support exists we
will move this implementation to the "real" printf implementation.
Conceptually, this patch has the client copy over its format string and
arguments to the server. The server will then scan the format string
searching for any specifiers that are actually a string. If it is a
string then we will send the pointer back to the server to tell it to
copy it back. This copied value will then replace the pointer when the
final formatting is done.
This will require a built-in extension to the varargs support to get
access to the underlying struct. The varargs used on the GPU will simply
be a struct wrapped in a varargs ABI.
The code does some (overly simple?) checks on file syntax. These checks
assume unix line endings and fail on windows. This commit updates the
code to strip extra whitespace, making the checks more robust,
particularly in the presence of windows line endings.
Fixes#86023
We compute atan2f(y, x) in 2 stages:
- Fast step: perform computations in double precision , with relative
errors < 2^-50
- Accurate step: if the result from the Fast step fails Ziv's rounding
test, then we perform computations in double-double precision, with
relative errors < 2^-100.
On Ryzen 5900X, worst-case latency is ~ 200 clocks, compared to average
latency ~ 60 clocks, and average reciprocal throughput ~ 20 clocks.
Summary:
The current implementation of RPC tied everything to device IDs and
forced us to do init / shutdown to manage some global state. This turned
out to be a bad idea in situations where we want to track multiple
hetergeneous devices that may report the same device ID in the same
process.
This patch changes the interface to instead create an opaque handle to
the internal device and simply allocates it via `new`. The user will
then take this device and store it to interface with the attached
device. This interface puts the burden of tracking the device identifier
to mapped d evices onto the user, but in return heavily simplifies the
implementation.
Fixes#86546 and removes the macro `LIBC_HAS_BUILTIN`. This was
necessary to support older compilers that did not support
`__has_builtin`. All of the compilers we support already have this
builtin.
See: https://libc.llvm.org/compiler_support.html
All uses now use `__has_builtin` directly
cc @nickdesaulniers
This patch updates the construction of packet headers to replace the
usage of ACQUIRE/RELEASE with SCACQUIRE/SCRELEASE which is now
recommended.
The patch also ensures consistency across kernel dispatches.
Reland of #84991
A downstream overlay mode user ran into issues with the isnan macro not
working in our sources with a specific libc configuration. This patch
replaces the last direct includes of math.h with our internal
math_macros.h, along with the necessary build system changes.
Summary:
The libc build has a few utilties that need to be built before we can do
everything in the full build. The one requirement currently is the
`libc-hdrgen` binary. If we are doing a full build runtimes mode we
first add `libc` to the projects list and then only use the `projects`
portion to buld the `libc` portion. We also use utilities for the GPU
build, namely the loader utilities. Previously we would build these
tools on-demand inside of the cross-build, which tool some hacky
workarounds for the dependency finding and target triple. This patch
instead just builds them similarly to libc-hdrgen and then passses them
in. We now either pass it manually it it was built, or just look it up
like we do with the other `clang` tools.
Depends on https://github.com/llvm/llvm-project/pull/84664
Summary:
We previously changed the data layout for the RPC buffer to make it lane
size agnostic. I put off changing the size for the server case to make
the patch smaller. This patch simply reorganizes code by making the lane
size an argument to the port rather than a templtae size. Heavily
simplifies a lot of code, no more `std::variant`.
Summary:
These values were incorrectly flipped, setting the size of the blocks to
the threads and vice-versa. When I originally wrote the thread utilities
it was using COV4 which used an implicit format. Then when I updated I
accidentally flipped them and never noticed because nothing depended on
the size of the threads until I checked it manually.
Towards the goal of getting `ninja libc-lint` back to green, fix the numerous
instances of:
warning: header guard does not follow preferred style [llvm-header-guard]
This is because many of our header guards start with `__LLVM` rather than
`LLVM`.
To filter just these warnings:
$ ninja -k2000 libc-lint 2>&1 | grep llvm-header-guard
To automatically apply fixits:
$ find libc/src libc/include libc/test -name \*.h | \
xargs -n1 -I {} clang-tidy {} -p build/compile_commands.json \
-checks='-*,llvm-header-guard' --fix --quiet
Some manual cleanup is still necessary as headers that were missing header
guards outright will have them inserted before the license block (we prefer
them after).
Summary:
Recent changes added an include path in the float128 type that used the
internal `libc` path to find the macro. This doesn't work once it's
installed because we need to search from the root of the install dir.
This patch adds "include/" to the include path so that our inclusion
of installed headers always match the internal use.
Summary:
This directory is leftover from when we handled both AMDGPU and NVPTX in
the same build and merged them into a pseudo triple. Now the only thing
it contains is the RPC server header. This gets rid of it, but now that
it's in the base install directory we should make it clear that it's an
LLVM libc header.
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.
This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.
The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
https://github.com/llvm/llvm-project/pull/81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.
The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.
The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.
We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.
This approach is vastly superior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.
Some certain utilities need to be built with
`--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine
workaround as we
will always assume that the GPU `libc` is a cross-build with a
functioning host.
Depends on https://github.com/llvm/llvm-project/pull/81557
Summary:
The RPC interface needs to handle an entire warp or wavefront at once.
This is currently done by using a compile time constant indicating the
size of the buffer, which right now defaults to some value on the client
(GPU) side. However, there are currently attempts to move the `libc`
library to a single IR build. This is problematic as the size of the
wave fronts changes between ISAs on AMDGPU. The builitin
`__builtin_amdgcn_wavefrontsize()` will return the appropriate value,
but it is only known at runtime now.
In order to support this, this patch restructures the packet. Now
instead of having an array of arrays, we simply have a large array of
buffers and slice it according to the runtime value if we don't know it
ahead of time. This also somewhat has the advantage of making the buffer
contiguous within a page now that the header has been moved out of it.
Summary:
The RPC interface uses several ports to provide parallel access. Right
now we begin the search at the beginning, which heavily contests the
early ports. Using the SMID allows us to stagger the starting index
based off of the cluster identifier that is executing the current warp.
Multiple warps can share an SM, but it will guaruntee that the
contention for the low indices is lower.
This also increases the maximum port size to around 4096, this is
because 512 isn't enough to cover the full hardare parallelism needed to
guarantee this doesdn't deadlock.
The semantics for casting can range from "bitcast" (same representation)
to "different representation", to "type promotion". Here we remove the
cast operator and force usage of `get_val` as the only function to get
the floating point value, making the intent clearer and more consistent.
fixes https://github.com/llvm/llvm-project/issues/78743
- For normal objects, the patch removes `RTTI` and exceptions in `fullbuild`
- For FP tests, the patch adds links to `stdc++` and `gcc_s` if `MPFR` is used.
Prior to this change, we wouldn't build headers that aren't referenced
by other parts of the libc which would result in a build error during
installation. To address this, we make the header target a dependency of
the libc archive. Additionally, we also redo the install targets, moving
the install targets closer to build targets and simplifying the
hierarchy and generally matching what we do for other runtimes.
The support introduced in 675702f356b0c3a540fa2e8af4192f7d658b2988 is
not working correctly in all scenarios. Instead of setup_host_tool
function, we can use the existing targets introduced by add_tablegen
macro.
When crosscompiling tools for a different architecture, we need to build
native libc-hdrgen which can be achieved using the existing CMake
support for crosscompiling tablegen tools.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This one might be a bit controversial since the terminology has been
introduced from the start but I think `FRACTION_LEN` is a better name
here. AFAICT it really is "the number of bits after the decimal dot when
the number is in normal form."
`MANTISSA_WIDTH` is less precise as it's unclear whether we take the
leading bit into account.
This patch also renames most of the properties to use the `_LEN` suffix
and fixes useless casts or variables.
According to [wikipedia](https://en.wikipedia.org/wiki/Exponent_bias)
the "biased exponent" is the encoded form that is always positive
whereas the unbiased form is the actual "real" exponent that can be
positive or negative.
`FPBits` seems to be using `unbiased_exponent` to describe the encoded
form (unsigned). This patch simply use `biased` instead of `unbiased`.
Summary:
This pointer has been causing issues. Allocating and reading from coarse
memory on the CPU is not guaranteed and varies depending on the kernel
version and support. Previously we attempted to pin the memory but this
caused unexpected failures. This should be a legal operation and work
around the problem as fine-grained memory should be always legal to
write to by both sides.
Summary:
This may be problematic to pin a stack pointer. Allocate it via the OS
allocator instead as the documentation suggests.
For some reason, if you attempt to free this pointer after the memory
region has been unlocked, it will return an invalid pointer.
Summary:
Previously, we determined that coarse grained memory cannot be used in
the general case. That removed the buffer used to transfer the memory,
however we still had this lookup. Though we do not access the symbol
directly, it still conflicts with the agents apparently. Pin this as
well.
This resolves the problems @lntue was having with the `libc` GPU build.
Summary:
This portion of code handles mapping the RPC client memory over to the
device. HSA copies need to be between two slices of memory that HSA has
allocated. Previously we used coarse-grained memory to act as the host
source. However, the support for this varies depending on the kernel and
version and should not be relied upon. This patch changes that handling
to use the `hsa_amd_memory_lock` API to explicitly pin memory to a
location sufficient for a DMA transfer to the GPU.