This PR fixes broken links in all files describing libc usage modes.
Please let me know if there are any other places that need updating.
---------
Co-authored-by: shubhp@perlmutter <shubhp@perlmutter.com>
This PR fixes the feature detection for RISC-V floating-point support in
LLVM's libc implementation.
The `__riscv_flen` macro represents the floating-point register width in
bits (32, 64, or 128). Since Extension D is specifically documented as
implying F, we can use simple >= comparisons to detect them.
For half-precision support, the implementation checks for the Zfhmin
extension as RVA22 and RVA23 profiles only require Zfhmin rather than
the full Zfh extension. Zfh also implies Zfhmin, so checking for Zfhmin
should cover all cases.
isComplete previously meant different things for different conversion
directions.
Refactored bytes_processed to bytes_stored which now consistently
increments on every push and decrements on pop making both directions
more consistent with each other
Updating x87 floating point register significantly affect the
performance of the functions.
All the floating point exception reads will merge the results from both
mxcsr and x87 registers anyway.
If you enable `LIBC_CONF_PRINTF_FLOAT_TO_STR_USE_FLOAT320` and use a
`%f` style printf format directive to print a nonzero number too small
to show up in the output digits, e.g. `printf("%.2f", 0.001)`, then the
output would be intermittently incorrect, because
`DyadicFloat::as_mantissa_type_rounded` would try to shift the 320-bit
mantissa right by more than 320 bits, invoking the 'undefined behavior'
clause commented in the `shift()` function in `big_int.h`.
There were already tests in the libc test suite exercising this case,
e.g. the subnormal tests in `LlvmLibcSPrintfTest.FloatDecimalConv` use
`%f` at the default precision of 6 decimal places on tiny numbers such
as 2^-1027. But because the behavior is undefined, they don't visibly
fail all the time, and in all previous test runs we'd tried with
USE_FLOAT320, they had got lucky.
The fix is simply to detect an out-of-range right shift before doing it,
and instead just set the output value to zero.
In #94078, `write_to_stdout` had not been fully implemented. However,
now that it has been implemented, to conform with the C standard
(7.23.6.3. The printf function, specifically point 2), we use `stdout`.
This issue is tracked in #94685.
- Also prefer `static constexpr`
- Made it explicit that we are writing to `stdout`
Removed strcmp, strlen, and memset calls from table.h and replaced them
with internal functions.
---------
Co-authored-by: Sriya Pratipati <sriyap@google.com>
The printf parser uses errno for setting up the %m conversion. It was
presumably getting this include indirectly until a recent change. This
patch adds a direct dependency to fix it.
The previous internal fcntl implementation modified errno directly, this
patch fixes that. This patch also moves open and close into OSUtil since
they are used in multiple places. There are more places that need
similar cleanup but only got comments in this patch to keep it
relatively reviewable.
Related to: https://github.com/llvm/llvm-project/issues/143937
Fixes a couple of bugs found when building. The PR to enable the headers
can be found here: #144114.
- math.yaml: float128 guard
- wchar.yaml: __restrict keyword order
This reverts commit a93e55e57e and fixes
build and test failures:
* Proper include added to setvbuf_test.cpp
* fgetc/fgetc_unlocked/fgets tests are ported to ErrnoSetterMatcher and
are made more precise. This fixes inconsistencies between expectations
in regular and GPU builds - ErrnoSetterMatcher is configured to omit
errno matching on GPUs, as fgetc implementation on GPU doesn't set
errno, in contrast to Linux.
Reduce the direct use of libc_errno in stdio unit tests by adopting
ErrnoCheckingTest where appropriate.
Also removes the libc_errno.h inclusions from stdlib.h tests that were
accidentally added in d87eea35fa
Summary:
We need to set the bitfield memory to zero because the system does not
guarantee zeroed out memory. Even if fresh pages are zero, the system
allows re-use so we would need a `kfd` level API to skip this step.
Because we can't this patch updates the logic to perform the zero
initialization wave-parallel. This reduces the amount of time it takes
to allocate a fresh by up to a tenth.
This has the unfortunate side effect that the control flow is more
convoluted and we waste some extra registers, but it's worth it to
reduce the slab allocation latency.