We should more consistently use inline assembly using the LIBC wrappers.
It's much safer to mark all of these volatile as well.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152294
ea8f4b9841 broke some build configurations
because it was enabled by default and some people are using a just built
libc/clang/LLVM to work on other projects where having a just built LLVM
libc in one of Clang's default include directories can make things
unusable.
Differential Revision: https://reviews.llvm.org/D152190
A previous patch added general support for printing via the RPC
interface. we should consolidate this functionality and get rid of the
old opcode that was used for simple testing.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152211
If CUDA is not found this string will expand into nothing. We need to
surround it with a string otherwise it will cause build failures.
Differential Revision: https://reviews.llvm.org/D152209
This patch adds the initial support required to support basic priting in
`stdio.h` via `puts` and `fputs`. This is done using the existing LLVM C
library `File` API. In this sense we can think of the RPC interface as
our system call to dump the character string to the file. We carry a
`uintptr_t` reference as our native "file descriptor" as it will be used
as an opaque reference to the host's version once functions like
`fopen` are supported.
For some unknown reason the declaration of the `StdIn` variable causes
both the AMDGPU and NVPTX backends to crash if I use the `READ` flag.
This is not used currently as we only support output now, but it needs
to be fixed
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D151282
This patch adds support for the `malloc` and `free` functions. These
currently aren't implemented in-tree so we first add the interface
filies.
This patch provides the most basic support for a true `malloc` and
`free` by using the RPC interface. This is functional, but in the future
we will want to implement a more intelligent system and primarily use
the RPC interface more as a `brk()` or `sbrk()` interface only called
when absolutely necessary. We will need to design an intelligent
allocator in the future.
The semantics of these memory allocations will need to be checked. I am
somewhat iffy on the details. I've heard that HSA can allocate
asynchronously which seems to work with my tests at least. CUDA uses an
implicit synchronization scheme so we need to use an explicitly separate
stream from the one launching the kernel or the default stream. I will
need to test the NVPTX case.
I would appreciate if anyone more experienced with the implementation details
here could chime in for the HSA and CUDA cases.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151735
This is based on ideas from @nafi to:
- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector
operations when wide types are available. We also get rid of byte
reloads and serializing '__builtin_ctzll'.
I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.
The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.
Reviewed By: nafi3000
Differential Revision: https://reviews.llvm.org/D148717
This patch moves the location of libllvmlibc.a within the build tree to
within ./lib/<target triple>. This more closely matches the behavior of
other runtime builds and allows for clang in the same build tree to
automatically be able to link against llvmlibc since this path is by
default included by the driver.
Also removes the LIBC_BINARY_DIR CMake flag since it isn't used anywhere
in the tree (based on a quick grep).
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D151624
The C standard asserts that the `errno` value is an l-value thread local
integer. We cannot provide a generic thread local integer on the GPU
currently without some workarounds. Previously, we worked around this by
implementing the `errno` value as a special consumer class that made all
the writes disappear. However, this is problematic for internal tests.
Currently there are build failures because of this handling and it's
only likely to cause more problems the more we do this.
This patch instead makes the internal target used for testing export the
`errno` value as a simple global integer. This allows us to use and test
the `errno` interface correctly assuming we run with a single thread.
Because this is only used for the non-exported target we still do not
provide this feature in the version that users will use so we do not
need to worrk about it being incorrect in general.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152015
There were regressions in the testing framework due to none of the
functioning buildbots having a 32 bit long. This allowed the 32 bit
version of the strtointeger function to go untested. This patch adds
tests for strtoint32 and strtoint64, which are internal testing
functions that use constant integer sizes. It also fixes the tests to
properly handle these situations.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151935
Update implementation status table for Date and Time Functions to include different targets.
Reviewed By: jeffbailey
Differential Revision: https://reviews.llvm.org/D151809
This patch simply moves the special handling for `linux` files to a
subdirectory. This is done to make it easier in the future to extend
this support to targets (like the GPU) that will have different
dependencies.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D151231
This reverts commit d763c6e5e2.
Adds the patch by @hans from
https://github.com/llvm/llvm-project/issues/62719
This patch fixes the Windows build.
d763c6e5e2 reverted the reviews
D144509 [CMake] Bumps minimum version to 3.20.0.
This partly undoes D137724.
This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193
Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.
D150532 [OpenMP] Compile assembly files as ASM, not C
Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent)
when compiling a file which has been set as having the language
C. This behaviour change only takes place if "cmake_minimum_required"
is set to 3.20 or newer, or if the policy CMP0119 is set to new.
Attempting to compile assembly files with "-x c" fails, however
this is workarounded in many cases, as OpenMP overrides this with
"-x assembler-with-cpp", however this is only added for non-Windows
targets.
Thus, after increasing cmake_minimum_required to 3.20, this breaks
compiling the GNU assembly for Windows targets; the GNU assembly is
used for ARM and AArch64 Windows targets when building with Clang.
This patch unbreaks that.
D150688 [cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump
The build uses other mechanism to select the runtime.
Fixes#62719
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D151344
Along the way, couple of additional things have been done:
1. Move `ErrnoSetterMatcher.h` to `test/UnitTest` as all other matchers live
there now.
2. `ErrnoSetterMatcher` ignores matching `errno` on GPUs.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151129
This passed locally but unfortauntely it seems some tests are not ready
to be made hermetic. Revert for now until we can investigate
specifically which tests are failing and mark those as `UNIT_TEST_ONLY`.
This reverts commit 417ea79e79.
This patch enables us to run the floating point tests as hermetic.
Importantly we now use the internal versions of the `fesetround` and
`fegetround` functions.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D151123
Strict warnings require explicit static_cast to counteract
default widening of types narrower than int.
Functions in header files should have vague linkage (inline
keyword), not internal linkage (static) or external linkage
(no inline keyword) even for template functions. Note these
don't use the LIBC_INLINE macro since this is only for test code.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D151494
It resolves to thread_local on all platform except for the GPUs on which
it resolves to nothing. The use of thread_local in the source code has been
replaced with the new macro.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151486
This is an ongoing series of commits that are reformatting our
Python code. This catches the last of the python files to
reformat. Since they where so few I bunched them together.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: jhenderson, #libc, Mordante, sivachandra
Differential Revision: https://reviews.llvm.org/D150784
We want to do this so that build system like ninja don't end up running
the hermetic and unit tests in parallel. Running in parallel can cause
problems for tests which read/write disk files as the hermetic and unit
tests can end up stepping on each other.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151291
This is largely a cosmetic change done with a few goals:
1. Reduce the conditionals in picking the correct set of tables for the
platform.
2. Avoid exposing, for example Linux errors, when building for non-Linux
platforms. This also prevents build failures when Linux errors are not
defined on the target non-Linux platform.
3. Some "_table" suffixes have been removed to avoid repeated
occurance of "table" like "tables/linux_error_table.h".
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D151367
`getrandom` is implemented as a syscall.
We don't want to test linux implementation of the syscall. We just want to verify that it reacts as expected to sensible values.
Runtime before
```
[ RUN ] LlvmLibcGetRandomTest.InvalidFlag
[ OK ] LlvmLibcGetRandomTest.InvalidFlag (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.InvalidBuffer
[ OK ] LlvmLibcGetRandomTest.InvalidBuffer (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.ReturnsSize
[ OK ] LlvmLibcGetRandomTest.ReturnsSize (took 83 ms)
[ RUN ] LlvmLibcGetRandomTest.PiEstimation
[ OK ] LlvmLibcGetRandomTest.PiEstimation (took 9882 ms)
```
Runtime after
```
[ RUN ] LlvmLibcGetRandomTest.InvalidFlag
[ OK ] LlvmLibcGetRandomTest.InvalidFlag (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.InvalidBuffer
[ OK ] LlvmLibcGetRandomTest.InvalidBuffer (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.ReturnsSize
[ OK ] LlvmLibcGetRandomTest.ReturnsSize (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.CheckValue
[ OK ] LlvmLibcGetRandomTest.CheckValue (took 0 ms)
```
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D151336
Summry:
This was accidentally dropped from a previous patch following a rebase.
Fix it to where it's consistent.
Differential Revision: https://reviews.llvm.org/D151232
Currently we have the `send_n` and `recv_n` routines to stream data,
such as a string to print, to the other side. The first operation is to
send the size so the other side knows the number of bytes to recieve.
However, this wasted 56 bytes that could've been sent. This meant that
small values, like the arguments to a function to call on the host for
example, needed to perform an extra send. This patch sends the first 56
bytes in the first packet and continues if necessary.
Depends on D150992
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D151041
We provide the `send_n` and `recv_n` utilities as a generic way to
stream data between both sides of the process. This was previously
tested and performed as expected when using a string of constant size.
However, when the size was allowed to diverge between the threads in the
warp or wavefront this could deadlock. This did not occur on NVPTX
because of the use of the explicit warp sync. However, on AMD one of the
work items in the wavefront could continue executing and hit the next
`recv` call before the other threads, then we would deadlock as we
violated the RPC invariants.
This patch replaces the for loop with a thread ballot. This will cause
every thread in the warp or wavefront to continue executing the loop
until all of them can exit. This acts as a more explicit wavefront sync.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D150992
The AMDGPU backend has a built-in pass to lower constructors. We do this
manually in the `start.cpp` implementation so we can disable this to
keep the binaries smaller.
Differential Revision: https://reviews.llvm.org/D151213
With more tests added to LLVM libc each week we want to keep track of unittest's runtime, especially for low end build bots.
Top offender can be tracked with a bit of scripting (spoiler alert, mem function sweep tests are in the top ones)
```
ninja check-libc | grep "ms)" | awk '{print $(NF-1),$0}' | sort -nr | cut -f2- -d' '
```
Unfortunately this doesn't work for hermetic tests since `clock` is unavailable.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151097