clang-p2996

Author	SHA1	Message	Date
Petr Hosek	0ab14951db	[NFC][libc] Use the new style includes for tests This was accidentally omitted from D154746.	2023-07-10 07:42:13 +00:00
Petr Hosek	fb149e4beb	[libc] Use the new style includes for tests This is a follow up to D154529 covering tests. Differential Revision: https://reviews.llvm.org/D154746	2023-07-08 05:15:44 +00:00
Petr Hosek	9654bc3960	Revert "[libc] Set include directories for the str_to_float test" This reverts commit `147c0640a3` since it broke GPU builds.	2023-07-07 21:25:23 +00:00
Petr Hosek	147c0640a3	[libc] Set include directories for the str_to_float test This test uses libc headers and need to explicitly include them. Differential Revision: https://reviews.llvm.org/D154277	2023-07-07 20:33:54 +00:00
Joseph Huber	c850ea1498	[libc] Support fopen / fclose on the GPU This patch adds the necessary support for the fopen and fclose functions to work on the GPU via RPC. I added a new test that enables testing this with the minimal features we have on the GPU. I will update it once we have `fread` and `fwrite` to actually check the outputted strings. For now I just relied on checking manually via the outpuot temp file. Reviewed By: JonChesterfield, sivachandra Differential Revision: https://reviews.llvm.org/D154519	2023-07-05 18:31:58 -05:00
Joseph Huber	7e88e26d38	[libc] Add GPU support for the 'inttypes.h' functions Another low hanging fruit we can put on the GPU, this ports the tests over to the hermetic framework so we can run them on the GPU. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D154540	2023-07-05 17:47:10 -05:00
Joseph Huber	515bd1c9b8	[libc][Obvious] Fix timing on AMDGPU not being initialized Summary: Reviewer requested that this routine not be a macro, however that means that it was not being intitialized as the static initializer was done before the memcpy from the device. Fix this so we can get timing information.	2023-07-05 16:08:37 -05:00
Joseph Huber	80504b06ad	[libc][Obvious] Fix bad macro check on NVPTX tests Summary: I forgot to add the `defined()` check on NVPTX.	2023-07-05 15:54:12 -05:00
Joseph Huber	5db39796bf	[libc] Support timing information in libc tests This patch adds the necessary support to provide timing information in `libc` tests. This is useful for determining which tests look what amount of time. We also can use this as a test basis for providing more fine-grained timing when implementing things on the GPU. The main difficulty with this is the fact that the AMDGPU fixed frequency clock operates at an unknown frequency. We need to read this on a per-card basis from the driver and then copy it in. NVPTX on the other hand has a fixed clock at a resolution of 1ns. I have also increased the resolution of the print-outs as the majority of these are below a millisecond for me. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154446	2023-07-05 14:27:08 -05:00
Siva Chandra	3db36d6a9b	[libc] Initiliaze the global pointer in riscv startup code. Reviewed By: mikhail.ramalho Differential Revision: https://reviews.llvm.org/D151539	2023-07-05 07:32:31 +00:00
Guillaume Chatelet	1c814c99aa	[libc] Improve memcmp latency and codegen This is based on ideas from @nafi to: - use a branchless version of 'cmp' for 'uint32_t', - completely resolve the lexicographic comparison through vector operations when wide types are available. We also get rid of byte reloads and serializing '__builtin_ctzll'. I did not include the suggestion to replace comparisons of 'uint16_t' with two 'uint8_t' as it did not seem to help the codegen. This can be revisited in sub-sequent patches. The code been rewritten to reduce nested function calls, making the job of the inliner easier and preventing harmful code duplication. Reviewed By: nafi3000 Differential Revision: https://reviews.llvm.org/D148717	2023-06-30 13:00:58 +00:00
Tue Ly	de19101e33	[libc][NFC] Set rounding mode for sincosf exhaustive test.	2023-06-28 20:30:54 -04:00
Tue Ly	f320fefc4a	[libc][math] Implement erff function correctly rounded to all rounding modes. Implement correctly rounded `erff` functions. For `x >= 4`, `erff(x) = 1` for `FE_TONEAREST` or `FE_UPWARD`, `0x1.ffffep-1` for `FE_DOWNWARD` or `FE_TOWARDZERO`. For `0 <= x < 4`, we divide into 32 sub-intervals of length `1/8`, and use a degree-15 odd polynomial to approximate `erff(x)` in each sub-interval: ``` erff(x) ~ x * (c0 + c1 * x^2 + c2 * x^4 + ... + c7 * x^14). ``` For `x < 0`, we can use the same formula as above, since the odd part is factored out. Performance tested with `perf.sh` tool from the CORE-MATH project on AMD Ryzen 9 5900X: Reciprocal throughput (clock cycles / op) ``` $ ./perf.sh erff --path2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH reciprocal throughput -- with -march=native (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 11.790 + 0.182 clc/call; Median-Min = 0.154 clc/call; Max = 12.255 clc/call; -- CORE-MATH reciprocal throughput -- with -march=x86-64-v2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 14.205 + 0.151 clc/call; Median-Min = 0.159 clc/call; Max = 15.893 clc/call; -- System LIBC reciprocal throughput -- [####################] 100 % Ntrial = 20 ; Min = 45.519 + 0.445 clc/call; Median-Min = 0.552 clc/call; Max = 46.345 clc/call; -- LIBC reciprocal throughput -- with -mavx2 -mfma (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 9.595 + 0.214 clc/call; Median-Min = 0.220 clc/call; Max = 9.887 clc/call; -- LIBC reciprocal throughput -- with -msse4.2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 10.223 + 0.190 clc/call; Median-Min = 0.222 clc/call; Max = 10.474 clc/call; ``` and latency (clock cycles / op): ``` $ ./perf.sh erff --path2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH latency -- with -march=native (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 38.566 + 0.391 clc/call; Median-Min = 0.503 clc/call; Max = 39.170 clc/call; -- CORE-MATH latency -- with -march=x86-64-v2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 43.223 + 0.667 clc/call; Median-Min = 0.680 clc/call; Max = 43.913 clc/call; -- System LIBC latency -- [####################] 100 % Ntrial = 20 ; Min = 111.613 + 1.267 clc/call; Median-Min = 1.696 clc/call; Max = 113.444 clc/call; -- LIBC latency -- with -mavx2 -mfma (with FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 40.138 + 0.410 clc/call; Median-Min = 0.536 clc/call; Max = 40.729 clc/call; -- LIBC latency -- with -msse4.2 (without FMA instructions) [####################] 100 % Ntrial = 20 ; Min = 44.858 + 0.872 clc/call; Median-Min = 0.814 clc/call; Max = 46.019 clc/call; ``` Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153683	2023-06-28 13:58:37 -04:00
Tue Ly	e9074d019e	[libc] Fix missing dependency and linking option for sqrtf exhaustive test.	2023-06-28 08:13:53 -04:00
Tue Ly	9532074a9d	[libc][math] Clean up exhaustive tests implementations. Clean up exhaustive tests. Let check functions return number of failures instead of passed/failed. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D153682	2023-06-28 07:58:46 -04:00
Jon Chesterfield	d4d8cd8446	[libc] Factor specifics of packet type out of process NFC. Simplifies process slightly, gives more options for testing it. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D153604	2023-06-23 03:45:23 +01:00
Jon Chesterfield	85c66f5d18	[libc] Instantiate and sanity check rpc class CMake plumbing cargo culted from other tests. Minor changes to Process to allow statically allocating a buffer. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D153594	2023-06-23 02:11:18 +01:00
Guillaume Chatelet	bd1cba9f4f	Revert D148717 "[libc] Improve memcmp latency and codegen" Once integrated in our codebase the patch triggered a bunch of failing tests. We do not yet understand where the bug is but we revert it to move forward with integration. This reverts commit `5e32765c15`.	2023-06-21 12:37:14 +00:00
Siva Chandra Reddy	75d70b7306	[libc] Make close function of the internal File class cleanup the file object. Before this change, a separate static method named cleanup was used to cleanup the file. Instead, now the close method cleans up the full file object using the platform's close function. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D153377	2023-06-21 05:05:04 +00:00
Mikhail R. Gadelha	a2df87c2b0	[libc] Fix libmath test compilation when using UInt<T> This patch: (1) adds the add_with_carry_const and sub_with_borrow_const constexpr calls to add and sub, respectively. Both add and sub are constexpr calls and were call the non-constexpr version of add/sub_with_borrow. (2) adds explicit UIntType construct calls in some fp tests. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D150223	2023-06-20 15:41:18 -03:00
Tue Ly	5dbd5118ec	[libc][math] Improve tanhf performance. Re-order exceptional branches and slightly adjust the evaluation. Performance tested with the CORE-MATH project on AMD EPYC 7B12 (clocks/op) Reciprocal throughputs: ``` --- BEFORE --- $ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 7.794 + 0.102 clc/call; Median-Min = 0.066 clc/call; Max = 8.267 clc/call; [####################] 100 %. (with -msse4.2) Ntrial = 20 ; Min = 10.783 + 0.172 clc/call; Median-Min = 0.144 clc/call; Max = 11.446 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 18.926 + 0.381 clc/call; Median-Min = 0.342 clc/call; Max = 19.623 clc/call; --- AFTER --- $ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 6.598 + 0.085 clc/call; Median-Min = 0.052 clc/call; Max = 6.868 clc/call; [####################] 100 % (with -msse4.2) Ntrial = 20 ; Min = 9.245 + 0.304 clc/call; Median-Min = 0.248 clc/call; Max = 10.675 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 11.724 + 0.440 clc/call; Median-Min = 0.444 clc/call; Max = 12.262 clc/call; ``` Latency: ``` --- BEFORE --- $ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 38.821 + 0.157 clc/call; Median-Min = 0.122 clc/call; Max = 39.539 clc/call; [####################] 100 %. (with -msse4.2) Ntrial = 20 ; Min = 44.767 + 0.766 clc/call; Median-Min = 0.681 clc/call; Max = 45.951 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 55.055 + 1.512 clc/call; Median-Min = 1.571 clc/call; Max = 57.039 clc/call; --- AFTER --- $ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf [####################] 100 % (with -mavx2 -mfma) Ntrial = 20 ; Min = 36.147 + 0.194 clc/call; Median-Min = 0.181 clc/call; Max = 36.536 clc/call; [####################] 100 % (with -msse4.2) Ntrial = 20 ; Min = 40.904 + 0.728 clc/call; Median-Min = 0.557 clc/call; Max = 42.231 clc/call; [####################] 100 %. (SSE2) Ntrial = 20 ; Min = 55.776 + 0.557 clc/call; Median-Min = 0.542 clc/call; Max = 56.551 clc/call; ``` Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153026	2023-06-20 09:25:07 -04:00
Siva Chandra Reddy	21e1651c0c	[libc] Remove the requirement of a platform-flush operation in File abstraction. The libc flush operation is not supposed to trigger a platform level flush operation. See "Notes" on this Linux man page: https://man7.org/linux/man-pages/man3/fflush.3.html Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D153182	2023-06-19 18:38:29 +00:00
Joseph Huber	485e2de6d5	[libc][nfc] Silence two warnings in tests These currently give warnings for unused variables or a default case where everything is covered. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D153137	2023-06-16 12:52:06 -05:00
Joseph Huber	ed34cb2cd7	[libc] Add a test for `fputs` to check using `stdout` and `stderr` This patch adds a test directly for the `fputs` function similar to the existing `puts` test. This lets us know that the default file pointers are function and the `fputs` interface works. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D152288	2023-06-16 11:01:55 -05:00
Joseph Huber	7e8b0c27f2	[libc] Disable the strtod and strtold tests on NVPTX These tests have a single line that fails with a value off-by-one, see https://lab.llvm.org/buildbot/#/builders/46/builds/50055/steps/12/logs/stdio . Disable these for now so we can figure out what the error is later. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D153056	2023-06-15 13:29:42 -05:00
Joseph Huber	dcdfc963d7	[libc] Export GPU extensions to `libc` for external use The GPU port of the LLVM C library needs to export a few extensions to the interface such that users can interface with it. This patch adds the necessary logic to define a GPU extension. Currently, this only exports a `rpc_reset_client` function. This allows us to use the server in D147054 to set up the RPC interface outside of `libc`. Depends on https://reviews.llvm.org/D147054 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152283	2023-06-15 11:02:24 -05:00
Tue Ly	53d4057622	[libc] Fix merging issue with test/src/math/exhaustive/expm1f_test	2023-06-14 11:00:13 -04:00
Tue Ly	055be3c30c	[libc] Enable hermetic floating point tests again. Fixing an issue with LLVM libc's fenv.h defined rounding mode macros differently from system libc, making get_round() return different values from fegetround(). Also letting math tests to skip rounding modes that cannot be set. This should allow math tests to be run on platforms in which fenv.h is not implemented yet. This allows us to re-enable hermatic floating point tests in https://reviews.llvm.org/D151123 and reverting https://reviews.llvm.org/D152742. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D152873	2023-06-14 10:53:35 -04:00
Guillaume Chatelet	9902fc8dad	[libc] Enable custom logging in LibcTest This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152630	2023-06-14 13:37:50 +00:00
Guillaume Chatelet	bdb07c98c4	Revert D152630 "[libc] Enable custom logging in LibcTest" Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707 This reverts commit `9a7b4c9348`.	2023-06-14 10:31:49 +00:00
Guillaume Chatelet	9a7b4c9348	[libc] Enable custom logging in LibcTest This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152630	2023-06-14 10:26:18 +00:00
Guillaume Chatelet	2cfae7cdf4	[libc] Dispatch memmove to memcpy when buffers are disjoint Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster. The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip). On x86 this patch adds a latency of 2 to 3 cycles. Before ``` -------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... -------------------------------------------------------------------------------- BM_Memmove/0/0_median 5.00 ns 5.00 ns 10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 6.21 ns 6.21 ns 10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 8.09 ns 8.09 ns 10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 5.95 ns 5.95 ns 10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 5.63 ns 5.63 ns 10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 5.68 ns 5.68 ns 10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 7.46 ns 7.46 ns 10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 5.40 ns 5.40 ns 10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 5.62 ns 5.62 ns 10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 101 ns 101 ns 10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096 ``` After ``` BM_Memmove/0/0_median 3.57 ns 3.57 ns 10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 4.52 ns 4.52 ns 10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 5.70 ns 5.70 ns 10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 4.47 ns 4.47 ns 10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 4.53 ns 4.53 ns 10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 4.19 ns 4.19 ns 10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 5.02 ns 5.02 ns 10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 4.03 ns 4.03 ns 10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 4.70 ns 4.70 ns 10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 90.7 ns 90.7 ns 10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096 ``` Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D152811	2023-06-14 08:29:15 +00:00
Tue Ly	1557256ab0	[libc] Add Int<> type and fix (U)Int<128> compatibility issues. Add Int<> and Int128 types to replace the usage of __int128_t in math functions. Clean up to make sure that (U)Int128 and __(u)int128_t are interchangeable in the code base. Reviewed By: sivachandra, mikhail.ramalho Differential Revision: https://reviews.llvm.org/D152459	2023-06-13 09:40:48 -04:00
Joseph Huber	746e72910f	[libc] Fix floating point test failing to build on the GPU A patch enabled this test which uses that `add_fp_unittest`. Unfortunately we do not support these on the GPU because it attempts to link in the floating point utils which are not built supporting hermetic tests. This was attempted to be fixed in D151123 but that had to be reverted. For now disable these so the tests pass. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D152742	2023-06-12 15:06:33 -05:00
Michael Jones	d3074f16a6	[libc] Add qsort_r This patch adds the reentrent qsort entrypoint, qsort_r. This is done by extending the qsort functionality and moving it to a shared utility header. For this reason the qsort_r tests focus mostly on the places where it differs from qsort, since they share the same sorting code. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D152467	2023-06-12 11:12:17 -07:00
Guillaume Chatelet	5e32765c15	[libc] Improve memcmp latency and codegen This is based on ideas from @nafi to: - use a branchless version of 'cmp' for 'uint32_t', - completely resolve the lexicographic comparison through vector operations when wide types are available. We also get rid of byte reloads and serializing '__builtin_ctzll'. I did not include the suggestion to replace comparisons of 'uint16_t' with two 'uint8_t' as it did not seem to help the codegen. This can be revisited in sub-sequent patches. The code been rewritten to reduce nested function calls, making the job of the inliner easier and preventing harmful code duplication. Reviewed By: nafi3000 Differential Revision: https://reviews.llvm.org/D148717	2023-06-12 13:47:16 +00:00
Tue Ly	a982431295	[libc] Add platform independent floating point rounding mode checks. Many math functions need to check for floating point rounding modes to return correct values. Currently most of them use the internal implementation of `fegetround`, which is platform-dependent and blocking math functions to be enabled on platforms with unimplemented `fegetround`. In this change, we add platform independent rounding mode checks and switching math functions to use them instead. https://github.com/llvm/llvm-project/issues/63016 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152280	2023-06-12 09:36:41 -04:00
Guillaume Chatelet	1ec995cc1c	Revert D148717 "[libc] Improve memcmp latency and codegen" This broke aarch64 debug buildbot https://lab.llvm.org/buildbot/#/builders/223/builds/21703 This reverts commit `bd4f978754`.	2023-06-12 08:32:00 +00:00
Guillaume Chatelet	bd4f978754	[libc] Improve memcmp latency and codegen This is based on ideas from @nafi to: - use a branchless version of 'cmp' for 'uint32_t', - completely resolve the lexicographic comparison through vector operations when wide types are available. We also get rid of byte reloads and serializing '__builtin_ctzll'. I did not include the suggestion to replace comparisons of 'uint16_t' with two 'uint8_t' as it did not seem to help the codegen. This can be revisited in sub-sequent patches. The code been rewritten to reduce nested function calls, making the job of the inliner easier and preventing harmful code duplication. Reviewed By: nafi3000 Differential Revision: https://reviews.llvm.org/D148717	2023-06-12 07:56:23 +00:00
Guillaume Chatelet	8e44b849da	[libc][NFC] Introduce a Location object for consistent failure logging This is just an implementation detail. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152532	2023-06-10 06:58:15 +00:00
Guillaume Chatelet	0fd0b74289	[libc][NFC] Clean up matchers namespace This is a follow up to https://reviews.llvm.org/D152503 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152533	2023-06-10 06:55:16 +00:00
Tue Ly	37458f6693	[libc][math] Move str method from FPBits class to testing utils. str method of FPBits class is only used for pretty printing its objects in tests. It brings cpp::string dependency to FPBits class, which is not ideal for embedded use case. We move str method to a free function in test utils and remove this dependency of FPBits class. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152607	2023-06-10 02:50:58 -04:00
Joseph Huber	168fa31816	[libc] Fix some tests on NVPTX due to insufficient stack size A few of these tests were disabled due to failing on NVPTX. After looking into it the vast majority of these cases were due to insufficient stack memory. This can be worked around by increasing the stack size in the loader or by reducing the memory usage in the case of large string constants. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D152583	2023-06-09 16:42:14 -05:00
Guillaume Chatelet	fd2c74c8ed	[libc][NFC] Simplify LibcTest and trim down string allocations This is a bit of cleanup before working on logging via stream operator (i.e., `EXPECT_XXX() << ...`). Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152503	2023-06-09 09:36:18 +00:00
Joseph Huber	63710fd529	[libc] Disable uint test on NVPTX GPUs This test started failing on Nvidia, we need to disable it to keep the bot green until we can investigate the root cause. Differential Revision: https://reviews.llvm.org/D152481	2023-06-08 17:54:27 -05:00
Michael Jones	5df182f121	[libc] disable printf Lf tests for float128 plats The results for the %Lf tests were calculated on 80 bit long double systems, meaning the results are not necessarily accurate for float128 systems. In future these will be fixed, but for the moment I'm just turning them off. Differential Revision: https://reviews.llvm.org/D152471	2023-06-08 14:35:51 -07:00
Michael Jones	688b9730d1	[libc] add options to printf decimal floats This patch adds three options for printf decimal long doubles, and these can also apply to doubles. 1. Use a giant table which is fast and accurate, but takes up ~5MB). 2. Use dyadic floats for approximations, which only gives ~50 digits of accuracy but is very fast. 3. Use large integers for approximations, which is accurate but very slow. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D150399	2023-06-08 14:23:15 -07:00
Tue Ly	b95ed8b6d9	[libc] Remove operator T from cpp::expected. The libc's equivalent of std::expected has a non-standard and non-explicit operator T - https://github.com/llvm/llvm-project/issues/62738 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D152270	2023-06-06 13:57:44 -04:00
Joseph Huber	e6c401b5e8	[libc] Add initial support for 'puts' and 'fputs' to the GPU This patch adds the initial support required to support basic priting in `stdio.h` via `puts` and `fputs`. This is done using the existing LLVM C library `File` API. In this sense we can think of the RPC interface as our system call to dump the character string to the file. We carry a `uintptr_t` reference as our native "file descriptor" as it will be used as an opaque reference to the host's version once functions like `fopen` are supported. For some unknown reason the declaration of the `StdIn` variable causes both the AMDGPU and NVPTX backends to crash if I use the `READ` flag. This is not used currently as we only support output now, but it needs to be fixed Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D151282	2023-06-05 17:56:55 -05:00
Joseph Huber	a621308881	[libc] Implement basic `malloc` and `free` support on the GPU This patch adds support for the `malloc` and `free` functions. These currently aren't implemented in-tree so we first add the interface filies. This patch provides the most basic support for a true `malloc` and `free` by using the RPC interface. This is functional, but in the future we will want to implement a more intelligent system and primarily use the RPC interface more as a `brk()` or `sbrk()` interface only called when absolutely necessary. We will need to design an intelligent allocator in the future. The semantics of these memory allocations will need to be checked. I am somewhat iffy on the details. I've heard that HSA can allocate asynchronously which seems to work with my tests at least. CUDA uses an implicit synchronization scheme so we need to use an explicitly separate stream from the one launching the kernel or the default stream. I will need to test the NVPTX case. I would appreciate if anyone more experienced with the implementation details here could chime in for the HSA and CUDA cases. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D151735	2023-06-05 17:56:53 -05:00

1 2 3 4 5 ...

929 Commits