clang-p2996

Author	SHA1	Message	Date
Nick Desaulniers	315a5cce89	[libc] move __stack_chk_fail to src/ from startup/ (#75863 ) __stack_chk_fail should be provided by libc.a, not startup files. Add __stack_chk_fail to existing linux and arm entrypoints. On Windows (when not targeting MinGW), it seems that the corresponding function identifier is __security_check_cookie, so no entrypoint is added for Windows. Baremetal targets also ought to be compileable with `-fstack-protector` There is no common header for this prototype, since calls to __stack_chk_fail are meant to be inserted by the compiler upon function return when compiled `-fstack-protector`.	2023-12-19 11:05:12 -08:00
Schrodinger ZHU Yifan	6c1f56fdb5	[libc] expose aux vector (#75806 ) This patch lifts aux vector related definitions to app.h. Because startup's refactoring is in progress, this patch still contains duplicated changes. This problem will be addressed very soon in an incoming patch.	2023-12-18 12:27:30 -08:00
michaelrj-google	8180ea8694	[libc] Add bind function (#74014 ) This patch adds the bind function to go with the socket function. It also cleans up a lot of socket related data structures.	2023-12-12 13:36:11 -08:00
Schrodinger ZHU Yifan	86bde5adc8	[libc] implement prctl (#74386 ) Implement `prctl` as specified in https://man7.org/linux/man-pages/man2/prctl.2.html. This patch also includes test cases covering two simple use cases: 1. `PR_GET_NAME/PR_SET_NAME`: where userspace data is passed via arg2. 2. `PR_GET_THP_DISABLE`: where return value is passed via syscal retval.	2023-12-05 12:31:00 -08:00
Schrodinger ZHU Yifan	ff51b60b18	[libc] Revert #73704 and subsequent fixes #73984 , #74026 (#74355 ) The test cases of mincore require getting correct page size from OS. As `sysconf` is not functioning correctly, these patches are implemented in a somewhat confusing way. We revert such patches and will reintroduce mincore after we correct sysconf. This reverts `54878b8`, `985c0d1` and `418a3a4`.	2023-12-04 12:49:12 -08:00
Schrodinger ZHU Yifan	a0eda10947	[libc][NFC] unify startup library's code style with the rest (#74041 ) This PR unifies the startup library's code style with the rest of libc.	2023-12-04 10:31:18 -08:00
Schrodinger ZHU Yifan	418a3a4577	[libc][SysMMan] implement mincore (#73704 ) Implement `mincore` as specified in https://man7.org/linux/man-pages/man2/mincore.2.html	2023-11-30 14:22:36 -05:00
Schrodinger ZHU Yifan	e399a317ef	[libc] fix build on aarch64 (#73739 ) * avoid implicit narrowing conversion * move hsearch entrypoints to FULL_BUILD	2023-11-28 22:39:00 -05:00
Schrodinger ZHU Yifan	81e3e7e5d4	[libc] [search] implement hcreate(_r)/hsearch(_r)/hdestroy(_r) (#73469 ) This patch implements `hcreate(_r)/hsearch(_r)/hdestroy(_r)` as specified in https://man7.org/linux/man-pages/man3/hsearch.3.html. Notice that `neon/asimd` extension is not yet added in this patch. - The implementation is largely simplified from rust's [`hashbrown`](https://github.com/rust-lang/hashbrown/blob/master/src/raw/mod.rs) as we only consider fix-sized insertion-only hashtables. Technical details are provided in code comments. - This patch also contains a portable string hash function, which is derived from [`aHash`](https://github.com/tkaitchuck/aHash)'s fallback routine. Not using any SIMD acceleration, it has a good enough quality (passing all SMHasher tests) and is not too bad in speed. - Some general functionalities are added, such as `memory_size`, `offset_to`(alignment), `next_power_of_two`, `is_power_of_two`. `ctz/clz` are extended to support shorter integers.	2023-11-28 21:02:25 -05:00
Nishant Mittal	0c49fc4c68	[libc][math] Implement nexttoward functions (#72763 ) Implements the `nexttoward`, `nexttowardf` and `nexttowardl` functions. Also, raise excepts required by the standard in `nextafter` functions. cc: @lntue	2023-11-21 09:02:51 -05:00
michaelrj-google	4db99c8b54	[libc] Add base for target config within cmake (#72318 ) Currently the only way to add or remove entrypoints is to modify the entrypoints.txt file for the current target. This isn't ideal since a user would have to carry a diff for this file when updating their checkout. This patch adds a basic mechanism to allow the user to remove entrypoints without modifying the repository.	2023-11-17 11:32:27 -08:00
lntue	3f906f513e	[libc][math] Add initial support for C23 float128 math functions, starting with copysignf128. (#71731 )	2023-11-10 14:32:59 -05:00
doshimili	3153aa4c95	[libc] Adding a version of memset with software prefetching (#70857 ) Software prefetching helps recover performance when hardware prefetching is disabled. The 'LIBC_COPT_MEMSET_X86_USE_SOFTWARE_PREFETCHING' compile time option allows users to use this patch.	2023-11-10 10:56:16 +01:00
lntue	bc7a3bd864	[libc][math] Implement powf function correctly rounded to all rounding modes. (#71188 ) We compute `pow(x, y)` using the formula ``` pow(x, y) = x^y = 2^(y * log2(x)) ``` We follow similar steps as in `log2f(x)` and `exp2f(x)`, by breaking down into `hi + mid + lo` parts, in which `hi` parts are computed using the exponent field directly, `mid` parts will use look-up tables, and `lo` parts are approximated by polynomials. We add some speedup for common use-cases: ``` pow(2, y) = exp2(y) pow(10, y) = exp10(y) pow(x, 2) = x * x pow(x, 1/2) = sqrt(x) pow(x, -1/2) = rsqrt(x) - to be added ```	2023-11-06 16:54:25 -05:00
Joseph Huber	25bf1ae99b	[libc] Enable remaining string functions on the GPU (#68346 ) Summary: We previously had to disable these string functions because they were not compatible with the definitions coming from the GNU / host environment. The GPU, when exporting its declarations, has a very difficult requirement that it be compatible with the host environment as both sides of the compilation need to agree on definitions and what's present. This patch more or less gives up an just copies the definitions as expected by `glibc` if they are provided that way, otherwise we fall back to the accepted way. This is the alternative solution to an existing PR which instead disable's GCC's handling.	2023-10-23 13:16:20 -04:00
Joseph Huber	630037ede4	[libc] Partially implement 'rand' for the GPU (#66167 ) Summary: This patch partially implements the `rand` function on the GPU. This is partial because the GPU currently doesn't support thread local storage or static initializers. To implement this on the GPU. I use 1/8th of the local / shared memory quota to treak the shared memory as thread local storage. This is done by simply allocating enough storage for each thread in the block and indexing into this based off of the thread id. The downside to this is that it does not initialize `srand` correctly to be `1` as the standard says, it is also wasteful. In the future we should figure out a way to support TLS on the GPU so that this can be completely common and less resource intensive.	2023-10-19 17:01:43 -04:00
Anton Rydahl	c73ad025b1	[libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (#66034 ) This patch populates the GPU version of `libm` with missing vendor entrypoints. The vendor math entrypoints are disabled by default but can be enabled with the CMake option `LIBC_GPU_VENDOR_MATH=ON`.	2023-10-19 12:24:50 -07:00
alfredfo	74b0465fe9	[libc] Add simple features.h with implementation macro (#69402 ) In the future this should probably be autogenerated so it defines library version. See: Discussion in #libc https://discord.com/channels/636084430946959380/636732994891284500/1163979080979460176	2023-10-19 04:08:13 +02:00
Joseph Huber	ddc30ff802	[libc] Implement the 'ungetc' function on the GPU (#69248 ) Summary: This function follows closely with the pattern of all the other functions. That is, making a new opcode and forwarding the call to the host. However, this also required modifying the test somewhat. It seems that not all `libc` implementations follow the same error rules as are tested here, and it is not explicit in the standard, so we simply disable these EOF checks when targeting the GPU.	2023-10-17 13:02:31 -05:00
lntue	da28593d71	[libc][math] Implement double precision expm1 function correctly rounded for all rounding modes. (#67048 ) Implementing expm1 function for double precision based on exp function algorithm: - Reduced x = log2(e) * (hi + mid1 + mid2) + lo, where: * hi is an integer * mid1 * 2^-6 is an integer * mid2 * 2^-12 is an integer * \|lo\| < 2^-13 + 2^-30 - Then exp(x) - 1 = 2^hi * 2^mid1 * 2^mid2 * exp(lo) - 1 ~ 2^hi * (2^mid1 * 2^mid2 * (1 + lo * P(lo)) - 2^(-hi) ) - We evaluate fast pass with P(lo) is a degree-3 Taylor polynomial of (e^lo - 1) / lo in double precision - If the Ziv accuracy test fails, we use degree-6 Taylor polynomial of (e^lo - 1) / lo in double double precision - If the Ziv accuracy test still fails, we re-evaluate everything in 128-bit precision.	2023-09-28 16:43:15 -04:00
Joseph Huber	cc2445589d	[libc] Fix wrapper headers for some ctype macros and C++ decls Summary: These wrapper headers need to work around things in the standard headers. The existing workarounds didn't correctly handle the macros for `iscascii` and `toascii`. Additionally, `memrchr` can't be used because it has a different declaration for C++ mode. Fix this so it can be compiled.	2023-09-28 10:00:34 -05:00
Mikhail R. Gadelha	e3087c4b8c	[libc] Start to refactor riscv platform abstraction to support both 32 and 64 bits versions This patch enables the compilation of libc for rv32 by unifying the current rv64 and rv32 implementation into a single rv implementation. We updated the cmake file to match the new riscv32 arch and force LIBC_TARGET_ARCHITECTURE to be "riscv" whenever we find "riscv32" or "riscv64". This is required as LIBC_TARGET_ARCHITECTURE is used in the path for several platform specific implementations. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D148797	2023-09-26 12:32:25 -03:00
Joseph Huber	7ac8e26fc7	[libc] Implement `fseek`, `fflush`, and `ftell` on the GPU (#67160 ) Summary: This patch adds the necessary entrypoints to handle the `fseek`, `fflush`, and `ftell` functions. These are all very straightfoward, we simply make RPC calls to the associated function on the other end. Implementing it this way allows us to more or less borrow the state of the stream from the server as we intentionally maintain no internal state on the GPU device. However, this does not implement the `errno` functinality so that must be ignored.	2023-09-26 09:46:46 -05:00
Guillaume Chatelet	b6bc9d72f6	[libc] Mass replace enclosing namespace (#67032 ) This is step 4 of https://discourse.llvm.org/t/rfc-customizable-namespace-to-allow-testing-the-libc-when-the-system-libc-is-also-llvms-libc/73079	2023-09-26 11:45:04 +02:00
michaelrj-google	a5a008ff4f	[libc] Refactor scanf reader to match printf (#66023 ) In a previous patch, the printf writer was rewritten to use a single writer class with a buffer and a callback hook. This patch refactors scanf's reader to match conceptually.	2023-09-22 12:50:02 -07:00
Joseph Huber	59896c168a	[libc] Remove the 'rpc_reset' routine from the RPC implementation (#66700 ) Summary: This patch removes the `rpc_reset` function. This was previously used to initialize the RPC client on the device by setting up the pointers to communicate with the server. The purpose of this was to make it easier to initialize the device for testing. However, this prevented us from enforcing an invariant that the buffers are all read-only from the client side. The expected way to initialize the server is now to copy it from the host runtime. This will allow us to maintain that the RPC client is in the constant address space on the GPU, potentially through inference, and improving caching behaviour.	2023-09-21 11:07:09 -05:00
Joseph Huber	b8f64431ea	[libc] Add GPU config file using the new format (#66635 ) Summary: This patch copies a config file for the GPU similar to the baremetal/embedded implementation. This will configure the implementations of functions like `sprintf` and `snprintf` to be compiled into more simple versions that can be run on the GPU. These functions cannot be enabled yet as Vararg support hasn't landed, but it will be used then.	2023-09-18 08:06:59 -05:00
Joseph Huber	a1be5d69df	[libc] Implement more input functions on the GPU (#66288 ) Summary: This patch implements the `fgets`, `getc`, `fgetc`, and `getchar` functions on the GPU. Their implementations are straightforward enough. One thing worth noting is that the implementation of `fgets` will be extremely slow due to the high latency to read a single char. A faster solution would be to make a new RPC call to call `fgets` (due to the special rule that newline or null breaks the stream). But this is left out because performance isn't the primary concern here.	2023-09-14 15:39:29 -05:00
Mikhail R. Gadelha	72e6f06119	[libc] Fix start up crash on 32 bit systems (#66210 ) This patch changes the default types of argc/argv so it's no longer a uint64_t in all systems, instead, it's now a uintptr_t, which fixes crashes in 32-bit systems that expect 32-bit types. This patch also adds two uintptr_t types (EnvironType and AuxEntryType) for the same reason. The patch also adds a PgrHdrTableType type behind an ifdef that's Elf64_Phdr in 64-bit systems and Elf32_Phdr in 32-bit systems.	2023-09-14 09:02:35 -04:00
Siva Chandra	17114f8b19	[libc] Remove common_libc_tuners.cmake and move options into config.json. (#66226 ) The name has been changed to adhere to the config option naming format. The necessary build changes to use the new option have also been made.	2023-09-13 22:17:00 -07:00
michaelrj-google	380eb46b13	[libc] Move long double table option to new config (#66151 ) This patch adds the long double table option for printf into the new configuration scheme. This allows it to be set for most targets but unset for baremetal.	2023-09-13 10:43:05 -07:00
Joseph Huber	bf85f27370	[libc] Implement 'qsort' and 'bsearch' on the GPU (#66230 ) Summary: This patch simply adds the necessary config to enable qsort and bsearch on the GPU. It is highly unlikely that anyone will use these, as they are single threaded, but we may as well support all entrypoints that we can.	2023-09-13 12:06:34 -05:00
Joseph Huber	60c0d303d6	[libc] Implement stdio writing functions for the GPU port (#65809 ) Summary: This patch implements fwrite, putc, putchar, and fputc on the GPU. These are very straightforward, the main difference for the GPU implementation is that we are currently ignoring `errno`. This patch also introduces a minimal smoke test for `putc` that is an exact copy of the `puts` test except we print the string char by char. This also modifies the `fopen` test to use `fwrite` to mirror its use of `fread` so that it is tested as well.	2023-09-09 13:27:07 -05:00
Siva Chandra Reddy	0f1507af41	[libc] Add a JSON based config option system. Few printf config options have been setup using this new config system along with their baremetal overrides. A follow up patch will add generation of doc/config.rst, which will contain the full list of libc config options and short description explaining how they affect the libc. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D159158	2023-09-05 14:19:18 +00:00
Petr Hosek	89fc7e52ab	[libc] Include (v)s(n)printf in baremetal configs These are commonly used on baremetal targets. We disable float support and other features to reduce the binary size. This would ideally eventually be handled using the proposed config mechanism: https://discourse.llvm.org/t/rfc-systematic-way-to-introduce-and-use-libc-config-options/72943 but for now we use a CMake conditional. Differential Revision: https://reviews.llvm.org/D159067	2023-09-04 02:39:02 +00:00
Joseph Huber	533145c458	[libc] Support 'assert.h' on the GPU This patch adds the necessary support to provide `assert` functionality through the GPU `libc` implementation. This implementation creates a special-case GPU implementation rather than relying on the common version. This is because the GPU has special considerings for printing. The assertion is printed out in chunks with `write_to_stderr`, however when combined with the GPU execution model this causes 32+ threads to all execute in-lock step. Meaning that we'll get a horribly fragmented message. Furthermore, potentially thousands of threads could hit the assertion at once and try to print even if we had it all in one `printf`. This is solved by having a one-time lock that each thread group / wave / warp will attempt to claim. We only let one thread group pass through while the others simply stop executing. Finally only the first thread in that group will do the printing until we finally abort execution. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D159296	2023-08-31 15:04:43 -05:00
Joseph Huber	07102a1194	[libc] Implement the 'abort' function on the GPU This function implements the `abort` function on the GPU. The implementation here closely mirros the `exit` call where we first synchornize with the RPC server to make sure it's listening and then we exit on the GPU. I was unsure if this should be a simple `__builtin_assert` on the GPU. I elected to go with an RPC approach to make this a more "true" `abort` call. That is, it should invoke some signal handlers and exit with the proper code according to the implemented C library on the server. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D159210	2023-08-31 08:40:15 -05:00
Joseph Huber	ca10bc4f41	[libc] Implement the 'nanosleep' function on the GPU The GPU has the ability to sleep for very short periods of time. We can map this to the existing `nanosleep` utility. This patch maps the nanosleep utility to the existing hardware instructions as best as possible. Depends on D159118 Reviewed By: JonChesterfield, sivachandra Differential Revision: https://reviews.llvm.org/D159225	2023-08-30 18:34:59 -05:00
Joseph Huber	30307a7bb7	[libc] Implement the 'clock()' function on the GPU This patch implements the `clock()` function on the GPU. This function is supposed to return a timestamp that can be converted into seconds using the `CLOCKS_PER_SEC` macro. The GPU has a fixed frequency timer that can be used for this purpose. However, there are some considerations. First is that AMDGPU does not have a statically known fixed frequency. I know internally that the gfx10xx and gfx11xx series use a 100 MHz clock which will probably remain for the future. Gfx9xx typically uses a 25 MHz clock except for the Vega 10 GPU. The only way to know for sure is to look it up from the runtime. For this purpose, I elected to default it to some known values and assign these to an exteranlly visible symbol that can be initialized if needed. If we do not have a good guess we just return zero. Second is that the `CLOCKS_PER_SEC` macro only gives about a microsecond of resolution. POSIX demands that it's 1,000,000 so it's best that we keep with this tradition as almost all targets seem to respect this. The reason this is important is because on the GPU we will almost assuredly be copying the host's macro value (see the wrapper header) so we should go with the POSIX version that's most likely to be set. (We could probably make a warning if the included header doesn't match the expected value). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D159118	2023-08-30 16:16:34 -05:00
Tue Ly	76bb278ebb	[libc][math] Implement double precision exp10 function correctly rounded for all rounding modes. Implement double precision exp10 function correctly rounded for all rounding modes. Using the same algorithm as double precision exp (https://reviews.llvm.org/D158551) and exp2 (https://reviews.llvm.org/D158812) functions. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D159143	2023-08-30 08:43:50 -04:00
Tue Ly	8ca614aa22	[libc][math] Implement double precision exp2 function correctly rounded for all rounding modes. Implement double precision exp2 function correctly rounded for all rounding modes. Using the same algorithm as double precision exp function in https://reviews.llvm.org/D158551. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D158812	2023-08-25 10:15:08 -04:00
Siva Chandra	37f6e3c0e9	[libc] Add sys/time.h to the list of aarch64 headers. Differential Revision: https://reviews.llvm.org/D158809	2023-08-24 21:03:15 -07:00
Siva Chandra	82f41192e2	[libc] Add abort and __llvm_libc_syscall to aarch64 fullbuild config. Differential Revision: https://reviews.llvm.org/D158794	2023-08-24 16:26:15 -07:00
Tue Ly	434bf16084	[libc][math] Implement double precision exp function correctly rounded for all rounding modes. Implement double precision exp function correctly rounded for all rounding modes. Using 4 stages: - Range reduction: reduce to `exp(x) = 2^hi * 2^mid1 * 2^mid2 * exp(lo)`. - Use 64 + 64 LUT for 2^mid1 and 2^mid2, and use cubic Taylor polynomial to approximate `(exp(lo) - 1) / lo` in double precision. Relative error in this step is bounded by 1.5 * 2^-63. - If the rounding test fails, use degree-6 Taylor polynomial to approximate `exp(lo)` in double-double precision. Relative error in this step is bounded by 2^-99. - If the rounding test still fails, use degree-7 Taylor polynomial to compute `exp(lo)` in ~128-bit precision. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D158551	2023-08-24 10:17:17 -04:00
Alfred Persson Forsberg	d3e045934a	Revert "[libc] Add limits.h" limits.h currently interferes with Clang's limits.h. include_next emits a warning because it is a GNU extension. Will re add this once we figure out a good solution. This reverts commits `13bbca8d69`, `002cba0329`, and `0fb3066873`.	2023-08-17 06:21:50 +02:00
Joseph Huber	1e573f378c	[libc] Implement fopen, fclose, and fread on the GPU This patch implements the `fopen`, `fclose`, and `fread` functions on the GPU. These are pretty much re-implemented from what existed but using the new interface. Having this subset allows us to test the interface a bit more strenuously since we can write and read to a file. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D157622	2023-08-16 09:14:38 -05:00
Alfred Persson Forsberg	0fb3066873	[libc] Add limits.h This header contains implementation specific constants. The compiler already provides its own limits.h with numerical limits conforming to freestanding ISO C. But it is missing extensions like POSIX, and does for example not include <linux/limits.h> which is expected on a Linux system, therefore, an LLVM libc implementation of limits.h is needed for hosted (__STDC_HOSTED__) environments. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D156961	2023-08-14 01:35:44 +01:00
Joseph Huber	d04494ccc9	[libc] Rework the file handling for the GPU The GPU has much tighter requirements for handling IO functions. Previously we attempted to define the GPU as one of the platform files. Using a common interface allowed us to easily define these functions without much extra work. However, it became more clear that this was a poor fit for the GPU. The file interface uses function pointers, which prevented inlining and caused bad perfromance and resource usage on the GPU. Further, using an actual `FILE` type rather than referring to it as a host stub prevented us from usin files coming from the host on the GPU device. After talking with @sivachandra, the approach now is to simply define GPU specific versions of the functions we intend to support. Also, we are ignoring `errno` for the time being as it is unlikely we will ever care about supporting it fully. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D157427	2023-08-09 14:42:20 -05:00
Michael Jones	16d5c24226	[libc] Add v variants of printf functions The v variants of the printf functions take their variadic arguments as a va_list instead of as individual arguments. They are otherwise identical to the corresponding printf variants. This patch adds them (vprintf, vfprintf, vsprintf, and vsnprintf) as well as tests. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D157138	2023-08-04 14:50:24 -07:00
Mikhail R. Gadelha	c9783d2bda	[libc] Add support to compile some syscalls on 32 bit platform This patch adds a bunch of ifdefs to handle the 32 bit versions of some syscalls, which often only append a 64 to the name of the syscall (with exception of SYS_lseek -> SYS_llseek and SYS_futex -> SYS_futex_time64) This patch also tries to handle cases where wait4 is not available (as in riscv32): to implement wait, wait4 and waitpid when wait4 is not available, we check for alternative wait calls and ultimately rely on waitid to implement them all. In riscv32, only waitid is available, so we need it to support this platform. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D148371	2023-08-03 10:08:01 -03:00

1 2 3 4 5 ...

446 Commits