clang-p2996

Author	SHA1	Message	Date
Joseph Huber	a4f553fcde	[libc] Fix using the `libcgpu.a` for NVPTX in non-LTO builds CUDA requires a PTX feature to be compiled generally, because the `libcgpu.a` archive contains LLVM-IR we need to have one present to compile it. Currently, the wrapper fatbinary format we use to incorporate these into single-source offloading languages has a special option to provide this. Since this was not present in the builds, if the user did not specify it via `-foffload-lto` it would not compile from CUDA or OpenMP due to the missing PTX features. Fix this by passing it to the packager invocation. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D154864	2023-07-10 13:54:47 -05:00
Petr Hosek	36c15be20b	[libc] Use LIBC_INCLUDE_DIR in CMake rules D152592 introduced LIBC_INCLUDE_DIR for the location of the include directory, use it in relevant CMake rules. Differential Revision: https://reviews.llvm.org/D154278	2023-07-10 07:32:24 +00:00
Petr Hosek	bf171aaa7a	Revert "[libc] Use LIBC_INCLUDE_DIR in CMake rules" This reverts commit `6e821f0b3a` since it broke the libc-aarch64-ubuntu-fullbuild-dbg bot.	2023-07-07 20:52:54 +00:00
Petr Hosek	6e821f0b3a	[libc] Use LIBC_INCLUDE_DIR in CMake rules D152592 introduced LIBC_INCLUDE_DIR for the location of the include directory, use it in relevant CMake rules. Differential Revision: https://reviews.llvm.org/D154278	2023-07-07 20:42:25 +00:00
Petr Hosek	e1cb5924cb	Revert "[libc] Use LIBC_INCLUDE_DIR in CMake rules" This reverts commit `046deabd93` since it broke libc-aarch64-ubuntu-fullbuild-dbg.	2023-07-05 17:20:11 +00:00
Petr Hosek	046deabd93	[libc] Use LIBC_INCLUDE_DIR in CMake rules D152592 introduced LIBC_INCLUDE_DIR for the location of the include directory, use it in relevant CMake rules. Differential Revision: https://reviews.llvm.org/D154278	2023-07-05 17:16:19 +00:00
Tue Ly	f9753ef189	[libc][Obvious] Fix a typo in setting FMA control option for RISCV64.	2023-06-02 11:15:29 -04:00
Joseph Huber	e7735a57b9	[libc] Correctly pass 'CXX_STANDARD' to the packaged GPU build We need to perform the GPU build separately. The `CXX_STANDARD` option was not being passed properly. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D149225	2023-04-26 16:52:31 -05:00
Joseph Huber	1c968f7a1f	[libc] Ignore unknown CUDA versions for `libc` targeting NVPTX Summary: We generally need a CUDA toolchain to build the tests for the GPU `libc` targeting NVPTX. However, clang will commonly emit warnings on versions that are too new. We can ignore these in `libc` since we are manually specifying the `+ptx` version to use whenever we compile. So we do not need to worry about unexpected changes and we do not depend on any newer features. So this should not be problematic.	2023-04-21 13:27:45 -05:00
Joseph Huber	a73cd00d87	[libc] Bump up sm_60's CUDA feature to +ptx63 Summary: The sm_60 GPU is the oldest model that's supported for using the RPC features of the `libc` GPU runtime. This also requires at least `+ptx63` to enable use of the active mask. So, this patch sets that as the minimum.	2023-04-21 13:27:45 -05:00
Joseph Huber	8704c3a31f	[libc] Set minimum CUDA PTX feature to +ptx60 Summary: The `+ptx` features correspond to the related CUDA version. We require a certain set of features from the `ptxas` assembler, which is tied to the CUDA version. Some of the ones set here were insufficient, so I am simply setting a cutoff to the CUDA 9.0 release as the minimum. This roughly corresponds to what should be required for sm_60 to be compiled with the source.	2023-04-20 18:01:01 -05:00
Joseph Huber	24214832fd	[libc] Fix `nvptx_options` variable not being reset in CMake Summary: This variable was not being reset, which caused the options to be compounded when building multiple architectures. This was very problematic as the architectures are not compatible.	2023-04-19 15:28:26 -05:00
Joseph Huber	e2356fb07e	[libc] Add special handling for CUDA PTX features The NVIDIA compilation path requires some special options. This is mostly because compilation is dependent on having a valid CUDA toolchain. We don't actually need the CUDA toolchain to create the exported `libcgpu.a` library because it's pure LLVM-IR. However, for some language features we need the PTX version to be set. This is normally set by checking the CUDA version, but without one installed it will fail to build. We instead choose a minimum set of features on the desired target, inferred from https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes and the PTX refernece for functions like `nanosleep`. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D148532	2023-04-17 11:51:34 -05:00
Tue Ly	6edad0c8f0	[libc][RISCV] Let RISCV64 targets test implementations with and without FMA. Let RISCV64 targets math implementations with and without FMA automatically. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D146730	2023-04-06 09:23:48 -04:00
Joseph Huber	9f5c6dcf59	[libc] Search for the CUDA patch explicitly when testing The packaged version of the `libc` library does not depend on the CUDA installation because it only uses `clang` and emits LLVM-IR. However, for testing we directly need the CUDA toolkit to emit and execute the files. This patch explicitly passes `--cuda-path` to the relevant compilations for NVPTX testing. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D147653	2023-04-05 15:14:47 -05:00
Joseph Huber	2dc60b4ea4	[libc] Use LTO for AMDGPU compilation and linking Summary: The AMDGPU ABI isn't stable or well defined. For that reson we prefer to rely on LTO to ensure that multiple files get linked correctly. Currently the internal targets used for testing mix LLVM-IR and assembly. We should be consistent here.	2023-03-29 14:20:44 -05:00
Joseph Huber	dab75a4378	[libc] Remove leftover debug prints	2023-03-14 15:14:12 -05:00
Joseph Huber	ab107b3fac	[libc] Fix CMake deduplication `-Xclang` arguments Summary: We use `-Xclang` to pass the GPU binary to be embedded. In the case of multi-source objects this will be passed more than once, but CMake implicitly deduplicates arguments. Use the special generator to prevent this from happening.	2023-03-14 15:04:37 -05:00
Joseph Huber	597cef4486	[libc] Fix GPU fatbinary dependencies for multi-source object libraries Summary: Multi-source object libraries require some additional handling, this logic wasn't correctly settending the dependency on each filename individually and was instead using the last one. This meant that only the last file was built for multi-object libraries.	2023-03-14 15:04:37 -05:00
Joseph Huber	c2a17bff24	[libc] Set the stub filename to the target name instead of the source The GPU target requires some weird special case handling to create fat binaries. CMake offers no way to set the name of an object library. The only way to do this is to create a file with the desired name and use that. Currently we name it after the source filename. However, this breaks if there is more than a single source. This patch changes the logic to instead look up the object target name and use that. E.g. `src.__support.OSUtil.osutil` will be `osutil.cpp`. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D145912	2023-03-14 10:31:04 -05:00
Joseph Huber	a031f72187	[libc] Correctly pass the compile options to the internal GPU compilation Summary: We use an internal option to create the GPU binary used for testing. This wasn't getting the proper flags passed to it due to a missing variable name.	2023-03-14 08:19:13 -05:00
Joseph Huber	8a712bf7c4	[libc] Fix common compile options not getting passed to GPU object Summary: This variable was named incorrectly. We weren't getting needed flags passed to object library builds.	2023-03-10 16:54:20 -06:00
Siva Chandra Reddy	772e37f893	[libc] Add ALIAS option to add_object_library rule. This ALIAS option is now used with threads/callonce target. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D145409	2023-03-06 22:11:33 +00:00
Siva Chandra Reddy	bfeef8b794	[libc] Add a linting target named "libc-lint". Lint targets for individual entrypoints have also been cleaned up. The target "libc-lint" depends on the individual lint targets. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D144705	2023-02-24 19:55:43 +00:00
Joseph Huber	98697f4764	[libc] Fix LIBC_GPU_ARCHITECTURES not being used Summary: This variable is supposed to control the architectures to build for. At some point this was changes out for testing and never fixed.	2023-02-21 13:00:16 -06:00
Joseph Huber	eb71ecfa10	[libc] Fix GPU include directories not being set properly Summary: For some reason, this variable was set after where it was used. Causing weird behaviour with including the standard headers. Fix it.	2023-02-20 15:43:16 -06:00
Joseph Huber	4a872412d8	[libc] Fix dependencies for generating the GPU binary file This patch adjusts the way dependencies are handled in the packaed version of the GPU libc runtime. Previously the files were not getting updated properly in the install when they changed. Depends on D144214 Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D144280	2023-02-20 09:37:19 -06:00
Joseph Huber	d51d2b5909	[libc] Support add_object_library for the GPU build This patch unifies the handling of generating the GPU build targets between the `add_entrypoint_library` and the `add_object_library` functions. The `_build_gpu_objects` function will create two targets. One contains a single object file with several GPU binaries embedded in it, a so-called fatbinary. The other is a direct compile of the supported target to be used internally only. This patch pulls out some of the properties logic so that we can handle both more easily. This patch also required adding an ovverride `NO_GPU_BUILD` for cases when we only want to build the source file as normal. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D144214	2023-02-20 09:37:18 -06:00
Guillaume Chatelet	c3228714cc	[libc][NFC] Make tuning macros start with LIBC_COPT_ Rename preprocessor definitions that control tuning of llvm libc. Differential Revision: https://reviews.llvm.org/D143913	2023-02-15 10:00:16 +00:00
Joseph Huber	5fde2d9951	[libc] Write stub files to a new directory to avoid conflicts Summary: This hack with stub files is used to make the final object archive have human-understandable names. We currently output these into the current binary directory, which sometimes interferes with the actual source file. Put these in their own directory to be certain they don't conflict.	2023-02-13 16:39:59 -06:00
Tue Ly	c1e252417e	[libc] Add -mavx2 together with -mfma to allow clang pre-12 to generate fma instructions. For clang-11, having -mfma without -mavx2 does not generate fma instructions, causing a build bot to fail on log10_test. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D143234	2023-02-03 15:12:09 -05:00
Joseph Huber	6d0e137358	[libc] Remove OpenMP and build the GPU libc directly The current `libcgpu.a` is actually an archive of fatbinaries. The host file contains nothing but a section called `LLVM_OFFLOADING` that contains embedded device code. This used to be handled implicitly by borrowing the OpenMP toolchain, which did this packaging internally. Passing the OpenMP flags causes problems with trying to move to testing. This patch pulls this logic out into the CMake and handles it manually. This patch is a lot of noise, but it fundamentally comes down to the following changes. 1. Build the source for every GPU architecture (GPU architectures are generally not backwards compatible) 2. Combine all of these files into a single binary blob 3. Embed that binary blob into a host file 4. Package these host files into a `.a` archive. 5. The device code will be extracted and managed by the offloading linker. Another important point. Right now we are maintaining an important distinction with the GPU build. That is, when we build the exported library we will build for many GPU architectures. However, the internal version will only be built for a single GPU architecture, one that was found on the user's system. This is intended to be used for internal testing, very similar to the current path where `libc` is compiled for a single target triple. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D143089	2023-02-02 09:47:03 -06:00
Siva Chandra Reddy	23872aae12	[libc] Add an off-by-default option to silence "skipping" messages from CMake. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D142802	2023-01-30 16:39:41 +00:00
Guillaume Chatelet	4e9ac30816	[reland][libc][NFC] Add -fno-lax-vector-conversions compilation flag Now that a3d2c344773cc4fc95136fd67245880b34d8e335 has been submitted.	2022-12-27 10:32:41 +00:00
Guillaume Chatelet	d065472c9e	Revert "[libc][NFC] Add -fno-lax-vector-conversions compilation flag" This breaks aarch64 build. This reverts commit `32f4c3f103`.	2022-12-27 08:30:19 +00:00
Guillaume Chatelet	32f4c3f103	[libc][NFC] Add -fno-lax-vector-conversions compilation flag	2022-12-27 08:25:32 +00:00
Guillaume Chatelet	6d9d387f73	Use -Wstrict-prototypes with clang only	2022-12-18 15:54:21 +01:00
Joseph Huber	55151e138d	[libc] Add initial support for a libc implementation for the GPU This patch contains the initial support for building LLVM's libc as a target for the GPU. Currently this only supports a handful of very basic functions that can be implemented without an operating system. The GPU code is build using the existing OpenMP toolchain. This allows us to minimally change the existing codebase and get a functioning static library. This patch allows users to create a static library called `libcgpu.a` that contains fat binaries containing device IR. Current limitations are the lack of test support and the fact that only one target OS can be built at a time. That is, the user cannot get a `libc` for Linux and one for the GPU simultaneously. This introduces two new CMake variables to control the behavior `LLVM_LIBC_TARET_OS` is exported so the user can now specify it to equal `"gpu"`. `LLVM_LIBC_GPU_ARCHITECTURES` is also used to configure how many targets to build for at once. Depends on D138607 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D138608	2022-11-29 14:51:54 -06:00
Siva Chandra Reddy	2e4ef9b6ef	[libc][NFC] Add a few compiler warning flags. A bunch of cleanup to supress the new warnings is also done. Reviewed By: abrachet Differential Revision: https://reviews.llvm.org/D130723	2022-08-04 23:46:38 +00:00
Tue Ly	d883a4ad02	[libc] Implement sinf function that is correctly rounded to all rounding modes. Implement sinf function that is correctly rounded to all rounding modes. - We use a simple range reduction for `pi/16 < \|x\|` : Let `k = round(x / pi)` and `y = (x/pi) - k`. So `k` is an integer and `-0.5 <= y <= 0.5`. Then ``` sin(x) = sin(ypi + kpi) = (-1)^(k & 1) * sin(ypi) ~ (-1)^(k & 1) y * P(y^2) ``` where `yP(y^2)` is a degree-15 minimax polynomial generated by Sollya with: ``` > P = fpminimax(sin(xpi)/x, [\|0, 2, 4, 6, 8, 10, 12, 14\|], [\|D...\|], [0, 0.5]); ``` - Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master) on Ryzen 1700: Before this patch (not correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.892 System LIBC reciprocal throughput : 25.559 LIBC reciprocal throughput : 29.381 ``` After this patch (correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.896 System LIBC reciprocal throughput : 25.740 LIBC reciprocal throughput : 27.872 LIBC reciprocal throughput : 20.012 (with `-msse4.2` flag) LIBC reciprocal throughput : 14.244 (with `-mfma` flag) ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D123154	2022-07-22 10:07:31 -04:00
Tue Ly	ed261e7106	[libc] Add float type and flag for nearest_integer to enable SSE4.2. Add float type and flag for nearest integer to automatically test with and without SSE4.2 flag. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D129916	2022-07-22 09:29:41 -04:00
Tue Ly	ae5c82502e	[libc][Obvious] Do not add __NO_ to targets with FLAG__NO suffix.	2022-06-30 10:45:59 -04:00
Guillaume Chatelet	aeccc16497	Re-land [libc] Apply no-builtin everywhere, remove unnecessary flags This is a reland of D126773 / `b2a9ea4420`. The removal of `-mllvm -combiner-global-alias-analysis` has landed separately in D128051 / `7b73f53790`. And the removal of `-mllvm --tail-merge-threshold=0` is scheduled for removal in a subsequent patch.	2022-06-22 12:30:20 +00:00
Guillaume Chatelet	4a6929f811	Revert "[libc] Apply no-builtin everywhere, remove unnecessary flags" This reverts commit `b2a9ea4420`.	2022-06-16 09:28:17 +00:00
Tue Ly	667863d8a8	[libc] Fix cmake compatibility issue with list(POP_FRONT). list(POP_FRONT) is only added to cmake in 3.15, while our base line version is 3.13 Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D127129	2022-06-06 13:36:03 -04:00
Tue Ly	614567a7bf	[libc] Automatically add -mfma flag for architectures supporting FMA. Detect if the architecture supports FMA instructions and if the targets depend on fma. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D123615	2022-06-03 01:21:20 -04:00
Guillaume Chatelet	b2a9ea4420	[libc] Apply no-builtin everywhere, remove unnecessary flags Note, this is a re-submission of D125894 with `features = ["-header_modules"]` added to the main BUILD.bazel file. Some functions like `stpncpy` are implemented in terms of `memset` but are not currently using `-fno-builtin-memset`. This is somewhat hidden by the fact that we use `-ffreestanding` globally and that `-ffreestanding` implies `-fno-builtin` for Clang. This patch also removes `-mllvm -combiner-global-alias-analysis` that is Clang specific and that does not bring substantial gains on modern processors. Also we keep `-mllvm --tail-merge-threshold=0` for aarch64 in CMakeLists.txt but we omit it in the Bazel config. This is because Bazel consumes the source files directly and so it can use PGO to take optimal decisions locally. Differential Revision: https://reviews.llvm.org/D126773	2022-06-01 13:34:36 +00:00
Tue Ly	800051487f	[libc] Implement FLAGS option for generating all combinations for targets. Add FLAGS option for add_header_library, add_object_library, add_entrypoint_object, and add_libc_unittest. In general, a flag is a string provided for supported functions under the multi-valued option `FLAGS`. It should be one of the following forms: FLAG_NAME FLAG_NAME__NO FLAG_NAME__ONLY A target will inherit all the flags of its upstream dependency. When we create a target `TARGET_NAME` with a flag using (add_header_library, add_object_library, ...), its behavior will depend on the flag form as follow: - FLAG_NAME: The following 2 targets will be generated: `TARGET_NAME` that has `FLAG_NAME` in its `FLAGS` property. `TARGET_NAME.__NO_FLAG_NAME` that depends on `DEP.__NO_FLAG_NAME` if `TARGET_NAME` depends on `DEP` and `DEP` has `FLAG_NAME` in its `FLAGS` property. - FLAG_NAME__ONLY: Only generate 1 target `TARGET_NAME` that has `FLAG_NAME` in its `FLAGS` property. - FLAG_NAME__NO: Only generate 1 target `TARGET_NAME.__NO_FLAG_NAME` that depends on `DEP.__NO_FLAG_NAME` if `DEP` is in its DEPENDS list and `DEP` has `FLAG_NAME` in its `FLAGS` property. To show all the targets generated, pass SHOW_INTERMEDIATE_OBJECTS=ON to cmake. To show all the targets' dependency and flags, pass `SHOW_INTERMEDIATE_OBJECTS=DEPS` to cmake. To completely disable a flag FLAG_NAME expansion, set the variable `SKIP_FLAG_EXPANSION_FLAG_NAME=TRUE`. Reviewed By: michaelrj, sivachandra Differential Revision: https://reviews.llvm.org/D125174	2022-06-01 00:54:07 -04:00
Guillaume Chatelet	0443bfabe7	Revert "[libc] Apply no-builtin everywhere, remove unnecessary flags" This reverts commit `94d6dd9057`.	2022-05-20 14:37:17 +00:00
Guillaume Chatelet	94d6dd9057	[libc] Apply no-builtin everywhere, remove unnecessary flags Some functions like `stpncpy` are implemented in terms of `memset` but are not currently using `-fno-builtin-memset`. This is somewhat hidden by the fact that we use `-ffreestanding` globally and that `-ffreestanding` implies `-fno-builtin` for Clang. This patch also removes `-mllvm -combiner-global-alias-analysis` that is Clang specific and that does not bring substantial gains on modern processors. Also we keep `-mllvm --tail-merge-threshold=0` for aarch64 in CMakeLists.txt but we omit it in the Bazel config. This is because Bazel consumes the source files directly and so it can use PGO to take optimal decisions locally. Differential Revision: https://reviews.llvm.org/D125894	2022-05-19 09:08:42 +00:00

1 2

70 Commits