This is to avoid err_target_unknown_abi which is caused by use default TargetTriple instead of shader model target triple.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D125585
`-gen-reproducer` causes crash reproduction to be emitted
even when clang didn't crash, and now can optionally take an
argument of never, on-crash (default), on-error and always.
Differential revision: https://reviews.llvm.org/D120201
-gen-reproducer causes crash reproduction to be emitted even
when clang didn't crash, and now can optionally take an argument
of never, on-crash (default), on-error and always.
Differential revision: https://reviews.llvm.org/D120201
The new driver uses an augmented linker wrapper to perform the device
linking phase, but to the user looks like a regular linker invocation.
Contrary to the old driver, the new driver contains all the information
necessary to produce a linked device image in the host object itself.
Currently, we infer the usage of the device linker by the user
specifying an offloading toolchain, e.g. (--offload-arch=...) or
(-fopenmp-targets=...), but this shouldn't be strictly necessary.
This patch introduces a new option `--offload-link` to tell
the driver to use the offloading linker instead. So a compilation flow
can now look like this,
```
clang foo.cu --offload-new-driver -fgpu-rdc --offload-arch=sm_70 -c
clang foo.o --offload-link -lcudart
```
I was considering if this could be merged into the `-fuse-ld` option,
but because the device linker wraps over the users linker it would
conflict with that. In the future it's possible to merge this into `lld`
completely or `gold` via a plugin and we would use this option to
enable the device linking feature. Let me know what you think for this.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D126398
The Clang driver additional stages to build a complete offloading
program for applications using CUDA or OpenMP offloading. This normally
requires either a source file input or a valid object file to be
handled. This would cause problems when trying to compile an assembly or
LLVM IR file through clang with flags that would enable offloading. This
patch simply adds a check to prevent the offloading toolchain from being
used if we don't have a valid source file.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125705
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125165
Currently we require the `-fopenmp-targets=` option to specify the
triple to use for the offloading toolchains, and the `-Xopenmp-target=`
option to specify architectures to a specific toolchain. The changes
made in D124721 allowed us to use `--offload-arch=` to specify multiple
target architectures. However, this can become combersome with many
different architectures. This patch introduces functinality that
attempts to deduce the target triple and architectures from the
offloading action. Currently we will deduce known GPU architectures when
only `-fopenmp` is specified.
This required a bit of a hack to cache the deduced architectures,
without this we would've just thrown an error when we tried to look up
the architecture again when generating the job. Normally we require the
user to manually specify the toolchain arguments, but here they would
confict unless we overrode them.
Depends on: D124721
Reviewed By: saiislam
Differential Revision: https://reviews.llvm.org/D125050
This patch adds support for OpenMP to use the `--offload-arch` and
`--no-offload-arch` options. Traditionally, OpenMP has only supported
compiling for a single architecture via the `-Xopenmp-target` option.
Now we can pass in a bound architecture and use that if given, otherwise
we default to the value of the `-march` option as before.
Note that this only applies the basic support, the OpenMP target runtime
does not yet know how to choose between multiple architectures.
Additionally other parts of the offloading toolchain (e.g. LTO) require
the `-march` option, these should be worked out later.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D124721
This change makes sure that Flang's driver recognises LLVM IR and BC as
supported file formats. To this end, `isFortran` is extended and renamed
as `isSupportedByFlang` (the latter better reflects the new
functionality).
New tests are added to verify that the target triple is correctly
overridden by the frontend driver's default value or the value specified
with `-triple`. Strictly speaking, this is not a functionality that's
new in this patch (it was added in D124664). This patch simply enables
us to write such tests and hence I'm including them here.
Differential Revision: https://reviews.llvm.org/D124667
If clang is passed "-include foo.h", it will rewrite to "-include-pch foo.h.pch"
before passing it to cc1, if foo.h.pch exists.
Existence is checked, but validity is not. This is probably a reasonable
assumption for the compiler itself, but not for clang-based tools where the
actual compiler may be a different version of clang, or even GCC.
In the end, we lose our -include, we gain a -include-pch that can't be used,
and the file often fails to parse.
I would like to turn this off for all non-clang invocations (i.e.
createInvocationFromCommandLine), but we have explicit tests of this behavior
for libclang and I can't work out the implications of changing it.
Instead this patch:
- makes it optional in the driver, default on (no change)
- makes it optional in createInvocationFromCommandLine, default on (no change)
- changes driver to do IO through the VFS so it can be tested
- tests the option
- turns the option off in clangd where the problem was reported
Subsequent patches should make libclang opt in explicitly and flip the default
for all other tools. It's probably also time to extract an options struct
for createInvocationFromCommandLine.
Fixes https://github.com/clangd/clangd/issues/856
Fixes https://github.com/clangd/vscode-clangd/issues/324
Differential Revision: https://reviews.llvm.org/D124970
OpenMP recently moved to the new offloading driver, this had the effect
of making it more difficult to inspect intermediate code for the device.
This patch adds `-foffload-host-only` and `-foffload-device-only` to
control which sides get compiled. This will allow users to more easily
inspect output without needing the temp files.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D124220
This patch adds the basic support for the clang driver to compile and link CUDA
using the new offloading driver. This requires handling the CUDA offloading kind
and embedding the generated files into the host. This will allow us to link
OpenMP code with CUDA code in the linker wrapper. More support will be required
to create functional CUDA / HIP binaries using this method.
Depends on D120270 D120271 D120934
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D120272
In preparation for allowing other offloading kinds to use the new driver
a new opt-in flag `-foffload-new-driver` is added. This is distinct from
the existing `-fopenmp-new-driver` because OpenMP will soon use the new
driver by default while the others should not.
Reviewed By: yaxunl, tra
Differential Revision: https://reviews.llvm.org/D123325
Summary:
We provide the `-f(no-)openmp-new-driver` option to allow users to use
the old or new driver. Previously this wasn't handled in the expected
way and only `-fno-openmp-new-driver` was checked. This patch fixes that
by using the `hasFlag` method as is standard.
When the -fdirectives-only option is used together with -E, the preprocessor
output reflects evaluation of if/then/else directives.
Thus it preserves macros that are still live after such processing.
This output can be consumed by a second compilation to produce a header unit.
We automatically invoke this (with -E) when we know that the job produces a
header unit so that the preprocessed output reflects the macros that will be
defined when the binary HU is emitted.
Differential Revision: https://reviews.llvm.org/D121591
Allow an invocation like clang -fmodule-header bar.h (which will be a C++
compilation, but using a header which will be recognised as a C one).
Also we do not want to produce:
"treating 'c-header' input as 'c++-header' when in C++ mode"
diagnostics when the user has been specific about the intent.
Differential Revision: https://reviews.llvm.org/D121590
These command-line flags are alternates to providing the -x
c++-*-header indicators that we are building a header unit.
Act on fmodule-header= for headers on the c/l:
If we have x.hh -fmodule-header, then we should treat that header
as a header unit input (equivalent to -xc++-header-unit-header x.hh).
Likewise, for fmodule-header={user,system} the source should be now
recognised as a header unit input (since this can affect the job list
that we need).
It's not practical to recognise a header without any suffix so
-fmodule-header=system foo isn't going to happen. Although
-fmodule-header=system foo.hh will work OK. However we can make it
work if the user indicates that the item without a suffix is a valid
header. (so -fmodule-header=system -xc++-header vector)
Differential Revision: https://reviews.llvm.org/D121589
This adds file types and handling for three input types, representing a C++20
header unit source:
1. When provided with a complete pathname for the header.
2. For a header to be looked up (by the frontend) in the user search paths
3. For a header to be looked up in the system search paths.
We also add a pre-processed file type (although that is a single type, regardless
of the original input type).
These types may be specified with -xc++-{user,system,header-unit}-header xxxx.
These types allow us to disambiguate header unit jobs from PCH ones, and thus
we handle these differently from other header jobs in two ways:
1. The job construction is altered to build a C++20 header unit (rather than a
PCH file, as would be the case for other headers).
2. When the type is "user" or "system" we defer checking for the file until the
front end is run, since we need to look up the header in the relevant paths
which are not known at this point.
Differential Revision: https://reviews.llvm.org/D121588
Summary:
A new offloading action builder line was added that wasn't guarded with
the new driver for OpenMP. This doesn't affect anything now but could
potentially cause problems.
Previously an opt-in flag `-fopenmp-new-driver` was used to enable the
new offloading driver. After passing tests for a few months it should be
sufficiently mature to flip the switch and make it the default. The new
offloading driver is now enabled if there is OpenMP and OpenMP
offloading present and the new `-fno-openmp-new-driver` is not present.
The new offloading driver has three main benefits over the old method:
- Static library support
- Device-side LTO
- Unified clang driver stages
Depends on D122683
Differential Revision: https://reviews.llvm.org/D122831
The target profile option(/T) decide the shader model when compile hlsl.
The format is shaderKind_major_minor like ps_6_1.
The shader model is saved as llvm::Triple is clang/llvm like
dxil-unknown-shadermodel6.1-hull.
The main job to support the option is translating ps_6_1 into
shadermodel6.1-pixel.
That is done inside tryParseProfile at HLSL.cpp.
To integrate the option into clang Driver, a new DriverMode DxcMode is
created. When DxcMode is enabled, OSType for TargetTriple will be
forced into Triple::ShaderModel. And new ToolChain HLSLToolChain will
be created when OSType is Triple::ShaderModel.
In HLSLToolChain, ComputeEffectiveClangTriple is overridden to call
tryParseProfile when targetProfile option is set.
To make test work, Fo option is added and .hlsl is added for active
-xhlsl.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D122865
Patch by: Xiang Li <python3kgae@outlook.com>
This adds a PS5-specific ToolChain subclass, which defines some basic
PS5 driver behavior. Future patches will add more target-specific
driver behavior.
Add CSKY target toolchains to support csky in linux and elf environment.
It can leverage the basic universal Linux toolchain for linux environment, and only add some compile or link parameters.
For elf environment, add a CSKYToolChain to support compile and link.
Also add some parameters into basic codebase of clang driver.
Differential Revision: https://reviews.llvm.org/D121445
--overlay-platform-toolchain inserts a whole new toolchain path with
higher priority than system default, which could be achieved by
composing smaller options. We need to figure out alternative solution
and what is missing among these basic options.
In some cases, we need to set alternative toolchain path other than the
default with system (headers, libraries, dynamic linker prefix, ld path,
etc.), e.g., to pick up newer components, but keep sysroot at the same
time (to pick up extra packages).
This change introduces a new option --overlay-platform-toolchain to set
up such alternative toolchain path.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D121992
clang -extract-api should accept multiple headers and forward them to a
single CC1 instance. This change introduces a new ExtractAPIJobAction.
Currently API Extraction is done during the Precompile phase as this is
the current phase that matches the requirements the most. Adding a new
phase would need to change some logic in how phases are scheduled. If
the headers scheduled for API extraction are of different types the
driver emits a diagnostic.
Differential Revision: https://reviews.llvm.org/D121936
This path refactors the new driver to be less dependent on OpenMP. This
is done in preparation for the new driver to be able to handle other
offloading kinds and compile them together.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120934
When both CUDA or HIP programs and C++ programs are passed
to clang driver without -c, C++ programs are treated as CUDA
or HIP program, which is incorrect.
This is because action builder sets the offloading kind of input
job actions to the linking action to be the union of offloading
kind of the input job actions, i.e. if there is one HIP or CUDA
input to the linker, then all the input to the linker is marked
as HIP or CUDA.
To fix this issue, the offload action builder tracks the originating
input argument of each host action, which allows it to determine
the active offload kind of each host action. Then the offload
kind of each input action to the linker can be determined
individually.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D120911
When both HIP and C++ programs are input files to clang
with -c, clang treats C++ programs as HIP programs,
which is incorrect.
This is due to action builder does not set correct
offloading kind for job actions for C++ programs.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D120910
When we are creating jobs for the new driver we first check the cache to
see if the job was already created as a part of the offloading
toolchain. This would sometimes fail if the bound architecture was set
for the host during offloading. We want to ingore this because it is not
relevant for looking up host actions. Previously it was set on some
machines and would cause the cache lookup to fail.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118858
This patch implements the fist support for handling LTO in the
offloading pipeline. The flag `-foffload-lto` is used to control if
bitcode is embedded into the device. If bitcode is found in the device,
the extracted files will be sent to the LTO pipeline to be linked and
sent to the backend. This implementation does not separately link the
device bitcode libraries yet.
Depends on D116675
Differential Revision: https://reviews.llvm.org/D116975
This patch introduces a linker wrapper tool that allows us to preprocess
files before they are sent to the linker. This adds a dummy action and
job to the driver stage that builds the linker command as usual and then
replaces the command line with the wrapper tool.
Depends on D116543
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116544
This patch introduces the `-fopenmp-new-driver` option which instructs
the compiler to use a new driver scheme for producing offloading code.
In this scheme we create a complete offloading object file and then pass
it as input to the host compilation phase. This will allow us to embed
the object code in the backend phase.
This is the start of a series of commits to rework the OpenMP offloading driver
pipeline. The goal of this is to simplify the steps required for creating an
offloading program. This patch changes the driver's configuration to simply pass
the device file back to the host as an input so it can be embedded as an LLVM IR
global during the backend, then simply passes that object file to the linker.
This driver implementation will currently create the following phases,
```
$ clang input.c -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-phases
+- 0: input, "input.c", c, (host-openmp)
+- 1: preprocessor, {0}, cpp-output, (host-openmp)
+- 2: compiler, {1}, ir, (host-openmp)
| | +- 3: input, "input.c", c, (device-openmp)
| | +- 4: preprocessor, {3}, cpp-output, (device-openmp)
| |- 5: compiler, {4}, ir, (device-openmp)
| +- 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64)" {5}, ir
| +- 7: backend, {6}, assembler, (device-openmp)
|- 8: assembler, {7}, object, (device-openmp)
+- 9: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64)" {8}, ir
+- 10: backend, {9}, assembler, (host-openmp)
+- 11: assembler, {10}, object, (host-openmp)
12: clang-linker-wrapper, {11}, image, (host-openmp)
```
Which will map to the following bindings
```
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["input.c"], output: "/tmp/input-bae62e.bc"
# "nvptx64" - "clang", inputs: ["input.c", "/tmp/input-bae62e.bc"], output: "/tmp/input-76784e.s"
# "nvptx64" - "NVPTX::Assembler", inputs: ["/tmp/input-76784e.s"], output: "/tmp/input-8f29db.o"
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["/tmp/input-bae62e.bc", "/tmp/input-8f29db.o"], output: "/tmp/input-545450.o"
# "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["/tmp/input-545450.o"], output: "a.out"
```
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116541
This patch builds on the change in D117634 that expanded the short
triples when passed in by the user. This patch adds the same
functionality for the `-Xopenmp-target=` flag. Previously it was
unintuitive that passing `-fopenmp-targets=nvptx64
-Xopenmp-target=nvptx64 <arg>` would not forward the arg because the
triples did not match on account of `nvptx64` being expanded to
`nvptx64-nvidia-cuda`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118495
The --offload option was added in D110622 to "override the default
device target". When it landed it supported only HIP. This patch
extends that option to support SPIR-V targets for CUDA.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D117137
The OpenMP offloading libraries are built with fixed triples and linked
in during compile time. This would cause un-helpful errors if the user
passed in the wrong expansion of the triple used for the bitcode
library. because we only support these triples for OpenMP offloading we
can normalize them to the full verion used in the bitcode library.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D117634
Add support of linking files compiled into SPIR-V objects
using spirv-link.
Command line inteface examples:
clang --target=spirv64 test1.cl test2.cl
clang --target=spirv64 test1.cl -o test1.o
clang --target=spirv64 test1.o test2.cl -o test_app.out
This works independently from the SPIR-V generation method
(via an external tool or an internal backend) and applies
to either approach that is being used.
Differential Revision: https://reviews.llvm.org/D116266
When passing a set of flags to configure defaults for a specific
target (similar to the cmake settings `CLANG_DEFAULT_RTLIB`,
`CLANG_DEFAULT_UNWINDLIB`, `CLANG_DEFAULT_CXX_STDLIB` and
`CLANG_DEFAULT_LINKER`, but without hardcoding them in the binary),
some of the flags may cause warnings (e.g. `-stdlib=` when compiling C
code). Allow requesting selectively ignoring unused arguments among
some of the arguments on the command line, without needing to resort
to `-Qunused-arguments` or `-Wno-unused-command-line-argument`.
Fix up the existing diagnostics.c testcase. It was added in
response to PR12181 to fix handling of
`-Werror=unused-command-line-argument`, but the command line option
in the test (`-fzyzzybalubah`) now triggers "error: unknown argument"
instead of the intended warning. Change it into a linker input
(`-lfoo`) which triggers the intended diagnostic. Extend the
existing test case to check more cases and make sure that it keeps
testing the intended case.
Add testing of the new option to this existing test.
Differential Revision: https://reviews.llvm.org/D116503
Currently when -fgpu-rdc is specified, HIP toolchain always does host linking even
if --cuda-device-only is specified.
This patch fixes that. Only device linking is performed when --cuda-device-only
is specified.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D116840
Clang searches for runtimes (e.g. libclang_rt*) first in a
subdirectory named for the target triple (corresponding to
LLVM_ENABLE_PER_TARGET_RUNTIME_DIR=ON), then if it's not found uses
.../lib/<os>/libclang_rt* with a suffix corresponding to the arch and
environment name.
Android triples optionally include an API level indicating the minimum
Android version to be run on
(e.g. aarch64-unknown-linux-android21). When compiler-rt is built with
LLVM_ENABLE_PER_TARGET_RUNTIME_DIR=ON this API level is part of the
output path.
Linking code built for a later API level against a runtime built for
an earlier one is safe. In projects with several API level targets
this is desireable to avoid re-building the same runtimes many
times. This is difficult with the current runtime search method: if
the API levels don't exactly match Clang gives up on the per-target
runtime directory path.
To enable this more simply, this change tries target triple without
the API level before falling back on the old layout.
Another option would be to try every API level in the triple,
e.g. check aarch-64-unknown-linux-android21, then ...20, then ...19,
etc.
Differential Revision: https://reviews.llvm.org/D115049