Add handling for FPFastMathMode in SPIR-V shaders. This is a first pass
that
simply does a direct translation when the proper extension is available.
This will unblock work for HLSL. However, it is not a full solution.
The default math mode for spir-v is determined by the API. When
targeting Vulkan many of the fast math options are assumed. We should do
something particular when targeting Vulkan.
We will also need to handle the hlsl "precise" keyword correctly when
FPFastMathMode is not available.
Unblockes https://github.com/llvm/llvm-project/issues/140739, but we are
keeing it open to track the remaining issues mentioned above.
Add `dead_on_return` attribute, which is meant to be taken advantage
by the frontend, and states that the memory pointed to by the argument
is dead upon function return. As with `byval`, it is supposed to be
used for passing aggregates by value. The difference lies in the ABI:
`byval` implies that the pointer is explicitly passed as argument to
the callee (during codegen the copy is emitted as per byval contract),
whereas a `dead_on_return`-marked argument implies that the copy
already exists in the IR, is located at a specific stack offset within
the caller, and this memory will not be read further by the caller upon
callee return – or otherwise poison, if read before being written.
RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
This PR introduces support for the DWARF64 format, enabling handling of
64-bit DWARF sections as defined by the DWARF specification. The update
includes adjustments to header parsing and modification of form values
to accommodate 64-bit offsets and values.
Also Added the testcase to verify the DWARF64 format.
This change simplifies the API by removing the error handling complexity.
- Changed `Embedder::create()` to return `std::unique_ptr<Embedder>` directly instead of `Expected<std::unique_ptr<Embedder>>`
- Updated documentation and tests to reflect the new API
- Added death test for invalid IR2Vec kind in debug mode
- In release mode, simply returns nullptr for invalid kinds instead of creating an error
(Tracking issue - #141817)
Changes to scale opcodes, types and args once in `IR2VecVocabAnalysis` so that we can avoid scaling each time while computing embeddings. This PR refactors the vocabulary to explicitly define 3 sections---Opcodes, Types, and Arguments---used for computing Embeddings.
(Tracking issue - #141817 ; partly fixes - #141832)
This patch adds a new document describing the LLVM Qualification Group,
modeled after the Security Group documentation. The goal is to create an
open working group focused on enabling LLVM use in safety-critical
applications, such as those requiring ISO 26262 qualification.
The group is intended to be non-enforcing and collaborative, and to act
as a public coordination point for contributors working on
safety-relevant concerns in LLVM.
See:
https://discourse.llvm.org/t/rfc-proposal-to-establish-a-safety-group-in-llvm/86916
In this review, I’d really appreciate your feedback on both the overall
structure and wording, especially if anything could be made clearer,
more balanced, or more aligned with LLVM’s values and documentation
tone. What feels right? What could be improved to better reflect LLVM
community expectations?
---------
Co-authored-by: Wendi Urribarri (Woven by Toyota <wendi.urribarri@woven-planet.global>
Require using attribute `[[maybe_unused]` for assert-only variables that
may be unused in non-assert enabled builds to suppress unused variable
warnings.
---------
Co-authored-by: James Henderson <James.Henderson@sony.com>
Co-authored-by: Nikita Popov <github@npopov.com>
As far as I know binutils does not have a similar option and I don't
know of a reason we shouldn't accept the RVC hint instructions.
The wording in the spec in the past suggested that maybe these
weren't valid instruction names, but that's been modified recently.
Similar to the existing implementations for X86 and PPC, support
symbolizing branch targets for AArch64. Do not omit the address for ADRP
as the target is typically not at an intended location.
Pull Request: https://github.com/llvm/llvm-project/pull/145009
The RFC for this removal can be found at:
https://discourse.llvm.org/t/rfc-removing-pstl/86807
Note, libc++ still supports PSTL. That support is integrated directly
into the libc++ source tree.
There is no release note for this removal because it's not really clear
that this was user-facing facilities or where such a release note should
live.
## Purpose
Simplify the logic used to define `LLVM_ABI` and related macros,
eliminate the `LLVM_ABI_FRIEND` macro, and update the `LLVM_ABI` macro
to always resolve to `__attribute__((visibility("default")))` when
building LLVM as a shared library for ELF or Mach-O targets.
## Background
Previously, `LLVM_ABI` was defined to the C++ style attribute
`[[gnu::visibility("default")]]` when compiling with gcc, which has more
restrictions on its placement. Of note, the C++ style attributes cannot
decorate `friend` functions and must not appear after `extern` on
variable declarations.
Documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
- Define a new CMake config value,
`LLVM_ENABLE_LLVM_EXPORT_ANNOTATIONS`, which is implicitly set whenever
`LLVM_BUILD_LLVM_DYLIB`, `LLVM_BUILD_SHARED_LIBS`, or
`LLVM_ENABLE_PLUGINS` is set. Add it as a `#cmakedefine` to
llvm-config.h so its definition is available to projects building
against LLVM as required so clients see `__declspec(dllimport)` on
Windows.
- Gate the `LLVM_ABI` macro definitions in Compiler.h behind the new
`LLVM_ENABLE_LLVM_EXPORT_ANNOTATIONS` definition. This is
simpler/cleaner, but should be equivalent to the previous logic.
- Maintain `LLVM_BUILD_STATIC` as an override to be used by specific
targets that don't want to build against the DLL/shared library, such as
tablegen.
- For ELF and Mach-O targets, directly define `LLVM_ABI` as
`__attribute__((visibility("default")))` instead of
`LLVM_ATTRIBUTE_VISIBILITY_DEFAULT`, which resolves to C++ style
`[[gnu::visibility("default")]]` when compiling with gcc.
- Remove the `LLVM_ABI_FRIEND` macro and replace all usages of it with
`LLVM_ABI`.
- Update the documentation for exporting friend functions to no longer
reference `LLVM_ABI_FRIEND`.
## Validation
- Built as static lib with clang and gcc on Linux.
- Built as static with clang-cl and MSVC on Windows.
- Built as shared lib with clang and gcc on Linux (+ additional local
changes not yet merged).
- Built as DLL with clang-cl and MSVC on Windows (+ additional local
changes not yet merged).
---------
Co-authored-by: SquallATF <squallatf@gmail.com>
Also fix the LangRef to match the implementation. This was checking
against the alloca address space size rather than the default address
space.
The check was also more permissive than the LangRef. The error
check permitted any size less than the pointer size; follow the
stricter wording of the LangRef.
Barrier instructions are no-ops in single-wave workgroups. This includes
s_barrier_signal_isfirst, which will leave SCC unmodified.
Model this correctly (via an implicit use of SCC) and ensure SCC==1
before the barrier instruction (if the wave is the only one of the
workgroup, then it is the first).
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
This change adds family-specific architecture variants support added in [PTX ISA
8.8](https://docs.nvidia.com/cuda/parallel-thread-execution/#ptx-isa-version-8-8).
These architecture variants have "f" suffix. For example, sm_100f.
This change doesn't promote existing features to family-specific
architecture.
Background: The yaml-strtab format looks just like the yaml format,
except that the values in the key/value pairs of the remarks are
deduplicated and replaced by indices into a string table (see removed
test cases for examples). The motivation behind this format was to
reduce size of the remarks files. However, it was quickly superseded by
the bitstream format.
Therefore, remove the yaml-strtab format, as it doesn't have a good
usecase anymore:
- It isn't particularly efficient
- It isn't human-readable
- It isn't straightforward to parse in external tools that can't use the
remarks library. We don't even support it in opt-viewer.
llvm-remarkutil is also missing options to parse/convert yaml-strtab, so
the chance that anyone is actually using this format is low.
Shaders compiled with DXC/LLPC generate these relocations, and even if
that changes in the future we want to handle existing binaries. The
friction to support this and the maintenance cost long term both seem
incredibly low, considering other targets like ARM support both REL/RELA
static relocations behind the same interface.
Update the guideline to reduce the chance of miscompilation/performance
regression.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>
Some data members are only part of a class definition in a Debug build,
e.g. `LVObject::ID`. If `debuginfologicalview` is used as a library,
`NDEBUG` cannot be used for this purpose, as this PP macro may have a
different definition in a downstream project, which in turn triggers an
ODR violation. Fix it by
- Making `LVObject::ID` an unconditional data member.
- Making `LVObject::dump()` non-virtual. Rationale: `virtual` is not
needed (and it calls `print()`, which is virtual anyway).
Fixes#139098.
When no profile is provided, but the new --empty-profile option is
specified, the export/report/show commands now emit coverage data
equivalent to that obtained from a profile with all zero counters
("baseline coverage").
This is useful for build systems (e.g. Bazel) that can track coverage
information for each build target, even those that are never linked into
tests and thus don't have runtime coverage data recorded. By merging in
baseline coverage, lines in files that aren't linked into tests are
correctly reported as uncovered.
Reland with fixes to `CoverageMappingTest.cpp`.
Reverts llvm/llvm-project#144121
Reverts llvm/llvm-project#117910
```
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp:281:28: error: 'std::reference_wrapper' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported]
281 | std::make_optional(std::reference_wrapper(*ProfileReader));
| ^
/usr/lib/gcc/ppc64le-redhat-linux/8/../../../../include/c++/8/bits/refwrap.h:289:11: note: add a deduction guide to suppress this warning
289 | class reference_wrapper
| ^
```
This PR fixes a bug in GatherToLDSOpLowering, we were getting the
MemRefType of source for the destination. Additionally, some related
typos are corrected.
CC: @krzysz00 @umangyadav @lialan
When no profile is provided, but the new --empty-profile option is
specifed, the export/report/show commands now emit coverage data
equivalent to that obtained from a profile with all zero counters
("baseline coverage").
This is useful for build systems (e.g. Bazel) that can track coverage
information for each build target, even those that are never linked into
tests and thus don't have runtime coverage data recorded. By merging in
baseline coverage, lines in files that aren't linked into tests are
correctly reported as uncovered.
This patch is part of a series that adds origin-tracking to the debugify
source location coverage checks, allowing us to report symbolized stack
traces of the point where missing source locations appear.
This patch adds the configuration options needed to enable this feature,
in the form of a new CMake option that enables a flag in
`llvm-config.h`; this is not an entirely new CMake flag, but a new
option, `COVERAGE_AND_ORIGIN`, for the existing flag
`LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING`. This patch contains
documentation, but no actual implementation for the flag itself.
Following GitHub organizations were merged into the ROCm org:
* ROCm-Developer-Tools
* RadeonOpenCompute
* ROCmSoftwarePlatform
Ensure that all hyperlinks to the old organizations now point to the new
organization at https://github.com/ROCm.
This patch extends the TMA G2S intrinsics with the
support for cta_group::1/2 available from Blackwell onwards.
The existing intrinsics are auto-upgraded with a default
value of '0' for the `cta_group` flag operand.
* lit tests are added for all combinations of the newer variants.
* Negative tests are added to validate the error-handling
when the value of the cta_group flag falls out-of-range.
* The generated PTX is verified with a 12.8 ptxas executable.
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
This is a cli tool to that tests the conformance of LLVM's mustache
implementation against the public Mustache spec, hosted at
https://github.com/mustache/spec. This is a revised version of the
patches in #111487.
Co-authored-by: Peter Chou <peter.chou@mail.utoronto.ca>
This patch extends the llvm.histogram intrinsic to support additional
update operations beyond the existing add. Specifically, the new
supported operations are:
* umax: unsigned maximum
* umin: unsigned minimum
* uadd.sat: unsigned saturated addition
Based on the discussion from:
https://discourse.llvm.org/t/rfc-expanding-the-experimental-histogram-intrinsic/84673
This change is to support target extension types in vectors. The change
allows sized target extension types to opt-in to being a valid vector
element.
Allowing target extension types as vector elements will allow backends
to use vector operations such as `insertelement` and `extractelement` on
their target types with minimal changes.
RFC:
https://discourse.llvm.org/t/rfc-supporting-sized-target-extension-types-in-vector/86431