The nested switch exists to share setting IntrinsicsTypes to {ResultType}.
clz/ctz return before we reach that so they can just be in the top
level switch.
Currently, the `DropTypeTests` parameter only fully works with phi nodes
and llvm.assume instructions. However, we'd like CFI to work in
conjunction with FatLTO, in so far as the bitcode section should be able
to contain the CFI instrumentation, while any incompatible bits are
dropped when compiling the object code.
To do that, we need to drop the llvm.type.test instructions everywhere,
and not just their uses in phi nodes. This patch updates the
LowerTypeTest pass so that uses are removed, and replaced with `true` in
all cases, and not just in phi nodes.
Addressing this will allow us to fix#112053 by modifying the FatLTO
pipeline.
Reviewers: pcc, nikic
Reviewed By: pcc
Pull Request: https://github.com/llvm/llvm-project/pull/112787
UAVs and SRVs have already been converted to use LLVM target types and
we can disable generating of the !hlsl.uavs and !hlsl.srvs! annotations.
This will enable adding tests for structured buffers with user defined
types that this old resource annotations code does not handle (it
crashes).
Part 1 of #114126
Reproducer:
```
//--- a.cppm
export module a;
int func();
static int a = func();
//--- a.cpp
import a;
```
The `func()` should only execute once. However, before this patch we
will somehow import `static int a` from a.cppm incorrectly and
initialize that again.
This is super bad and can introduce serious runtime behaviors.
And also surprisingly, it looks like the root cause of the problem is
simply some oversight choosing APIs.
Enables the support of `-fcf-protection=return` on RISC-V, which
requires Zicfiss. It also adds a string attribute "hw-shadow-stack"
to every function if the option is set on RISC-V
Pure Scalable Types are defined in AAPCS64 here:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts
And should be passed according to Rule C.7 here:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules
This part of the ABI is completely unimplemented in Clang, instead it
treats PSTs sometimes as HFAs/HVAs, sometime as general composite types.
This patch implements the rules for passing PSTs by employing the
`CoerceAndExpand` method and extending it to:
* allow array types in the `coerceToType`; Now only `[N x i8]` are
considered padding.
* allow mismatch between the elements of the `coerceToType` and the
elements of the `unpaddedCoerceToType`; AArch64 uses this to map
fixed-length vector types to SVE vector types.
Corectly passing a PST argument needs a decision in Clang about whether
to pass it in memory or registers or, equivalently, whether to use the
`Indirect` or `Expand/CoerceAndExpand` method. It was considered
relatively harder (or not practically possible) to make that decision in
the AArch64 backend.
Hence this patch implements the register counting from AAPCS64 (cf.
`NSRN`, `NPRN`) to guide the Clang's decision.
Remove these intrinsics which can be better represented by load
instructions with `!invariant.load` metadata:
- llvm.nvvm.ldg.global.i
- llvm.nvvm.ldg.global.f
- llvm.nvvm.ldg.global.p
# What
This PR renames the newly-introduced llvm attribute
`sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise,
sibling variables such as `SanitizeRealtimeUnsafe` are renamed to
`SanitizeRealtimeBlocking` respectively. There are no other functional
changes.
# Why?
- There are a number of problems that can cause a function to be
real-time "unsafe",
- we wish to communicate what problems rtsan detects and *why* they're
unsafe, and
- a generic "unsafe" attribute is, in our opinion, too broad a net -
which may lead to future implementations that need extra contextual
information passed through them in order to communicate meaningful
reasons to users.
- We want to avoid this situation and make the runtime library boundary
API/ABI as simple as possible, and
- we believe that restricting the scope of attributes to names like
`sanitize_realtime_blocking` is an effective means of doing so.
We also feel that the symmetry between `[[clang::blocking]]` and
`sanitize_realtime_blocking` is easier to follow as a developer.
# Concerns
- I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been
part of the tree for a few weeks now (introduced here:
https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't
been released in version 20 yet, am I correct in considering this to not
be a breaking change?
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point
intrinsic.
From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````
The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the
FPMR.
This patch is an attempt to the add the mfloat8_t scalar type. It has a
parser and codegen for the new scalar type.
The patch it is lowering to and 8bit unsigned as it has no format. But
maybe we should add another opaque type.
[1] https://github.com/ARM-software/acle/pull/323
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.
This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
If a call using the musttail attribute returns it's value through an
sret argument pointer, we must forward an incoming sret pointer to it,
instead of creating a new alloca. This is always possible because the
musttail attribute requires the caller and callee to have the same
argument and return types.
Extends `nowait` support for other device directives. This PR refactors
the task generation utils used for the `target` directive so that they
are general enough to be reused for other device directives as well.
Fixes: #113187
Avoid to create init function since clang does not support global
variable with flexible array init.
It will cause assertion failure later.
When compiling HIP source for AMDGCN flavoured SPIR-V that is expected
to be consumed by the ROCm HIP RT, it's not desirable to set the OpenCL
Kernel CC on `__global__` functions. On one hand, this is not an OpenCL
RT, so it doesn't compose with e.g. OCL specific attributes. On the
other it is a "noisy" CC that carries semantics, and breaks overload
resolution when using [generic dispatchers such as those used by
RAJA](186d4194a5/src/common/HipDataUtils.hpp (L39)).
Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
Fixed: #113044
the type of `ArrayTypeTraitExpr` can be changed, use i32 directly is
incorrect.
---------
Co-authored-by: Eli Friedman <efriedma@quicinc.com>
Currently both True/False counts were folded. It lost the information,
"It is True or False before folding." It prevented recalling branch
counts in merging template instantiations.
In `llvm-cov`, a folded branch is shown as:
- `[True: n, Folded]`
- `[Folded, False n]`
In the case If `n` is zero, a branch is reported as "uncovered". This is
distinguished from "folded" branch. When folded branches are merged,
`Folded` may be dissolved.
In the coverage map, either `Counter` is `Zero`. Currently both were
`Zero`.
Since "partial fold" has been introduced, either case in `switch` is
omitted as `Folded`.
Each `case:` in `switch` is reported as `[True: n, Folded]`, since
`False` count doesn't show meaningful value.
When `switch` doesn't have `default:`, `switch (Cond)` is reported as
`[Folded, False: n]`, since `True` count was just the sum of `case`(s).
`switch` with `default` can be considered as "the statement that doesn't
have any `False`(s)".
In #99726, `-fptrauth-type-info-vtable-pointer-discrimination` was
introduced, which is intended to enable type and address discrimination
for type_info vtable pointers. However, some codegen logic for actually
enabling address discrimination was missing. This patch addresses the
issue.
Fixes#101716
Adds `@_init_resource_bindings()` function to module initialization that
includes `handle.fromBinding` intrinsic calls for simple resource
declarations. Arrays of resources or resources inside user defined types
are not supported yet.
While this unblocks our progress on [Compile a runnable shader from
clang](https://github.com/llvm/wg-hlsl/issues/7) milestone, this is
probably not the way we would like to handle resource binding
initialization going forward. Ideally, it should be done via the
resource class constructors in order to support dynamic resource binding
or unbounded arrays if resources.
Depends on PRs #110327 and #111203.
Part 1 of #105076
The whole struct is specificed in the __bdos. The calculation of the
whole size of the structure can be done in two ways:
1) sizeof(struct S) + count * sizeof(typeof(fam))
2) offsetof(struct S, fam) + count * sizeof(typeof(fam))
The first will add any remaining whitespace that might exist after
allocation while the second method is more precise, but not quite
expected from programmers. See [1] for a discussion of the topic.
GCC isn't (currently) able to calculate __bdos on a pointer to the whole
structure. Therefore, because of the above issue, we'll choose to match
what GCC does for consistency's sake.
[1] https://lore.kernel.org/lkml/ZvV6X5FPBBW7CO1f@archlinux/
Co-authored-by: Eli Friedman <efriedma@quicinc.com>
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
When the arch in the triple in "spirv", the default target codegen is
currently used. We should be using the spir-v target codegen. This will
be used to have SPIR-V specific lowering of the HLSL types.
Commit 84ee629bc5 ("clang: Remove some pointer bitcasts (#112324)",
2024-10-15) triggered some "Call parameter type does not match function
signature!" errors when using the OpenCL pipe builtin functions under
the spir triple, due to a missing addrspacecast.
This would have been caught by the pipe_builtin.cl test if that had used
the `spir-unknown-unknown` triple, so extend the test to use that
triple too.
- create a clang built-in in Builtins.td
- add semantic checking in SemaHLSL.cpp
- link the WaveReadLaneAt api in hlsl_intrinsics.h
- add lowering to spirv backend op GroupNonUniformShuffle
with Scope = 2 (Group) in SPIRVInstructionSelector.cpp
- add WaveReadLaneAt intrinsic to IntrinsicsDirectX.td and mapping
to DXIL.td
- add tests for HLSL intrinsic lowering to spirv intrinsic in
WaveReadLaneAt.hlsl
- add tests for sema checks in WaveReadLaneAt-errors.hlsl
- add spir-v backend tests in WaveReadLaneAt.ll
- add test to show scalar dxil lowering functionality
- note that this doesn't include support for the scalarizer to
handle WaveReadLaneAt will be added in a future pr
This is the first part #70104
Translates `RWBuffer` and `StructuredBuffer` resources buffer types to
DirectX target types `dx.TypedBuffer` and `dx.RawBuffer`.
Includes a change of `HLSLAttributesResourceType` from 'sugar' type to
full canonical type. This is required for codegen and other clang
infrastructure to work property on HLSL resource types.
Fixes#95952 (part 2/2)