Commit Graph

17398 Commits

Author SHA1 Message Date
Phoebe Wang
c72a751dab [X86][AMX] Support AMX-TRANSPOSE (#113532)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-01 16:45:03 +08:00
Craig Topper
cd8d507b07 [RISCV] Pull __builtin_riscv_clz/ctz out of a nested switch. NFC
The nested switch exists to share setting IntrinsicsTypes to {ResultType}.
clz/ctz return before we reach that so they can just be in the top
level switch.
2024-10-31 11:01:58 -07:00
Simon Pilgrim
fcaa8c6e22 Fix MSVC "signed/unsigned mismatch" warning. NFC. 2024-10-31 11:50:19 +00:00
Stanislav Mekhanoshin
ba1a09da8d [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (#113610)
The same handling as for __builtin_amdgcn_mov_dpp.
2024-10-31 02:19:20 -07:00
Paul Kirth
b01e2a8b56 [llvm] Allow always dropping all llvm.type.test sequences
Currently, the `DropTypeTests` parameter only fully works with phi nodes
and llvm.assume instructions. However, we'd like CFI to work in
conjunction with FatLTO, in so far as the bitcode section should be able
to contain the CFI instrumentation, while any incompatible bits are
dropped when compiling the object code.

To do that, we need to drop the llvm.type.test instructions everywhere,
and not just their uses in phi nodes. This patch updates the
LowerTypeTest pass so that uses are removed, and replaced with `true` in
all cases, and not just in phi nodes.

Addressing this will allow us to fix #112053 by modifying the FatLTO
pipeline.

Reviewers: pcc, nikic

Reviewed By: pcc

Pull Request: https://github.com/llvm/llvm-project/pull/112787
2024-10-30 16:56:30 -07:00
Helena Kotas
74d8f3952c [HLSL] Remove old resource annotations for UAVs and SRVs (#114139)
UAVs and SRVs have already been converted to use LLVM target types and
we can disable generating of the !hlsl.uavs and !hlsl.srvs! annotations.
This will enable adding tests for structured buffers with user defined
types that this old resource annotations code does not handle (it
crashes).

Part 1 of #114126
2024-10-30 14:06:42 -07:00
Jay Foad
463a4c16ea [clang] Remove some uses of llvm::StructType::setBody. NFC. (#113691)
It is simple to create the struct body up front, now that we have
transitioned to opaque pointers.
2024-10-30 16:53:08 +00:00
Chuanqi Xu
259eaa6878 [C++20] [Modules] Fix the duplicated static initializer problem (#114193)
Reproducer:

```
//--- a.cppm
export module a;
int func();
static int a = func();

//--- a.cpp
import a;
```

The `func()` should only execute once. However, before this patch we
will somehow import `static int a` from a.cppm incorrectly and
initialize that again.

This is super bad and can introduce serious runtime behaviors.

And also surprisingly, it looks like the root cause of the problem is
simply some oversight choosing APIs.
2024-10-30 17:27:04 +08:00
Jesse Huang
335e68d8bc [Clang][RISCV] Support -fcf-protection=return for RISC-V (#112477)
Enables the support of `-fcf-protection=return` on RISC-V, which
requires Zicfiss. It also adds a string attribute "hw-shadow-stack"
to every function if the option is set on RISC-V
2024-10-29 15:47:49 +08:00
joaosaffran
481bce018e Adding splitdouble HLSL function (#109331)
- Adding hlsl `splitdouble` intrinsics
- Adding DXIL lowering
- Adding SPIRV lowering
- Adding test

Fixes: #108901

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-10-28 13:26:59 -07:00
Steven Perron
98e3075df9 [HLSL][SPIRV] Add convergence tokens to entry point wrapper (#112757)
Inlining currently assumes that either all function use controled
convergence or none of them do. This is why we need to have the entry
point wrapper use controled convergence.


c85611e858/llvm/lib/Transforms/Utils/InlineFunction.cpp (L2431-L2439)
2024-10-28 13:25:04 -04:00
Aaron Ballman
af7c58b7ea Remove support for RenderScript (#112916)
See
https://discourse.llvm.org/t/rfc-deprecate-and-eventually-remove-renderscript-support/81284
for the RFC
2024-10-28 12:48:42 -04:00
Momchil Velikov
53f7f8ecca [Clang][AArch64] Fix Pure Scalables Types argument passing and return (#112747)
Pure Scalable Types are defined in AAPCS64 here:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts

And should be passed according to Rule C.7 here:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules

This part of the ABI is completely unimplemented in Clang, instead it
treats PSTs sometimes as HFAs/HVAs, sometime as general composite types.

This patch implements the rules for passing PSTs by employing the
`CoerceAndExpand` method and extending it to:
* allow array types in the `coerceToType`; Now only `[N x i8]` are
considered padding.
* allow mismatch between the elements of the `coerceToType` and the
elements of the `unpaddedCoerceToType`; AArch64 uses this to map
fixed-length vector types to SVE vector types.

Corectly passing a PST argument needs a decision in Clang about whether
to pass it in memory or registers or, equivalently, whether to use the
`Indirect` or `Expand/CoerceAndExpand` method. It was considered
relatively harder (or not practically possible) to make that decision in
the AArch64 backend.
Hence this patch implements the register counting from AAPCS64 (cf.
`NSRN`, `NPRN`) to guide the Clang's decision.
2024-10-28 15:43:14 +00:00
Simon Pilgrim
d6d4569dd9 Fix MSVC "signed/unsigned mismatch" warnings. NFC. 2024-10-28 11:45:36 +00:00
Alex MacLean
fb33af08e4 [NVPTX] Remove nvvm.ldg.global.* intrinsics (#112834)
Remove these intrinsics which can be better represented by load
instructions with `!invariant.load` metadata:

- llvm.nvvm.ldg.global.i
- llvm.nvvm.ldg.global.f
- llvm.nvvm.ldg.global.p
2024-10-27 16:14:13 -07:00
davidtrevelyan
4102625380 [rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155)
# What

This PR renames the newly-introduced llvm attribute
`sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise,
sibling variables such as `SanitizeRealtimeUnsafe` are renamed to
`SanitizeRealtimeBlocking` respectively. There are no other functional
changes.


# Why?

- There are a number of problems that can cause a function to be
real-time "unsafe",
- we wish to communicate what problems rtsan detects and *why* they're
unsafe, and
- a generic "unsafe" attribute is, in our opinion, too broad a net -
which may lead to future implementations that need extra contextual
information passed through them in order to communicate meaningful
reasons to users.
- We want to avoid this situation and make the runtime library boundary
API/ABI as simple as possible, and
- we believe that restricting the scope of attributes to names like
`sanitize_realtime_blocking` is an effective means of doing so.

We also feel that the symmetry between `[[clang::blocking]]` and
`sanitize_realtime_blocking` is easier to follow as a developer.

# Concerns

- I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been
part of the tree for a few weeks now (introduced here:
https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't
been released in version 20 yet, am I correct in considering this to not
be a breaking change?
2024-10-26 13:06:11 +01:00
Gang Chen
4ac0e7e400 [AMDGPU] Add a type for the named barrier (#113614) 2024-10-25 11:24:47 -07:00
CarolineConcatto
49940514e2 [CLANG][AArch64] Add the modal 8 bit floating-point scalar type (#97277)
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point
intrinsic.

From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````

The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the
FPMR.

This patch is an attempt to the add the mfloat8_t scalar type. It has a
parser and codegen for the new scalar type.

The patch it is lowering to and 8bit unsigned as it has no format. But
maybe we should add another opaque type.

[1]  https://github.com/ARM-software/acle/pull/323
2024-10-25 13:59:46 +01:00
Sergio Afonso
d87964de78 [OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533)
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.

This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
2024-10-25 11:30:16 +01:00
Kiran
a96c14eeb8 [Clang] Always forward sret parameters to musttail calls
If a call using the musttail attribute returns it's value through an
sret argument pointer, we must forward an incoming sret pointer to it,
instead of creating a new alloca. This is always possible because the
musttail attribute requires the caller and callee to have the same
argument and return types.
2024-10-25 09:34:08 +01:00
Jay Foad
4dd55c567a [clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)
Follow up to #109133.
2024-10-24 10:23:40 +01:00
CarolineConcatto
6dad29aebc [CLANG][AArch64]Add Neon vectors for mfloat8_t (#99865)
This patch adds these new vector sizes for neon:
   mfloat8x16_t and mfloat8x8_t

    According to the ARM ACLE PR#323[1].

    [1] ARM-software/acle#323
2024-10-23 13:23:18 +01:00
Kareem Ergawy
ad70f3e095 [flang][OpenMP] Support target enter|update|exit .. nowait (#113305)
Extends `nowait` support for other device directives. This PR refactors
the task generation utils used for the `target` directive so that they
are general enough to be reused for other device directives as well.
2024-10-23 10:48:54 +02:00
Carl Ritson
076aac59ac [AMDGPU] Add a new target for gfx1153 (#113138) 2024-10-23 12:56:58 +09:00
Congcong Cai
bd6c430dcb [clang codegen] avoid to crash when emit init func for global variable with flexible array init (#113336)
Fixes: #113187
Avoid to create init function since clang does not support global
variable with flexible array init.
It will cause assertion failure later.
2024-10-23 09:21:27 +08:00
Florian Hahn
4334f317e7 [TBAA] Extend pointer TBAA to pointers of non-builtin types. (#110569)
Extend the logic added in 123c036bd3
(https://github.com/llvm/llvm-project/pull/76612) to support pointers to
non-builtin types by using the mangled name of the canonical type.

PR: https://github.com/llvm/llvm-project/pull/110569
2024-10-22 16:23:34 -07:00
Alex Voicu
2074de252b [clang][HIP] Don't use the OpenCLKernel CC when targeting AMDGCNSPIRV (#110447)
When compiling HIP source for AMDGCN flavoured SPIR-V that is expected
to be consumed by the ROCm HIP RT, it's not desirable to set the OpenCL
Kernel CC on `__global__` functions. On one hand, this is not an OpenCL
RT, so it doesn't compose with e.g. OCL specific attributes. On the
other it is a "noisy" CC that carries semantics, and breaks overload
resolution when using [generic dispatchers such as those used by
RAJA](186d4194a5/src/common/HipDataUtils.hpp (L39)).
2024-10-22 17:16:46 +01:00
Alex Voicu
6e0b0038cd [clang][OpenCL][CodeGen][AMDGPU] Do not use private as the default AS for when generic is available (#112442)
Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
2024-10-22 12:05:48 +01:00
Congcong Cai
c0c36aa018 [clang codegen] fix crash emitting __array_rank (#113186)
Fixed: #113044
the type of `ArrayTypeTraitExpr` can be changed, use i32 directly is
incorrect.

---------

Co-authored-by: Eli Friedman <efriedma@quicinc.com>
2024-10-22 17:03:51 +08:00
Stanislav Mekhanoshin
622e398d88 [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (#112447)
We need to support 64-bit data types (intrinsics do support it). We are
also silently converting FP to integer argument now, also fixed.
2024-10-21 11:57:18 -07:00
Piyou Chen
c77e836123 [RISCV][FMV] Remove support for negative priority (#112161)
Ensure that target_version and target_clones do not accept negative
numbers for the priority feature.

Base on discussion on
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85.
2024-10-21 16:10:22 +08:00
NAKAMURA Takumi
4a011ac84f [Coverage] Introduce "partial fold" on BranchRegion (#112694)
Currently both True/False counts were folded. It lost the information,
"It is True or False before folding." It prevented recalling branch
counts in merging template instantiations.

In `llvm-cov`, a folded branch is shown as:

- `[True: n, Folded]`
- `[Folded, False n]`

In the case If `n` is zero, a branch is reported as "uncovered". This is
distinguished from "folded" branch. When folded branches are merged,
`Folded` may be dissolved.

In the coverage map, either `Counter` is `Zero`. Currently both were
`Zero`.

Since "partial fold" has been introduced, either case in `switch` is
omitted as `Folded`.

Each `case:` in `switch` is reported as `[True: n, Folded]`, since
`False` count doesn't show meaningful value.

When `switch` doesn't have `default:`, `switch (Cond)` is reported as
`[Folded, False: n]`, since `True` count was just the sum of `case`(s).
`switch` with `default` can be considered as "the statement that doesn't
have any `False`(s)".
2024-10-20 12:30:35 +09:00
Boaz Brickner
09cc75e2cc [clang] Deduplicate the logic that only warns once when stack is almost full (#112552)
Zero diff in behavior.
2024-10-18 10:11:14 +02:00
Sven van Haastregt
5a09ce9e03 [OpenCL] Replace a CreatePointerCast call; NFC (#112676)
With opaque pointers, the only purpose of the cast here is to cast
between address spaces, similar to the 4-argument case below.
2024-10-18 09:10:05 +02:00
Daniil Kovalev
6bb63002fc [PAC] Fix address discrimination for type info vtable pointers (#102199)
In #99726, `-fptrauth-type-info-vtable-pointer-discrimination` was
introduced, which is intended to enable type and address discrimination
for type_info vtable pointers. However, some codegen logic for actually
enabling address discrimination was missing. This patch addresses the
issue.

Fixes #101716
2024-10-18 08:58:26 +03:00
Helena Kotas
7dbfa7b981 [HLSL] Add handle initialization for simple resource declarations (#111207)
Adds `@_init_resource_bindings()` function to module initialization that
includes `handle.fromBinding` intrinsic calls for simple resource
declarations. Arrays of resources or resources inside user defined types
are not supported yet.

While this unblocks our progress on [Compile a runnable shader from
clang](https://github.com/llvm/wg-hlsl/issues/7) milestone, this is
probably not the way we would like to handle resource binding
initialization going forward. Ideally, it should be done via the
resource class constructors in order to support dynamic resource binding
or unbounded arrays if resources.

Depends on PRs #110327 and #111203.

Part 1 of #105076
2024-10-17 17:59:08 -07:00
Bill Wendling
8c62bf54df [Clang] Disable use of the counted_by attribute for whole struct pointers (#112636)
The whole struct is specificed in the __bdos. The calculation of the
whole size of the structure can be done in two ways:

    1) sizeof(struct S) + count * sizeof(typeof(fam))
    2) offsetof(struct S, fam) + count * sizeof(typeof(fam))

The first will add any remaining whitespace that might exist after
allocation while the second method is more precise, but not quite
expected from programmers. See [1] for a discussion of the topic.

GCC isn't (currently) able to calculate __bdos on a pointer to the whole
structure. Therefore, because of the above issue, we'll choose to match
what GCC does for consistency's sake.

[1] https://lore.kernel.org/lkml/ZvV6X5FPBBW7CO1f@archlinux/

Co-authored-by: Eli Friedman <efriedma@quicinc.com>
2024-10-17 21:52:40 +00:00
Matt Arsenault
51b4ada458 clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (#102462) 2024-10-17 17:10:45 +04:00
NAKAMURA Takumi
5bcc66dc00 VisitIfStmt: Prune a redundant condition.
`S->isConsteval()` is evaluated at the top of this method.
Likely mis-merging in #75425
2024-10-17 20:04:00 +09:00
Nikita Popov
255a99c29f [APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309)
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.

The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.

Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
2024-10-17 08:48:08 +02:00
Steven Perron
2c8ecb3272 [HLSL][SPIRV] Use Spirv target codegen (#112573)
When the arch in the triple in "spirv", the default target codegen is
currently used. We should be using the spir-v target codegen. This will
be used to have SPIR-V specific lowering of the HLSL types.
2024-10-16 12:46:45 -04:00
Hiroshi Yamauchi
1de15c15bc Add arrangeCXXMethodCall to the CodeGenABITypes interface. (#111597)
In MSVC, the calling conventions for free functions and C++ instance
methods could be different, it makes sense to have this variant there.
2024-10-16 09:35:05 -07:00
Simon Pilgrim
cf5e295ec0 Fix MSVC "not all control paths return a value" warning. NFC. 2024-10-16 17:15:47 +01:00
Sven van Haastregt
caa7301bc8 [OpenCL] Restore addrspacecast for pipe builtins (#112514)
Commit 84ee629bc5 ("clang: Remove some pointer bitcasts (#112324)",
2024-10-15) triggered some "Call parameter type does not match function
signature!" errors when using the OpenCL pipe builtin functions under
the spir triple, due to a missing addrspacecast.

This would have been caught by the pipe_builtin.cl test if that had used
the `spir-unknown-unknown` triple, so extend the test to use that
triple too.
2024-10-16 13:58:12 +02:00
Finn Plummer
6d13cc9411 [HLSL] Implement WaveReadLaneAt intrinsic (#111010)
- create a clang built-in in Builtins.td
    - add semantic checking in SemaHLSL.cpp
    - link the WaveReadLaneAt api in hlsl_intrinsics.h
    - add lowering to spirv backend op GroupNonUniformShuffle
      with Scope = 2 (Group) in SPIRVInstructionSelector.cpp
    - add WaveReadLaneAt intrinsic to IntrinsicsDirectX.td and mapping
      to DXIL.td

    - add tests for HLSL intrinsic lowering to spirv intrinsic in
      WaveReadLaneAt.hlsl
    - add tests for sema checks in WaveReadLaneAt-errors.hlsl
    - add spir-v backend tests in WaveReadLaneAt.ll
    - add test to show scalar dxil lowering functionality

    - note that this doesn't include support for the scalarizer to
      handle WaveReadLaneAt will be added in a future pr

This is the first part #70104
2024-10-15 18:49:40 -07:00
Helena Kotas
3b4512074e [HLSL] Make HLSLAttributedResourceType canonical and add code paths to convert HLSL types to DirectX target types (#110327)
Translates `RWBuffer` and `StructuredBuffer` resources buffer types to
DirectX target types `dx.TypedBuffer` and `dx.RawBuffer`.

Includes a change of `HLSLAttributesResourceType` from 'sugar' type to
full canonical type. This is required for codegen and other clang
infrastructure to work property on HLSL resource types.

Fixes #95952 (part 2/2)
2024-10-15 13:38:15 -07:00
Matt Arsenault
84ee629bc5 clang: Remove some pointer bitcasts (#112324)
Obsolete since opaque pointers.
2024-10-15 22:46:24 +04:00
Mariya Podchishchaeva
b528b131b6 [clang] Fix crash related to _BitInt constant split (#112218)
9ad72df55c added split of _BitInt
constants when required. Before folding back, check that the constant
exists.
2024-10-15 09:44:20 +02:00
yabinc
627746581b Reapply "[clang][CodeGen] Zero init unspecified fields in initializers in C" (#109898) (#110051)
This reverts commit d50eaac12f. Also fixes
a bug calculating offsets for bit fields in the original patch.
2024-10-14 16:32:24 -07:00
Artem Belevich
30a06e8022 [CUDA] Add support for CUDA-12.6 and sm_100 (#112028)
This is a copy of #97402(with minor updates), which is now ready to land.

---------

Co-authored-by: Sergey Kozub <skozub@nvidia.com>
2024-10-14 11:51:05 -07:00