clang-p2996

Author	SHA1	Message	Date
Phoebe Wang	c72a751dab	[X86][AMX] Support AMX-TRANSPOSE (#113532 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368	2024-11-01 16:45:03 +08:00
Craig Topper	cd8d507b07	[RISCV] Pull __builtin_riscv_clz/ctz out of a nested switch. NFC The nested switch exists to share setting IntrinsicsTypes to {ResultType}. clz/ctz return before we reach that so they can just be in the top level switch.	2024-10-31 11:01:58 -07:00
Simon Pilgrim	fcaa8c6e22	Fix MSVC "signed/unsigned mismatch" warning. NFC.	2024-10-31 11:50:19 +00:00
Stanislav Mekhanoshin	ba1a09da8d	[AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (#113610 ) The same handling as for __builtin_amdgcn_mov_dpp.	2024-10-31 02:19:20 -07:00
Paul Kirth	b01e2a8b56	[llvm] Allow always dropping all llvm.type.test sequences Currently, the `DropTypeTests` parameter only fully works with phi nodes and llvm.assume instructions. However, we'd like CFI to work in conjunction with FatLTO, in so far as the bitcode section should be able to contain the CFI instrumentation, while any incompatible bits are dropped when compiling the object code. To do that, we need to drop the llvm.type.test instructions everywhere, and not just their uses in phi nodes. This patch updates the LowerTypeTest pass so that uses are removed, and replaced with `true` in all cases, and not just in phi nodes. Addressing this will allow us to fix #112053 by modifying the FatLTO pipeline. Reviewers: pcc, nikic Reviewed By: pcc Pull Request: https://github.com/llvm/llvm-project/pull/112787	2024-10-30 16:56:30 -07:00
Helena Kotas	74d8f3952c	[HLSL] Remove old resource annotations for UAVs and SRVs (#114139 ) UAVs and SRVs have already been converted to use LLVM target types and we can disable generating of the !hlsl.uavs and !hlsl.srvs! annotations. This will enable adding tests for structured buffers with user defined types that this old resource annotations code does not handle (it crashes). Part 1 of #114126	2024-10-30 14:06:42 -07:00
Jay Foad	463a4c16ea	[clang] Remove some uses of llvm::StructType::setBody. NFC. (#113691 ) It is simple to create the struct body up front, now that we have transitioned to opaque pointers.	2024-10-30 16:53:08 +00:00
Chuanqi Xu	259eaa6878	[C++20] [Modules] Fix the duplicated static initializer problem (#114193 ) Reproducer: ``` //--- a.cppm export module a; int func(); static int a = func(); //--- a.cpp import a; ``` The `func()` should only execute once. However, before this patch we will somehow import `static int a` from a.cppm incorrectly and initialize that again. This is super bad and can introduce serious runtime behaviors. And also surprisingly, it looks like the root cause of the problem is simply some oversight choosing APIs.	2024-10-30 17:27:04 +08:00
Jesse Huang	335e68d8bc	[Clang][RISCV] Support -fcf-protection=return for RISC-V (#112477 ) Enables the support of `-fcf-protection=return` on RISC-V, which requires Zicfiss. It also adds a string attribute "hw-shadow-stack" to every function if the option is set on RISC-V	2024-10-29 15:47:49 +08:00
joaosaffran	481bce018e	Adding splitdouble HLSL function (#109331 ) - Adding hlsl `splitdouble` intrinsics - Adding DXIL lowering - Adding SPIRV lowering - Adding test Fixes: #108901 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>	2024-10-28 13:26:59 -07:00
Steven Perron	98e3075df9	[HLSL][SPIRV] Add convergence tokens to entry point wrapper (#112757 ) Inlining currently assumes that either all function use controled convergence or none of them do. This is why we need to have the entry point wrapper use controled convergence. `c85611e858/llvm/lib/Transforms/Utils/InlineFunction.cpp (L2431-L2439)`	2024-10-28 13:25:04 -04:00
Aaron Ballman	af7c58b7ea	Remove support for RenderScript (#112916 ) See https://discourse.llvm.org/t/rfc-deprecate-and-eventually-remove-renderscript-support/81284 for the RFC	2024-10-28 12:48:42 -04:00
Momchil Velikov	53f7f8ecca	[Clang][AArch64] Fix Pure Scalables Types argument passing and return (#112747 ) Pure Scalable Types are defined in AAPCS64 here: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts And should be passed according to Rule C.7 here: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules This part of the ABI is completely unimplemented in Clang, instead it treats PSTs sometimes as HFAs/HVAs, sometime as general composite types. This patch implements the rules for passing PSTs by employing the `CoerceAndExpand` method and extending it to: * allow array types in the `coerceToType`; Now only `[N x i8]` are considered padding. * allow mismatch between the elements of the `coerceToType` and the elements of the `unpaddedCoerceToType`; AArch64 uses this to map fixed-length vector types to SVE vector types. Corectly passing a PST argument needs a decision in Clang about whether to pass it in memory or registers or, equivalently, whether to use the `Indirect` or `Expand/CoerceAndExpand` method. It was considered relatively harder (or not practically possible) to make that decision in the AArch64 backend. Hence this patch implements the register counting from AAPCS64 (cf. `NSRN`, `NPRN`) to guide the Clang's decision.	2024-10-28 15:43:14 +00:00
Simon Pilgrim	d6d4569dd9	Fix MSVC "signed/unsigned mismatch" warnings. NFC.	2024-10-28 11:45:36 +00:00
Alex MacLean	fb33af08e4	[NVPTX] Remove nvvm.ldg.global.* intrinsics (#112834 ) Remove these intrinsics which can be better represented by load instructions with `!invariant.load` metadata: - llvm.nvvm.ldg.global.i - llvm.nvvm.ldg.global.f - llvm.nvvm.ldg.global.p	2024-10-27 16:14:13 -07:00
davidtrevelyan	4102625380	[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155 ) # What This PR renames the newly-introduced llvm attribute `sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise, sibling variables such as `SanitizeRealtimeUnsafe` are renamed to `SanitizeRealtimeBlocking` respectively. There are no other functional changes. # Why? - There are a number of problems that can cause a function to be real-time "unsafe", - we wish to communicate what problems rtsan detects and why they're unsafe, and - a generic "unsafe" attribute is, in our opinion, too broad a net - which may lead to future implementations that need extra contextual information passed through them in order to communicate meaningful reasons to users. - We want to avoid this situation and make the runtime library boundary API/ABI as simple as possible, and - we believe that restricting the scope of attributes to names like `sanitize_realtime_blocking` is an effective means of doing so. We also feel that the symmetry between `[[clang::blocking]]` and `sanitize_realtime_blocking` is easier to follow as a developer. # Concerns - I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been part of the tree for a few weeks now (introduced here: https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't been released in version 20 yet, am I correct in considering this to not be a breaking change?	2024-10-26 13:06:11 +01:00
Gang Chen	4ac0e7e400	[AMDGPU] Add a type for the named barrier (#113614 )	2024-10-25 11:24:47 -07:00
CarolineConcatto	49940514e2	[CLANG][AArch64] Add the modal 8 bit floating-point scalar type (#97277 ) ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic. From the PR#323: ``` ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3 8-bit floating-point formats. It is a storage and interchange only type with no arithmetic operations other than intrinsic calls. ```` The type should be an opaque type and its format in undefined in Clang. Only defined in the backend by a status/format register, for AArch64 the FPMR. This patch is an attempt to the add the mfloat8_t scalar type. It has a parser and codegen for the new scalar type. The patch it is lowering to and 8bit unsigned as it has no format. But maybe we should add another opaque type. [1] https://github.com/ARM-software/acle/pull/323	2024-10-25 13:59:46 +01:00
Sergio Afonso	d87964de78	[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533 ) This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.	2024-10-25 11:30:16 +01:00
Kiran	a96c14eeb8	[Clang] Always forward sret parameters to musttail calls If a call using the musttail attribute returns it's value through an sret argument pointer, we must forward an incoming sret pointer to it, instead of creating a new alloca. This is always possible because the musttail attribute requires the caller and callee to have the same argument and return types.	2024-10-25 09:34:08 +01:00
Jay Foad	4dd55c567a	[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399 ) Follow up to #109133.	2024-10-24 10:23:40 +01:00
CarolineConcatto	6dad29aebc	[CLANG][AArch64]Add Neon vectors for mfloat8_t (#99865 ) This patch adds these new vector sizes for neon: mfloat8x16_t and mfloat8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323	2024-10-23 13:23:18 +01:00
Kareem Ergawy	ad70f3e095	[flang][OpenMP] Support `target enter\|update\|exit .. nowait` (#113305 ) Extends `nowait` support for other device directives. This PR refactors the task generation utils used for the `target` directive so that they are general enough to be reused for other device directives as well.	2024-10-23 10:48:54 +02:00
Carl Ritson	076aac59ac	[AMDGPU] Add a new target for gfx1153 (#113138 )	2024-10-23 12:56:58 +09:00
Congcong Cai	bd6c430dcb	[clang codegen] avoid to crash when emit init func for global variable with flexible array init (#113336 ) Fixes: #113187 Avoid to create init function since clang does not support global variable with flexible array init. It will cause assertion failure later.	2024-10-23 09:21:27 +08:00
Florian Hahn	4334f317e7	[TBAA] Extend pointer TBAA to pointers of non-builtin types. (#110569 ) Extend the logic added in `123c036bd3` (https://github.com/llvm/llvm-project/pull/76612) to support pointers to non-builtin types by using the mangled name of the canonical type. PR: https://github.com/llvm/llvm-project/pull/110569	2024-10-22 16:23:34 -07:00
Alex Voicu	2074de252b	[clang][HIP] Don't use the OpenCLKernel CC when targeting AMDGCNSPIRV (#110447 ) When compiling HIP source for AMDGCN flavoured SPIR-V that is expected to be consumed by the ROCm HIP RT, it's not desirable to set the OpenCL Kernel CC on `__global__` functions. On one hand, this is not an OpenCL RT, so it doesn't compose with e.g. OCL specific attributes. On the other it is a "noisy" CC that carries semantics, and breaks overload resolution when using [generic dispatchers such as those used by RAJA](`186d4194a5/src/common/HipDataUtils.hpp (L39)`).	2024-10-22 17:16:46 +01:00
Alex Voicu	6e0b0038cd	[clang][OpenCL][CodeGen][AMDGPU] Do not use `private` as the default AS for when `generic` is available (#112442 ) Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use `private` as the default address space. This is wrong for cases where the `generic` address space is available, and is corrected via this patch. In general, this AS map abuse is a bad hack and we should re-work it altogether, but at least after this patch we will stop being incorrect for e.g. OpenCL 2.0.	2024-10-22 12:05:48 +01:00
Congcong Cai	c0c36aa018	[clang codegen] fix crash emitting __array_rank (#113186 ) Fixed: #113044 the type of `ArrayTypeTraitExpr` can be changed, use i32 directly is incorrect. --------- Co-authored-by: Eli Friedman <efriedma@quicinc.com>	2024-10-22 17:03:51 +08:00
Stanislav Mekhanoshin	622e398d88	[AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (#112447 ) We need to support 64-bit data types (intrinsics do support it). We are also silently converting FP to integer argument now, also fixed.	2024-10-21 11:57:18 -07:00
Piyou Chen	c77e836123	[RISCV][FMV] Remove support for negative priority (#112161 ) Ensure that target_version and target_clones do not accept negative numbers for the priority feature. Base on discussion on https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85.	2024-10-21 16:10:22 +08:00
NAKAMURA Takumi	4a011ac84f	[Coverage] Introduce "partial fold" on BranchRegion (#112694 ) Currently both True/False counts were folded. It lost the information, "It is True or False before folding." It prevented recalling branch counts in merging template instantiations. In `llvm-cov`, a folded branch is shown as: - `[True: n, Folded]` - `[Folded, False n]` In the case If `n` is zero, a branch is reported as "uncovered". This is distinguished from "folded" branch. When folded branches are merged, `Folded` may be dissolved. In the coverage map, either `Counter` is `Zero`. Currently both were `Zero`. Since "partial fold" has been introduced, either case in `switch` is omitted as `Folded`. Each `case:` in `switch` is reported as `[True: n, Folded]`, since `False` count doesn't show meaningful value. When `switch` doesn't have `default:`, `switch (Cond)` is reported as `[Folded, False: n]`, since `True` count was just the sum of `case`(s). `switch` with `default` can be considered as "the statement that doesn't have any `False`(s)".	2024-10-20 12:30:35 +09:00
Boaz Brickner	09cc75e2cc	[clang] Deduplicate the logic that only warns once when stack is almost full (#112552 ) Zero diff in behavior.	2024-10-18 10:11:14 +02:00
Sven van Haastregt	5a09ce9e03	[OpenCL] Replace a CreatePointerCast call; NFC (#112676 ) With opaque pointers, the only purpose of the cast here is to cast between address spaces, similar to the 4-argument case below.	2024-10-18 09:10:05 +02:00
Daniil Kovalev	6bb63002fc	[PAC] Fix address discrimination for type info vtable pointers (#102199 ) In #99726, `-fptrauth-type-info-vtable-pointer-discrimination` was introduced, which is intended to enable type and address discrimination for type_info vtable pointers. However, some codegen logic for actually enabling address discrimination was missing. This patch addresses the issue. Fixes #101716	2024-10-18 08:58:26 +03:00
Helena Kotas	7dbfa7b981	[HLSL] Add handle initialization for simple resource declarations (#111207 ) Adds `@_init_resource_bindings()` function to module initialization that includes `handle.fromBinding` intrinsic calls for simple resource declarations. Arrays of resources or resources inside user defined types are not supported yet. While this unblocks our progress on [Compile a runnable shader from clang](https://github.com/llvm/wg-hlsl/issues/7) milestone, this is probably not the way we would like to handle resource binding initialization going forward. Ideally, it should be done via the resource class constructors in order to support dynamic resource binding or unbounded arrays if resources. Depends on PRs #110327 and #111203. Part 1 of #105076	2024-10-17 17:59:08 -07:00
Bill Wendling	8c62bf54df	[Clang] Disable use of the counted_by attribute for whole struct pointers (#112636 ) The whole struct is specificed in the __bdos. The calculation of the whole size of the structure can be done in two ways: 1) sizeof(struct S) + count * sizeof(typeof(fam)) 2) offsetof(struct S, fam) + count * sizeof(typeof(fam)) The first will add any remaining whitespace that might exist after allocation while the second method is more precise, but not quite expected from programmers. See [1] for a discussion of the topic. GCC isn't (currently) able to calculate __bdos on a pointer to the whole structure. Therefore, because of the above issue, we'll choose to match what GCC does for consistency's sake. [1] https://lore.kernel.org/lkml/ZvV6X5FPBBW7CO1f@archlinux/ Co-authored-by: Eli Friedman <efriedma@quicinc.com>	2024-10-17 21:52:40 +00:00
Matt Arsenault	51b4ada458	clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (#102462 )	2024-10-17 17:10:45 +04:00
NAKAMURA Takumi	5bcc66dc00	VisitIfStmt: Prune a redundant condition. `S->isConsteval()` is evaluated at the top of this method. Likely mis-merging in #75425	2024-10-17 20:04:00 +09:00
Nikita Popov	255a99c29f	[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309 ) This fixes all the places that hit the new assertion added in https://github.com/llvm/llvm-project/pull/106524 in tests. That is, cases where the value passed to the APInt constructor is not an N-bit signed/unsigned integer, where N is the bit width and signedness is determined by the isSigned flag. The fixes either set the correct value for isSigned, set the implicitTrunc flag, or perform more calculations inside APInt. Note that the assertion is currently still disabled by default, so this patch is mostly NFC.	2024-10-17 08:48:08 +02:00
Steven Perron	2c8ecb3272	[HLSL][SPIRV] Use Spirv target codegen (#112573 ) When the arch in the triple in "spirv", the default target codegen is currently used. We should be using the spir-v target codegen. This will be used to have SPIR-V specific lowering of the HLSL types.	2024-10-16 12:46:45 -04:00
Hiroshi Yamauchi	1de15c15bc	Add arrangeCXXMethodCall to the CodeGenABITypes interface. (#111597 ) In MSVC, the calling conventions for free functions and C++ instance methods could be different, it makes sense to have this variant there.	2024-10-16 09:35:05 -07:00
Simon Pilgrim	cf5e295ec0	Fix MSVC "not all control paths return a value" warning. NFC.	2024-10-16 17:15:47 +01:00
Sven van Haastregt	caa7301bc8	[OpenCL] Restore addrspacecast for pipe builtins (#112514 ) Commit `84ee629bc5` ("clang: Remove some pointer bitcasts (#112324)", 2024-10-15) triggered some "Call parameter type does not match function signature!" errors when using the OpenCL pipe builtin functions under the spir triple, due to a missing addrspacecast. This would have been caught by the pipe_builtin.cl test if that had used the `spir-unknown-unknown` triple, so extend the test to use that triple too.	2024-10-16 13:58:12 +02:00
Finn Plummer	6d13cc9411	[HLSL] Implement `WaveReadLaneAt` intrinsic (#111010 ) - create a clang built-in in Builtins.td - add semantic checking in SemaHLSL.cpp - link the WaveReadLaneAt api in hlsl_intrinsics.h - add lowering to spirv backend op GroupNonUniformShuffle with Scope = 2 (Group) in SPIRVInstructionSelector.cpp - add WaveReadLaneAt intrinsic to IntrinsicsDirectX.td and mapping to DXIL.td - add tests for HLSL intrinsic lowering to spirv intrinsic in WaveReadLaneAt.hlsl - add tests for sema checks in WaveReadLaneAt-errors.hlsl - add spir-v backend tests in WaveReadLaneAt.ll - add test to show scalar dxil lowering functionality - note that this doesn't include support for the scalarizer to handle WaveReadLaneAt will be added in a future pr This is the first part #70104	2024-10-15 18:49:40 -07:00
Helena Kotas	3b4512074e	[HLSL] Make HLSLAttributedResourceType canonical and add code paths to convert HLSL types to DirectX target types (#110327 ) Translates `RWBuffer` and `StructuredBuffer` resources buffer types to DirectX target types `dx.TypedBuffer` and `dx.RawBuffer`. Includes a change of `HLSLAttributesResourceType` from 'sugar' type to full canonical type. This is required for codegen and other clang infrastructure to work property on HLSL resource types. Fixes #95952 (part 2/2)	2024-10-15 13:38:15 -07:00
Matt Arsenault	84ee629bc5	clang: Remove some pointer bitcasts (#112324 ) Obsolete since opaque pointers.	2024-10-15 22:46:24 +04:00
Mariya Podchishchaeva	b528b131b6	[clang] Fix crash related to _BitInt constant split (#112218 ) `9ad72df55c` added split of _BitInt constants when required. Before folding back, check that the constant exists.	2024-10-15 09:44:20 +02:00
yabinc	627746581b	Reapply "[clang][CodeGen] Zero init unspecified fields in initializers in C" (#109898 ) (#110051 ) This reverts commit `d50eaac12f`. Also fixes a bug calculating offsets for bit fields in the original patch.	2024-10-14 16:32:24 -07:00
Artem Belevich	30a06e8022	[CUDA] Add support for CUDA-12.6 and sm_100 (#112028 ) This is a copy of #97402(with minor updates), which is now ready to land. --------- Co-authored-by: Sergey Kozub <skozub@nvidia.com>	2024-10-14 11:51:05 -07:00

1 2 3 4 5 ...

17398 Commits