clang-p2996

Author	SHA1	Message	Date
Slava Zakharin	273aecdb20	[flang-rt] Use runtime::memchr instead of std::memchr. (#135298 )	2025-04-18 08:45:52 -07:00
Devon Loehr	915de1a588	Generate empty .clang-format-ignore before running tests (#136154 ) Followup to #136022, this ensures formatting tests are run with an empty `.clang-format-ignore` in their root directory, to prevent failures if the file also exists higher in the tree.	2025-04-18 08:43:00 -07:00
Arthur Eubanks	be9f72cf37	Revert "[ConstraintElim] Simplify cmp after uadd.sat/usub.sat (#135603 )" This reverts commit `fe54d1afcc`. Causes miscompiles, see #135603.	2025-04-18 15:37:37 +00:00
Nick Sarnie	257b727584	[clang][Sema][SYCL] Fix MSVC STL usage on AMDGPU (#135979 ) The MSVC STL includes specializations of `_Is_memfunptr` for every function pointer type, including every calling convention. The problem is the AMDGPU target doesn't support the x86 `vectorcall` calling convention so clang sets it to the default CC. This ends up clashing with the already-existing overload for the default CC, so we get a duplicate definition error when including `type_traits` (which we heavily use in the SYCL STL) and compiling for AMDGPU on Windows. This doesn't happen for pure AMDGPU non-SYCL because it doesn't include the C++ STL, and it doesn't happen for CUDA/HIP because a similar workaround was done [here](`fa49c3a888`). I am not an expert in Sema, so I did a kinda of hardcoded fix, please let me know if there is a better way to fix this. As far as I can tell we can't do exactly the same fix that was done for CUDA because we can't differentiate between device and host code so easily. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-18 15:28:46 +00:00
Jakub Kuderski	c016a65c18	[mlir][vector] Switch to `llvm::interleaved` in debug prints. NFC. (#136248 ) Clean up printing code by switching to `llvm::interleaved` from https://github.com/llvm/llvm-project/pull/135517.	2025-04-18 11:22:52 -04:00
Jakub Kuderski	4be84a142e	[mlir][gpu] Clean up prints in GPU dialect. NFC. (#136250 ) Clean up printing code by switching to `llvm::interleaved` from https://github.com/llvm/llvm-project/pull/135517. Also make some minor readability & performance fixes.	2025-04-18 11:10:17 -04:00
Jakub Kuderski	d0dd6974b8	[mlir][spirv] Switch to `llvm::interleaved`. NFC. (#136240 ) Clean up printing code by switching to `llvm::interleaved` from https://github.com/llvm/llvm-project/pull/135517.	2025-04-18 11:08:41 -04:00
Philip Reames	f2ecd86e34	[Analysis] Remove implicit LocationSize conversion from uint64_t (#133342 ) This change removes the uint64_t constructor on LocationSize preventing implicit conversion, and fixes up the using APIs to adapt to the change. Note that I'm adding a couple of explicit conversion points on routines where passing in a fixed offset as an integer seems likely to have well understood semantics. We had an unfortunate case which arose if you tried to pass a TypeSize value to a parameter of LocationSize type. We'd find the implicit conversion path through TypeSize -> uint64_t -> LocationSize which works just fine for fixed values, but looses information and fails assertions if the TypeSize was scalable. This change breaks the first link in that implicit conversion chain since that seemed to be the easier one.	2025-04-18 07:46:31 -07:00
Ivan Butygin	dda4b968e7	[mlir] AMDGPUToROCDL: lower `amdgpu.swizzle_bitmode` (#136223 ) Repack `amdgpu.swizzle_bitmode` arguments and lower it to `rocdl.ds_swizzle`. Repacking logic is follows: * `sizeof(arg) < sizeof(i32)`: bitcast to integer and zext to i32 and then trunc and bitcast back. * `sizeof(arg) == sizeof(i32)`: just bitcast to i32 and back if not i32 * `sizeof(arg) > sizeof(i32)`: bitcast to `vector<Nxi32>`, extract individual elements and do a series of `rocdl.ds_swizzle` and then compose vector and bitcast back. Added repacking logic to LLVM utils so it can be used elsewhere. I'm planning to use it for `gpu.shuffle` later.	2025-04-18 17:19:04 +03:00
Yingwei Zheng	b1b065f2bf	[ValueTracking] Refactor `isKnownNonEqualFromContext` (#127388 ) This patch avoids adding RHS for comparisons with two variable operands (https://github.com/llvm/llvm-project/pull/118493#discussion_r1949397482). Instead, we iterate over related dominating conditions of both V1 and V2 in `isKnownNonEqualFromContext`, as suggested by goldsteinn (https://github.com/llvm/llvm-project/pull/117442#discussion_r1944058002). Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=c6d95c441a29a45782ff72d6cb82839b86fd0e4a&to=88464baedd7b1731281eaa0ce4438122b4d218a7&stat=instructions:u	2025-04-18 22:14:06 +08:00
Oleksandr "Alex" Zinenko	20a104a7d6	[mlir] allow function type cloning to fail (#136300 ) `FunctionOpInterface` assumed the fact that the function type (attribute of the operation) can be cloned with arbirary lists of function arguments and results to support argument and result list mutation. This is not always correct, in particular, LLVM dialect functions require exactly one result making it impossible to erase the result. Allow function type cloning to fail and propagate this failure through various APIs that use it. The common assumption is that existing IR has not been modified. Fixes #131142.	2025-04-18 16:05:54 +02:00
amordo	35e6ca47c1	[docs] Add TOC for InstCombine contributor guide (#136293 )	2025-04-18 15:35:02 +02:00
Matt Arsenault	730773602f	llvm-reduce: Avoid using constantdata uselistorder in thinlto test (#136288 ) This also demonstrates a bug that's a consequence of the two different paths for the single and multithreaded cases. The parallel path goes through bitcode serialization and does preserve the uselistorder. It therefore survives and we can observe a reduced uselistorder with deleted instructions. In the CloneModule case, nothing is reduced.	2025-04-18 15:34:11 +02:00
Oleksandr "Alex" Zinenko	63b8f1c948	[mlir] add a fluent API to GreedyRewriterConfig (#132253 ) This is similar to other configuration objects used across MLIR.	2025-04-18 15:19:57 +02:00
Louis Dionne	860e88411d	[libc++] Make __config_site modular (#134699 ) This patch makes the __config_site header modular, which solves various problems with non-modular headers. This requires going back to generating the modulemap file, since we only know how to make __config_site modular when we're not using the per-target runtime dir. The patch also adds a test that we support -Wnon-modular-include-in-module, which warns about non-modular includes from modules. --------- Co-authored-by: Konstantin Varlamov <varconst@apple.com>	2025-04-18 06:06:25 -07:00
Rahul Joshi	622765f976	[Clang][GPU] Make NVPTX check more permissive in unit test (#136301 ) - Seems based on whether NVPTX backend is enabled or not, this call can have the range() attribute or not. So make this check more permissive.	2025-04-18 05:52:55 -07:00
Matthias Springer	fc1e311966	[mlir][memref] Fix rollback in test case during `convert-to-llvm` (#135958 ) This commit is in preparation of the One-Shot Dialect Conversion refactoring, which removes the rollback from the dialect conversion framework. `GenericAtomicRMWOpLowering` (`generic_atomic_rmw`) triggered a rollback in two test cases. The lowering pattern adds additional basic blocks to the enclosing operation, which used to be a `func.func` (now `llvm.func`). Adding a basic block triggers legalization of the op that owns the basic block. This fails when running `--convert-to-llvm="filter-dialects=memref"` because no lowering patterns for the `func` dialect were populated and only `llvm` ops are considered "legal" by the `convert-to-llvm` pass, causing a rollback of the entire `GenericAtomicRMWOpLowering` pattern. Also add extra `CHECK-INTERFACE` to make sure that all test cases are correctly lowered with `--convert-to-llvm="filter-dialects=memref"`.	2025-04-18 14:52:51 +02:00
Joseph Huber	db0f754c5a	[OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (#126143 ) Summary: Currently, we build a single `libomptarget.devicertl.a` which is a fatbinary. It is a host object file that contains the embedded archive files for both the NVIDIA and AMDGPU targets. This was done primarily as a convenience due to naming conflicts. Now that the clang driver for the GPU targets can appropriate link via the per-target runtime-dir, we can just make two separate static libraries and remove the indirection. This patch creates two new static libraries that get installed into ``` /lib/amdgcn-amd-amdhsa/libomp.a /lib/nvptx64-nvidia-cuda/libomp.a ``` for AMDGPU and NVPTX respectively. The link job created by the linker wrapper now simply needs to do `-lomp` and it will search those directories and link those static libraries. This requires far less special handling. This patch is a precursor to changing the build system entirely to be a runtimes based one. Soon this target will be a standard `add_library` and done through the GPU runtime targets. NOTE that this actually does remove an additional optimization step. Previously we merged all of the files into a single bitcode object and forcibly internalized some definitions. This, instead, just treats them like a normal static library. This may possibly affect performance for some files, but I think it's better overall to use static library semantics because it allows us to have an 'include-what-you-use' relationship with the library. Performance testing will be required. If we really need the merged blob then we can simply pack that into a new static library.	2025-04-18 07:43:31 -05:00
Rahul Joshi	3ed83630b2	[NFC][LLVM][TableGen] Use `decodeULEB128` for `OPC_SoftFail` emission (#136220 ) - Use `decodeULEB128` to decode +ve/-ve mask in OPC_SoftFail case. - Use current `I`/`E` iterators as inputs to `decodeULEB128`.	2025-04-18 05:12:35 -07:00
Rahul Joshi	e1b14d4e1c	[Clang][GPU] Fix unit test for NVPTX tid.x intrinsic (#136297 ) - llvm.nvvm.read.ptx.sreg.tid.x does not have the result range attribute yet.	2025-04-18 05:01:01 -07:00
Vladislav Dzhidzhoev	6462fad3d0	[DebugInfo] getMergedLocation: match scopes based on their location (#132286 ) getMergedLocation uses a common parent scope of the two input locations for an output location. It doesn't consider the case when the common parent scope is from a file other than L1's and L2's files. In that case, it produces a merged location with an erroneous scope (https://github.com/llvm/llvm-project/issues/122846). In some cases, such as https://github.com/llvm/llvm-project/pull/125780#issuecomment-2651657856, L1, L2 having a common parent scope from another file indicate that the code at L1 and L2 is included from the same source location. With this commit, getMergedLocation detects that L1, L2, or their common parent scope files are different. If so, it assumes that L1 and L2 were included from some source location, and tries to attach the output location to a scope with the nearest common source location with regard to L1 and L2. If the nearest common location is also from another file, getMergedLocation returns it as a merged location, assuming that L1 and L2 belong to files that were both included in the nearest common location. Fixes https://github.com/llvm/llvm-project/issues/122846.	2025-04-18 13:57:28 +02:00
Raul Tambre	c890b7376f	[lldb][Telemetry] Fix unit test compile failure with LLVM_ENABLE_TELEMETRY=0 (#136115 ) It needs to be `TEST_F` to access `received_entries`. Disabling also works based on the test not the fixture name. Build failure: ``` lldb/unittests/Core/TelemetryTest.cpp:110:17: error: use of undeclared identifier 'received_entries' 110 \| ASSERT_EQ(1U, received_entries.size()); \| ^ lldb/unittests/Core/TelemetryTest.cpp:112:61: error: use of undeclared identifier 'received_entries' 112 \| llvm::dyn_cast<lldb_private::FakeTelemetryInfo>(received_entries[0]) \| ^ ``` Fixes: `159b872b37`	2025-04-18 14:48:30 +03:00
Rahul Joshi	6c4caae449	[LLVM][TableGen] Move DecoderEmitter output to anonymous namespace (#136214 ) - Move the code generated by DecoderEmitter to anonymous namespace. - Move AMDGPU's usage of this code from header file to .cpp file. Note, we get build errors like "call to function 'decodeInstruction' that is neither visible in the template definition nor found by argument-dependent lookup" if we do not change AMDGPU.	2025-04-18 04:35:05 -07:00
Andrew Savonichev	a8fe21f3f5	[clang] Handle instantiated members to determine visibility (#136128 ) As reported in issue #103477, visibility of instantiated member functions used to be ignored when calculating visibility of a specialization. This patch modifies `getLVForClassMember` to look up for a source template for an instantiated member, and changes `mergeTemplateLV` to apply it. A similar issue was reported in #31462, but it seems that `extern` declaration with visibility prevents the function from being emitted as hidden. This behavior seems correct, even though GCC emits it as with default visibility instead. Both tests from #103477 and #31462 are added as LIT tests `test72` and `test73` respectively.	2025-04-18 20:29:19 +09:00
Aaron Ballman	c609cd2df9	Give this diagnostic a diagnostic group (#136182 ) I put this under -Wunitialized because that's the same group it's under in GCC. Fixes #41104	2025-04-18 07:09:27 -04:00
Timm Baeder	c5d59723cb	[clang][bytecode] Reject constexpr-unknown values in CheckStore (#136279 )	2025-04-18 12:48:16 +02:00
Zichen Lu	1d190065d9	[mlir][target] RAII wrap moduleToObject timer to ensure call `clear` function (#136142 ) As title, we need to call `Timer::clear` to avoid extra log like this: ``` ===-------------------------------------------------------------------------=== ... ===-------------------------------------------------------------------------=== Total Execution Time: 0.0000 seconds (0.0000 wall clock) ---Wall Time--- --- Name --- ----- .... ----- Total ```	2025-04-18 12:33:31 +02:00
Chengjun	9b8bc53a0b	[FlattenCFG] Fix an Imprecise Usage of AA (#128117 ) In current `FlattenCFG`, using `isNoAlias` for two instructions is imprecise. For example, when passing a store instruction and a load instruction directly into `AA->isNoAlias`, it will always return `NoAlias`. This happens because when checking the types of the two Values, the store instruction (which has a `void` type) causes the analysis to return `NoAlias`. For instructions, we should use `getModRefInfo` instead of `isNoAlias`, as aliasing is a concept of memory locations. In this patch, `AAResults::getModRefInfo` is supported to take in two instructions. It will check whether two instructions may access the same memory location or not. And in `FlattenCFG`, we use this new helper function to do the check instead of `isNoAlias`. Unit tests and lit tests are also included to this patch.	2025-04-18 12:30:05 +02:00
Matt Arsenault	9bdd9dc895	AMDGPU: Mark workitem ID intrinsics with range attribute (#136196 ) This avoids the need to have special handling at every use site. Unfortunately this means we unnecessarily emit AssertZext in the DAG (where we already directly understand the range of the intrinsic), andt we regress in undefined cases as we don't fold out asserts on undef.	2025-04-18 12:27:38 +02:00
Akshat Oke	31ddaef8d1	[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127 )	2025-04-18 15:06:30 +05:30
Christian Sigg	1db03cab70	[mlir][bazel] Port `697aa9995c`	2025-04-18 10:31:33 +02:00
Younan Zhang	c7daab259c	[Clang] Fix the trailing comma regression (#136273 ) `925e195` introduced a regression since which we started to accept invalid trailing commas in many expression lists where they're not allowed by the grammar. The issue came from the fact that an additional invalid state - previously handled by ParseExpressionList - was overlooked in that patch. Fixes https://github.com/llvm/llvm-project/issues/136254 No release entry because I want to backport it.	2025-04-18 16:27:27 +08:00
Simon Pilgrim	64ffecfc43	[DAG] isKnownNeverNaN - add DemandedElts element mask to isKnownNeverNaN calls (#135952 ) Matches what we've done for computeKnownBits etc. to improve vector handling	2025-04-18 09:24:02 +01:00
Kazu Hirata	5db95fd6ca	[memprof] Avoid repeated hash lookups (NFC) (#136268 ) Note that we don't have to worry about CallstackProfileData[Id] default-constructing the value side of a new map entry. If that happens, AccessHistogramSize > 0 wouldn't be true, and the new map entry gets deleted right away.	2025-04-18 01:07:05 -07:00
Corentin Jabot	a99c978d1b	[Clang] Avoid dereferencing an invalid iterator Fix msan builds after `8c5a307bd8` https://lab.llvm.org/buildbot/#/builders/94/builds/6321	2025-04-18 10:03:06 +02:00
Sergei Barannikov	d1496313d7	[CodeGen] Add another method to CFIInstBuilder (#136270 ) Mainly for use by downstream targets, but it can find applications in upstream code as well. Use it in MSP430 so that it doesn't look dead.	2025-04-18 10:50:42 +03:00
Timm Baeder	802e7309c0	[lldb] Fix TestExprDiagnostics test (#136269 ) Add missing source ranges to the diagnostic output.	2025-04-18 09:33:20 +02:00
Yanzuo Liu	a158352294	[Clang][GitHub][NFC] Auto-add clang:bytecode label for PR (#136148 )	2025-04-18 09:27:23 +02:00
Kazu Hirata	f4c76bba59	[clang] Use llvm::append_range (NFC) (#136256 ) This patch replaces: llvm::copy(Src, std::back_inserter(Dst)); with: llvm::append_range(Dst, Src); for breavity. One side benefit is that llvm::append_range eventually calls llvm::SmallVector::reserve if Dst is of llvm::SmallVector.	2025-04-18 00:15:13 -07:00
Nikolas Klauser	e0a6905287	[libc++] Simplify the generic implementation of is_{un}signed (#136095 )	2025-04-18 09:06:21 +02:00
Timm Baeder	cc7fc9978f	[clang] Add source range to 'use of undeclared identifier' diagnostics (#117671 )	2025-04-18 08:27:15 +02:00
Kazu Hirata	d27175d26e	[Scalar] Avoid repeated hash lookups (NFC) (#135751 )	2025-04-17 23:03:39 -07:00
Kazu Hirata	a42ac55a79	[IPO] Avoid repeated hash lookups (NFC) (#135750 )	2025-04-17 23:03:25 -07:00
Fangrui Song	f28408f3af	[test] Remove CHECK lines for MCAsmStreamer's fixup output The fixup output is a debug aid and should not be used to test target-specific relocation generation implementation. The llvm-mc -filetype=obj output is what truly matters.	2025-04-17 22:29:42 -07:00
Nico Weber	2b002d6804	[gn] port `1756fcb8b0`	2025-04-17 21:56:41 -07:00
Iris	155fc76f20	Recommit "[RISCV] Strengthen register usage validation for XTheadMemPair loads (#136241 )" With test fix. Closes #136087 https://github.com/XUANTIE-RV/thead-extension-spec/blob/master/xtheadmempair/lwd.adoc	2025-04-17 21:55:16 -07:00
Peter Collingbourne	b07ee6acff	LowerTypeTests: Simplify pointer types.	2025-04-17 21:52:40 -07:00
Kazu Hirata	59288761c9	[llvm] Use llvm::binary_search (NFC) (#136228 )	2025-04-17 21:46:24 -07:00
Fangrui Song	65d16a8101	[RISCV] Simplify fixup kinds that force relocations For RELA targets, fixup kinds that force relocations (GOT, TLS, ALIGN, RELAX, etc) can bypass `applyFixup` and be encoded as `FirstRelocationKind+i`, as seen in LoongArch. This patch removes redundant fixup kinds and adopts the `FirstRelocationKind+i` encoding. The `llvm-mc -show-encoding` output no longer displays descriptive fixup names, as this information is removed from `RISCVAsmBackend::getFixupKindInfo`. While a backend hook could be added to call `llvm::object::getELFRelocationTypeName`, it's unnecessary since the relocation in `-filetype=obj` output is what truly matters. Pull Request: https://github.com/llvm/llvm-project/pull/136088	2025-04-17 21:36:15 -07:00
Fangrui Song	d5f94c3915	Revert "[RISCV] Strengthen register usage validation for XTheadMemPair loads (#136241 )" This reverts commit `a354564a64`. Broke tests	2025-04-17 21:28:28 -07:00

1 2 3 4 5 ...

534586 Commits