clang-p2996

Author	SHA1	Message	Date
Abhina Sree	a9ee1797b7	Remove helper function and use target agnostic needConversion function (#146680 ) This patch adds back the needed AutoConvert.h header and removes the unneeded include guard of MVS to prevent this header from being removed in the future	2025-07-02 10:02:46 -04:00
Michael Buch	fc00256b2b	[lldb][test][NFC] Rename libcxx unordered_map tests to unordered_map-iterator The actual `unordered_map` tests live in `data-formatter-stl/generic/unordered`. The tests here are only testing `std::unordered_map::iterator`. This patch renames the directory accordingly. This is in preparation for moving all of the STL tests into the `generic` directory.	2025-07-02 14:36:41 +01:00
Jay Foad	2b03efc7fb	[AMDGPU] Use isImage. NFC. (#146677 )	2025-07-02 14:18:42 +01:00
Matt Arsenault	dbe441e716	X86: Avoid some uses of getPointerTy (#146306 ) In most contexts the pointer type is implied by the operation and should be propagated; getPointerTy is for niche cases where there is a synthesized value.	2025-07-02 22:14:16 +09:00
Ross Brunton	4f02965ae2	[Offload] Store kernel name in GenericKernelTy (#142799 ) GenericKernelTy has a pointer to the name that was used to create it. However, the name passed in as an argument may not outlive the kernel. Instead, GenericKernelTy now contains a std::string, and copies the name into there.	2025-07-02 14:11:05 +01:00
Alexandre Ganea	e63de82d90	[LLD][COFF] Disallow importing DllMain from import libraries (#146610 ) This is a workaround for https://github.com/llvm/llvm-project/issues/82050 by skipping the `DllMain` symbol if seen in aimport library. If this situation occurs, after this commit a warning will also be displayed. The warning can be silenced with `/ignore:exporteddllmain`	2025-07-02 08:53:18 -04:00
Callum Fare	acb52a8a98	[Offload] Improve liboffload documentation (#142403 ) - Update the main README to reflect the current project status - Rework the main API generation documentation. General fixes/tidying, but also spell out explicitly how to make API changes at the top of the document since this is what most people will care about. --------- Co-authored-by: Martin Grant <martingrant@outlook.com>	2025-07-02 13:52:27 +01:00
Steven Perron	4e213159af	[SPIRV] Add FloatControl2 capability (#144371 ) Add handling for FPFastMathMode in SPIR-V shaders. This is a first pass that simply does a direct translation when the proper extension is available. This will unblock work for HLSL. However, it is not a full solution. The default math mode for spir-v is determined by the API. When targeting Vulkan many of the fast math options are assumed. We should do something particular when targeting Vulkan. We will also need to handle the hlsl "precise" keyword correctly when FPFastMathMode is not available. Unblockes https://github.com/llvm/llvm-project/issues/140739, but we are keeing it open to track the remaining issues mentioned above.	2025-07-02 08:48:57 -04:00
jyli0116	9c0743fbc5	[GlobalISel] Allow expansion of urem by constant in prelegalizer (#145914 ) This patch allows urem by a constant to be expanded more efficiently to avoid the need for expensive udiv instructions. This is part of the resolution to issue #118090	2025-07-02 13:46:36 +01:00
Kunqiu Chen	0aafeb8ba1	Reland [TSan] Clarify and enforce shadow end alignment (#146676 ) #144648 was reverted because it failed the new sanitizer test `munmap_clear_shadow.c` in IOS's CI. That issue could be fixed by disabling the test on some platforms, due to the incompatibility of the test on these platforms. In detail, we should disable the test in FreeBSD, Apple, NetBSD, Solaris, and Haiku, where `ReleaseMemoryPagesToOS` executes `madvise(beg, end, MADV_FREE)`, which tags the relevant pages as 'FREE' and does not release them immediately.	2025-07-02 20:28:30 +08:00
Shilei Tian	c0e9084b1c	[AMDGPU] Add a debug option `-amdgpu-snop-padding` for `GCNHazardRecognizer` (#146587 ) This can help to identify if there is potential hazards. Co-authored-by: Byrnes, Jeffrey <Jeffrey.Byrnes@amd.com>	2025-07-02 08:16:38 -04:00
Kunqiu Chen	9eac5f72f6	Revert "[TSan] Clarify and enforce shadow end alignment" (#146674 ) Reverts llvm/llvm-project#144648 due to a test failure of the new added test case `munmap_clear_shadow.c` in IOS .	2025-07-02 20:11:11 +08:00
Mehdi Amini	6ec9b1b366	[MLIR] Remove spurious space when printing `prop-dict` (#145962 ) When there is an elided properties, there use to be an extra space insert in the prop-dict printing before the dictionnary. Fix #145695	2025-07-02 14:07:17 +02:00
David Sherwood	f575b18fdc	[LV] Add support for partial reductions without a binary op (#133922 ) Consider IR such as this: for.body: %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ] %accum = phi i32 [ 0, %entry ], [ %add, %for.body ] %gep.a = getelementptr i8, ptr %a, i64 %iv %load.a = load i8, ptr %gep.a, align 1 %ext.a = zext i8 %load.a to i32 %add = add i32 %ext.a, %accum %iv.next = add i64 %iv, 1 %exitcond.not = icmp eq i64 %iv.next, 1025 br i1 %exitcond.not, label %for.exit, label %for.body Conceptually we can vectorise this using partial reductions too, although the current loop vectoriser implementation requires the accumulation of a multiply. For AArch64 this is easily done with a udot or sdot with an identity operand, i.e. a vector of (i16 1). In order to do this I had to teach getScaledReductions that the accumulated value may come from a unary op, hence there is only one extension to consider. Similarly, I updated the vplan and AArch64 TTI cost model to understand the possible unary op. --------- Co-authored-by: Matt Devereau <matthew.devereau@arm.com>	2025-07-02 13:05:51 +01:00
Joseph Huber	dea4f3213d	[libc] Use is aligned builtin instead of ptrtoint (#146402 ) Summary: This avoids a ptrtoint by just using the clang builtin. This is clang specific but only clang can compile GPU code anyway so I do not bother with a fallback.	2025-07-02 07:03:11 -05:00
DrSergei	5fe63ae9a3	[lldb-dap] Fix flaky test TestDAP_server (#145231 ) This patch fixes a possible data race between main and event handler threads. Terminated event can be sent from `Disconnect` function or event handler. Consequently, there are some possible sequences of events. We must check events twice, because without getting an exited event, `exit_status` will be None. But, we don't know the order of events (for example, we can get terminated event before exited event), so we check events by filter. It is correct, because terminated event will be sent only once (guarded by `llvm::call_once`). This patch moved from [145010](https://github.com/llvm/llvm-project/pull/145010) and based on idea from this [comment](https://github.com/llvm/llvm-project/pull/145010#discussion_r2159637210).	2025-07-02 12:16:48 +01:00
Matt Arsenault	585b41c2ec	TargetOptions: Look up frame-pointer attribute once (#146639 ) Same as `07a86a525e`, except in ther other case here.	2025-07-02 20:09:20 +09:00
Stephen Tozer	35626e97d8	[DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (#143591 ) This patch is part of a series that adds origin-tracking to the debugify source location coverage checks, allowing us to report symbolized stack traces of the point where missing source locations appear. This patch adds a pair of new functions in `signals.h` that can be used to collect and symbolize stack traces respectively. This has major implementation overlap with the existing stack trace collection/symbolizing methods, but the existing functions are specialized for dumping a stack trace to stderr when LLVM crashes, while these new functions are meant to be called repeatedly during the execution of the program, and therefore we need a separate set of functions.	2025-07-02 12:01:17 +01:00
Andrei Safronov	a2c9f7dbcc	[Xtensa] Implement lowering SELECT_CC/BRCC for Xtensa FP Option. (#145544 ) Also minor format changes in disassembler test for Xtensa FP Option.	2025-07-02 13:48:49 +03:00
Paul Walker	7cc8fe2a2c	[LLVM][AArch64] Relax SVE/SME codegen predicates. (#145322 ) Code generation predicates like HasSVE2_or_SME implemented a strict divide between streaming and non-streaming which meant some SME instructions were not available unless a matching SVE feature was enabled.	2025-07-02 11:39:33 +01:00
Simon Pilgrim	38200e94f1	[DAG] visitFREEZE - always allow freezing multiple operands (#145939 ) Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...). This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots. I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe. Hopefully this will help some of the regression issues in #143102 etc.	2025-07-02 11:28:37 +01:00
nerix	4c7a706589	[LLDB] Simplify libstdc++ string summaries (#146562 ) From #143177. This combines the summaries for the pre- and post C++ 11 `std::string` as well as `std::wstring`. In all cases, the data pointer is reachable through `_M_dataplus._M_p`. It has the correct type (i.e. `char`/`wchar_t`) and it's null terminated, so LLDB knows how to format it as expected when using `GetSummaryAsCString`.	2025-07-02 11:21:31 +01:00
Michael Buch	40275a4ee3	[lldb][test] Add tests for formatting pointers to std::unordered_map Ever since #143501 and #144517, these should pass. Adds tests for https://github.com/llvm/llvm-project/issues/146040	2025-07-02 11:21:02 +01:00
Mel Chen	bc8dad1c7e	[VPlan] Emit VPVectorEndPointerRecipe for reverse interleave pointer adjustment (#144864 ) A reverse interleave access is essentially composed of multiple load/store operations with same negative stride, and their addresses are based on the last lane address of member 0 in the interleaved group. Currently, we already have VPVectorEndPointerRecipe for computing the last lane address of consecutive reverse memory accesses. This patch extends VPVectorEndPointerRecipe to support constant stride and extracts the reverse interleave group address adjustment from VPInterleaveRecipe::execute, replacing it with a VPVectorEndPointerRecipe. The final goal is to support interleaved accesses with EVL tail folding. Given that VPInterleaveRecipe is large and tightly coupled — combining both load and store, and embedding operations like reverse pointer adjustion (GEP), widen load/store, deinterleave/interleave, and reversal — breaking it down into smaller, dedicated recipes may allow VPlanTransforms::tryAddExplicitVectorLength to lower them into EVL-aware form more effectively. One foreseeable challenge is that VPlanTransforms::convertToConcreteRecipes currently runs after tryAddExplicitVectorLength, so decomposing VPInterleaveRecipe will likely need to happen earlier in the pipeline to be effective.	2025-07-02 18:16:02 +08:00
Hanyang (Eric) Xu	6e1e89ee38	[SLP] Avoid -passes=instcombine stages in SLP tests (#146257 ) Fixes #145511 Note that there are still two instances of --passes=slp-vectorizer,instcombine left unchanged because it seems that the tests are meant to run in conjunction with instcombine and removing instcombine would invalidate their original objective: [llvm/test/Transforms/SLPVectorizer/arith-div-undef.ll](https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/arith-div-undef.ll) [llvm/test/Transforms/SLPVectorizer/slp-hr-with-reuse.ll](https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/slp-hr-with-reuse.ll)	2025-07-02 06:14:41 -04:00
Kazu Hirata	7ead20db28	[lldb] Use llvm::erase_if (NFC) (#146624 ) Note that erase_if combines erase and remove_if.	2025-07-02 11:00:58 +01:00
Qi Zhao	82c0a53763	[LoongArch] Pre-commit for optimizing insert extracted pair elements	2025-07-02 17:38:08 +08:00
Tom Eccles	1b7cbe1f87	[flang][OpenMP] Create unique reduction decls for different logical kinds (#146558 ) Some Fujitsu tests showed incorrect results because we were sharing reduction declarations for different kinds for logical variables.	2025-07-02 10:25:43 +01:00
Simon Pilgrim	651c5208f8	VPlanRecipes.cpp - fix "'llvm::VPExpressionRecipe::computeCost': not all control paths return a value" MSVC warning. NFC.	2025-07-02 09:59:01 +01:00
Graham Hunter	85bc868417	[AArch64][TTI] Reduce cost for splatting whole first vector segment (SVE) (#145701 ) Improve cost modeling for splatting the first 128b segment.	2025-07-02 09:51:56 +01:00
Jannick Kremer	a75587d271	[clang][python][test] Move python binding tests to lit framework (#146486 ) As discussed in PR #142353, the current testsuite of the `clang` Python bindings has several issues: - It `libclang.so` cannot be loaded into `python` to run the testsuite, the whole `ninja check-all` aborts. - The result of running the testsuite isn't report like the `lit`-based tests, rendering them almost invisible. - The testsuite is disabled in a non-obvious way (`RUN_PYTHON_TESTS`) in `tests/CMakeLists.txt`, which again doesn't show up in the test results. All these issues can be avoided by integrating the Python bindings tests with `lit`, which is what this patch does: - The actual test lives in `clang/test/bindings/python/bindings.sh` and is run by `lit`. - The current `clang/bindings/python/tests` directory (minus the now-subperfluous `CMakeLists.txt`) is moved into the same directory. - The check if `libclang` is loadable (originally from PR #142353) is now handled via a new `lit` feature, `libclang-loadable`. - The various ways to disable the tests have been turned into `XFAIL`s as appropriate. This isn't complete and not completely tested yet. Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`, `i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and `x86_64-pc-linux-gnu`. Co-authored-by: Rainer Orth <ro@gcc.gnu.org>	2025-07-02 10:11:48 +02:00
Zhaoxin Yang	2c1900860c	[lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715 ) Support TLSDESC to initial-exec or local-exec optimizations. Introduce a new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE. In normal or medium code model, there are two forms of code sequences: * pcalau12i $a0, %desc_pc_hi20(sym_desc) * addi.d $a0, $a0, %desc_pc_lo12(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) ------ * pcaddi $a0, %desc_pcrel_20(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) Convert to IE: * pcalau12i $a0, %ie_pc_hi20(sym_ie) * ld.[wd] $a0, $a0, %ie_pc_lo12(sym_ie) Convert to LE: * lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP * ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to convert the preceding instructions to NOPs, due to both forms of code sequence (corresponding to relocation combinations: R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and R_LARCH_TLS_DESC_PCREL20_S2) have same process. TODO: When relaxation enables, redundant NOPs can be removed. It will be implemented in a future patch. Note: All forms of TLSDESC code sequences should not appear interleaved in the normal, medium or extreme code model, which compilers do not generate and lld is unsupported. This is thanks to the guard in PostRASchedulerList.cpp in llvm. ``` Calls are not scheduling boundaries before register allocation, but post-ra we don't gain anything by scheduling across calls since we don't need to worry about register pressure. ```	2025-07-02 16:09:51 +08:00
Antonio Frighetto	f1cc0b607b	[IR] Introduce `dead_on_return` attribute Add `dead_on_return` attribute, which is meant to be taken advantage by the frontend, and states that the memory pointed to by the argument is dead upon function return. As with `byval`, it is supposed to be used for passing aggregates by value. The difference lies in the ABI: `byval` implies that the pointer is explicitly passed as argument to the callee (during codegen the copy is emitted as per byval contract), whereas a `dead_on_return`-marked argument implies that the copy already exists in the IR, is located at a specific stack offset within the caller, and this memory will not be read further by the caller upon callee return – or otherwise poison, if read before being written. RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.	2025-07-02 09:29:36 +02:00
Fangrui Song	d5608d6751	MC,test: Improve section group test Also add a case for #146581 ``` .section sec,"ax" .section .foo,"axG",@progbits,sec nop ```	2025-07-02 00:28:40 -07:00
Matthias Springer	647aa6616f	[mlir][SPIRVToLLVM] Set valid insertion point after op erasure (#146551 ) Erasing/replacing an op, which is also the current insertion point, invalidates the insertion point. Explicitly set the insertion point, so that `copy` does not crash after the One-Shot Dialect Conversion refactoring. (`ConversionPatternRewriter` will start behaving more like a "normal" rewriter.)	2025-07-02 09:25:24 +02:00
Nikita Popov	83272a4849	[InstCombine] Fold icmp of gep chain with base (#144065 ) Fold icmp between a chain of geps and its base pointer. Previously only a single gep was supported. This will be extended to handle the case of two gep chains with a common base in a followup. This helps to avoid regressions after #137297.	2025-07-02 09:23:36 +02:00
Haojian Wu	0588e8188c	[Serialization] Use the SourceLocation::UIntTy instead of the raw type for the offset, NFC	2025-07-02 09:11:55 +02:00
Markus Böck	6c9be27b52	[mlir][tensor] Fold identity `reshape` of 0d-tensors (#146375 ) Just like 1d-tensors, reshapes of 0d-tensors (aka scalars) are always no-folds as they only have one possible layout. This PR adds logic to the `fold` implementation to optimize these away as is currently implemented for 1d tensors.	2025-07-02 09:09:03 +02:00
Fangrui Song	9262ac3ee4	Revert "ELFObjectWriter: Optimize isInSymtab" This reverts commit `1108cf6419`. Caused a regression for a weird but interesting case (STT_SECTION symbol as group signature). We no longer define `sec` ``` .section sec,"ax" .section .foo,"axG",@progbits,sec nop ``` Fix #146581	2025-07-02 00:08:42 -07:00
Fangrui Song	eac1a1d3a8	MCAssembler: Consistently place MCFragment parameter before MCFixup ... to be consistent with other places, e.g. `recordRelocation`. While here, use references instead of non-null pointers.	2025-07-01 23:59:35 -07:00
zbenzion	b68e8f1de7	[mlir][linalg] Allow promotion to use the original subview size (#144334 ) linalg promotion attempts to compute a constant upper bound for the allocated buffer size. Only when failed to compute an upperbound it fallbacks to the original subview size, which may be dynamic. Adding a promotion option to use the original subview size by default, thus minimizing the allocation size. Fixes #144268.	2025-07-02 08:47:51 +02:00
Fangrui Song	3c6cade485	MCObjectStreamer: De-virtualize emitInstToFragment	2025-07-01 23:05:35 -07:00
Kazu Hirata	f4b938b7c0	[TableGen] Use range-based for loops (NFC) (#146626 )	2025-07-01 22:50:11 -07:00
Kazu Hirata	b809d5e2ac	[ProfileData] Use lambdas instead of std::bind (NFC) (#146625 ) Lambdas are a lot shorter than std::bind here.	2025-07-01 22:50:04 -07:00
Kazu Hirata	838b91d7f6	[clangd] Drop const from a return type (NFC) (#146623 ) We don't need const on a return type.	2025-07-01 22:49:56 -07:00
Kazu Hirata	7b4dbb4f37	[Sema] Remove an unnecessary cast (NFC) (#146622 ) Since both alignment and Alignment are of the same type, this patch renames alignment to Alignment while removing the cast statement.	2025-07-01 22:49:48 -07:00
Mateusz Mikuła	2723a6d992	[LLVM][Cygwin] Enable dynamic linking of libLLVM (#146440 ) These changes allow to link everything to shared LLVM library with MSYS2 "Cygwin" toolchain.	2025-07-01 22:30:12 -07:00
Timm Baeder	984c78f27d	[clang][bytecode] Add back missing initialize call (#146589 ) This was only accidentally dropped, so add it back.	2025-07-02 07:15:47 +02:00
Craig Topper	c9bfdae620	[RISCV] Use uint64_t for Insn in getInstruction32 and getInstruction16. NFC (#146619 ) Insn is passed to decodeInstruction which is a template function based on the type of Insn. By using uint64_t we ensure only one version of decodeInstruction is created. This reduces the file size of RISCVDisassembler.cpp.o by ~25% in my local build.	2025-07-01 21:45:02 -07:00
Shilei Tian	f1a4bb6245	[RFC][NFC][AMDGPU] Remove explicit value assignments from `AMDGPU::GPUKind` (#146567 ) We don't seem to rely on the specific values of these enums, so removing the explicit assignments simplifies the process of adding new targets.	2025-07-01 23:39:01 -04:00

1 2 3 4 5 ...

543148 Commits