clang-p2996

Author	SHA1	Message	Date
nerix	4c7a706589	[LLDB] Simplify libstdc++ string summaries (#146562 ) From #143177. This combines the summaries for the pre- and post C++ 11 `std::string` as well as `std::wstring`. In all cases, the data pointer is reachable through `_M_dataplus._M_p`. It has the correct type (i.e. `char`/`wchar_t`) and it's null terminated, so LLDB knows how to format it as expected when using `GetSummaryAsCString`.	2025-07-02 11:21:31 +01:00
Michael Buch	40275a4ee3	[lldb][test] Add tests for formatting pointers to std::unordered_map Ever since #143501 and #144517, these should pass. Adds tests for https://github.com/llvm/llvm-project/issues/146040	2025-07-02 11:21:02 +01:00
Mel Chen	bc8dad1c7e	[VPlan] Emit VPVectorEndPointerRecipe for reverse interleave pointer adjustment (#144864 ) A reverse interleave access is essentially composed of multiple load/store operations with same negative stride, and their addresses are based on the last lane address of member 0 in the interleaved group. Currently, we already have VPVectorEndPointerRecipe for computing the last lane address of consecutive reverse memory accesses. This patch extends VPVectorEndPointerRecipe to support constant stride and extracts the reverse interleave group address adjustment from VPInterleaveRecipe::execute, replacing it with a VPVectorEndPointerRecipe. The final goal is to support interleaved accesses with EVL tail folding. Given that VPInterleaveRecipe is large and tightly coupled — combining both load and store, and embedding operations like reverse pointer adjustion (GEP), widen load/store, deinterleave/interleave, and reversal — breaking it down into smaller, dedicated recipes may allow VPlanTransforms::tryAddExplicitVectorLength to lower them into EVL-aware form more effectively. One foreseeable challenge is that VPlanTransforms::convertToConcreteRecipes currently runs after tryAddExplicitVectorLength, so decomposing VPInterleaveRecipe will likely need to happen earlier in the pipeline to be effective.	2025-07-02 18:16:02 +08:00
Hanyang (Eric) Xu	6e1e89ee38	[SLP] Avoid -passes=instcombine stages in SLP tests (#146257 ) Fixes #145511 Note that there are still two instances of --passes=slp-vectorizer,instcombine left unchanged because it seems that the tests are meant to run in conjunction with instcombine and removing instcombine would invalidate their original objective: [llvm/test/Transforms/SLPVectorizer/arith-div-undef.ll](https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/arith-div-undef.ll) [llvm/test/Transforms/SLPVectorizer/slp-hr-with-reuse.ll](https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/slp-hr-with-reuse.ll)	2025-07-02 06:14:41 -04:00
Kazu Hirata	7ead20db28	[lldb] Use llvm::erase_if (NFC) (#146624 ) Note that erase_if combines erase and remove_if.	2025-07-02 11:00:58 +01:00
Qi Zhao	82c0a53763	[LoongArch] Pre-commit for optimizing insert extracted pair elements	2025-07-02 17:38:08 +08:00
Tom Eccles	1b7cbe1f87	[flang][OpenMP] Create unique reduction decls for different logical kinds (#146558 ) Some Fujitsu tests showed incorrect results because we were sharing reduction declarations for different kinds for logical variables.	2025-07-02 10:25:43 +01:00
Simon Pilgrim	651c5208f8	VPlanRecipes.cpp - fix "'llvm::VPExpressionRecipe::computeCost': not all control paths return a value" MSVC warning. NFC.	2025-07-02 09:59:01 +01:00
Graham Hunter	85bc868417	[AArch64][TTI] Reduce cost for splatting whole first vector segment (SVE) (#145701 ) Improve cost modeling for splatting the first 128b segment.	2025-07-02 09:51:56 +01:00
Jannick Kremer	a75587d271	[clang][python][test] Move python binding tests to lit framework (#146486 ) As discussed in PR #142353, the current testsuite of the `clang` Python bindings has several issues: - It `libclang.so` cannot be loaded into `python` to run the testsuite, the whole `ninja check-all` aborts. - The result of running the testsuite isn't report like the `lit`-based tests, rendering them almost invisible. - The testsuite is disabled in a non-obvious way (`RUN_PYTHON_TESTS`) in `tests/CMakeLists.txt`, which again doesn't show up in the test results. All these issues can be avoided by integrating the Python bindings tests with `lit`, which is what this patch does: - The actual test lives in `clang/test/bindings/python/bindings.sh` and is run by `lit`. - The current `clang/bindings/python/tests` directory (minus the now-subperfluous `CMakeLists.txt`) is moved into the same directory. - The check if `libclang` is loadable (originally from PR #142353) is now handled via a new `lit` feature, `libclang-loadable`. - The various ways to disable the tests have been turned into `XFAIL`s as appropriate. This isn't complete and not completely tested yet. Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`, `i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and `x86_64-pc-linux-gnu`. Co-authored-by: Rainer Orth <ro@gcc.gnu.org>	2025-07-02 10:11:48 +02:00
Zhaoxin Yang	2c1900860c	[lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715 ) Support TLSDESC to initial-exec or local-exec optimizations. Introduce a new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE. In normal or medium code model, there are two forms of code sequences: * pcalau12i $a0, %desc_pc_hi20(sym_desc) * addi.d $a0, $a0, %desc_pc_lo12(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) ------ * pcaddi $a0, %desc_pcrel_20(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) Convert to IE: * pcalau12i $a0, %ie_pc_hi20(sym_ie) * ld.[wd] $a0, $a0, %ie_pc_lo12(sym_ie) Convert to LE: * lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP * ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to convert the preceding instructions to NOPs, due to both forms of code sequence (corresponding to relocation combinations: R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and R_LARCH_TLS_DESC_PCREL20_S2) have same process. TODO: When relaxation enables, redundant NOPs can be removed. It will be implemented in a future patch. Note: All forms of TLSDESC code sequences should not appear interleaved in the normal, medium or extreme code model, which compilers do not generate and lld is unsupported. This is thanks to the guard in PostRASchedulerList.cpp in llvm. ``` Calls are not scheduling boundaries before register allocation, but post-ra we don't gain anything by scheduling across calls since we don't need to worry about register pressure. ```	2025-07-02 16:09:51 +08:00
Antonio Frighetto	f1cc0b607b	[IR] Introduce `dead_on_return` attribute Add `dead_on_return` attribute, which is meant to be taken advantage by the frontend, and states that the memory pointed to by the argument is dead upon function return. As with `byval`, it is supposed to be used for passing aggregates by value. The difference lies in the ABI: `byval` implies that the pointer is explicitly passed as argument to the callee (during codegen the copy is emitted as per byval contract), whereas a `dead_on_return`-marked argument implies that the copy already exists in the IR, is located at a specific stack offset within the caller, and this memory will not be read further by the caller upon callee return – or otherwise poison, if read before being written. RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.	2025-07-02 09:29:36 +02:00
Fangrui Song	d5608d6751	MC,test: Improve section group test Also add a case for #146581 ``` .section sec,"ax" .section .foo,"axG",@progbits,sec nop ```	2025-07-02 00:28:40 -07:00
Matthias Springer	647aa6616f	[mlir][SPIRVToLLVM] Set valid insertion point after op erasure (#146551 ) Erasing/replacing an op, which is also the current insertion point, invalidates the insertion point. Explicitly set the insertion point, so that `copy` does not crash after the One-Shot Dialect Conversion refactoring. (`ConversionPatternRewriter` will start behaving more like a "normal" rewriter.)	2025-07-02 09:25:24 +02:00
Nikita Popov	83272a4849	[InstCombine] Fold icmp of gep chain with base (#144065 ) Fold icmp between a chain of geps and its base pointer. Previously only a single gep was supported. This will be extended to handle the case of two gep chains with a common base in a followup. This helps to avoid regressions after #137297.	2025-07-02 09:23:36 +02:00
Haojian Wu	0588e8188c	[Serialization] Use the SourceLocation::UIntTy instead of the raw type for the offset, NFC	2025-07-02 09:11:55 +02:00
Markus Böck	6c9be27b52	[mlir][tensor] Fold identity `reshape` of 0d-tensors (#146375 ) Just like 1d-tensors, reshapes of 0d-tensors (aka scalars) are always no-folds as they only have one possible layout. This PR adds logic to the `fold` implementation to optimize these away as is currently implemented for 1d tensors.	2025-07-02 09:09:03 +02:00
Fangrui Song	9262ac3ee4	Revert "ELFObjectWriter: Optimize isInSymtab" This reverts commit `1108cf6419`. Caused a regression for a weird but interesting case (STT_SECTION symbol as group signature). We no longer define `sec` ``` .section sec,"ax" .section .foo,"axG",@progbits,sec nop ``` Fix #146581	2025-07-02 00:08:42 -07:00
Fangrui Song	eac1a1d3a8	MCAssembler: Consistently place MCFragment parameter before MCFixup ... to be consistent with other places, e.g. `recordRelocation`. While here, use references instead of non-null pointers.	2025-07-01 23:59:35 -07:00
zbenzion	b68e8f1de7	[mlir][linalg] Allow promotion to use the original subview size (#144334 ) linalg promotion attempts to compute a constant upper bound for the allocated buffer size. Only when failed to compute an upperbound it fallbacks to the original subview size, which may be dynamic. Adding a promotion option to use the original subview size by default, thus minimizing the allocation size. Fixes #144268.	2025-07-02 08:47:51 +02:00
Fangrui Song	3c6cade485	MCObjectStreamer: De-virtualize emitInstToFragment	2025-07-01 23:05:35 -07:00
Kazu Hirata	f4b938b7c0	[TableGen] Use range-based for loops (NFC) (#146626 )	2025-07-01 22:50:11 -07:00
Kazu Hirata	b809d5e2ac	[ProfileData] Use lambdas instead of std::bind (NFC) (#146625 ) Lambdas are a lot shorter than std::bind here.	2025-07-01 22:50:04 -07:00
Kazu Hirata	838b91d7f6	[clangd] Drop const from a return type (NFC) (#146623 ) We don't need const on a return type.	2025-07-01 22:49:56 -07:00
Kazu Hirata	7b4dbb4f37	[Sema] Remove an unnecessary cast (NFC) (#146622 ) Since both alignment and Alignment are of the same type, this patch renames alignment to Alignment while removing the cast statement.	2025-07-01 22:49:48 -07:00
Mateusz Mikuła	2723a6d992	[LLVM][Cygwin] Enable dynamic linking of libLLVM (#146440 ) These changes allow to link everything to shared LLVM library with MSYS2 "Cygwin" toolchain.	2025-07-01 22:30:12 -07:00
Timm Baeder	984c78f27d	[clang][bytecode] Add back missing initialize call (#146589 ) This was only accidentally dropped, so add it back.	2025-07-02 07:15:47 +02:00
Craig Topper	c9bfdae620	[RISCV] Use uint64_t for Insn in getInstruction32 and getInstruction16. NFC (#146619 ) Insn is passed to decodeInstruction which is a template function based on the type of Insn. By using uint64_t we ensure only one version of decodeInstruction is created. This reduces the file size of RISCVDisassembler.cpp.o by ~25% in my local build.	2025-07-01 21:45:02 -07:00
Shilei Tian	f1a4bb6245	[RFC][NFC][AMDGPU] Remove explicit value assignments from `AMDGPU::GPUKind` (#146567 ) We don't seem to rely on the specific values of these enums, so removing the explicit assignments simplifies the process of adding new targets.	2025-07-01 23:39:01 -04:00
Alex Crichton	a8a9a7f95a	[WebAssembly] Fix inline assembly with vector types (#146574 ) This commit fixes using inline assembly with v128 results. Previously this failed with an internal assertion about a failure to legalize a `CopyFromReg` where the source register was typed `v8f16`. It looks like the type used for the destination register was whatever was listed first in the `def V128 : WebAssemblyRegClass` listing, so the types were shuffled around to have a default-supported type. A small test was added as well which failed to generate previously and should now pass in generation. This test passed on LLVM 18 additionally and regressed by accident in #93228 which was first included in LLVM 19.	2025-07-01 20:26:30 -07:00
Peter Collingbourne	2a702cdc38	Driver: Avoid llvm::sys::path::append if resource directory absolute. After #145996 CLANG_RESOURCE_DIR can be an absolute path so we need to handle it correctly in the driver. llvm::sys::path::append does not append absolute paths in the way that I expected (or consistent with other similar APIs such as C++17 std::filesystem::path::append or Python os.path.join); instead, it effectively discards the leading / and appends the resulting relative path (e.g. append(P, "/bar") with P = "/foo" sets P to "/foo/bar"). Many tests start failing if I try to align llvm::sys::path::append with the other APIs because of callers that expect the existing behavior, so for now let's add a special case here for absolute resource paths, and document the behavior in Path.h. Reviewers: MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/146449	2025-07-01 20:21:51 -07:00
XiangZhang	aa1d9a4c31	[MLIR][Affine] Enhance simplifyAdd for AffineExpr mod (#146492 ) Currently AffineExpr Add has ability to optimize `"s1 + (s1 // c * -c)" to "s1 % c"`, but can not optimize `"(s0 + s1) + (s1 // c * -c)"`. This patch provide an opportunity to do this simplification, let it can be simplified to `"s0 + s1 % c"`.	2025-07-02 11:08:58 +08:00
Kazu Hirata	eb07f0d4a9	[Analysis] Use range-based for loops (NFC) (#146466 )	2025-07-01 19:38:28 -07:00
Ashwin Banwari	2599a9aeb5	[clang] [modules] Implement P3618R0: Allow attaching main to the global module (#146461 ) Remove the prior warning for attaching extern "C++" to main.	2025-07-02 09:52:10 +08:00
Ami-zhang	3deed4211a	[docs] Add clang release notes for LoongArch (#146481 )	2025-07-02 09:21:33 +08:00
Jonas Devlieghere	a87b27fd51	[lldb] Fix the hardware breakpoint decorator (#146609 ) A decorator to skip or XFAIL a test takes effect when the function that's passed in returns a reason string. The wrappers around hw_breakpoints_supported were doing that incorrectly by inverting (calling `not`) on the result, turning it into a boolean, which means the test is always skipped.	2025-07-01 18:01:19 -07:00
Matt Arsenault	7502af89fc	clang: Forward exception_model flag for bitcode inputs (#146342 ) This will enable removal of a hack from the wasm backend in a future change. This feels unnecessarily clunky. I would assume something was automatically parsing this and propagating it in the C++ case, but I can't seem to find it. In particular it feels wrong that I need to parse out the individual values, given they are listed in the options.td file. We should also be parsing and forwarding every flag that corresponds to something else in TargetOptions, which requires auditing.	2025-07-02 09:39:46 +09:00
Wenju He	b0e6faae08	[libclc] Add missing clc_lgamma_r with generic address space pointer arg (#146495 ) There is no change to amdgcn--amdhsa.bc and nvptx64--nvidiacl.bc because __opencl_c_generic_address_space is not defined for them.	2025-07-02 08:28:01 +08:00
Wenju He	93fe52f19e	[libclc] Add __clc_nan implementation with signed nancode argument (#146485 ) In OpenCL Extended Instruction Set Specification, nancode can be signed integer or vector of signed integers values. This PR has no change to amdgcn--amdhsa.bc and nvptx64--nvidiacl.bc because the newly added clc functions are not used in OpenCL library.	2025-07-02 08:27:46 +08:00
Kewen12	2b16af8df2	[Offload][cmake] Add GPU test job limit for AMDGPU buildbot cmake cache (#146611 ) Added GPU test job limit to make it consistent with current config https://github.com/llvm/llvm-zorg/blob/main/buildbot/osuosl/master/config/builders.py#L2027C31-L2027C77	2025-07-01 19:18:28 -05:00
Aiden Grossman	6b7e1b97f4	[CI] Use Github Native Groups in monolithic-* scripts This patch updates monolithic-linux.sh and monolithic-windows.sh to emit expandable groups in the Github logs. The syntax this replaces originally worked to produce the same functionality on Buildkite, but Github uses a different syntax. https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#grouping-log-lines Reviewers: cmtice, DavidSpickett, tstellar, lnihlen, Endilll Reviewed By: Endilll, DavidSpickett Pull Request: https://github.com/llvm/llvm-project/pull/143481	2025-07-01 16:15:27 -07:00
Jonas Devlieghere	e89458d398	[lldb] Fix PipeTest name collision in unit tests We had two classes named `PipeTest`: one in `PipeTestUtilities.h` and one in `PipeTest.cpp`. The latter was unintentionally using the wrong class (from the header) which didn't initialize the HostInfo subsystem. This resulted in a crash due to a nullptr dereference (`g_fields`) when `PipePosix::CreateWithUniqueName` called `HostInfoBase::GetProcessTempDir`.	2025-07-01 16:01:38 -07:00
jjasmine	e9c9f8f374	[WebAssembly] Fold any/alltrue (setcc x, 0, eq/ne) to [not] any/alltrue x (#144741 ) Fixes https://github.com/llvm/llvm-project/issues/50142, a miss of further vectorization, where we can only achieve zext (xor (any_true), -1). Now in test case simd-setcc-reductions, it's converted to all_true. Also fixes https://github.com/llvm/llvm-project/issues/145177, which is all_true (setcc x, 0, eq) -> not any_true any_true (setcc x, 0, ne) -> any_true all_true (setcc x, 0, ne) -> all_true --------- Co-authored-by: badumbatish <--show-origin>	2025-07-01 15:27:37 -07:00
jjasmine	4a8c1f7d12	[WebAssembly] [Backend] Wasm optimize illegal bitmask (#145627 ) [WebAssembly] [Backend] Wasm optimize illegal bitmask for #131980. Currently, the case for illegal bitmask (v32i8 or v64i8) is that at the SelectionDag level, two (four) vectors of v128 will be concatenated together, then they'll all be SETCC by the same pseudo illegal instruction, which requires expansion later on. I opt for SETCC-ing them seperately, bitcast and zext them and then add them up together in the end. --------- Co-authored-by: badumbatish <--show-origin>	2025-07-01 15:13:08 -07:00
James Y Knight	ae2104897c	[SelectionDAG] Fix NaN regression in fma dag-combine. (#146592 ) After `901e1390c9` (#127770), the DAG combine would transform `fma(x, 0.0, 1.0)` into `1.0` if `-fp-contract=fast` was enabled, in addition to when 'x' is marked nnan/ninf. It's only valid in the latter case, not the former, so delete the extra condition.	2025-07-01 18:10:30 -04:00
Alex MacLean	475cd8dfaf	[NVPTX] Further cleanup call isel (#146411 ) This change continues rewriting and cleanup around DAG ISel for formal-arguments, return values, and function calls. This causes some incidental changes, mostly to instruction ordering and register naming but also a couple improvements caused by using scalar types earlier in the lowering.	2025-07-01 14:55:04 -07:00
Skrai Pardus	5ed852f7f7	[mlir][arith] Add `arith::ConstantIntOp` constructor (#144638 ) This PR adds a `build()` constructor for `ConstantIntOp` that takes in an `APInt`. Creating an `arith` constant value with an `APInt` currently requires a structure like the following: ```c b.create<arith::ConstantOp>(IntegerAttr::get(apintValue, 5)); ``` In comparison, the`ConstantFloatOp` already has an `APFloat` constructor which allows for the following: ```c b.create<arith::ConstantFloatOp>(floatType, apfloatValue); ``` Thus, intuitively, it makes sense that a similar `ConstantIntOp` constructor is made for `APInts` like so: ```c b.create<arith::ConstantIntOp>(intType, apintValue); ``` Depends on https://github.com/llvm/llvm-project/pull/144636	2025-07-01 23:50:39 +02:00
Florian Hahn	863e17a5be	[VPlan] Make Phi operand for VPReductionPHIRecipe optional (NFC). VPReductionPHIRecipe doesn't rely on the underlying phi any longer, allow empty underlying values when cloning. NFC at the moment but will enable follow-up patches.	2025-07-01 22:49:27 +01:00
zGoldthorpe	f393211454	[Reland][IPO] Added attributor for identifying invariant loads (#146584 ) Patched and tested the `AAInvariantLoadPointer` attributor from #141800, which identifies pointers whose loads are eligible to be marked as `!invariant.load`. The bug in the attributor was due to `AAMemoryBehavior` always identifying pointers obtained from `alloca`s as having no writes. I'm not entirely sure why `AAMemoryBehavior` behaves this way, but it seems to be beceause it identifies the scope of an `alloca` to be limited to only that instruction (and, certainly, no memory writes occur within the `alloca` instructin). This patch just adds a check to disallow all loads from `alloca` pointers from being marked `!invariant.load` (since any well-defined program will have to write to stack pointers at some point).	2025-07-01 17:46:19 -04:00
Changpeng Fang	d99b14623f	AMDGPU: Implement tensor_save and tensor_stop for gfx1250 (#146590 ) MC layer only.	2025-07-01 14:28:38 -07:00

1 2 3 4 5 ...

543569 Commits