clang-p2996

Author	SHA1	Message	Date
Jakub Kuderski	96bbe472ef	Revert "[mlir][spirv] Fix int type declaration duplication when serializing" and follow up commits (#144773 ) This reverts the following PRs: * https://github.com/llvm/llvm-project/pull/143108 * https://github.com/llvm/llvm-project/pull/144538 * https://github.com/llvm/llvm-project/pull/144685 Reverting because this disabled tests when building without the llvm spirv backend enabled.	2025-06-18 16:15:06 -04:00
Andrew Rogers	a88e655809	[llvm] build Blake3 source with LLVM_EXPORTS defined (#144753 ) ## Purpose This patch ensures that the BLAKE3 implementation in the LLVM Support library exports its public interface with `__declspec(dllexport)` when building LLVM as a Windows DLL. ## Background The effort to support building LLVM as a Windows DLL is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307). ## Overview Replicate [this logic](https://github.com/llvm/llvm-project/blob/main/llvm/cmake/modules/AddLLVM.cmake#L662-L664) from `llvm_add_library()` for the `LLVMSupportBlake3` target. Without this change, the `llvm_blake_` functions will only be annotated with `__declspec(dllimport)` when building LLVM as a Windows DLL which leads to inconsistent DLL linkage warnings from MSVC and `clang-cl`.	2025-06-18 13:08:05 -07:00
Chelsea Cassanova	a630ca6f6c	[lldb][breakpoint] Grey out disabled breakpoints (#91404 ) This commit adds colour settings to the list of breakpoints in order to grey out breakpoints that have been disabled.	2025-06-18 13:06:20 -07:00
Florian Hahn	23b8f11b27	[VPlan] Remove redundant VPWidenRecipe constructors (NFC) Since the removal of VPWidenEVLRecipe, the constructors taking a VPDefOpcode are not needed any more. Remove them.	2025-06-18 20:59:16 +01:00
Justin King	22a69a266d	lsan: Support free_sized and free_aligned_sized from C23 (#144604 ) Adds support to LSan for `free_sized` and `free_aligned_sized` from C23. Other sanitizers will be handled with their own separate PRs. For https://github.com/llvm/llvm-project/issues/144435 This is attempt number 2. Signed-off-by: Justin King <jcking@google.com>	2025-06-18 12:57:49 -07:00
Tobias Stadler	d4b7c0d8b4	[Remarks] Auto-detect remark parser format (#144554 ) Add remark format 'Auto', which performs automatic detection of the remark format using the magic numbers at the beginning of the remarks files. The RemarkLinker already did something similar, so we streamlined this and exposed this to llvm-remarkutil.	2025-06-18 20:49:55 +01:00
Amr Hesham	67c52aacae	[CIR] Upstream support for IncompleteArrayType (#144138 ) This change adds the basic support for IncompleteArray Issue https://github.com/llvm/llvm-project/issues/130197	2025-06-18 21:47:50 +02:00
Jameson Nash	c04fc5596e	[MemCpyOpt] allow some undef contents overread in processMemCpyMemCpyDependence (#143745 ) Allows memcpy to memcpy forwarding in cases where the second memcpy is larger, but the overread is known to be undef, by shrinking the memcpy size. Refs https://github.com/llvm/llvm-project/pull/140954 which laid some of the groundwork for this.	2025-06-18 15:38:34 -04:00
Jameson Nash	fb0651959b	[AArch64] fix trampoline implementation: actually use X15 (#143892 ) A incorrect switch statement caused it to try to use X4 instead of X15 in #126743, which would have not worked.	2025-06-18 15:37:56 -04:00
Alan Phipps	88d250729e	Revert "[llvm-cov] Export decision coverage to output json" (#144783 ) Reverts llvm/llvm-project#144335 Need to resolve test failures	2025-06-18 14:33:59 -05:00
Ramkumar Ramachandra	156a64c585	[HashRecognize] Tighten pre-conditions for analysis (#144757 ) Exit early if the TC is not a byte-multiple, as optimization works by dividing TC by 8. Also delay the SCEV TC query.	2025-06-18 20:19:25 +01:00
Ramkumar Ramachandra	f13b9e3643	[HashRecognize] Don't const-qualify Values in result (#144752 ) Const-qualifying Values in the analysis result makes them unusable with IRBuilder. The issue was discovered when attempting to use the result of the analysis for a transform.	2025-06-18 20:18:53 +01:00
Ramkumar Ramachandra	a94eb27a29	[HashRecognize] Fix big-endian CRC tables (#144754 ) Big-endian CRC tables are incorrect due to the initial value of CRC in genSarwateTable being hard-coded for CRC-8. 128 is the signed-min value for CRC-8, but it should be generalized to APInt::getSignedMinValue. The issue was found when writing CRC verification tests for llvm-test-suite.	2025-06-18 20:18:22 +01:00
Kazu Hirata	ca9a09dbe6	[libc++] Fix a typo in documentation (#144763 )	2025-06-18 15:03:17 -04:00
uthmanna	ab6beeca9c	[llvm-cov] Export decision coverage to output json (#144335 ) This commit adds decision coverage counts derived from MC/DC test vector execution to the JSON output of llvm-cov, as discussed here: [Missing Decision Coverage (DC) in output json](https://discourse.llvm.org/t/missing-decision-coverage-dc-in-output-json/86783) with @evodius96 --------- Co-authored-by: uthmanna <andre.uthmann@vector.com>	2025-06-18 14:00:10 -05:00
Walter J.T.V	8c3fbaf0ee	[Clang][OpenMP][LoopTransformations] Fix incorrect number of generated loops for Tile and Reverse directives (#140532 ) This patch is closely related to #139293 and addresses an existing issue in the loop transformation codebase. Specifically, it corrects the handling of the `NumGeneratedLoops` variable in `OMPLoopTransformationDirective` AST nodes and its inheritors (such as OMPUnrollDirective, OMPTileDirective, etc.). Previously, this variable was inaccurately set for certain transformations like reverse or tile. While this did not lead to functional bugs, since the value was only checked to determine whether it was greater than zero or equal to zero, the inconsistency could introduce problems when supporting more complex directives in the future.	2025-06-18 20:52:41 +02:00
Andre Kuhlenschmidt	17f5b8b52a	[flang][driver] add ability to look up feature flags without setting them (#144559 ) This just adds some convenience methods to feature control and rewrites old code in terms of those methods. Also cleans up some names that I just realize were overloads of another method.	2025-06-18 11:21:35 -07:00
zhijian lin	3f3526f36d	[NFC][PowerPC] pre-commit running the update_llc_test_checks.py for all-atomics.ll,loop-comment.ll etc (#144411 ) Run the update_llc_test_checks.py for all-atomics.ll,loop-comment.ll ,PR35812-neg-cmpxchg.ll (Pre-commit patch for the https://github.com/llvm/llvm-project/pull/144089)	2025-06-18 14:15:30 -04:00
Florian Hahn	071a6feabd	[TTI] Remove PPC hasActiveVectorLength impl, simplify interface (NFC). (#142310 ) PPCTTIImpl defines hasActiveVectorLength and also getVPMemoryOpCost, but they appear unused (i.e. no changes to tests). Remove them, as they complicate the interface for hasActiveVectorLength. This simplifies the only use in LV as now no placeholder values need to be passed. PR: https://github.com/llvm/llvm-project/pull/142310	2025-06-18 19:02:17 +01:00
Arthur Eubanks	dfe4d44d8d	Revert "[VPlan] Remove unnecessary DomTreeUpdater flush (NFC)." (#144758 ) This reverts commit `2e337349f4`. Causes breakages internally, will post reproducer later.	2025-06-18 11:00:13 -07:00
sribee8	6f4e4ea177	[libc] Internal getrandom implementation (#144427 ) Implemented an internal getrandom to avoid calls to the public one in table.h --------- Co-authored-by: Sriya Pratipati <sriyap@google.com>	2025-06-18 17:56:57 +00:00
Tomer Shafir	835d3034fe	[AArch64] improve zero-cycle regmov test (#143680 ) - Add a `gpr32` suffix to test name to denote the specific register class being checked - Expand `-mtriple=arm64-apple-ios` to `-march=arm64` to broaden the test context to the generic architecture, as the specific triple is not required - Port `bl` match to Linux too via the regex: `{{_?foo}}` - Advance `-mcpu=cyclone` to the newer M series major `-mcpu=apple-m1` - Use `-mcpu` so that `-mattr=-zcm` has a real effect - Add a test that generic arm64 doesn't optimize for ZCM - Distinguish 4 different assembly layouts: NOTCPU, CPU, NOTATTR, ATTR - Fix broken test logic, for example: `; NOT: mov [[REG2:w[0-9]+]], w3` matched `mov w1, w3` then `REG2` captured `w1` but then `; NOT: mov w1, [[REG2]]` matched by prefix `mov, w1, w19` even though it should have matched `mov w1, w1`. This change adds explicit matches for all of the generated copies.	2025-06-18 18:56:33 +01:00
Lei Huang	82acd8c377	[PowerPC] Add code to spill and restore DMRp registers (#142443 )	2025-06-18 13:50:57 -04:00
Justin King	d9f7979a63	sanitizer_common: add unsupported test for free_sized and free_aligned_sized from C23 (#144727 ) Signed-off-by: Justin King <jcking@google.com>	2025-06-18 10:24:38 -07:00
Artem Belevich	298f1c276f	Revert "Add missing intrinsics to cuda headers" (#144755 ) Reverts llvm/llvm-project#143664 as it breaks CUDA compilation.	2025-06-18 10:08:27 -07:00
John Brawn	77bc254851	[AArch64] Fix build failure with -Werror (#144749 ) PR#144387 caused buildbot failures with -Werror due to a comparison between signed and unsigned types. Fix this with an explicit cast.	2025-06-18 18:05:02 +01:00
Alexis Engelke	2a8c65e983	[CodeGen][NFC] Fix quadratic c-t for large jump tables Deleting a basic block removes all references from jump tables, which is O(n). When freeing a MachineFunction, all basic blocks are deleted before the jump tables, causing O(n^2) runtime. Fix this by deallocating the jump table first. Test case generator: import sys n = int(sys.argv[1]) print("define void @f(i64 %c, ptr %p) {") print(" switch i64 %c, label %d [") for i in range(n): print(f" i64 {i}, label %h{i}") print(f" ]") for i in range(n): print(f'h{i}:') print(f' store i64 {i*i}, ptr %p') print(f' ret void') print('d:') print(' ret void') print('}') Improvement at 5000 entries: Benchmark 1: ./llc.pre -filetype=obj -O0 <switch5k.bc Time (mean ± σ): 49.7 ms ± 1.0 ms Range (min … max): 48.0 ms … 52.1 ms 57 runs Benchmark 2: ./llc.post -filetype=obj -O0 <switch5k.bc Time (mean ± σ): 39.4 ms ± 0.8 ms Range (min … max): 37.1 ms … 41.1 ms 72 runs Summary ./llc.post -filetype=obj -O0 <switch5k.bc ran 1.26 ± 0.04 times faster than ./llc.pre -filetype=obj -O0 <switch5k.bc Improvement at 20000 entries: Benchmark 1: ./llc.pre -filetype=obj -O0 <switch20k.bc Time (mean ± σ): 281.7 ms ± 1.0 ms Range (min … max): 280.2 ms … 283.0 ms 10 runs Benchmark 2: ./llc.post -filetype=obj -O0 <switch20k.bc Time (mean ± σ): 123.9 ms ± 1.5 ms Range (min … max): 121.4 ms … 129.2 ms 23 runs Summary ./llc.post -filetype=obj -O0 <switch20k.bc ran 2.27 ± 0.03 times faster than ./llc.pre -filetype=obj -O0 <switch20k.bc Pull Request: https://github.com/llvm/llvm-project/pull/144108	2025-06-18 18:56:30 +02:00
Krzysztof Parzyszek	4084ffcf1e	[flang] Show types in DumpEvExpr (#143743 ) When dumping evaluate::Expr, show type names which contain a lot of useful information. For example show ``` expr <Fortran::evaluate::SomeType> { expr <Fortran::evaluate::SomeKind<Fortran::common::TypeCategory::Integer>> { expr <Fortran::evaluate::Type<Fortran::common::TypeCategory::Integer, 4>> { ... ``` instead of ``` expr T { expr T { expr T { ... ```	2025-06-18 11:31:03 -05:00
Yang Bai	fe3933da15	[mlir][vector] Support complete folding in single pass for vector.insert/vector.extract (#142124 ) ### Description This patch improves the folding efficiency of `vector.insert` and `vector.extract` operations by not returning early after successfully converting dynamic indices to static indices. This PR also renames the test pass `TestConstantFold` to `TestSingleFold` and adds comprehensive documentation explaining the single-pass folding behavior. ### Motivation Since the `OpBuilder::createOrFold` function only calls `fold` once, the current `fold` methods of `vector.insert` and `vector.extract` may leave the op in a state that can be folded further. For example, consider the following un-folded IR: ``` %v1 = vector.insert %e1, %v0 [0] : f32 into vector<128xf32> %c0 = arith.constant 0 : index %e2 = vector.extract %v1[%c0] : f32 from vector<128xf32> ``` If we use `createOrFold` to create the `vector.extract` op, then the result will be: ``` %v1 = vector.insert %e1, %v0 [127] : f32 into vector<128xf32> %e2 = vector.extract %v1[0] : f32 from vector<128xf32> ``` But this is not the optimal result. `createOrFold` should have returned `%e1`. The reason is that the execution of fold returns immediately after `extractInsertFoldConstantOp`, causing subsequent folding logics to be skipped. --------- Co-authored-by: Yang Bai <yangb@nvidia.com>	2025-06-18 09:26:04 -07:00
woruyu	0018921148	[DAG] add (~a \| x) & (a \| y) -> (a & (x ^ y)) ^y for foldMaskedMerge (#144342 ) ### Summary This PR resolves https://github.com/llvm/llvm-project/issues/143864 Add (~a \| x) & (a \| y) -> (a & (x ^ y)) ^y for foldMaskedMerge func using SDPatternMatch aftering adding this pattern, run ```ninja check-llvm-codegen```, all other cases remain unchanged, so I add a testcase(fold-masked-merge-demorgan.ll) for it --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-06-18 17:22:53 +01:00
Peng Liu	9827440f1e	[libc++] Optimize ranges::{for_each, for_each_n} for segmented iterators (#132896 ) Previously, the segmented iterator optimization was limited to `std::{for_each, for_each_n}`. This patch extends the optimization to `std::ranges::for_each` and `std::ranges::for_each_n`, ensuring consistent optimizations across these algorithms. This patch first generalizes the `std` algorithms by introducing a `Projection` parameter, which is set to `__identity` for the `std` algorithms. Then we let the `ranges` algorithms to directly call their `std` counterparts with a general `__proj` argument. Benchmarks demonstrate performance improvements of up to 21.4x for ``std::deque::iterator`` and 22.3x for ``join_view`` of ``vector<vector<char>>``. Addresses a subtask of #102817.	2025-06-18 12:22:47 -04:00
Peng Liu	dd40c460c4	[libc++] Clean up casts in std::forward_list (#130310 ) The patch removes unnecessary casts to `void*` pointers, inline some casts, and eliminates an identity cast.	2025-06-18 12:16:01 -04:00
Karlo Basioli	2a41350aab	Fix bazel build issue caused by #142986 second attempt (#144721 didnt… (#144743 ) … cover everything)	2025-06-18 17:15:12 +01:00
Ying Yi	6d785ca421	[Clang] Fix the clang/test/PCH/ignored-pch.c test. (#144737 ) Change the test to check the exit status of the 'ls' command line (instead of error message) since the error message is different when running 'ls' command on the different Host machine.	2025-06-18 17:14:33 +01:00
Peng Liu	13510c0736	[libc++] Make list constexpr as part of P3372R3 (#129799 ) This patch makes `std::list` constexpr as part of P3372R3. Fixes #128659.	2025-06-18 12:13:50 -04:00
Christopher Ferris	a2cee05449	[scudo] Make report pointers const. (#144624 ) Mark as many of the reportXX functions that take pointers const. This avoid the need to use const_cast when calling these functions on an already const pointer. Fix reportHeaderCorruption calls where an argument was passed into an append call that didn't use them.	2025-06-18 09:12:53 -07:00
Jon Roelofs	0fa373c77d	[Matrix] Propagate shape information through PHI insts (#141681 ) ... and split them as we lower them, avoiding several shuffles in the process.	2025-06-18 09:00:48 -07:00
Philip Reames	b5aaf9d988	[InstCombine] Implement vp.reverse reordering/elimination through binop/unop (#143963 ) This simply copies the structure of the vector.reverse patterns from just above, and reimplements them for the vp.reverse intrinsics when the mask is all ones and the EVLs exactly match. Its unfortunate that we have three different ways to represent a reverse (shuffle, vector.reverse, and vp.reverse) but I don't see an obvious way to remove any them because the semantics are slightly different. This significantly improves vectorization in TSVC_2's s112 and s1112 loops when using EVL tail folding.	2025-06-18 08:53:45 -07:00
Krzysztof Parzyszek	5d502aeddf	[flang][OpenMP] Clarify confusing error message (#144707 ) The message "The atomic variable x should occur exactly once among the arguments of the top-level [...] operator" was intended to convey that (1) an atomic variable should be an argument, and (2) it should be exactly one of the arguments. However, the wording turned out to be sowing confusion instead. Rework the corresponding check, and emit an individual error message for each problematic situation: - "atomic variable cannot be a proper subexpression of an argument", - "atomic variable should appear as an argument", - "atomic variable should be exactly one of the arguments". Fixes https://github.com/llvm/llvm-project/issues/144599	2025-06-18 10:42:39 -05:00
Brox Chen	9da9d32670	[AMDGPU][True16][CodeGen] sext i16 inreg in true16 mode (#144024 ) update sext pattern in true16, setting up proper vgpr16 reg use	2025-06-18 11:30:53 -04:00
Graham Hunter	8b8a3699db	[AArch64] Use dupq (SVE2.1) for segmented lane splats (#144482 ) Use the dupq instructions (when available) to represent a splat of the same lane within each 128b segment of a wider fixed vector.	2025-06-18 16:27:29 +01:00
Nathan Gauër	3af4d4e810	[HLSL][SPIR-V] Fix LinkageAttribute emission for BuiltIn (#144701 ) BuiltIn variables were missing the visibility attribute, which caused the Linkage capability to be emitted by the backend.	2025-06-18 17:26:40 +02:00
John Brawn	b53c1e4ee8	[AArch64] Add ISel for postindex ld1/st1 in big-endian (#144387 ) When big-endian we need to use ld1/st1 for vector loads and stores so that we get the elements in the correct order, but this prevents postindex addressing from being used. Fix this by adding the appropriate ISel patterns, plus the relevant changes in ISelLowering and ISelDAGToDAG to cause postindex addressing to be used.	2025-06-18 16:16:52 +01:00
amordo	e4c3b037bc	[InstCombine] Fold `tan(x) * cos(x) => sin(x)` (#136319 ) This patch enables folding `tan(x) * cos(x) -> sin(x)` under the `contract` flag. Fixes https://github.com/llvm/llvm-project/issues/34950.	2025-06-18 23:12:31 +08:00
Karlo Basioli	8fc20bffab	Fix bazel build issue caused by 142986 (#144721 )	2025-06-18 16:07:56 +01:00
Orlando Cazalet-Hyams	36038a1048	[RemoveDIs][NFC] Remove dbg intrinsic handling code from SelectionDAG ISel (#144702 )	2025-06-18 16:04:18 +01:00
Omair Javaid	6f4add3480	[compiler-rt] [Fuzzer] Fix ARMv7 test link failure by linking unwinder (#144495 ) compiler-rt/lib/fuzzer/tests build was failing on armv7, with undefined references to unwinder symbols, such as __aeabi_unwind_cpp_pr0. This occurs because the test is built with `-nostdlib++` but `libunwind` is not explicitly linked to the final test executable. This patch resolves the issue by adding CMake logic to explicitly link the required unwinder to the fuzzer tests, inspired by the same solution used to fix Scudo build failures by https://reviews.llvm.org/D142888.	2025-06-18 19:23:54 +05:00
Andrei Golubev	ee070d0816	[mlir][bufferization] Support custom types (1/N) (#142986 ) Following the addition of TensorLike and BufferLike type interfaces (see `00eaff3e9c`), introduce minimal changes required to bufferize a custom tensor operation into a custom buffer operation. To achieve this, new interface methods are added to TensorLike type interface that abstract away the differences between existing (tensor -> memref) and custom conversions. The scope of the changes is intentionally limited (for example, BufferizableOpInterface is untouched) in order to first understand the basics and reach consensus design-wise. --- Notable changes: * mlir::bufferization::getBufferType() returns BufferLikeType (instead of BaseMemRefType) * ToTensorOp / ToBufferOp operate on TensorLikeType / BufferLikeType. Operation argument "memref" renamed to "buffer" * ToTensorOp's tensor type inferring builder is dropped (users now need to provide the tensor type explicitly)	2025-06-18 16:18:12 +02:00
Akira Hatanaka	40d2f39210	[Sema][ObjC] Loosen restrictions on reinterpret_cast involving indirect ARC-managed pointers (#144458 ) Allow using reinterpret_cast for conversions between indirect ARC pointers and other pointer types. rdar://152905399	2025-06-18 07:08:32 -07:00
Nikolas Klauser	9db7502d22	[libc++] Move __has_iterator_typedefs to the up-to-C++17 implementation of iterator_traits (#144265 ) `__has_iterator_typedefs` is only used in the up-to-C++17 implementation of `type_traits`. To make that clearer the struct is moved into that code block.	2025-06-18 15:55:06 +02:00

1 2 3 4 5 ...

541482 Commits