clang-p2996

Author	SHA1	Message	Date
sribee8	6f4e4ea177	[libc] Internal getrandom implementation (#144427 ) Implemented an internal getrandom to avoid calls to the public one in table.h --------- Co-authored-by: Sriya Pratipati <sriyap@google.com>	2025-06-18 17:56:57 +00:00
Tomer Shafir	835d3034fe	[AArch64] improve zero-cycle regmov test (#143680 ) - Add a `gpr32` suffix to test name to denote the specific register class being checked - Expand `-mtriple=arm64-apple-ios` to `-march=arm64` to broaden the test context to the generic architecture, as the specific triple is not required - Port `bl` match to Linux too via the regex: `{{_?foo}}` - Advance `-mcpu=cyclone` to the newer M series major `-mcpu=apple-m1` - Use `-mcpu` so that `-mattr=-zcm` has a real effect - Add a test that generic arm64 doesn't optimize for ZCM - Distinguish 4 different assembly layouts: NOTCPU, CPU, NOTATTR, ATTR - Fix broken test logic, for example: `; NOT: mov [[REG2:w[0-9]+]], w3` matched `mov w1, w3` then `REG2` captured `w1` but then `; NOT: mov w1, [[REG2]]` matched by prefix `mov, w1, w19` even though it should have matched `mov w1, w1`. This change adds explicit matches for all of the generated copies.	2025-06-18 18:56:33 +01:00
Lei Huang	82acd8c377	[PowerPC] Add code to spill and restore DMRp registers (#142443 )	2025-06-18 13:50:57 -04:00
Justin King	d9f7979a63	sanitizer_common: add unsupported test for free_sized and free_aligned_sized from C23 (#144727 ) Signed-off-by: Justin King <jcking@google.com>	2025-06-18 10:24:38 -07:00
Artem Belevich	298f1c276f	Revert "Add missing intrinsics to cuda headers" (#144755 ) Reverts llvm/llvm-project#143664 as it breaks CUDA compilation.	2025-06-18 10:08:27 -07:00
John Brawn	77bc254851	[AArch64] Fix build failure with -Werror (#144749 ) PR#144387 caused buildbot failures with -Werror due to a comparison between signed and unsigned types. Fix this with an explicit cast.	2025-06-18 18:05:02 +01:00
Alexis Engelke	2a8c65e983	[CodeGen][NFC] Fix quadratic c-t for large jump tables Deleting a basic block removes all references from jump tables, which is O(n). When freeing a MachineFunction, all basic blocks are deleted before the jump tables, causing O(n^2) runtime. Fix this by deallocating the jump table first. Test case generator: import sys n = int(sys.argv[1]) print("define void @f(i64 %c, ptr %p) {") print(" switch i64 %c, label %d [") for i in range(n): print(f" i64 {i}, label %h{i}") print(f" ]") for i in range(n): print(f'h{i}:') print(f' store i64 {i*i}, ptr %p') print(f' ret void') print('d:') print(' ret void') print('}') Improvement at 5000 entries: Benchmark 1: ./llc.pre -filetype=obj -O0 <switch5k.bc Time (mean ± σ): 49.7 ms ± 1.0 ms Range (min … max): 48.0 ms … 52.1 ms 57 runs Benchmark 2: ./llc.post -filetype=obj -O0 <switch5k.bc Time (mean ± σ): 39.4 ms ± 0.8 ms Range (min … max): 37.1 ms … 41.1 ms 72 runs Summary ./llc.post -filetype=obj -O0 <switch5k.bc ran 1.26 ± 0.04 times faster than ./llc.pre -filetype=obj -O0 <switch5k.bc Improvement at 20000 entries: Benchmark 1: ./llc.pre -filetype=obj -O0 <switch20k.bc Time (mean ± σ): 281.7 ms ± 1.0 ms Range (min … max): 280.2 ms … 283.0 ms 10 runs Benchmark 2: ./llc.post -filetype=obj -O0 <switch20k.bc Time (mean ± σ): 123.9 ms ± 1.5 ms Range (min … max): 121.4 ms … 129.2 ms 23 runs Summary ./llc.post -filetype=obj -O0 <switch20k.bc ran 2.27 ± 0.03 times faster than ./llc.pre -filetype=obj -O0 <switch20k.bc Pull Request: https://github.com/llvm/llvm-project/pull/144108	2025-06-18 18:56:30 +02:00
Krzysztof Parzyszek	4084ffcf1e	[flang] Show types in DumpEvExpr (#143743 ) When dumping evaluate::Expr, show type names which contain a lot of useful information. For example show ``` expr <Fortran::evaluate::SomeType> { expr <Fortran::evaluate::SomeKind<Fortran::common::TypeCategory::Integer>> { expr <Fortran::evaluate::Type<Fortran::common::TypeCategory::Integer, 4>> { ... ``` instead of ``` expr T { expr T { expr T { ... ```	2025-06-18 11:31:03 -05:00
Yang Bai	fe3933da15	[mlir][vector] Support complete folding in single pass for vector.insert/vector.extract (#142124 ) ### Description This patch improves the folding efficiency of `vector.insert` and `vector.extract` operations by not returning early after successfully converting dynamic indices to static indices. This PR also renames the test pass `TestConstantFold` to `TestSingleFold` and adds comprehensive documentation explaining the single-pass folding behavior. ### Motivation Since the `OpBuilder::createOrFold` function only calls `fold` once, the current `fold` methods of `vector.insert` and `vector.extract` may leave the op in a state that can be folded further. For example, consider the following un-folded IR: ``` %v1 = vector.insert %e1, %v0 [0] : f32 into vector<128xf32> %c0 = arith.constant 0 : index %e2 = vector.extract %v1[%c0] : f32 from vector<128xf32> ``` If we use `createOrFold` to create the `vector.extract` op, then the result will be: ``` %v1 = vector.insert %e1, %v0 [127] : f32 into vector<128xf32> %e2 = vector.extract %v1[0] : f32 from vector<128xf32> ``` But this is not the optimal result. `createOrFold` should have returned `%e1`. The reason is that the execution of fold returns immediately after `extractInsertFoldConstantOp`, causing subsequent folding logics to be skipped. --------- Co-authored-by: Yang Bai <yangb@nvidia.com>	2025-06-18 09:26:04 -07:00
woruyu	0018921148	[DAG] add (~a \| x) & (a \| y) -> (a & (x ^ y)) ^y for foldMaskedMerge (#144342 ) ### Summary This PR resolves https://github.com/llvm/llvm-project/issues/143864 Add (~a \| x) & (a \| y) -> (a & (x ^ y)) ^y for foldMaskedMerge func using SDPatternMatch aftering adding this pattern, run ```ninja check-llvm-codegen```, all other cases remain unchanged, so I add a testcase(fold-masked-merge-demorgan.ll) for it --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-06-18 17:22:53 +01:00
Peng Liu	9827440f1e	[libc++] Optimize ranges::{for_each, for_each_n} for segmented iterators (#132896 ) Previously, the segmented iterator optimization was limited to `std::{for_each, for_each_n}`. This patch extends the optimization to `std::ranges::for_each` and `std::ranges::for_each_n`, ensuring consistent optimizations across these algorithms. This patch first generalizes the `std` algorithms by introducing a `Projection` parameter, which is set to `__identity` for the `std` algorithms. Then we let the `ranges` algorithms to directly call their `std` counterparts with a general `__proj` argument. Benchmarks demonstrate performance improvements of up to 21.4x for ``std::deque::iterator`` and 22.3x for ``join_view`` of ``vector<vector<char>>``. Addresses a subtask of #102817.	2025-06-18 12:22:47 -04:00
Peng Liu	dd40c460c4	[libc++] Clean up casts in std::forward_list (#130310 ) The patch removes unnecessary casts to `void*` pointers, inline some casts, and eliminates an identity cast.	2025-06-18 12:16:01 -04:00
Karlo Basioli	2a41350aab	Fix bazel build issue caused by #142986 second attempt (#144721 didnt… (#144743 ) … cover everything)	2025-06-18 17:15:12 +01:00
Ying Yi	6d785ca421	[Clang] Fix the clang/test/PCH/ignored-pch.c test. (#144737 ) Change the test to check the exit status of the 'ls' command line (instead of error message) since the error message is different when running 'ls' command on the different Host machine.	2025-06-18 17:14:33 +01:00
Peng Liu	13510c0736	[libc++] Make list constexpr as part of P3372R3 (#129799 ) This patch makes `std::list` constexpr as part of P3372R3. Fixes #128659.	2025-06-18 12:13:50 -04:00
Christopher Ferris	a2cee05449	[scudo] Make report pointers const. (#144624 ) Mark as many of the reportXX functions that take pointers const. This avoid the need to use const_cast when calling these functions on an already const pointer. Fix reportHeaderCorruption calls where an argument was passed into an append call that didn't use them.	2025-06-18 09:12:53 -07:00
Jon Roelofs	0fa373c77d	[Matrix] Propagate shape information through PHI insts (#141681 ) ... and split them as we lower them, avoiding several shuffles in the process.	2025-06-18 09:00:48 -07:00
Philip Reames	b5aaf9d988	[InstCombine] Implement vp.reverse reordering/elimination through binop/unop (#143963 ) This simply copies the structure of the vector.reverse patterns from just above, and reimplements them for the vp.reverse intrinsics when the mask is all ones and the EVLs exactly match. Its unfortunate that we have three different ways to represent a reverse (shuffle, vector.reverse, and vp.reverse) but I don't see an obvious way to remove any them because the semantics are slightly different. This significantly improves vectorization in TSVC_2's s112 and s1112 loops when using EVL tail folding.	2025-06-18 08:53:45 -07:00
Krzysztof Parzyszek	5d502aeddf	[flang][OpenMP] Clarify confusing error message (#144707 ) The message "The atomic variable x should occur exactly once among the arguments of the top-level [...] operator" was intended to convey that (1) an atomic variable should be an argument, and (2) it should be exactly one of the arguments. However, the wording turned out to be sowing confusion instead. Rework the corresponding check, and emit an individual error message for each problematic situation: - "atomic variable cannot be a proper subexpression of an argument", - "atomic variable should appear as an argument", - "atomic variable should be exactly one of the arguments". Fixes https://github.com/llvm/llvm-project/issues/144599	2025-06-18 10:42:39 -05:00
Brox Chen	9da9d32670	[AMDGPU][True16][CodeGen] sext i16 inreg in true16 mode (#144024 ) update sext pattern in true16, setting up proper vgpr16 reg use	2025-06-18 11:30:53 -04:00
Graham Hunter	8b8a3699db	[AArch64] Use dupq (SVE2.1) for segmented lane splats (#144482 ) Use the dupq instructions (when available) to represent a splat of the same lane within each 128b segment of a wider fixed vector.	2025-06-18 16:27:29 +01:00
Nathan Gauër	3af4d4e810	[HLSL][SPIR-V] Fix LinkageAttribute emission for BuiltIn (#144701 ) BuiltIn variables were missing the visibility attribute, which caused the Linkage capability to be emitted by the backend.	2025-06-18 17:26:40 +02:00
John Brawn	b53c1e4ee8	[AArch64] Add ISel for postindex ld1/st1 in big-endian (#144387 ) When big-endian we need to use ld1/st1 for vector loads and stores so that we get the elements in the correct order, but this prevents postindex addressing from being used. Fix this by adding the appropriate ISel patterns, plus the relevant changes in ISelLowering and ISelDAGToDAG to cause postindex addressing to be used.	2025-06-18 16:16:52 +01:00
amordo	e4c3b037bc	[InstCombine] Fold `tan(x) * cos(x) => sin(x)` (#136319 ) This patch enables folding `tan(x) * cos(x) -> sin(x)` under the `contract` flag. Fixes https://github.com/llvm/llvm-project/issues/34950.	2025-06-18 23:12:31 +08:00
Karlo Basioli	8fc20bffab	Fix bazel build issue caused by 142986 (#144721 )	2025-06-18 16:07:56 +01:00
Orlando Cazalet-Hyams	36038a1048	[RemoveDIs][NFC] Remove dbg intrinsic handling code from SelectionDAG ISel (#144702 )	2025-06-18 16:04:18 +01:00
Omair Javaid	6f4add3480	[compiler-rt] [Fuzzer] Fix ARMv7 test link failure by linking unwinder (#144495 ) compiler-rt/lib/fuzzer/tests build was failing on armv7, with undefined references to unwinder symbols, such as __aeabi_unwind_cpp_pr0. This occurs because the test is built with `-nostdlib++` but `libunwind` is not explicitly linked to the final test executable. This patch resolves the issue by adding CMake logic to explicitly link the required unwinder to the fuzzer tests, inspired by the same solution used to fix Scudo build failures by https://reviews.llvm.org/D142888.	2025-06-18 19:23:54 +05:00
Andrei Golubev	ee070d0816	[mlir][bufferization] Support custom types (1/N) (#142986 ) Following the addition of TensorLike and BufferLike type interfaces (see `00eaff3e9c`), introduce minimal changes required to bufferize a custom tensor operation into a custom buffer operation. To achieve this, new interface methods are added to TensorLike type interface that abstract away the differences between existing (tensor -> memref) and custom conversions. The scope of the changes is intentionally limited (for example, BufferizableOpInterface is untouched) in order to first understand the basics and reach consensus design-wise. --- Notable changes: * mlir::bufferization::getBufferType() returns BufferLikeType (instead of BaseMemRefType) * ToTensorOp / ToBufferOp operate on TensorLikeType / BufferLikeType. Operation argument "memref" renamed to "buffer" * ToTensorOp's tensor type inferring builder is dropped (users now need to provide the tensor type explicitly)	2025-06-18 16:18:12 +02:00
Akira Hatanaka	40d2f39210	[Sema][ObjC] Loosen restrictions on reinterpret_cast involving indirect ARC-managed pointers (#144458 ) Allow using reinterpret_cast for conversions between indirect ARC pointers and other pointer types. rdar://152905399	2025-06-18 07:08:32 -07:00
Nikolas Klauser	9db7502d22	[libc++] Move __has_iterator_typedefs to the up-to-C++17 implementation of iterator_traits (#144265 ) `__has_iterator_typedefs` is only used in the up-to-C++17 implementation of `type_traits`. To make that clearer the struct is moved into that code block.	2025-06-18 15:55:06 +02:00
Sergei Lebedev	1d6f1029f7	[mlir] [python] Fixed the return type of `MemRefType.get_strides_and_offset` (#144523 ) Previously, the return type for `offset` was `list[int]`, which clearly is not right.	2025-06-18 09:53:20 -04:00
lorenzo chelini	c5613dc863	[MLIR] Mark LLVM::FMAOp as legal (#144671 ) Mark LLVM::FMAOp as legal in configureGpuToNVVMConversionLegality, since we can handle intrinsic lowering in the NVPTX backend and emit fma.rn.f32.	2025-06-18 15:49:00 +02:00
Mircea Trofin	bdac9580f3	[nfc][jt] Drop `std::optional` pointers (#144548 ) The `std::optional` didn't add any semantics that couldn't be modeled with the pointers being `nullptr`.	2025-06-18 06:40:06 -07:00
Eric Fiselier	fda6b751f1	Fix libc++ restarter job. A while ago, the test workflow was updated with a new preemption regex, however it was only applied to the test job, and not the job that's actually restarting the failed libc++ test runs. This fix should correct the issue and get the restarter working again.	2025-06-18 09:36:36 -04:00
Jack Styles	671caef379	[Flang][OpenMP] Update relevant warnings to emit when OMP >= v5.2 (#144492 ) There has been a number of deprecation warnings that have been added to Flang, however these features are only deprecated when the OpenMP Version being used is 5.2 or later. Previously, flang did not consider the version with the warnings so would always be emitted. Flang now ensures warnings are emitted for the appropriate version of OpenMP, and tests are updated to reflect this change.	2025-06-18 14:35:53 +01:00
Tobias Stadler	1f34d68c4f	[Remarks] Remove yaml-strtab format (#144527 ) Background: The yaml-strtab format looks just like the yaml format, except that the values in the key/value pairs of the remarks are deduplicated and replaced by indices into a string table (see removed test cases for examples). The motivation behind this format was to reduce size of the remarks files. However, it was quickly superseded by the bitstream format. Therefore, remove the yaml-strtab format, as it doesn't have a good usecase anymore: - It isn't particularly efficient - It isn't human-readable - It isn't straightforward to parse in external tools that can't use the remarks library. We don't even support it in opt-viewer. llvm-remarkutil is also missing options to parse/convert yaml-strtab, so the chance that anyone is actually using this format is low.	2025-06-18 14:25:41 +01:00
Garvit Gupta	c4d99704e2	Revert "Reland [Driver] Add support for GCC installation detection in… (#144684 ) … Baremetal toolchain (#144640)" This reverts commit `45ea46c446`.	2025-06-18 18:53:45 +05:30
Kunwar Grover	6729da647a	[mlir][amdgpu][nfc] Add PatternBenefit to populate methods (#144663 )	2025-06-18 15:19:17 +02:00
Timm Bäder	68471d29ee	Revert "Reapply "[clang][bytecode] Allocate IntegralAP and Floating types usi… (#144676 )" This reverts commit `7c15edb306`. This still breaks clang-armv8-quick: https://lab.llvm.org/buildbot/#/builders/154/builds/17587	2025-06-18 15:17:53 +02:00
Frank Schlimbach	8584abb05a	[mlir] mlir/test/lit.local.cfg -> mlir/test/Target/SPIRV/lit.local.cfg (#144685 ) renamed: mlir/test/lit.local.cfg -> mlir/test/Target/SPIRV/lit.local.cfg	2025-06-18 15:04:55 +02:00
Tom Eccles	a83d3362f6	[flang][OpenMP] Don't allow DO CONCURRENT inside of a loop nest (#144506 ) I don't think DO CONCURRENT fits the definition of a Canonical Loop Nest (OpenMP 6.0 section 6.4.1). It is however explicitly allowed for the LOOP construct (6.0 section 13.8). There's some obscure language in OpenMP 6.0 for the LOOP construct: > If the collapsed loop is a DO CONCURRENT loop, neither the > data-sharing attribute clauses nor the collapse clause may be specified. From the surrounding context, I think "collapsed loop" just means the loop that the LOOP construct applies to. So I will interpret this to mean that DO CONCURRENT can only be used with the LOOP construct if it does not contain the COLLAPSE clause. This also fixes a bug where the associated clause was never cleared after it was set. Fixes #144178	2025-06-18 14:02:11 +01:00
Krzysztof Parzyszek	4b2ab1494b	[flang][OpenMP] Don't crash on iterator modifier in declare mapper (#144359 ) Both the declare mapper directive argument, and the iterator modifier can contain declaration-type-spec, so make sure that the processing of one ends before processing of the other begins in semantic analysis.	2025-06-18 07:46:49 -05:00
Matthias Springer	66580f77b8	[mlir][Transforms][NFC] Dialect Conversion: Keep `unresolvedMaterializations` up to date (#144254 ) `unresolvedMaterializations` is a mapping from `UnrealizedConversionCastOp` to `UnresolvedMaterializationRewrite`. This mapping is needed to find the correct type converter for an unresolved materialization. With this commit, `unresolvedMaterializations` is updated immediately when an op is being erased. This also cleans up the code base a bit: `SingleEraseRewriter` is now used only during the "cleanup" phase and no longer needed as a field of `ConversionRewriterImpl`. This commit is in preparation of the One-Shot Dialect Conversion refactoring: `allowPatternRollback = false` will in the future trigger immediate materialization of all IR changes.	2025-06-18 14:42:09 +02:00
Andrei Golubev	a1c2a71293	[mlir][bufferization] Use Type instead of Value in unknown conversion (#144658 ) Generally, bufferization should be able to create a memref from a tensor without needing to know more than just a mlir::Type. Thus, change BufferizationOptions::UnknownTypeConverterFn to accept just a type (mlir::TensorType for now) instead of mlir::Value. Additionally, apply the same rationale to getMemRefType() helper function. Both changes are prerequisites to enable custom types support in one-shot bufferization.	2025-06-18 14:38:58 +02:00
Ties Stuij	6265ca686d	[AArch64] Add Cortex-A320 scheduling model (#144385 ) Instead of using the Cortex-A510 scheduling model, Cortex-A320 now uses its own scheduling model, based off of the Cortex-A320 Software Optimization Guide: https://developer.arm.com/documentation/110285/r0p1 --------- Co-authored-by: Nashe Mncube <Nashe.Mncube@arm.com>	2025-06-18 13:38:49 +01:00
Timm Baeder	7c15edb306	Reapply "[clang][bytecode] Allocate IntegralAP and Floating types usi… (#144676 ) …ng an allocator (#144246)" This reverts commit `57828fec76`.	2025-06-18 14:37:29 +02:00
Simon Pilgrim	34a4894149	[X86] detectZextAbsDiff - use SDPatternMatch::m_Abs() matcher. NFC.	2025-06-18 13:21:09 +01:00
Benjamin Maxwell	d8e8ab7977	[AArch64][SME] Fix restoring callee-saves from FP with hazard padding (#143371 ) Currently, when hazard-padding is enabled a (fixed-size) hazard slot is placed in the CS area, just after the frame record. The size of this slot is part of the "CalleeSaveBaseToFrameRecordOffset". The SVE epilogue emission code assumed this offset was always zero, and incorrectly setting the stack pointer, resulting in all SVE registers being reloaded from incorrect offsets. ``` \| prev_lr \| \| prev_fp \| \| (a.k.a. "frame record") \| \|-----------------------------------\| <- fp(=x29) \| <hazard padding> \| \|-----------------------------------\| <- callee-saved base \| \| \| callee-saved fp/simd/SVE regs \| \| \| \|-----------------------------------\| <- SVE callee-save base ``` i.e. in the above diagram, the code assumed `fp == callee-saved base`.	2025-06-18 12:58:17 +01:00
Oleksandr "Alex" Zinenko	8a469da8b2	[mlir] remove unnecessary atomic_rmw expansions (#144515 ) The expansion of `memref.atomic_rmw` into a `memref.generic_atomic_rmw` for floating-point min/max operations is no longer necessary as those are now supported by the LLVM dialect and LLVM IR. Furthermore, combining this expansion with direct lowering of `generic_atomic_rmw` could leads to invalid LLVM dialect IR with `cmpxchg` operating on floating-point values that it does not support.	2025-06-18 13:32:46 +02:00
Garvit Gupta	66d6964a55	Fix tests failing on fuchsia clang x86_64 builders (#144655 ) Fuchsia sets CLANG_DEFAULT_UNWINDLIB to libunwind. As a result, when rtlib is set to libgcc and unwindlib is not explicitly specified, tests using Fuchsia as the default platform will fail. To address this, the affected tests are now xfailed This change fixes the following tests introduced in `45ea46c446`: clang/test/Driver/aarch64-toolchain-extra.c clang/test/Driver/arm-toolchain-extra.c clang/test/Driver/aarch64-toolchain.c clang/test/Driver/arm-toolchain.c	2025-06-18 16:50:48 +05:30

1 2 3 4 5 ...

541462 Commits