clang-p2996

Author	SHA1	Message	Date
Fraser Cormack	95c683fc1b	[libclc] Move logb/ilogb to CLC library; optimize (#128028 ) This commit moves the logb and ilogb builtins to the CLC library. It simultaneously optimizes them both for vector types and for half types. Vector types were being scalarized in some cases. Half types were previously promoting to float, whereas this commit provides them a native implementation. Everything passes the OpenCL-CTS. I had to intuit some magic numbers used by these implementations in order to generate the half variants. I gave them clearer definitions derived from what I believe are their actual component numbers, but named them 'magic' to convey that they weren't derived from first principles.	2025-05-13 11:47:35 +01:00
Fraser Cormack	0e8f0b51ff	[libclc][NFC] Fix return after else	2025-05-13 11:46:26 +01:00
Fraser Cormack	655151a7e0	[libclc] Move (fast) length & distance to CLC library (#139701 ) This commit also refactors how geometric builtins are defined and declared, by sharing more helpers. It also removes an unnecessary gentype-like helper in favour of the more complete math/gentype.inc. There are no changes to the IR for any of these four builtins. The 'normalize' builtin will follow in a subsequent commit because it would involve the addition of missing halfn-type overloads for completeness.	2025-05-13 11:45:55 +01:00
Paul Walker	49ee674e5d	[NFC][LLVM][CodeGen][X86] Add ConstantInt/FP based vector support to MachineInstr fixup and printing code. (#137331 ) When -use-constant-{int,fp}-for-fixed-length-splat are enabled, constant vector splats take the form of ConstantInt/FP instead of ConstantVector. These constants get linked to MachineInstrs via constant pools for later processing. The processing assumes ConstantInt/FP to always represent scalar constants with this PR extending the code to support vector types. NOTE: The test choices are somewhat artificial because pretty much all the vector tests failed without these changes when the new constants are enabled. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-05-13 11:33:07 +01:00
Aaron Ballman	7866c4091e	Fix crash with invalid member function param list (#139595 ) We cannot consume annotation tokens with ConsumeToken(), so any pragmas present in an invalid initializer would previously crash. Now we handle annotation tokens more generally and avoid the crash. Fixes #113722	2025-05-13 06:31:10 -04:00
Ivan Butygin	91f3cdbd4f	[mlir][gpu] Pattern to promote `gpu.shuffle` to specialized AMDGPU ops (#137109 ) Only swizzle promotion for now, may add DPP ops support later.	2025-05-13 13:26:46 +03:00
jyli0116	382ad6f2e7	[GISel][AArch64] Added more efficient lowering of Bitreverse (#139233 ) GlobalISel was previously inefficient in handling bitreverses of vector types. This deals with i16, i32, i64 vector types and converts them into i8 bitreverses and rev instructions.	2025-05-13 11:21:50 +01:00
Kadir Cetinkaya	3009aa75ca	[clang][Tooling] Extend special symbol mappings for (U)INTN_C	2025-05-13 12:14:09 +02:00
yanming	63ad1492dc	[mlir][NFC] Fix the MLIR example format to conform to SSA form.	2025-05-13 18:08:14 +08:00
Wang Qiang	cece058191	[llvm][mlir][NFC] Fix typos in comments and test descriptions (#139688 ) This patch fixes several typographical errors in comments and test files: 1. Corrected "achive" to "archive" in archive-update.test. 2. Fixed "achive" to "achieve" in a comment in XeGPUSubgroupDistribute.cpp. 3. Corrected "achived" to "achieved" in a test note in SimpleSIVNoValidityCheckFixedSize.ll. These changes are non-functional and intended to improve readability and documentation accuracy. Signed-off-by: Kane Wang <wangqiang1@kylinos.cn> Co-authored-by: Kane Wang <wangqiang1@kylinos.cn>	2025-05-13 11:03:51 +01:00
Pierre van Houtryve	2278f5e65b	[AMDGPU] Hoist readlane/readfirstlane through unary/binary operands (#129037 ) When a read(first)lane is used on a binary operator and the intrinsic is the only user of the operator, we can move the read(first)lane into the operand if the other operand is uniform. Unfortunately IC doesn't let us access UniformityAnalysis and thus we can't truly check uniformity, we have to do with a basic uniformity check which only allows constants or trivially uniform intrinsics calls. We can also do the same for unary and cast operators.	2025-05-13 12:00:49 +02:00
David Spickett	d05854dfc8	llvm][docs] Use default checkout location in test suite guide (#139264 ) Step 2 tells you to checkout "llvm-test-suite" to "test-suite", but I don't see a particular reason to use a non-default path. If you're following the instructions exactly, it all works, but if you autopilot that step it is surprising later when things do not work. It's not hard for an individual to fix later, but we should suggest the least surprising thing where we can.	2025-05-13 10:58:29 +01:00
Jay Foad	28b7d6621a	[TableGen][CodeGen] Give every leaf register a unique regunit (#139526 ) Give every leaf register a unique regunit, even if it has ad hoc aliases. Previously only leaf registers without ad hoc aliases would get a unique regunit, but that caused situations where regunits could not be used to distinguish a register from its subregs. For example: - Registers A and B alias. They both get regunit 0 only. - Register C has subregs A and B. It inherits regunits from its subregs, so it also gets regunit 0 only. After this fix, registers A and B will get a unique regunit in addition to the regunit representing the alias, for example: - A will get regunits 0 and 1. - B will get regunits 0 and 2. - C will get regunits 0, 1 and 2.	2025-05-13 10:52:36 +01:00
David Green	671cef029f	[AggressiveInstcombine] Fold away shift in or reduction chain. (#137875 ) If we have `icmp eq or(a, shl(b)), 0` then the shift can be removed so long as it is nuw or nsw. It is still comparing that some bits are non-zero. https://alive2.llvm.org/ce/z/nhrBVX. This is also true of ne, and true for longer or chains.	2025-05-13 10:33:38 +01:00
Nuko Y.	69f4e60093	[AArch64][test] Fix test failing on unknown options (#139696 ) Fixes buildbot failure https://lab.llvm.org/buildbot/#/builders/16/builds/18873 originating from #138448. Normally ignored silently but fails on higher error levels. Buildbot errors: ``` /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AArch64/reserveXreg.ll -mtriple=aarch64-unknown-linux-gnu \| /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AArch64/reserveXreg.ll # RUN: at line 6 + /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AArch64/reserveXreg.ll + /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=aarch64-unknown-linux-gnu '+reserve-x8' is not a recognized feature for this target (ignoring feature) '+reserve-x8' is not a recognized feature for this target (ignoring feature) '+reserve-x16' is not a recognized feature for this target (ignoring feature) '+reserve-x16' is not a recognized feature for this target (ignoring feature) '+reserve-x17' is not a recognized feature for this target (ignoring feature) '+reserve-x17' is not a recognized feature for this target (ignoring feature) ```	2025-05-13 10:31:35 +01:00
lorenzo chelini	61536f2781	[mlir] Retire additional `let constructor` (NFC) (#139390 ) Three main changes: - The pass createRequestCWrappersPass is renamed as createLLVMRequestCWrappersPass - createOptimizeForTargetPass is now under the LLVM namespace. It’s unclear why the NVVM namespace was used initially, as all passes in LLVMIR/Transforms/Passes.h consistently reside in the LLVM namespace. - DuplicateFunctionEliminationPass is now in the func namespace.	2025-05-13 11:15:29 +02:00
Tom Eccles	8ecb958b8f	[flang][OpenMP][Semantics] resolve objects in the flush arg list (#139522 ) Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE.	2025-05-13 10:14:02 +01:00
Timm Baeder	83ce8a44bb	[clang][bytecode] Get BuiltinID from the direct callee (#139675 ) getBuiltinCallee() just checks the direct callee for its builtin id anyway, so let's do this ourselves.	2025-05-13 11:11:47 +02:00
Lucas Ramirez	6456ee056f	Reapply "[AMDGPU][Scheduler] Refactor ArchVGPR rematerialization during scheduling (#125885 )" (#139548 ) This reapplies `067caaa` and `382a085` (reverting `b35f6e2`) with fixes to issues detected by the address sanitizer (MIs have to be removed from live intervals before being removed from their parent MBB). Original commit description below. AMDGPU scheduler's `PreRARematStage` attempts to increase function occupancy w.r.t. ArchVGPR usage by rematerializing trivial ArchVGPR-defining instruction next to their single use. It first collects all eligible trivially rematerializable instructions in the function, then sinks them one-by-one while recomputing occupancy in all affected regions each time to determine if and when it has managed to increase overall occupancy. If it does, changes are committed to the scheduler's state; otherwise modifications to the IR are reverted and the scheduling stage gives up. In both cases, this scheduling stage currently involves repeated queries for up-to-date occupancy estimates and some state copying to enable reversal of sinking decisions when occupancy is revealed not to increase. The current implementation also does not accurately track register pressure changes in all regions affected by sinking decisions. This commit refactors this scheduling stage, improving RP tracking and splitting the stage into two distinct steps to avoid repeated occupancy queries and IR/state rollbacks. - Analysis and collection (`canIncreaseOccupancyOrReduceSpill`). The number of ArchVGPRs to save to reduce spilling or increase function occupancy by 1 (when there is no spilling) is computed. Then, instructions eligible for rematerialization are collected, stopping as soon as enough have been identified to be able to achieve our goal (according to slightly optimistic heuristics). If there aren't enough of such instructions, the scheduling stage stops here. - Rematerialization (`rematerialize`). Instructions collected in the first step are rematerialized one-by-one. Now we are able to directly update the scheduler's state since we have already done the occupancy analysis and know we won't have to rollback any state. Register pressures for impacted regions are recomputed only once, as opposed to at every sinking decision. In the case where the stage attempted to increase occupancy, and if both rematerializations alone and rescheduling after were unable to improve occupancy, then all rematerializations are rollbacked.	2025-05-13 11:11:00 +02:00
Timm Baeder	3de2fa91e1	[clang][bytecode] Avoid classifying in visitArrayElemInit() (#139674 ) We usually call this more than once, but the type of the initializer never changes. Let's classify only once and pass that to visitArrayElemInit().	2025-05-13 11:01:59 +02:00
Hans Wennborg	fd3fecfc09	Revert "[lld] Merge equivalent symbols found during ICF (#134342 )" The change would also merge non-equivalent symbols under some circumstances, see comment with a reproducer on the PR. > Fixes a correctness issue for AArch64 when ADRP and LDR instructions are > outlined in separate sections and sections are fed to ICF for > deduplication. > > See test case (based on > https://github.com/llvm/llvm-project/issues/129122) for details. All > rodata.* sections are folded into a single section with ICF. This leads > to all f2_* function sections getting folded into one (as their > relocation target symbols g* belong to .rodata.g* sections that have > already been folded into one). Since relocations still refer original g* > symbols, we end up creating duplicate GOT entry for all such symbols. > This PR addresses that by tracking such folded symbols and create one > GOT entry for all such symbols. > > Fixes https://github.com/llvm/llvm-project/issues/129122 > > Co-authored by: @jyknight This reverts commit `8389d6fad7`.	2025-05-13 10:57:46 +02:00
Timm Baeder	98763433e6	[clang][bytecode] Optimize enum value range checks (#139672 ) Only do the work if we really have to.	2025-05-13 10:55:24 +02:00
Matt Arsenault	6d35ec2335	ObjCARC: Fix regression from using ConstantData uselists (#139609 ) Fixes regression after `9383fb23e1`	2025-05-13 10:52:49 +02:00
Jacques Pienaar	c78e65cc98	[lldb][plugin] Use counter directly for number of readers (#139252 ) Here we were initializing & locking a shared_mutex in a thread, while releasing it in the parent which may/often turned out to be a different thread (shared_mutex::unlock_shared is undefined behavior if called from a thread that doesn't hold the lock). Switch to counter to more simply keep track of number of readers and simply lock/unlock rather than utilizing reader mutex to verify last freed (and so requiring this matching thread init/destroy behavior).	2025-05-13 01:52:36 -07:00
Florian Hahn	ba2dacd276	[VPlan] Print use and definition in verifier on violation. Improves the error message when a use comes before the def by including the use and def, when print utilities are available.	2025-05-13 09:52:02 +01:00
David Green	137aa573ca	[GlobalISel] Add computeNumSignBits for G_BUILD_VECTOR. (#139506 ) The code is similar to SelectionDAG::ComputeNumSignBits, but does not deal with truncating buildvectors.	2025-05-13 09:36:14 +01:00
Daan De Meyer	cdbc297ef5	include-cleaner: Report function decls from __cleanup__ as used (#138669 )	2025-05-13 10:22:32 +02:00
David Green	d2dafded03	[AArch64] Minor test cleanup for postselectopt-dead-cc-defs.mir. NFC Remove the duplicate definition of %12	2025-05-13 09:12:25 +01:00
drazi	eea1e50ac2	[mlir-tblgen] trim method body to empty with only spaces to avoid crash (#139568 ) method body or default impl must be true empty. Even they contain only spaces, ``mlir-tblgen`` considers they are non-empty and generates invalid code lead to segment fault. It's very hard to debug. ```c++ InterfaceMethod< ... /methodBody=/ [{ }], // This must be true empty. Leaving a space here can lead to segment fault which is hard to figure out why /defaultImpl=/ [{ ... }] ``` This PR trim spaces when method body or default implementation of interface method is not empty. Now ``mlir-tblgen`` generates valid code even when they contain only spaces. --------- Co-authored-by: Fung Xie <ftse@nvidia.com> Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-05-13 10:03:06 +02:00
Kohei Yamaguchi	f92dd0083e	[mlir][docs] Add quant dialect pass doc into Passes.md (NFC) (#139363 ) This PR added documentation for the quant dialect passes to `Passes.md`, as it had not been included.	2025-05-13 17:00:45 +09:00
Igor Kirillov	a3fb54c1ae	[LAA][NFC] Unify naming of DepCandidates to DepCands (#139534 ) The MemoryDepChecker::DepCandidates instance in each LoopAccessInfo had multiple names (AccessSets, DepCands, DependentAccesses), which was confusing. This patch renames all references to DepCands for consistency.	2025-05-13 08:52:46 +01:00
Florian Hahn	5c7bc6a0e6	[ComplexDeinterleave] Don't try to combine single FP reductions. (#139469 ) Currently the apss tries to combine floating point reductions, without checking for the correct fast-math flags and it also creates invalid IR (using llvm.reduce.add for FP types). For now, just bail out for non-integer types. PR: https://github.com/llvm/llvm-project/pull/139469	2025-05-13 08:44:11 +01:00
Piotr Fusik	3cfdf2ccdf	[RISCV] Handle more (add x, C) -> (sub x, -C) cases (#138705 ) This is a follow-up to #137309, adding: - multi-use of the constant with different adds - vectors (vadd.vx -> vsub.vx)	2025-05-13 09:12:24 +02:00
Antonio Frighetto	adfd59fdb8	[InstCombine] Introduce `foldICmpBinOpWithConstantViaTruthTable` folding Match icmps of binops where both operands are select with constant arms, i.e., `icmp pred (select A ? C1 : C2) binop (select B ? C3 : C4), C5`. Fold such patterns by creating a truth table of the possible four constant variants, and materialize back the optimal logic from it via `createLogicFromTable` helper. This also generalizes an existing fold, which has therefore been dropped. Proofs: https://alive2.llvm.org/ce/z/NS7Vzu. Fixes: https://github.com/llvm/llvm-project/issues/138212.	2025-05-13 09:04:25 +02:00
Antonio Frighetto	1bfd94b1b9	[InstCombine] Precommit tests for PR139109 (NFC)	2025-05-13 09:03:56 +02:00
Jim Lin	9f274a95b1	[RISCV] Fix indentation for riscv_corev_alu.h in CMakeLists.txt. NFC.	2025-05-13 14:46:08 +08:00
Iris Shi	6abf5b94da	[RISCV][NFC] Fix typos in `RISCVSchedule.td`	2025-05-13 14:32:32 +08:00
Kazu Hirata	13d80b4b12	[AST] Use llvm::upper_bound (NFC) (#139664 )	2025-05-12 23:24:46 -07:00
Kazu Hirata	75e0865837	[clang-tools-extra] Use llvm::unique (NFC) (#139663 )	2025-05-12 23:24:24 -07:00
Kazu Hirata	c95745f2db	[llvm] Use StringRef::{starts_with,find} (NFC) (#139661 ) Calling find/contains in the StringRef domain allows us to avoid creating temporary instances of std::string.	2025-05-12 23:24:07 -07:00
Kazu Hirata	294eb7670f	[TableGen] Fix a warning This patch fixes an unused parameter warning with gcc7 under the release configuration.	2025-05-12 23:18:30 -07:00
Timm Baeder	79eed76c58	[clang][bytecode][NFC] Remove incorrect comment (#139571 ) We don't create function frames for builtin functions anymore.	2025-05-13 08:09:26 +02:00
Helena Kotas	03934d0a21	[DirectX] Implement DXILResourceImplicitBinding pass (#138043 ) The `DXILResourceImplicitBinding` pass uses the results of `DXILResourceBindingAnalysis` to assigns register slots to resources that do not have explicit binding. It replaces all `llvm.dx.resource.handlefromimplicitbinding` calls with `llvm.dx.resource.handlefrombinding` using the newly assigned binding. If a binding cannot be found for a resource, the pass will raise a diagnostic error. Currently this diagnostic message does not include the resource name, which will be addressed in a separate task (#137868). Part 2/2 of #136786 Closes #136786	2025-05-12 23:00:00 -07:00
Kazu Hirata	383a825d6d	[BOLT] Use StringRef::contains (NFC) (#139658 ) Once we convert EventNames to StringRef, which is cheap, we can call StringRef::contains without creating a temporary instance of std::string.	2025-05-12 22:59:26 -07:00
Kazu Hirata	0fedccf389	[IR] Use llvm::upper_bound (NFC) (#139656 )	2025-05-12 22:59:05 -07:00
Kazu Hirata	e6e50170b9	[CodeGen] Use llvm::lower_bound (NFC) (#139655 )	2025-05-12 22:58:50 -07:00
Kazu Hirata	510c8a23e6	[llvm] Use llvm::find_if (NFC) (#139654 )	2025-05-12 22:58:30 -07:00
Iris Shi	49ab1d740e	[NFC][RISCV] Remove extra space in `RISCVInstrInfoZfh.td`	2025-05-13 13:53:38 +08:00
Haojian Wu	1d0ee12e34	Reland "Reland [Modules] Remove unnecessary check when generating name lookup table in ASTWriter" (#139253 ) This relands the patch `67b298f6d8`, with some more testcases. The `undefined symbol` error mentioned in https://github.com/llvm/llvm-project/issues/61065#issuecomment-1517725811 doesn't exist anymore from our internal tests. Fixes #61065, #134739 --------- Co-authored-by: Viktoriia Bakalova <bakalova@google.com>	2025-05-13 07:46:43 +02:00
Matt Arsenault	2f9323bc5b	DAG: Stop forcibly adding nsz to expanded minnum/maxnum (#139615 )	2025-05-13 07:37:21 +02:00

1 2 3 4 5 ...

537335 Commits