Commit Graph

537335 Commits

Author SHA1 Message Date
Fraser Cormack
95c683fc1b [libclc] Move logb/ilogb to CLC library; optimize (#128028)
This commit moves the logb and ilogb builtins to the CLC library.

It simultaneously optimizes them both for vector types and for half
types. Vector types were being scalarized in some cases. Half types were
previously promoting to float, whereas this commit provides them a
native implementation.

Everything passes the OpenCL-CTS.

I had to intuit some magic numbers used by these implementations in
order to generate the half variants. I gave them clearer definitions
derived from what I believe are their actual component numbers, but
named them 'magic' to convey that they weren't derived from first
principles.
2025-05-13 11:47:35 +01:00
Fraser Cormack
0e8f0b51ff [libclc][NFC] Fix return after else 2025-05-13 11:46:26 +01:00
Fraser Cormack
655151a7e0 [libclc] Move (fast) length & distance to CLC library (#139701)
This commit also refactors how geometric builtins are defined and
declared, by sharing more helpers. It also removes an unnecessary
gentype-like helper in favour of the more complete math/gentype.inc.

There are no changes to the IR for any of these four builtins.

The 'normalize' builtin will follow in a subsequent commit because it
would involve the addition of missing halfn-type overloads for
completeness.
2025-05-13 11:45:55 +01:00
Paul Walker
49ee674e5d [NFC][LLVM][CodeGen][X86] Add ConstantInt/FP based vector support to MachineInstr fixup and printing code. (#137331)
When -use-constant-{int,fp}-for-fixed-length-splat are enabled, constant
vector splats take the form of ConstantInt/FP instead of ConstantVector.
These constants get linked to MachineInstrs via constant pools for later
processing. The processing assumes ConstantInt/FP to always represent
scalar constants with this PR extending the code to support vector
types.

NOTE: The test choices are somewhat artificial because pretty much all
the vector tests failed without these changes when the new constants are
enabled.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-13 11:33:07 +01:00
Aaron Ballman
7866c4091e Fix crash with invalid member function param list (#139595)
We cannot consume annotation tokens with ConsumeToken(), so any pragmas
present in an invalid initializer would previously crash. Now we handle
annotation tokens more generally and avoid the crash.

Fixes #113722
2025-05-13 06:31:10 -04:00
Ivan Butygin
91f3cdbd4f [mlir][gpu] Pattern to promote gpu.shuffle to specialized AMDGPU ops (#137109)
Only swizzle promotion for now, may add DPP ops support later.
2025-05-13 13:26:46 +03:00
jyli0116
382ad6f2e7 [GISel][AArch64] Added more efficient lowering of Bitreverse (#139233)
GlobalISel was previously inefficient in handling bitreverses of vector
types. This deals with i16, i32, i64 vector types and converts them into
i8 bitreverses and rev instructions.
2025-05-13 11:21:50 +01:00
Kadir Cetinkaya
3009aa75ca [clang][Tooling] Extend special symbol mappings for (U)INTN_C 2025-05-13 12:14:09 +02:00
yanming
63ad1492dc [mlir][NFC] Fix the MLIR example format to conform to SSA form. 2025-05-13 18:08:14 +08:00
Wang Qiang
cece058191 [llvm][mlir][NFC] Fix typos in comments and test descriptions (#139688)
This patch fixes several typographical errors in comments and test
files:

1. Corrected "achive" to "archive" in archive-update.test. 
2. Fixed "achive" to "achieve" in a comment in
XeGPUSubgroupDistribute.cpp.
3. Corrected "achived" to "achieved" in a test note in
SimpleSIVNoValidityCheckFixedSize.ll.

These changes are non-functional and intended to improve readability and
documentation accuracy.

Signed-off-by: Kane Wang <wangqiang1@kylinos.cn>
Co-authored-by: Kane Wang <wangqiang1@kylinos.cn>
2025-05-13 11:03:51 +01:00
Pierre van Houtryve
2278f5e65b [AMDGPU] Hoist readlane/readfirstlane through unary/binary operands (#129037)
When a read(first)lane is used on a binary operator and the intrinsic is
the only user of the operator, we can move the read(first)lane into the
operand if the other operand is uniform.

Unfortunately IC doesn't let us access UniformityAnalysis and thus we
can't truly check uniformity, we have to do with a basic uniformity
check which only allows constants or trivially uniform intrinsics calls.

We can also do the same for unary and cast operators.
2025-05-13 12:00:49 +02:00
David Spickett
d05854dfc8 llvm][docs] Use default checkout location in test suite guide (#139264)
Step 2 tells you to checkout "llvm-test-suite" to "test-suite", but I
don't see a particular reason to use a non-default path.

If you're following the instructions exactly, it all works, but if you
autopilot that step it is surprising later when things do not work.

It's not hard for an individual to fix later, but we should suggest the
least surprising thing where we can.
2025-05-13 10:58:29 +01:00
Jay Foad
28b7d6621a [TableGen][CodeGen] Give every leaf register a unique regunit (#139526)
Give every leaf register a unique regunit, even if it has ad hoc
aliases.

Previously only leaf registers *without* ad hoc aliases would get a
unique regunit, but that caused situations where regunits could not be
used to distinguish a register from its subregs. For example:

- Registers A and B alias. They both get regunit 0 only.
- Register C has subregs A and B. It inherits regunits from its subregs,
  so it also gets regunit 0 only.

After this fix, registers A and B will get a unique regunit in addition
to the regunit representing the alias, for example:

- A will get regunits 0 and 1.
- B will get regunits 0 and 2.
- C will get regunits 0, 1 and 2.
2025-05-13 10:52:36 +01:00
David Green
671cef029f [AggressiveInstcombine] Fold away shift in or reduction chain. (#137875)
If we have `icmp eq or(a, shl(b)), 0` then the shift can be removed so
long as it is nuw or nsw. It is still comparing that some bits are
non-zero.
https://alive2.llvm.org/ce/z/nhrBVX.

This is also true of ne, and true for longer or chains.
2025-05-13 10:33:38 +01:00
Nuko Y.
69f4e60093 [AArch64][test] Fix test failing on unknown options (#139696)
Fixes buildbot failure
https://lab.llvm.org/buildbot/#/builders/16/builds/18873 originating
from #138448. Normally ignored silently but fails on higher error
levels.

Buildbot errors:
```
/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AArch64/reserveXreg.ll -mtriple=aarch64-unknown-linux-gnu | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AArch64/reserveXreg.ll # RUN: at line 6
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AArch64/reserveXreg.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=aarch64-unknown-linux-gnu
'+reserve-x8' is not a recognized feature for this target (ignoring feature)
'+reserve-x8' is not a recognized feature for this target (ignoring feature)
'+reserve-x16' is not a recognized feature for this target (ignoring feature)
'+reserve-x16' is not a recognized feature for this target (ignoring feature)
'+reserve-x17' is not a recognized feature for this target (ignoring feature)
'+reserve-x17' is not a recognized feature for this target (ignoring feature)
```
2025-05-13 10:31:35 +01:00
lorenzo chelini
61536f2781 [mlir] Retire additional let constructor (NFC) (#139390)
Three main changes:

- The pass createRequestCWrappersPass is renamed as
createLLVMRequestCWrappersPass

- createOptimizeForTargetPass is now under the LLVM namespace. It’s
unclear why the NVVM namespace was used initially, as all passes in
LLVMIR/Transforms/Passes.h consistently reside in the LLVM namespace.

- DuplicateFunctionEliminationPass is now in the func namespace.
2025-05-13 11:15:29 +02:00
Tom Eccles
8ecb958b8f [flang][OpenMP][Semantics] resolve objects in the flush arg list (#139522)
Fixes #136583

Normally the flush argument list would contain a DataRef to some
variable. All DataRefs are handled generically in resolve-names and so
the problem wasn't observed. But when a common block name is specified,
this is not parsed as a DataRef. There was already handling in
resolve-directives for OmpObjectList but not for argument lists. I've
added a visitor for FLUSH which ensures all of the arguments have been
resolved.

The test is there to make sure the compiler doesn't crashed encountering
the unresolved symbol. It shows that we currently deny flushing a common
block. I'm not sure that it is right to restrict common blocks from
flush argument lists, but fixing that can come in a different patch.
This one is fixing an ICE.
2025-05-13 10:14:02 +01:00
Timm Baeder
83ce8a44bb [clang][bytecode] Get BuiltinID from the direct callee (#139675)
getBuiltinCallee() just checks the direct callee for its builtin id
anyway, so let's do this ourselves.
2025-05-13 11:11:47 +02:00
Lucas Ramirez
6456ee056f Reapply "[AMDGPU][Scheduler] Refactor ArchVGPR rematerialization during scheduling (#125885)" (#139548)
This reapplies 067caaa and 382a085 (reverting b35f6e2) with fixes to
issues detected by the address sanitizer (MIs have to be removed from
live intervals before being removed from their parent MBB).

Original commit description below.

AMDGPU scheduler's `PreRARematStage` attempts to increase function
occupancy w.r.t. ArchVGPR usage by rematerializing trivial
ArchVGPR-defining instruction next to their single use. It first
collects all eligible trivially rematerializable instructions in the
function, then sinks them one-by-one while recomputing occupancy in all
affected regions each time to determine if and when it has managed to
increase overall occupancy. If it does, changes are committed to the
scheduler's state; otherwise modifications to the IR are reverted and
the scheduling stage gives up.

In both cases, this scheduling stage currently involves repeated queries
for up-to-date occupancy estimates and some state copying to enable
reversal of sinking decisions when occupancy is revealed not to
increase. The current implementation also does not accurately track
register pressure changes in all regions affected by sinking decisions.

This commit refactors this scheduling stage, improving RP tracking and
splitting the stage into two distinct steps to avoid repeated occupancy
queries and IR/state rollbacks.

- Analysis and collection (`canIncreaseOccupancyOrReduceSpill`). The
number of ArchVGPRs to save to reduce spilling or increase function
occupancy by 1 (when there is no spilling) is computed. Then,
instructions eligible for rematerialization are collected, stopping as
soon as enough have been identified to be able to achieve our goal
(according to slightly optimistic heuristics). If there aren't enough of
such instructions, the scheduling stage stops here.
- Rematerialization (`rematerialize`). Instructions collected in the
first step are rematerialized one-by-one. Now we are able to directly
update the scheduler's state since we have already done the occupancy
analysis and know we won't have to rollback any state. Register
pressures for impacted regions are recomputed only once, as opposed to
at every sinking decision.

In the case where the stage attempted to increase occupancy, and if both
rematerializations alone and rescheduling after were unable to improve
occupancy, then all rematerializations are rollbacked.
2025-05-13 11:11:00 +02:00
Timm Baeder
3de2fa91e1 [clang][bytecode] Avoid classifying in visitArrayElemInit() (#139674)
We usually call this more than once, but the type of the initializer
never changes. Let's classify only once and pass that to
visitArrayElemInit().
2025-05-13 11:01:59 +02:00
Hans Wennborg
fd3fecfc09 Revert "[lld] Merge equivalent symbols found during ICF (#134342)"
The change would also merge *non-equivalent* symbols under some circumstances,
see comment with a reproducer on the PR.

> Fixes a correctness issue for AArch64 when ADRP and LDR instructions are
> outlined in separate sections and sections are fed to ICF for
> deduplication.
>
> See test case (based on
> https://github.com/llvm/llvm-project/issues/129122) for details. All
> rodata.* sections are folded into a single section with ICF. This leads
> to all f2_* function sections getting folded into one (as their
> relocation target symbols g* belong to .rodata.g* sections that have
> already been folded into one). Since relocations still refer original g*
> symbols, we end up creating duplicate GOT entry for all such symbols.
> This PR addresses that by tracking such folded symbols and create one
> GOT entry for all such symbols.
>
> Fixes https://github.com/llvm/llvm-project/issues/129122
>
> Co-authored by: @jyknight

This reverts commit 8389d6fad7.
2025-05-13 10:57:46 +02:00
Timm Baeder
98763433e6 [clang][bytecode] Optimize enum value range checks (#139672)
Only do the work if we really have to.
2025-05-13 10:55:24 +02:00
Matt Arsenault
6d35ec2335 ObjCARC: Fix regression from using ConstantData uselists (#139609)
Fixes regression after 9383fb23e1
2025-05-13 10:52:49 +02:00
Jacques Pienaar
c78e65cc98 [lldb][plugin] Use counter directly for number of readers (#139252)
Here we were initializing & locking a shared_mutex in a thread, while
releasing it in the parent which may/often turned out to be a different
thread (shared_mutex::unlock_shared is undefined behavior if called from
a thread that doesn't hold the lock).

Switch to counter to more simply keep track of number of readers and
simply lock/unlock rather than utilizing reader mutex to verify last
freed (and so requiring this matching thread init/destroy behavior).
2025-05-13 01:52:36 -07:00
Florian Hahn
ba2dacd276 [VPlan] Print use and definition in verifier on violation.
Improves the error message when a use comes before the def by including
the use and def, when print utilities are available.
2025-05-13 09:52:02 +01:00
David Green
137aa573ca [GlobalISel] Add computeNumSignBits for G_BUILD_VECTOR. (#139506)
The code is similar to SelectionDAG::ComputeNumSignBits, but does not
deal with truncating buildvectors.
2025-05-13 09:36:14 +01:00
Daan De Meyer
cdbc297ef5 include-cleaner: Report function decls from __cleanup__ as used (#138669) 2025-05-13 10:22:32 +02:00
David Green
d2dafded03 [AArch64] Minor test cleanup for postselectopt-dead-cc-defs.mir. NFC
Remove the duplicate definition of %12
2025-05-13 09:12:25 +01:00
drazi
eea1e50ac2 [mlir-tblgen] trim method body to empty with only spaces to avoid crash (#139568)
method body or default impl must be true empty. Even they contain only
spaces, ``mlir-tblgen`` considers they are non-empty and generates
invalid code lead to segment fault. It's very hard to debug.

```c++
    InterfaceMethod<
      ...
      /*methodBody=*/  [{ }],    // This must be true empty. Leaving a space here can lead to segment fault which is hard to figure out why
      /*defaultImpl=*/ [{
        ...
      }]
```

This PR trim spaces when method body or default implementation of
interface method is not empty. Now ``mlir-tblgen`` generates valid code
even when they contain only spaces.

---------

Co-authored-by: Fung Xie <ftse@nvidia.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-05-13 10:03:06 +02:00
Kohei Yamaguchi
f92dd0083e [mlir][docs] Add quant dialect pass doc into Passes.md (NFC) (#139363)
This PR added documentation for the quant dialect passes to `Passes.md`,
as it had not been included.
2025-05-13 17:00:45 +09:00
Igor Kirillov
a3fb54c1ae [LAA][NFC] Unify naming of DepCandidates to DepCands (#139534)
The MemoryDepChecker::DepCandidates instance in each LoopAccessInfo had multiple names (AccessSets, DepCands, DependentAccesses), which was confusing. This patch renames all references to DepCands for consistency.
2025-05-13 08:52:46 +01:00
Florian Hahn
5c7bc6a0e6 [ComplexDeinterleave] Don't try to combine single FP reductions. (#139469)
Currently the apss tries to combine floating point reductions, without
checking for the correct fast-math flags and it also creates invalid IR
(using llvm.reduce.add for FP types).

For now, just bail out for non-integer types.

PR: https://github.com/llvm/llvm-project/pull/139469
2025-05-13 08:44:11 +01:00
Piotr Fusik
3cfdf2ccdf [RISCV] Handle more (add x, C) -> (sub x, -C) cases (#138705)
This is a follow-up to #137309, adding:
- multi-use of the constant with different adds
- vectors (vadd.vx -> vsub.vx)
2025-05-13 09:12:24 +02:00
Antonio Frighetto
adfd59fdb8 [InstCombine] Introduce foldICmpBinOpWithConstantViaTruthTable folding
Match icmps of binops where both operands are select with constant arms,
i.e., `icmp pred (select A ? C1 : C2) binop (select B ? C3 : C4), C5`.
Fold such patterns by creating a truth table of the possible four
constant variants, and materialize back the optimal logic from it via
`createLogicFromTable` helper. This also generalizes an existing fold,
which has therefore been dropped.

Proofs: https://alive2.llvm.org/ce/z/NS7Vzu.

Fixes: https://github.com/llvm/llvm-project/issues/138212.
2025-05-13 09:04:25 +02:00
Antonio Frighetto
1bfd94b1b9 [InstCombine] Precommit tests for PR139109 (NFC) 2025-05-13 09:03:56 +02:00
Jim Lin
9f274a95b1 [RISCV] Fix indentation for riscv_corev_alu.h in CMakeLists.txt. NFC. 2025-05-13 14:46:08 +08:00
Iris Shi
6abf5b94da [RISCV][NFC] Fix typos in RISCVSchedule.td 2025-05-13 14:32:32 +08:00
Kazu Hirata
13d80b4b12 [AST] Use llvm::upper_bound (NFC) (#139664) 2025-05-12 23:24:46 -07:00
Kazu Hirata
75e0865837 [clang-tools-extra] Use llvm::unique (NFC) (#139663) 2025-05-12 23:24:24 -07:00
Kazu Hirata
c95745f2db [llvm] Use StringRef::{starts_with,find} (NFC) (#139661)
Calling find/contains in the StringRef domain allows us to avoid
creating temporary instances of std::string.
2025-05-12 23:24:07 -07:00
Kazu Hirata
294eb7670f [TableGen] Fix a warning
This patch fixes an unused parameter warning with gcc7 under the
release configuration.
2025-05-12 23:18:30 -07:00
Timm Baeder
79eed76c58 [clang][bytecode][NFC] Remove incorrect comment (#139571)
We don't create function frames for builtin functions anymore.
2025-05-13 08:09:26 +02:00
Helena Kotas
03934d0a21 [DirectX] Implement DXILResourceImplicitBinding pass (#138043)
The `DXILResourceImplicitBinding` pass uses the results of
`DXILResourceBindingAnalysis` to assigns register slots to resources
that do not have explicit binding. It replaces all
`llvm.dx.resource.handlefromimplicitbinding` calls with
`llvm.dx.resource.handlefrombinding` using the newly assigned binding.

If a binding cannot be found for a resource, the pass will raise a
diagnostic error. Currently this diagnostic message does not include the
resource name, which will be addressed in a separate task (#137868).

Part 2/2 of #136786
Closes #136786
2025-05-12 23:00:00 -07:00
Kazu Hirata
383a825d6d [BOLT] Use StringRef::contains (NFC) (#139658)
Once we convert EventNames to StringRef, which is cheap, we can call
StringRef::contains without creating a temporary instance of
std::string.
2025-05-12 22:59:26 -07:00
Kazu Hirata
0fedccf389 [IR] Use llvm::upper_bound (NFC) (#139656) 2025-05-12 22:59:05 -07:00
Kazu Hirata
e6e50170b9 [CodeGen] Use llvm::lower_bound (NFC) (#139655) 2025-05-12 22:58:50 -07:00
Kazu Hirata
510c8a23e6 [llvm] Use llvm::find_if (NFC) (#139654) 2025-05-12 22:58:30 -07:00
Iris Shi
49ab1d740e [NFC][RISCV] Remove extra space in RISCVInstrInfoZfh.td 2025-05-13 13:53:38 +08:00
Haojian Wu
1d0ee12e34 Reland "Reland [Modules] Remove unnecessary check when generating name lookup table in ASTWriter" (#139253)
This relands the patch
67b298f6d8,
with some more testcases.

The `undefined symbol` error mentioned in
https://github.com/llvm/llvm-project/issues/61065#issuecomment-1517725811
doesn't exist anymore from our internal tests.

Fixes #61065, #134739

---------

Co-authored-by: Viktoriia Bakalova <bakalova@google.com>
2025-05-13 07:46:43 +02:00
Matt Arsenault
2f9323bc5b DAG: Stop forcibly adding nsz to expanded minnum/maxnum (#139615) 2025-05-13 07:37:21 +02:00