Commit Graph

540737 Commits

Author SHA1 Message Date
Charles Zablit
20d5d09e99 [compiler-rt] remove unused default in compiler-rt lit tests (#143738)
In https://github.com/llvm/llvm-project/pull/143183 was mistakenly added
a default value to `python_root_dir` in lit tests of compiler-rt.

This is unused by the lit tests of compiler-rt, as it was meant to be
used by `lldb`.

This patch removes this change.
2025-06-12 11:37:25 +01:00
SahilPatidar
971c49fbf3 [InstCombine] Ensure Safe Handling of Flags in foldFNegIntoConstant (#94148)
Fix #93769 

alive2: https://alive2.llvm.org/ce/z/MHShQY
2025-06-12 19:31:43 +09:00
Nikita Popov
d517f15e09 [LICM] Regenerate test checks (NFC) 2025-06-12 12:30:52 +02:00
Paul Walker
702b9033c1 [LLVM][CodeGen][AArch64] Lower vector-(de)interleave to multi-register uzp/zip instructions. (#143128) 2025-06-12 11:27:30 +01:00
Luke Lau
7ef77eb998 [LV] Support scalable interleave groups for factors 3,5,6 and 7 (#141865)
Currently the loop vectorizer can only vectorize interleave groups for
power-of-2 factors at scalable VFs by recursively interleaving
[de]interleave2 intrinsics.

However after https://github.com/llvm/llvm-project/pull/124825 and
#139893, we now have [de]interleave intrinsics for all factors up to 8,
which is enough to support all types of segmented loads and stores on
RISC-V.

Now that the interleaved access pass has been taught to lower these in
#139373 and #141512, this patch teaches the loop vectorizer to emit
these intrinsics for factors up to 8, which enables scalable
vectorization for non-power-of-2 factors.

As far as I'm aware, no in-tree target will vectorize a scalable
interelave group above factor 8 because the maximum interleave factor is
capped at 4 on AArch64 and 8 on RISC-V, and the
`-max-interleave-group-factor` CLI option defaults to 8, so the
recursive [de]interleaving code has been removed for now.

Factors of 3 with scalable VFs are also turned off in AArch64 since
there's no lowering for [de]interleave3 just yet either.
2025-06-12 11:09:09 +01:00
Antonio Frighetto
5987f1ee5c [InstCombine] Regenerate narrow-switch.ll test (NFC)
`narrow-switch.ll` test has been regenerated via latest UTC using
`--prefix-filecheck-ir-name _`, so as to avoid conflicts with
scripted variable names.
2025-06-12 11:57:22 +02:00
Nikita Popov
a8c6fb4cb8 [MemCpyOpt] Fix lifetime marker sizes in tests (NFC)
As pointed out in https://github.com/llvm/llvm-project/pull/143782,
these tests were specifying the size in bits instead of bytes.

In order to preserve the intent of the tests, add a use of %src,
which prevents stack-move optimization. These are supposed to test
the handling of scoped alias metadata in call slot optimization.
2025-06-12 11:54:50 +02:00
Durgadoss R
3e5d50f9c6 [NVPTX] Add cta_group support to TMA G2S intrinsics (#143178)
This patch extends the TMA G2S intrinsics with the
support for cta_group::1/2 available from Blackwell onwards.
The existing intrinsics are auto-upgraded with a default
value of '0' for the `cta_group` flag operand.

* lit tests are added for all combinations of the newer variants.
* Negative tests are added to validate the error-handling 
   when the value of the cta_group flag falls out-of-range.
* The generated PTX is verified with a 12.8 ptxas executable.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-06-12 15:20:39 +05:30
Omair Javaid
8e4fdff6f0 [X86] Update tailcc-ssp.ll assertions using update_llc_test_checks.py (#143500)
The assertions in llvm/test/CodeGen/X86/tailcc-ssp.ll were outdated. The
initial comment indicated they were generated with
`utils/update_llc_test_checks.py UTC_ARGS: --version 5`, but this was
not accurate based on the file's content.

Running `utils/update_llc_test_checks.py` regenerated the assertions,
aligning them with the current `llc` output.
This commit ensures that the test's claimed behavior accurately reflects
the actual `llc` output, even though the tests were already passing.

This was identified by @efriedma-quic during review of #136290.

Submitting a separate PR to make sure these changes stay isolated.
2025-06-12 14:48:13 +05:00
Chuanqi Xu
1d1f9afe91 [C++20] [Modules] Treat directly imported internal partition unit as reachable
Close https://github.com/llvm/llvm-project/issues/143788

See the discussion for details.
2025-06-12 17:46:33 +08:00
Simon Pilgrim
2a27c059ec [X86] Use BSR passthrough behaviour to fold (CMOV (BSR ?, X), Y, (X == 0)) -> (BSR Y, X) (#143662)
Make use of targets that support BSR "pass through behaviour" on a zero input to remove a CMOV thats performing the same function

BSF will be a trickier patch as we need to make sure it works with the "REP BSF" hack in X86MCInstLower
2025-06-12 10:46:08 +01:00
Florian Hahn
db8d34db26 [VPlan] Set branch weight metadata on middle term in VPlan (NFC) (#143035)
Manage branch weights for the BranchOnCond in the middle block in VPlan.
This requires updating VPInstruction to inherit from VPIRMetadata, which
in general makes sense as there are a number of opcodes that could take
metadata.

There are other branches (part of the skeleton) that also need branch
weights adding.

PR: https://github.com/llvm/llvm-project/pull/143035
2025-06-12 10:04:08 +01:00
kadir çetinkaya
4551e50355 [clang] Reset FileID based diag state mappings (#143695)
When sharing same compiler instance for multiple compilations, we reset
source manager's file id tables in between runs. Diagnostics engine
keeps a cache based on these file ids, that became dangling references
across compilations.

This patch makes sure we reset those whenever sourcemanager is trashing
its FileIDs.
2025-06-12 10:49:23 +02:00
Pengcheng Wang
ce621041c2 [RISCV] Get host CPU name via hwprobe (#142745)
We can get the `mvendorid/marchid/mimpid` via hwprobe and then we
can compare these IDs with those defined in processors to find the
CPU name.

With this change, `-mcpu/-mtune=native` can set the proper name.
2025-06-12 16:39:57 +08:00
Matt Arsenault
5434b85d2c ARM: Remove fake entries for divrem libcalls (#143832)
This was defining aliases of the i32 divrem functions for the i8
and i16 cases. This is unnecessary and was unused. The divrem
candidate cases wouldn't have formed with illegal types in the
first place, so codegen wouldn't even query these.
2025-06-12 17:38:52 +09:00
Matt Arsenault
4079ed3c9e ARM: Move setting of more runtime libcalls to RuntimeLibcallInfo (#143826)
These are the easy cases that do not really depend on the subtarget,
other than for the deceptive predicates on the subtarget class. Most
of the rest of the cases here also do not, but this is obscured by
going through helper predicates added onto the subtarget which hide
dependence on TargetOptions.
2025-06-12 17:35:55 +09:00
Simon Pilgrim
edaac11df3 [X86] combineSelect - attempt to combine with shuffles (#143753)
Before legalization we will convert to a vector_shuffle node - but afterward we can try to combine the select into an existing target shuffle chain
2025-06-12 09:29:41 +01:00
Ian Wood
6e5a1423b7 [mlir] Reapply "Loosen restrictions on folding dynamic reshapes" (#142827)
The original PR https://github.com/llvm/llvm-project/pull/137963 had a
nvidia bot failure. This appears to be a flaky test because rerunning
the build was successful.

This change needs commit 6f2ba47 to fix incorrect usage of
`getReassociationIndicesForCollapse`.

Reverts llvm/llvm-project#142639

Co-authored-by: Artem Gindinson <gindinson@roofline.ai>
2025-06-12 10:28:27 +02:00
Simon Pilgrim
2d35b568ef [X86] bsf.ll - add icmp_ne coverage to bsf passthrough tests 2025-06-12 09:27:24 +01:00
Mikael Holmen
77062244ed Fix two instances of -Wparentheses warnings [NFC]
Add parentheses around the assert conditions.

Without this gcc warned like
 ../lib/Target/AMDGPU/GCNSchedStrategy.cpp:2250: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  2250 |          NewMI != RegionBounds.second && "cannot remove at region end");
and
 ../../clang/lib/Sema/SemaOverload.cpp:11326:39: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
 11326 |          DeferredCandidatesCount == 0 &&
       |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
 11327 |              "Unexpected deferred template candidates");
       |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2025-06-12 09:43:30 +02:00
Nikita Popov
6157028fea [BasicAA][ValueTracking] Increase depth for underlying object search (#143714)
This depth limits a linear search (rather than the usual potentially
exponential one) and is not particularly important for compile-time in
practice.

The change in #137297 is going to increase the length of GEP chains, so
I'd like to increase this limit a bit to reduce the chance of
regressions (https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2419
showed a 13% increase in SearchLimitReached). There is no particular
significance to the new value of 10.

Compile-time is neutral.
2025-06-12 09:19:50 +02:00
Chuanqi Xu
3f0cf742ac [C++20] [Modules] [Reduced BMI] Don't write specializations with local args
Close https://github.com/llvm/llvm-project/issues/119947

As discussed in the above thread, we shouldn't write specializations
with local args in reduced BMI. Since users can't find such
specializations any way.
2025-06-12 14:42:04 +08:00
David Green
9d491bc602 [AArch64][GlobalISel] Enable extract_vec_elt_combines postlegalization. 2025-06-12 07:03:09 +01:00
Martin Storsjö
bec85f3b18 [LLD] [COFF] [test] Readd lto-late-arm.ll (#143494)
This testcase was removed in 4cafd28b7d,
as a082f665f8 had made it no longer
trigger the error that it was supposed to do. (Because the latter of
those two commits makes the symbol "__rt_sdiv" be included among the
potential libcalls listed by lto::LTO::getRuntimeLibcallSymbols().)

Readd the test as a positive test, making sure that such libcalls can
get linked.

We do have preexisting test coverage for LTO libcalls overall in
libcall-archive.ll, but readd this test to cover specifically the ARM
division helper functions as well.
2025-06-12 08:58:26 +03:00
Thurston Dang
2efff47363 [NFCI][msan] Show that shadow for partially undefined constant vectors is computed as fully initialized (#143823)
This happens because `getShadow(Value *V)` has a special case for fully undefined/poisoned values, but partially undefined values fall-through and are given a clean shadow. This leads to false negatives (no false positives).

Note: MSan correctly handles InsertElementInst, but the shadow of the initial constant vector may still be wrong and be propagated.

Showing that the same approximation happens for other composite types is left as an exercise for the reader.
2025-06-11 22:43:06 -07:00
Rajveer Singh Bharadwaj
95bbaca6c1 [AArch64] Extend usage of XAR instruction for fixed-length operations (#139460) 2025-06-12 10:54:01 +05:30
Fangrui Song
28bda77843 Introduce MCAsmInfo::UsesSetToEquateSymbol and prefer = to .set
Introduce MCAsmInfo::UsesSetToEquateSymbol to control the preferred
syntax for symbol equating. We now favor the more readable and common
`symbol = expression` syntax over `.set`. This aligns with pre- https://reviews.llvm.org/D44256 behavior.

On Apple platforms, this resolves a clang -S vs -c behavior difference (resolves #104623).

For targets whose = support is unconfirmed, UsesSetToEquateSymbol is set to false.
This also minimizes test updates.

Pull Request: https://github.com/llvm/llvm-project/pull/142289
2025-06-11 22:19:31 -07:00
Fazlay Rabbi
02550da932 [OpenMP 60] Initial parsing/sema for need_device_addr modifier on adjust_args clause (#143442)
Adds initial parsing and semantic analysis for `need_device_addr`
modifier on `adjust_args` clause.
2025-06-11 22:06:11 -07:00
Kazu Hirata
99638537cd [AArch64] Fix a warning
This patch fixes:

  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7157:3: error:
  unannotated fall-through between switch labels
  [-Werror,-Wimplicit-fallthrough]
2025-06-11 21:56:48 -07:00
Alexey Samsonov
a93e55e57e Revert "[libc] Migrate stdio tests to ErrnoCheckingTest." (#143829)
Reverts llvm/llvm-project#143802. Follow-up fix
3c7af175e5 wasn't robust enough and itself
got reverted.
2025-06-11 21:33:46 -07:00
Fangrui Song
c3be4524a5 [ELF,test] Improve weak-undef-got-plt.s 2025-06-11 21:23:06 -07:00
Kareem Ergawy
282e471018 [flang] Erase fir.local ops before lowering fir to llvm (#143687)
`fir.local` ops are not supposed to have any uses at this point (i.e.
during lowering to LLVM). In case of serialization, the
`fir.do_concurrent` users are expected to have been lowered to
`fir.do_loop` nests. In case of parallelization, the `fir.do_concurrent`
users are expected to have been lowered to the target parallel model
(e.g. OpenMP).

This hopefully resolved a build issue introduced by
https://github.com/llvm/llvm-project/pull/142567 (see for example:
https://lab.llvm.org/buildbot/#/builders/199/builds/4009).
2025-06-12 05:58:55 +02:00
Chuanqi Xu
f09050fdc8 [C++20] [Modules] Fix module local lookup ambiguousity
Close https://github.com/llvm/llvm-project/issues/61360
Close https://github.com/llvm/llvm-project/issues/129525
Close https://github.com/llvm/llvm-project/issues/143734

We shouldn't identify different module local decls in different modules
as the same entity.
2025-06-12 11:48:09 +08:00
Brandon Wu
5f231db764 [RISCV] Use StringRef for RequiredExtensions in RVVIntrinsicDef (#143503)
This prevents many duplicated copies of required extensions string.
2025-06-12 11:41:52 +08:00
Fangrui Song
2fcaa00d1e [ELF] -z undefs: handle relocations referencing undefined non-weak like undefined weak
* Merge the special case into isStaticLinkTimeConstant
* Generalize isUndefWeak to isUndefined. undefined non-weak is an error
  case. We choose to be general, which also brings us in line with GNU ld.
2025-06-11 20:37:15 -07:00
Yang Zaizhou
968d8eaa44 [OpenMP][Flang]Fix omp_get_cancellation return type from integer to logical (#142990) 2025-06-12 11:28:57 +08:00
Kewen12
a71210e5ab Revert "[libc] Fix stdio tests after #143802" (#143824)
Reverts llvm/llvm-project#143810 

This PR breaks our buildbot:
https://lab.llvm.org/buildbot/#/builders/10/builds/7159 revert to
unblock downstream merge.
2025-06-11 23:24:56 -04:00
Khem Raj
b46f34452e libunwind: Do not use __attribute__((target("gcs"))) with non-clang compilers (#138077)
This attribute is unsupported in GCC, so far it worked because before
GCC15 did not define this macros in _CHKFEAT_GCS in arm_acle.h [1]

With gcc15 compiler libunwind's check for this macros is succeeding and
it ends up enabling 'gcs' by using function attribute, this works with
clang but not with gcc.

We can see this in rust compiler bootstrap for aarch64/musl when system
uses gcc15, it ends up with these errors

Building libunwind.a for aarch64-poky-linux-musl
```
cargo:warning=/mnt/b/yoe/master/sources/poky/build/tmp/work/cortexa57-poky-linux-musl/rust/1.85.1/rustc-1.85.1-src/src/llvm-project/libunwind/src/UnwindLevel1.c:191:1: error: arch extension 'gcs' should be prefixed by '+' cargo:warning=  191 | unwind_phase2(unw_context_t *uc, unw_cursor_t *cursor, _Unwind_Exception *exception_object) {
cargo:warning=      | ^~~~~~~~~~~~~
cargo:warning=/mnt/b/yoe/master/sources/poky/build/tmp/work/cortexa57-poky-linux-musl/rust/1.85.1/rustc-1.85.1-src/src/llvm-project/libunwind/src/UnwindLevel1.c:337:22: error: arch extension 'gcs' should be prefixed by '+'
cargo:warning=  337 |                      _Unwind_Stop_Fn stop, void *stop_parameter) {
cargo:warning=      |                      ^~~~~~~~~~~~~~~
```

[1] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=5a6af707f0af

Signed-off-by: Khem Raj <raj.khem@gmail.com>
2025-06-11 20:22:08 -07:00
Fangrui Song
d8118ed6db [ELF,test] Improve weak-undef-rw.s 2025-06-11 20:00:45 -07:00
LLVM GN Syncbot
faa49d6662 [gn build] Port de51b2dd3c 2025-06-12 02:53:03 +00:00
Jonas Devlieghere
de51b2dd3c [lldb] Move Transport class into lldb_private (NFC) (#143806)
Move lldb-dap's Transport class into lldb_private so the code can be
shared between the "JSON with header" protocol used by DAP and the JSON
RPC protocol used by MCP (see [1]).

[1]: https://discourse.llvm.org/t/rfc-adding-mcp-support-to-lldb/86798
2025-06-11 19:51:05 -07:00
Chuanqi Xu
bb3b8306dc [NFC] [C++20] [Modules] Add a test module local declaration lookup
From
https://github.com/llvm/llvm-project/issues/143734, but it looks good on
trunk. Add it as tests are always good.
2025-06-12 10:48:34 +08:00
Jameson Nash
082251bba4 [AArch64] fix trampoline implementation: use X15 (#126743)
AAPCS64 reserves any of X9-X15 for a compiler to choose to use for this
purpose, and says not to use X16 or X18 like GCC (and the previous
implementation) chose to use. The X18 register may need to get used by
the kernel in some circumstances, as specified by the platform ABI, so
it is generally an unwise choice. Simply choosing a different register
fixes the problem of this being broken on any platform that actually
follows the platform ABI (which is all of them except EABI, if I am
reading this linux kernel bug correctly
https://lkml2.uits.iu.edu/hypermail/linux/kernel/2001.2/01502.html). As
a side benefit, also generate slightly better code and avoids needing
the compiler-rt to be present. I did that by following the XCore
implementation instead of PPC (although in hindsight, following the
RISCV might have been slightly more readable). That X18 is wrong to use
for this purpose has been known for many years (e.g.
https://www.mail-archive.com/gcc@gcc.gnu.org/msg76934.html) and also
known that fixing this to use one of the correct registers is not an ABI
break, since this only appears inside of a translation unit. Some of the
other temporary registers (e.g. X9) are already reserved inside llvm for
internal use as a generic temporary register in the prologue before
saving registers, while X15 was already used in rare cases as a scratch
register in the prologue as well, so I felt that seemed the most logical
choice to choose here.
2025-06-11 21:49:01 -04:00
Longsheng Mou
52360d195b [NFC] Use llvm::includes instead of std::includes (#143542)
This PR follows up #143297.
2025-06-12 09:27:27 +08:00
Shunsuke Watanabe
c431618041 [Clang][Driver] Override complex number calculation method by -fno-fast-math (#132680)
This patch fixes a bug where -fno-fast-math doesn't revert the complex
number calculation method to the default. The priority of overriding
options related to complex number calculations differs slightly from
GCC, as discussed in:


https://discourse.llvm.org/t/the-priority-of-fno-fast-math-regarding-complex-number-calculations/84679
2025-06-12 10:19:26 +09:00
Jeffrey Byrnes
7034014d08 [InstCombine] Combine or-disjoint (and->mul), (and->mul) to and->mul (#136013)
The canonical pattern for bitmasked mul is currently

```
%val = and %x, %bitMask // where %bitMask is some constant
%cmp = icmp eq %val, 0
%sel = select %cmp, 0, %C // where %C is some constant = C' * %bitMask
```

In certain cases, where we are combining multiple of these bitmasked
muls with common factors, we are able to optimize into and->mul (see
https://github.com/llvm/llvm-project/pull/135274 )

This optimization lends itself to further optimizations. This PR
addresses one of such optimizations.

In cases where we have

`or-disjoint ( mul(and (X, C1), D) , mul (and (X, C2), D))`

we can combine into

`mul( and (X, (C1 + C2)), D) `

provided C1 and C2 are disjoint.

Generalized proof: https://alive2.llvm.org/ce/z/MQYMui
2025-06-11 18:07:00 -07:00
Jim Lin
7a3bcf9f71 [RISCV] Add missing predicate for PseudoTHVdotVMAQA family instructions 2025-06-12 08:43:06 +08:00
Jake Egan
d7c6cad744 [sanitizer_common] Implement interception on AIX (#138606)
Adjust AIX interceptor support in sanitizer_common. 

Issue: https://github.com/llvm/llvm-project/issues/138916
2025-06-11 20:22:15 -04:00
Jameson Nash
bc7ea63e9c [MemCpyOpt] handle memcpy from memset for non-constant sizes (#143727)
Allows forwarding memset to memcpy for mismatching unknown sizes if
overread has undef contents. In that case we can refine the undef bytes
to the memset value.

Refs #140954 which laid some of the groundwork for this.
2025-06-11 20:04:27 -04:00
Jorge Gorbe Moya
6c72084a57 [bazel] port 1ecd108cb7 2025-06-11 16:57:14 -07:00