Commit Graph

543139 Commits

Author SHA1 Message Date
Kunqiu Chen
0aafeb8ba1 Reland [TSan] Clarify and enforce shadow end alignment (#146676)
#144648 was reverted because it failed the new sanitizer test
`munmap_clear_shadow.c` in IOS's CI.
That issue could be fixed by disabling the test on some platforms, due
to the incompatibility of the test on these platforms.

In detail, we should disable the test in FreeBSD, Apple, NetBSD,
Solaris, and Haiku, where `ReleaseMemoryPagesToOS` executes
`madvise(beg, end, MADV_FREE)`, which tags the relevant pages as 'FREE'
and does not release them immediately.
2025-07-02 20:28:30 +08:00
Shilei Tian
c0e9084b1c [AMDGPU] Add a debug option -amdgpu-snop-padding for GCNHazardRecognizer (#146587)
This can help to identify if there is potential hazards.

Co-authored-by: Byrnes, Jeffrey <Jeffrey.Byrnes@amd.com>
2025-07-02 08:16:38 -04:00
Kunqiu Chen
9eac5f72f6 Revert "[TSan] Clarify and enforce shadow end alignment" (#146674)
Reverts llvm/llvm-project#144648 due to a test failure of the new added
test case `munmap_clear_shadow.c` in IOS .
2025-07-02 20:11:11 +08:00
Mehdi Amini
6ec9b1b366 [MLIR] Remove spurious space when printing prop-dict (#145962)
When there is an elided properties, there use to be an extra space
insert in the prop-dict printing before the dictionnary.

Fix #145695
2025-07-02 14:07:17 +02:00
David Sherwood
f575b18fdc [LV] Add support for partial reductions without a binary op (#133922)
Consider IR such as this:

for.body:
  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
  %accum = phi i32 [ 0, %entry ], [ %add, %for.body ]
  %gep.a = getelementptr i8, ptr %a, i64 %iv
  %load.a = load i8, ptr %gep.a, align 1
  %ext.a = zext i8 %load.a to i32
  %add = add i32 %ext.a, %accum
  %iv.next = add i64 %iv, 1
  %exitcond.not = icmp eq i64 %iv.next, 1025
  br i1 %exitcond.not, label %for.exit, label %for.body

Conceptually we can vectorise this using partial reductions too,
although the current loop vectoriser implementation requires the
accumulation of a multiply. For AArch64 this is easily done with
a udot or sdot with an identity operand, i.e. a vector of (i16 1).

In order to do this I had to teach getScaledReductions that the
accumulated value may come from a unary op, hence there is only
one extension to consider. Similarly, I updated the vplan and
AArch64 TTI cost model to understand the possible unary op.

---------

Co-authored-by: Matt Devereau <matthew.devereau@arm.com>
2025-07-02 13:05:51 +01:00
Joseph Huber
dea4f3213d [libc] Use is aligned builtin instead of ptrtoint (#146402)
Summary:
This avoids a ptrtoint by just using the clang builtin. This is clang
specific but only clang can compile GPU code anyway so I do not bother
with a fallback.
2025-07-02 07:03:11 -05:00
DrSergei
5fe63ae9a3 [lldb-dap] Fix flaky test TestDAP_server (#145231)
This patch fixes a possible data race between main and event handler
threads. Terminated event can be sent from `Disconnect` function or
event handler. Consequently, there are some possible sequences of
events. We must check events twice, because without getting an exited
event, `exit_status` will be None. But, we don't know the order of
events (for example, we can get terminated event before exited event),
so we check events by filter. It is correct, because terminated event
will be sent only once (guarded by `llvm::call_once`).

This patch moved from
[145010](https://github.com/llvm/llvm-project/pull/145010) and based on
idea from this
[comment](https://github.com/llvm/llvm-project/pull/145010#discussion_r2159637210).
2025-07-02 12:16:48 +01:00
Matt Arsenault
585b41c2ec TargetOptions: Look up frame-pointer attribute once (#146639)
Same as 07a86a525e, except in
ther other case here.
2025-07-02 20:09:20 +09:00
Stephen Tozer
35626e97d8 [DLCov] Origin-Tracking: Enable collecting and symbolizing stack traces (#143591)
This patch is part of a series that adds origin-tracking to the debugify
source location coverage checks, allowing us to report symbolized stack
traces of the point where missing source locations appear.

This patch adds a pair of new functions in `signals.h` that can be used
to collect and symbolize stack traces respectively. This has major
implementation overlap with the existing stack trace
collection/symbolizing methods, but the existing functions are
specialized for dumping a stack trace to stderr when LLVM crashes, while
these new functions are meant to be called repeatedly during the
execution of the program, and therefore we need a separate set of
functions.
2025-07-02 12:01:17 +01:00
Andrei Safronov
a2c9f7dbcc [Xtensa] Implement lowering SELECT_CC/BRCC for Xtensa FP Option. (#145544)
Also minor format changes in disassembler test for Xtensa FP Option.
2025-07-02 13:48:49 +03:00
Paul Walker
7cc8fe2a2c [LLVM][AArch64] Relax SVE/SME codegen predicates. (#145322)
Code generation predicates like HasSVE2_or_SME implemented a strict
divide between streaming and non-streaming which meant some SME
instructions were not available unless a matching SVE feature was
enabled.
2025-07-02 11:39:33 +01:00
Simon Pilgrim
38200e94f1 [DAG] visitFREEZE - always allow freezing multiple operands (#145939)
Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...).

This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots.

I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe.

Hopefully this will help some of the regression issues in #143102 etc.
2025-07-02 11:28:37 +01:00
nerix
4c7a706589 [LLDB] Simplify libstdc++ string summaries (#146562)
From #143177. This combines the summaries for the pre- and post C++ 11
`std::string` as well as `std::wstring`. In all cases, the data pointer
is reachable through `_M_dataplus._M_p`. It has the correct type (i.e.
`char*`/`wchar_t*`) and it's null terminated, so LLDB knows how to
format it as expected when using `GetSummaryAsCString`.
2025-07-02 11:21:31 +01:00
Michael Buch
40275a4ee3 [lldb][test] Add tests for formatting pointers to std::unordered_map
Ever since #143501 and #144517, these should pass.

Adds tests for https://github.com/llvm/llvm-project/issues/146040
2025-07-02 11:21:02 +01:00
Mel Chen
bc8dad1c7e [VPlan] Emit VPVectorEndPointerRecipe for reverse interleave pointer adjustment (#144864)
A reverse interleave access is essentially composed of multiple
load/store operations with same negative stride, and their addresses are
based on the last lane address of member 0 in the interleaved group.

Currently, we already have VPVectorEndPointerRecipe for computing the
last lane address of consecutive reverse memory accesses. This patch
extends VPVectorEndPointerRecipe to support constant stride and extracts
the reverse interleave group address adjustment from
VPInterleaveRecipe::execute, replacing it with a
VPVectorEndPointerRecipe.

The final goal is to support interleaved accesses with EVL tail folding.
Given that VPInterleaveRecipe is large and tightly coupled — combining
both load and store, and embedding operations like reverse pointer
adjustion (GEP), widen load/store, deinterleave/interleave, and reversal
— breaking it down into smaller, dedicated recipes may allow
VPlanTransforms::tryAddExplicitVectorLength to lower them into EVL-aware
form more effectively.

One foreseeable challenge is that
VPlanTransforms::convertToConcreteRecipes currently runs after
tryAddExplicitVectorLength, so decomposing VPInterleaveRecipe will
likely need to happen earlier in the pipeline to be effective.
2025-07-02 18:16:02 +08:00
Hanyang (Eric) Xu
6e1e89ee38 [SLP] Avoid -passes=instcombine stages in SLP tests (#146257)
Fixes #145511

Note that there are still two instances of
--passes=slp-vectorizer,instcombine left unchanged because it seems that
the tests are meant to run in conjunction with instcombine and removing
instcombine would invalidate their original objective:


[llvm/test/Transforms/SLPVectorizer/arith-div-undef.ll](https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/arith-div-undef.ll)

[llvm/test/Transforms/SLPVectorizer/slp-hr-with-reuse.ll](https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SLPVectorizer/slp-hr-with-reuse.ll)
2025-07-02 06:14:41 -04:00
Kazu Hirata
7ead20db28 [lldb] Use llvm::erase_if (NFC) (#146624)
Note that erase_if combines erase and remove_if.
2025-07-02 11:00:58 +01:00
Qi Zhao
82c0a53763 [LoongArch] Pre-commit for optimizing insert extracted pair elements 2025-07-02 17:38:08 +08:00
Tom Eccles
1b7cbe1f87 [flang][OpenMP] Create unique reduction decls for different logical kinds (#146558)
Some Fujitsu tests showed incorrect results because we were sharing
reduction declarations for different kinds for logical variables.
2025-07-02 10:25:43 +01:00
Simon Pilgrim
651c5208f8 VPlanRecipes.cpp - fix "'llvm::VPExpressionRecipe::computeCost': not all control paths return a value" MSVC warning. NFC. 2025-07-02 09:59:01 +01:00
Graham Hunter
85bc868417 [AArch64][TTI] Reduce cost for splatting whole first vector segment (SVE) (#145701)
Improve cost modeling for splatting the first 128b segment.
2025-07-02 09:51:56 +01:00
Jannick Kremer
a75587d271 [clang][python][test] Move python binding tests to lit framework (#146486)
As discussed in PR #142353, the current testsuite of the `clang` Python
bindings has several issues:

- It `libclang.so` cannot be loaded into `python` to run the testsuite,
the whole `ninja check-all` aborts.
- The result of running the testsuite isn't report like the `lit`-based
tests, rendering them almost invisible.
- The testsuite is disabled in a non-obvious way (`RUN_PYTHON_TESTS`) in
`tests/CMakeLists.txt`, which again doesn't show up in the test results.

All these issues can be avoided by integrating the Python bindings tests
with `lit`, which is what this patch does:

- The actual test lives in `clang/test/bindings/python/bindings.sh` and
is run by `lit`.
- The current `clang/bindings/python/tests` directory (minus the
now-subperfluous `CMakeLists.txt`) is moved into the same directory.
- The check if `libclang` is loadable (originally from PR #142353) is
now handled via a new `lit` feature, `libclang-loadable`.
- The various ways to disable the tests have been turned into `XFAIL`s
as appropriate. This isn't complete and not completely tested yet.

Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`,
`i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and
`x86_64-pc-linux-gnu`.

Co-authored-by: Rainer Orth <ro@gcc.gnu.org>
2025-07-02 10:11:48 +02:00
Zhaoxin Yang
2c1900860c [lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715)
Support TLSDESC to initial-exec or local-exec optimizations. Introduce a
new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing
R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing
R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE.
    
In normal or medium code model, there are two forms of code sequences:
* pcalau12i  $a0, %desc_pc_hi20(sym_desc)
* addi.d     $a0, $a0, %desc_pc_lo12(sym_desc)
* ld.d       $ra, $a0, %desc_ld(sym_desc)
* jirl       $ra, $ra, %desc_call(sym_desc)
------
* pcaddi     $a0, %desc_pcrel_20(sym_desc)
* ld.d       $ra, $a0, %desc_ld(sym_desc)
* jirl       $ra, $ra, %desc_call(sym_desc)
    
Convert to IE:
* pcalau12i $a0, %ie_pc_hi20(sym_ie)
* ld.[wd]   $a0, $a0, %ie_pc_lo12(sym_ie)

Convert to LE:
* lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP
* ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero

Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to
convert the preceding instructions to NOPs, due to both forms of code
sequence (corresponding to relocation combinations:
R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and
R_LARCH_TLS_DESC_PCREL20_S2) have same process.
    
TODO: When relaxation enables, redundant NOPs can be removed. It will be
implemented in a future patch.
    
Note: All forms of TLSDESC code sequences should not appear interleaved
in the normal, medium or extreme code model, which compilers do not
generate and lld is unsupported. This is thanks to the guard in
PostRASchedulerList.cpp in llvm.
```
Calls are not scheduling boundaries before register allocation,
but post-ra we don't gain anything by scheduling across calls
since we don't need to worry about register pressure.
```
2025-07-02 16:09:51 +08:00
Antonio Frighetto
f1cc0b607b [IR] Introduce dead_on_return attribute
Add `dead_on_return` attribute, which is meant to be taken advantage
by the frontend, and states that the memory pointed to by the argument
is dead upon function return. As with `byval`, it is supposed to be
used for passing aggregates by value. The difference lies in the ABI:
`byval` implies that the pointer is explicitly passed as argument to
the callee (during codegen the copy is emitted as per byval contract),
whereas a `dead_on_return`-marked argument implies that the copy
already exists in the IR, is located at a specific stack offset within
the caller, and this memory will not be read further by the caller upon
callee return – or otherwise poison, if read before being written.

RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
2025-07-02 09:29:36 +02:00
Fangrui Song
d5608d6751 MC,test: Improve section group test
Also add a case for #146581
```
.section sec,"ax"
.section .foo,"axG",@progbits,sec
nop
```
2025-07-02 00:28:40 -07:00
Matthias Springer
647aa6616f [mlir][SPIRVToLLVM] Set valid insertion point after op erasure (#146551)
Erasing/replacing an op, which is also the current insertion point,
invalidates the insertion point. Explicitly set the insertion point, so
that `copy` does not crash after the One-Shot Dialect Conversion
refactoring. (`ConversionPatternRewriter` will start behaving more like
a "normal" rewriter.)
2025-07-02 09:25:24 +02:00
Nikita Popov
83272a4849 [InstCombine] Fold icmp of gep chain with base (#144065)
Fold icmp between a chain of geps and its base pointer. Previously only
a single gep was supported.
    
This will be extended to handle the case of two gep chains with a common
base in a followup.

This helps to avoid regressions after #137297.
2025-07-02 09:23:36 +02:00
Haojian Wu
0588e8188c [Serialization] Use the SourceLocation::UIntTy instead of the raw type
for the offset, NFC
2025-07-02 09:11:55 +02:00
Markus Böck
6c9be27b52 [mlir][tensor] Fold identity reshape of 0d-tensors (#146375)
Just like 1d-tensors, reshapes of 0d-tensors (aka scalars) are always
no-folds as they only have one possible layout. This PR adds logic to
the `fold` implementation to optimize these away as is currently
implemented for 1d tensors.
2025-07-02 09:09:03 +02:00
Fangrui Song
9262ac3ee4 Revert "ELFObjectWriter: Optimize isInSymtab"
This reverts commit 1108cf6419.

Caused a regression for a weird but interesting case (STT_SECTION symbol
as group signature). We no longer define `sec`
```
.section sec,"ax"
.section .foo,"axG",@progbits,sec
nop
```

Fix #146581
2025-07-02 00:08:42 -07:00
Fangrui Song
eac1a1d3a8 MCAssembler: Consistently place MCFragment parameter before MCFixup
... to be consistent with other places, e.g. `recordRelocation`.
While here, use references instead of non-null pointers.
2025-07-01 23:59:35 -07:00
zbenzion
b68e8f1de7 [mlir][linalg] Allow promotion to use the original subview size (#144334)
linalg promotion attempts to compute a constant upper bound for the
allocated buffer size. Only when failed to compute an upperbound it
fallbacks to the original subview size, which may be dynamic.

Adding a promotion option to use the original subview size by default,
thus minimizing the allocation size.
Fixes #144268.
2025-07-02 08:47:51 +02:00
Fangrui Song
3c6cade485 MCObjectStreamer: De-virtualize emitInstToFragment 2025-07-01 23:05:35 -07:00
Kazu Hirata
f4b938b7c0 [TableGen] Use range-based for loops (NFC) (#146626) 2025-07-01 22:50:11 -07:00
Kazu Hirata
b809d5e2ac [ProfileData] Use lambdas instead of std::bind (NFC) (#146625)
Lambdas are a lot shorter than std::bind here.
2025-07-01 22:50:04 -07:00
Kazu Hirata
838b91d7f6 [clangd] Drop const from a return type (NFC) (#146623)
We don't need const on a return type.
2025-07-01 22:49:56 -07:00
Kazu Hirata
7b4dbb4f37 [Sema] Remove an unnecessary cast (NFC) (#146622)
Since both alignment and Alignment are of the same type, this patch
renames alignment to Alignment while removing the cast statement.
2025-07-01 22:49:48 -07:00
Mateusz Mikuła
2723a6d992 [LLVM][Cygwin] Enable dynamic linking of libLLVM (#146440)
These changes allow to link everything to shared LLVM library with
MSYS2 "Cygwin" toolchain.
2025-07-01 22:30:12 -07:00
Timm Baeder
984c78f27d [clang][bytecode] Add back missing initialize call (#146589)
This was only accidentally dropped, so add it back.
2025-07-02 07:15:47 +02:00
Craig Topper
c9bfdae620 [RISCV] Use uint64_t for Insn in getInstruction32 and getInstruction16. NFC (#146619)
Insn is passed to decodeInstruction which is a template function based
on the type of Insn. By using uint64_t we ensure only one version of
decodeInstruction is created. This reduces the file size of
RISCVDisassembler.cpp.o by ~25% in my local build.
2025-07-01 21:45:02 -07:00
Shilei Tian
f1a4bb6245 [RFC][NFC][AMDGPU] Remove explicit value assignments from AMDGPU::GPUKind (#146567)
We don't seem to rely on the specific values of these enums, so removing
the
explicit assignments simplifies the process of adding new targets.
2025-07-01 23:39:01 -04:00
Alex Crichton
a8a9a7f95a [WebAssembly] Fix inline assembly with vector types (#146574)
This commit fixes using inline assembly with v128 results. Previously
this failed with an internal assertion about a failure to legalize a
`CopyFromReg` where the source register was typed `v8f16`. It looks like
the type used for the destination register was whatever was listed first
in the `def V128 : WebAssemblyRegClass` listing, so the types were
shuffled around to have a default-supported type.

A small test was added as well which failed to generate previously and
should now pass in generation. This test passed on LLVM 18 additionally
and regressed by accident in #93228 which was first included in LLVM 19.
2025-07-01 20:26:30 -07:00
Peter Collingbourne
2a702cdc38 Driver: Avoid llvm::sys::path::append if resource directory absolute.
After #145996 CLANG_RESOURCE_DIR can be an absolute path so we need to
handle it correctly in the driver.

llvm::sys::path::append does not append absolute paths in the way
that I expected (or consistent with other similar APIs such as C++17
std::filesystem::path::append or Python os.path.join); instead, it
effectively discards the leading / and appends the resulting relative path
(e.g. append(P, "/bar") with P = "/foo" sets P to "/foo/bar").

Many tests start failing if I try to align llvm::sys::path::append with
the other APIs because of callers that expect the existing behavior,
so for now let's add a special case here for absolute resource paths,
and document the behavior in Path.h.

Reviewers: MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/146449
2025-07-01 20:21:51 -07:00
XiangZhang
aa1d9a4c31 [MLIR][Affine] Enhance simplifyAdd for AffineExpr mod (#146492)
Currently AffineExpr Add has ability to optimize `"s1 + (s1 // c * -c)"
to "s1 % c"`,
but can not optimize `"(s0 + s1) + (s1 // c * -c)"`. 
This patch provide an opportunity to do this simplification, let it can
be simplified to `"s0 + s1 % c"`.
2025-07-02 11:08:58 +08:00
Kazu Hirata
eb07f0d4a9 [Analysis] Use range-based for loops (NFC) (#146466) 2025-07-01 19:38:28 -07:00
Ashwin Banwari
2599a9aeb5 [clang] [modules] Implement P3618R0: Allow attaching main to the global module (#146461)
Remove the prior warning for attaching extern "C++" to main.
2025-07-02 09:52:10 +08:00
Ami-zhang
3deed4211a [docs] Add clang release notes for LoongArch (#146481) 2025-07-02 09:21:33 +08:00
Jonas Devlieghere
a87b27fd51 [lldb] Fix the hardware breakpoint decorator (#146609)
A decorator to skip or XFAIL a test takes effect when the function
that's passed in returns a reason string. The wrappers around
hw_breakpoints_supported were doing that incorrectly by inverting
(calling `not`) on the result, turning it into a boolean, which means
the test is always skipped.
2025-07-01 18:01:19 -07:00
Matt Arsenault
7502af89fc clang: Forward exception_model flag for bitcode inputs (#146342)
This will enable removal of a hack from the wasm backend
in a future change.

This feels unnecessarily clunky. I would assume something was
automatically parsing this and propagating it in the C++ case,
but I can't seem to find it. In particular it feels wrong that
I need to parse out the individual values, given they are listed
in the options.td file. We should also be parsing and forwarding
every flag that corresponds to something else in TargetOptions,
which requires auditing.
2025-07-02 09:39:46 +09:00
Wenju He
b0e6faae08 [libclc] Add missing clc_lgamma_r with generic address space pointer arg (#146495)
There is no change to amdgcn--amdhsa.bc and nvptx64--nvidiacl.bc because
__opencl_c_generic_address_space is not defined for them.
2025-07-02 08:28:01 +08:00