Commit Graph

542664 Commits

Author SHA1 Message Date
Jim Lin
c15f422541 [RISCV] Remove required features zvfhmin/zvfbfmin from plain f16/bf16 intrinsics (#145891)
We've checked f16/bf16 vector type support using `checkRVVTypeSupport`.
So it's not necessary to add the required features for plain f16/bf16
intrinsics that do not use actual instructions from zvfhmin/zvfbfmin.
2025-06-27 16:10:10 +08:00
David Sherwood
6f43754e9c [LV] Disable interleaving via hints for uncountable early exit loops (#145877)
Currently if the user enables interleaving during vectorisation of
uncountable early exit loops via the interleave_count pragma and the
enable-early-exit-vectorization option, it will miscompile. There is
ongoing work to fix this, but for now it seems safer to ignore the hint
until it is supported.

---------

Co-authored-by: Paul Walker <paul.walker@arm.com>
2025-06-27 09:09:55 +01:00
Kazu Hirata
2529de5c93 [ADT] Deprecate ArrayRef(std::nullopt) (#146011)
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, this patch deprecates
the constructor.  All known uses within the LLVM codebase have been
migrated to other constructors.
2025-06-27 01:03:02 -07:00
Simon Pilgrim
a40a4c552b [MC] MCSectionGOFF.h - fix GCC Wparentheses operator precedence warning around assert message. NFC. 2025-06-27 08:59:46 +01:00
Florian Hahn
ec62dee703 [VPlan] Handle FirstActiveLane when unrolling. (#145394)
Currently FirstActiveLane is not handled correctly during
 unrolling. This is currently causing mis-compiles when
 vectorizing early-exit loops with interleaving forced.

This patch updates handling of FirstActiveLane to be analogous to
computing final reduction results: during unrolling, the created copies
for its original operand are added as additional operands, and
FirstActiveLane will always produce the index of the first active lane
across all unrolled iterations.

Note that some of the generated code is still incorrect, as we also need
to handle ExtractElement with FirstActiveLane operands. I will share
patches for those soon as well.

PR: https://github.com/llvm/llvm-project/pull/145394
2025-06-27 08:44:57 +01:00
Daniel Man
045b827367 [GlobalISel] Use-Vector-Truncate Opt Needs Elt Type Check (#146003)
In the pre-legalizer combiner, there exists a bug with UseVectorTruncate
match-apply optimization. When the destinations' types do not match the
vector element type of the G_UNMERGE_VALUES instruction, the resulting
collapsed truncate does not preserve original functional behavior. This
commit introduces a simple type check to ensure that the destination
types match the vector element type.
2025-06-27 16:41:22 +09:00
Baranov Victor
8a839ea791 [analyzer][NFC] Fix clang-tidy warning in Malloc and UnixApi checkers (#145719)
Mostly `else-after-return` and `else-after-continue` warnings
2025-06-27 10:33:09 +03:00
Fangrui Song
205dcf7146 PowerPC: Remove redundant MCSymbolRefExpr::VariantKind casts 2025-06-27 00:28:41 -07:00
Christopher McGirr
96c1611163 [mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern (#141613)
Given the following example:
```
module {
  func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> {
    %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32>
    return %pack : tensor<1x1x1x4x1xf32>
  }
}
```

We would generate an invalid transpose operation because the calculated
permutation would be `[0, 2, 0]` which is semantically incorrect. As the
permutation must contain unique integers corresponding to the source
tensor dimensions.

The following change modifies how we calculate the permutation array and
ensures that the dimension indices given in the permutation array is
unique.

The above example would then translate to a transpose having a
permutation of `[1, 2, 0]`. Following the rule, that the `inner_dim_pos`
is appended to the permutation array and the preceding indices are
filled with the remaining dimensions.
2025-06-27 09:24:33 +02:00
Joachim Jenke
23384cd581 [OpenMP][test][NFC] Temporarily disabling failing test
The test added with PR #145625 fails with certain build configurations of
libomp, disabling the test until the issue in the runtime is fixed.
2025-06-27 09:03:05 +02:00
quic_hchandel
950d281eb2 [RISCV] Add ISel patterns for Qualcomm uC Xqcicm extension (#145643)
Add codegen patterns for the conditional move instructions in this
extension
2025-06-27 12:25:48 +05:30
Kunqiu Chen
bc90166a50 [TSan] Clarify and enforce shadow end alignment (#144648)
In TSan, every `k` bytes of application memory (where `k = 8`) maps to a
single shadow/meta cell. This design leads to two distinct outcomes when
calculating the end of a shadow range using `MemToShadow(addr_end)`,
depending on the alignment of `addr_end`:

- **Exclusive End:** If `addr_end` is aligned (`addr_end % k == 0`),
`MemToShadow(addr_end)` points to the first shadow cell *past* the
intended range. This address is an exclusive boundary marker, not a cell
to be operated on.
- **Inclusive End:** If `addr_end` is not aligned (`addr_end % k != 0`),
`MemToShadow(addr_end)` points to the last shadow cell that *is* part of
the range (i.e., the same cell as `MemToShadow(addr_end - 1)`).

Different TSan functions have different expectations for whether the
shadow end should be inclusive or exclusive. However, these expectations
are not always explicitly enforced, which can lead to subtle bugs or
reliance on unstated invariants.


The core of this patch is to ensure that functions ONLY requiring an
**exclusive shadow end** behave correctly.

1.  Enforcing Existing Invariants:
For functions like `MetaMap::MoveMemory` and `MapShadow`, the assumption
is that the end address is always `k`-aligned. While this holds true in
the current codebase (e.g., due to some external implicit conditions),
this invariant is not guaranteed by the function's internal context. We
add explicit assertions to make this requirement clear and to catch any
future changes that might violate this assumption.

2.  Fixing Latent Bugs:
In other cases, unaligned end addresses are possible, representing a
latent bug. This was the case in `UnmapShadow`. The `size` of a memory
region being unmapped is not always a multiple of `k`. When this
happens, `UnmapShadow` would fail to clear the final (tail) portion of
the shadow memory.

This patch fixes `UnmapShadow` by rounding up the `size` to the next
multiple of `k` before clearing the shadow memory. This is safe because
the underlying OS `unmap` operation is page-granular, and the page size
is guaranteed to be a multiple of `k`.

Notably, this fix makes `UnmapShadow` consistent with its inverse
operation, `MemoryRangeImitateWriteOrResetRange`, which already performs
a similar size round-up.

In summary, this PR:

- **Adds assertions** to `MetaMap::MoveMemory` and `MapShadow` to
enforce their implicit requirement for k-aligned end addresses.
- **Fixes a latent bug** in `UnmapShadow` by rounding up the size to
ensure the entire shadow range is cleared. Two new test cases have been
added to cover this scenario.
  - Removes a redundant assertion in `__tsan_java_move`.
- Fixes an incorrect shadow end calculation introduced in commit
4052de6. The previous logic, while fixing an overestimation issue, did
not properly account for `kShadowCell` alignment and could lead to
underestimation.
2025-06-27 14:43:34 +08:00
Fangrui Song
7726103d1e WebAssembly: Merge MCExpr into MCAsmInfo
to align with targets that have made the transition.
2025-06-26 23:38:39 -07:00
Kazu Hirata
a277d24ddb [ProfileData] Use llvm::count (NFC) (#146013)
llvm::count is shorter than llvm::count_if plus a lambda.
2025-06-26 23:38:28 -07:00
Kazu Hirata
c7b34b0b44 [mlir] Use a new constructor of ArrayRef (NFC) (#146009)
ArrayRef now has a new constructor that takes a parameter whose type
has data() and size().  This patch migrates:

  ArrayRef<T>(X.data(), X.size()

to:

  ArrayRef<T>(X)
2025-06-26 23:38:20 -07:00
Kazu Hirata
26ec66dc18 [llvm] Use a new constructor of ArrayRef (NFC) (#146008)
ArrayRef now has a new constructor that takes a parameter whose type
has data() and size().  This patch migrates:

  ArrayRef<T>(X.data(), X.size()

to:

  ArrayRef<T>(X)
2025-06-26 23:38:12 -07:00
Kazu Hirata
8f71650baa [clang] Use a new constructor of ArrayRef (NFC) (#146007)
ArrayRef now has a new constructor that takes a parameter whose type
has data() and size().  This patch migrates:

  ArrayRef<T>(X.data(), X.size()

to:

  ArrayRef<T>(X)
2025-06-26 23:38:05 -07:00
Florian Hahn
786ccb2c0e [LV] Directly check if memory or SCEV check blocks are used (NFCI).
Slightly simplify the logic to retrieve check blocks in
GeneratedRTChecks, to prepare for additional refactoring.
2025-06-27 07:24:32 +01:00
Fangrui Song
eb9d22b24c VE: Merge MCExpr into MCAsmInfo 2025-06-26 23:21:46 -07:00
Thurston Dang
afe6af14ff [msan] Add optional flag to improve instrumentation of disjoint OR (#145990)
The disjoint OR (https://github.com/llvm/llvm-project/pull/72583) of two '1's is poison, hence the MSan ought to consider the result uninitialized (rather than initialized - i.e. a false negative - as per the existing instrumentation which ignores disjointedness). This patch adds a flag, `-msan-precise-disjoint-or`, which defaults to false (the legacy behavior). A future patch will default this flag to true.

Updates the test from https://github.com/llvm/llvm-project/pull/145982
2025-06-26 22:55:55 -07:00
Fangrui Song
56b2c7d988 MC: Rename initializeVariantKinds to initializeAtSpecifiers
We introduced VariantKinds after MCSymbolRefExpr::VariantKind and then
deprecated the VariantKind naming in favor of AtSpecifier (#133214).
Rename the function and type to use the recommended convention.
2025-06-26 22:46:20 -07:00
Paul Kirth
9179322447 Revert "[llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO" (#145987)
Reverts llvm/llvm-project#139999

This has a reported crash in
https://github.com/llvm/llvm-project/pull/139999#issuecomment-2993622494

This PR was intended to fix an error when linking, which is
unfortunately preferable to crashing clang. For now, we'll revert and
investigate the problem.
2025-06-26 22:35:38 -07:00
Chuanqi Xu
d829636f5d [C++20] [Modules] Don't mark namespace decl as module local declaration
Close https://github.com/llvm/llvm-project/issues/145975

According to [basic.namespace.general]/p2:
> A namespace is never attached to a named module and never has a name
> with module linkage.
2025-06-27 13:35:09 +08:00
Changpeng Fang
4729242878 AMDGPU: Add MC layer support for load transpose instructions for gfx1250 (#146024)
Co-authored with @jayfoad
2025-06-26 22:30:31 -07:00
Fangrui Song
7dfcf489fd PowerPC: Separate ELF and XCOFF @ specifiers
`@l` was incorrectly parsed as ELF-specific S_LO. Change it to AIX-specific S_L.
2025-06-26 22:23:51 -07:00
Florian Mayer
60a18d6119 [LowerAllowCheckPass] fix pipeline printing (#146000) 2025-06-26 21:35:35 -07:00
LLVM GN Syncbot
0cde5a8569 [gn build] Port 4f97780a7a 2025-06-27 03:55:29 +00:00
Nicolai Hähnle
61739d76f0 AMDGPU: Trivial doc fixes (#146021) 2025-06-26 20:55:16 -07:00
Jordan Rupprecht
8ed064b979 [bazel] Add targets for transform.debug python extension (#146022)
For #145550 / c08502defe
2025-06-26 22:51:25 -05:00
Fangrui Song
207925ebe7 Xtensa: Move MCExpr into MCAsmInfo
to align with targets that have made the transition.
2025-06-26 20:48:47 -07:00
Erick Velez
ab1e4d55d8 [clang-doc] refactor BitcodeReader::readSubBlock (#145835)
Reduce boilerplate code in readSubBlock by creating a callable from a higher-order lambda based on the block's add need.
2025-06-26 20:44:14 -07:00
Fangrui Song
4f97780a7a LoongArch: Move MCExpr into MCAsmInfo
to align with targets that have made the transition.
2025-06-26 20:42:51 -07:00
Qi Zhao
569fcac458 [LoongArch] Pre-commit tests for optimizing insert extracted fp elements 2025-06-27 11:19:06 +08:00
Jordan Rupprecht
1b2843bae0 [bazel] Port #145995 (#146014)
Commit 0515449f6d
2025-06-26 22:21:48 -05:00
Jim Lin
96ec1c29f2 [RISCV] Add nds.bfos and nds.bfoz for the short forward branch optimization. (#145836)
This adds nds.bfos and nds.bfoz, which are also supported by Andes
45-series CPUs for short forward branch optimization.
2025-06-27 10:54:41 +08:00
Han-Chung Wang
0515449f6d [mlir][tensor][memref] Enhance collapse(expand(src)) canonicalization pattern. (#145995) 2025-06-26 19:39:50 -07:00
ZhaoQi
30e519e1ad [LoongArch] Fix xvshuf instructions lowering (#145868)
Fix https://github.com/llvm/llvm-project/issues/137000.
2025-06-27 10:29:32 +08:00
Nicolai Hähnle
d58b0f23d0 AMDGPU/MC: Try harder to evaluate absolute MC expressions (#145146)
This is a follow-up to commit 24c860547e ("AMDGPU/MC: Fix emitting
absolute expressions (#136789)").

In some downstream work, we end up with an MCTargetExpr that is a
maximum (AGVK_Max) in an instruction operand. getMachineOpValueCommon
recognizes the absolute nature of the expression and doesn't emit a
fixup. getLitEncoding needs to be aligned with this decision, else we
end up with a 0 immediate without a corresponding fixup.

Note that evaluateAsAbsolute checks for MCConstantExpr as a fast path,
so this accepts strictly more cases than before.

I've tried several ways to write a test for this without success. The
challenge is that there is no upstream way to generate this kind of
expression in an instruction operand natively, and trying to create one
via inline assembly fails because the assembly parser evaluates the
expression to a constant during parsing.
2025-06-26 19:22:44 -07:00
Ryan Mansfield
54e2f5ac9c [clang][docs] Fix typo in -fapinotes-modules option. (#145907) 2025-06-26 19:16:35 -07:00
Alex Langford
13da48ddb3 [lldb][NFC] Remove unused ConstString includes in Utility (#145983) 2025-06-26 19:15:23 -07:00
Owen Pan
f2f17e563d [clang-format][NFC] Remove \brief from comments (#145853)
This was done before in https://reviews.llvm.org/D46320
2025-06-26 19:14:34 -07:00
Lang Hames
ad6b597875 [ORC] Fix EPCGenericMemoryAccessTest write-ptrs implementation after f93df5ebd9
The write-pointers operation should be writing a pointer, not a uint64_t. This
bug existed prior to f93df5ebd9, but changes in that commit seem to have
exposed the issue (see e.g.
https://lab.llvm.org/buildbot/#/builders/154/builds/17956).
2025-06-27 12:08:25 +10:00
Adam Glass
9a0a9764f3 [Clang][AArch64] _interlockedbittestand{set,reset}64_{acq,rel,nf} support for AArch64 (#145980)
Adds _interlockedbittestand{set,reset}64_{acq,rel,nf} support for
AArch64
2025-06-26 17:20:27 -07:00
jimingham
ec48d15b20 Fix a bug in the breakpoint ID verifier in CommandObjectBreakpoint. (#145994)
It was assuming that for any location M.N, N was always less than the
number of breakpoint locations. But if you rebuild the target and rerun
multiple times, when the section backing one of the locations is no
longer valid, we remove the location, but we don't reuse the ID. So you
can have a breakpoint that only has location 1.3. The num_locations
check would say that was an invalid location.
2025-06-26 17:03:07 -07:00
Chelsea Cassanova
c3811c8474 [lldb][scripts] Use named args in versioning script (#145993)
Using named args means that you don't need to keep track of 5 positional
args.
2025-06-26 17:02:22 -07:00
Lang Hames
f93df5ebd9 [ORC] Add read operations to orc::MemoryAccess. (#145834)
This commit adds operations to orc::MemoryAccess for reading basic types
(uint8_t, uint16_t, uint32_t, uint64_t, pointers, buffers, and strings)
from executor memory.

The InProcessMemoryAccess and EPCGenericMemoryAccess implementations are
updated to support the new operations.
2025-06-27 09:49:17 +10:00
Jonas Devlieghere
76f3cc9e04 [lldb] Fix another race condition in Target::GetExecutableModule (#145991)
c72c0b298c fixed a race condition in Target::GetExecutableModule. The
patch originally added the lock_guard but I suggested using the locking
ModuleList::Modules() helper instead. That didn't consider that the
fallback would still access the ModuleList without holding the lock.
This patch fixes the remaining issue.
2025-06-26 16:44:19 -07:00
Haohai Wen
018548ddff [objcopy][coff] Place section name first in strtab (#145266)
The prioritized string table builder was introduced in 9cc9efc. This
patch sets highest priority for the section name to place it at the
start of string table to avoid the issue described in 4d2eda2.
2025-06-27 07:39:04 +08:00
Gheorghe-Teodor Bercea
3df36a2b18 [AMDGPU] Enable vectorization of i8 values. (#134934)
This patch adjusts the cost model to account for the ability of the
AMDGPU optimizer to group together i8 values into i32 values.

Co-authored-by: Erich Keane <ekeane@nvidia.com>
2025-06-26 19:15:31 -04:00
Thurston Dang
9e4981cf11 [NFCI][msan] Add test for "disjoint" OR (#145982)
Disjoint OR is an extension to OR that was introduced in https://github.com/llvm/llvm-project/pull/72583. This patch adds a test case that shows MSan does not handle it correctly.
2025-06-26 15:58:05 -07:00