Commit Graph

540368 Commits

Author SHA1 Message Date
Guillot Tony
b896d262eb [C23][N3006] Documented behavior of underspecified object declarations (#140911)
This PR is documenting the behavior of Clang towards underspecified
object declarations in C23 as advised by @AaronBallman.
2025-06-09 11:02:20 -04:00
Igor Wodiany
cc2d5facec [mlir][spirv] Make CooperativeMatrixType a ShapedType (#142784)
This is to enable `CooperativeMatrixType` to be used with
`DenseElementsAttr`, so that a `spirv.Constant` can be easily built from
`OpConstantComposite`. For example:

```mlir
%cst = spirv.Constant dense<0.000000e+00> : !spirv.coopmatrix<1x1xf32, Subgroup, MatrixAcc>
```

Constraints of arithmetic operations are changed, as
`SameOperandsAndResultType` can no longer fully verify CoopMatrices.
This is because for shaped types the verifier only checks element type
and shapes, whereas for any other arbitrary type it looks for an exact
match.

This patch does not enable the actual deserialization. This will be
done in a subsequent PR.
2025-06-09 16:01:48 +01:00
Shilei Tian
bc5d8276da [FIX] Update check lines of llvm/test/Transforms/OpenMP/remove_globalization.ll 2025-06-09 10:59:53 -04:00
Kajetan Puchalski
18f8e23815 [flang][OpenMP] Make static duration variables default to shared DSA (#142783)
According to the OpenMP standard, variables with static storage duration
are predetermined as shared.
Add a check when creating implicit symbols for OpenMP to fix them
erroneously getting set to firstprivate.

Fixes llvm#140732.

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-06-09 15:52:24 +01:00
Ryosuke Niwa
de961997cb [WebKit checkers] Add an annotation for pointer conversion. (#141277)
This PR adds the WebKit checker support for
[[clang::annotate_type("webkit.pointerconversion")]].

When this attribute is set on the return value of a function, the
function is treated as safe to call anywhere and the return value's
pointer origin is the argument.`
2025-06-09 07:33:15 -07:00
Ryosuke Niwa
e5fa38b02b [WebKit checkers] Treat passing of a member variable which is capable of CheckedPtr as safe. (#142485)
It's safe for a member function of a class or struct to call a function
or allocate a local variable with a pointer or a reference to a member
variable since "this" pointer, and therefore all its members, will be
kept alive by its caller so recognize as such.
2025-06-09 07:32:15 -07:00
Shilei Tian
64bd4d91ef [FIX] Add nvptx as required target for some OpenMP tests
Those tests set nvptx64 in IR but doesn't require the target. The optimization
now needs TTI such that if nvptx is not registered, it just uses whatever
default target is, which will cause the check lines mismatch.
2025-06-09 10:27:46 -04:00
Simon Pilgrim
ced1f501ce [X86] IsElementEquivalent - pull out exact matching for same index/op. (#143367)
The types must still be vectors matching MaskSize
2025-06-09 15:24:12 +01:00
Kazu Hirata
b3b8a097fe [mlir] Use *Map::try_emplace (NFC) (#143341)
- try_emplace(Key) is shorter than insert({Key, nullptr}).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
2025-06-09 07:18:26 -07:00
Philip Reames
939666380f [SDAG] Add partial_reduce_sumla node (#141267)
We have recently added the partial_reduce_smla and partial_reduce_umla
nodes to represent Acc += ext(b) * ext(b) where the two extends have to
have the same source type, and have the same extend kind.

For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which
correspond to the existing nodes, but we also have vqdotsu which
represents the case where the two extends are sign and zero respective
(i.e. not the same type of extend).

This patch adds a partial_reduce_sumla node which has sign extension for
A, and zero extension for B. The addition is somewhat mechanical.
2025-06-09 07:17:45 -07:00
Shilei Tian
f32b75658f [Attributor] Use known non-flat AS before getAssumedAddrSpace (#143221)
If the underlying object already has a non-flat address space, we simply
use
that before calling `getAssumedAddrSpace`.

Partially fixes SWDEV-536263.
2025-06-09 10:11:34 -04:00
Rahul Joshi
7b5ab28e38 [NFC][Clang] Adopt simplified getTrailingObjects in Stmt (#143250) 2025-06-09 06:52:21 -07:00
Orlando Cazalet-Hyams
cf5e2b613d [KeyInstr] MDNodeKeyImpl<DILocation> skip zero values for hash (#143357)
[KeyInstr] MDNodeKeyImpl<DILocation> skip zero values for hash

Hashing AtomGroup and AtomRank substantially impacts performance whether Key
Instructions is enabled or not. We can't detect whether it's enabled here
cheaply; avoiding hashing zero values is a good approximation. This affects
Key Instruction builds too, but any potential costs incurred by messing with
the hash distribution (hash_combine(x) != hash_combine(x, 0)) appear to still
be massively outweighed by the overall compile time savings by performing this
check.

See PR for compile-time-tracker numbers.
2025-06-09 14:50:33 +01:00
Rahul Joshi
e4060d3bea [Clang][NFC] Adopt simpified getTrailingObjects in CIRGenFunctionInfo (#143253) 2025-06-09 06:25:08 -07:00
Tomer Shafir
4c555051d7 [AArch64][Subtarget] add missing direct include of Triple.h (#143362)
`AArch64Subtarget.h` uses the complete type of `Triple`, but had only
forward declared the class, which happend to be included through the
following bottom-up path for example: "llvm/TargetParser/Triple.h"
"llvm/MC/MCSubtargetInfo.h"
"llvm/CodeGen/TargetSubtargetInfo.h"
2025-06-09 14:16:45 +01:00
Simon Pilgrim
8b8cbe905b [X86] avx512-scalar_mask.ll - remove old FIXME comments
These were addressed by 034adf2683
2025-06-09 14:07:15 +01:00
Un1q32
251a43e193 [Clang] Link libgcc_s.1.dylib when building for macOS 10.5 and older (#141401) 2025-06-09 20:56:08 +08:00
jyli0116
7e00a7c021 [GlobalISel] Fixes unused variable error in testMOPredicate_MO (#143364)
Solves unused variable error in generated Global ISel code due to
changes from #140935
2025-06-09 13:37:38 +01:00
Jay Foad
c400fe24ae [AMDGPU] Update failing test after #129897 2025-06-09 12:33:33 +01:00
Jay Foad
6b25f4439c [AMDGPU] Detect trivially uniform arguments in InstCombine (#129897)
Update one test to use an SGPR argument as the simplest way of getting a
uniform value.
2025-06-09 12:06:03 +01:00
Jay Foad
592e59667a [TableGen] Move getSuperRegForSubReg into CodeGenRegBank. NFC. (#142979)
This method doesn't use anything from CodeGenTarget, so it seems to
belong in CodeGenRegBank.
2025-06-09 12:03:29 +01:00
Simon Pilgrim
66911b7546 [X86] Fold (add X, (srl Y, 7)) -> (sub X, (icmp_sgt 0, Y)) on vXi8 vectors (#143359)
Undo the vectorcombine canonicalisation as SSE has awful vXi8 shift
support, but can easily splat the MSB using the PCMPGTB(0,x) trick.

Alternative to #143106 which could cause infinite loops between srl/sra
conversions

Fixes #130549
2025-06-09 11:52:07 +01:00
David Sherwood
891a2c3c34 [AArch64] Change IssueWidth to 6 in AArch64SchedNeoverseV2.td (#142565)
I think that the issue width for neoverse-v2 CPUs is set too
high and does not properly reflect the dispatch constraints.
I tested various values of IssueWidth (16, 8 and 6) with runs
of SPEC2017 on a neoverse-v2 machine and I got the highest
overall geomean score with an issue width of 6, although it's
only a marginal 0.14% improvement. I also observed a 1-2%
improvement when testing the Gromacs application with some
workloads. Here are some notable changes in SPEC2017 ref
runtimes, i.e. has a ~0.5% change or greater ('-' means
faster):

548.exchange2: -1.7%
510.parest: -0.78%
538.imagick: -0.73%
500.perlbench: -0.57%
525.x264: -0.55%
507.cactuBSSN: -0.5%
520.omnetpp: -0.48%
511.povray: +0.57%
544.nab: +0.65%
503.bwaves: +0.68%
526.blender: +0.75%

If this patch causes any major regressions post-commit it can
be easily reverted, but I think it should be an overall
improvement.
2025-06-09 11:36:00 +01:00
Alex Guteniev
8631cddd69 libc++ test: update MinSequenceContainer.h to make some tests pass on MSVC STL (#140287)
Per [sequence.reqmts] there are these member functions.

I did not audit if any other member functions are missing. Adding these
is enough for MSVC STL
2025-06-09 18:35:27 +08:00
Tom Eccles
ce603a0f16 [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (#140066)
This adds another puzzle piece for the support of OpenMP DECLARE
REDUCTION functionality.

This adds support for operators with derived types, as well as declaring
multiple different types with the same name or operator.

A new detail class for UserReductionDetials is introduced to hold the
list of types supported for a given reduction declaration.

Tests for parsing and symbol generation added.

Declare reduction is still not supported to lowering, it will generate a
"Not yet implemented" fatal error.

Fixes #141306
Fixes #97241
Fixes #92832
Fixes #66453

---------

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
2025-06-09 11:17:03 +01:00
David Spickett
e4447e1273 [lldb][test] Remove Windows xfail from forward declaration tests
Since https://github.com/llvm/llvm-project/pull/141344, they are
passing.
2025-06-09 10:14:54 +00:00
Iris Shi
bfb48363b0 [SelectionDAG] Make (a & x) | (~a & y) -> (a & (x ^ y)) ^ y available for all targets (#137641) 2025-06-09 17:57:15 +08:00
Charles Zablit
b62488f832 [lldb] make lit use the same PYTHONHOME for building and testing (#143183)
When testing LLDB, we want to make sure to use the same Python as the
one we used to build it.

This patch used the CMake variable `Python3_ROOT_DIR` to set the
`PYTHONHOME` env variable in LLDB lit tests, in order to ensure of this.

Please see https://github.com/swiftlang/swift/pull/82063 for the
original issue.
2025-06-09 10:20:39 +01:00
David Spickett
c738308784 [lldb][test] Remove outdated rdar link in settings test 2025-06-09 08:59:22 +00:00
Kazu Hirata
df4b453516 [lldb] Use llvm::find (NFC) (#143338)
This patch should be mostly obvious, but in one place, this patch
changes:

  const auto &it = std::find(...)

to:

  auto it = llvm::find(...)

We do not need to bind to a temporary with const ref.
2025-06-09 09:56:27 +01:00
nerix
c6670fa20d [LLDB] Unify DWARF section name matching (#141344)
Different object file formats support DWARF sections (COFF, ELF, MachO,
PE/COFF, WASM). COFF and PE/COFF only matched a subset. This caused some
GCC executables produced on MinGW to have issue later on when debugging.
One example is that `.debug_rnglists` was not matched, which caused
range-extraction to fail when printing a backtrace.

This unifies the parsing of section names in
`ObjectFile::GetDWARFSectionTypeFromName`, so all file formats can use
the same naming convention. Since the prefixes are different,
`GetDWARFSectionTypeFromName` only matches the suffixes (i.e. `.debug_`
needs to be stripped before).

I added two tests to ensure the sections are correctly identified on
Windows executables.
2025-06-09 09:46:50 +01:00
Corentin Jabot
62c9b0cab6 [Clang] Fix a crash when diagnosing a non relocatable with no copy ctr (#143350)
If lookup did not find a copy constructor, it would return nulll,
leading to an assert in `cast`.

Fixes #143325
2025-06-09 10:29:05 +02:00
jyli0116
f3ffee601c [GISel][AArch64] Allow PatLeafs to be imported in GISel which were previously causing warnings (#140935)
Previously PatLeafs could not be imported, causing the following
warnings to be emitted when running tblgen with
`-warn-on-skipped-patterns:`
```
/work/clean/llvm/lib/Target/AArch64/AArch64InstrInfo.td:2631:1: warning: Skipped pattern: Src pattern child has unsupported predicate
def : Pat<(i64 (mul top32Zero:$Rn, top32Zero:$Rm)),
^
```
These changes allow the patterns to now be imported successfully.
2025-06-09 09:02:56 +01:00
Yingwei Zheng
2f15637e04 [ValueTracking] Update Ordered when both operands are non-NaN. (#143349)
When the original predicate is ordered and both operands are non-NaN,
`Ordered` should be set to true. This variable still matters even if
both operands are non-NaN because FMF only applies to select, not fcmp.

Closes https://github.com/llvm/llvm-project/issues/143123.
2025-06-09 15:46:09 +08:00
Ricardo Jesus
5d3899d293 [AArch64][SVE] Mark AES instructions commutable. (#142919)
We already do this for the Neon versions of the instructions,
just not for SVE.
2025-06-09 08:29:36 +01:00
Ricardo Jesus
c70c0a86a5 [AArch64][InstCombine] Combine AES instructions with zero operands. (#142781)
We currently combine (AES (EOR (A, B)), 0) into (AES A, B) for Neon
intrinsics when the zero operand appears in the RHS of the AES
instruction.

This patch extends the combine to support AES SVE intrinsics and
the case where the zero operand appears in the LHS of the AES
instructions.
2025-06-09 08:27:58 +01:00
David Green
acd264d0ac [AArch64][GlobalISel] Prefer DUPLANE to REV and other shuffles (#142725)
Some shuffles containing undefs can match multiple instructions, such as
<3,u,u,u> being either a duplane or a rev. This changes the order that
different shuffles are considered, so that duplane is preferred which is
simpler and more likely to lead to further combines.
2025-06-09 07:47:17 +01:00
Kazu Hirata
03f616eb3a [llvm] Compare std::optional<T> to values directly (NFC) (#143340)
This patch transforms:

  X && *X == Y

to:

  X == Y

where X is of std::optional<T>, and Y is of T or similar.
2025-06-08 22:37:59 -07:00
Kazu Hirata
3dabeed837 [ADT] Simplify popcount with constexpr if (NFC) (#143339)
Without this patch, we implement 4-byte and 8-byte popcount as
structs.

This patch replaces the template trick with constexpr if, putting the
entire logic in the body of popcount.
2025-06-08 22:37:51 -07:00
Kazu Hirata
e47abec513 [DirectoryWatcher] Use llvm::find (NFC) (#143337) 2025-06-08 22:37:44 -07:00
Abhishek Kaushik
d6ecd6a658 [SelectionDAG][X86] Handle llvm.type.test in DAGBuilder (#142939)
Closes #142937
2025-06-09 10:44:55 +05:30
Nathan Ridge
392bd577e3 [clangd] Guard against trivial FunctionProtoTypeLoc when creating inlay hints (#143087)
Fixes https://github.com/llvm/llvm-project/issues/142608
2025-06-09 00:33:20 -04:00
Sudharsan Veeravalli
e27876ad2f [RISCV] Add compress patterns for Xqcibi branch instructions (#143095)
This patch adds patterns to compress from the 48-bit qc.e.bxxi to the 32
bit qc.bxxi branch instructions.
2025-06-09 09:29:02 +05:30
Ami-zhang
0ed5d9aff6 [LoongArch][BF16] Add support for the __bf16 type (#142548)
The LoongArch psABI recently added __bf16 type support. Now we can
enable this new type in clang.

Currently, bf16 operations are automatically supported by promoting to
float. This patch adds bf16 support by ensuring that load extension /
truncate store operations are properly expanded.

And this commit implements support for bf16 truncate/extend on hard FP
targets. The extend operation is implemented by a shift just as in the
standard legalization. This requires custom lowering of the truncate
libcall on hard float ABIs (the normal libcall code path is used on soft
ABIs).
2025-06-09 11:15:41 +08:00
tangaac
90beda2aba [LoongArch] Lower vector_shuffle as lane permute and shuffle for lasx if possible. (#141196) 2025-06-09 09:23:53 +08:00
Amir Ayupov
03bbd04bb7 [BOLT][NFCI] Skip validation in parseLBRSample (#143288)
Parsed branches and fall-throughs are validated in `doBranch` and
`doTrace` respectively. Simplify parseLBRSample by omitting the
validation. This also speeds up perf data processing as checks are only
done once for aggregated branches/fall-throughs and not individual LBR
entries.

Since invalid/external addresses are no longer sanitized during parsing,
sanitize them in `doBranch`.

Test Plan: updated X86/pre-aggregated-perf.test
2025-06-08 17:50:02 -07:00
Amir Ayupov
dcd2ac7ef2 [BOLT] Sort EntryData (#143308)
Aggregated branch data has two containers: `Data` for local branches,
and `EntryData` for external branches. Fix the omission and sort
`EntryData` to ensure stable output fdata profiles.

Test Plan: updated pre-aggregated-perf.test
2025-06-08 17:43:44 -07:00
Amir Ayupov
c480dcddd9 [BOLT][NFC] Move LBREntry from DataReader to DataAggregator (#143287)
LBREntry is only used in DataAggregator.

Test Plan: NFC
2025-06-08 17:41:46 -07:00
Phoebe Wang
4fbf67f73b [X86][FP16] Do not generate X86 FMIN/FMAX for FP16 when VLX not enabled (#143100)
Fixes: https://godbolt.org/z/7jYa3bWK9
2025-06-09 08:35:56 +08:00
Kazu Hirata
f3867f900f [llvm] Use *Map::try_emplace (NFC) (#143321)
- try_emplace(Key) is shorter than insert(std::make_pair(Key, 0)).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
2025-06-08 16:18:46 -07:00