Commit Graph

538236 Commits

Author SHA1 Message Date
Fraser Cormack
0bc7f41db8 [libclc] Move all remquo address spaces to CLC library (#140871)
Previously the OpenCL address space overloads of remquo would call into
the one and only 'private' CLC remquo. This was an outlier compared with
the other pointer-argumented maths builtins.

This commit moves the definitions of all address space overloads to the
CLC library to give more control over each address space to CLC
implementers.

There are some minor changes to the generated bytecode but it's simply
moving IR instructions around.
2025-05-21 11:26:04 +01:00
Benjamin Maxwell
7a8090c037 [AArch64] Remove unused ISD nodes (NFC) (#140706)
Part of #140472.
2025-05-21 10:55:07 +01:00
Tom Eccles
f054aa240f [flang][OpenMP] fix diagnostic for bad cancel type (#140798)
Fixes #133685
2025-05-21 10:48:13 +01:00
Fraser Cormack
80913b44a4 [libclc][NFC] Reuse inc file for OpenCL frexp decl 2025-05-21 10:19:31 +01:00
Benjamin Maxwell
1a08aa2680 [AArch64] Split AArch64ISD::COND_SMSTART/STOP off AArch64::SMSTART/STOP (NFC) (#140711)
The conditional variants of SMSTART/STOP currently take the current
PStateSM as a variadic value. This is not supported by the verification
added in #140472 (which requires variadic values to be of type Register
or RegisterMask), so this patch splits the the conditional variants into
new `COND_` nodes, where these extra parameters are fixed arguments.

Suggested in
https://github.com/llvm/llvm-project/pull/140472#discussion_r2094635066

Part of #140472.
2025-05-21 10:05:57 +01:00
Luke Hutton
22a493089a [mlir][tosa] Allow creation of reshape with unranked output (#140617)
This commit allows reshape to be created with an unranked output,
allowing it to be inferred by the shape inference pass.
2025-05-21 09:59:36 +01:00
Hongren Zheng
1b69f7775b [MLIR][Doc] Add documentation for OpAsmAttr/TypeInterface (#140244)
After the introduction of OpAsmAttr/TypeInterface in #121187 #124721,
the documentation for them could be updated along side the doc for
OpAsmDialectInterface.

#127993 changed the trailing digit behavior for alias name generation.
2025-05-21 16:58:45 +08:00
Fraser Cormack
6553dc30b8 [NVPTX] Support the OpenCL generic addrspace feature by default (#137940)
As best as I can see, all NVPTX architectures support the generic
address space.

I note there's a FIXME in the target's address space map about 'generic'
still having to be added to the target but we haven't observed any
issues with it downstream. The generic address space is mapped to the
same target address space as default/private (0), but this isn't
necessarily a problem for users.
2025-05-21 09:55:11 +01:00
Wenju He
e70568e28e [libclc] Re-use shuffle_decl.inc in OpenCL shuffle2 declaration (#140679)
Also internalize __clc_get_el_* symbols in clc_shuffle2. llvm-diff shows
no change to amdgcn--amdhsa.bc.
2025-05-21 09:49:24 +01:00
Adrian Kuegel
67489c7f0f [mlir] Silence an unused variable warnings in builds without asserts. 2025-05-21 08:47:42 +00:00
Peter Smith
5b5c57d322 Add llvm-project archive issues for Chromium bug tracker (#132030)
The Chromium bug tracker is in an archived state. The Security Response
Group has preemptively created llvm-project GitHub issues with PDF
copies of the Chromium issues should the repository become inaccessible.

* Add URLs for redirects from
https://bugs.chromium.org/p/llvm/issues/detail?id=X to
https://issuetracker.google.com/issues/y
* Add URLs to llvm-project archive issues.
* Add an explanation of archive use.
2025-05-21 09:46:56 +01:00
Baranov Victor
64dcf78935 [clang-tidy][NFC] Refactor modernize-pass-by-value check code and tests (#140753)
- Deleted unused includes
- Deleted useless braces
- Modernized tests to use `CHECK-MESSAGES-NOT` and `CHECK-FIXES-NOT` for
better readability and maintainability
2025-05-21 10:34:45 +02:00
Simon Pilgrim
dec214d5c6 [X86] combineINSERT_SUBVECTOR - generalise insert_subvector(x,extract(broadcast)) -> blend (#140516)
Don't match against specific broadcast nodes and let isShuffleEquivalent handle it
2025-05-21 09:28:52 +01:00
Luke Lau
d827588c36 [VectorCombine] Scalarize binop-like intrinsics (#138095)
Currently VectorCombine can scalarize vector compares and binary ops.
This extends it to also scalarize binary-op like intrinsics like umax,
minnum etc.

The motivation behind this is to scalarize more intrinsics in
VectorCombine rather than in DAGCombine, so we can sink splats across
basic blocks: see #137786

This currently has very little effect on generated code because
InstCombine doesn't yet canonicalize binary intrinsics where one operand
is a constant into the form that VectorCombine expects, i.e. `binop
(shuffle insert) const --> shuffle (binop insert const)`. The plan is to
land this first and then in a subsequent patch teach InstCombine to do
the canonicalization to avoid regressions in the meantime.

This uses `isTriviallyVectorizable` to determine whether or not an
intrinsic is safe to scalarize. There's also `isTriviallyScalarizable`,
but this seems more geared towards the Scalarizer pass and includes
intrinsics with multiple return values.

It also only handles intrinsics with two operands with the same type as
the return type. In the future we would generalize this to handle
arbitrary numbers of operands, including unary operators too, e.g. fneg
or fma, as well as different operand types, e.g. powi or scmp
2025-05-21 09:24:11 +01:00
cor3ntin
f4cebe5d73 [Clang] Fix a regression introduced by #140576 (#140859)
Lambda bodies should not be treated as subexpressions of the enclosing
scope.
2025-05-21 10:18:14 +02:00
Kazu Hirata
9f4cea209e [llvm] Fix typos in documentation (#140844) 2025-05-21 01:11:08 -07:00
Kazu Hirata
fe6290ef5b [llvm] Use *Map::try_emplace (NFC) (#140843)
try_emplace can default-construct values, so we do not need to do so
on our own.  Plus, try_emplace(Key) is much shorter than
insert(std::make_pair(Key, Value()).
2025-05-21 01:11:01 -07:00
Adrian Kuegel
ad05543428 [Bazel] Also adapt test/BUILD for 9a553d3766 2025-05-21 07:51:38 +00:00
Adrian Kuegel
462cede350 [Bazel] Follow fixes for 9a553d3766 2025-05-21 07:41:19 +00:00
Kazu Hirata
5bd3c44b79 [clang-tools-extra] Remove redundant control flow statements (NFC) (#140846) 2025-05-21 00:29:17 -07:00
Hsiangkai Wang
9f1da90d6f [mlir][SPIRV] Do not rewrite CompositeInsert for coopmatrix (#137837)
When rewriting multiple CompositeInserts to CompositeConstruct, we need
to know the number of elements of the result type. However, we cannot
query the number of elements for cooperative matrix types.
2025-05-21 08:12:24 +01:00
Timm Baeder
bc546cabe5 [clang][NFC] Clean up ASTContext.cpp (#140847)
Use BuiltinType::{isInteger,isSignedInteger,isUnsignedInteger} instead
of doing the comparisons here.
2025-05-21 09:09:41 +02:00
Adrian Kuegel
49a9bbba67 [Bazel] Port a9ee8e4a45 2025-05-21 07:05:48 +00:00
Jonas Rickert
a21986b152 [MLIR][PDL] Skip over all results in the PDL Bytecode if a Constraint/Rewrite failed (#139255)
Skipping only over the first results leads to the curCodeIt pointing to
the wrong location in the bytecode, causing the execution to continue
with a wrong instruction after the Constraint/Rewrite.

Signed-off-by: Rickert, Jonas <Jonas.Rickert@amd.com>
2025-05-20 23:37:27 -07:00
tangaac
f87bcf1b49 [LoongArch] Add patterns for vstelm instructions (#139201) 2025-05-21 14:21:12 +08:00
Balázs Benics
d50c85df25 [analyzer][NFC] Move PrettyStackTraceLocationContext into dispatchWorkItem (#140035)
[analyzer][NFC] Move PrettyStackTraceLocationContext into
dispatchWorkItem

This change helps with ensuring that the abstract machine call stack is
only dumped exactly once no matter what checker callback we have the
crash in.

Note that `check::EndAnalysis` callbacks are resolved outside of
`dispatchWorkItem`, but that's the only checker callback that is outside
of `dispatchWorkItem`.

CPP-6476
2025-05-21 08:10:35 +02:00
Ely Ronnen
2ed8c27d03 add @skipIfWindows to unresolved disassemble test on windows (#140852)
Fix https://lab.llvm.org/buildbot/#/builders/141/builds/8867
2025-05-21 08:10:22 +02:00
Timm Baeder
645846d43b [clang][bytecode] Initialize global strings via memcpy (#140789)
If we know the char width is 1, we can just copy
the data over instead of going through the Pointer API.
2025-05-21 07:56:15 +02:00
Timm Baeder
3c8a6bc3b7 [clang][bytecode] Optimize classify() further (#140735)
Try to do as few checks as possible. Check for builtin types only once,
then look at the BuiltinType Kind. For integers, we cache the int and
long size, since those are used a lot and the ASTContext::getIntWidth()
call is costly.
2025-05-21 07:35:00 +02:00
Ely Ronnen
9d0614e77a [lldb-dap] fix disassembly request instruction offset handling (#140486)
Fix the handling of the `instructionOffset` parameter, which resulted in
always returning the wrong disassembly because VSCode always uses
`instructionOffset = -50` and expects 50 instructions before the given
address, instead of 50 bytes before
2025-05-21 06:50:13 +02:00
Kazu Hirata
1fd24369c3 [CodeGen] Use range-based for loops (NFC) (#140840) 2025-05-20 21:39:59 -07:00
Kazu Hirata
557448e144 [DebugInfo] Use std::map::try_emplace (NFC) (#140839)
This patch provides default member initialization for SymInfo, which
in turns allows us to call std::map::try_emplace without the value.
2025-05-20 21:39:34 -07:00
Kazu Hirata
51b288d9cf [CodeGen] Avoid repeated hash lookups (NFC) (#140838) 2025-05-20 21:39:17 -07:00
Kazu Hirata
adbe155eee [polly] Drop const from return types (NFC) (#140837) 2025-05-20 21:38:49 -07:00
Kazu Hirata
a4782ff903 [llvm-mca] Drop const from a return type (NFC) (#140836) 2025-05-20 21:38:34 -07:00
Javier Lopez-Gomez
211ee04a61 [llvm-debuginfo-analyzer] Fix a couple of unhandled DWARF situations leading to a crash (#137221)
This pull request fixes a couple of unhandled situations in DWARF input
leading to a crash. Specifically,

- If the DWARF input contains a declaration of a C variadic function
(where `...` translates to `DW_TAG_unspecified_parameters`), which is
then followed by a definition, `llvm_unreachable()` is hit in
`LVScope::addMissingElements()`. This is only visible in Debug builds.

- Parsing of instructions in `LVBinaryReader::createInstructions()` does
not check whether `Offset` lies within the `Bytes` ArrayRef. A specially
crafted DWARF input can lead to this condition.
2025-05-21 05:29:41 +01:00
Srinivasa Ravi
9a553d3766 [MLIR][NVVM] Add NVVMRequiresSM op traits (#126886)
Motivation:
Currently, the NVVMOps are not verified against the supported SM
architectures. This can manifest as an ISel failure in the NVPTX LLVM
backend during CodeGen to PTX ISA. This PR addresses this issue by
adding verifier checks for Target-SM architectures in the NVVM Dialect
itself, thereby catching the errors early on.

Summary:
* Parametric traits named `NVVMRequiresSM` and `NVVMRequiresSMa` are
added to facilitate the version checks for typical and arch-accelerated
versions respectively.
* These traits can be attached to any NVVM Op to enable the checks for
the particular Op. (example shown below)
* An attribute interface called named `TargetAttrVerifyInterface` is
added to the GPU dialect which any target attribute seeking to perform
target-verification on the module can implement.
* The checks are performed by the `NVVMTargetAttr` (implementing the
`TargetAttrVerifyInterface` interface) when called from the GPU module
verifier where it walks through the module and performs the checks for
Ops with the `NVVMRequiresSM` traits.
* A few Ops in `NVVMOps.td` have been updated to serve as examples.

Example Usage:
```
       def NVVM_ReduxOp : NVVM_Op<"redux.sync"> {...} 
 ----> def NVVM_ReduxOp : NVVM_Op<"redux.sync", [NVVMRequiresSM<80>]> {...}

       def NVVM_WgmmaFenceAlignedOp : NVVM_Op<"wgmma.fence.aligned"> {...}
 ----> def NVVM_WgmmaFenceAlignedOp : NVVM_Op<"wgmma.fence.aligned", [NVVMRequiresSMa<[90]>]> {...}
```

---------

Co-authored-by: Guray Ozen <guray.ozen@gmail.com>
2025-05-21 08:53:00 +05:30
Shilei Tian
0f2a469951 Revert "[AMDGPU] remove move instruction if there is no user of it (#136735)"
This reverts commit 883afa4ef9 since it is not
technically sound.
2025-05-20 23:12:09 -04:00
Johannes Doerfert
57a90edacd [OpenMP][GPU][FIX] Enable generic barriers in single threaded contexts (#140786)
The generic GPU barrier implementation checked if it was the main thread
in generic mode to identify single threaded regions. This doesn't work
since inside of a non-active (=sequential) parallel, that thread becomes
the main thread of a team, and is not the main thread in generic mode.
At least that is the implementation of the APIs today.

To identify single threaded regions we now check the team size
explicitly.

This exposed three other issues; one is, for now, expected and not a
bug, the second one is a bug and has a FIXME in the
single_threaded_for_barrier_hang_1.c file, and the final one is also
benign as described in the end.

The non-bug issue comes up if we ever initialize a thread state.
Afterwards we will never run any region in parallel. This is a little
conservative, but I guess thread states are really bad for performance
anyway.

The bug comes up if we optimize single_threaded_for_barrier_hang_1 and
execute it in Generic-SPMD mode. For some reason we loose all the
updates to b. This looks very much like a compiler bug, but could also
be another logic issue in the runtime. Needs to be investigated.

Issue number 3 comes up if we have nested parallels inside of a target
region. The clang SPMD-check logic gets confused, determines SPMD (which
is fine) but picks an unreasonable thread count. This is all benign, I
think, just weird:

```
  #pragma omp target teams
  #pragma omp parallel num_threads(64)
  #pragma omp parallel num_threads(10)
  {}
```
Was launched with 10 threads, not 64.
2025-05-20 19:33:54 -07:00
Owen Pan
0dfdf7efbf [clang-format] Handle raw string literals containing JSON code (#140666)
Fix #65400
2025-05-20 19:15:57 -07:00
Younan Zhang
701fe51fd6 [Clang] Fix an inadvertent overwrite of sub-initializers (#140714)
When using InitChecker with VerifyOnly, we create a new designated
initializer to handle anonymous fields. However in the last call to
CheckDesignatedInitializer, the subinitializer isn't properly preserved
but it gets overwritten by the cloned one. Which causes the initializer
to reference the dependent field, breaking assumptions when we
initialize the instantiated specialization.

Fixes https://github.com/llvm/llvm-project/issues/67173
2025-05-21 10:02:34 +08:00
Pat Doyle
064912217c [bazel][mlir] Add missing dep for 747620d (#140830)
fixes the following errors:

ERROR:
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/6a1efeb401da192d3572f00e2f11245b/external/llvm-project/mlir/BUILD.bazel:3410:11:
Compiling mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp
failed: (Exit 1): clang failed: error executing CppCompile command (from
target @@llvm-project//mlir:XeGPUTransforms) /usr/lib/llvm-18/bin/clang
-U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign
-Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics
-fno-omit-frame-pointer ... (remaining 130 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain
the sandbox build root for debugging

external/llvm-project/mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp:11:10:
error: module llvm-project//mlir:XeGPUTransforms does not depend on a
module exporting 'mlir/Dialect/Arith/Utils/Utils.h'
   11 | #include "mlir/Dialect/Arith/Utils/Utils.h"
      |          ^

external/llvm-project/mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp:13:10:
fatal error: 'mlir/Dialect/Index/IR/IndexDialect.h' file not found
   13 | #include "mlir/Dialect/Index/IR/IndexDialect.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.
2025-05-20 20:55:20 -05:00
Valentin Clement (バレンタイン クレメン)
6811a3bedf [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (#140818) 2025-05-20 18:55:13 -07:00
Jonas Devlieghere
7b51339387 [lldb-dap] Avoid double 'new' events for dyld on Darwin (#140810)
I got a bug report where a pedantic DAP client complains about getting
two "new" module events for the same UUID. This is caused by the dyld
transition from the on-disk dyld to the shared cache dyld, which share
the same UUID. The transition is not generating an unloaded event
(because we're not really unloading dyld) but we do get a loaded event
(because the load address changed). This PR fixes the issue by relying
on the modules set as the source of truth instead of relying on the
event type.
2025-05-20 18:51:59 -07:00
tangaac
3cf65656b6 [LoongArch] Remove wrong vector shuffle lowering for lasx. (#140688)
PR https://github.com/llvm/llvm-project/pull/137918 introduces a wrong
lowering for v4f64/v4i64 to generate xvshuf4i.d instruction.
This PR reverts the wrong part of lasx.
2025-05-21 09:39:55 +08:00
Nhat Nguyen
7385772fdb [mlir] [liveness] Conservatively mark operands of return-like op inside non-callable and non-regionbranch op as live (#140793)
Currently the liveness analysis always marks operands yielded in regions
that aren't classified as `RegionBranchOpInterface` or
`CallableOpInterface` as non-live. Examples for these ops include
linalg.generic (with `linalg.yield` as terminator) or gpu ops (with
`gpu.yield` as terminator).

This in turn makes the `remove-dead-values` pass always incorrectly
remove the bodies of these ops, leading to invalid IR. Because these ops
define their own semantics, I have conservatively marked all operands of
these yield ops to be live.
2025-05-20 18:36:26 -07:00
Alan Li
ada2fbfe36 [GISel] Fix ShuffleVector assert (#139769)
Fixes issue: https://github.com/llvm/llvm-project/issues/139752

When G_SHUFFLE_VECTOR has only 1 element then it is possible the vector
is decayed into a scalar.
2025-05-20 21:25:31 -04:00
Fangrui Song
744a469bba [BOLT,test] Add --image-base to tests that use --section-start
When using -no-pie without a SECTIONS command, the linker uses the
target's default image base. If -Ttext= or --section-start specifies an
output section address below this base, the result is likely unintended.
LLD will give a diagnostic (#140187) and may change the behavior in the future.
It's good to set an explicit image base to avoid relying on its current
behavior. BOLT doesn't seem to care whether a PT_PHDR segment is
present.

Pull Request: https://github.com/llvm/llvm-project/pull/140570
2025-05-20 17:41:27 -07:00
YAMAMOTO Takashi
80f0ffb937 [lld][WebAssembly] Set the target-cpu in LTO config (#140010)
I couldn't find an existing way to pass -mcpu=lime1 equivalent to LTO
codegen.
This commit would privide one. With this commit, you can do so by
passing
`-mllvm -mcpu=lime1` to wasm-ld.
2025-05-20 17:29:01 -07:00
LLVM GN Syncbot
88c4ef2f9f [gn build] Port d561d595c4 2025-05-20 23:20:27 +00:00