Commit Graph

542664 Commits

Author SHA1 Message Date
Kazu Hirata
f0311f447a [ADT] Remove a constructor (NFC) (#146010)
ArrayRef now has a new constructor that takes a parameter whose type
has data() and size() methods.  Since the new constructor subsumes
another constructor that takes std::array, this patch removes that
constructor.  Note that std::array also comes with data() and size()
methods.

The only problem is that ASTFileSignature in the clang frontend does
not work with the new ArrayRef constructor because it overrides size,
blocking access to std::array<uint8_t, 20>::size().  This patch adds
an implicit cast operator to ArrayRef.  Note that ASTFileSignature is
defined as:

  struct ASTFileSignature : std::array<uint8_t, 20> {
    using BaseT = std::array<uint8_t, 20>;
    static constexpr size_t size = std::tuple_size<BaseT>::value;
    :
2025-06-27 07:39:31 -07:00
Momchil Velikov
3876e887d0 [MLIR][ArmSVE] Add an ArmSVE dialect operation mapping to bfmmla (#145064) 2025-06-27 15:37:13 +01:00
cmtice
da2969b105 [LLDB] Update DIL to handle smart pointers; add more tests. (#143786)
This updates the DIL implementation to handle smart pointers (accessing
field members and dereferencing) in the same way the current 'frame
variable' implementation does. It also adds tests for handling smart
pointers, as well as some additional DIL tests.
2025-06-27 07:30:14 -07:00
Edd Dawson
772009ce4a [PS5][Driver] Allow selection of CRT with -L (#145869)
There's long standing behaviour in PlayStation to allow user-supplied
library search paths (`-L`) to influence lookup of CRT objects. This
seems to be a historical quirk that has persisted to the present day.

This usage of `-L` for CRT selection is deeply entrenched among users of
the PS5 toolchain. While this change is conceptually bothersome, it does
reflect what's shipped.

SIE tracker: TOOLCHAIN-17706
2025-06-27 15:29:49 +01:00
Erich Keane
3463aba45f [OpenACC][CIR] Implement copyin/copyout/create lowering for compute/c… (#145976)
…ombined

This patch does the lowering of copyin (represented as a
    acc.copyin/acc.delete), copyout (acc.create/acc.copyin), and create
(acc.create/acc.delete).

Additionally, it found a few problems with #144806, so it fixes those as
well.
2025-06-27 07:25:58 -07:00
Ana Mihajlovic
08d747c1ef [AMDGPU] Fix bad removal of s_delay_alu (#145728)
instructionWaitsForSGPRWrites function covers ALL SALU instructions,
including those like s_waitcnt that don't read from sgpr. This results
in removing delay_alu instructions in cases like VALU->SGPR->VALU, which
results in performance regression. Change modifies the function so that
it checks if instruction also reads a sgpr.
2025-06-27 16:15:10 +02:00
Ross Brunton
39f19f2f1f [Offload] Store device info tree in device handle (#145913)
Rather than creating a new device info tree for each call to
`olGetDeviceInfo`, we instead do it on device initialisation. As well
as improving performance, this fixes a few lifetime issues with returned
strings.

This does unfortunately mean that device information is immutable,
but hopefully that shouldn't be a problem for any queries we want to
implement.

This also meant allowing offload initialization to fail, which it can
now do.
2025-06-27 15:10:43 +01:00
Ross Brunton
102cf1b999 [Offload] Make CUDA Driver Version a string (#146049)
AMD treats this value as a string, so for consistency require this in
NVIDIA as well. This shouldn't change the output of the
`llvm-offload-device-info` tool, but does fix an issue in liboffload
when it tries to query the version.
2025-06-27 15:07:04 +01:00
AZero13
dcea5f1f38 [TargetLowering] Fold (a | b) ==/!= b -> (a & ~b) ==/!= 0 when and-not exists (#145368)
This is especially helpful for AArch64, which simplifies ands + cmp to tst.
Alive2: https://alive2.llvm.org/ce/z/LLgcJJ

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-06-27 14:47:52 +01:00
Simon Pilgrim
f329689ec0 [AMDGPU] add_i128.ll - regenerate test checks 2025-06-27 14:28:33 +01:00
Wendi
2b48ce7034 [docs] Add documentation for LLVM Qualification Group (#145331)
This patch adds a new document describing the LLVM Qualification Group,
modeled after the Security Group documentation. The goal is to create an
open working group focused on enabling LLVM use in safety-critical
applications, such as those requiring ISO 26262 qualification.

The group is intended to be non-enforcing and collaborative, and to act
as a public coordination point for contributors working on
safety-relevant concerns in LLVM.

See:
https://discourse.llvm.org/t/rfc-proposal-to-establish-a-safety-group-in-llvm/86916

In this review, I’d really appreciate your feedback on both the overall
structure and wording, especially if anything could be made clearer,
more balanced, or more aligned with LLVM’s values and documentation
tone. What feels right? What could be improved to better reflect LLVM
community expectations?

---------

Co-authored-by: Wendi Urribarri (Woven by Toyota <wendi.urribarri@woven-planet.global>
2025-06-27 14:23:53 +01:00
Pawan Nirpal
2cbcec4832 [Clang][NFC] - Move attr-cpuspecific-cpus test over to Sema (#146065)
The attr-cpuspecific-cpus test does not have any LLVM IR checks relevant
to Codegen, Moving the test over to clang/test/Sema.
2025-06-27 09:22:52 -04:00
Reid Kleckner
f59d270867 [cmake] Ignore pstl in LLVM_ENABLE_PROJECTS (#146070)
This should fix the premerge-monolithic-linux bot. This can be removed
after the next build master restart.
2025-06-27 07:20:05 -06:00
Krzysztof Parzyszek
302ed97b58 [flang][OpenMP] Move lowering of ATOMIC to separate file, NFC (#146067)
Reinstate commits e5559ca4 and 925dbc79 with changes that avoid the
reported failures in Windows builds.

Ref: https://github.com/llvm/llvm-project/pull/144960
2025-06-27 08:19:16 -05:00
John Brawn
ec1c73b2ec [compiler-rt][ARM] Only use bxaut when the target has pacbti (#146057)
Most pacbti instructions are a nop when the target does not have pacbti,
and thus safe to execute, but bxaut is an undefined instruction. When we
don't have pacbti (e.g. if we're compiling compiler-rt with
-mbranch-protection=standard in order to be forward-compatible with
pacbti while still working on targets without it) then we need to use
separate aut and bx instructions.
2025-06-27 13:26:09 +01:00
Egor Zhdan
d8ca77e2b9 [Clang][Sema] Allow qualified type names in swift_name attribute
This allows adding `__attribute__((swift_name("MyNamespace.MyType.my_method()")))` on standalone C++ functions to make Swift import them as member functions of a type that is nested in a C++ namespace.

rdar://138934888
2025-06-27 13:21:37 +01:00
Andrzej Warzyński
541f33e075 [mlir][linalg] Prevent hoisting of transfer pairs in the presence of aliases (#145235)
This patch adds additional checks to the hoisting logic to prevent hoisting of
`vector.transfer_read` / `vector.transfer_write` pairs when the underlying
memref has users that introduce aliases via operations implementing
`ViewLikeOpInterface`.

Note: This may conservatively block some valid hoisting opportunities and could
affect performance. However, as demonstrated by the included tests, the current
logic is too permissive and can lead to incorrect transformations.

If this change prevents hoisting in cases that are provably safe, please share
a minimal repro - I'm happy to explore ways to relax the check.

Special treatment is given to `memref.assume_alignment`, mainly to accommodate
recent updates in:

* https://github.com/llvm/llvm-project/pull/139521

Note that such special casing does not scale and should generally be avoided.
The current hoisting logic lacks robust alias analysis. While better support
would require more work, the broader semantics of `memref.assume_alignment`
remain somewhat unclear. It's possible this op may eventually be replaced with
the "alignment" attribute added in:

* https://github.com/llvm/llvm-project/pull/144344
2025-06-27 13:18:15 +01:00
Matt Arsenault
7e2e030121 GlobalISel: Replace use of report_fatal_error (#145866) 2025-06-27 21:16:23 +09:00
jeanPerier
37e2d10499 Revert "[flang] add option to generate runtime type info as external" (#146064)
Reverts llvm/llvm-project#145901

Broke shared library builds because of the usage of
`skipExternalRttiDefinition` in Lowering.
2025-06-27 14:05:59 +02:00
Akash Banerjee
91f10df794 [Flang][OpenMP] Skip implicit mapping of named constants (#145966)
Added early return when mapping named constants.

This prevents linking error in the following example:

```
program test
   use, intrinsic :: iso_c_binding, only: c_double
   implicit none

   real(c_double) :: x
   integer        :: i
   x = 0.0_c_double
   !$omp target teams distribute parallel do reduction(+:x)
   do i = 0, 9
      x = x + 1.0_c_double
   end do
   !$omp end target teams distribute parallel do
end program test
```
2025-06-27 13:05:22 +01:00
Matt Arsenault
c8ea114741 AMDGPU: Introduce a pass to replace VGPR MFMAs with AGPR (#145024)
In gfx90a-gfx950, it's possible to emit MFMAs which use AGPRs or VGPRs
for vdst and src2. We do not want to do use the AGPR form, unless
required by register pressure as it requires cross bank register
copies from most other instructions. Currently we select the AGPR
or VGPR version depending on a crude heuristic for whether it's possible
AGPRs will be required. We really need the register allocation to
be complete to make a good decision, which is what this pass is for.
    
This adds the pass, but does not yet remove the selection patterns
for AGPRs. This is a WIP, and NFC-ish. It should be a no-op on any
currently selected code. It also does not yet trigger on the real
examples of interest, which require handling batches of MFMAs at
once.
2025-06-27 21:05:03 +09:00
Matt Arsenault
bc1a6a2a93 AMDGPU: Add baseline tests for VGPR MFMA rewriting pass (#145023)
AMDGPU: Add baseline tests for VGPR MFMA rewriting pass

Add baseline tests for a new pass that will rewrite VGPR MFMAs
with copies to AV_* classes into the AGPR form.

Add start of IR test that probably needs to be redone
2025-06-27 21:02:49 +09:00
David Green
cf3d136c22 [AArch64] Do not generate ld1IndexPost when inserting into lane 0 of a zero vector (#145723)
If we are inserting into lane 0 of a zero vector, we can use the ldr
instructions to get the upper-lane zero for free. Do not attempt to make
post-inc operations in that case, which should be less micro-ops
overall.
2025-06-27 12:47:16 +01:00
Nikolas Klauser
e9805235bf [libc++] Move libcxx/test/libcxx/extensions to libcxx/test/extensions and update the tests (#145476)
This patch adds a separate `extensions` directory, since there are quite
a few extensions in libc++ that aren't necessarily libc++-specific. For
example, the tests currently in `libcxx/test/libcxx/extensions` should
also pass with libstdc++, since they originally added the extension.
This also "documents" what users are allowed to rely on and what parts
are just libc++ tests to make sure our implementation is behaving as we
expect, which may be subject to change.

This patch also formats the tests and refactors `.fail.cpp` tests to
`.verify.cpp` tests.
2025-06-27 13:16:37 +02:00
Florian Hahn
5fdcb35aaa [InferAlignment] Add tests with GEP recurrences.
Add some test coverage for GEP recurrences  in ValueTracking,
https://github.com/llvm/llvm-project/pull/123518.
2025-06-27 12:10:57 +01:00
jeanPerier
e816817bbb [flang] add option to generate runtime type info as external (#145901)
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-27 13:00:29 +02:00
Benjamin Kramer
f0f46e25ec [bazel] Port 3fdf46ad60 2025-06-27 12:52:45 +02:00
jeanPerier
b989c76f39 [flang][NFC] switch ValueRange(nullopt) to ValueRange{} after #146011 (#146043)
Clean-up some std::nullopt usages in FIR ops builder that triggers a
deprecated warning after #146011.
2025-06-27 12:49:34 +02:00
David Spickett
3f00cff5c7 [lldb][test] Disable TestLocationsAfterRebuild.py on Windows
We can't remove the program file while lldb has it open.

Test added in https://github.com/llvm/llvm-project/pull/145994.
2025-06-27 10:49:22 +00:00
Simon Pilgrim
7dde6027a0 [DAG] canCreateUndefOrPoison - add handling for ISD::SELECT (#146046)
Followup to #143760 which handled ISD::VSELECT

I've moved ISD::SELECT/VSELECT under the "No poison except from flags
(which is handled above)" subgroup to try to remind people that these
can have poison generating FMFs (NINF/NNAN), even though this hasn't
been well explained anywhere I can find :(

Helps with regressions from #145939
2025-06-27 11:49:08 +01:00
gbMattN
0158ca21a2 Prevent a crash when a global variable has debug metadata (#145918)
This patch fixes a crash I found when trying to compile some codebases
with -fsanitize=type and -g
2025-06-27 11:43:24 +01:00
Paul Walker
793667017c [NFC][LLVM] Use DL consistently throughout AArch64ISelLowering.cpp. 2025-06-27 10:35:04 +00:00
long.chen
aed8f1992a [NFC][mlir][memref] refine debug message about memref::SubViewOp. (#145470) 2025-06-27 18:34:45 +08:00
Twice
c3e08c9b89 [MLIR] Replace getVoidPtrType with getPtrType in ConvertToLLVMPattern (#145657)
`ConversionPattern::getVoidPtrType` looks a little confusion since the
opaque pointer migration is already done. Also we cannot specify address
space in this method.

Maybe we can mark them as deprecated and add new method `getPtrType()`,
as this PR did : )
2025-06-27 12:31:53 +02:00
Tobias Hieta
1fb786ea93 [clang][scandeps] Improve handling of rawstrings. (#139504) 2025-06-27 12:21:21 +02:00
Nikita Popov
7f223d121d [PassBuilder] Treat pipeline aliases as normal passes (#146038)
Pipelines like `-passes="default<O3>"` are currently parsed in a special
way. Switch them to work like normal, parameterized module passes.
2025-06-27 12:07:09 +02:00
Ramkumar Ramachandra
613804cca9 [LV] Improve code using [[maybe_unused]] (NFC) (#137138) 2025-06-27 10:58:17 +01:00
Simon Pilgrim
08f074a59f [TTI] getInstructionCost - consistently treat all undef/poison shuffle masks as free (#146039)
#145920 exposed an issue where we were treating undef/poison shuffles as SK_Select kinds
2025-06-27 10:53:01 +01:00
Ziqing Luo
5f2b9dd90d Re-land "[-Wunterminated-string-initialization] Handle C string literals ending with explicit '\0' (#143487)"
In C, a char array needs no "nonstring" attribute, if its initializer is
a string literal that 1) explicitly ends with '\0' and 2) fits in the
array after a possible truncation.

For example
`char a[4] = "ABC\0"; // fine, needs no "nonstring" attr`

rdar://152506883

This reland disables the test for linux so that it will not block the
buildbot: https://lab.llvm.org/buildbot/#/builders/144/builds/28591.
2025-06-27 17:41:14 +08:00
David Sherwood
bf2b14acf3 [LV] Enable auto-vectorisation of loops with uncountable exits (#133099)
Until now the feature to enable vectorisation of some early exit
loops with uncountable exits was controlled under a flag, off by
default. Now that we have efficient code generation for
vectorising such loops (see PR #130766) and we still have some
time from the next LLVM release it seems like a good time point
to enable the feature by default. If any issues arise post-commit
it can be easily reverted.

Using this patch I built and ran the LLVM test suite successfully,
which on neoverse-v1 led to the vectorisation of 114 additional
early exit loops. I also built and ran SPEC2017 successfully for
both neoverse-v1 and neoverse-v2.
2025-06-27 10:39:33 +01:00
Pavel Labath
2c90c0b90c [lldb] Extract debug server location code (#145706)
.. from the guts of GDBRemoteCommunication to ~top level.

This is motivated by #131519 and by the fact that's impossible to guess
whether the author of a symlink intended it to be a "convenience
shortcut" -- meaning it should be resolved before looking for related
files; or an "implementation detail" -- meaning the related files should
be located near the symlink itself.

This debate is particularly ridiculous when it comes to lldb-server
running in platform mode, because it also functions as a debug server,
so what we really just need to do is to pass /proc/self/exe in a
platform-independent manner.

Moving the location logic higher up achieves that as lldb-platform (on
non-macos) can pass `HostInfo::GetProgramFileSpec`, while liblldb can
use the existing complex logic (which only worked on liblldb anyway as
lldb-platform doesn't have a lldb_private::Platform instance).

Another benefit of this patch is a reduction in dependency from
GDBRemoteCommunication to the rest of liblldb (achieved by avoiding the
Platform dependency).
2025-06-27 11:16:57 +02:00
Matt Arsenault
7255c3aee3 DAG: Check libcall function is supported before emission (#144314) 2025-06-27 18:09:04 +09:00
David Sherwood
ddb8493ca7 [LV] Fix test issue caused by #145877 (#146041) 2025-06-27 10:02:57 +01:00
Matt Arsenault
b4d3283ab7 AArch64: Add libcall impl declarations for __arm_sc* memory functions (#144977)
These were bypassing the ordinary libcall emission mechanism. Make sure
we have entries in RuntimeLibcalls, which should include all possible
calls the compiler could emit.

Fixes not emitting the # prefix in the arm64ec case.
2025-06-27 17:53:03 +09:00
Matt Arsenault
779f7243c8 XCore: Declare libcalls used for align 4 memcpy (#144976)
This usage was hidden in XCoreSelectionDAGInfo and bypassed
the usual libcall system, so define these for later use.
2025-06-27 17:50:01 +09:00
Matt Arsenault
f38773e980 Hexagon: Add libcall declarations for special memcpy (#144975)
HexagonSelectionDAGInfo was bypassing the ordinary RuntimeLibcallInfo
handling for this case, so define a libcall for it and use it.
2025-06-27 17:46:42 +09:00
Matt Arsenault
4243e502c1 ARM: Add runtime libcall definitions for aebi memory functions (#144974)
Fix bypassing ordinary RuntimeLibcalls APIs for cases handled
in ARMSelectionDAGInfo
2025-06-27 17:43:46 +09:00
Matt Arsenault
b88e1f6a79 TableGen: Generate enum for runtime libcall implementations (#144973)
Work towards separating the ABI existence of libcalls vs. the
lowering selection. Set libcall selection through enums, rather
than through raw string names.
2025-06-27 17:40:43 +09:00
Matt Arsenault
3fdf46ad60 TableGen: Add runtime libcall backend (#144972)
Replace RuntimeLibcalls.def with a tablegenerated version. This
is in preparation for splitting RuntimeLibcalls into two components.
For now match the existing functionality.
2025-06-27 17:37:03 +09:00
Nikolas Klauser
d163ab3323 [libc++] Remove a bunch of unnecessary type indirections from __tree (#145295)
Most of the diff is replacing `__parent_pointer` with
`__end_node_pointer`. The most interesting diff is that the pointer
aliases are now defined directly inside `__tree` instead of a separate
traits class.
2025-06-27 10:11:20 +02:00