Commit Graph

541398 Commits

Author SHA1 Message Date
Lucas Duarte Prates
6fcdde2a4e [runtimes] Allow use of external llvm-lit on standalone builds (#144347)
When creating a standalone build of the runtimes sub-project, the
current CMake implementation looks for a lit executable that might
potentially exist in the build tree and unconditionally overrides the
value of `LLVM_EXTERNAL_LIT`. Due to this, any value passed via
`-DLLVM_EXTERNAL_LIT` when configuring the CMake project is ignored.
This change adds the `ALLOW_EXTERNAL` argument to the
`get_llvm_lit_path` call in the runtimes' CMakeLists.txt, allowing any
value previously set to be considered.
2025-06-18 10:26:46 +01:00
Karlo Basioli
58c4fa96cb Fix bazel build for #142771 (#144659) 2025-06-18 10:21:37 +01:00
Ying Yi
fe42d34274 [clang][headers]Remove unnecessary guard of !defined(__SCE__). (#144522)
Sony PlayStation now supports C++20, and we wish to change the default
C++ mode to C++20 sometime in the future. As such, the !defined(__SCE__)
guards are redundant and we want to remove them. This in turn makes the
entire guard lines redundant (always true), so this patch removes them
entirely.
2025-06-18 10:13:46 +01:00
Sirui Mu
8e157fdbb7 [CIR] Add support for __builtin_assume (#144376)
This patch adds support for the `__builtin_assume` builtin function.
2025-06-18 17:10:29 +08:00
Kunqiu Chen
355725a25e [TSan] Fix missing inst cleanup (#144067)
Commit 44e875ad5b introduced a change that
replaces `ReplaceInstWithInst` with `Instruction::replaceAllUsesWith`,
without subsequent instruction cleanup.

This results in TSan leaving behind useless `load atomic` instructions
after 'replacing' them.

This commit adds cleanup back, consistent with the context.
2025-06-18 17:09:32 +08:00
Frank Schlimbach
43e1a5a411 [mlir][mesh] adding option for traversal order in sharding propagation (#144079)
The traversal order in sharding propagation was hard-coded. This PR
provides options to the pass to select a suitable order
- forward-only
- backward-only
- forward-backward
- backward-forward

Default is the previous behavior (backward-forward).
2025-06-18 11:06:48 +02:00
Philipp Jung
669627d0c7 Add check 'cppcoreguidelines-use-enum-class' (#138282)
Warn on non-class enum definitions as suggested by the Core Guidelines:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Renum-class
2025-06-18 11:02:53 +02:00
Mikael Holmen
c16dc63b44 [OMPIRBuilder] Fix gcc -Wparentheses warning [NFC]
Without this gcc warned like
 /repo/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7559:68: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  7559 |         NumStaleCIArgs == (OffloadingArraysToPrivatize.size() + 2) &&
       |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  7560 |             "Wrong number of arguments for StaleCI when shareds are present");
       |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2025-06-18 10:59:18 +02:00
Nikita Popov
7ea7ccd24d [PowerPC][AIX] Specify pointer info and alignment for stack store (#144526)
When lowering call arguments to stack, specify a stack MPI, as well as
the stack alignment, instead of using the defaults (which would be an
unknown location with ABI alignment).

I believe the asm diffs are just changes in scheduling.
2025-06-18 10:50:17 +02:00
Craig Topper
255b55c602 [GlobalOpt] Use cast instead of dyn_cast. NFC (#144634)
The dyn_cast was not checked for null, and the cast is guaranteed to
succeed by an earlier check.
2025-06-18 01:35:56 -07:00
Kareem Ergawy
59d6fbb8ff [flang][fir] Provide allocation block for fir.local when required (#144521)
Extends `fir::FirOpBuilder::getAllocaBlock()` to support `fir.local`.
This allows us to retrieve an allocation block when needed for
`fir.local`.
2025-06-18 10:24:08 +02:00
Pengcheng Wang
ca29c632f0 [RISCV] Support non-power-of-2 types when expanding memcmp
We can convert non-power-of-2 types into extended value types
and then they will be widen.

Reviewers: lukel97

Reviewed By: lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/114971
2025-06-18 16:11:18 +08:00
Mel Chen
ba40a7bc2e [LoopVectorize] Vectorize fixed-order recurrence with vscale x 1. (#142772)
When the fixed-order recurrence phi is live-out from the loop, the
vectorizer uses VPInstruction::ExtractPenultimateElement to extract the
penultimate element from the recurrence vector. However, this is not
feasible when the VF is vscale x 1, since vscale could be 1, making the
vector contain only one element.

This patch changes the behavior for vscale x 1 by extracting the last
element from the vector produced by splicing the recurrence phi and the
previous value. This ensures we can still determine the correct live-out
value of the recurrence phi.
2025-06-18 16:03:20 +08:00
Simon Tatham
49df87e71b [libc][printf] Fix out-of-range shift in float320 printf (#144542)
If you enable `LIBC_CONF_PRINTF_FLOAT_TO_STR_USE_FLOAT320` and use a
`%f` style printf format directive to print a nonzero number too small
to show up in the output digits, e.g. `printf("%.2f", 0.001)`, then the
output would be intermittently incorrect, because
`DyadicFloat::as_mantissa_type_rounded` would try to shift the 320-bit
mantissa right by more than 320 bits, invoking the 'undefined behavior'
clause commented in the `shift()` function in `big_int.h`.

There were already tests in the libc test suite exercising this case,
e.g. the subnormal tests in `LlvmLibcSPrintfTest.FloatDecimalConv` use
`%f` at the default precision of 6 decimal places on tiny numbers such
as 2^-1027. But because the behavior is undefined, they don't visibly
fail all the time, and in all previous test runs we'd tried with
USE_FLOAT320, they had got lucky.

The fix is simply to detect an out-of-range right shift before doing it,
and instead just set the output value to zero.
2025-06-18 08:57:51 +01:00
Robert Imschweiler
a38932ac3c Revert "[GlobalISel] prevent G_UNMERGE_VALUES for vectors with different elements" (#144650)
Reverts llvm/llvm-project#133335
2025-06-18 09:49:32 +02:00
Rajveer Singh Bharadwaj
e07b1b26c3 [DAG] Implement SDPatternMatch m_Abs() matcher (#144512) 2025-06-18 12:59:27 +05:30
Garvit Gupta
45ea46c446 Reland [Driver] Add support for GCC installation detection in Baremetal toolchain (#144640)
This patch introduces enhancements to the Baremetal toolchain to support
GCC toolchain detection.
- If the --gcc-install-dir or --gcc-toolchain options are provided and
point to valid paths, the sysroot is derived from those locations.
- If not, the logic falls back to the existing sysroot inference
mechanism already present in the Baremetal toolchain.
- Support for adding include paths for the libstdc++ library has also
been added.

Additionally, the restriction to always use the integrated assembler has
been removed. With a valid GCC installation, the GNU assembler can now
be used as well.

This patch currently updates and adds tests for the ARM target only.
RISC-V-specific tests will be introduced in a later patch, once the
RISCVToolChain is fully merged into the Baremetal toolchain. At this
stage, there is no way to test the RISC-V target within this PR.

RFC:

https://discourse.llvm.org/t/merging-riscvtoolchain-and-baremetal-toolchains/75524
2025-06-18 12:50:48 +05:30
Simon Pilgrim
44b715293f [PhaseOrdering][X86] Copy FMUL+ADDSUB/FMADDSUB build vector patterns from codegen tests
As detailed on #144489 - confirm the vectorisation of scalar FMUL+ADDSUB/FMADDSUB on various targets
2025-06-18 08:17:08 +01:00
Simon Pilgrim
0875bee2b1 [X86] combineAndNotIntoANDNP - pull out repeated SDLoc(). NFC. 2025-06-18 08:17:07 +01:00
Simon Pilgrim
dac94f28e6 [X86] combineAndNotOrIntoAndNotAnd - pull out repeated SDLoc(). NFC. 2025-06-18 08:17:07 +01:00
Simon Pilgrim
896e187a6e [X86] combineAndMaskToShift - pull out repeated SDLoc(). NFC. 2025-06-18 08:17:07 +01:00
Robert Imschweiler
4d71f20b28 [GlobalISel] prevent G_UNMERGE_VALUES for vectors with different elements (#133335)
This commit prevents building a G_UNMERGE_VALUES instruction with
different source and destination vector elements in
`LegalizationArtifactCombiner::ArtifactValueFinder::tryCombineMergeLike()`,
e.g.:
`%1:_(<2 x s8>), %2:_(<2 x s8>) = G_UNMERGE_VALUES %0:_(<2 x s16>)`

This LLVM defect was identified via the AMD Fuzzing project.
2025-06-18 09:07:08 +02:00
Kunqiu Chen
10f29a6072 [MSan] Fix wrong unpoison size in SignalAction (#144071)
MSan should unpoison the parameters of extended signal handlers. 
However, MSan unpoisoned the second parameter with the wrong size 
`sizeof(__sanitizer_sigaction)`, inconsistent with its real type 
`siginfo_t`.

This commit fixes this issue by correcting the size to 
`sizeof(__sanitizer_siginfo)`.
2025-06-18 14:53:33 +08:00
Kirill Chibisov
74687180dd [mlir][emitc] Make CExpression trait into interface (#142771)
By defining `CExpressionInterface`, we move the side effect detection
logic from `emitc.expression` into the individual operations
implementing the interface allowing operations to gradually tune the
side effect.

It also allows checking for side effects each operation individually.
2025-06-18 07:38:47 +02:00
Craig Topper
ad9e591fd5 [SelectionDAG][RISCV] Fold (add (vscale * C0), (vscale * C1)) to (vscale * (C0 + C1)) in getNode. (#144565)
We already have shl/mul vscale related folds in getNode.

This is an alternative to the DAGCombine proposed in #144507.
2025-06-17 21:33:50 -07:00
Matt Arsenault
7b9d10d2e6 PowerPC: Fix using long double libm functions for f128 intrinsics (#144382)
This wasn't setting the correct libcall names, which default to the
l suffixed libm names.
2025-06-18 13:26:15 +09:00
Matt Arsenault
af49a650e1 PowerPC: Add baseline tests for more f128 libcall handling (#144381)
Some of these incorrectly call the l suffixed version of libm
functions and others assert.
2025-06-18 13:23:17 +09:00
Liao Chunyu
e14f327d80 [RISCV] Pre-test for #144461 2025-06-17 23:32:01 -04:00
Sudharsan Veeravalli
a2ad65661a [RISCV] Add patterns for generating QC_CTO and QC_CLO (#144532)
These instructions count leading/trailing ones in the register.

Currently these are only generated when we have `Zbb` enabled (along
with `Xqcibm`) since it contains the `CTTZ/CTLZ` instructions.
2025-06-18 07:54:08 +05:30
Jacob Lalonde
a96a3f1b26 [lldb][Minidump Parser] Implement a range data vector for minidump memory ranges (#136040)
Recently I was debugging a Minidump with a few thousand ranges, and came
across the (now deleted) comment:

```
  // I don't have a sense of how frequently this is called or how many memory
  // ranges a Minidump typically has, so I'm not sure if searching for the
  // appropriate range linearly each time is stupid.  Perhaps we should build
  // an index for faster lookups.
```

blaming this comment, it's 9 years old! Much overdue for this simple fix
with a range data vector.

I had to add a default constructor to Range in order to implement the
RangeDataVector, but otherwise this just a replacement of look up logic.
2025-06-17 18:37:15 -07:00
Jim Lin
8ddada41df [RISCV] Add Andes XAndesVBFHCvt (Andes Vector BFLOAT16 Conversion) extension (#144320)
The spec can be found at:
https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release.

This patch only supports assembler. The instructions are similar to
`Zvfbfmin` and the only difference with `Zvfbfmin` is that
`XAndesVBFHCvt` doesn't have mask variant.
2025-06-18 09:17:46 +08:00
Peter Collingbourne
9265b1f0cf LowerTypeTests: Use jump table entry type as value type of jump table alias.
The motivation for this is that it causes the jump table entry's symbol
to have an st_size equal to the jump table entry size, instead of being
equal to the size of the entire jump table, which is incorrect and can
lead to unexpected behavior in binary analysis tools that rely on the
size field such as Bloaty.

Reviewers: fmayer

Reviewed By: fmayer

Pull Request: https://github.com/llvm/llvm-project/pull/144462
2025-06-17 18:15:06 -07:00
Harrison Hao
0defde8e06 [AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (#141758)
Now we only support D16 folding for `image sample` instructions with a
single user: a `fptrunc` to half.
However, we can actually support D16 folding for image.sample
instructions with multiple users,
as long as each user follows the pattern of extractelement followed by
fptrunc to half.
For example:
```
  %sample = call <4 x float> @llvm.amdgcn.image.sample
  %e0 = extractelement <4 x float> %sample, i32 0
  %h0 = fptrunc float %e0 to half
  %e1 = extractelement <4 x float> %sample, i32 1
  %h1 = fptrunc float %e1 to half
  %e2 = extractelement <4 x float> %sample, i32 2
  %h2 = fptrunc float %e2 to half
```
This change enables D16 folding for such cases and avoids generating
`v_cvt_f16_f32_e32` instructions.
2025-06-18 09:00:07 +08:00
Jianhui Li
86a09f3615 [MLIR][XeGPU] Clean up xegpu op tests (#144592)
Test cleanup: 
1) separate layout.mlir from ops.mlir for layout related test 
2) remove lane layout for ops working at work item scope. 
3) remove redundant test in create_tdesc/update_tdesc/prefetch. 
4) remove "test_" from all test function name.
2025-06-17 19:48:09 -05:00
Jason Molenda
4e090b6e84 [lldb] Re-insert code to search for a binary by filepath if provided
July 14 2024 I landed a change to update progress reporting when
loading kernel/firmware binaries
https://github.com/llvm/llvm-project/pull/98845
In DynamicLoader::LoadBinaryWithUUIDAndAddress I removed code that
was setting the ModuleSpec to the provided name, if the name provided
is that of a file on disk.  With this code missing, if a filepath
name is passed in, this code will fail to find that binary on the local
disk.  There's nothing in the PR / intention that would lead to this
change, it was unintentional.
2025-06-17 17:41:31 -07:00
Matt Arsenault
99e263228f github: Add mips backend to PR autolabeler (#140909) 2025-06-18 09:28:24 +09:00
Andrew Rogers
abbdd1670d [llvm] minor fixes for clang-cl Windows DLL build (#144386)
## Purpose

This patch makes a minor changes to LLVM and Clang so that LLVM can be
built as a Windows DLL with `clang-cl`. These changes were not required
for building a Windows DLL with MSVC.

## Background

The Windows DLL effort is tracked in #109483. Additional context is
provided in [this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

## Overview
Specific changes made in this patch:
- Remove `constexpr` fields that reference DLL exported symbols. These
symbols cannot be resolved at compile time when building a Windows DLL
using `clang-cl`, so they cannot be `constexpr`. Instead, they are made
`const` and initialized in the implementation file rather than at
declaration in the header.
- Annotate symbols now defined out-of-line with `LLVM_ABI` so they are
exported when building as a shared library.
- Explicitly add default copy assignment operator for `ELFFile` to
resolve a compiler warning.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
2025-06-17 17:21:40 -07:00
Minding
64155a3229 Added clarifying comment to 'LLVMLinkInMCJIT' and 'LLVMLinkInInterpreter' (#92467)
Clarify that these functions are no-ops when linking to LLVM as a shared object.
2025-06-18 10:09:07 +10:00
Shilei Tian
15482c83aa [ElimAvailExtern] Add an option to allow to convert global variables in a specified address space to local (#144287)
Currently, the `EliminateAvailableExternallyPass` only converts certain
available externally functions to local if `avail-extern-to-local` is
set or in
contextual profiling mode. For global variables, it only drops their
initializers.

This PR adds an option to allow the pass to convert global variables in
a
specified address space to local. The motivation for this change is to
correctly
support lowering of LDS variables (`__shared__` variables, in more
generic
terminology) when ThinLTO is enabled for AMDGPU.

A `__shared__` variable is lowered to a hidden global variable in a
particular
address space by the frontend, which is roughly same as a `static` local
variable. To properly lower it in the backend, the compiler needs to
check all
its uses. Enabling ThinLTO currently breaks this when a function
containing a
`__shared__` variable is imported from another module. Even though the
global
variable is imported along with its associated function, and the
function is
privatized by the `EliminateAvailableExternallyPass`, the global
variable itself
is not.

It's safe to privatize such global variables, because they're _local_ to
their
associated functions. If the function itself is privatized, its
associated
global variables should also be privatized accordingly.
2025-06-17 19:58:24 -04:00
Andrei Safronov
c21a4c6c43 [Xtensa] Implement Xtensa Interrupt/Exception/Debug Options. (#143820)
Implement Xtensa Interrupt. HighInterrupts, Exception, Debug Options.
Also implement small Xtensa Options like PRID, Coprocessor and Timers.
2025-06-18 02:57:47 +03:00
Eli Friedman
f2d2c99866 [clang] Remove separate evaluation step for static class member init. (#142713)
We already evaluate the initializers for all global variables, as
required by the standard. Leverage that evaluation instead of trying to
separately validate static class members.

This has a few benefits:

- Improved diagnostics; we now get notes explaining what failed to
evaluate.
- Improved correctness: is_constant_evaluated is handled correctly.

The behavior follows the proposed resolution for CWG1721.

Fixes #88462. Fixes #99680.
2025-06-17 16:43:55 -07:00
Arthur Eubanks
b164d3613a [gn build] Port 628274dadf 2025-06-17 23:42:47 +00:00
Arthur Eubanks
6652961ae5 [gn build] Manually port 556e69b7 2025-06-17 23:41:29 +00:00
Arthur Eubanks
535291409c [gn build] Port 9ec75a50bc 2025-06-17 23:41:29 +00:00
Arthur Eubanks
a871b919ed [gn build] Port 9e0186d925 2025-06-17 23:41:28 +00:00
Sterling-Augustine
628274dadf [NFC] Extract Printing portions of DWARFCFIProgram to new files (#143762)
CFIPrograms' most common uses are within debug frames, but it is not
their only use. For example, some assembly writers encode them by hand
into .cfi_escape directives. This PR extracts printing code for them
into its own files, which avoids the need for the main class to depend
on DWARFUnit, sections, and similar.

One in a series of NFC DebugInfo/DWARF refactoring changes to layer it
more cleanly, so that binary CFI parsing can be used from low-level
code, (such as byte strings created via .cfi_escape) without circular
dependencies. The final goal is to make a more limited dwarf library
usable from lower-level code.

More information can be found at
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665
2025-06-17 16:35:47 -07:00
Matt Arsenault
a9811340b7 AMDGPU: Report special input intrinsics as free (#141948) 2025-06-18 08:24:58 +09:00
Craig Topper
f3af1cd08c [RISCV] Set the exact flag on the SRL created for converting vscale to a read of vlenb. (#144571)
We know that vlenb is a multiple of RVVBytesPerBlock so we aren't
shifting out any non-zero bits.
2025-06-17 16:24:50 -07:00
Matt Arsenault
f08474ab1f AMDGPU: Add baseline cost model tests for special argument intrinsics (#141947) 2025-06-18 08:21:55 +09:00
Matt Arsenault
54015f36c6 AMDGPU: Cost model for minimumnum/maximumnum (#141946) 2025-06-18 08:19:06 +09:00