Commit Graph

541984 Commits

Author SHA1 Message Date
Kazu Hirata
d4d37d8430 [BOLT] Remove a redundant call to std::unique_ptr<T>::get (NFC) (#145211) 2025-06-23 18:04:19 -07:00
Jim Lin
2f9c97c030 [RISCV] Add Andes AX45MPV processor definition (#145267)
Andes AX45MPV is 64-bit in-order dual-issue 8-stage pipeline
linux-capable CPU implementing the RV64IMAFDCV ISA extension. That is
developed by Andes Technology https://www.andestech.com, a RISC-V IP
provider.

The overviews for AX45MPV:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax45mpv/

Scheduling model for RVV extension will be implemented a follow-up PR.
2025-06-24 08:57:55 +08:00
Wenju He
9d570d568b [ValueTracking] Return true for AddrSpaceCast in canCreateUndefOrPoison (#144686)
In our downstream GPU target, following IR is valid before instcombine
although the second addrspacecast causes UB.
  define i1 @test(ptr addrspace(1) noundef %v) {
    %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4)
    %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0)
    %2 = icmp eq i32 %1, 0
    %3 = addrspacecast ptr addrspace(4) %0 to ptr addrspace(3)
    %4 = select i1 %2, ptr addrspace(3) null, ptr addrspace(3) %3
    %5 = icmp eq ptr addrspace(3) %4, null
    ret i1 %5
  }
We have a custom optimization that replaces invalid addrspacecast with
poison, and IR is still valid since `select` stops poison propagation.

However, instcombine pass optimizes `select` to `or`:
    %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4)
    %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0)
    %2 = icmp eq i32 %1, 0
    %3 = addrspacecast ptr addrspace(1) %v to ptr addrspace(3)
    %4 = icmp eq ptr addrspace(3) %3, null
    %5 = or i1 %2, %4
    ret i1 %5
The transform is invalid for our target.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-24 08:43:47 +08:00
Anshil Gandhi
a314ac4d22 [Reland][InstCombine] Iterative replacement in PtrReplacer (#145410)
This patch enhances the PtrReplacer as follows:

1. Users are now collected iteratively to be generous on the stack. In
the case of PHIs with incoming values which have not yet been visited,
they are pushed back into the stack for reconsideration.
2. Replace users of the pointer root in a reverse-postorder traversal,
instead of a simple traversal over the collected users. This reordering
ensures that the uses of an instruction are replaced before replacing
the instruction itself.
3. During the replacement of PHI, use the same incoming value if it does
not have a replacement.

This patch specifically fixes the case when an incoming value of a PHI
is addrspacecasted.

This reland PR includes a fix for an assertion failure caused by
https://github.com/llvm/llvm-project/pull/137215, which was reverted.
The failing test involved a phi and gep depending on each other, in
which case the PtrReplacer did not order them correctly for replacement.
This patch fixes it by adding a check during the definition of
`PostOrderWorklist`.
2025-06-23 20:35:40 -04:00
Finn Plummer
310a62f88a [HLSL][RootSignature] Plug-in serialization and add full sample testcase (#144769)
This pr extends `dumpRootElements` to invoke the print methods of all
`RootElement`s now that they are all implemented.

Extends the `RootSignatures-AST.hlsl` testcase to have a root element of
each type being parsed, constructed to the in-memory representation mode
and then being dumped as part of the AST dump.

- Update `HLSLRootSignatureUtils.cpp` to extend `dumpRootElements`
- Extend `AST/HLSL/RootSigantures-AST.hlsl` testcase
- Defines the helper `operator<<` for `RootElement`
- Small correction to the output of `numDescriptors` to be `unbounded`
in special case

Resolves https://github.com/llvm/llvm-project/issues/124595.
2025-06-23 17:19:12 -07:00
Maksim Levental
a2aa812a31 [mlir][python] bind block predecessors and successors (#145116)
bind `block.getSuccessor` and `block.getPredecessors`.
2025-06-23 19:59:03 -04:00
sribee8
bc5e5c0114 [libc] wcpncpy implementation (#145430)
Implemented wcpncpy and tests.

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-06-23 23:35:28 +00:00
sribee8
10d46cf0d5 [libc] mbtowc implementation (#145405)
Implemented mbtowcs and tests for the function.

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-06-23 23:25:13 +00:00
Prajwal Nadig
23b66a68f1 [ExtractAPI] Include virtual keyword for methods (#145412)
This information was being left out of the symbol graph.

rdar://131780883
2025-06-23 17:10:43 -06:00
Paul Osmialowski
4b9f7cd856 [flang] flang manpage overhaul (#144948)
Make the flang man page look more like the one clang is having.
2025-06-24 00:07:10 +01:00
Shubh Pachchigar
98e8ef2273 [libc] Fix broken links in libc (#145199)
This PR fixes broken links in all files describing libc usage modes.
Please let me know if there are any other places that need updating.

---------

Co-authored-by: shubhp@perlmutter <shubhp@perlmutter.com>
2025-06-23 15:51:43 -07:00
S. VenkataKeerthy
d37325ea95 Revert "[MLGO][IR2Vec] Integrating IR2Vec with MLInliner (#143479)" (#145418)
This reverts commit af2c06ecd6 as it
causes failure of lit test (Transforms/Inline/ML/interactive-mode.ll)
2025-06-24 00:48:16 +02:00
Chelsea Cassanova
92a7f6fbbe [lldb][rpc] Fix bug in convert script for RPC (#145419)
In the script that's used by RPC to convert LLDB headers to LLDB RPC
headers, there's a bug with how it converts namespace usage. An
overeager regex pattern caused *all* text before any `lldb::` namespace
usage to get replaced with `lldb_rpc::` instead of just the namespace
itself. This commit changes that regex pattern to be less overeager and
modifies one of the shell tests for this script to actually check that
the namespace usage replacement is working correctly.

rdar://154126268
2025-06-23 15:46:12 -07:00
Aaron St George
3782eb60f8 [mlir][TilingInterface] NFC Improve comment for tiledAndFusedOps member of SCFTileAndFuseResult (#145397)
Comment was a little unclear, hopefully this change is an improvement.
2025-06-23 15:08:42 -07:00
Tex Riddell
509fb931b4 Fix min_vec_size.ll test for changes in vector-combine (#145392)
Running the `vector-combine` pass on this test now produces a single
shuffle on a loaded `<1 x float>` instead of an insert into a `<2 x
float>` followed by a shuffle.

This test change matches changes in other tests in PR #144690, which
introduced the optimization.
2025-06-23 14:53:57 -07:00
Skrai Pardus
a45fda6aeb switch type and value ordering for arith Constant[XX]Op (#144636)
This change standardizes the order of the parameters for `Constant[XXX]
Ops` to match with all other `Op` `build()` constructors.

In all instances of generated code for the MLIR dialects's Ops (that is
the TableGen using the .td files to create the .h.inc/.cpp.inc files),
the desired result type is always specified before the value.

Examples: 
```
// ArithOps.h.inc
class ConstantOp : public ::mlir::Op<ConstantOp, ::mlir::OpTrait::ZeroRegions, ::mlir::OpTrait::OneResult, ::mlir::OpTrait::OneTypedResult<::mlir::Type>::Impl, ::mlir::OpTrait::ZeroSuccessors, ::mlir::OpTrait::ZeroOperands, ::mlir::OpTrait::OpInvariants, ::mlir::BytecodeOpInterface::Trait, ::mlir::OpTrait::ConstantLike, ::mlir::ConditionallySpeculatable::Trait, ::mlir::OpTrait::AlwaysSpeculatableImplTrait, ::mlir::MemoryEffectOpInterface::Trait, ::mlir::OpAsmOpInterface::Trait, ::mlir::InferIntRangeInterface::Trait, ::mlir::InferTypeOpInterface::Trait> {
public:
....
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::TypedAttr value);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypedAttr value);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::TypedAttr value);
  static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
...
```
```
// ArithOps.h.inc
class SubIOp : public ::mlir::Op<SubIOp, ::mlir::OpTrait::ZeroRegions, ::mlir::OpTrait::OneResult, ::mlir::OpTrait::OneTypedResult<::mlir::Type>::Impl, ::mlir::OpTrait::ZeroSuccessors, ::mlir::OpTrait::NOperands<2>::Impl, ::mlir::OpTrait::OpInvariants, ::mlir::BytecodeOpInterface::Trait, ::mlir::ConditionallySpeculatable::Trait, ::mlir::OpTrait::AlwaysSpeculatableImplTrait, ::mlir::MemoryEffectOpInterface::Trait, ::mlir::InferIntRangeInterface::Trait, ::mlir::arith::ArithIntegerOverflowFlagsInterface::Trait, ::mlir::OpTrait::SameOperandsAndResultType, ::mlir::VectorUnrollOpInterface::Trait, ::mlir::OpTrait::Elementwise, ::mlir::OpTrait::Scalarizable, ::mlir::OpTrait::Vectorizable, ::mlir::OpTrait::Tensorizable, ::mlir::InferTypeOpInterface::Trait> {
public:
...
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none);
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none);
  static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
  static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
...
```
In comparison, in the distinct case of `ConstantIntOp` and
`ConstantFloatOp`, the ordering of the result type and the value is
switched.

Thus, this PR corrects the ordering of the aforementioned
`Constant[XXX]Ops` to match with other constructors.
2025-06-23 23:35:50 +02:00
Craig Topper
97ad0f4b3d [DAGCombiner][RISCV] Don't propagate the exact flag from udiv/sdiv to urem/srem. (#145387)
If we simplify a udiv/sdiv using the exact flag we shouldn't
propagate that simplifaction to any urem/srem that happens to
use the same operands. If the exact flag is wrong, the udiv/sdiv
will produce poison, but that doesn't mean we can make the urem/srem
simplify to 0.
    
Fixes #145360.
2025-06-23 14:29:17 -07:00
Kazu Hirata
06d78ba953 [lldb] Fix warnings
This patch fixes:

  third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
  error: comparison of integers of different signs: 'const unsigned
  long' and 'const int' [-Werror,-Wsign-compare]
2025-06-23 14:27:56 -07:00
Florian Mayer
61a969b867 Revert "[MSAN] handle assorted AVX permutations" (#145404)
Rolling back while investigating an issue that might be caused by this.
2025-06-23 14:15:39 -07:00
S. VenkataKeerthy
af2c06ecd6 [MLGO][IR2Vec] Integrating IR2Vec with MLInliner (#143479)
Changes to use Symbolic embeddings in MLInliner. 

(Fixes #141836, Tracking issue - #141817)
2025-06-23 14:07:45 -07:00
Jonas Devlieghere
329ae868cb Revert "[Modules] Record whether VarDecl initializers contain side effects" (#145407)
Reverts llvm/llvm-project#143739 because it triggers an assert:

```
Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 952.
```
2025-06-23 16:01:58 -05:00
Lei Huang
d715ecba79 Revert "[flang][fir] Add fir.if -> scf.if and add filecheck test … (#142965)" (#145345)
This reverts commit 823750d873.

Test causes segfault on aix flang builder.
2025-06-23 16:46:47 -04:00
Henrich Lauko
179d724867 [CIR] Clean up enum attributes (#144999)
This mirrors incubator changes from https://github.com/llvm/clangir/pull/1678

- Create CIR specific EnumAttr bases and prefix enum attributes with CIR_ that automatically puts enum to cir namespace

- Removes unnecessary enum case definitions

- Unifies naming of enum values to use capitals consistently and make enumerations to start from 0
2025-06-23 22:24:29 +02:00
Rahul Joshi
6cf656eca7 [NFC][Clang][AST] Drop llvm:: in front of ArrayRef/MutableArrayRef (#145207) 2025-06-23 13:10:42 -07:00
Andres-Salamanca
66214410c4 [CIR] Add support for DumpRecordLayouts (#145058)
This PR adds support for the `-fdump-record-layouts` flag.
2025-06-23 14:58:28 -05:00
Shay Kleiman
5f74d9bb62 [mlir][linalg] Add support for inlined const to isaFillOpInterface (#144870) 2025-06-23 22:53:41 +03:00
Maksim Levental
653d0d0073 [mlir][python] add MLIR_BINDINGS_PYTHON_INSTALL_PREFIX to make bindings install dir configurable (#124878)
This PR parameterizes the install directory of the MLIR Python bindings in the final distribution.
2025-06-23 15:49:01 -04:00
MaheshRavishankar
7bc956d3d6 [mlir][PartialReductionTilingInterface] Add support for ReductionTilingStrategy::PartialReductionOuterParallel in tileUsingSCF. (#143988)
Following up from https://github.com/llvm/llvm-project/pull/143467,
this PR adds support for
`ReductionTilingStrategy::PartialReductionOuterParallel` to
`tileUsingSCF`. The implementation of
`PartialReductionTilingInterface` for `Linalg` ops has been updated to
support this strategy as well. This makes the `tileUsingSCF` come on
par with `linalg::tileReductionUsingForall` which will be deprecated
subsequently.

Changes summary
- `PartialReductionTilingInterface` changes :
  - `tileToPartialReduction` method needed to get the induction
    variables of the generated tile loops. This was needed to keep the
    generated code similar to `linalg::tileReductionUsingForall`,
    specifically to create a simplified access for slicing the
intermediate partial results tensor when tiled in `num_threads` mode.
  - `getPartialResultTilePosition` methods needs the induction
    varialbes for the generated tile loops for the same reason above,
    and also needs the `tilingStrategy` to be passed in to generate
    correct code.

The tests in `transform-tile-reduction.mlir` testing the
`linalg::tileReductionUsingForall` have been moved over to test
`scf::tileUsingSCF` with
`ReductionTilingStrategy::PartialReductionOuterParallel`
strategy. Some of the test that were doing further cyclic distribution
of the transformed code from tiling are removed. Those seem like two
separate transformation that were merged into one. Ideally that would
need to happen when resolving the `scf.forall` rather than during
tiling.

Please review only the top commit. Depends on
https://github.com/llvm/llvm-project/pull/143467

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-06-23 12:27:26 -07:00
Vitor Sessak
6c232f440f [CUDA] Add missing intrinsics to cuda headers, take #2 (#144851)
LLVM prevents the sm_32_intrinsics.hpp header from being included with a
#define SM_32_INTRINSICS_HPP. It also provides drop-in replacements of
the functions defined in the CUDA header.

One issue is that some intrinsics were added after the replacement was
written, and thus have no replacement, breaking code that calls them
(Raft is one example).

This commit backport the code from sm_32_intrinsics.hpp for the missing
intrinsics.

This is the second try after PR #143664 broke tests.
2025-06-23 12:24:04 -07:00
Florian Hahn
5d01697ec6 [LAA] Be more careful when evaluating AddRecs at symbolic max BTC. (#128061)
Evaluating AR at the symbolic max BTC may wrap and create an expression
that is less than the start of the AddRec due to wrapping (for example
consider MaxBTC = -2).

If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get
incremented by EltSize before returning, so this effectively sets ScEnd
to unsigned max. Note that LAA separately checks that accesses cannot
not wrap (52ded67249,
https://github.com/llvm/llvm-project/pull/127543), so unsigned max
represents an upper bound.

When there is a computable backedge-taken count, we are guaranteed to
execute the number of iterations, and if any pointer would wrap it would
be UB (or the access will never be executed, so cannot alias). It
includes new tests from the previous discussion that show a case we wrap
with a BTC, but it is UB due to the pointer after the object wrapping
(in `evaluate-at-backedge-taken-count-wrapping.ll`)

When we have only a maximum backedge taken count, we instead try to use
dereferenceability information to determine if the pointer access must be in
bounds for the maximum backedge taken count.

PR: https://github.com/llvm/llvm-project/pull/128061
2025-06-23 20:23:40 +01:00
Simon Pilgrim
bf4afb08fe [CostModel] improveShuffleKindFromMask - recognise a SK_PermuteSingleSrc incorrectly tagged as SK_PermuteTwoSrc (#145352)
If a SK_PermuteTwoSrc shuffle kind's mask only references the first
operand, then treat this as SK_PermuteSingleSrc

Part of #145335
2025-06-23 20:20:47 +01:00
Andres-Salamanca
f4d31cdee3 [CIR] Add bitfield offset calculation for big-endian targets (#145067)
This PR updates the bitfield offset calculation to correctly handle
big-endian architectures.
2025-06-23 14:20:26 -05:00
Shuqi Liang
e6f98ff4a8 Fix variable naming style in PPCBoolRetToInt.cpp (#144533)
Change loop variable 'i' to 'I' to conform to LLVM coding standards.
Variable names should start with an upper case letter according to LLVM
coding guidelines.
Fixed locations:
Lines 103-104: for loop variable and usage
Lines 257-258: for loop variable and usage

Co-authored-by: Shuqi Liang <Shuqi.Liang@ibm.com>
2025-06-23 15:00:19 -04:00
qxy11
3095d3a47d [lldb] Add count for number of DWO files loaded in statistics (#144424)
## Summary
A new `totalLoadedDwoFileCount` and `totalDwoFileCount` counters to
available statisctics when calling "statistics dump".

1. `GetDwoFileCounts ` is created, and returns a pair of ints
representing the number of loaded DWO files and the total number of DWO
files, respectively. An override is implemented for `SymbolFileDWARF`
that loops through each compile unit, and adds to a counter if it's a
DWO unit, and then uses `GetDwoSymbolFile(false)` to check whether the
DWO file was already loaded/parsed.

3. In `Statistics`, use `GetSeparateDebugInfo` to sum up the total
number of loaded/parsed DWO files along with the total number of DWO
files. This is done by checking whether the DWO file was already
successfully `loaded` in the collected DWO data, anding adding to the
`totalLoadedDwoFileCount`, and adding to `totalDwoFileCount` for all CU
units.

## Expected Behavior
- When binaries are compiled with split-dwarf and separate DWO files,
`totalLoadedDwoFileCount` would be the number of loaded DWO files and
`totalDwoFileCount` would be the total count of DWO files.
- When using a DWP file instead of separate DWO files,
`totalLoadedDwoFileCount` would be the number of parsed compile units,
while `totalDwoFileCount` would be the total number of CUs in the DWP
file. This should be similar to the counts we get from loading separate
DWO files rather than only counting whether a single DWP file was
loaded.
- When not using split-dwarf, we expect both `totalDwoFileCount` and
`totalLoadedDwoFileCount` to be 0 since no separate debug info is
loaded.

## Testing
**Manual Testing**
On an internal script that has many DWO files, `statistics dump` was
called before and after a `type lookup` command. The
`totalLoadedDwoFileCount` increased as expected after the `type lookup`.
```
(lldb) statistics dump
{
  ...
  "totalLoadedDwoFileCount": 29,
}
(lldb) type lookup folly::Optional<unsigned int>::Storage
typedef std::conditional<true, folly::Optional<unsigned int>::StorageTriviallyDestructible, folly::Optional<unsigned int>::StorageNonTriviallyDestructible>::type
typedef std::conditional<true, folly::Optional<unsigned int>::StorageTriviallyDestructible, folly::Optional<unsigned int>::StorageNonTriviallyDestructible>::type
...
(lldb) statistics dump
{
  ...
  "totalLoadedDwoFileCount": 2160,
}
```
**Unit test**
Added three unit tests that build with new "third.cpp" and "baz.cpp"
files. For tests with w/ flags `-gsplit-dwarf -gpubnames`, this
generates 2 DWO files. Then, the test incrementally adds breakpoints,
and does a type lookup, and the count should increase for each of these
as new DWO files get loaded to support these.
```
$ bin/lldb-dotest -p TestStats.py ~/llvm-sand/external/llvm-project/lldb/test/API/commands/statistics/basic/
----------------------------------------------------------------------
Ran 20 tests in 211.738s

OK (skipped=3)
```
2025-06-23 11:51:08 -07:00
Henrich Lauko
97e8266172 [CIR] Remove redundant operation trait and use AllTypesMatch instead (#144950)
This mirrors incubator changes from https://github.com/llvm/clangir/pull/1679
2025-06-23 20:42:00 +02:00
sribee8
b215c8e18f [libc] wcpcpy implementation (#144802)
Implemented wcpcpy and tests.

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-06-23 18:31:13 +00:00
MaheshRavishankar
71817856f7 [mlir][PartialReductionTilingInterface] Generalize implementation of tileUsingSCF for ReductionTilingStrategy::PartialOuterReduction. (#143467)
This is a precursor to generalizing the `tileUsingSCF` to handle
`ReductionTilingStrategy::PartialOuterParallel` strategy. This change
itself is generalizing/refactoring the current implementation that
supports only `ReductionTilingStrategy::PartialOuterReduction`.

Changes in this PR
- Move the `ReductionTilingStrategy` enum out of
  `scf::SCFTilingOptions` and make them visible to `TilingInterface`.
- `PartialTilingInterface` changes
  - Pass the `tilingStrategy` used for partial reduction to
    `tileToPartialReduction`.
  - Pass the reduction dimension along as `const
    llvm::SetVector<unsigned> &`.
- Allow `scf::SCFTilingOptions` to set the reduction dimensions that
  are to be tiled.
- Change `structured.tiled_reduction_using_for` to allow specification
  of the reduction dimensions to be partially tiled.

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-06-23 11:23:46 -07:00
Miguel Cárdenas
e80acd4fae [clang][nvlink-wrapper] Add support for opt-remarks command line options (#145365)
## Problem
When using `-fsave-optimization-record` with offloading, the Clang
driver passes optimization record options like
`-plugin-opt=opt-remarks-format=yaml` to `clang-nvlink-wrapper`.
However, the wrapper doesn't recognize these options, causing
compilation to fail.


## Solution
This patch adds support for the standard optimization record command
line options to `clang-nvlink-wrapper`, matching the interface provided
by LLD and gold-plugin as documented in the [LLVM Remarks
documentation](https://llvm.org/docs/Remarks.html).

## Changes
- **NVLinkOpts.td**: Added definitions for `opt-remarks-filename`,
`opt-remarks-format`, `opt-remarks-filter`, and
`opt-remarks-with-hotness` options
- **NVLinkOpts.td**: Added `plugin-opt=` aliases for these options to
match what the Clang driver sends
- **ClangNVLinkWrapper.cpp**: Updated `createLTO()` to use command line
arguments when available, falling back to existing global variables

## Testing
The fix allows `-fsave-optimization-record` to work correctly with
offloading, generating optimization records during the LTO phase without
throwing unknown argument errors.

This change maintains backward compatibility and follows the existing
pattern used by other LLVM linkers.
2025-06-23 13:15:04 -05:00
Farzon Lotfi
0f173a0f9a [DirectX] make firstbitlow intrinsic use first argument instead of return for overload type (#145350)
fixes #144966
Easy fix just add `dx_firstbitlow` to
`DirectXTTIImpl::isTargetIntrinsicWithOverloadTypeAtArg`
2025-06-23 14:04:06 -04:00
Zyn
ff865b639a [lldb] Fix SBMemoryRegionInfoListExtensions iter to yield unique refe… (#144815) 2025-06-23 13:02:51 -05:00
Alex MacLean
7ce76e1ad1 [NVPTX] Rename register classes after float register removal (NFC) (#145255) 2025-06-23 10:53:36 -07:00
Sam Elliott
43ae009a9b [RISCV] Make All VType Parts Optional (#144971)
This matches the current binutils behaviour, and the orignal ratified
spec that states that LMUL=1 when the lmul operand is omitted. A variety
of previously invalid vtype instructions are now accepted by the
assembler.

To match binutils, at least one of the vtype operands must be provided.

Also fixes the MCOperandPredicate definition in VTypeIOp, which had one
logic issue and one syntax issue. This is not used by the MC layer
currently, so this change should not affect the existing implementation.

Fixes #143744
2025-06-23 10:50:17 -07:00
Sam Elliott
a6eb5eee38 [RISCV][NFC] Remove hasStdExtCOrZca (#145139)
As of 20b5728b7b, C always enables Zca, so
the check `C || Zca` is equivalent to just checking for `Zca`.

This replaces any uses of `HasStdExtCOrZca` with a new `HasStdExtZca`
(with the same assembler description, to avoid changes in error
messages), and simplifies everywhere where C++ needed to check for
either C or Zca.

The Subtarget function is just deprecated for the moment.
2025-06-23 10:49:47 -07:00
LLVM GN Syncbot
f1c1456b91 [gn build] Port c594f6e697 2025-06-23 17:43:22 +00:00
Jonas Devlieghere
e391301e0e [lldb] Use proc instead of pro to avoid command ambiguity
Use `proc` instead of `pro` to avoid ambiguity between the `process` and
`protocol-server` command.
2025-06-23 10:35:48 -07:00
Craig Topper
ab17ff0562 [RISCV] Add Zvfh tests for vp.splice. NFC 2025-06-23 10:23:57 -07:00
Craig Topper
53edba8091 [RISCV] Add vp.reverse tests for Zvfh and fractional lmuls. NFC 2025-06-23 10:23:51 -07:00
Wael Yehia
735d721de4 [PowerPC] Fix handling of undefs in the PPC::isSplatShuffleMask query (#145149)
Currently, the query assumes that a single undef byte implies the rest of
the `EltSize - 1` bytes are undefs, but that's not always true.
e.g. isSplatShuffleMask(
<0,1,2,3,4,5,6,7,undef,undef,undef,undef,0,1,2,3>, 8) should return
false.

---------

Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>
2025-06-23 13:22:33 -04:00
Henrik G. Olsson
319a51a5ff [Modules] Record whether VarDecl initializers contain side effects (#143739)
Calling `DeclMustBeEmitted` should not lead to more deserialization, as
it may occur before previous deserialization has finished.
When passed a `VarDecl` with an initializer however, `DeclMustBeEmitted`
needs to know whether that initializer contains side effects. When the
`VarDecl` is deserialized but the initializer is not, this triggers
deserialization of the initializer. To avoid this we add a bit to the
serialization format for `VarDecl`s, indicating whether its initializer
contains side effects or not, so that the `ASTReader` can query this
information directly without deserializing the initializer.

rdar://153085264
2025-06-23 10:16:31 -07:00
Timm Baeder
836ff367d0 [clang][bytecode] Fix IntegralAP::{isMin,isMax} (#145339)
We need to take signeness into account here.
2025-06-23 19:11:17 +02:00