Commit Graph

517431 Commits

Author SHA1 Message Date
Jacob Bramley
16cd5cdf4d [BOLT] Ignore AArch64 markers outside their sections. (#74106)
AArch64 uses $d and $x symbols to delimit data embedded in code.
However, sometimes we see $d symbols, typically in .eh_frame, with
addresses that belong to different sections. These occasionally fall
inside .text functions and cause BOLT to stop disassembling, which in
turn causes DWARF CFA processing to fail.

As a workaround, we just ignore symbols with addresses outside the
section they belong to. This behaviour is consistent with objdump and
similar tools.
2024-11-07 15:16:14 +03:00
Aaron Ballman
3d0b283dcd [C2y] Add test coverage for WG14 N3370 (#115054)
This paper added case ranges in switch statements, which is a GNU
extension Clang has supported since at least Clang 3.0.

It updates the diagnostics to no longer call this a GNU extension except
in C++ mode.
2024-11-07 06:53:29 -05:00
Valery Pykhtin
9470945b66 [CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236)
CopyHints set has been collecting duplicates of a register with
increasing weight and then deduplicated with HintedRegs set. Let's stop
collecting duplicates at the first place.
2024-11-07 12:52:08 +01:00
Jacek Caban
dafbc97594 [LLD][COFF] Append a terminator entry to redirection metadata (#115202)
For MSVC compatibility.
2024-11-07 12:44:45 +01:00
Ramkumar Ramachandra
fef6613e9f ValueTracking: simplify udiv/urem recurrences (#108973)
A urem recurrence has the property that the result can never exceed the
start value. A udiv recurrence has the property that the result can
never exceed either the start value or the numerator, whichever is
greater. Implement a simplification based on these properties.
2024-11-07 11:41:35 +00:00
Ramkumar Ramachandra
abe0cd4621 ValueTracking: pre-commit udiv/urem recurrence tests (#109198) 2024-11-07 11:36:52 +00:00
wanglei
d87dbcbf13 [LoongArch] Reuse GPRRegisterClass to shorten some code in LoongArchRegisterInfo.td. NFC 2024-11-07 19:30:35 +08:00
Nikita Popov
f43ef53dd2 [Mem2Reg] Regenerate test checks (NFC)
Switch to FileCheck and use UTC.
2024-11-07 12:26:52 +01:00
Nikita Popov
4fa1e8f970 [gold] Fix test after pipeline change
After fbd89bcc66 we're not running
FunctionAttrs at O1, so adjust the test expectation accordingly.
2024-11-07 12:25:20 +01:00
Oliver Stannard
9f02950a15 [ARM] Allow spilling FPSCR for MVE adc/sbc intrinsics (#115174)
The MVE VADC and VSBC instructions read and write a carry bit in FPSCR,
which is exposed through the intrinsics. This makes it possible to write
code which has the FPSCR live across a function call, or which uses the
same value twice, so it needs to be possible to spill and reload it.

There is a missed optimisation in one of the test cases, where we reload
the FPSCR from the stack despite it still being live, I've not found a
simple way to prevent the register allocator from doing this.
2024-11-07 11:23:49 +00:00
JaydeepChauhan14
dd98ae358b Test added for x86-instr-mapping (#115170) 2024-11-07 19:09:21 +08:00
Ilia Kuklin
1361c19c04 [lldb] Index static const members of classes, structs and unions as global variables in DWARF 4 and earlier (#111859)
In DWARF 4 and earlier `static const` members of structs, classes and
unions have an entry tag `DW_TAG_member`, and are also tagged as
`DW_AT_declaration`, but otherwise follow the same rules as
`DW_TAG_variable`.
2024-11-07 16:06:03 +05:00
JoelWee
0c0d7a6ec7 [MLIR] Fix bazel after 2f743ac 2024-11-07 10:51:00 +00:00
Sjoerd Meijer
6720ce75f6 [Docs][llvm-exegesis] Clarify AArch64 support (#114989)
Claiming AArch64 support for llvm-exegesis is a bit of a stretch in my
opinion as only a couple of opcodes with GPR64 operands will work for
snippet benchmarking, so I propose to clarify that AArch64 support is
very experimental. Also added some clarifications about its libpfm4
dependency.
2024-11-07 10:48:52 +00:00
Simon Pilgrim
490e58a98e Fix MSVC "not all control paths return a value" warning. NFC 2024-11-07 10:22:05 +00:00
simpal01
f9fecab1fd Add -mno-unaligned-access and -mbig-endian to ARM and AArch64 multilib flags (#114782)
This adds -mno-unaligned-access and -mbig-endian command line
options to the set of flags used by the multilib selection for ARM and
AArch64 targets.
2024-11-07 09:54:41 +00:00
Durgadoss R
1b01064faa [NVPTX] Add TMA bulk tensor copy intrinsics (#96083)
This patch adds NVVM intrinsics and NVPTX codegen for:
* cp.async.bulk.tensor.S2G.1D -> 5D variants, supporting both Tile and
   Im2Col modes. These intrinsics optionally support cache_hints as
   indicated by the boolean flag argument.
* cp.async.bulk.tensor.G2S.1D -> 5D variants, with support for both Tile
  and Im2Col modes. The Im2Col variants have an extra set of offsets as
  parameters. These intrinsics optionally support multicast and cache_hints,
  as indicated by the boolean arguments at the end of the intrinsics.
* The backend looks through these flag arguments and lowers to the
   appropriate PTX instruction.
* Lit tests are added for all combinations of these intrinsics in
  cp-async-bulk-tensor-g2s/s2g.ll.
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst file.
* PTX Spec reference:
  https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-tensor

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-11-07 15:21:53 +05:30
Nikita Popov
2d7f34f2a5 [ValueTracking] Don't special case depth for phi of select (#114996)
As discussed on
https://github.com/llvm/llvm-project/pull/114689#pullrequestreview-2411822612
and following, there is no principled reason why the phi of select case
should have a different recursion limit than the general case. There may
still be fan-out, and there may still be indirect recursion. Revert that
part of #113707.
2024-11-07 10:14:28 +01:00
abhishek-kaushik22
d2aff182d3 Revert "TLS loads opimization (hoist)" (#114740)
This reverts commit c31014322c.

Based on the discussions in #112772, this pass is not needed after the
introduction of `llvm.threadlocal.address` intrinsic.

Fixes https://github.com/llvm/llvm-project/issues/112771.
2024-11-07 10:10:28 +01:00
serge-sans-paille
5f342816ef [llvm] Use computeConstantRange to improve llvm.objectsize computation (#114673)
Using LazyValueInfo, it is possible to compute valuable information for
allocation functions, GEP and alloca, even in the presence of dynamic
information.

llvm.objectsize plays an important role in _FORTIFY_SOURCE definitions,
so improving its diagnostic in turns improves the security of compiled
application.

As a side note, as a result of recent optimization improvements, clang
no longer passes
https://github.com/serge-sans-paille/builtin_object_size-test-suite This
commit restores the situation and greatly improves the scope of code
handled by the static version of __builtin_object_size.
2024-11-07 09:01:14 +00:00
Pravin Jagtap
9b909b8886 [AMDGPU][NFC] Precommit tests representing agpr spills. (#115270)
Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.
2024-11-07 14:28:27 +05:30
Lee Wei
1469d82e1c Remove br i1 undef from some regression tests [NFC] (#115130)
As defined in LangRef, branching on `undef` is undefined behavior.
This PR aims to remove undefined behavior from tests. As UB tests break
Alive2 and may be the root cause of breaking future optimizations.

Here's an Alive2 proof for one of the examples:
https://alive2.llvm.org/ce/z/TncxhP
2024-11-07 08:11:15 +00:00
Boaz Brickner
ae5bfa0cef [clang] Output an error when [[lifetimebound]] attribute is applied on a function implicit object parameter while the function returns void (#114203)
Fixes: https://github.com/llvm/llvm-project/issues/107556
2024-11-07 09:05:46 +01:00
Yingwei Zheng
0b9f1cc024 [SCEV] Disallow simplifying phi(undef, X) to X (#115109)
See the following case:
```
@GlobIntONE = global i32 0, align 4

define ptr @src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
```
1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus,
`ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for
`%retval.2.peel`.
2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the
use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`.
3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand`
reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [
@GlobIntONE, %cleanup.loopexit.peel ]` to generate code for
`@GlobIntONE`. It is incorrect.

This patch disallows simplifying `phi(undef, X)` to `X` by setting
`CanUseUndef` to false.
Closes https://github.com/llvm/llvm-project/issues/114879.
2024-11-07 15:53:51 +08:00
Pengcheng Wang
3850801ca5 [RISCV] Add vcpop.m/vfirst.m to RISCVMaskedPseudosTable
We seem to forget these two instructions.

Reviewers: preames, frasercrmck, lukel97, topperc

Reviewed By: lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/115162
2024-11-07 15:41:46 +08:00
Younan Zhang
adb0d8ddce [Clang] Distinguish expanding-pack-in-place cases for SubstTemplateTypeParmTypes (#114220)
In 50e5411e4, we preserved the pack substitution index within
SubstTemplateTypeParmType nodes and performed in-place expansions of
packs such that type constraints on a lambda that serve as a pattern of
a fold expression could be evaluated if the type constraints contain any
packs that are expanded by the fold expression.

However, we made an incorrect assumption of the condition under which
in-place expansion should occur. For example, a SizeOfPackExpr case
relies on SubstTemplateTypeParmType nodes being transformed to
SubstTemplateTypeParmPackTypes rather than expanding them immediately in
place.

This fixes that by adding a flag to SubstTemplateTypeParmType to
discriminate such in-place expansion situations.

Fixes https://github.com/llvm/llvm-project/issues/113518
2024-11-07 15:37:14 +08:00
Haojian Wu
9f796159f2 Add clang::lifetimebound annotation to llvm::function_ref (#115019)
This helps catch dangling llvm::function_ref references, see #114950,
#114949, #114808, #114789
2024-11-07 08:24:54 +01:00
Luke Lau
343a810725 [RISCV] Allow f16/bf16 with zvfhmin/zvfbfmin as legal strided access (#115264)
This is also split off from the zvfhmin/zvfbfmin
isLegalElementTypeForRVV work.

Enabling this will cause SLP and RISCVGatherScatterLowering to emit
@llvm.experimental.vp.strided.{load,store} intrinsics, and codegen
support for this was added in #109387 and #114750.
2024-11-07 14:40:15 +08:00
Fangrui Song
9b058bb42d [ELF] Replace errorOrWarn(...) with Err 2024-11-06 22:33:51 -08:00
Fangrui Song
f8bae3af74 [ELF] Replace warn(...) with Warn 2024-11-06 22:19:31 -08:00
Fangrui Song
09c2c5e1e9 [ELF] Replace error(...) with ErrAlways or Err
Most are migrated to ErrAlways mechanically.
In the future we should change most to Err.
2024-11-06 22:04:52 -08:00
Fangrui Song
63c6fe4a0b [ELF] Replace fatal(...) with Fatal or Err 2024-11-06 21:17:26 -08:00
vporpo
f7ef7b2ff7 [SandboxVec][Scheduler] Implement rescheduling (#115220)
This patch adds support for re-scheduling already scheduled
instructions. For now this will clear and rebuild the DAG, and will
reschedule the code using the new DAG.
2024-11-06 20:59:49 -08:00
Jeffrey Byrnes
ae6dbed594 [AMDGPU] Use correct DWord for v_dot4 S0 operand (#115224)
Fixes a copy-paste typo.

The typo resulted in producing bad v_perm based operands for the v_dot4
combine. When adding a corresponding byte pair to the v_dot byte pair
chains, we must take note of the byte position in the corresponding
source nodes. These byte positions are used to ensure we extract the
correct DWord from the ultimate source, and formulate a correct
perm_mask from the extracted DWord.

With the typo, we the S0 byte would used the DWord offset for the
corresponding S1 byte. If this offset was not the same as the true DWord
offset for the S0 byte, we would extract and use the wrong byte for S0
in the v_dot.

Fixes https://github.com/llvm/llvm-project/issues/112941
2024-11-06 20:48:20 -08:00
Luke Lau
f0e2301b7c [RISCV] Allow f16/bf16 with zvfhmin/zvfbfmin as legal interleaved access (#115257)
This is another piece split off from the work to add zvfhmin/zvfbfmin to
isLegalElementTypeForRVV.
This is needed to get InterleavedAccessPass to lower [de]interleaves to
segment load/stores.
2024-11-07 12:35:59 +08:00
Luke Lau
481ff22b8b [RISCV] Lower fixed-length vp_{gather,scatter} for zvfhmin/zvfbfmin (#115253)
This uses the same lowering as masked gathers and scatters.
2024-11-07 12:28:13 +08:00
Sergei Barannikov
3bdd71137e [TableGen][GISel] Extract helper function for constraining operands (#115148)
As a side effect, this fixes COPY_TO_REGCLASS not being constrained
if it is not top-level (the reason for changes in tests).
2024-11-07 07:16:54 +03:00
Craig Topper
da032b7903 [RISCV][GISel] Use maskedValueIsZero in RISCVInstructionSelector::selectZExtBits. (#115244) 2024-11-06 20:14:24 -08:00
Han-Kuan Chen
c6091cdbed [SLP][REVEC] Make shufflevector can be vectorized with ReorderIndices and ReuseShuffleIndices. (#114965) 2024-11-07 11:04:34 +08:00
Luke Lau
70bc12e77f [RISCV] Remove unnecessary scalar extensions from test. NFC
Now that f16 and bf16 aren't being scalarized we don't need
zfhmin/zfbfmin.
2024-11-07 10:54:02 +08:00
Richard Smith
de18fa1ace Don't redundantly specify the default template argument to BumpPtrAllocatorImpl (#114857) 2024-11-06 18:45:27 -08:00
Luke Lau
05f87b2d65 [RISCV] Lower fixed-length mload/mstore for zvfhmin/zvfbfmin (#115145)
This is the same idea as #114945.
2024-11-07 10:41:03 +08:00
Luke Lau
7cb66772e2 [RISCV] Rework fixed-length masked load/store tests. NFC
Pass in the mask and vector directly as arguments, and add tests for
zvfhmin and zvfbfmin.
2024-11-07 10:38:21 +08:00
Diego Caballero
af5c471a4d [mlir][Vector] Add vector.extract(vector.shuffle) folder (#115105)
This PR adds a folder for extracting an element from a vector shuffle.
It turns something like:

```
   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
     : vector<8xf32>, vector<8xf32>
   %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>
```

into:

```
   %extract = vector.extract %b[7] : f32 from vector<8xf32>
```
2024-11-06 18:17:12 -08:00
Valentin Clement (バレンタイン クレメン)
30d80009e5 [flang][cuda] Allow SHARED actual to DEVICE dummy (#115215)
Update the compatibility rules to allow SHARED actual argument passed to
DEVICE dummy argument. Emit a warning in that case.
2024-11-06 17:45:58 -08:00
Matt Arsenault
29a5c054e6 ValueTracking: Allow getUnderlyingObject to look at vectors (#114311)
We can identify some easy vector of pointer cases, such as
a getelementptr with a scalar base.
2024-11-06 17:14:44 -08:00
Craig Topper
7c82875866 [GISel][RISCV][AMDGPU] Add G_SHL, G_LSHR, G_ASHR to binop_left_to_zero. (#115089)
Shifting 0 by any amount is still zero.
2024-11-06 17:03:04 -08:00
Konstantin Schwarz
cbfe87c253 [GlobalISel] Remove references to rhs of shufflevector if rhs is undef (#115076) 2024-11-06 16:36:13 -08:00
Kazu Hirata
5348a30a58 [ExecutionEngine] Simplify code with DenseMap::operator[] (NFC) (#115115) 2024-11-06 16:33:34 -08:00
Kazu Hirata
84745da74c [Analysis] Fix a warning (NFC)
This patch fixes:

  third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
  error: comparison of integers of different signs: 'const unsigned
  int' and 'const int' [-Werror,-Wsign-compare]
2024-11-06 16:26:27 -08:00