Commit Graph

11406 Commits

Author SHA1 Message Date
Rahul Joshi
bee9664970 [TableGen] Emit OpName as an enum class instead of a namespace (#125313)
- Change InstrInfoEmitter to emit OpName as an enum class
  instead of an anonymous enum in the OpName namespace.
- This will help clearly distinguish between values that are 
  OpNames vs just operand indices and should help avoid
  bugs due to confusion between the two.
- Rename OpName::OPERAND_LAST to NUM_OPERAND_NAMES.
- Emit declaration of getOperandIdx() along with the OpName
  enum so it doesn't have to be repeated in various headers.
- Also updated AMDGPU, RISCV, and WebAssembly backends
  to conform to the new definition of OpName (mostly
  mechanical changes).
2025-02-12 08:19:30 -08:00
Alex MacLean
a282b6c486 [NVPTX] Convert scalar function nvvm.annotations to attributes (#125908)
Replace some more nvvm.annotations with function attributes,
auto-upgrading the annotations as needed. These new attributes will be
more idiomatic and compile-time efficient than the annotations.

- !"maxclusterrank" / !"cluster_max_blocks" -> "nvvm.maxclusterrank"
- !"minctasm" -> "nvvm.minctasm"
- !"maxnreg" -> "nvvm.maxnreg"
2025-02-12 07:33:22 -08:00
Yingwei Zheng
34534442a8 [Docs][LangRef] Fix broken ref to pointer capture. NFC (#126910) 2025-02-12 23:14:58 +08:00
Paul Walker
563d54569e [NFC][LLVM][LangRef] Fix typos within partial.reduce.add documentation. 2025-02-12 11:51:26 +00:00
Paul Walker
01afa8fc0b [NFC][LLVM][LangRef] Improve documentation for partial.reduce.add. (#126728) 2025-02-12 11:33:24 +00:00
Stanislav Mekhanoshin
7639242155 [AMDGPU] Create new directive .amdhsa_inst_pref_size (#126622)
The field INST_PREF_SIZE is available since gfx11.
2025-02-11 08:35:45 -08:00
Benjamin Maxwell
701223ac20 [IR] Add llvm.sincospi intrinsic (#125873)
This adds the `llvm.sincospi` intrinsic, legalization, and lowering
(mostly reusing the lowering for sincos and frexp).

The `llvm.sincospi` intrinsic takes a floating-point value and returns
both the sine and cosine of the value multiplied by pi. It computes the
result more accurately than the naive approach of doing the
multiplication ahead of time, especially for large input values.

```
declare { float, float }          @llvm.sincospi.f32(float  %Val)
declare { double, double }        @llvm.sincospi.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincospi.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincospi.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincospi.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float>  %Val)
```

Currently, the default lowering of this intrinsic relies on the
`sincospi[f|l]` functions being available in the target's runtime (e.g.
libc).
2025-02-11 09:01:30 +00:00
Abhilash Majumder
6a961dc03d [NVPTX] Add intrinsics for prefetch.* (#125887)
\[NVPTX\] Add Prefetch intrinsics

This PR adds prefetch intrinsics with the relevant eviction priorities.
* Lit tests are added as part of prefetch.ll
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst.

For more information, refer PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu>`_.

---------

Co-authored-by: abmajumder <abmajumder@nvidia.com>
2025-02-11 14:24:46 +05:30
Rahul Joshi
0f674cce82 [NFC][LLVM] Remove unused TargetIntrinsicInfo class (#126003)
Remove `TargetIntrinsicInfo` class as its practically unused (its pure
virtual with no subclasses) and its references in the code.
2025-02-10 14:56:30 -08:00
Nico Weber
308d28667c [llvm][docs] Tweak backporting instructions a bit (#126519)
* Drop ".Z" in milestone name since we've been doing X.Y releases
instead of X.Y.Z releases since LLVM 18

* Add "LLVM" prefix since that's what release milestones are named

* Use a numbered list to make it clearer that there are two steps
needed, and add some more details to the first step
2025-02-10 10:58:16 -05:00
David Spickett
f845497f3b [llvm][Docs] Explain how to handle excessive formatting changes (#126239)
Based on some feedback in Discord about a PR where a reviewer asked the
author to move the formatting changes to a new PR, which appears to
contradict the current form of this document.

I've added an explanation here, before the point where the author would
be committing any of the formatting changes.

There are other ways this can go, for example some projects don't want
the churn of formatting, or you can pre-emptively send a formatting PR,
but I don't think enumerating them all here will help the audience for
this text.

So I've recomended one path that will start them off well, and can
branch off if the reviewers make requests.
2025-02-10 10:32:45 +00:00
Nikita Popov
2d31a12dbe [DSE] Don't use initializes on byval argument (#126259)
There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.
2025-02-10 10:34:03 +01:00
Durgadoss R
f3040498f0 [NVPTX] Add tcgen05 wait/fence/commit intrinsics (#126091)
This patch adds intrinsics for tcgen05 wait,
fence and commit PTX instructions.

lit tests are added and verified with a
ptxas-12.8 executable.

Docs are updated in the NVPTXUsage.rst file.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-07 22:10:25 +05:30
Benjamin Maxwell
3fdb348243 [LangRef] Fix some formatting issues (NFC) (#126224)
Fixes codeblock and inline code formatting for the `llvm.modf.*`
intrinsic.
2025-02-07 13:35:22 +00:00
Benjamin Maxwell
4bf97aa818 [IR] Add llvm.modf intrinsic (#121948)
This adds the `llvm.modf` intrinsic, legalization, and lowering (mostly
reusing the lowering for sincos and frexp).

The `llvm.modf` intrinsic takes a floating-point value and returns both
the integral and fractional parts (as a struct).

```
declare { float, float }             @llvm.modf.f32(float  %Val)
declare { double, double }           @llvm.modf.f64(double %Val)
declare { x86_fp80, x86_fp80 }       @llvm.modf.f80(x86_fp80  %Val)
declare { fp128, fp128 }             @llvm.modf.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }     @llvm.modf.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float>  %Val)
```

This corresponds to the libm `modf` function but returns multiple values
in a struct (rather than take output pointers), which makes it easier to
vectorize.
2025-02-07 09:25:13 +00:00
Sameer Sahasrabuddhe
a6abd0a13d [Docs] Remove outdated reference to "future work" in convergence. 2025-02-06 14:30:20 +05:30
Alex Bradbury
96d46c694d [docs] Improvements to HowToAddABuilder local test guide (#125802)
This patch makes the following improvements:
* Corrects the suggestion that `bbenv` needs to be made within an
llvm-zorg checkout.
* Gives workarounds for following the instructions on a system with
Python 3.13 (it removed some long-deprecated libraries, which causes
problems).
* Adds a note about how some builder workflows involve checking out
llvm-zorg to retrieve additional scripts and gives guidance on how you
can still make and test local changes to those scripts when that's the
case.
2025-02-05 18:32:38 +00:00
Jay Foad
b275309a4c [TableGen][Docs] Fix productionlists for assert and dump (#123739)
These were referring to nonexistent grammar tokens instead of `Value`.
2025-02-05 11:03:35 +00:00
Jay Foad
439de724fe [TableGen][Docs] Fix productionlists for SimpleValue (#123751)
Previously the grammar tokens SimpleValue2 through SimpleValue9 were
unreferenced. This ties them together so that the grammar makes more
sense.
2025-02-05 09:15:38 +00:00
Konstantin Zhuravlyov
fc4210fb6c AMDGPU/Docs: Fix target properties for gfx9-4-generic (#125593)
gfx9-4-generic has architected flat scratch, not absolute
2025-02-04 21:47:43 -05:00
Paweł Bylica
f308af757d [libfuzzer][docs] Update and clarify Output section (#125075)
In the documentation page for the libfuzzer update the example snippets
of outputs. They are now slightly different than what is documented.

Improve the documentation of the output section `L:`. It now shows two
numbers.

Closes https://github.com/llvm/llvm-project/issues/42571.
2025-02-04 18:50:35 +01:00
David Spickett
4b720f88a3 [llvm][Docs] Clarify release ABI/API compatibility rules (#123049)
If the current release branch is version X, the phrase "the previous
major release." sounds to me as if it is referring to releases of X-1.
Not to the last release from the current release branch, which is what I
think it intends.

(if it meant X-1, then we could never change the ABI)
2025-02-04 10:31:58 +00:00
Durgadoss R
91cb8f5d32 [NVPTX] Add tcgen05 alloc/dealloc intrinsics (#124961)
This patch adds intrinsics for the tcgen05 alloc/dealloc
family of PTX instructions. This patch also adds an
addrspace 6 for tensor memory which is used by
these intrinsics.

lit tests are added and verified with a ptxas-12.8 executable.

Documentation for these additions is also added in NVPTXUsage.rst.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-04 14:31:40 +05:30
Adrian Vogelsgesang
8e6fa15bc3 [lldb-dap] Support column breakpoints (#125347)
This commit adds support for column breakpoints to lldb-dap

To do so, support for the `breakpointLocations` request was
added. To find all available breakpoint positions, we iterate over
the line table.

The `setBreakpoints` request already forwarded the column correctly to
`SBTarget::BreakpointCreateByLocation`. However, `SourceBreakpointMap`
did not keep track of multiple breakpoints in the same line. To do so,
the `SourceBreakpointMap` is now indexed by line+column instead of by
line only.

This was previously submitted as #113787, but got reverted due to
failures on ARM and macOS. This second attempt has less strict test
case expectations. Also, I added a release note.
2025-02-04 01:23:28 +01:00
FantasqueX
14776c6d13 [Kaleidoscope] Fix typo (#125366)
Remove duplicate word.
2025-02-01 23:33:43 +00:00
Omair Javaid
2bffa5bf7a [lldb][Windows] WoA HW Watchpoint support in LLDB (#108072)
This PR adds support for hardware watchpoints in LLDB for AArch64
Windows targets.

Windows does not provide an API to query the number of available
hardware watchpoints supported by underlying hardware platform.
Therefore, current implementation supports only a single hardware
watchpoint, which has been verified on Windows 11 using Microsoft
SQ2 and Snapdragon Elite X hardware.

LLDB test suite ninja check-lldb still fails watchpoint-related tests.
However, tests that do not require more than a single watchpoint
pass successfully when run individually.
2025-01-31 14:11:39 +05:00
Vishakh Prakash
05f8e0806e Update SPIRVUsage.rst (#123897) 2025-01-30 09:47:04 -08:00
Jay Foad
104c2b86a5 [TableGen][Docs] Accept "code" as a Type (#124902)
Previously the Type production did not include "code", which was only
accepted in one place in the grammar:

   BodyItem: (`Type` | "code") `TokIdentifier` ["=" `Value`] ";"

However the parser implementation accepts "code" as a Type with only one
place where it is *not* allowed, corresponding to this production:

   SimpleValue9: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"

This patch changes the production for Type to include "code", thereby
fixing most occurrences of Type in the grammar, and documents the
restriction for BangOperator Types in the text instead of codifying it
in the grammar.
2025-01-30 13:17:41 +00:00
David Sherwood
3bc2dade36 [LoopVectorize] Enable vectorisation of early exit loops with live-outs (#120567)
This work feeds part of PR
https://github.com/llvm/llvm-project/pull/88385, and adds support for
vectorising
loops with uncountable early exits and outside users of loop-defined
variables. When calculating the final value from an uncountable early
exit we need to calculate the vector lane that triggered the exit,
and hence determine the value at the point we exited.

All code for calculating the last value when exiting the loop early
now lives in a new vector.early.exit block, which sits between the
middle.split block and the original exit block. Doing this required
two fixes:

1. The vplan verifier incorrectly assumed that the block containing
a definition always dominates the block of the user. That's not true
if you can arrive at the use block from multiple incoming blocks.
This is possible for early exit loops where both the early exit and
the latch jump to the same block.
2. We were adding the new vector.early.exit to the wrong parent loop.
It needs to have the same parent as the actual early exit block from
the original loop.

I've added a new ExtractFirstActive VPInstruction that extracts the
first active lane of a vector, i.e. the lane of the vector predicate
that triggered the exit.

NOTE: The IR generated for dealing with live-outs from early exit
loops is unoptimised, as opposed to normal loops. This inevitably
leads to poor quality code, but this can be fixed up later.
2025-01-30 10:37:00 +00:00
Carl Ritson
1f38d38d54 [AMDGPU] Fix documentation table formatting from #118750 (NFC) 2025-01-30 14:27:25 +09:00
Carl Ritson
a3a3e6997b [AMDGPU] Rewrite GFX12 SGPR hazard handling to dedicated pass (#118750)
- Algorithm operates over whole IR to attempt to minimize waits.
- Add support for VALU->VALU SGPR hazards via VA_SDST/VA_VCC.
2025-01-30 11:21:11 +09:00
Joel E. Denny
18f8106f31 [KernelInfo] Implement new LLVM IR pass for GPU code analysis (#102944)
This patch implements an LLVM IR pass, named kernel-info, that reports
various statistics for codes compiled for GPUs. The ultimate goal of
these statistics to help identify bad code patterns and ways to mitigate
them. The pass operates at the LLVM IR level so that it can, in theory,
support any LLVM-based compiler for programming languages supporting
GPUs. It has been tested so far with LLVM IR generated by Clang for
OpenMP offload codes targeting NVIDIA GPUs and AMD GPUs.

By default, the pass runs at the end of LTO, and options like
``-Rpass=kernel-info`` enable its remarks. Example `opt` and `clang`
command lines appear in `llvm/docs/KernelInfo.rst`. Remarks include
summary statistics (e.g., total size of static allocas) and individual
occurrences (e.g., source location of each alloca). Examples of its
output appear in tests in `llvm/test/Analysis/KernelInfo`.
2025-01-29 12:40:19 -05:00
Nikita Popov
29441e4f5f [IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
David Spickett
9ea64dd878 [lldb] Make Python >= 3.8 required for LLDB 21 (#124735)
As decided on
https://discourse.llvm.org/t/rfc-lets-document-and-enforce-a-minimum-python-version-for-lldb/82731.

LLDB 20 recommended `>= 3.8` but did not remove support for anything
earlier. Now we are in what will become LLDB 21, so I'm removing that
support and making
`>= 3.8` required.

See https://docs.python.org/3/c-api/apiabiversion.html#c.PY_VERSION_HEX
for the format of PY_VERSION_HEX.
2025-01-29 09:56:41 +00:00
Tom Stellard
3bd3e06f3f Bump version to 21.0.0git (#124870)
Also clear the release notes.
2025-01-28 19:48:43 -08:00
Veera
98d6dd3988 [LLVM][LangRef][noalias] Remove Redundant Line and Improve Wording (#124685)
Removes a redundant line and improves punctuation and wording in the
paragraph describing the `noalias` attribute.
2025-01-28 20:21:22 -05:00
gulfemsavrun
38902153fe [PassBuilder] Add RelLookupTableConverterPass to LTO (#124053)
[PassBuilder] Add RelLookupTableConverterPass to LTO

This patch adds RelLookupTableConverterPass into the LTO
post-link optimization pass pipeline. This optimization
converts lookup tables to relative lookup tables to make
them PIC-friendly, which is already included in the non-LTO
pass pipeline. This patch adds this optimization to the
post-link optimization pipeline to discover more
opportunities in the LTO context.
2025-01-28 15:08:03 -08:00
David Spickett
8353aa2a53 [llvm][Docs] Add LLDB AArch64 GCS Release note
https://github.com/llvm/llvm-project/pull/124295 just
went in and that's the last piece of functionality.
2025-01-28 12:09:05 +00:00
Jeremy Morse
65f81df473 [Docs][DebugInfo] Summarise what people need to do for RemoveDIs now (#124725)
Replace the "what I need to do" section of the RemoveDIs docs with a
paragraph about preserving start-of-block iterators. Hopefully this is
concise enough to remain in peoples heads going forwards!
2025-01-28 11:31:15 +00:00
David Spickett
b29bf3de05 [llvm][Docs] Re-order the LLDB release notes
To put generic changes first, moving into target specific changes
at the end.
2025-01-28 10:33:44 +00:00
Djordje Todorovic
0cb7636a46 [RISCV] Add MIPS extensions (#121394)
Adding two extensions for MIPS p8700 CPU:
  1. cmove (conditional move)
  2. lsp (load/store pair)

The official product page here:
https://mips.com/products/hardware/p8700
2025-01-28 08:04:09 +01:00
Petr Hosek
b593110d89 [compiler-rt] Deprecate LLVM_ENABLE_PROJECTS in favor of LLVM_ENABLE_RUNTIMES (#124016)
We plan to make this a hard error in the LLVM 21 release.

Link #124012
2025-01-27 22:32:38 -08:00
quic_hchandel
2d0688797c [RISCV] Renaming muladdi to muliadd as per v0.5 spec. (#124237)
muliadd is more relevant to the operation performed, i.e. multiply by
immediate.

The latest spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest
2025-01-27 20:40:45 -08:00
Vasileios Porpodas
485b1ac8a2 [SandboxIR][Docs] C++ highlighting for code block 2025-01-25 08:25:15 -08:00
Jeffrey Byrnes
db1ee18eda NFC: Typo fix
Change-Id: I08470bc617490558250136ea35a4964003fa9981
2025-01-24 15:59:13 -08:00
Sam Elliott
d910fbcbd1 [RISCV][NFC] cR Constraint Release Note 2025-01-24 14:46:01 -08:00
Jun Wang
77c23fd0aa [AMDGPU] Update AMDGPUUsage.rst to document two intrinsics (#123816)
The AMDGPUUsage.rst file is updated to document two intrinsics:
llvm.amdgcn.mov.dpp and llvm.amdgcn.update.dpp.
2025-01-24 14:12:18 -08:00
David Spickett
4b6fc49346 [llvm][Docs] Clarify the process for requesting a merge on your behalf (#124154)
This makes it more clear what you the author must do, and what reviewers
can expect you to do, before an approved PR can be merged. Spliting out
the email bit into a section also means we can link directly to it in
discussions.

This relies on one of those parties actually reading this, but I plan to
tackle the case where they don't with some new automation.
2025-01-24 09:34:37 +00:00
David Spickett
97df7411fd [llvm][Docs] Make it clear where lit test files live (#124121)
As someone on Discord was understandably confused because the build
directory does contain folder structures that look remarkably like the
source directory.

I used this page to explain it but realised that this must be from when
llvm was a separate repository. So `<user home>/llvm` probably was a
common path.

Now it's in llvm-project. So make that obvious in the instructions.
2025-01-24 08:29:44 +00:00
Pradeep Kumar
435609b70c [LLVM][NVPTX] Add support for griddepcontrol instruction (#123511)
This commit adds support for griddepcontrol PTX instruction with tests
under griddepcontrol.ll
2025-01-24 09:33:16 +05:30