[NVPTX] Add pm_event intrinsics (#141278)

This patch adds the pm_event.mask intrinsic and its
clang-builtin.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
This commit is contained in:
Durgadoss R
2025-06-06 19:39:33 +05:30
committed by GitHub
parent 016ce351c8
commit c4012bb5de
6 changed files with 61 additions and 0 deletions

View File

@@ -1868,6 +1868,29 @@ If the request failed, the behavior of these intrinsics is undefined.
For more information, refer `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/?a#parallel-synchronization-and-communication-instructions-clusterlaunchcontrol-query-cancel>`__.
Perf Monitor Event Intrinsics
-----------------------------
'``llvm.nvvm.pm.event.mask``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
"""""""
.. code-block:: llvm
declare void @llvm.nvvm.pm.event.mask(i16 immarg %mask_val)
Overview:
"""""""""
The '``llvm.nvvm.pm.event.mask``' intrinsic triggers one or more
performance monitor events. Each bit in the 16-bit immediate operand
``%mask_val`` controls an event.
For more information on the pmevent instructions, refer to the PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent>`_.
Other Intrinsics
----------------