clang-p2996

Files

Durgadoss R 340cc1702e [LLVM][NVPTX]: Add intrinsic for setmaxnreg (#77289 )

This patch adds an intrinsic for setmaxnreg PTX instruction.
* PTX Doc link for this instruction:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-setmaxnreg

* The i32 argument, an immediate value, specifies the actual
  absolute register count for the instruction.
* The `setmaxnreg` instruction is available in SM90a.
  So, this patch adds 'hasSM90a' predicate to use in
  the NVPTX backend.
* lit tests are added to verify the lowering of the intrinsic.
* Verifier logic (and tests) are added to test the register
  count range and divisibility-by-8 requirements.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>

2024-01-09 12:04:13 -08:00

lit.local.cfg

…

setmaxnreg.ll

…