The NVPTX ISA states that an immOff must fit in a signed 32-bit integer (https://docs.nvidia.com/cuda/parallel-thread-execution/#addresses-as-operands): > `[reg+immOff]` > > a sum of register `reg` containing a byte address plus a constant > integer byte offset (signed, 32-bit). > > `[var+immOff]` > > a sum of address of addressable variable `var` containing a byte > address plus a constant integer byte offset (signed, 32-bit). Currently we do not consider this constraint, meaning that in some edge cases we generate invalid PTX when a value is offset by a very large immediate.