Switch from `.weak` to `.common` linkage for common global variables where possible. The `.common` linkage is described in [PTX ISA 11.6.4. Linking Directives: .common] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#linking-directives-common) > Declares identifier to be globally visible but “common”. > >Common symbols are similar to globally visible symbols. However multiple object files may declare the same common symbol and they may have different types and sizes and references to a symbol get resolved against a common symbol with the largest size. > >Only one object file can initialize a common symbol and that must have the largest size among all other definitions of that common symbol from different object files. > >.common linking directive can be used only on variables with .global storage. It cannot be used on function symbols or on symbols with opaque type. I've updated the logic and tests to only use `.common` for PTX 5.0 or greater and verified that the new tests now pass with `ptxas`.
14 lines
576 B
LLVM
14 lines
576 B
LLVM
; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -mattr=+ptx43 | FileCheck %s --check-prefix PTX43
|
|
; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -mattr=+ptx50 | FileCheck %s --check-prefix PTX50
|
|
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 -mattr=+ptx43 | %ptxas-verify %}
|
|
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 -mattr=+ptx50 | %ptxas-verify %}
|
|
|
|
; PTX43: .weak .global .align 4 .u32 g
|
|
; PTX50: .common .global .align 4 .u32 g
|
|
@g = common addrspace(1) global i32 zeroinitializer
|
|
|
|
define i32 @func0() {
|
|
%val = load i32, ptr addrspace(1) @g
|
|
ret i32 %val
|
|
}
|