Switch from `.weak` to `.common` linkage for common global variables where possible. The `.common` linkage is described in [PTX ISA 11.6.4. Linking Directives: .common] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#linking-directives-common) > Declares identifier to be globally visible but “common”. > >Common symbols are similar to globally visible symbols. However multiple object files may declare the same common symbol and they may have different types and sizes and references to a symbol get resolved against a common symbol with the largest size. > >Only one object file can initialize a common symbol and that must have the largest size among all other definitions of that common symbol from different object files. > >.common linking directive can be used only on variables with .global storage. It cannot be used on function symbols or on symbols with opaque type. I've updated the logic and tests to only use `.common` for PTX 5.0 or greater and verified that the new tests now pass with `ptxas`.
30 lines
893 B
LLVM
30 lines
893 B
LLVM
; RUN: llc < %s -march=nvptx -mcpu=sm_20 -mattr=+ptx43 | FileCheck %s --check-prefixes CHECK,PTX43
|
|
; RUN: llc < %s -march=nvptx -mcpu=sm_20 -mattr=+ptx50 | FileCheck %s --check-prefixes CHECK,PTX50
|
|
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 -mattr=+ptx43 | %ptxas-verify %}
|
|
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 -mattr=+ptx50 | %ptxas-verify %}
|
|
|
|
; PTX43: .weak .global .align 4 .u32 g
|
|
; PTX50: .common .global .align 4 .u32 g
|
|
@g = common addrspace(1) global i32 0, align 4
|
|
|
|
; CHECK: .weak .const .align 4 .u32 c
|
|
@c = common addrspace(4) global i32 0, align 4
|
|
|
|
; CHECK: .weak .shared .align 4 .u32 s
|
|
@s = common addrspace(3) global i32 0, align 4
|
|
|
|
define i32 @f1() {
|
|
%1 = load i32, ptr addrspace(1) @g
|
|
ret i32 %1
|
|
}
|
|
|
|
define i32 @f4() {
|
|
%1 = load i32, ptr addrspace(4) @c
|
|
ret i32 %1
|
|
}
|
|
|
|
define i32 @f3() {
|
|
%1 = load i32, ptr addrspace(3) @s
|
|
ret i32 %1
|
|
}
|