[clang] Return larger CXX records in memory (#120670)

We incorrectly return CXX records in AVX registers when they should be
returned in memory. This is violation of x86-64 psABI.

Detailed discussion is here:
https://groups.google.com/g/x86-64-abi/c/BjOOyihHuqg/m/KurXdUcWAgAJ
This commit is contained in:
Pranav Kant
2025-02-04 09:42:12 -08:00
committed by GitHub
parent f634223782
commit e8a486ea97
4 changed files with 46 additions and 0 deletions

View File

@@ -45,6 +45,8 @@ C++ Specific Potentially Breaking Changes
ABI Changes in This Version
---------------------------
- Return larger CXX records in memory instead of using AVX registers. Code compiled with older clang will be incompatible with newer version of the clang unless -fclang-abi-compat=20 is provided. (#GH120670)
AST Dumping Potentially Breaking Changes
----------------------------------------

View File

@@ -250,6 +250,11 @@ public:
/// passing them as if they had a size of 1 byte.
Ver19,
/// Attempt to be ABI-compatible with code generated by Clang 20.0.x.
/// This causes clang to:
/// - Incorrectly return C++ records in AVX registers on x86_64.
Ver20,
/// Conform to the underlying platform's C and C++ ABIs as closely
/// as we can.
Latest

View File

@@ -1334,6 +1334,15 @@ class X86_64ABIInfo : public ABIInfo {
return T.isOSLinux() || T.isOSNetBSD();
}
bool returnCXXRecordGreaterThan128InMem() const {
// Clang <= 20.0 did not do this.
if (getContext().getLangOpts().getClangABICompat() <=
LangOptions::ClangABI::Ver20)
return false;
return true;
}
X86AVXABILevel AVXLevel;
// Some ABIs (e.g. X32 ABI and Native Client OS) use 32 bit pointers on
// 64-bit hardware.
@@ -2067,6 +2076,13 @@ void X86_64ABIInfo::classify(QualType Ty, uint64_t OffsetBase, Class &Lo,
classify(I.getType(), Offset, FieldLo, FieldHi, isNamedArg);
Lo = merge(Lo, FieldLo);
Hi = merge(Hi, FieldHi);
if (returnCXXRecordGreaterThan128InMem() &&
(Size > 128 && (Size != getContext().getTypeSize(I.getType()) ||
Size > getNativeVectorSizeForAVXABI(AVXLevel)))) {
// The only case a 256(or 512)-bit wide vector could be used to return
// is when CXX record contains a single 256(or 512)-bit element.
Lo = Memory;
}
if (Lo == Memory || Hi == Memory) {
postMerge(Size, Lo, Hi);
return;

View File

@@ -0,0 +1,23 @@
// RUN: %clang %s -S --target=x86_64-unknown-linux-gnu -emit-llvm -O2 -march=x86-64-v3 -o - | FileCheck %s
using UInt64x2 = unsigned long long __attribute__((__vector_size__(16), may_alias));
template<int id>
struct XMM1 {
UInt64x2 x;
};
struct XMM2 : XMM1<0>, XMM1<1> {
};
// CHECK: define{{.*}} @_Z3foov({{.*}} [[ARG:%.*]]){{.*}}
// CHECK-NEXT: entry:
// CHECK-NEXT: store {{.*}}, ptr [[ARG]]{{.*}}
// CHECK-NEXT: [[TMP1:%.*]] = getelementptr {{.*}}, ptr [[ARG]]{{.*}}
// CHECK-NEXT: store {{.*}}, ptr [[TMP1]]{{.*}}
XMM2 foo() {
XMM2 result;
((XMM1<0>*)&result)->x = UInt64x2{1, 2};
((XMM1<1>*)&result)->x = UInt64x2{3, 4};
return result;
}