[flang][cuda] Perform scalar assignment of c_devptr inlined (#123407)

Because `c_devptr` has a `c_ptr` field, any assignment were done via the
Assign runtime function. This leads to stack overflow on the device and
taking too much memory. As we know the c_devptr can be directly copied
on assignment, make it a special case.
This commit is contained in:
Valentin Clement (バレンタイン クレメン)
2025-01-17 14:34:47 -08:00
committed by GitHub
parent 0c6e03eea0
commit 2523d3b102
3 changed files with 37 additions and 4 deletions

View File

@@ -1401,6 +1401,10 @@ static void genComponentByComponentAssignment(fir::FirOpBuilder &builder,
/// Can the assignment of this record type be implement with a simple memory
/// copy (it requires no deep copy or user defined assignment of components )?
static bool recordTypeCanBeMemCopied(fir::RecordType recordType) {
// c_devptr type is a special case. It has a nested c_ptr field but we know it
// can be copied directly.
if (fir::isa_builtin_c_devptr_type(recordType))
return true;
if (fir::hasDynamicSize(recordType))
return false;
for (auto [_, fieldType] : recordType.getTypeList()) {