[flang][cuda] Perform scalar assignment of c_devptr inlined (#123407)
Because `c_devptr` has a `c_ptr` field, any assignment were done via the Assign runtime function. This leads to stack overflow on the device and taking too much memory. As we know the c_devptr can be directly copied on assignment, make it a special case.
This commit is contained in:
committed by
GitHub
parent
0c6e03eea0
commit
2523d3b102
@@ -1401,6 +1401,10 @@ static void genComponentByComponentAssignment(fir::FirOpBuilder &builder,
|
||||
/// Can the assignment of this record type be implement with a simple memory
|
||||
/// copy (it requires no deep copy or user defined assignment of components )?
|
||||
static bool recordTypeCanBeMemCopied(fir::RecordType recordType) {
|
||||
// c_devptr type is a special case. It has a nested c_ptr field but we know it
|
||||
// can be copied directly.
|
||||
if (fir::isa_builtin_c_devptr_type(recordType))
|
||||
return true;
|
||||
if (fir::hasDynamicSize(recordType))
|
||||
return false;
|
||||
for (auto [_, fieldType] : recordType.getTypeList()) {
|
||||
|
||||
Reference in New Issue
Block a user