Need to use the original scalar type, when building the reduction, and use the scalar type, when performing casting, to avoid compiler crash.
Need to use the original scalar type, when building the reduction, and use the scalar type, when performing casting, to avoid compiler crash.