Canonicalize !$CUF KERNEL DO loop nests, similar to OpenACC/OpenMP canonicalization. Check statements and expressions in device contexts for usage that isn't supported. Add more tests, and include some tweaks to standard modules needed to build CUDA Fortran modules. Depends on https://reviews.llvm.org/D150159, https://reviews.llvm.org/D150161, https://reviews.llvm.org/D150162, & https://reviews.llvm.org/D150163. Differential Revision: https://reviews.llvm.org/D150164