Summary: We used to do a fetch add of zero to approximate a load. This is because the NVPTX backend didn't handle this properly. It's not an issue anymore so simply use the proper atomic builtin.
Summary: We used to do a fetch add of zero to approximate a load. This is because the NVPTX backend didn't handle this properly. It's not an issue anymore so simply use the proper atomic builtin.