- do not create MPI operations if no halo exchange is needed
- allow returning sharding information through `!mesh.sharding`
(gets converted into a tuple of tensors)
- lowering `mesh.shard_shape` including fixes to the operation itself
- global symbol `static_mpi_rank` replaced by an DLTI attribute
(now aligned with MPIToLLVM)
- smaller fixes and some minor cleanup
---------
Co-authored-by: Christian Ulmann <christianulmann@gmail.com>