This implementation has a number of issues and ultimately does not work
on gfx9.
* It does not reduce bank conflicts with wide memory accesses.
* It does not correctly account for when LDS bank conflicts occur on
amdgpu.
* The implementation is too fragile to be used on real-world code. For
example, the code bails out on any `memref.subview` in the root op, even
when the subview is not a user of any of the `memref.alloc` ops.
I do not see how these can be easily fixed, therefore I think it's
better to delete this code.