SK_PermuteTwoSrc legalization has to assume any of the legalised source registers could be referenced in split shuffles, but if we already know that each 128-bit lane only references elements from the same lane of the source operands, then this scaling won't occur.
Hopefully this can help with #113356 without us having to get full processShuffleMasks canonicalization finished first.