If we know an exact VLEN, then the index is effectively modulo the number of elements in a single vector register. Our lowering performs this subvector optimization. A bit of context. This change may look a bit strange on it's own given we are currently *not* scaling insert/extract cost by LMUL. This costing decision needs to change, but is very intertwined with SLP profitability, and is thus a bit hard to adjust. I'm hoping that https://github.com/llvm/llvm-project/pull/108419 will let me start to untangle this. This change is basically a case of finding a subset I can tackle before other dependencies are in place which does no real harm in the meantime.
84 KiB
84 KiB