I was benchmarking the MatchTable when I found that `getConstantVRegValWithLookThrough` took a non-negligible amount of time, about 7.5% of all of `AArch64PreLegalizerCombinerImpl::tryCombineAll`. I decided to take a closer look to see if I could squeeze some performance out of it, and I landed on a few changes that: - Avoid copying APint unnecessarily, especially returning std::optional<APInt> can be expensive when a out parameter also works. - Avoid indirect call by using templated function pointers instead of function_ref/std::function Both of those changes seem to speedup this function by about 50%, but my benchmarking (`perf record`) seems inconsistent (so take measurements with a grain of salt), I saw as high as 4.5% and as low as 2% for this function on the exact same input after the changes, but it never got close again to 7% in a few runs so this looks like a stable improvement.
64 KiB
64 KiB