A long time ago (back in 2009) there was a commit52d4d8244bthat changed the scheduler to not dirty height/depth when adding or removing SUnit predecessors when the latency on the edge was zero. That commit message is claiming that the depth or height isn't affected when the latency is zero. As a matter of fact, the depth/height can change even with a zero latency on the edge. If for example adding a new SUnit A, with zero latency, but as a predecessor to a SUnit B, then both height of A and depth of B should be marked as dirty. If for example B has a greater height than A, then the height of A needs to be adjusted even if the latency is zero. I think this has been wrong for many years. Downstream we have had commit52d4d8244breverted since back in 2016. There is no motivating lit test for52d4d8244b(only an incomplete C level reproducer in https://github.com/llvm/llvm-project/issues/3613). After commit13d04fa560there finally appeared an upstream lit test that shows that we get better code if marking height/depth as dirty (llvm/test/CodeGen/AArch64/abds.ll).
22 KiB
22 KiB