Files
clang-p2996/llvm/lib/CodeGen/MachineFunction.cpp
Alexis Engelke 2a8c65e983 [CodeGen][NFC] Fix quadratic c-t for large jump tables
Deleting a basic block removes all references from jump tables, which
is O(n). When freeing a MachineFunction, all basic blocks are deleted
before the jump tables, causing O(n^2) runtime. Fix this by deallocating
the jump table first.

Test case generator:

    import sys

    n = int(sys.argv[1])
    print("define void @f(i64 %c, ptr %p) {")
    print("  switch i64 %c, label %d [")
    for i in range(n):
        print(f"    i64 {i}, label %h{i}")
    print(f"  ]")
    for i in range(n):
        print(f'h{i}:')
        print(f'  store i64 {i*i}, ptr %p')
        print(f'  ret void')
    print('d:')
    print('  ret void')
    print('}')

Improvement at 5000 entries:

    Benchmark 1: ./llc.pre -filetype=obj -O0 <switch5k.bc
      Time (mean ± σ):      49.7 ms ±   1.0 ms
      Range (min … max):    48.0 ms …  52.1 ms    57 runs

    Benchmark 2: ./llc.post -filetype=obj -O0 <switch5k.bc
      Time (mean ± σ):      39.4 ms ±   0.8 ms
      Range (min … max):    37.1 ms …  41.1 ms    72 runs

    Summary
      ./llc.post -filetype=obj -O0 <switch5k.bc ran
        1.26 ± 0.04 times faster than ./llc.pre -filetype=obj -O0 <switch5k.bc

Improvement at 20000 entries:

    Benchmark 1: ./llc.pre -filetype=obj -O0 <switch20k.bc
      Time (mean ± σ):     281.7 ms ±   1.0 ms
      Range (min … max):   280.2 ms … 283.0 ms    10 runs

    Benchmark 2: ./llc.post -filetype=obj -O0 <switch20k.bc
      Time (mean ± σ):     123.9 ms ±   1.5 ms
      Range (min … max):   121.4 ms … 129.2 ms    23 runs

    Summary
      ./llc.post -filetype=obj -O0 <switch20k.bc ran
        2.27 ± 0.03 times faster than ./llc.pre -filetype=obj -O0 <switch20k.bc

Pull Request: https://github.com/llvm/llvm-project/pull/144108
2025-06-18 18:56:30 +02:00

57 KiB