Files
clang-p2996/llvm/test/CodeGen/AMDGPU/issue48473.mir
Matt Arsenault eefed1dbf0 RegAllocGreedy: Roll back successful recolorings on failure
This is a replacement for the original fix attempted in
c46aab01c0.

This fixes "overlapping insert" assertion failures when trying to
unwind an unsuccessful recoloring attempt.

The problem would occur when there are multiple recoloring candidates
which recursively required recoloring. If one recoloring candidate was
successfully recolored at one level, and the next recoloring candidate
was unsuccessful, we would not roll back the first candidates
successful recoloring. The forgotten successful recoloring may have
been assigned to something that conflicts with a register that needs
to be restored in a parent recoloring attempt.

See the testcase added in issue48473 for a more concrete example with
explanation.
2022-04-12 19:02:48 -04:00

82 lines
5.5 KiB
YAML

# RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -start-before=greedy,0 -stop-after=virtregrewriter,1 -verify-machineinstrs -o - 2> %t.err %s | FileCheck %s
# RUN: FileCheck -check-prefix=ERR %s < %t.err
# ERR: error: register allocation failed: maximum depth for recoloring reached. Use -fexhaustive-register-search to skip cutoffs
# ERR-NEXT: error: ran out of registers during register allocation
# This testcase used to fail with an "overlapping insert" assertion
# when trying to roll back an unsucessful recoloring of %25. One of
# the interfering vregs is successfully recolored, and the other is
# not. We need to roll back the successfully recolored interfering
# vreg in order to avoid conflicting with the original assignment of
# the original register when rolling back the second.
# %25 initially assigned to $sgpr60_sgpr61_sgpr62_sgpr63_sgpr64_sgpr65_sgpr66_sgpr67
# interfering candidates %15 %17
# assigned %15 to $sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15
# %15
# %18 -> normal recolored $sgpr28_sgpr29_sgpr30_sgpr31
# %20 -> normal recolored $sgpr60_sgpr61_sgpr62_sgpr63
# %17 candidates %37 %39
# tentative assign %17 $sgpr84_sgpr85_sgpr86_sgpr87_sgpr88_sgpr89_sgpr90_sgpr91
# %37 to $sgpr72_sgpr73_sgpr74_sgpr75 succeeded
# %39 last chance recoloring, fails max depth
# Fail to assign: %17 to $sgpr84_sgpr85_sgpr86_sgpr87_sgpr88_sgpr89_sgpr90_sgpr91 at depth 4
# %37 reassign to $sgpr84_sgpr85_sgpr86_sgpr87 unassign from $sgpr72_sgpr73_sgpr74_sgpr75
# %39 reassign to $sgpr88_sgpr89_sgpr90_sgpr91
# %17 candidates %39 %41
# Try to assign: %17 to $sgpr88_sgpr89_sgpr90_sgpr91_sgpr92_sgpr93_sgpr94_sgpr95
# %39 Try assign to $sgpr72_sgpr73_sgpr74_sgpr75 succeeded
# %41 last chance recoloring, fail max depth
# %39 reassign to $sgpr88_sgpr89_sgpr90_sgpr91 unassign from $sgpr72_sgpr73_sgpr74_sgpr75
# %41 reassign to $sgpr92_sgpr93_sgpr94_sgpr95
# %17 candidates %41 %16
# Try assign %41 to $sgpr72_sgpr73_sgpr74_sgpr75 succeeded
# %16 last chance recolor, fail max depth
# fail to recolor %17
#
# Have to roll back the succesful recoloring of %15 when %17's
# recoloring failed. Previously we would leave the recoloring of %18
# and %20 in place. The recoloring of %20 to
# $sgpr60_sgpr61_sgpr62_sgpr63 conflicts with the parent restore of
# %25 to $sgpr60_sgpr61_sgpr62_sgpr63_sgpr64_sgpr65_sgpr66_sgpr67
# CHECK-LABEL: name: issue48473
# CHECK: S_NOP 0, implicit killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, implicit killed renamable $sgpr12_sgpr13_sgpr14_sgpr15, implicit killed renamable $sgpr16_sgpr17_sgpr18_sgpr19_sgpr20_sgpr21_sgpr22_sgpr23, implicit killed renamable $sgpr24_sgpr25_sgpr26_sgpr27_sgpr28_sgpr29_sgpr30_sgpr31, implicit killed renamable $sgpr84_sgpr85_sgpr86_sgpr87, implicit killed renamable $sgpr36_sgpr37_sgpr38_sgpr39_sgpr40_sgpr41_sgpr42_sgpr43, implicit killed renamable $sgpr4_sgpr5_sgpr6_sgpr7, implicit killed renamable $sgpr44_sgpr45_sgpr46_sgpr47_sgpr48_sgpr49_sgpr50_sgpr51, implicit killed renamable $sgpr88_sgpr89_sgpr90_sgpr91, implicit killed renamable $sgpr76_sgpr77_sgpr78_sgpr79_sgpr80_sgpr81_sgpr82_sgpr83, implicit killed renamable $sgpr0_sgpr1_sgpr2_sgpr3, implicit killed renamable $sgpr52_sgpr53_sgpr54_sgpr55_sgpr56_sgpr57_sgpr58_sgpr59, implicit killed renamable $sgpr92_sgpr93_sgpr94_sgpr95, implicit killed renamable $sgpr68_sgpr69_sgpr70_sgpr71_sgpr72_sgpr73_sgpr74_sgpr75, implicit renamable $sgpr68_sgpr69_sgpr70_sgpr71_sgpr72_sgpr73_sgpr74_sgpr75, implicit killed renamable $sgpr96_sgpr97_sgpr98_sgpr99, implicit killed renamable $sgpr8_sgpr9_sgpr10_sgpr11, implicit killed renamable $sgpr60_sgpr61_sgpr62_sgpr63_sgpr64_sgpr65_sgpr66_sgpr67
---
name: issue48473
tracksRegLiveness: true
machineFunctionInfo:
isEntryFunction: true
scratchRSrcReg: '$sgpr100_sgpr101_sgpr102_sgpr103'
stackPtrOffsetReg: '$sgpr32'
argumentInfo:
privateSegmentBuffer: { reg: '$sgpr0_sgpr1_sgpr2_sgpr3' }
privateSegmentWaveByteOffset: { reg: '$sgpr4' }
occupancy: 20
body: |
bb.0:
liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4_sgpr5_sgpr6_sgpr7
%0:sgpr_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3
%1:sgpr_128 = COPY $sgpr4_sgpr5_sgpr6_sgpr7
%2:sgpr_256 = S_LOAD_DWORDX8_IMM undef %3:sgpr_64, 1000, 0 :: (load 32, addrspace 6)
%4:sgpr_256 = S_LOAD_DWORDX8_IMM undef %3:sgpr_64, 1088, 0 :: (load 32, addrspace 6)
%5:sgpr_256 = S_LOAD_DWORDX8_IMM undef %3:sgpr_64, 1152, 0 :: (load 32, addrspace 6)
%6:sgpr_256 = S_LOAD_DWORDX8_IMM undef %3:sgpr_64, 1216, 0 :: (load 32, addrspace 6)
%7:sgpr_256 = S_LOAD_DWORDX8_IMM undef %3:sgpr_64, 1280, 0 :: (load 32, addrspace 6)
%8:sgpr_256 = S_LOAD_DWORDX8_IMM undef %3:sgpr_64, 1408, 0 :: (load 32, addrspace 6)
%9:sgpr_128 = S_LOAD_DWORDX4_IMM undef %3:sgpr_64, 0, 0 :: (load 16, addrspace 6)
%10:sgpr_128 = S_LOAD_DWORDX4_IMM undef %3:sgpr_64, 0, 0 :: (load 16, addrspace 6)
%11:sgpr_128 = S_LOAD_DWORDX4_IMM undef %3:sgpr_64, 0, 0 :: (load 16, addrspace 6)
%12:sgpr_128 = S_LOAD_DWORDX4_IMM undef %3:sgpr_64, 0, 0 :: (load 16, addrspace 6)
%13:sgpr_128 = S_LOAD_DWORDX4_IMM undef %3:sgpr_64, 0, 0 :: (load 16, addrspace 6)
%14:sgpr_128 = IMPLICIT_DEF
S_NOP 0, implicit-def %15:sgpr_256, implicit-def %16:sgpr_128, implicit-def %17:sgpr_256
S_NOP 0, implicit %0, implicit %1, implicit %2, implicit %4, implicit %9, implicit %5, implicit %10, implicit %6, implicit %11, implicit %8, implicit %13, implicit %7, implicit %12, implicit %17, implicit %17, implicit %16, implicit %14, implicit %15
S_ENDPGM 0
...