There is a failure with this pass in the case when target register class for a subregister isn't known from instruction description (for ex. COPY). Currently in this situation the RC is obtained using TargetRegisterInfo::getSubRegisterClass but in general it's not working. In order to fix this two things should be done: 1. Stop processing a subregister if the target register class is unknown (conservative approach) 2. Improve deduction of subregister' target register class (i.e by processing COPY chain) I was going to implement point 1 but my tests use implicit operands for S_NOP and they don't have associated target register class and all tests fail. Therefore I decided to turn off the pass now, implement point 1 and fix my tests. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D152291
20 lines
1.1 KiB
YAML
20 lines
1.1 KiB
YAML
# RUN: llc -march=amdgcn -mcpu=tonga %s -start-before detect-dead-lanes -stop-before machine-scheduler -verify-machineinstrs -o - | FileCheck -check-prefix=GCN %s
|
|
# RUN: llc -march=amdgcn -mcpu=tonga %s -start-before detect-dead-lanes -stop-before machine-scheduler -verify-machineinstrs -early-live-intervals -o - | FileCheck -check-prefix=GCN %s
|
|
|
|
# GCN-LABEL: name: dead_lane
|
|
# GCN: bb.0:
|
|
# GCN-NEXT: undef %3.sub0:vreg_64 = nofpexcept V_MAC_F32_e32 undef %1:vgpr_32, undef %1:vgpr_32, undef %3.sub0, implicit $mode, implicit $exec
|
|
# GCN-NEXT: FLAT_STORE_DWORD undef %4:vreg_64, %3.sub0,
|
|
---
|
|
name: dead_lane
|
|
tracksRegLiveness: true
|
|
body: |
|
|
bb.0:
|
|
%1:vgpr_32 = nofpexcept V_MAC_F32_e32 undef %0:vgpr_32, undef %0:vgpr_32, undef %0:vgpr_32, implicit $mode, implicit $exec
|
|
%2:vgpr_32 = nofpexcept V_MAC_F32_e32 undef %0:vgpr_32, undef %0:vgpr_32, undef %0:vgpr_32, implicit $mode, implicit $exec
|
|
%3:vreg_64 = REG_SEQUENCE %1:vgpr_32, %subreg.sub0, %2:vgpr_32, %subreg.sub1
|
|
FLAT_STORE_DWORD undef %4:vreg_64, %3.sub0, 0, 0, implicit $exec, implicit $flat_scr
|
|
S_ENDPGM 0
|
|
|
|
...
|