With subregister liveness tracking enabled, the aggressive anti-dependency breaker can rename only a part of a live super-register.
This situation happened on Hexagon while experimenting with the subregister liveness tracking.
This is the relevant code before the post-RA scheduler, which the interesting instructions marked:
BB#9: derived from LLVM BB %for.end30 Live Ins: %R16 %R17 %R19 %R22 %R23 Predecessors according to CFG: BB#0 BB#8 [...] --> %R2<def> = A2_add %R23, %R17<kill> %R6<def> = M2_mpyi %R16, %R16 %R22<def,tied1> = M2_accii %R22<tied0>, %R2<kill>, 2 %R7<def> = A2_tfrsi <ga:@.str> %R3<def> = A2_tfr %R16<kill> %D2<def> = A2_tfrp %D0<kill> %R2<def> = L2_loadri_io %R29, 28; mem:LD4[FixedStack5] %R2<def> = M2_mpyi %R6<kill>, %R2<kill> --> %R23<def> = S2_asr_i_r %R22, 31 S2_storeri_io %R29<kill>, 0, %R7<kill>; mem:ST4[Stack] --> %D0<def> = A2_tfrp %D11<kill> J2_call <ga:@check> ; imp-refs removed for brevity
The register D11 is a pair R23:R22. What happens next is that the anti-dependency breaker identifies the anti-dependency on R23 between the first instruction and the S2_asr_i_r. It will then proceed to rename D11 to D13 (i.e. R27:R26) and R23 to R27. The problem is that R22 is still live at this point and is not renamed (to R26).
The relevant part of the DAG before renaming:
SU(0): %R2<def> = A2_add %R23, %R17<kill> # preds left : 0 # succs left : 3 # rdefs left : 0 Latency : 1 Depth : 0 Height : 5 Successors: out SU(6): Latency=1 Reg=%R2 val SU(2): Latency=1 Reg=%R2 antiSU(8): Latency=0 Reg=%R23 [...] SU(8): %R23<def> = S2_asr_i_r %R22, 31 # preds left : 2 # succs left : 1 # rdefs left : 0 Latency : 1 Depth : 2 Height : 2 Predecessors: val SU(2): Latency=1 Reg=%R22 antiSU(0): Latency=0 Reg=%R23 Successors: val SU(10): Latency=1 Reg=%D11 [...] SU(10): %D0<def> = A2_tfrp %D11<kill> # preds left : 3 # succs left : 1 # rdefs left : 0 Latency : 1 Depth : 3 Height : 1 Predecessors: val SU(8): Latency=1 Reg=%D11 antiSU(5): Latency=0 Reg=%D0 val SU(2): Latency=1 Reg=%D11 Successors: ch SU(4294967295) *: Latency=1
Subsequently:
===== Aggressive anti-dependency breaking Available regs: ... Anti: %D0<def> = A2_tfrp %D11<kill> Def Groups: D0=g0->g0(via R0)->g0(via R1) Antidep reg: D0 (zero group) Use Groups: D11=g0->g415(last-use) R22->g416(last-use) R23->g417(last-use) Anti: S2_storeri_io %R29<kill>, 0, %R7<kill>; mem:ST4[Stack] Def Groups: Use Groups: R29=g0->g418(last-use) R7=g0->g419(last-use) Anti: %R23<def> = S2_asr_i_r %R22, 31 Def Groups: R23=g417->g415(via D11) Antidep reg: R23 Rename Candidates for Group g415: D11: DoubleRegs :: D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 R23: IntRegs :: R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28 Find Registers: [D13: D13 R27] Breaking anti-dependence edge on R23: D11->D13(1 refs) R23->R27(1 refs) Use Groups: R22=g416
The final code is this. Note that R22 has not been renamed, even though its super-register use (i.e. D11 in the last instruction before the call) has been renamed to D13:
BB#9: derived from LLVM BB %for.end30 Live Ins: %R16 %R17 %R19 %R22 %R23 Predecessors according to CFG: BB#0 BB#8 [...] %R2<def> = A2_add %R23<kill>, %R17<kill> %D2<def> = A2_tfrp %D0<kill> %R6<def> = M2_mpyi %R16, %R16 %R0<def> = L2_loadri_io %R29, 28; mem:LD4[FixedStack5] %R22<def,tied1> = M2_accii %R22<tied0>, %R2<kill>, 2 %R2<def> = M2_mpyi %R6<kill>, %R0<kill> %R3<def> = A2_tfr %R16<kill> %R7<def> = A2_tfrsi <ga:@.str> --> %R27<def> = S2_asr_i_r %R22<kill>, 31 S2_storeri_io %R29<kill>, 0, %R7<kill>; mem:ST4[Stack] --> %D0<def> = A2_tfrp %D13<kill> J2_call <ga:@check>
The proposed solution is to avoid breaking of anti-dependencies on registers that do not start independent live ranges. If the definition of the anti-dep register has successors based on any alias of the anti-dep register, that alias must be completely covered by the anti-dep register.
This seems expensive. I would at least move the Aliases variable out of the loop and simply clear it so the we do not need to allocate the memory again each iteration.