This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix LiveVariables error after lowering SI_END_CF
ClosedPublic

Authored by arsenm on Jan 20 2022, 8:51 AM.

Details

Reviewers
foad
ruiling
Group Reviewers
Restricted Project
Summary

This wasn't accounting for the block change in updating LiveVariables.

Diff Detail

Event Timeline

arsenm created this revision.Jan 20 2022, 8:51 AM
arsenm requested review of this revision.Jan 20 2022, 8:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2022, 8:51 AM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm updated this revision to Diff 401657.Jan 20 2022, 8:54 AM

Drop leftover test

foad added inline comments.Jan 21 2022, 2:07 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
516

Would it make sense to move this code into splitAt, i.e. have it update LV just like it updates LIS?

522

This is a shame. isPHIJoin hasn't been used since rGfac770b865f59cbe615241dad153ad20d5138b9e 9 years ago and I was hoping to remove it.

526

Why do you only need to do this for phi join regs? Couldn't any random register have its last use in SplitBB?

arsenm added inline comments.Jan 21 2022, 8:19 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
516

Probably

522

Well we were hoping to remove LiveVariables completely, so I'm not worried about using a specific piece of it.

526

I derived this based on the phi representation described in the comment for VarInfo. Other registers should have the block already in AliveBlocks

foad added a comment.Jan 24 2022, 2:46 AM

Why do you only need to do this for phi join regs? Couldn't any random register have its last use in SplitBB?

I was thinking of a case like this:

---
name:            phi_visit_order
tracksRegLiveness: true
body:             |
  bb.0:
    successors: %bb.3(0x40000000), %bb.1(0x40000000)
    liveins: $vgpr0
  
    %0:vgpr_32 = COPY killed $vgpr0
    %1:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
    %2:sreg_64_xexec = V_CMP_EQ_U32_e64 0, killed %0, implicit $exec
    %3:sreg_64_xexec = SI_IF %2, %bb.1, implicit-def $exec, implicit-def dead $scc, implicit $exec
    S_BRANCH %bb.3
  
  bb.1:
    successors: %bb.2(0x80000000)
  
    %4:sreg_64_xexec = PHI %5, %bb.3, %3, %bb.0
    %6:vgpr_32 = PHI %7, %bb.3, %1, %bb.0
    S_BRANCH %bb.2
  
  bb.2:
    successors: %bb.3(0x80000000)
  
    %8:sreg_64_xexec = COPY %4
    SI_END_CF killed %8, implicit-def $exec, implicit-def dead $scc, implicit $exec
    %9:vgpr_32 = nsw V_ADD_U32_e32 1, killed %6, implicit $exec
  
  bb.3:
    successors: %bb.3(0x40000000), %bb.1(0x40000000)
  
    %10:vgpr_32 = PHI %9, %bb.2, %7, %bb.3, %1, %bb.0
    GLOBAL_STORE_DWORD undef %11:vreg_64, %10, 0, 0, implicit $exec :: (volatile store (s32), addrspace 1)
    %7:vgpr_32 = COPY killed %10
    %5:sreg_64_xexec = SI_IF %2, %bb.1, implicit-def $exec, implicit-def dead $scc, implicit $exec
    S_BRANCH %bb.3

...

Here %4 which is not a phi join reg has its last use in bb.2. When bb.2 is split into MBB/SplitBB, %4 will be live through MBB. Rather than testing for phi join regs I think you want to do it for any register that is live into MBB.

ruiling added inline comments.Jan 24 2022, 6:03 AM
llvm/test/CodeGen/AMDGPU/lower-control-flow-live-variables-update.mir
61–88

Could you show me the .ll and the llc command to reproduce the issue? the test has irreducible control flow, something must be wrong in early pass.

arsenm added inline comments.Jan 25 2022, 6:42 AM
llvm/test/CodeGen/AMDGPU/lower-control-flow-live-variables-update.mir
61–88

You hit this failure by adding -global-isel to test/CodeGen/AMDGPU/salu-to-valu.ll. GlobalISel avoids some extra copies the DAG happens to emit, which (correctly) allows for tail duplicating the SI_IF. We can't rely on reducible control flow or specific instruction+control flow patterns being preserved

arsenm updated this revision to Diff 403433.Jan 26 2022, 4:12 PM

Fix Jay's testcase. We need to avoid adding registers to the block that are only live in one block

I would hope we can move the block split logic into a separate pass that can be scheduled before we constructing liveness information, like during addPreRegAlloc(). It is expensive to update either LiveVariables or LiveIntervals. Sounds good?

I would hope we can move the block split logic into a separate pass that can be scheduled before we constructing liveness information, like during addPreRegAlloc(). It is expensive to update either LiveVariables or LiveIntervals. Sounds good?

The whole reason this pass is here is because it needs to be done after phi elimination, so you can't really move it anywhere else. LiveVariables we should also just be pushing to eliminate entirely

I would hope we can move the block split logic into a separate pass that can be scheduled before we constructing liveness information, like during addPreRegAlloc(). It is expensive to update either LiveVariables or LiveIntervals. Sounds good?

The whole reason this pass is here is because it needs to be done after phi elimination, so you can't really move it anywhere else. LiveVariables we should also just be pushing to eliminate entirely

I think block split itself could be done before phi elimination. If you detect the source operand of SI_END_CF defined in the same block, you can split the block. That works just like what we are doing now. Did I miss something? Even you deprecate LiveVariables, you still need to update LiveIntervals, it still need to searching against all virtual registers to see whose LiveInterval need updated.

I would hope we can move the block split logic into a separate pass that can be scheduled before we constructing liveness information, like during addPreRegAlloc(). It is expensive to update either LiveVariables or LiveIntervals. Sounds good?

The whole reason this pass is here is because it needs to be done after phi elimination, so you can't really move it anywhere else. LiveVariables we should also just be pushing to eliminate entirely

I think block split itself could be done before phi elimination. If you detect the source operand of SI_END_CF defined in the same block, you can split the block. That works just like what we are doing now. Did I miss something? Even you deprecate LiveVariables, you still need to update LiveIntervals, it still need to searching against all virtual registers to see whose LiveInterval need updated.

Sounds like LiveIntervals does not need expensive update?

ruiling added inline comments.Jan 30 2022, 8:29 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
531

Is it possible that a value defined in MBB (and kept in MBB after split) and killed in successor block, now becomes alive in SplitBB?

arsenm added a comment.Feb 4 2022, 2:38 PM

I would hope we can move the block split logic into a separate pass that can be scheduled before we constructing liveness information, like during addPreRegAlloc(). It is expensive to update either LiveVariables or LiveIntervals. Sounds good?

The whole reason this pass is here is because it needs to be done after phi elimination, so you can't really move it anywhere else. LiveVariables we should also just be pushing to eliminate entirely

I think block split itself could be done before phi elimination. If you detect the source operand of SI_END_CF defined in the same block, you can split the block. That works just like what we are doing now. Did I miss something?

The reason we are splitting the block is so we can place the exec modification before the copies for lowered phis. We cannot split the block as needed while the phis still exist since phis need to be the first instructions in the block

arsenm added inline comments.Feb 4 2022, 2:59 PM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
531

That's covered by updating the Killed set, replaceKillInstruction

The reason we are splitting the block is so we can place the exec modification before the copies for lowered phis. We cannot split the block as needed while the phis still exist since phis need to be the first instructions in the block

I don't understand, we are moving out instructions after SI_END_CF into a new block, all the PHIs and SI_END_CF will still be in the old block. Why are we breaking the phis?
To make it clear, I am not strongly asking for this now. I am fine with fixing the LiveVariable update issue.

llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
531

It is something like below after split:

MBB:
  %a = PHI ... // this will be lowered into COPY after phi-elim
 ...
SplitBB:
  ...
Succ:
  use(%a)

%a is not the source of SI_END_CF, here we need to insert SplitBB into AliveBlocks of %a.

arsenm added inline comments.Feb 9 2022, 9:01 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
531

The lowered SI_END_CF is not inserted at its original position in the block. It is moved to the top of the block, moving it before the phis which were lowered

ruiling accepted this revision.Feb 9 2022, 8:07 PM

LGTM

llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
531

I missed the trick we have already reordered the instructions during phi elimination. The only COPY comes before SI_END_CF is the one that generate source of SI_END_CF.

This revision is now accepted and ready to land.Feb 9 2022, 8:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2022, 6:58 PM
Herald added a subscriber: hsmhsm. · View Herald Transcript