Page MenuHomePhabricator

[AMDGPU] Insert PS early exit at end of control flow
ClosedPublic

Authored by critson on Jun 28 2020, 11:33 PM.

Details

Summary

Depends on D82641.
Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.

Diff Detail

Event Timeline

critson created this revision.Jun 28 2020, 11:33 PM

I appreciate there might be some resistance to this way of doing things, i.e. relying on control flow lowering to insert cleanup markers. As an alternative I can rewrite this to look directly at the exec mask processing during the insert skips pass.

Can you explain the reasoning behind how you decide where to insert SI_KILL_CLEANUP?

llvm/lib/Target/AMDGPU/SIInstructions.td
382–383

Shouldn't this also Use EXEC?

critson marked 2 inline comments as done.Jun 29 2020, 9:50 PM

Can you explain the reasoning behind how you decide where to insert SI_KILL_CLEANUP?

I will add this as a comment in the code (because apparently I forgot):
SI_KILL_CLEANUP is inserted immediately after SI_END_CF for control flow where a kill is present.
When SI_KILL_CLEANUP is uniformly reached it will become a test of the EXEC mask immediately after it has been updated by the SI_END_CF to reflect the true state of live lanes.

We could actually blindly insert SI_KILL_CLEANUP after every SI_END_CF, reducing the code complexity, but we'd end up with a lot of early exits that would never be taken.

llvm/lib/Target/AMDGPU/SIInstructions.td
382–383

It does, this entire block of instructions is Use EXEC.

critson updated this revision to Diff 274377.Jun 30 2020, 2:32 AM
critson marked an inline comment as done.
  • Rebase
  • Add comment
nhaehnle accepted this revision.Jul 1 2020, 6:29 AM

Okay, thanks, this all makes sense to me. I do have some suggestions for tightening checks up a little, but either way LGTM.

llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
222

Would it make sense to add an assertion that UseMI->getOpcode() == AMDGPU::SI_ELSE?

228–232

When can the NextExec.isDead() case occur? Unreachable code or the like?

In any case, could you move this around a bit as:

  if (NextExec.isDead()) {
    assert(!SimpleIf);
    break;
  }
...
if (UseMI->getOpcode() == AMDGPU::SI_END_CF) {
  if (hasKill(...)) {
    NeedsKillCleanup.insert(&*UseMI);
    SimpleIf = false;
  }
}
This revision is now accepted and ready to land.Jul 1 2020, 6:29 AM
This revision was automatically updated to reflect the committed changes.