Depends on D82641.
Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
I appreciate there might be some resistance to this way of doing things, i.e. relying on control flow lowering to insert cleanup markers. As an alternative I can rewrite this to look directly at the exec mask processing during the insert skips pass.
Can you explain the reasoning behind how you decide where to insert SI_KILL_CLEANUP?
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
382–383 | Shouldn't this also Use EXEC? |
I will add this as a comment in the code (because apparently I forgot):
SI_KILL_CLEANUP is inserted immediately after SI_END_CF for control flow where a kill is present.
When SI_KILL_CLEANUP is uniformly reached it will become a test of the EXEC mask immediately after it has been updated by the SI_END_CF to reflect the true state of live lanes.
We could actually blindly insert SI_KILL_CLEANUP after every SI_END_CF, reducing the code complexity, but we'd end up with a lot of early exits that would never be taken.
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
382–383 | It does, this entire block of instructions is Use EXEC. |
Okay, thanks, this all makes sense to me. I do have some suggestions for tightening checks up a little, but either way LGTM.
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp | ||
---|---|---|
222 | Would it make sense to add an assertion that UseMI->getOpcode() == AMDGPU::SI_ELSE? | |
228–232 | When can the NextExec.isDead() case occur? Unreachable code or the like? In any case, could you move this around a bit as: if (NextExec.isDead()) { assert(!SimpleIf); break; } ... if (UseMI->getOpcode() == AMDGPU::SI_END_CF) { if (hasKill(...)) { NeedsKillCleanup.insert(&*UseMI); SimpleIf = false; } } |
Shouldn't this also Use EXEC?