This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Unify early PS termination blocks
ClosedPublic

Authored by critson on Jun 26 2020, 3:52 AM.

Details

Summary

Depends on D77544.
Generate a single early exit block out-of-line and branch to this
if all lanes are killed. This avoids branching if lanes are active.

Diff Detail

Event Timeline

critson created this revision.Jun 26 2020, 3:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2020, 3:52 AM

This is a stepping stop to adding early termination in more places because it means termination can be added anywhere in a basic block (without splitting basic blocks).

Will need to be revised with D77544 is committed.

Is this the right place for this? We're trying to remove this pass

We are trying to have the kill intrinsic handling early during the instruction selection.
I can try to incorporate it all entirely there including the unify block code (if can't find a better place now to have it).
I will also update D77544 if this patch goes upstream.

We are trying to have the kill intrinsic handling early during the instruction selection.
I can try to incorporate it all entirely there including the unify block code (if can't find a better place now to have it).
I will also update D77544 if this patch goes upstream.

This pass seems to be the logic place for this right now.
Is there any code or ongoing discussion I can refer to for moving kill handling to instruction selection?
I can see a few problems with doing that, but that might just be because I do not know what is planned.

I am fine with this getting blown away when SIInsertSkips is replaced by a better solution, and contributing again to that solution.

I think this patch can be simplified if D77544 is submitted first and the order of SIInsertSkips and SIPreEmitPeephole are swapped.

critson updated this revision to Diff 273995.Jun 28 2020, 10:20 PM

Rework this on top of D77544.
Make insertion of skips near the beginning of blocks work correctly by splitting blocks.

We are trying to have the kill intrinsic handling early during the instruction selection.

Whether this is a good idea depends on what is meant by it :) Kill intrinsics interact with WQM for how pixel shaders are defined, and we really need to move the WQM pass later because it adds instructions that interfere with scheduling in a bad way. Setting up the CFG to be the right shape earlier makes sense to me.

Mostly looks good to me, some minor inline comments, but most importantly: please add a test case in skip-if-dead.ll with a function that triggers the early-exit and returns a value (the only test return right now is @test_kill_control_flow, but it doesn't trigger this early exit).

llvm/lib/Target/AMDGPU/SIInsertSkips.cpp
167

Should probably be renamed to generatePsEndPgm.

222–223

Why the update to NextBBI? The old code didn't do that, I believe on purpose because while it doesn't break correctness, there simply shouldn't be a need to inspect the EarlyExitBlock.

Mostly looks good to me, some minor inline comments, but most importantly: please add a test case in skip-if-dead.ll with a function that triggers the early-exit and returns a value (the only test return right now is @test_kill_control_flow, but it doesn't trigger this early exit).

I see this may be covered by llvm/test/CodeGen/AMDGPU/transform-block-with-return-to-epilog.ll?

critson marked 3 inline comments as done.Jun 30 2020, 1:42 AM
critson added inline comments.
llvm/lib/Target/AMDGPU/SIInsertSkips.cpp
222–223

This update is not part of the iteration.
EarlyExitBlock is not inspected because this code is executed after the iteration of blocks.
This update is required so the SplitBB is inserted in the correct place (immediately following MBB). Technically this is only required when NextBBI is MBB->end(), in which case the EarlyExitBlock will become the next block; however, simply resetting the iterator keeps it more generic w.r.t. EarlyExitBlock placement.

critson updated this revision to Diff 274372.Jun 30 2020, 1:55 AM
critson marked an inline comment as done.
  • Rebase
  • Rename generateEndPgm
  • Edit comment for clarity
  • Add explicit test to skip-if-dead.ll
nhaehnle accepted this revision.Jul 1 2020, 6:21 AM

Thanks, LGTM

This revision is now accepted and ready to land.Jul 1 2020, 6:21 AM
This revision was automatically updated to reflect the committed changes.