Page MenuHomePhabricator

[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block
ClosedPublic

Authored by alex-t on Aug 26 2020, 8:58 AM.

Details

Summary

optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE.
As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor.
In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute.
Removing empty block solves the problem.

This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen.

Diff Detail

Event Timeline

alex-t created this revision.Aug 26 2020, 8:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2020, 8:58 AM
alex-t requested review of this revision.Aug 26 2020, 8:58 AM
rampitec added inline comments.Aug 26 2020, 10:25 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
158

Does it preserve MDT and MLI now?

606

It will work differently with and without debug info. Imagine that you have a DBG_VALUE in that block, its size will not be 1 anymore. You also need a mir test axactly the same as you are using but with the DBG_VALUE in that block.

Besides in most cases there is no S_BRANCH at the end, just a fall through to successor.

607

Isn't it the same as MBB.succ_begin()?

609

You can simply initialize Preds vector with MBB.predecessors(), loop is not needed.

627

Need a test for this case.

633

What happens to MBB instructions? Will they be erased or become dangling?

635

MBB is already deleted at this point, it will assert or crash. It also needs to remove all instructions, not just first as there can be DBG_VALUEs.

Should also add a few cases with other empty block situations, including with debug info.

Also should add an example where the original problem occurred

llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
606

what if the block is truly empty?

Should also move all of this to a separate function

613–614

replaceSuccessor

alex-t updated this revision to Diff 288430.Aug 27 2020, 12:37 PM

Changes as requested by reviewer.

alex-t marked 8 inline comments as done.Aug 27 2020, 12:39 PM
alex-t added inline comments.
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
158

yes. your initial assumption was a chain of single successors. So, if we remove block, we remove it from that chain.

Should also add a few cases with other empty block situations, including with debug info.

Also should add an example where the original problem occurred

All the above in fact are covered in collapse-encf.mir
As for the original problem, it is mostly about placing the spills in the redundant block under exec == 0.
CFG in this example is trivial if-break loop exit and also covered by the existing test.

rampitec added inline comments.Aug 27 2020, 12:57 PM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
158

I assume when you remove a block you also need to call at least MLI::removeBlock() and MDT::eraseNode() to call them preserved.

684

You do not need this branch if target is a fall thru block.
You also do not seem a testcase when it is not and this branch is needed.

alex-t updated this revision to Diff 288631.Aug 28 2020, 9:19 AM

Added MachineDominatorTree and MachineLoopInfo update after redundant block removal.

llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
158

Oops. Yes. This is correct.
So, given that neither MDT no LI is not used in SILowerControlFlow I'd perfer to remove setPreserved. Otherwise I'd call getAnalysisIfAvailable just to get a pointer and update them. I'm not sure if it worth doing.

684

Removing block we channge MBBs order.
Verifier complains if the new successor is not a branch target and is not "next node" (MBB.getNextNode()).
It consider legal fall thru to the next node only.

The branch is only necessary in case of fall thru to make verifier happy

rampitec added inline comments.Aug 28 2020, 11:39 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
684

This defeats optimization purpose to reduce control flow. This branch will survive till the end, right?

712

Will this also erase all instructions?

713

I would do it before actually erasing MBB itself.

715

Looks like both MLI and MDT are already destroyed at this point anyway. Do you see any of the analysis available? If not I would just remove it from preserved.

arsenm added inline comments.Aug 28 2020, 11:57 AM
llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
684

You're just missing a call to updateTerminator? You shouldn't need to leave a branch to a fallthrough just to satisfy the verifier

alex-t updated this revision to Diff 288699.Aug 28 2020, 1:56 PM

changed as requested by reviewer

alex-t marked 5 inline comments as done.Aug 28 2020, 1:57 PM
rampitec added inline comments.Aug 28 2020, 2:01 PM
llvm/test/CodeGen/AMDGPU/collapse-endcf.mir
25–26

I still see redundant branches.

alex-t added inline comments.Aug 28 2020, 3:02 PM
llvm/test/CodeGen/AMDGPU/collapse-endcf.mir
25–26

It is impossible to remove block without inserting branch. Otherwise I have to rebuild MBBs layout.
Machine Verifier explicitly requires fallthru to be the next item in MBBs list.

 if (Fallthrough && SuccMBB == MBB->getNextNode())
   continue;
report("MBB has unexpected successors which are not branch targets, "
        "fallthrough, EHPads, or inlineasm_br targets.",
        MBB);

The only difference is that now these redundant branch is inserted by MachineBasicBlock::updateTerminator() as Matt suggested.

The only difference is that now these redundant branch is inserted by MachineBasicBlock::updateTerminator() as Matt suggested.

Use MachineFunction::splice() to reinsert block into a proper place?

alex-t updated this revision to Diff 289162.Sep 1 2020, 7:15 AM

No redundant branches anymore

rampitec accepted this revision.Sep 1 2020, 10:46 AM

LGTM, but please run PSDB before submission. This stuff is quite nontrivial.

This revision is now accepted and ready to land.Sep 1 2020, 10:46 AM
alex-t updated this revision to Diff 289376.Sep 2 2020, 2:34 AM

diff rebased to latest trunk

alex-t added a comment.Sep 7 2020, 8:57 AM

Enhanced PSDB passed