Page MenuHomePhabricator

[CodeGen] Deduplicate restore blocks in branch relaxation
Needs ReviewPublic

Authored by piggynl on Aug 10 2022, 9:18 AM.

Details

Summary

For multiple far branches to the same destination, there are chances that some restore blocks could be shared if they clobber the same registers and share the same restore sequence. Since RISCV generates a same restore block consisting only ld s11, 0(sp) for all jumps to one destination (see D130560), deduplicating them would save a few bytes in that case.

This patch stores all restore blocks for a destination in RestoreBlocks in the destination's BasicBlockInfo, allowing the target to call DeduplicateRestoreBB() to find a previously inserted restore block identical to RestoreBB and insert an unconditional branch to it, then BranchRelaxation::fixupUnconditionalBranch() can erase RestoreBB from the function.

Suppose jump .dest is out of range so need a register to jump indirectly in the following code.

.label_0:
        foo     1
        jump    .dest
.label_1:
        foo     2
        jump    .dest
.label_2:
        bar
        bar
        bar
        bar
.dest:
        baz

Without this patch, it will be relaxed to:

.label_0:
        foo     1
        store   reg1, (sp)
        mov     reg1, .restore_1
        jump    reg1
.label_1:
        foo     2
        store   reg1, (sp)
        mov     reg1, .restore_2
        jump    reg1
.label_2:
        bar
        bar
        bar
        bar
.restore_1:
        load    reg1, (sp)
        jump    .dest
.restore_2:
        load    reg1, (sp)
.dest:
        baz

With this patch, restore blocks can be deduplicated, thus generating the following code:

.label_0:
        foo
        store   reg1, (sp)
        mov     reg1, .restore_1
        jump    reg1
.label_1:
        foo
        store   reg1, (sp)
        mov     reg1, .restore_1
        jump    reg1
.label_2:
        bar
        bar
        bar
        bar
.restore_1:
        load    reg1, (sp)
.dest:
        baz

Depends on D131865, which pre-commits the modified test.

Diff Detail

Event Timeline

piggynl created this revision.Aug 10 2022, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 9:18 AM
piggynl requested review of this revision.Aug 10 2022, 9:18 AM
piggynl edited the summary of this revision. (Show Details)Aug 10 2022, 10:10 AM
piggynl added inline comments.Aug 10 2022, 10:10 AM
llvm/lib/CodeGen/BranchRelaxation.cpp
121

It is possible to use SmallSet instead of SmallVector for RestoreBlocks. It will enable this function to find the reusable block in O(log N) time, but It will take a lot of work. We have to implement a comparator for MachineBasicBlock like FunctionComparator. Also, we cannot reuse MachineInstr::isIdenticalTo() and MachineOperand::isIdenticalTo() because they can only return the equality.

I have no idea how frequently is register spill required for branch relaxation in AMDGPU, but in RISCV, it almost never happens. Also, since RISCV generates a same restore block consisting only ld s11, 0(sp) for all jumps to one destination (see D130560), there should be no difference for RISCV to use SmallVector or SmallSet.

llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll
4312

I have no knowledge of AMDGPU and don't know how to construct some branches that can share or cann't share a same restore block, so please kindly let me know how to construct the test properly.

This is the first added function. I copied @spill() in this file and simply added another branch after the previous one. The two branches look to share the same restore block .LBB2_5.

5274

This is the second added function. I copied @spill_func() in this file and simply added another branch after the previous one. The two branches look don't have a same restore block.

5681–5691

and I don't know why this sequence is forming. Is this intended?

piggynl updated this revision to Diff 451626.Aug 10 2022, 2:08 PM
  • Add test for RISCV.
craig.topper added inline comments.Aug 12 2022, 11:39 AM
llvm/lib/CodeGen/BranchRelaxation.cpp
68

How often are there RestoreBlocks? I wonder if this makes sense to use TinyPtrVector so that we don't enlarge the BasicBlockInfo as much. This SmallVector is probably at least 24 bytes?

craig.topper added inline comments.Aug 12 2022, 11:43 AM
llvm/test/CodeGen/RISCV/branch-relaxation.ll
1299

Pre-commit the test?

piggynl added inline comments.Aug 14 2022, 10:38 AM
llvm/test/CodeGen/RISCV/branch-relaxation.ll
1433

I think this %dest_2 should be %dest_1. Creating D131863 to fix this. I'll pre-commit this @relax_jal_spill_32_deduplicate_restore_block() test based on D131863.

piggynl updated this revision to Diff 452545.Aug 14 2022, 11:26 AM
piggynl edited the summary of this revision. (Show Details)
  • Change the type of BasicBlockInfo::RestoreBlocks from SmallVector to TinyPtrVector.
  • Rebase onto pre-committed test D131865.
piggynl marked 2 inline comments as done.Aug 14 2022, 11:29 AM
piggynl added inline comments.
llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll
4312

Moved these comments about tests for AMDGPU to D131865.

Ping for review :)

Ping for review :)

When would an inserted restore block ever not be identical with the same destination?

llvm/lib/CodeGen/BranchRelaxation.cpp
94

This loop seems overcomplicated for a comparison. Can you just do a range loop, comparing identical instructions until the first terminator?

98

s/has/have/

105

Does this also need to consider the various flags on the block (e.g. alignment)?

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1010

Why is this a callback? Why doesn't the pass just deduplicate to begin with and pass in the new block?