Page MenuHomePhabricator

[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs
Needs RevisionPublic

Authored by cdevadas on Apr 21 2022, 12:18 PM.

Details

Summary

Currently the custom SGPR spill lowering pass would
spill SGPRs into physical VGPR lanes and the remaining
VGPRs are used by regalloc for vector regclass allocation.
This currently imposes many restrictions that we end up
with unsuccessful SGPR spilling when there won't be
enough VGPRs and we are forced to spill the remaining
spills into memory during PEI. The custom spilling during
PEI has many edge cases and often breaks the compiler time
to time.

This patch implements spilling SGPRs into virtual VGPR
lanes. Since we now split the register allocation for
SGPRs and VGPRs, the virtual registers introduced for
spill lanes would get allocated automatically in the
following regalloc invocation.

Spill to virtual registers will always be successful,
even at high-pressure situations and thereby avoiding
most of the edge cases in the PEI apart from the
difficulty with custom SGPR spills for special registers
like frame pointer.

Diff Detail

Event Timeline

cdevadas created this revision.Apr 21 2022, 12:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 12:18 PM
cdevadas requested review of this revision.Apr 21 2022, 12:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 12:18 PM
arsenm added inline comments.Apr 21 2022, 1:58 PM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
352–353

Seems worthwhile for this to be its own real function instead of a lambda

376

This could be the end iterator

379

It might be worth adding a target comment flag for this implicit def to comment it's for SGPR spilling

382

Typo " in case if multiple spills"

407–408

Should implement MachineFunctionPass::getClearedProperties instead of clearing these here

llvm/test/CodeGen/AMDGPU/csr-sgpr-spill-live-ins.mir
17–19

Can you precommit a change to add the -NEXTs here

llvm/test/CodeGen/AMDGPU/spill-sgpr-to-virtual-vgpr.mir
27

The test checks seem to not capture that these operands are tied

195

Needs a case where the insert block has no terminators

cdevadas added inline comments.Apr 26 2022, 9:22 AM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
352–353

Yes indeed. Will do.

379

I don't think we can do any special handling for them at assembly emission.
IMPLICIT_DEF is handled/printed by the generic part of AsmPrinter and it won't reach the target-specific emitInstruction at all.

407–408

Will do.

llvm/test/CodeGen/AMDGPU/csr-sgpr-spill-live-ins.mir
17–19

Will do.

llvm/test/CodeGen/AMDGPU/spill-sgpr-to-virtual-vgpr.mir
27

The auto-generator didn't generate the tied-operands correctly. I can hand-modify this test to show the tied operand. It's the simplest case.

59

This test is already hand-modified to check the tied operands.

195

I couldn't write one successfully. Will try some unstructured flow to force one.

cdevadas added inline comments.Apr 27 2022, 3:53 AM
llvm/test/CodeGen/AMDGPU/spill-sgpr-to-virtual-vgpr.mir
195

I don't think such a case exists. A fall-through block will have only one successor and that becomes the nearest dominator for its children.
It would be true even for any unstructured flow.

cdevadas updated this revision to Diff 425478.Apr 27 2022, 4:32 AM

Fixed the review comments.
Moved UpdateLaneVGPRDomInstr lambda into a separate function.
Implemented getClearedProperties to clear certain MF properties.
Tes pre-commit + rebase.
Fixed the tied operand cases in certain tests.

As a follow up I think we need to address the loss of being able to share VGPR lanes for unrelated spills

llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
264–265

Typo "the the". It's also not necessarily unstructured

293

IsDominatesChecked is confusingly named and expresses the code not the intent. SeenSpillInBlock?

llvm/test/CodeGen/AMDGPU/need-fp-from-vgpr-spills.ll
202–204

This is an unfortunate regression but what I expected

arsenm added inline comments.Apr 27 2022, 2:07 PM
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
293

As part of the follow up to allow spill slot sharing, I think we can move all of this allocation stuff out of SIMachineFunctionInfo and into SILowerSGPRSpills

cdevadas added inline comments.Apr 27 2022, 8:24 PM
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
293

Ya, will try to move it entirely out of SIMachineFunctionInfo.

cdevadas updated this revision to Diff 425689.Apr 27 2022, 8:30 PM

Addressed the review comments.

cdevadas updated this revision to Diff 440294.Jun 27 2022, 10:13 AM
cdevadas edited the summary of this revision. (Show Details)

Code rebase.

arsenm accepted this revision.Jun 27 2022, 5:29 PM

LGTM. Might want to introduce an asm printer flag on the implicit_def to mark it's for SGPR spills in the comment

llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
267

Remove "Is there a better way to handle it?"

307

Extra ()s

This revision is now accepted and ready to land.Jun 27 2022, 5:29 PM

Should also remove the SpillSGPRToVGPR option and handling

Should also remove the SpillSGPRToVGPR option and handling

I don't want to do it separately. A follow-up patch?

Should also remove the SpillSGPRToVGPR option and handling

I don't want to do it separately. A follow-up patch?

Typo in my earlier comment. I want to do that as a separate patch.
I've identified a few more clean up that can be done while removing SpillSGPRToVGPR option.

Should also remove the SpillSGPRToVGPR option and handling

I don't want to do it separately. A follow-up patch?

Yes, that's fine

cdevadas updated this revision to Diff 440735.Jun 28 2022, 12:58 PM
cdevadas edited the summary of this revision. (Show Details)

Code rebase.

arsenm accepted this revision.Jun 28 2022, 3:39 PM

What happens when the register allocator decides to split a live range of virtual registers here, i.e. if it introduces a COPY?

arsenm requested changes to this revision.Jun 29 2022, 1:28 PM

What happens when the register allocator decides to split a live range of virtual registers here, i.e. if it introduces a COPY?

This is totally broken as soon as any of these spill. We need WWM spills if they do. We should boost their priority and they need a guaranteed register to save and restore exec. I’m not sure the best way to go about this

This revision now requires changes to proceed.Jun 29 2022, 1:28 PM