Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs
AcceptedPublic

Authored by cdevadas on Apr 21 2022, 12:18 PM.

Details

Summary

Currently, the custom SGPR spill lowering pass spills
SGPRs into physical VGPR lanes and the remaining VGPRs
are used by regalloc for vector regclass allocation.
This imposes many restrictions that we ended up with
unsuccessful SGPR spilling when there won't be enough
VGPRs and we are forced to spill the leftover into
memory during PEI. The custom spill handling during PEI
has many edge cases and often breaks the compiler time
to time.

This patch implements spilling SGPRs into virtual VGPR
lanes. Since we now split the register allocation for
SGPRs and VGPRs, the virtual registers introduced for
the spill lanes would get allocated automatically in
the subsequent regalloc invocation for VGPRs.

Spill to virtual registers will always be successful,
even in the high-pressure situations, and hence it avoids
most of the edge cases during PEI. We are now left with
only the custom SGPR spills during PEI for special registers
like the frame pointer which is an unproblematic case.

By spilling CSRs into virtual VGPR lanes, we might end up
with broken CFIs that can potentially corrupt the frame
unwinding in the debugger causing either a crash or a
terrible debugging experience. This occurs when regalloc
tries to spill or split the liverange of these virtual VGPRs.
The CFIs should also be inserted at these intermediate
points to correctly propagate the CFI entries. It is not
currently implemented in the compiler. As a short-term fix,
we continue to spill CSR SGPRs into physical VGPR lanes for
the debugger to correctly compute the unwind information.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
arsenm added inline comments.Nov 14 2022, 1:32 PM
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
478–483

I don't like having state here for a single operation that's happening in one pass and isn't valid for multiple uses. I don't really understand how this is being set and passed around

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
652

Isn't this always required?

cdevadas added inline comments.Nov 15 2022, 10:30 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1356–1358

Alex's patch has landed. But this code is still needed to update the liveness for each instruction as eliminateFrameIndex is called here.

llvm/lib/Target/AMDGPU/SIInstrInfo.h
628 ↗(On Diff #472666)

Will change.

llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
60

It isn't.

llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
478–483

CurrentVRegSpilled is needed to track the virtual register (Liverange) for which the physical register was assigned. And it is needed only for fast regalloc . We need this mapping to correctly track the WWM spills as RegAllocFast spills/restore the physical registers directly as there is no VRM. This will be appropriately set with the delegate MRI_NoteVirtualRegisterSpill which is inserted in the RegAllocFast spill/reload functions.

SIMachineFunctionInfo is where the delegates are currently handled and I don't have a better place to move it.

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
652

No. They are reserved only if RA inserts any whole wave spill.

cdevadas updated this revision to Diff 475518.Nov 15 2022, 10:44 AM

Rebase + Suggestions incorporated.

cdevadas updated this revision to Diff 477532.Nov 23 2022, 9:25 AM

Rebase + Incorporated changes after D138515 to move the handling of physReg to current VirtReg mapping entirely into the generic design.

cdevadas updated this revision to Diff 477752.Nov 24 2022, 5:02 AM

Implemented the WWM spill during RegAllocFast using the additional argument to the spiller interface introduced with patch D138656.

arsenm accepted this revision.Dec 15 2022, 10:45 AM
This revision is now accepted and ready to land.Dec 15 2022, 10:45 AM
This revision was landed with ongoing or failed builds.Dec 16 2022, 10:27 PM
This revision was automatically updated to reflect the committed changes.
jdoerfert reopened this revision.Dec 19 2022, 11:14 PM
jdoerfert added subscribers: jtramm, jhuber6, ronlieb, jdoerfert.

This patch causes OpenMC offloaded via OpenMP on AMDGPUs to crash at runtime. It looks like some corruption in the memory address.
You can find build instructions here: https://github.com/jtramm/openmc_offloading_builder

The commit before this one works fine though, assuming you cherry picked https://reviews.llvm.org/rGee1d000d43321590771a2f047c8c55d07d09ad28 first as it landed after.
I assume other codes will be impacted too.

@jtramm @ronlieb @jhuber6 FYI

This revision is now accepted and ready to land.Dec 19 2022, 11:14 PM

This patch causes OpenMC offloaded via OpenMP on AMDGPUs to crash at runtime. It looks like some corruption in the memory address.
You can find build instructions here: https://github.com/jtramm/openmc_offloading_builder

The commit before this one works fine though, assuming you cherry picked https://reviews.llvm.org/rGee1d000d43321590771a2f047c8c55d07d09ad28 first as it landed after.
I assume other codes will be impacted too.

@jtramm @ronlieb @jhuber6 FYI

Thanks. Going to take a look.

cdevadas updated this revision to Diff 496532.Feb 10 2023, 9:56 AM

Rebased after whole-wave copy implementation.

yassingh updated this revision to Diff 523333.May 18 2023, 4:09 AM
yassingh added a subscriber: yassingh.
  • Rebased
  • Incorporated the downstream code
cdevadas edited the summary of this revision. (Show Details)May 18 2023, 5:06 AM
arsenm added inline comments.Jun 21 2023, 5:27 PM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
66

It shouldn't have been SSA to begin with ad this doesn't de-SSA

67

Add a comment explaining the new vregs?

399

You don't need to specially handle the instruction, see AsmPrinterFlags

arsenm requested changes to this revision.Jun 22 2023, 10:55 AM

Just a few more nits

This revision now requires changes to proceed.Jun 22 2023, 10:55 AM
yassingh added inline comments.Jun 26 2023, 4:53 AM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
66

Removing this line causes machine-verifier to crash in few tests. Any hints @cdevadas ?

399

Tried adding a new flag here D153754

yassingh updated this revision to Diff 534590.Jun 26 2023, 8:57 AM

Review comments

yassingh added inline comments.Jun 26 2023, 9:07 AM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
66

Removing this line works fine when running the whole pipeline as the compiler knows the code here is not in SSA form. However, when SILowerSGPRSpills and related passes are run in isolation the verifier assumes the code to be in SSA form(possibly a bug there, also we are introducing virtual vgprs maybe that's the reason).

I can leave the line as it is or is there some way to update the test files to let the compiler know the input isn't SSA? I tried "isSSA: false", didn't work.

cdevadas added inline comments.Jun 26 2023, 9:41 AM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
66

Seems reasonable to retain this line for now. The compiler might not be able to decide that this pass is run post phi-elimination and assume SSA form by default. There must be a serialized option to control it for MIR tests.

yassingh added inline comments.Jun 26 2023, 9:23 PM
llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
66

Yeah, MIRParser::isSSA recomputes the SSA information and sets it to true, also it doesn't expose a way to override it.

yassingh updated this revision to Diff 535257.Jun 28 2023, 12:08 AM

Rebase over ancestor patch changes.

arsenm accepted this revision.Jun 28 2023, 9:25 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1356–1357

This is a pre-existing issue that should be fixed, but we should not be scanning the entire block from the end on every spill. The block iteration should be reversed and we should lazily call enterBasicBlockEnd on the first seen spill

llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
66

this is kind of a mir parser bug

This revision is now accepted and ready to land.Jun 28 2023, 9:25 AM
yassingh updated this revision to Diff 537989.Jul 6 2023, 10:31 PM

fix comment

yassingh updated this revision to Diff 538200.Jul 7 2023, 10:40 AM

Rebase before merge

This revision was landed with ongoing or failed builds.Jul 7 2023, 10:46 AM
This revision was automatically updated to reflect the committed changes.
This revision is now accepted and ready to land.Jul 20 2023, 9:29 AM