This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores
ClosedPublic

Authored by cdevadas on Aug 22 2022, 11:19 PM.

Details

Summary

There is a lot of customization and eventually code duplication in the
frame lowering that handles special SGPR spills like the one needed for
the Frame Pointer. Incorporating any additional SGPR spill currently
makes it difficult during PEI. This patch introduces a new spill builder
to efficiently handle such spill requirements. Various spill methods are
special handled using a separate class.

Diff Detail

Event Timeline

cdevadas created this revision.Aug 22 2022, 11:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 11:19 PM
cdevadas requested review of this revision.Aug 22 2022, 11:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 11:19 PM

Looks ok to me in general.

llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
31

This function doesn’t do anything with CSRs, although the name says so?

100–101

Comment should say <FI> for the renamed variable.

273

Why does this code split the copies into SubRegs and copyFromScratchSGPR doesn’t?

llvm/lib/Target/AMDGPU/SIRegisterInfo.h
37–38 ↗(On Diff #454702)

Why not make this an enum class RegSaveKind?

cdevadas added inline comments.Aug 23 2022, 3:23 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
31

You're right. Neither scratch nor CSR should be in the name. I will change it.

100–101

Thanks. Will change it.

273

For the purpose of emitting CFI directives for each copy. There is no CFI emitted in the epilogue.
There is more coming in the downstream version of this patch.

llvm/lib/Target/AMDGPU/SIRegisterInfo.h
37–38 ↗(On Diff #454702)

Will do.

cdevadas updated this revision to Diff 454780.Aug 23 2022, 4:15 AM

Addressed review comments.

sebastian-ne added inline comments.Aug 23 2022, 5:04 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
273

Hm, does that mean CFI cannot represent register pairs?
If so, can we emit a single copy instruction for the super register and multiple CFI instructions for the subregisters?

cdevadas added inline comments.Aug 23 2022, 9:43 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
273

Sure, we will get the benefit for 64-bit copies as we have the 'S_MOV_B64' instruction. But for the CFI emission, we need composite expressions for register pairs. @scott.linder to comment about its impact.
For now, I will change the code to emit a single COPY instruction for the register pair while retaining separate CFI instructions for equivalent SubRegs.

scott.linder added inline comments.Aug 23 2022, 3:43 PM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
273

I agree with using the most efficient representation of the copy (sounds like a 64-bit MOV is best) and generating whatever CFI we need to describe it.

In the case of the 64-bit program address space PC and the wave64 EXEC this will be a pair of SGPRs, described via a composite expression in the CFI, like how CD mentions. For every other case we can generate independent CFI instructions to separately describe the SubRegs which correspond to DWARF registers.

For example, we don't have a DWARF register number for the pair SGPR40_SGPR41, whereas we do for SGPR40 and SGPR41 separately, so we can still emit something along the lines of:

$sgpr40_sgpr41 = S_MOV_B64 $sgpr50_sgpr51
CFI_INSTRUCTION register sgpr50, sgpr40
CFI_INSTRUCTION register sgpr51, sgpr41

cdevadas updated this revision to Diff 455103.Aug 24 2022, 1:30 AM

Restricted the code to emit a single COPY in case of copy to scratch SGPR. The related CFIs still have split components per sub reg.

sebastian-ne accepted this revision.Aug 24 2022, 1:51 AM

Thanks, looks good to me.

This revision is now accepted and ready to land.Aug 24 2022, 1:51 AM
arsenm added inline comments.Sep 15 2022, 11:35 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
53–54

No else after return

cdevadas updated this revision to Diff 462456.Sep 23 2022, 5:42 AM

Removed else after the return statement.

arsenm added inline comments.Sep 23 2022, 8:56 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.h
30–37 ↗(On Diff #462456)

Don't see why this needs to pollute SIRegisterInfo. Belongs in MFI or maybe FrameLowering

cdevadas updated this revision to Diff 462534.Sep 23 2022, 10:01 AM

Moved CustomSGPRSaveInfo class into MFI

cdevadas updated this revision to Diff 472280.Nov 1 2022, 6:50 AM
cdevadas retitled this revision from [AMDGPU][SIFrameLowering] Unify custom SGPR spill saves and restores (NFC) to [AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores (NFC).
cdevadas edited the summary of this revision. (Show Details)

Used PrologEpilog instead of the prefix "Custom" to refer SGPR spills during PEI. Also, fixed the review title and summary.

arsenm accepted this revision.Nov 1 2022, 10:42 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
197

Really we ought to be stepping back from the end of the entry block and always using reverse liveness (but I guess you're just moving this function, so that's a separate change)

cdevadas updated this revision to Diff 482887.Dec 14 2022, 9:07 AM
cdevadas retitled this revision from [AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores (NFC) to [AMDGPU][SIFrameLowering] Unify PEI SGPR spill saves and restores.

code rebase

arsenm accepted this revision.Dec 14 2022, 10:18 AM
This revision was landed with ongoing or failed builds.Dec 16 2022, 10:20 PM
This revision was automatically updated to reflect the committed changes.