This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Remove return VGPRs from callee save list
ClosedPublic

Authored by critson on Jun 14 2023, 1:08 AM.

Details

Summary

There is no need to generate spill/restore for registers used in
return value. This matters for amdgpu_gfx calling convention
where CSR and Ret definitions overlap.

Diff Detail

Event Timeline

critson created this revision.Jun 14 2023, 1:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 1:08 AM
critson requested review of this revision.Jun 14 2023, 1:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 1:08 AM
critson updated this revision to Diff 531226.Jun 14 2023, 1:17 AM

Rebase on top of pre-commit test.

As pointed out by @sebastian-ne the other way to solve this is to remove the overlap of CSRs and return registers in amdgpu_gfx.

jsilvanus added inline comments.
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1527

"[...] because these do not [...]"?

1529

"[...] of equal type / size"?

Maybe add an assertion above when finding multiple returns?

critson updated this revision to Diff 531231.Jun 14 2023, 1:53 AM
  • Fix comment
  • Add assertion that all returns are of equal size
critson marked 2 inline comments as done.Jun 14 2023, 1:54 AM

Thanks for the update. Looks good to me, but I'm not very familiar with this, so I'm leaving approval for someone else.

cdevadas added inline comments.Jun 14 2023, 7:06 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1521

I assume this is only a short-term fix until revising the gfx ABI to move return VGPRs into the scratch regs range to avoid the CSR spills/restores.

sebastian-ne accepted this revision.Jun 14 2023, 7:14 AM

LGTM if there are no further concerns from @cdevadas.

Just to check, I suppose calls to functions with that many return values are already handled fine?

llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1521

I think this is a long-term solution :)
Callee-saves have the striped pattern, which is good and what we want.
But if a function decides that it wants to return a lot of values – and they fit into registers – then why not use these registers for the return value, even if they are callee-save (just that they cannot be callee-save for this function return).

This revision is now accepted and ready to land.Jun 14 2023, 7:14 AM

LGTM if there are no further concerns from @cdevadas.

No other concerns.

llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1521

Ok.

Just to check, I suppose calls to functions with that many return values are already handled fine?

As far as I can tell, calls are working fine as-is.

This revision was landed with ongoing or failed builds.Jun 14 2023, 10:07 PM
This revision was automatically updated to reflect the committed changes.