This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Preserve only the inactive lanes of scratch vgprs
ClosedPublic

Authored by cdevadas on Sep 23 2022, 5:50 AM.

Details

Summary

In general, a callee is free to use the scratch registers without
preserving its previous state. However, the VGPR used for SGPR
spilling can potentially have its inactive lanes overwritten by
the writelane instructions. When the function returns, it can
cause unexpected behavior if the VGPR value is not preserved
appropriately.

The current scheme to preserve the inactive lanes of such
scratch VGPRs is not done rightly. It preserves all lanes
and causes the outgoing values (if any) getting overwritten
by the epilog restores. It thus corrupts the return value.

To avoid such situations, this patch ensures we preserve only
the inactive lanes.

Diff Detail

Event Timeline

cdevadas created this revision.Sep 23 2022, 5:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 23 2022, 5:50 AM
cdevadas requested review of this revision.Sep 23 2022, 5:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 23 2022, 5:50 AM
arsenm added inline comments.Sep 23 2022, 7:43 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
876

I'd rename IsCalleeSavedReg to EnableInactiveLanes

1107–1108

Needs a comment explaining we're going to get 2 exec flip and restore blocks

llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
290

This needs to check the MRI callee saved regs, not TRI directly for dynamically changed CSRs

llvm/test/CodeGen/AMDGPU/GlobalISel/image-waterfall-loop-O0.ll
148

In the case of lane exit we'll end up restoring values that weren't preserved, but I guess that's OK since those lanes are exiting as dead anyway

cdevadas updated this revision to Diff 462537.Sep 23 2022, 10:03 AM

Suggestions incorporated.

arsenm accepted this revision.Sep 28 2022, 1:02 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
284–287

I don't really like having to split these into two arrays like this but I don't have a better suggestion

290

Actually I've seen this mistake too many times. We should probably rename the TRI version and hide it

This revision is now accepted and ready to land.Sep 28 2022, 1:02 PM
cdevadas updated this revision to Diff 482888.Dec 14 2022, 9:11 AM

code rebase

arsenm accepted this revision.Dec 14 2022, 10:17 AM
This revision was landed with ongoing or failed builds.Dec 16 2022, 10:22 PM
This revision was automatically updated to reflect the committed changes.