This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Save WWM registers in functions
ClosedPublic

Authored by sebastian-ne on Mar 26 2021, 10:49 AM.

Details

Summary

The values of registers in inactive lanes needs to be saved during
function calls.

Save all registers used for whole wave mode, similar to how it is done
for VGPRs that are used for SGPR spilling.

Diff Detail

Event Timeline

sebastian-ne created this revision.Mar 26 2021, 10:49 AM
sebastian-ne requested review of this revision.Mar 26 2021, 10:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2021, 10:49 AM
arsenm added inline comments.Mar 28 2021, 8:01 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
751

I'm not sure what this is setting (I'm also not a huge fan of stepping the iterator back to find the inserted instruction since it spreads the assumption that this only inserts one instruction)

critson added inline comments.
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
464–466

I wonder if SmallSetVector would be more appropriate, as it would save you iterating this to check if a given register is reserved.

foad added a subscriber: foad.Mar 29 2021, 3:25 AM
sebastian-ne marked 2 inline comments as done.

I'm not sure what this is setting (I'm also not a huge fan of stepping the iterator back to find the inserted instruction since it spreads the assumption that this only inserts one instruction)

Good point, I changed that to be an additional parameter of storeRegToStackSlot and loadRegFromStackSlot.

Change the SmallVector for WWMReservedRegs to a SmallDenseMap. Thanks for the suggestion.

arsenm added inline comments.Mar 30 2021, 3:17 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
1474 ↗(On Diff #333805)

I think I missed adding an operand to the pseudos to tell it what frame register to use. Do we really need a flag for this? Why can't eliminateFrameIndex infer SP/FP/BP from the frame index?

sebastian-ne marked an inline comment as done.

Rebased again

Friendly ping for review.
The previous comments are all fixed, the code should be quite obvious now.

arsenm accepted this revision.Apr 23 2021, 6:11 AM
This revision is now accepted and ready to land.Apr 23 2021, 6:11 AM
This revision was landed with ongoing or failed builds.Apr 23 2021, 7:10 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Apr 23 2021, 7:35 AM

This breaks tests on Windows: http://45.33.8.238/win/37451/step_11.txt

Please take a look, and revert for now if it takes a while to fix.

foad added a comment.Apr 23 2021, 7:48 AM

This breaks tests on Windows: http://45.33.8.238/win/37451/step_11.txt

Please take a look, and revert for now if it takes a while to fix.

You almost certainly just need to add a triple to the test's RUN lines, a la rG29ccc8523a4ab0c245b8a62664db34f28adf4451.

Thanks, resubmitted with -mtriple instead of -march.