The values of registers in inactive lanes needs to be saved during
function calls.
Save all registers used for whole wave mode, similar to how it is done
for VGPRs that are used for SGPR spilling.
Paths
| Differential D99429
[AMDGPU] Save WWM registers in functions ClosedPublic Authored by sebastian-ne on Mar 26 2021, 10:49 AM.
Details Summary The values of registers in inactive lanes needs to be saved during Save all registers used for whole wave mode, similar to how it is done
Diff Detail
Event TimelineHerald added subscribers: kerbowa, hiraditya, t-tye and 7 others. · View Herald TranscriptMar 26 2021, 10:49 AM
sebastian-ne marked 2 inline comments as done. Comment Actions
Good point, I changed that to be an additional parameter of storeRegToStackSlot and loadRegFromStackSlot. Change the SmallVector for WWMReservedRegs to a SmallDenseMap. Thanks for the suggestion.
Comment Actions Friendly ping for review. This revision is now accepted and ready to land.Apr 23 2021, 6:11 AM This revision was landed with ongoing or failed builds.Apr 23 2021, 7:10 AM Closed by commit rG91464c30bfcf: [AMDGPU] Save WWM registers in functions (authored by sebastian-ne). · Explain Why This revision was automatically updated to reflect the committed changes. Comment Actions This breaks tests on Windows: http://45.33.8.238/win/37451/step_11.txt Please take a look, and revert for now if it takes a while to fix. sebastian-ne added a reverting change: rG22d99cb63f96: Revert "[AMDGPU] Save WWM registers in functions".Apr 23 2021, 7:39 AM Comment Actions
You almost certainly just need to add a triple to the test's RUN lines, a la rG29ccc8523a4ab0c245b8a62664db34f28adf4451.
Revision Contents
Diff 333805 llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
llvm/lib/Target/AMDGPU/SIInstrInfo.h
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
llvm/test/CodeGen/AMDGPU/wwm-reserved-spill.ll
|
I'm not sure what this is setting (I'm also not a huge fan of stepping the iterator back to find the inserted instruction since it spreads the assumption that this only inserts one instruction)