The custom VGPR spills inserted during frame lowering
maintain a separate list for WWM reserved registers.
Added them into WWMSpills that already tracks such
reserved registers. It unifies the spill insertion.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
1252 | If we're just going to add these to lane VGPRs, is there any real reason to distinguish the WWM registers? Could we just add them to LaneVGPRs to start with? | |
1252–1253 | Theoretically we could want to call determineCalleeSaves multiple times, so I'm not sure accumulating state here is a good idea. | |
llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp | ||
151 ↗ | (On Diff #424259) | Drop this comment, it's not unconditionally true. The relevant problem is addressed in frame lowering |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
1252 | The separate list was maintained for Serialize info. We could directly add them into LaneVGPRs otherwise. | |
1252–1253 | The best place to fill in the LaneVGPRs (both writelane and WWM reserved VGPRs) would be in processFunctionBeforeFrameFinalized. But that is invoked after determineCalleeSaves where we mask these spills off the CSR list so that the default spill insertion in PEI would defer them, and we could later custom insert them during FrameLowering. At this moment I don't see a better place to move it. We could add a condition to enable this code only once if determineCalleeSaves needs to be called multiple times. |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
1252 | The serialization is supposed to reflect whatever's here. It's not worth preserving for its own sake |
It makes sense to unify that handling, but I object to the term "lane VGPRs". In a way, it's the other way around: the registers that are used for SGPR->VGPR spills are de facto "WWM VGPRs", because they're used in a wave-wide manner, which is the opposite of lane-based usage.
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h | ||
---|---|---|
474–483 | I'm confused by the relationship between "WWMReservedRegs" and WWMVGPRs. |
Improved the comment for WWMReservedRegs to have more details especially to differentiate it from the superset WWMVGPRs.
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
1252 | allocateWWMSpill would be a better name |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
1252–1253 | With the rename it's clearer what's going on here now. I'd prefer to keep the frame index creation in processFunctionBeforeFrameFinalized |
Moved WWM reserved registers frame index creation into processFunctionBeforeFrameFinalized.
@sebastian-ne Is there a case when SIPreAllocateWWMRegs pass allocates a CSR VGPR to operands in WWM instructions?
If it is true, we may have a bug in the current handling that we failed to reset the WWMReservedRegs from the SavedVGPRs during determineCalleeSaves that allow us later to custom insert these spills in the emitPrologue/emitEpilogue functions.
Here is what we do for VGPRs used for SGPR spills.
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp#L1278
It doesn't come under the scope of this patch. But if this case can arise, we must fix it.
Sorry for getting back to you so late. You are correct, WWMReservedRegs should be removed from the SavedVGPRs during determineCalleeSaves.
Currently, if WWM registers fall on callee-saves, they are spilled twice, once as CSR, once as WWM register. At least it’s not incorrect.
Here’s a test demonstrating this: https://github.com/Flakebi/amdvlk-llvm-project/commit/0b0e056c5be110096a92c8175d6650d8535d4e33#diff-eddae1d72701ad027181cd89a7409a0b653da8debae655c3379c5b4d97934a73R1241-R1246
If we're just going to add these to lane VGPRs, is there any real reason to distinguish the WWM registers? Could we just add them to LaneVGPRs to start with?