When the callee requires a dynamic stack realignment,
it is not possible to correctly access the incoming
stack arguments using the stack pointer. We reserve a
base pointer in such cases to access the function arguments
inside the callee. The base pointer will hold the incoming
stack pointer value before any kind of delta added to it.
Details
- Reviewers
t-tye scott.linder arsenm - Commits
- rG7c4e711ef8d8: [AMDGPU] Enable base pointer.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
3875–3884 | This isn't supposed to be an ABI visible thing, so I'm not sure if this belongs here. The note about replacing it with a 0 immediate offset doesn't make sense too | |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
110–121 | I don't think this case is stressed in the tests, but also don't think we should handle this case here. We assume all SGPR->VGPR spills have been resolved in SILowerSGPRSpills. Trying to get an additional VGPR to spill in PEI is less reliable | |
646 | Register() instead of NoRegister | |
676–678 | Should do a single loop over the function with the other case above (also may require .sortUniqueLiveIns) | |
885 |
| |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
116 | Missing space after comma | |
122 | Can this go in the header or is is the register enum not defined there? | |
311 | This is a unneeded check | |
llvm/test/CodeGen/AMDGPU/stack-realign.ll | ||
163 | Should add a test explanation comment | |
203 | Can you note the BP is saved/restored in an SGPR |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
6715 | s/requie/requires/ | |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
795–798 | Is this comment accurate anymore? I think it can go back where it was with "base pointer" just changed to "frame pointer". A new comment for the "base pointer" would need to talk about referencing function arguments, not locals. | |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
118 | Is it possible to have some mechanism for only setting up the BP when it is actually used? Looking at X86 it seems like they don't do this, so I am OK with us not doing it either, just wondering if it is feasible. | |
llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll | ||
302 | This is an example of where we have dynamic realignment but don't actually use the BP to refer to a fixed index. Again, being correct is more important, but if it is feasible it seems nice to only set this up when it is used. |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
795–798 | I don't think we need to stress it out for the FP. | |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
118 | It sounds good to me. We can enable it only when there is a fixed-frame. |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
118 | I think getNumFixedObjects is not the right check here. That won't deal with the case where we have dead stack objects (as is common from eliminated SGPR spils) |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
118 | It would be good to have a straight forward check here to see if there is any fixed-frame and getNumFixedObjects would do that. |
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
744 | Don't need the parentheses | |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
118 | It doesn't, because for us it's full of litter from SGPR spills. This will be overly conservative, and trigger when we have no actual stack objects. Is this only called after the frame is finalized? The way the x86 version expresses this makes more sense to me: bool CantUseFP = needsStackRealignment(MF); return CantUseFP && CantUseSP(MFI); This might work for us the same? |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
118 | There are calls to hasBasePointer() early during SIIselLowering and it is not always called after the frame is finalized. It doesn't work in the way that we wanted if we include the function CantUseSP(). |
This will require some changes to the CFI, although previously this case was broken anyway so it does not regress the unwind information. I think we will just need to define the CFA relative to the base pointer (if it is actually set up) or the stack pointer (if realignment is needed, but the base pointer is not).
I posted a piglit test to try to exercise this https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/291
This isn't supposed to be an ABI visible thing, so I'm not sure if this belongs here. The note about replacing it with a 0 immediate offset doesn't make sense too