Presently, there is an issue on MI100 (and probably other architecture) where the VGPR used for AGPR copies clobbers VGPR used for AGPR spill. AFAICT this is because in processFunctionBeforeFrameIndicesReplaced we think the VGPR register for AGPR spill is unused. This patch aims to correct that.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
We should work towards eliminating the special case handling of "spilling" between AGPRs and VGPRs. The allocator now knows they have a common allocatable class and introduces new vregs and copies as needed. We only still have this special handling because it breaks an optimization that works by accident now (which is unused subregister related)
Short of this, yes, any register used for special spill purposes needs to be reserved
The special handling is not about unused subregs. It is to allow not to spill the whole wide register into memory if there are free registers.
Add test to capture bug. Consecutive accvgpr reads before buffer_stores result in clobber before patch.
This is something I can explore after this patch -- for now there is an ask to resolve this into 5.5 so I think we will need this patch.
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp | ||
---|---|---|
1172 | Can you only do this if we actually needed to do an allocation? It's actually pretty expensive to recompute this and we do it a few too many times as it is already |
llvm/include/llvm/CodeGen/MachineRegisterInfo.h | ||
---|---|---|
906 ↗ | (On Diff #483238) | Hey Matt -- thanks for comments thus far. Regarding the warning -- were you thinking programmatically enforcing these conditions (e.g. via assert) or a literal warning in comment? Not sure how we can enforce this state with MRI |
llvm/include/llvm/CodeGen/MachineRegisterInfo.h | ||
---|---|---|
916–921 ↗ | (On Diff #483584) | I think blindly unreserving all aliases doesn't make as much sense as it does in the reserve direction. You're also not using this version, so might as well drop it for now. |
906 ↗ | (On Diff #483238) | I don't know how you could programmatically enforce this. Should also assert the reserved regs are already frozen before; this should only be used for incremental updates |
Extension of this work is to replace MRI.freezeReservedRegisters() with MRI.reserveReg() where applicable, and remove the register use tracking logic in allocateVGPRSpillToAGPR (OtherUsedRegs).
Can you only do this if we actually needed to do an allocation? It's actually pretty expensive to recompute this and we do it a few too many times as it is already