This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][SIFrameLowering] Mark VGPR used for AGPR spills as reserved
ClosedPublic

Authored by jrbyrnes on Dec 8 2022, 1:28 PM.

Details

Summary

Presently, there is an issue on MI100 (and probably other architecture) where the VGPR used for AGPR copies clobbers VGPR used for AGPR spill. AFAICT this is because in processFunctionBeforeFrameIndicesReplaced we think the VGPR register for AGPR spill is unused. This patch aims to correct that.

Diff Detail

Event Timeline

jrbyrnes created this revision.Dec 8 2022, 1:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2022, 1:28 PM
jrbyrnes requested review of this revision.Dec 8 2022, 1:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2022, 1:28 PM
arsenm added a comment.Dec 8 2022, 1:30 PM

We should work towards eliminating the special case handling of "spilling" between AGPRs and VGPRs. The allocator now knows they have a common allocatable class and introduces new vregs and copies as needed. We only still have this special handling because it breaks an optimization that works by accident now (which is unused subregister related)

arsenm added a comment.Dec 8 2022, 1:31 PM

We should work towards eliminating the special case handling of "spilling" between AGPRs and VGPRs. The allocator now knows they have a common allocatable class and introduces new vregs and copies as needed. We only still have this special handling because it breaks an optimization that works by accident now (which is unused subregister related)

Short of this, yes, any register used for special spill purposes needs to be reserved

We should work towards eliminating the special case handling of "spilling" between AGPRs and VGPRs. The allocator now knows they have a common allocatable class and introduces new vregs and copies as needed. We only still have this special handling because it breaks an optimization that works by accident now (which is unused subregister related)

The special handling is not about unused subregs. It is to allow not to spill the whole wide register into memory if there are free registers.

jrbyrnes updated this revision to Diff 481729.EditedDec 9 2022, 12:41 PM

Add test to capture bug. Consecutive accvgpr reads before buffer_stores result in clobber before patch.

We should work towards eliminating the special case handling of "spilling" between AGPRs and VGPRs. The allocator now knows they have a common allocatable class and introduces new vregs and copies as needed. We only still have this special handling because it breaks an optimization that works by accident now (which is unused subregister related)

This is something I can explore after this patch -- for now there is an ask to resolve this into 5.5 so I think we will need this patch.

jrbyrnes edited the summary of this revision. (Show Details)Dec 9 2022, 5:01 PM
jrbyrnes added a reviewer: hsmhsm.
arsenm requested changes to this revision.Dec 13 2022, 2:11 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1172

Can you only do this if we actually needed to do an allocation? It's actually pretty expensive to recompute this and we do it a few too many times as it is already

This revision now requires changes to proceed.Dec 13 2022, 2:11 PM
jrbyrnes updated this revision to Diff 482907.Dec 14 2022, 9:59 AM
jrbyrnes edited the summary of this revision. (Show Details)

Use flags to minimize calls to MRI.freezeReservedRegs

jrbyrnes updated this revision to Diff 482910.Dec 14 2022, 10:05 AM

Fix typo (toggle NeedsReserve to false after checking for pending reservation)

arsenm added inline comments.Dec 14 2022, 10:08 AM
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
1173

This is an API problem. Should add a way to incrementally add and remove reserved registers

1174

Don't see why you need to clear this

jrbyrnes updated this revision to Diff 483238.Dec 15 2022, 10:50 AM

Introduce MRI->reserveReg() and use to reserve VGPR for AGPR spills

jrbyrnes marked 3 inline comments as done.Dec 15 2022, 10:51 AM
arsenm added inline comments.Dec 15 2022, 10:52 AM
llvm/include/llvm/CodeGen/MachineRegisterInfo.h
906 ↗(On Diff #483238)

This needs to consider all aliasing registers. This also needs a warning that this cannot be used in the middle of a walk in a function or when you have liveness info available

910 ↗(On Diff #483238)

Ditto

jrbyrnes added inline comments.Dec 16 2022, 9:04 AM
llvm/include/llvm/CodeGen/MachineRegisterInfo.h
906 ↗(On Diff #483238)

Hey Matt -- thanks for comments thus far. Regarding the warning -- were you thinking programmatically enforcing these conditions (e.g. via assert) or a literal warning in comment? Not sure how we can enforce this state with MRI

jrbyrnes updated this revision to Diff 483584.Dec 16 2022, 10:23 AM

Consolidate API. Warn user about required state conditions for use.

arsenm accepted this revision.Dec 16 2022, 10:58 AM
arsenm added inline comments.
llvm/include/llvm/CodeGen/MachineRegisterInfo.h
916–921 ↗(On Diff #483584)

I think blindly unreserving all aliases doesn't make as much sense as it does in the reserve direction. You're also not using this version, so might as well drop it for now.

906 ↗(On Diff #483238)

I don't know how you could programmatically enforce this. Should also assert the reserved regs are already frozen before; this should only be used for incremental updates

This revision is now accepted and ready to land.Dec 16 2022, 10:58 AM
jrbyrnes updated this revision to Diff 483618.Dec 16 2022, 11:37 AM
jrbyrnes marked 4 inline comments as done.

Remove unreserve reg, assert reserved is frozen

This revision was landed with ongoing or failed builds.Dec 16 2022, 12:09 PM
This revision was automatically updated to reflect the committed changes.

Extension of this work is to replace MRI.freezeReservedRegisters() with MRI.reserveReg() where applicable, and remove the register use tracking logic in allocateVGPRSpillToAGPR (OtherUsedRegs).