This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Don't use stack space for SGPR->VGPR spills
ClosedPublic

Authored by arsenm on Oct 26 2016, 12:29 PM.

Details

Summary

Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.

I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.

The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 75928.Oct 26 2016, 12:29 PM
arsenm retitled this revision from to AMDGPU: Don't use stack space for SGPR->VGPR spills.
arsenm updated this object.
arsenm added a subscriber: llvm-commits.
nhaehnle added inline comments.Oct 27 2016, 3:18 AM
lib/Target/AMDGPU/SIFrameLowering.cpp
115–117

The comment doesn't match the code anymore. Was the drop_back(2) removed on purpose? It seems like it would still be needed.

235

Spurious whitespace.

lib/Target/AMDGPU/SIRegisterInfo.cpp
1323–1332

I wonder if this shouldn't be an error condition. After all, the frontend explicitly requested some maximum number of SGPRs, and now we break that request. It depends a bit on the use case for amdgpu-num-sgpr of course.

arsenm added inline comments.Nov 7 2016, 11:22 AM
lib/Target/AMDGPU/SIFrameLowering.cpp
115–117

Yes, although I'm having trouble remembering why exactly. I think it should be redundant with the isReserved check

lib/Target/AMDGPU/SIRegisterInfo.cpp
1323–1332

I'm not really sure about this. I think we agreed that the other higher level attributes are what people should be using instead of directly specifying the number of registers. These I think were left for compiler stressing, so it probably doesn't really matter

arsenm added inline comments.Nov 7 2016, 11:30 AM
lib/Target/AMDGPU/SIRegisterInfo.cpp
1323–1332

Actually the code size point is only fixed by fitting all of the spills into the 64 element wide slot. The way it works in this patch is to leave the other 63-elements wasted, so the minimum offset is 64.

nhaehnle added inline comments.Nov 18 2016, 7:59 AM
lib/Target/AMDGPU/SIFrameLowering.cpp
115–117

Yes, the isAllocatable should cover it. But please still update the comment and the whitespace change below. Apart from that, the patch LGTM.

arsenm marked an inline comment as done.Feb 2 2017, 3:32 PM

There's a problem with this where the VGPR index to use is currently determined by the frame object offsets that PEI determines, so if you end up with multiple spills there's a chance you incorrectly end up re-using the same indexes. I'll need to replace that with something else first

arsenm updated this revision to Diff 87071.Feb 3 2017, 8:26 PM

Don't use frame offsets for determining spill VGPR

arsenm updated this revision to Diff 87100.Feb 4 2017, 10:31 AM

Skip loop through program if there are no SGPR spills

nhaehnle added inline comments.Feb 10 2017, 9:25 AM
lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
219–225

Should we remember the failure somehow? Otherwise, we will retry the allocation later on for the same FI.

lib/Target/AMDGPU/SIMachineFunctionInfo.h
200

Should be SGPR->VGPR.

arsenm added inline comments.Feb 10 2017, 9:33 AM
lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
219–225

Maybe? I'm not sure it's worth the effort. The benefit would be maybe slightly improved compile time in the exceptional case for more complexity

nhaehnle added inline comments.Feb 15 2017, 5:14 AM
lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
219–225

Fair enough. I think the patch is fine, just that one SGPR -> VGPR comment.

arsenm accepted this revision.Feb 21 2017, 11:23 AM
arsenm marked an inline comment as done.

r295753

This revision is now accepted and ready to land.Feb 21 2017, 11:23 AM
arsenm closed this revision.Feb 21 2017, 11:24 AM