This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Use wider scalar spills for SGPR spilling
ClosedPublic

Authored by arsenm on Oct 28 2016, 4:11 PM.

Details

Reviewers
tstellarAMD
Summary

Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.

This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 76269.Oct 28 2016, 4:11 PM
arsenm retitled this revision from to AMDGPU: Use wider scalar spills for SGPR spilling.
arsenm updated this object.
arsenm added a subscriber: llvm-commits.
tstellarAMD accepted this revision.Nov 1 2016, 2:23 PM
tstellarAMD edited edge metadata.

LGTM. It would be good if we could turn some of these spilling tests into MIR tests.

This revision is now accepted and ready to land.Nov 1 2016, 2:23 PM
arsenm closed this revision.Dec 1 2016, 5:05 PM

r288445