This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: xnack_mask is always reserved on VI
ClosedPublic

Authored by nhaehnle on Jan 5 2016, 11:57 AM.

Details

Summary

Somehow, I first interpreted the docs as saying space for xnack_mask is only
reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and
went back to actually test what is happening, and it turns out that xnack_mask
is always reserved at least on Tonga and Carrizo, in the sense that flat_scr
is always fixed below the SGPRs that are used to implement xnack_mask, whether
or not they are actually used.

I confirmed this by writing a shader using inline assembly to tease out the
aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where
we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so
xnack_mask is s[76:77] and vcc is s[78:79]).

This patch changes both the calculation of the total number of SGPRs and the
various register reservations to account for this. In particular, when the
XNACK feature isn't used, the gap between flat_scr and vcc can be used. This
seems to be in line with the hardware documentation.

Note that previously, even before my earlier change in r256794, the SGPRs that
alias to xnack_mask could end up being used as well when flat_scr was unused
and the total number of SGPRs happened to fall on the right alignment
(e.g. highest regular SGPR being used s29 and VCC used would lead to number
of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there
were some conflict due to such aliasing, we should have noticed that already.

Diff Detail

Event Timeline

nhaehnle updated this revision to Diff 44039.Jan 5 2016, 11:57 AM
nhaehnle retitled this revision from to AMDGPU/SI: xnack_mask is always reserved on VI.
nhaehnle updated this object.
nhaehnle added reviewers: arsenm, tstellarAMD.
nhaehnle added a subscriber: llvm-commits.
nhaehnle updated this revision to Diff 44044.Jan 5 2016, 12:20 PM

Sorry for the churn. We cannot leave a gap when the xnack feature is unused
because that breaks with the accounting done for the number of SGPRs.

tstellarAMD accepted this revision.Jan 6 2016, 8:12 PM
tstellarAMD edited edge metadata.

I had to stare at the extra register computation for a while to convince myself it was correct. Thanks for fixing this.

LGTM.

This revision is now accepted and ready to land.Jan 6 2016, 8:12 PM
This revision was automatically updated to reflect the committed changes.