Page MenuHomePhabricator

[AMDGPU] Define 16 bit VGPR subregs
Needs ReviewPublic

Authored by rampitec on Feb 19 2020, 2:49 PM.



We have loads preserving low and high 16 bits of their
destinations. However, we always use a whole 32 bit register
for these. The same happens with 16 bit stores, we have to
use full 32 bit register so if high bits are clobbered the
register needs to be copied. One example of such code is
added to the load-hi16.ll.

The proper solution to the problem is to define 16 bit subregs
and use them in the operations which do not read another half
of a VGPR or preserve it if the VGPR is written.

This patch simply defines subregisters and register classes.
At the moment there should be no difference in code generation.
A lot more work is needed to actually use these new register
classes. Therefore, there are no new tests at this time.

Register weight calculation has changed with new subregs so
appropriate changes were made to keep all calculations just
as they are now, especially calculations of register pressure.

Diff Detail

Event Timeline

rampitec created this revision.Feb 19 2020, 2:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2020, 2:50 PM
rampitec updated this revision to Diff 247088.Thu, Feb 27, 1:12 PM

Rebased. Ping.

arsenm added inline comments.Fri, Feb 28, 7:55 AM
39–79 ↗(On Diff #247088)

This should be split to a separate change

753–757 ↗(On Diff #247088)



Can this be a state_assert? I would hope getSubRegIndexLaneMask is constexpr


This should get a comment noting that there is no encoding for the high registers, and the low register are just encoded as the 32-bit register

rampitec marked 5 inline comments as done.Fri, Feb 28, 12:42 PM
rampitec added inline comments.

Unfortunately it is not a constexpr. I wanted to make a static assert right at getNumCoveredRegs(), but that did not fly.

rampitec updated this revision to Diff 247349.Fri, Feb 28, 12:44 PM
rampitec marked an inline comment as done.

Split the change and added comments.