This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Define 16 bit VGPR subregs
ClosedPublic

Authored by rampitec on Feb 19 2020, 2:49 PM.

Details

Summary

We have loads preserving low and high 16 bits of their
destinations. However, we always use a whole 32 bit register
for these. The same happens with 16 bit stores, we have to
use full 32 bit register so if high bits are clobbered the
register needs to be copied. One example of such code is
added to the load-hi16.ll.

The proper solution to the problem is to define 16 bit subregs
and use them in the operations which do not read another half
of a VGPR or preserve it if the VGPR is written.

This patch simply defines subregisters and register classes.
At the moment there should be no difference in code generation.
A lot more work is needed to actually use these new register
classes. Therefore, there are no new tests at this time.

Register weight calculation has changed with new subregs so
appropriate changes were made to keep all calculations just
as they are now, especially calculations of register pressure.

Diff Detail

Event Timeline

rampitec created this revision.Feb 19 2020, 2:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2020, 2:50 PM
rampitec updated this revision to Diff 247088.Feb 27 2020, 1:12 PM

Rebased. Ping.

arsenm added inline comments.Feb 28 2020, 7:55 AM
llvm/include/llvm/MC/LaneBitmask.h
39–79 ↗(On Diff #247088)

This should be split to a separate change

llvm/lib/CodeGen/MIRParser/MIParser.cpp
753–757 ↗(On Diff #247088)

Ditto

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
42

Can this be a state_assert? I would hope getSubRegIndexLaneMask is constexpr

llvm/lib/Target/AMDGPU/SIRegisterInfo.td
457

This should get a comment noting that there is no encoding for the high registers, and the low register are just encoded as the 32-bit register

rampitec marked 5 inline comments as done.Feb 28 2020, 12:42 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
42

Unfortunately it is not a constexpr. I wanted to make a static assert right at getNumCoveredRegs(), but that did not fly.

rampitec updated this revision to Diff 247349.Feb 28 2020, 12:44 PM
rampitec marked an inline comment as done.

Split the change and added comments.

arsenm accepted this revision.Mar 31 2020, 6:34 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.td
18–20

Could use a comment for what this accomplishes

This revision is now accepted and ready to land.Mar 31 2020, 6:34 AM
rampitec updated this revision to Diff 253939.Mar 31 2020, 11:34 AM
rampitec marked an inline comment as done.

Rebased.
Added comment.

This revision was automatically updated to reflect the committed changes.