This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Define special SGPR subregs
ClosedPublic

Authored by rampitec on Apr 21 2020, 2:10 PM.

Details

Summary

These are used in SReg_32 and when we start to use SGPR_LO16
there will be compaints that not all registers in RC support
all subreg indexes. For now it is NFC.

Unused regunits are reserved so that verifier does not complain
about missing phys reg live-ins.

Diff Detail

Event Timeline

rampitec created this revision.Apr 21 2020, 2:10 PM
arsenm added inline comments.Apr 21 2020, 3:40 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Can't you just iterate SGPR_16LO and use reserveRegisterTuples?

rampitec marked an inline comment as done.Apr 21 2020, 3:46 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

This will reserve all SGPRs, not saying that it will reserve SGPR_LO16 which is needed to be used.

rampitec marked an inline comment as done.Apr 21 2020, 4:31 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Also the point is this is mostly not about SGPR, this is about SReg. Stuff like vcc_lo_hi16. Without reserving verifier complains about missing live-ins in some basic blocks.

arsenm added inline comments.Apr 21 2020, 5:01 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

I mean I would expect there to be an SReg_16LO too

rampitec marked an inline comment as done.Apr 21 2020, 5:24 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Not really, these are needed. You materialize a 16 bit immediate into SGPR_LO16 so that it can be consumed by a 16 bit operand. Well at least this is a working concept so far.

rampitec marked an inline comment as done.Apr 22 2020, 4:37 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Look, that is the code I am going to use:

if (Subtarget->has16BitInsts()) {
  addRegisterClass(MVT::i16, &AMDGPU::SGPR_LO16RegClass);
  addRegisterClass(MVT::f16, &AMDGPU::SGPR_LO16RegClass);

For this SGPR_LO16 shall be available.

rampitec marked an inline comment as done.Apr 23 2020, 11:11 AM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

I see the source of confusion now. It is a wrong RC, should be !SGPR_LO16RegClass.contains(Low). Noticed it when a test run out of registers.

rampitec updated this revision to Diff 259639.Apr 23 2020, 11:16 AM

Fixed reserved bitset and rebased.

arsenm added inline comments.Apr 23 2020, 3:36 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

I mean we're missing a register class definition. I would expect a parallel set of _16LO to go with the current SGPR+ sets of registers we already have

rampitec marked an inline comment as done.Apr 23 2020, 3:50 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Do you mean you want to have SReg_LO16? So far I do not see a real use for it.

rampitec marked an inline comment as done.Apr 23 2020, 4:37 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

It also will not help the logic in this loop.

arsenm added inline comments.Apr 24 2020, 3:26 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Yes, otherwise there will be an asymmetry. Really SReg_LO16 should be the allocable form so VCC isn't special

arsenm added inline comments.Apr 24 2020, 3:51 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
269–270

Add a note that this is because there's no SReg_LO16 at least

rampitec marked 2 inline comments as done.Apr 24 2020, 4:05 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
266–267

Ok, I will add it. I am not sure it will help physreg livein into a BB thought. I have a patch somewhere which defines it, but I recall I have seen the same issue.

269–270

I will add SReg_LO16 and see how it goes. Thanks.

rampitec marked an inline comment as done.Apr 24 2020, 8:11 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
269–270

Hm... I have added SReg_LO16 and it went completely off rail with a lot of failed tests, but it did not help VCC liveness as I expected. In fact the liveness problem could occur with any physreg, it is just we never explicitely use something like SGPR0, but we do so with VCC.

rampitec updated this revision to Diff 260389.Apr 27 2020, 11:35 AM

Added SReg_LO16.

Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2020, 11:35 AM
arsenm accepted this revision.Apr 28 2020, 12:42 PM
This revision is now accepted and ready to land.Apr 28 2020, 12:42 PM

Commit is temporarily reverted, it asserts in a debug tablegen.

This revision was automatically updated to reflect the committed changes.
rampitec updated this revision to Diff 260777.Apr 28 2020, 2:55 PM

Fixed tablegen crash. Combination of CoveredBySubregs = 1 and isArtificial = 1 in one of the RUs seems to confuse tablegen.