This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Accelerate SIRegisterInfo::getPhysRegClass
ClosedPublic

Authored by critson on Dec 6 2022, 5:48 AM.

Details

Summary

Dynamically build a mapping table from physical register to base
register class for getPhysRegClass.

Diff Detail

Event Timeline

critson created this revision.Dec 6 2022, 5:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2022, 5:48 AM
critson requested review of this revision.Dec 6 2022, 5:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2022, 5:48 AM
arsenm added a comment.Dec 6 2022, 5:50 AM

This is better, but it would be better to have a tablegen generated pre-sorted table

Ideally this would be a static table, but the changes required to TableGen to understand register classes for this are non-trivial.

I am seeing >0.5% speed up in compile time with this change.
So the cost of computing the mapping appears negligible compared to the benefit.

arsenm added a comment.Dec 6 2022, 5:52 AM

Ideally this would be a static table, but the changes required to TableGen to understand register classes for this are non-trivial.

Why? The regclass structure is exactly represented there already

Ideally this would be a static table, but the changes required to TableGen to understand register classes for this are non-trivial.

Why? The regclass structure is exactly represented there already

The register classes are represented, but there is no way to enumerate all the registers in a register class within a standard structure like GenericTable.
RegisterClass MemberList is a dag after all.
This also prevents iterating over MemberList to define some kind of mapping class between a Register and RegisterClass.

The mapping of registers to register classes is one to many, so it is also not a simple case of modifying TableGen to output a new mapping.
As far as I can tell there is no obvious rule that actually defines the base class for a given register (allocation priority and super classing are insufficient).
The best solution I can see is we extend RegisterClass to allow us to mark which classes are "base classes" and what order to enumerate them to produce a mapping table.
This functionality seems very specific to our needs, so I don't know if that is really an appropriate thing to do.

foad accepted this revision.Dec 7 2022, 2:37 AM

LGTM.

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2772–2773

Surely you can assert this?

This revision is now accepted and ready to land.Dec 7 2022, 2:37 AM
arsenm added a comment.Dec 7 2022, 6:35 AM

This functionality seems very specific to our needs, so I don't know if that is really an appropriate thing to do.

It's plenty appropriate

critson updated this revision to Diff 481225.Dec 8 2022, 3:11 AM
  • Turn this into a proof of concept for D139616
foad added inline comments.Dec 8 2022, 3:31 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
40

Remove this

llvm/lib/Target/AMDGPU/SIRegisterInfo.h
49

Remove this

critson updated this revision to Diff 481227.Dec 8 2022, 3:36 AM
  • Remove stray code from earlier revision
foad added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1432–1435

This will sometimes call getPhysRegBaseClass on a virtual register, which I think should be disallowed.

critson updated this revision to Diff 481760.Dec 9 2022, 3:19 PM
  • Avoid passing virtual registers to getPhysRegBaseClass in IsCopyFromSGPR
arsenm requested changes to this revision.Dec 13 2022, 2:12 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1434

It's even easier to do the class test on the virtual register

This revision now requires changes to proceed.Dec 13 2022, 2:12 PM
critson added inline comments.Dec 13 2022, 6:28 PM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1434

I am sorry, I don't know what you mean by "easier".
Easier as in less lines of code? Or, easier as in more computationally efficient?

Perhaps you are suggesting we replace most of this code with a call to TRI->isSGPRReg()?
However, this function should only be true for physical registers (not any SGPR reg).

critson marked 4 inline comments as done.Dec 15 2022, 5:52 PM

@arsenm - please can you elucidate the changes you'd like made.

arsenm added inline comments.Dec 15 2022, 6:01 PM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1434

I mean saying any virtual registers are not SGPRs is wrong; if this is actually reachable with virtual registers it should try to handle them. If not it should assert

foad accepted this revision.Dec 16 2022, 12:03 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1434

If so that should be a separate patch. This one is NFC.

critson marked 2 inline comments as done.Dec 19 2022, 1:18 AM
critson added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1434

I'll work on a separate patch, it is definitely called on virtual registers so an assertion would not be appropriate. Change it to TRI->isSGPRReg() causes failures in later passes so I will have to investigate that in detail.

arsenm accepted this revision.Dec 19 2022, 5:24 AM
This revision is now accepted and ready to land.Dec 19 2022, 5:24 AM
This revision was landed with ongoing or failed builds.Dec 19 2022, 11:23 PM
This revision was automatically updated to reflect the committed changes.
critson marked an inline comment as done.