This is an archive of the discontinued LLVM Phabricator instance.

RegUsageInfoCollector: Don't iterate all regs for every reg class
ClosedPublic

Authored by arsenm on Jul 8 2019, 5:21 AM.

Details

Reviewers
qcolombet
Summary

This is extremly slow on AMDGPU, which has a lot of physical register
and a lot of register classes. Visit registers with common regunits
instead. NFC, except for compile time improvement.

Diff Detail

Event Timeline

arsenm created this revision.Jul 8 2019, 5:21 AM
qcolombet added inline comments.Jul 8 2019, 8:22 AM
lib/CodeGen/RegUsageInfoCollector.cpp
211

We miss the setting of HasAtLeastOneSubreg in the loop.

arsenm updated this revision to Diff 208453.Jul 8 2019, 10:21 AM

Remove loop entirely. Ultimately determineCalleeSaves adds everything from MCRegAliasIterator anyway

Remove loop entirely. Ultimately determineCalleeSaves adds everything from MCRegAliasIterator anyway

Hmm, I missed that. Why did we add this code in the first place?

Remove loop entirely. Ultimately determineCalleeSaves adds everything from MCRegAliasIterator anyway

Hmm, I missed that. Why did we add this code in the first place?

It was added by D46315

It was added by D46315

Looks like it was papering over some SystemZ issues... or maybe it was the TRI::getCalleeSavedRegs vs. MRI::getCalleeSavedRegs.

Anyhow, LGTM.

qcolombet accepted this revision.Jul 8 2019, 11:03 AM
This revision is now accepted and ready to land.Jul 8 2019, 11:03 AM
arsenm closed this revision.Jul 8 2019, 11:48 AM

r365370