This is an archive of the discontinued LLVM Phabricator instance.

Reverse subregister saved loops in register usage info collector.
ClosedPublic

Authored by airlied on Jun 25 2018, 8:10 PM.

Details

Summary

On AMDGPU we have 70 register classes, so iterating over all 70
each time and exiting is costly on the CPU, this flips the loop
around so that it loops over the 70 register classes first,
and exits without doing the inner loop if needed.

On my test just starting radv this takes
RegUsageInfoCollector::runOnMachineFunction
from 6.0% of total time to 2.7% of total time,
and reduces the startup from 2.24s to 2.19s

Diff Detail

Repository
rL LLVM

Event Timeline

airlied created this revision.Jun 25 2018, 8:10 PM
mareko added a subscriber: mareko.Jun 29 2018, 11:39 AM

Adding more people; still reading up on why the covered logic was necessary in the first place...

MatzeB accepted this revision.Jul 13 2018, 12:14 PM
MatzeB added subscribers: marcello.maggioni, bogner.

I fear the covered check will hit us perf wise either way (+CC some people from our GPU team) and that we need to find a different solution long term (like the Targets announcing the saved registers themselfes instead of the generic code piecing together the information...).
Anyway if this works for you I'd be fine with this patch as a stopgap.

This revision is now accepted and ready to land.Jul 13 2018, 12:14 PM
airlied updated this revision to Diff 163206.Aug 29 2018, 3:25 PM

Rebased onto latest master. I require someone else to push this as I don't have commit rights.

Please upload with context

This revision was automatically updated to reflect the committed changes.