Page MenuHomePhabricator

[AMDGPU] Make GCNRegBankReassign assign based on subreg banks
ClosedPublic

Authored by critson on Jul 29 2020, 9:54 PM.

Details

Summary

When scavenging consider the sub-register of the source operand
to determine the bank of a candidate register (not just sub0).
Without this it is possible to introduce an infinite loop,
e.g. $sgpr15_sgpr16_sgpr17 can be assigned for a conflict between
$sgpr0 and SGPR_96:sub1.

Diff Detail

Event Timeline

critson created this revision.Jul 29 2020, 9:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2020, 9:54 PM
critson requested review of this revision.Jul 29 2020, 9:54 PM
critson updated this revision to Diff 281898.Jul 30 2020, 5:41 AM

Correctly handle determining banks for wide sub-registers.

This revision is now accepted and ready to land.Jul 30 2020, 10:08 AM
critson updated this revision to Diff 282217.Jul 31 2020, 7:33 AM
  • Fix further edge cases involved with sub-registers.
  • Add additional tests to cover bugs fixed.

LGTM

Thank you for the quick review.
While testing this on VulkanCTS I found bug it introduced which I have now fixed.
I would appreciate it if you have a chance to take a quick second look.

rampitec added inline comments.Jul 31 2020, 10:21 AM
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp
401

I do not like this test gone. We really need to bail here.

critson updated this revision to Diff 282366.Jul 31 2020, 8:45 PM
  • Exclude sub-registers that cover all banks from stall calculations.
critson added inline comments.Jul 31 2020, 8:58 PM
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp
401

So if I understand correctly, the test was saying "if a sub-register covers all banks then exclude it from the stall count"?
My problem with the test was that it was being applied only when a reg+bank was specified for reassignment.
This meant that the stall count computation method was different when testing a bank assignment, causing stall count mismatches and infinite loops in some cases.

I have reinstated the test, but for all stall computations.
Based on the above understanding this should be correct as we never want to spend time evaluating unavoidable stalls.

rampitec added inline comments.Aug 3 2020, 10:15 AM
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp
401

Right, thank you! LGTM.

This revision was landed with ongoing or failed builds.Aug 3 2020, 8:55 PM
This revision was automatically updated to reflect the committed changes.
foad added inline comments.Aug 5 2020, 1:08 AM
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp
401

This meant that the stall count computation method was different when testing a bank assignment, causing stall count mismatches

There are already other reasons why you get stall count mismatches, and I'm not sure it's worth trying to fix them. See D82643.