When scavenging consider the sub-register of the source operand
to determine the bank of a candidate register (not just sub0).
Without this it is possible to introduce an infinite loop,
e.g. $sgpr15_sgpr16_sgpr17 can be assigned for a conflict between
$sgpr0 and SGPR_96:sub1.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
- Fix further edge cases involved with sub-registers.
- Add additional tests to cover bugs fixed.
Thank you for the quick review.
While testing this on VulkanCTS I found bug it introduced which I have now fixed.
I would appreciate it if you have a chance to take a quick second look.
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp | ||
---|---|---|
401 | I do not like this test gone. We really need to bail here. |
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp | ||
---|---|---|
401 | So if I understand correctly, the test was saying "if a sub-register covers all banks then exclude it from the stall count"? I have reinstated the test, but for all stall computations. |
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp | ||
---|---|---|
401 | Right, thank you! LGTM. |
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp | ||
---|---|---|
401 |
There are already other reasons why you get stall count mismatches, and I'm not sure it's worth trying to fix them. See D82643. |
I do not like this test gone. We really need to bail here.