This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Prevent overlapping subregs in getCoveringSubRegIndexes
ClosedPublic

Authored by Pierre-vh on Jan 12 2023, 12:52 AM.

Details

Summary

If getCoveringSubRegIndexes returns a set of subregister indexes where some subregisters overlap others, it can create unsatisfiable copy bundles that eventually cause VirtRegRewriter to error out due to "cycles in copy bundle".

We can simply prevent this by making the algorithm skip over subregisters indexes that would cause an overlap with already-covered lanes.

Note that in the case of AMDGPU, this problem is caused by the lack of subregisters indexes for 13/14/15-register tuples. We have everything up until 12, then we have 16 and 32 but nothing between 12 and 16.
This means that the best candidate to do the least amount of copies when splitting a 29-register tuple was to copy (e.g.) 0-15 and 14-29, causing an overlap.
With this change, getCoveringSubRegIndexes will now prefer using something like 0-15, 16-28 and 1

Diff Detail

Event Timeline

Pierre-vh created this revision.Jan 12 2023, 12:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 12:52 AM
Pierre-vh requested review of this revision.Jan 12 2023, 12:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 12:52 AM

Testing for this is a bit tricky, fortunately it looks like it's already covered by a test.
If another form of testing is needed, perhaps we could do a unit test? I feel like it'd be a lot more straightforward to just query TRI in a unit test and check all the corner cases there.

Pierre-vh edited the summary of this revision. (Show Details)Jan 12 2023, 1:00 AM
arsenm added a reviewer: foad.
arsenm added inline comments.Jan 13 2023, 7:21 AM
llvm/lib/CodeGen/TargetRegisterInfo.cpp
580

s/as much/as many/

llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
61

Needs a new test for the cases that were failing

foad added inline comments.Jan 13 2023, 7:31 AM
llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
29

Did anything actually change here? If not can you first commit a patch to regenerate the checks for this file, and then rebase this patch?

arsenm added inline comments.Jan 13 2023, 7:31 AM
llvm/lib/CodeGen/TargetRegisterInfo.cpp
576

this isn't memory

Testing for this is a bit tricky, fortunately it looks like it's already covered by a test.
If another form of testing is needed, perhaps we could do a unit test? I feel like it'd be a lot more straightforward to just query TRI in a unit test and check all the corner cases there.

You already have the reduced mir test?

Pierre-vh marked 4 inline comments as done.EditedJan 16 2023, 12:06 AM

Testing for this is a bit tricky, fortunately it looks like it's already covered by a test.
If another form of testing is needed, perhaps we could do a unit test? I feel like it'd be a lot more straightforward to just query TRI in a unit test and check all the corner cases there.

You already have the reduced mir test?

Only for the virtregrewriter pass (so with already split liveranges). Plus, the testcase I have seems highly sensitive to register allocation changes, and as liverange splitting happens in the register allocator it's not a good test case as it may start passing not because of this change but because of other unrelated changes that affect regalloc (for instance it just passes now on trunk without any changes)
Any test case could also be made irrelevant if we add the missing 13-15 tuples subregister indexes

Should I try to come up with an artificial case like the ones in splitkit.mir? What should the structure be to tickle the register allocator just enough to make liverange splitting copy sub0-sub29 of a sreg_1024?

Maybe I can do something based on extend-phi-subrange-not-in-parent, just cleaned up and with a comment saying exactly what it needs to test.

llvm/test/CodeGen/AMDGPU/extend-phi-subrange-not-in-parent.mir
29

That indeed didn't change in this patch, I regenerated the test
https://github.com/llvm/llvm-project/commit/9bdfd7c8db2066e5b50b9599248660bdf8eda7f0

Pierre-vh marked an inline comment as done.

Rebase, address comments

Testing for this is a bit tricky, fortunately it looks like it's already covered by a test.
If another form of testing is needed, perhaps we could do a unit test? I feel like it'd be a lot more straightforward to just query TRI in a unit test and check all the corner cases there.

You already have the reduced mir test?

Only for the virtregrewriter pass (so with already split liveranges). Plus, the testcase I have seems highly sensitive to register allocation changes, and as liverange splitting happens in the register allocator it's not a good test case as it may start passing not because of this change but because of other unrelated changes that affect regalloc (for instance it just passes now on trunk without any changes)

This is a general issue, still having something is better than nothing

Any test case could also be made irrelevant if we add the missing 13-15 tuples subregister indexes

Could also try forcing it on a different target

Should I try to come up with an artificial case like the ones in splitkit.mir? What should the structure be to tickle the register allocator just enough to make liverange splitting copy sub0-sub29 of a sreg_1024?

That would also be good, it can be tricky and time consuming but it’s usually possible. It’s a bit of an art to craft these tests

Maybe I can do something based on extend-phi-subrange-not-in-parent, just cleaned up and with a comment saying exactly what it needs to test.

arsenm accepted this revision.Jan 17 2023, 8:36 AM
This revision is now accepted and ready to land.Jan 17 2023, 8:36 AM
This revision was landed with ongoing or failed builds.Jan 18 2023, 12:49 AM
This revision was automatically updated to reflect the committed changes.