The phi created in a low overhead loop gets created with a default register class it seems. There are then copied inserted between the low overhead loop pseudo instructions (which produce/consume GPRlr instructions) and the phi holding the induction. This patch removes those as a step towards attempting to make t2LoopDec and t2LoopEnd a single instruction, and appears useful in it's own right as shown in the tests.
Details
Diff Detail
Event Timeline
Getting phis to use the right register class sounds like a good generic change to make. If creating them like that is a pain, how about just moving most of this logic into OptimizePhis?
Yeah this might look a little strange. This was really written for a following patch (D91358), where it was more about trying to create the new instructions without copies, than trying to specifically remove them. We do still need to remove those before the phi, but it seems to make sense to do that all together as we are already needing to check here. I was trying to get as much stuff out of the way as I could beforehand that other patch to simplify it.
I'm not sure about doing it in general. We are replacing a phi that accepts many registers with one that only accepts 1. We are being very careful here that it is OK to do that, but in general that might not always be true.
llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp | ||
---|---|---|
161 | How about splitting this up in 2 separate functions: RevertLoopWithCall, and MergeLoopEnd? The latter is not really descriptive anymore if it still revert loops. | |
183 | I do need to ramp up on this again, but it wasn't immediately clear what problem we are solving here. | |
189 | nit: perhaps a more descriptive name than OK. |
llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp | ||
---|---|---|
161 | This is one of those cases where splitting up the patch has caused more trouble than it has helped! I was trying to make the later patch simpler but it has made these ones more confusing as a result. We don't really need to revert calls here _unless_ we are converting to a t2LoopEndDec. If we do convert, we need to ensure that there is nothing else using LR (so no calls), and that there is no other uses of the induction variable, so all the copies are removed. They are really just parts of converting to a t2LoopEndDec that I was trying to make work on their own, so split them into separate patches. I've added a comment to the start of the function to explain this (which isn't true quite yet, but will be once the other patches are in). Hopefully that makes it OK? We can still split it out if you think that's better, but there is at least some method to this madness. |
Looks like a good change to me, and this looked reasonable:
I'm not sure about doing it in general. We are replacing a phi that accepts many registers with one that only accepts 1. We are being very careful here that it is OK to do that, but in general that might not always be true.
But perhaps wait a day with committing in case @samparker has more ideas/comments about this:
llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp | ||
---|---|---|
161 | Thanks, this makes it a lot better! | |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/count_dominates_start.mir | ||
128 ↗ | (On Diff #309915) | Ah, sorry, must have overlooked this MIR test in my first review. |
llvm/test/CodeGen/Thumb2/LowOverheadLoops/count_dominates_start.mir | ||
---|---|---|
128 ↗ | (On Diff #309915) | This was a new one that serendipitously came in with the rebase. Thanks for the review. |
How about splitting this up in 2 separate functions: RevertLoopWithCall, and MergeLoopEnd? The latter is not really descriptive anymore if it still revert loops.