A VORR of the form $qX = VORR $qY, $qY essentially just moves $qY into $qX but is currently not allowed in a tail-predicated loop. If the definition of $qY produces zeroed false lanes then we now consider the VORR to produce zeroed false lanes as well, since it doesn't modify the incoming lane values. This required modifying some existing tests that used a VORR of this format to test unpredicated instructions that defined a live-out.
Details
- Reviewers
dmgreen SjoerdMeijer NickGuy samparker
Diff Detail
Unit Tests
Event Timeline
Hello. Sorry for the delay, I had apparently forgotten about this. I'm not sure I understand the reasoning here, but I'm not sure I understood the reasoning for the FalseLanesZero to begin with.
You can leave the existing tests as they were, and update them with the new codegen. They will either be correct and better, or else we need to look into what is going wrong with them.
Yes, it's complicated to ensure the transformation is correct. The high-level idea is described in a comments in LowOverheadLoop::ValidateLiveOuts(). Please have a look at that, and if it is still unclear, we need to improve that.
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
882 | This is the same for most of the instructions that we check here, and this logic is almost the same as the loop at the bottom of the function. So shouldn't the dominates check just be inserted in there instead? I'm slightly dubious about that tbh! This algorithm is working on the basis that the instructions in FalseLanesZero have been proved to behaviour how we want. Adding a VORR in there based upon future checks breaks our assumptions. I feel like this is better handled in ValidateLiveOuts when you've got the whole loop body analysed and you can add some extra logic when scanning though FalseLanesUnknown. |
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp | ||
---|---|---|
882 | Sorry for the delay in getting back to this and thanks for the feedback. I'll have a go at migrating this to ValidateLiveOuts. |
clang-format: please reformat the code