This is an archive of the discontinued LLVM Phabricator instance.

[ARM][LowOverheadLoops] Allow unpredicated VORRs if the operand def produces zeroed false lanes
AbandonedPublic

Authored by samtebbs on Jul 19 2021, 3:44 AM.

Details

Summary
A VORR of the form $qX = VORR $qY, $qY essentially just moves $qY into
$qX but is currently not allowed in a tail-predicated loop.

If the definition of $qY produces zeroed false lanes then we now
consider the VORR to produce zeroed false lanes as well, since it
doesn't modify the incoming lane values.

This required modifying some existing tests that used a VORR of this
format to test unpredicated instructions that defined a live-out.

Diff Detail

Event Timeline

samtebbs created this revision.Jul 19 2021, 3:44 AM
samtebbs requested review of this revision.Jul 19 2021, 3:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2021, 3:44 AM

Hello. Sorry for the delay, I had apparently forgotten about this. I'm not sure I understand the reasoning here, but I'm not sure I understood the reasoning for the FalseLanesZero to begin with.

You can leave the existing tests as they were, and update them with the new codegen. They will either be correct and better, or else we need to look into what is going wrong with them.

Hello. Sorry for the delay, I had apparently forgotten about this. I'm not sure I understand the reasoning here, but I'm not sure I understood the reasoning for the FalseLanesZero to begin with.

Yes, it's complicated to ensure the transformation is correct. The high-level idea is described in a comments in LowOverheadLoop::ValidateLiveOuts(). Please have a look at that, and if it is still unclear, we need to improve that.

samparker added inline comments.Jul 26 2021, 1:58 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
882

This is the same for most of the instructions that we check here, and this logic is almost the same as the loop at the bottom of the function. So shouldn't the dominates check just be inserted in there instead? I'm slightly dubious about that tbh! This algorithm is working on the basis that the instructions in FalseLanesZero have been proved to behaviour how we want. Adding a VORR in there based upon future checks breaks our assumptions. I feel like this is better handled in ValidateLiveOuts when you've got the whole loop body analysed and you can add some extra logic when scanning though FalseLanesUnknown.

samtebbs added inline comments.Aug 9 2021, 5:57 AM
llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
882

Sorry for the delay in getting back to this and thanks for the feedback. I'll have a go at migrating this to ValidateLiveOuts.