This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Don't transform to CTR loop if the decrement branch instr. would end up in a different loop
ClosedPublic

Authored by nemanjai on Apr 26 2018, 5:22 PM.

Details

Summary

This fixes PR37229.

Basically, we don't want to transform a loop to a CTR loop if the decrement-and-branch instruction would end up in an exiting block that is also part of another nested loop. Previously this couldn't happen because of how SCEV computed the trip count. However, the check is fundamentally needed.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai created this revision.Apr 26 2018, 5:22 PM
chandlerc added inline comments.Apr 26 2018, 5:24 PM
test/CodeGen/PowerPC/no-ctr-loop-if-exit-in-nested-loop.ll
2–5

I'm really worried that this test is overreduced.

I think you should craft an IR test that directly forms the pattern you describe where an outer loop has an exiting block shared with an inner loop, but the trip count can only be computed for the outer loop.

I'd also encourage you to not have any undef in the branches. Instead, you really just want to have a "real" (if minimal) loop structure.

efriedma added inline comments.
lib/Target/PowerPC/PPCCTRLoops.cpp
579

The actual condition you need to check is whether the branch executes exactly once per iteration.

For most code, this check is equivalent, but it's possible to have an irreducible "loop" which is not an LLVM Loop.

nemanjai added inline comments.Apr 26 2018, 7:29 PM
lib/Target/PowerPC/PPCCTRLoops.cpp
579

Sorry if this is a naive question, but would adding a check that this loop's header dominates the exit block suffice for such a case?

I don't really know what an irreducible loop looks like, but with this question, I'm making the assumption that for such a loop, there would have to be a path to the exit block that doesn't go through the header. Of course, if this assumption is wrong, please correct me.

test/CodeGen/PowerPC/no-ctr-loop-if-exit-in-nested-loop.ll
2–5

Yeah, I took the easy approach of unleashing bugpoint on the original test case. I'll reduce it manually to keep the original structure in the test.

nemanjai updated this revision to Diff 144271.Apr 26 2018, 9:02 PM

I've produced a test case that closely resembles the original CFG. Also updated the condition to hopefully address the valid point that @efriedma brought up.
Please let me know if this is sufficient.

mkazantsev added inline comments.Apr 26 2018, 9:35 PM
lib/Target/PowerPC/PPCCTRLoops.cpp
578

By definition, loop header dominates all blocks of this loop. So once you've checked that LI->getLoop(*I) is L, it will automatically give you the second part of the check. I think you can assert that DT->dominates(L->getHeader(), *I)) if the first part is true.

nemanjai added inline comments.Apr 27 2018, 3:46 AM
lib/Target/PowerPC/PPCCTRLoops.cpp
578

Ah, OK. That makes sense. So clearly my condition isn't sufficient to address Eli's comment. If either you or Eli can suggest how I might modify this condition to also address irreducible loops completely nested in the outer loop, I would really appreciate it.

hfinkel added inline comments.Apr 27 2018, 5:34 AM
lib/Target/PowerPC/PPCCTRLoops.cpp
579

We have a utility for this. ShrinkWrapping uses it:

if (containsIrreducibleCFG<MachineBasicBlock *>(RPOT, *MLI)) {

as does the loop vectorizer:

if (!containsIrreducibleCFG<const BasicBlock *>(RPOT, *LI)) {

I think that you want to do something here very similar to what the loop vectorizer does:

static void collectSupportedLoops(Loop &L, LoopInfo *LI,

                                OptimizationRemarkEmitter *ORE,
                                SmallVectorImpl<Loop *> &V) {
// Collect inner loops and outer loops without irreducible control flow. For
// now, only collect outer loops that have explicit vectorization hints.
if (L.empty() || (EnableVPlanNativePath && isExplicitVecOuterLoop(&L, ORE))) {
  LoopBlocksRPO RPOT(&L);
  RPOT.perform(LI);
  if (!containsIrreducibleCFG<const BasicBlock *>(RPOT, *LI)) {
    V.push_back(&L);
    // TODO: Collect inner loops inside marked outer loops in case
    // vectorization fails for the outer loop. Do not invoke
    // 'containsIrreducibleCFG' again for inner loops when the outer loop is
    // already known to be reducible. We can use an inherited attribute for
    // that.
    return;
  }
}
for (Loop *InnerL : L)
  collectSupportedLoops(*InnerL, LI, ORE, V);

}

mkazantsev added inline comments.Apr 27 2018, 7:14 PM
lib/Target/PowerPC/PPCCTRLoops.cpp
578

I don't think that SCEV will be able to calculate iteration count for a loop with irreducible control flow. Do you have a test when you really need to support it?

mkazantsev added inline comments.Apr 27 2018, 7:16 PM
lib/Target/PowerPC/PPCCTRLoops.cpp
578

Unless there is a reason why you need it, I would suggest just to limit this transform to the loops that have a single backedge since it's the canonical form.

efriedma added inline comments.Apr 30 2018, 12:40 PM
lib/Target/PowerPC/PPCCTRLoops.cpp
578

SCEV can compute the trip count for a loop which contains irreducible control flow. For example:

define void @foo(i1 %z) {
entry:
  br label %loophead

loophead:
  %i = phi i32 [ 0, %entry ], [ %inc, %irr_exit ]
  br i1 %z, label %irr_head1, label %irr_head2

irr_head1:
  br label %exiting

irr_head2:
  br label %exiting

exiting:
  %inc = add nuw nsw i32 %i, 1
  %exitcond = icmp eq i32 %inc, 100000
  br i1 %exitcond, label %exit, label %irr_cont

irr_cont:
  br i1 %z, label %irr_cont2, label %irr_exit

irr_cont2:
  br i1 %z, label %irr_head1, label %irr_head2

irr_exit:
  br label %loophead

exit:
  ret void
}
nemanjai updated this revision to Diff 144848.May 2 2018, 4:12 AM

Added a check for irreducible CFG within the loop being considered for transformation - thanks for the pointers @efriedma and @hfinkel.
We simply bail out of the transformation if the loop has irreducible control flow rather than analyzing said control flow. This results in a total reduction of 2 loops out of 25,432 that are transformed in a bootstrap build.

hfinkel accepted this revision.May 2 2018, 8:12 AM

LGTM

This revision is now accepted and ready to land.May 2 2018, 8:12 AM
This revision was automatically updated to reflect the committed changes.