This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
3
LoopUnroll.cpp
-
test/Transforms/LoopUnroll/
-
Transforms/
-
LoopUnroll/
-
pr27157.ll

Differential D18712

[LoopUnroll] Fix the way we update DT after complete unrolling.
ClosedPublic

Authored by mzolotukhin on Apr 1 2016, 1:44 PM.

Download Raw Diff

Details

Reviewers

chandlerc
• dberlin
sanjoy
escha
hfinkel

Summary

Updating dominators for exit-blocks of the unrolled loops is not enough,
as shown in PR27157. The proper way is to update dominators for all
dominance-children of exiting blocks. That set is a superset of
exit-blocks set.

Diff Detail

Event Timeline

mzolotukhin updated this revision to Diff 52419.Apr 1 2016, 1:44 PM

mzolotukhin retitled this revision from to [LoopUnroll] Fix the way we update DT after complete unrolling..

mzolotukhin updated this object.

mzolotukhin added reviewers: • dberlin, chandlerc, hfinkel, sanjoy, escha.

mzolotukhin added a subscriber: llvm-commits.

Could anyone take a look at this please? It's a fix for a stability issue, and I'd appreciate if we can land it fast (not having correct DT information might lead to weird bugs later in the pipeline).

Michael

flyingforyou added a subscriber: flyingforyou.Apr 5 2016, 2:17 PM

Hi Michael,

I don't quite grok this statement "That set is a superset of
exit-blocks set.". What if we have

maybe_goto exit0;

for (...)
  if (cond) {
    // block a
    maybe_goto exit0;
  } else {
    maybe_goto exit1;
  }
}

exit0:
 return

exit1:
  return

The set of dominance children of exiting blocks of the above loop is
{exit1}, but the set of exit blocks is {exit0, exit1} (so the set of
dominance children of exiting blocks is not a superset of the exit
blocks of the loop). Is there some additional invariant that applies
to unrolled loops that I'm missing here?

lib/Transforms/Utils/LoopUnroll.cpp
558–568	Use `auto *` here and below (to make it obvious that you're binding to a pointer).

Generally, I think this change will be somewhat easier to follow if
you update the commit message to "this is the pre-unroll loop A, which
gets unrolled to this loop B, and these are the blocks X and Y we need
to fixup out of which the current technique leaves out X".

I realize that you have a test case, but reviewers are lazy people. :)

Hi Sanjoy,

Thank you for taking a look. You're right, my statement "That set is a superset of exit-blocks set" is incorrect (as shown by your example). But I believe the proposed way to update DT is still correct though.

Let me elaborate on the issue I'm trying to solve here: the attached test illustrates it.
The set of exiting blocks in this test is {loop_latch, loop_exiting_bb1, loop_exiting_bb2}.
The set of exit blocks is {exit2, exit1.loopexit, bb}.
The set of dominance children of exiting blocks is {exit2, exit1.loopexit, exit1, loop_exiting_bb2, bb}.

The loop after unrolling looks like this:

; ITERATION 1
loop_header:                                      ; preds = %entry
  br i1 undef, label %loop_latch, label %loop_exiting_bb1
loop_exiting_bb1:                                 ; preds = %loop_header
  br i1 false, label %loop_exiting_bb2, label %exit1.loopexit
loop_exiting_bb2:                                 ; preds = %loop_exiting_bb1
  br i1 false, label %loop_latch, label %bb
loop_latch:                                       ; preds = %loop_exiting_bb2, %loop_header
  %iv_next = add nuw nsw i64 0, 1
  %cmp = icmp ne i64 %iv_next, 2
  br label %loop_header.1

; ITERATION 2
loop_header.1:                                    ; preds = %loop_latch
  br i1 undef, label %loop_latch.1, label %loop_exiting_bb1.1
loop_exiting_bb1.1:                               ; preds = %loop_header.1
  br i1 false, label %loop_exiting_bb2.1, label %exit1.loopexit
loop_exiting_bb2.1:                               ; preds = %loop_exiting_bb1.1
  br i1 false, label %loop_latch.1, label %bb
loop_latch.1:                                     ; preds = %loop_exiting_bb2.1, %loop_header.1
  %iv_next.1 = add nuw nsw i64 %iv_next, 1
  %cmp.1 = icmp ne i64 %iv_next.1, 2
  br label %exit2

; EXIT BLOCKS
bb:                                               ; preds = %loop_exiting_bb2.1, %loop_exiting_bb2
  br label %exit1
exit1.loopexit:                                   ; preds = %loop_exiting_bb1.1, %loop_exiting_bb1
  br label %exit1
exit1:                                            ; preds = %exit1.loopexit, %bb
  ret void
exit2:                                            ; preds = %loop_latch.1
  ret void

The issue is that currently we don't update dom-info for exit1, because it's not an exit block. Before the unrolling its dominator was loop_exiting_bb1, but now we have another incoming edge from loop_exiting_bb1.1, so its new dominator is loop_header. If, instead of looking at exit-blocks, we start looking at dominance children of exiting blocks, as proposed in this patch, that should be solved.

Now, if one of the exits has an incoming edge from outside the loop, then its dominator shouldn't change, since it's also outside of the loop. So, we should be fine in this case too.

Does it make sense?

Thanks,
Michael

Hi Michael,

How about loops like these:

define void @f() {
entry:
  br label %loop.header

loop.header:
  %iv = phi i32 [ 0, %entry ], [ %iv.inc, %latch ]
  %iv.inc = add i32 %iv, 1
  br i1 undef, label %diamond, label %latch

diamond:
  br i1 undef, label %left, label %right

left:
  br i1 undef, label %exit, label %merge

right:
  br i1 undef, label %exit, label %merge

merge:
  br label %latch

latch:
  %end.cond = icmp eq i32 %iv, 1
  br i1 %end.cond, label %exit1, label %loop.header

exit:
  ret void

exit1:
  ret void
}

exit is dominated by diamond (not a loop exit [edit, should be : "not a loop exiting block"]), but after unrolling exit becomes dominated by loop.header.

Hmm, that's a good catch. I think we need to check dominance children of all loop blocks then. That should be sufficient, right?

Thanks,
Michael

In D18712#392836, @mzolotukhin wrote:

Hmm, that's a good catch. I think we need to check dominance children of all loop blocks then. That should be sufficient, right?

Sounds right, but I didn't try to prove it.

Update dom-info for all dom-children of original loop blocks.

Use 'auto *' instead of 'auto'.

lgtm with a minor nit inline

lib/Transforms/Utils/LoopUnroll.cpp
564	Isn't `PrevIDom` the same as `BB`?

This revision is now accepted and ready to land.Apr 6 2016, 2:00 PM

Thanks, committed in r265605.

Michael

lib/Transforms/Utils/LoopUnroll.cpp
564	It is, thanks!

Revision Contents

Path

Size

lib/

Transforms/

Utils/

LoopUnroll.cpp

27 lines

test/

Transforms/

LoopUnroll/

pr27157.ll

53 lines

Diff 52846

lib/Transforms/Utils/LoopUnroll.cpp

Show First 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	bool llvm::UnrollLoop(Loop *L, unsigned Count, unsigned TripCount,
assert(Count > 0);		assert(Count > 0);
assert(TripMultiple > 0);		assert(TripMultiple > 0);
assert(TripCount == 0 \|\| TripCount % TripMultiple == 0);		assert(TripCount == 0 \|\| TripCount % TripMultiple == 0);

// Are we eliminating the loop control altogether?		// Are we eliminating the loop control altogether?
bool CompletelyUnroll = Count == TripCount;		bool CompletelyUnroll = Count == TripCount;
SmallVector<BasicBlock *, 4> ExitBlocks;		SmallVector<BasicBlock *, 4> ExitBlocks;
L->getExitBlocks(ExitBlocks);		L->getExitBlocks(ExitBlocks);
		std::vector<BasicBlock*> OriginalLoopBlocks = L->getBlocks();

// Go through all exits of L and see if there are any phi-nodes there. We just		// Go through all exits of L and see if there are any phi-nodes there. We just
// conservatively assume that they're inserted to preserve LCSSA form, which		// conservatively assume that they're inserted to preserve LCSSA form, which
// means that complete unrolling might break this form. We need to either fix		// means that complete unrolling might break this form. We need to either fix
// it in-place after the transformation, or entirely rebuild LCSSA. TODO: For		// it in-place after the transformation, or entirely rebuild LCSSA. TODO: For
// now we just recompute LCSSA for the outer loop, but it should be possible		// now we just recompute LCSSA for the outer loop, but it should be possible
// to fix it in-place.		// to fix it in-place.
bool NeedToFixLCSSA = PreserveLCSSA && CompletelyUnroll &&		bool NeedToFixLCSSA = PreserveLCSSA && CompletelyUnroll &&
▲ Show 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	if (NeedConditional) {
}		}
}		}
}		}
// Replace the conditional branch with an unconditional one.		// Replace the conditional branch with an unconditional one.
BranchInst::Create(Dest, Term);		BranchInst::Create(Dest, Term);
Term->eraseFromParent();		Term->eraseFromParent();
}		}
}		}
// Update dominators of loop exit blocks.		// Update dominators of blocks we might reach through exits.
// Immediate dominator of an exit block might change, because we add more		// Immediate dominator of such block might change, because we add more
// routes which can lead to the exit: we can now reach it from the copied		// routes which can lead to the exit: we can now reach it from the copied
// iterations too. Thus, the new idom of the exit block will be the nearest		// iterations too. Thus, the new idom of the block will be the nearest
// common dominator of the previous idom and common dominator of all copies of		// common dominator of the previous idom and common dominator of all copies of
// the exiting block. This is equivalent to the nearest common dominator of		// the previous idom. This is equivalent to the nearest common dominator of
// the previous idom and the first latch, which dominates all copies of the		// the previous idom and the first latch, which dominates all copies of the
// exiting block.		// previous idom.
if (DT && Count > 1) {		if (DT && Count > 1) {
for (auto Exit : ExitBlocks) {		for (auto *BB : OriginalLoopBlocks) {
BasicBlock *PrevIDom = DT->getNode(Exit)->getIDom()->getBlock();		auto *BBDomNode = DT->getNode(BB);
		for (auto *ChildDomNode : BBDomNode->getChildren()) {
		auto *ChildBB = ChildDomNode->getBlock();
		if (L->contains(ChildBB))
		continue;
		BasicBlock *PrevIDom = ChildDomNode->getIDom()->getBlock();
		sanjoyUnsubmitted Not Done Reply Inline Actions Isn't `PrevIDom` the same as `BB`? sanjoy: Isn't `PrevIDom` the same as `BB`?
		mzolotukhinAuthorUnsubmitted Not Done Reply Inline Actions It is, thanks! mzolotukhin: It is, thanks!
BasicBlock *NewIDom =		BasicBlock *NewIDom =
DT->findNearestCommonDominator(PrevIDom, Latches[0]);		DT->findNearestCommonDominator(PrevIDom, Latches[0]);
DT->changeImmediateDominator(Exit, NewIDom);		DT->changeImmediateDominator(ChildBB, NewIDom);
		}
		sanjoyUnsubmitted Not Done Reply Inline Actions Use `auto ` here and below (to make it obvious that you're binding to a pointer). sanjoy:* Use `auto *` here and below (to make it obvious that you're binding to a pointer).
}		}
}		}

// Merge adjacent basic blocks, if possible.		// Merge adjacent basic blocks, if possible.
SmallPtrSet<Loop *, 4> ForgottenLoops;		SmallPtrSet<Loop *, 4> ForgottenLoops;
for (BasicBlock *Latch : Latches) {		for (BasicBlock *Latch : Latches) {
BranchInst *Term = cast<BranchInst>(Latch->getTerminator());		BranchInst *Term = cast<BranchInst>(Latch->getTerminator());
if (Term->isUnconditional()) {		if (Term->isUnconditional()) {
▲ Show 20 Lines • Show All 124 Lines • Show Last 20 Lines

test/Transforms/LoopUnroll/pr27157.ll

This file was added.

				; RUN: opt -loop-unroll -debug-only=loop-unroll -disable-output < %s
				; REQUIRES: asserts
				; Compile this test with debug flag on to verify domtree right after loop unrolling.
				target datalayout = "E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64"

				; PR27157
				define void @foo() {
				entry:
				br label %loop_header
				loop_header:
				%iv = phi i64 [ 0, %entry ], [ %iv_next, %loop_latch ]
				br i1 undef, label %loop_latch, label %loop_exiting_bb1
				loop_exiting_bb1:
				br i1 false, label %loop_exiting_bb2, label %exit1.loopexit
				loop_exiting_bb2:
				br i1 false, label %loop_latch, label %bb
				bb:
				br label %exit1
				loop_latch:
				%iv_next = add nuw nsw i64 %iv, 1
				%cmp = icmp ne i64 %iv_next, 2
				br i1 %cmp, label %loop_header, label %exit2
				exit1.loopexit:
				br label %exit1
				exit1:
				ret void
				exit2:
				ret void
				}

				define void @foo2() {
				entry:
				br label %loop.header
				loop.header:
				%iv = phi i32 [ 0, %entry ], [ %iv.inc, %latch ]
				%iv.inc = add i32 %iv, 1
				br i1 undef, label %diamond, label %latch
				diamond:
				br i1 undef, label %left, label %right
				left:
				br i1 undef, label %exit, label %merge
				right:
				br i1 undef, label %exit, label %merge
				merge:
				br label %latch
				latch:
				%end.cond = icmp eq i32 %iv, 1
				br i1 %end.cond, label %exit1, label %loop.header
				exit:
				ret void
				exit1:
				ret void
				}