This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
SimpleLoopUnswitch.cpp
-
test/Transforms/SimpleLoopUnswitch/
-
Transforms/
-
SimpleLoopUnswitch/
-
nontrivial-unswitch.ll

Differential D45754

[PM/LoopUnswitch] Detect irreducible control flow within loops and skip unswitching non-trivial edges.
ClosedPublic

Authored by chandlerc on Apr 17 2018, 11:12 PM.

Download Raw Diff

Details

Reviewers

sanjoy
fedor.sergeev
fhahn

Commits

rG32e62f9c5b85: [PM/LoopUnswitch] Detect irreducible control flow within loops and skip…
rL330357: [PM/LoopUnswitch] Detect irreducible control flow within loops and skip…

Summary

This fixes the bug pointed out in review with non-trivial unswitching.

This also provides a basis that should make it pretty easy to finish
fleshing out a routine to scan an entire function body for irreducible
control flow, but this patch remains minimal for disabling loop
unswitch.

Diff Detail

Repository: rL LLVM

Event Timeline

chandlerc created this revision.Apr 17 2018, 11:12 PM

Herald added subscribers: hiraditya, mcrosier. · View Herald TranscriptApr 17 2018, 11:12 PM

Harbormaster completed remote builds in B17164: Diff 142884.Apr 17 2018, 11:13 PM

@dcaballe added a containsIrreduciableCFG function to include/llvm/Analysis/CFG.h in D40874. I do not have time to take a close look today, but it seems at least for the attached test case, containsIrreduciableCFG does the right thing.

In D45754#1071186, @fhahn wrote:

@dcaballe added a containsIrreduciableCFG function to include/llvm/Analysis/CFG.h in D40874. I do not have time to take a close look today, but it seems at least for the attached test case, containsIrreduciableCFG does the right thing.

Gah.

I looked and looked, even looked specifically in this area, and asked around, all without success at finding this. Now I'm sad I wasted time building a new one. Thanks for pointing it out.

The approaches are... surprisingly different. I computed a slightly different thing. But I think I like the approach of the existing one quite a bit more, although I'm sad about the cost. But I'm willing for us to get bug reports about the cost of this before we really start stressing.

Lemme adjust this to just call that routine. Maybe I can even call it somewhat later to minimize the cost more.

Thanks, Florian!

I looked and looked, even looked specifically in this area, and asked around, all without success at finding this. Now I'm sad I wasted time building a new one. Thanks for pointing it out.

Sorry to hear that. The same was about to happen to me but I just found that implementation in ShrinkWrap and generalized it. Glad to see it's useful somehow.

Thanks,
Diego

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
1930 ↗	(On Diff #142884)	typo -> complexity

Use the existing facilities and sink this past the point where we find actual
unswitch candidates. This should at least avoid the cost of the loop traversal
when there are no loop invariant conditions.

And fix typo.

Harbormaster completed remote builds in B17191: Diff 142999.Apr 18 2018, 2:27 PM

OK, should be ready for review again! A much, much simpler patch now.

Thanks, LGTM.

Having a single function to detect irreducible CFGs allows us to test it more widely and in case it turns out to be a bottleneck, I think there are a couple of things we can do to improve the performance. The most prominent issue is probably that we re-do work for inner loops
, when loops are processed from inner to outer loop nests.

This revision is now accepted and ready to land.Apr 19 2018, 7:27 AM

In D45754#1072141, @fhahn wrote:

Thanks, LGTM.

Having a single function to detect irreducible CFGs allows us to test it more widely and in case it turns out to be a bottleneck, I think there are a couple of things we can do to improve the performance. The most prominent issue is probably that we re-do work for inner loops
, when loops are processed from inner to outer loop nests.

Thanks for the review, and totally agree about the tradeoffs.

Closed by commit rL330357: [PM/LoopUnswitch] Detect irreducible control flow within loops and skip… (authored by chandlerc). · Explain WhyApr 19 2018, 11:48 AM

This revision was automatically updated to reflect the committed changes.

Ahem... sorry for checking it only now, but this does not seem to address a testcase from PR36379, so it must be a different issue.
I will put this to more testing to see if it improves anything on our workflow.

Hang off on doing more testing. I have more failures w/ non-trivial
unswitching just in the test suite, so I already have multiple
reproductions that I'm working on.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

SimpleLoopUnswitch.cpp

13 lines

test/

Transforms/

SimpleLoopUnswitch/

nontrivial-unswitch.ll

30 lines

Diff 143140

llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp

Show All 11 Lines
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Sequence.h"		#include "llvm/ADT/Sequence.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
		#include "llvm/Analysis/CFG.h"
#include "llvm/Analysis/CodeMetrics.h"		#include "llvm/Analysis/CodeMetrics.h"
#include "llvm/Analysis/LoopAnalysisManager.h"		#include "llvm/Analysis/LoopAnalysisManager.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
		#include "llvm/Analysis/LoopIterator.h"
#include "llvm/Analysis/LoopPass.h"		#include "llvm/Analysis/LoopPass.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
▲ Show 20 Lines • Show All 1,902 Lines • ▼ Show 20 Lines	if (LI.getLoopFor(BB) == &L)
if (BI->isConditional() && L.isLoopInvariant(BI->getCondition()) &&		if (BI->isConditional() && L.isLoopInvariant(BI->getCondition()) &&
BI->getSuccessor(0) != BI->getSuccessor(1))		BI->getSuccessor(0) != BI->getSuccessor(1))
UnswitchCandidates.push_back(BI);		UnswitchCandidates.push_back(BI);

// If we didn't find any candidates, we're done.		// If we didn't find any candidates, we're done.
if (UnswitchCandidates.empty())		if (UnswitchCandidates.empty())
return Changed;		return Changed;

		// Check if there are irreducible CFG cycles in this loop. If so, we cannot
		// easily unswitch non-trivial edges out of the loop. Doing so might turn the
		// irreducible control flow into reducible control flow and introduce new
		// loops "out of thin air". If we ever discover important use cases for doing
		// this, we can add support to loop unswitch, but it is a lot of complexity
		// for what seems little or no real world benifit.
		LoopBlocksRPO RPOT(&L);
		RPOT.perform(&LI);
		if (containsIrreducibleCFG<const BasicBlock *>(RPOT, LI))
		return Changed;

DEBUG(dbgs() << "Considering " << UnswitchCandidates.size()		DEBUG(dbgs() << "Considering " << UnswitchCandidates.size()
<< " non-trivial loop invariant conditions for unswitching.\n");		<< " non-trivial loop invariant conditions for unswitching.\n");

// Given that unswitching these terminators will require duplicating parts of		// Given that unswitching these terminators will require duplicating parts of
// the loop, so we need to be able to model that cost. Compute the ephemeral		// the loop, so we need to be able to model that cost. Compute the ephemeral
// values and set up a data structure to hold per-BB costs. We cache each		// values and set up a data structure to hold per-BB costs. We cache each
// block's cost so that we don't recompute this when considering different		// block's cost so that we don't recompute this when considering different
// subsets of the loop for duplication during unswitching.		// subsets of the loop for duplication during unswitching.
▲ Show 20 Lines • Show All 221 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll

	Show First 20 Lines • Show All 2,344 Lines • ▼ Show 20 Lines

	loop_exit:			loop_exit:
	%lcssa = phi i32 [ %var_val, %loop_begin ]			%lcssa = phi i32 [ %var_val, %loop_begin ]
	ret i32 %lcssa			ret i32 %lcssa
	; CHECK: loop_exit:			; CHECK: loop_exit:
	; CHECK-NEXT: %[[LCSSA:.*]] = phi i32 [ %[[V]], %loop_begin ]			; CHECK-NEXT: %[[LCSSA:.*]] = phi i32 [ %[[V]], %loop_begin ]
	; CHECK-NEXT: ret i32 %[[LCSSA]]			; CHECK-NEXT: ret i32 %[[LCSSA]]
	}			}

				; Negative test: we do not switch when the loop contains unstructured control
				; flows as it would significantly complicate the process as novel loops might
				; be formed, etc.
				define void @test_no_unswitch_unstructured_cfg(i1* %ptr, i1 %cond) {
				; CHECK-LABEL: @test_no_unswitch_unstructured_cfg(
				entry:
				br label %loop_begin

				loop_begin:
				br i1 %cond, label %loop_left, label %loop_right

				loop_left:
				%v1 = load i1, i1* %ptr
				br i1 %v1, label %loop_right, label %loop_merge

				loop_right:
				%v2 = load i1, i1* %ptr
				br i1 %v2, label %loop_left, label %loop_merge

				loop_merge:
				%v3 = load i1, i1* %ptr
				br i1 %v3, label %loop_latch, label %loop_exit

				loop_latch:
				br label %loop_begin

				loop_exit:
				ret void
				}