This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
6/19
IndVarSimplify.cpp

Differential D67408

[IndVars] An implementation of loop predication without a need for speculation
ClosedPublic

Authored by reames on Sep 10 2019, 10:43 AM.

Download Raw Diff

Details

Reviewers

apilipenko
nikic
skatkov
ebrevnov

Commits

rG0200626f0bfe: [IndVars] An implementation of loop predication without a need for speculation
rL373351: [IndVars] An implementation of loop predication without a need for speculation

Summary

This patch implements a variation of a well known techniques for JIT compilers - we have an implementation in tree as LoopPredication - but with an interesting twist. This version does not assume the ability to execute a path which wasn't taken in the original program (such as a guard or widenable.condition intrinsic). The benefit is that this works for arbitrary IR from any frontend (including C/C++/Fortran). The tradeoff is that it's restricted to read only loops without implicit exits.

This builds on SCEV, and can thus eliminate the loop varying portion of the any early exit where all exits are understandable by SCEV. A key advantage is that fixing deficiency exposed in SCEV - already found one while writing test cases - will also benefit all of full redundancy elimination (and most other loop transforms).

I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.

Factoring wise, I chose to put this in IndVarSimplify since it's a generally applicable to all workloads. I could split this off into it's own pass, but we'd then probably want to add that new pass every place we use IndVars.

One solid argument for splitting it off into it's own pass is that this transform is "too good". It breaks a huge number of existing IndVars test cases as they tend to be simple read only loops. At the moment, I've opted it off by default, but if we add this to IndVars and enable, we'll have to update around 20 test files to add side effects or disable this transform.

Diff Detail

Event Timeline

reames created this revision.Sep 10 2019, 10:43 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2019, 10:43 AM

Herald added subscribers: javed.absar, bollu, mcrosier. · View Herald Transcript

I somehow uploaded the wrong patch. Ignored until I can fix.

Correct patch this time...

I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.

Maybe this is because I don't have context on previous work related to LoopPredication but it sounds not quite what the transformation is doing. At least it doesn't generate any explicit predicate for the loop. I don't have anything better to suggest at this time though.

lib/Transforms/Scalar/IndVarSimplify.cpp
2750	I can't find a place where this prerequisite is actually checked. Am I just missing it?
2758	Returning at this point means no other code (other than loop predication) can be added bellow this point in future. I think we better structure it in a more independent way. Taking into account this function becomes pretty large I would suggest factoring out new code to a separate function?
2769	Looks like many steps in this filter are common with the existing code above. I think it make sense to unify these two pieces. Ideally ExitingBB better be const to prevent unintended influence but not sure this is doable with out a copy.
2827	No need to call mayThrow explicitly since mayHaveSideEffects covers this.
2835	The first sentence sounds a little odd to me. Would you mind rephrasing it.
2837	Could you clarify this sentence for me. Did not catch.

reames marked 5 inline comments as done.Sep 12 2019, 11:34 AM

reames added inline comments.

lib/Transforms/Scalar/IndVarSimplify.cpp
2750	The code is assuming lcssa, so the lack of phis in the exist block is what you're looking for.
2758	I'm completely fine factoring this out into it's own function, but let's hold off until we decide on pass placement and algorithmic questions.
2769	While there are commonalities, there are enough differences that combining the two is going to end up being extremely confusing. I'd be willing to play around with code structure to see if this works better than I think it would, but I'd like to land and then refactor in that direction if it's okay.
2835	The comment is also stale. I will simply remove it.
2837	Again, stale. I was originally inserting comparisons at the branch under the logic that the exit count might not be invariant. I then realized that was non-sensical when I couldn't construct a test for it, and moved to the bail out scheme now in the code. I forgot to remove the comment.

ebrevnov added inline comments.Sep 13 2019, 3:37 AM

lib/Transforms/Scalar/IndVarSimplify.cpp
2750	Got it. Thanks!
2816	consistents->consists or constituents?
2824	Did you consider adding support for llvm.experimental.widenable.condition in this or one of the next patches? My understanding is semantics of llvm.experimental.widenable.condition allows us to assume that exist is taken on the first hit thus we can ignore side effects and live outs coming from the code dominated by the intrinsic.
test/Transforms/IndVarSimplify/loop-predication.ll
714 ↗	(On Diff #219725)	Any plan to support that soon?

reames marked 2 inline comments as done.Sep 13 2019, 9:06 AM

reames added inline comments.

lib/Transforms/Scalar/IndVarSimplify.cpp
2824	WC is a planned follow up. It's a little tricky because the way we structure the WC comparisons currently breaks getExitCount and getBackedgeTakenCount. I definitely want this landed and tested before I start making any of the tweaks for WC.
test/Transforms/IndVarSimplify/loop-predication.ll
714 ↗	(On Diff #219725)	At some point. If it's a case which bites someone, I'll prioritize it.

LGTM.

It looks like you successfully avoid the pitfall the original original loop predication fell into by filtering implicit exits and using trip counts. It makes me think that we can revisit loop predication again and go back to SCEV based implementation.

lib/Transforms/Scalar/IndVarSimplify.cpp
2763	It looks like you need to check `hasLoopInvariantBackedgeTakenCount` first. Excerpt from `getBackedgeTakenCount` description: /// Note that it is not valid to call this method on a loop without a /// loop-invariant backedge-taken count (see /// hasLoopInvariantBackedgeTakenCount).
2770	Typo. exitting - exiting
2770–2772	I don't think I quite follow this comment. Can you please elaborate? Is there a test demonstrating the problem?
2856	What is the situation when they have different types?

This revision is now accepted and ready to land.Sep 25 2019, 5:13 PM

Closed by commit rL373351: [IndVars] An implementation of loop predication without a need for speculation (authored by reames). · Explain WhyOct 1 2019, 10:02 AM

This revision was automatically updated to reflect the committed changes.

reames marked 2 inline comments as done.

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

IndVarSimplify.cpp

93 lines

Diff 219567

lib/Transforms/Scalar/IndVarSimplify.cpp

Context not available.
	bool rewriteNonIntegerIVs(Loop *L);	bool rewriteNonIntegerIVs(Loop *L);

	bool simplifyAndExtend(Loop L, SCEVExpander &Rewriter, LoopInfo LI);	bool simplifyAndExtend(Loop L, SCEVExpander &Rewriter, LoopInfo LI);
	bool optimizeLoopExits(Loop *L);	bool optimizeLoopExits(Loop *L, SCEVExpander &Rewriter);

	bool canLoopBeDeleted(Loop *L, SmallVector<RewritePhi, 8> &RewritePhiSet);	bool canLoopBeDeleted(Loop *L, SmallVector<RewritePhi, 8> &RewritePhiSet);
	bool rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter);	bool rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter);
Context not available.
	return MadeAnyChanges;	return MadeAnyChanges;
	}	}

	bool IndVarSimplify::optimizeLoopExits(Loop *L) {	bool IndVarSimplify::optimizeLoopExits(Loop *L, SCEVExpander &Rewriter) {
	SmallVector<BasicBlock*, 16> ExitingBlocks;	SmallVector<BasicBlock*, 16> ExitingBlocks;
	L->getExitingBlocks(ExitingBlocks);	L->getExitingBlocks(ExitingBlocks);

Context not available.
	assert(MaxExitCount->getType() == ExitCount->getType());	assert(MaxExitCount->getType() == ExitCount->getType());

	// Can we prove that some other exit must be taken strictly before this	// Can we prove that some other exit must be taken strictly before this
	// one? TODO: handle cases where ule is known, and equality is covered	// one?
	// by a dominating exit
	if (SE->isLoopEntryGuardedByCond(L, CmpInst::ICMP_ULT,	if (SE->isLoopEntryGuardedByCond(L, CmpInst::ICMP_ULT,
	MaxExitCount, ExitCount)) {	MaxExitCount, ExitCount)) {
	bool ExitIfTrue = !L->contains(*succ_begin(ExitingBB));	bool ExitIfTrue = !L->contains(*succ_begin(ExitingBB));
Context not available.
	continue;	continue;
	}	}

	// TODO: If we can prove that the exiting iteration is equal to the exit	// If we know that this exit's count is equal to the loops exit count, then
	// count for this exit and that no previous exit oppurtunities exist within	// we know we won't reach any exits strictly after this one and can fold
	// the loop, then we can discharge all other exits. (May fall out of	// them out of existance. There may be an exit before us which causes us
	// previous TODO.)	// never to reach this one on the exiting iteration, so we still don't know
		// whether this exit is taken.
		const SCEV *ExactBTC = SE->getBackedgeTakenCount(L);
		if (isa<SCEVCouldNotCompute>(ExactBTC))
		continue;
		// TODO: Bitwidth conversion.
		if (!isa<SCEVCouldNotCompute>(ExactBTC) &&
		SE->isLoopEntryGuardedByCond(L, CmpInst::ICMP_EQ,
		ExactBTC, ExitCount))
		for (BasicBlock *OtherExitingBB : ExitingBlocks) {
		// If our exitting block exits multiple loops, we can only rewrite the
		// innermost one. Otherwise, we're changing how many times the
		// innermost loop runs before it exits.
		if (LI->getLoopFor(OtherExitingBB) != L)
		continue;

		if (!DT->properlyDominates(ExitingBB, OtherExitingBB))
		continue;

		// Can't rewrite non-branch yet.
		BranchInst *BI = dyn_cast<BranchInst>(OtherExitingBB->getTerminator());
		if (!BI)
		continue;

		// If already constant, nothing to do.
		if (isa<Constant>(BI->getCondition()))
		continue;

		bool ExitIfTrue = !L->contains(*succ_begin(OtherExitingBB));
		auto *OldCond = BI->getCondition();
		auto *NewCond = ExitIfTrue ? ConstantInt::getFalse(OldCond->getType()) :
		ebrevnovUnsubmitted Not Done Reply Inline Actions I can't find a place where this prerequisite is actually checked. Am I just missing it? ebrevnov: I can't find a place where this prerequisite is actually checked. Am I just missing it?
		reamesAuthorUnsubmitted Done Reply Inline Actions The code is assuming lcssa, so the lack of phis in the exist block is what you're looking for. reames: The code is assuming lcssa, so the lack of phis in the exist block is what you're looking for.
		ebrevnovUnsubmitted Not Done Reply Inline Actions Got it. Thanks! ebrevnov: Got it. Thanks!
		ConstantInt::getTrue(OldCond->getType());
		BI->setCondition(NewCond);
		if (OldCond->use_empty())
		DeadInsts.push_back(OldCond);
		Changed = true;
		}

		#if 1
		ebrevnovUnsubmitted Not Done Reply Inline Actions Returning at this point means no other code (other than loop predication) can be added bellow this point in future. I think we better structure it in a more independent way. Taking into account this function becomes pretty large I would suggest factoring out new code to a separate function? ebrevnov: Returning at this point means no other code (other than loop predication) can be added bellow…
		reamesAuthorUnsubmitted Done Reply Inline Actions I'm completely fine factoring this out into it's own function, but let's hold off until we decide on pass placement and algorithmic questions. reames: I'm completely fine factoring this out into it's own function, but let's hold off until we…
	// TODO: If we can't prove any relation between our exit count and the	// TODO: If we can't prove any relation between our exit count and the
	// loops exit count, but taking this exit doesn't require actually running	// loops exit count, but taking this exit doesn't require actually running
	// the loop (i.e. no side effects, no computed values used in exit), then	// the loop (i.e. no side effects, no computed values used in exit), then
	// we can replace the exit test with a loop invariant test which exits on	// we can replace the exit test with a loop invariant test which exits on
	// the first iteration.	// the first iteration.
		apilipenkoUnsubmitted Not Done Reply Inline Actions It looks like you need to check `hasLoopInvariantBackedgeTakenCount` first. Excerpt from `getBackedgeTakenCount` description: /// Note that it is not valid to call this method on a loop without a /// loop-invariant backedge-taken count (see /// hasLoopInvariantBackedgeTakenCount). apilipenko: It looks like you need to check `hasLoopInvariantBackedgeTakenCount` first. Excerpt from…
		bool LoopIsReadOnly = true;
		for (BasicBlock *BB : L->blocks())
		for (auto &I : *BB)
		LoopIsReadOnly &= !I.mayHaveSideEffects();

		bool DominatesAllOther = true;
		ebrevnovUnsubmitted Not Done Reply Inline Actions Looks like many steps in this filter are common with the existing code above. I think it make sense to unify these two pieces. Ideally ExitingBB better be const to prevent unintended influence but not sure this is doable with out a copy. ebrevnov: Looks like many steps in this filter are common with the existing code above. I think it make…
		reamesAuthorUnsubmitted Done Reply Inline Actions While there are commonalities, there are enough differences that combining the two is going to end up being extremely confusing. I'd be willing to play around with code structure to see if this works better than I think it would, but I'd like to land and then refactor in that direction if it's okay. reames: While there are commonalities, there are enough differences that combining the two is going to…
		for (BasicBlock *OtherExitingBB : ExitingBlocks)
		apilipenkoUnsubmitted Not Done Reply Inline Actions Typo. exitting - exiting apilipenko: Typo. exitting - exiting
		if (OtherExitingBB != ExitingBB &&
		!DT->properlyDominates(ExitingBB, OtherExitingBB)) {
		apilipenkoUnsubmitted Not Done Reply Inline Actions I don't think I quite follow this comment. Can you please elaborate? Is there a test demonstrating the problem? apilipenko: I don't think I quite follow this comment. Can you please elaborate? Is there a test…
		// TODO: exclude invariant ones
		DominatesAllOther = false;
		break;
		}

		BasicBlock ExitBlock = succ_begin(ExitingBB);
		if (L->contains(ExitBlock))
		ExitBlock = *std::next(succ_begin(ExitingBB));
		bool NoPHI = empty(ExitBlock->phis());
		dbgs() << ExitBlock->getName() << "\n";
		dbgs() << LoopIsReadOnly << DominatesAllOther << NoPHI << "\n";
		if (LoopIsReadOnly && DominatesAllOther && NoPHI) {
		IRBuilder<> B(BI);
		dbgs() << *ExactBTC << "\n";
		dbgs() << *ExitCount << "\n";
		#if 1
		SCEVExpander Rewriter2(*SE, DL, "indvars2");
		auto *NewCond = B.CreateICmp(CmpInst::ICMP_EQ,
		Rewriter2.expandCodeFor(ExactBTC),
		Rewriter2.expandCodeFor(ExitCount));
		auto *OldCond = BI->getCondition();
		BI->setCondition(NewCond);
		if (OldCond->use_empty())
		DeadInsts.push_back(OldCond);
		#endif
		Changed = true;
		}
		#endif
	}	}
	return Changed;	return Changed;
	}	}
		ebrevnovUnsubmitted Not Done Reply Inline Actions No need to call mayThrow explicitly since mayHaveSideEffects covers this. ebrevnov: No need to call mayThrow explicitly since mayHaveSideEffects covers this.
		ebrevnovUnsubmitted Not Done Reply Inline Actions The first sentence sounds a little odd to me. Would you mind rephrasing it. ebrevnov: The first sentence sounds a little odd to me. Would you mind rephrasing it.
		reamesAuthorUnsubmitted Done Reply Inline Actions The comment is also stale. I will simply remove it. reames: The comment is also stale. I will simply remove it.
		ebrevnovUnsubmitted Not Done Reply Inline Actions Could you clarify this sentence for me. Did not catch. ebrevnov: Could you clarify this sentence for me. Did not catch.
		reamesAuthorUnsubmitted Done Reply Inline Actions Again, stale. I was originally inserting comparisons at the branch under the logic that the exit count might not be invariant. I then realized that was non-sensical when I couldn't construct a test for it, and moved to the bail out scheme now in the code. I forgot to remove the comment. reames: Again, stale. I was originally inserting comparisons at the branch under the logic that the…
		ebrevnovUnsubmitted Not Done Reply Inline Actions consistents->consists or constituents? ebrevnov: consistents->consists or constituents?
		ebrevnovUnsubmitted Not Done Reply Inline Actions Did you consider adding support for llvm.experimental.widenable.condition in this or one of the next patches? My understanding is semantics of llvm.experimental.widenable.condition allows us to assume that exist is taken on the first hit thus we can ignore side effects and live outs coming from the code dominated by the intrinsic. ebrevnov: Did you consider adding support for llvm.experimental.widenable.condition in this or one of the…
		reamesAuthorUnsubmitted Done Reply Inline Actions WC is a planned follow up. It's a little tricky because the way we structure the WC comparisons currently breaks getExitCount and getBackedgeTakenCount. I definitely want this landed and tested before I start making any of the tweaks for WC. reames: WC is a planned follow up. It's a little tricky because the way we structure the WC…
		apilipenkoUnsubmitted Not Done Reply Inline Actions What is the situation when they have different types? apilipenko: What is the situation when they have different types?
Context not available.
	// Eliminate redundant IV cycles.	// Eliminate redundant IV cycles.
	NumElimIV += Rewriter.replaceCongruentIVs(L, DT, DeadInsts);	NumElimIV += Rewriter.replaceCongruentIVs(L, DT, DeadInsts);

	Changed \|= optimizeLoopExits(L);	Changed \|= optimizeLoopExits(L, Rewriter);

	// If we have a trip count expression, rewrite the loop's exit condition	// If we have a trip count expression, rewrite the loop's exit condition
	// using it.	// using it.
Context not available.