This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] threshold for folding branches with common destination
ClosedPublic

Authored by jingyue on Sep 29 2014, 10:50 AM.

Download Raw Diff

Details

Reviewers

nadav
eliben
meheff
resistor
hfinkel

Commits

rGfc0296704c6a: [SimplifyCFG] threshold for folding branches with common destination
rL218711: [SimplifyCFG] threshold for folding branches with common destination

Summary

This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.

The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:

__global__ void foo(int a, int b, int c, int d, int e, int n,
                    const int *input, int *output) {
  int sum = 0;
  for (int i = 0; i < n; ++i)
    sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
  *output = sum;
}

The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:

sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];

Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.

We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!

Diff Detail

Event Timeline

jingyue updated this revision to Diff 14181.Sep 29 2014, 10:50 AM

jingyue retitled this revision from to [SimplifyCFG] threshold for folding branches with common destination.

jingyue updated this object.

jingyue edited the test plan for this revision. (Show Details)

jingyue added reviewers: nadav, resistor, eliben, meheff.

jingyue added a subscriber: Unknown Object (MLST).

jingyue updated this object.Sep 29 2014, 10:52 AM

Could this be made a pass parameter rather than (or in addition to) a command line option?

—Owen

hfinkel added a subscriber: hfinkel.Sep 29 2014, 11:25 AM

hfinkel added inline comments.

lib/Transforms/Utils/SimplifyCFG.cpp
2034	I don't think that the hasOneUse check here really does what you want once we allow for more than once instruction. We used to check that the single bonus instruction had one user and this user specifically was the Cond. Now you'd like to allow for some variable number of single-use instructions (regardless of a relationship to Cond). This could change behavior even when allowing only a single instruction. I think that what you really want to do is to walk up the operand graph from Cond, accumulating instructions until you reach your limit, keeping the hasOneUse check and the check that the use is Cond for the first instruction.
2042	Please don't remove these blank lines, I think they make the code easier to read.

Hi Owen,

I made the threshold a pass paramter in this patch. However, it doesn't look
pretty because there are several layers between createCFGSimplificationPass and
the actual use of this threshold (e.g. CFGSimplifyPass and SimplifyCFGOpt).

What are the benefits of having these pass parameters? I saw other passes such
as JumpThreading and LoopUnroll have such pass parameters too, but none of them
seem actually used. One way I can think of using that is target-specific code
can create these passes with a customized threshold, but in that case, I feel
TargetTransformInfo would be a better home.

Jingyue

jingyue added inline comments.Sep 29 2014, 4:42 PM

lib/Transforms/Utils/SimplifyCFG.cpp
2034	Hi Hal, Thanks for the careful review! However, I don't think the modified code changes the behavior when allowing only a single bonus instruction. Note that in Line 2037 I check whether the only user is in the same BB (and appears after the potential bonus instruction otherwise def doesn't dominate use). When there is only one potential bonus instruction, it is either used by Cond or BI (DbgInfoIntrinsic only uses MDNode but not Instruction). Being used by BI is impossible, because BI only uses Cond as its first operand and other operands are all BB labels. Therefore, this bonus instruction must be used by Cond. Does this make sense? I agree it is at least worth a comment. I like your suggestion of early exiting once we reach the limit. I'll change that part. Thanks, Jingyue

Original Message -----

From: "Jingyue Wu" <jingyue@google.com>
To: jingyue@google.com, nrotem@apple.com, eliben@google.com, meheff@google.com, resistor@mac.com
Cc: hfinkel@anl.gov, llvm-commits@cs.uiuc.edu
Sent: Monday, September 29, 2014 6:43:08 PM
Subject: Re: [PATCH] [SimplifyCFG] threshold for folding branches with common destination

Comment at: lib/Transforms/Utils/SimplifyCFG.cpp:2034
@@ +2033,3 @@
+ continue;
+ if (!I->hasOneUse() || !isSafeToSpeculativelyExecute(I, DL))

+ return false;

hfinkel wrote:

I don't think that the hasOneUse check here really does what you
want once we allow for more than once instruction. We used to
check that the single bonus instruction had one user and this user
specifically was the Cond. Now you'd like to allow for some
variable number of single-use instructions (regardless of a
relationship to Cond). This could change behavior even when
allowing only a single instruction.

I think that what you really want to do is to walk up the operand
graph from Cond, accumulating instructions until you reach your
limit, keeping the hasOneUse check and the check that the use is
Cond for the first instruction.

Hi Hal,

Thanks for the careful review! However, I don't think the modified
code changes the behavior when allowing only a single bonus
instruction. Note that in Line 2037 I check whether the only user is
in the same BB (and appears after the potential bonus instruction
otherwise def doesn't dominate use). When there is only one
potential bonus instruction, it is either used by Cond or BI
(DbgInfoIntrinsic only uses MDNode but not Instruction). Being used
by BI is impossible, because BI only uses Cond as its first operand
and other operands are all BB labels. Therefore, this bonus
instruction must be used by Cond. Does this make sense? I agree it
is at least worth a comment.

Yes, I understand now, thanks! Please do add a comment.

-Hal

I like your suggestion of early exiting once we reach the limit. I'll
change that part.

Thanks,
Jingyue

http://reviews.llvm.org/D5529

addresses Hal's comments

I understand now. Thanks!

Does the new patch look good to you?

Jingyue

I'm fine with this patch, but you might want to see if Hal has other thoughts on it.

LGTM, thanks!

This revision is now accepted and ready to land.Sep 30 2014, 2:10 PM

jingyue closed this revision.Sep 30 2014, 3:33 PM

Revision Contents

Path

Size

include/

llvm/

Transforms/

Scalar.h

2 lines

Utils/

Local.h

4 lines

lib/

Transforms/

Scalar/

SimplifyCFGPass.cpp

22 lines

Utils/

SimplifyCFG.cpp

146 lines

test/

Transforms/

SimplifyCFG/

branch-fold-threshold.ll

28 lines

Diff 14197

include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 210 Lines • ▼ Show 20 Lines
	//			//
	FunctionPass *createJumpThreadingPass(int Threshold = -1);			FunctionPass *createJumpThreadingPass(int Threshold = -1);

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// CFGSimplification - Merge basic blocks, eliminate unreachable blocks,			// CFGSimplification - Merge basic blocks, eliminate unreachable blocks,
	// simplify terminator instructions, etc...			// simplify terminator instructions, etc...
	//			//
	FunctionPass *createCFGSimplificationPass();			FunctionPass *createCFGSimplificationPass(int Threshold = -1);

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// FlattenCFG - flatten CFG, reduce number of conditional branches by using			// FlattenCFG - flatten CFG, reduce number of conditional branches by using
	// parallel-and and parallel-or mode, etc...			// parallel-and and parallel-or mode, etc...
	//			//
	FunctionPass *createFlattenCFGPass();			FunctionPass *createFlattenCFGPass();

	▲ Show 20 Lines • Show All 181 Lines • Show Last 20 Lines

include/llvm/Transforms/Utils/Local.h

	Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines

	/// SimplifyCFG - This function is used to do simplification of a CFG. For			/// SimplifyCFG - This function is used to do simplification of a CFG. For
	/// example, it adjusts branches to branches to eliminate the extra hop, it			/// example, it adjusts branches to branches to eliminate the extra hop, it
	/// eliminates unreachable basic blocks, and does other "peephole" optimization			/// eliminates unreachable basic blocks, and does other "peephole" optimization
	/// of the CFG. It returns true if a modification was made, possibly deleting			/// of the CFG. It returns true if a modification was made, possibly deleting
	/// the basic block that was pointed to.			/// the basic block that was pointed to.
	///			///
	bool SimplifyCFG(BasicBlock *BB, const TargetTransformInfo &TTI,			bool SimplifyCFG(BasicBlock *BB, const TargetTransformInfo &TTI,
				unsigned BonusInstThreshold,
	const DataLayout *TD = nullptr,			const DataLayout *TD = nullptr,
	AssumptionTracker *AT = nullptr);			AssumptionTracker *AT = nullptr);

	/// FlatternCFG - This function is used to flatten a CFG. For			/// FlatternCFG - This function is used to flatten a CFG. For
	/// example, it uses parallel-and and parallel-or mode to collapse			/// example, it uses parallel-and and parallel-or mode to collapse
	// if-conditions and merge if-regions with identical statements.			// if-conditions and merge if-regions with identical statements.
	///			///
	bool FlattenCFG(BasicBlock BB, AliasAnalysis AA = nullptr);			bool FlattenCFG(BasicBlock BB, AliasAnalysis AA = nullptr);

	/// FoldBranchToCommonDest - If this basic block is ONLY a setcc and a branch,			/// FoldBranchToCommonDest - If this basic block is ONLY a setcc and a branch,
	/// and if a predecessor branches to us and one of our successors, fold the			/// and if a predecessor branches to us and one of our successors, fold the
	/// setcc into the predecessor and use logical operations to pick the right			/// setcc into the predecessor and use logical operations to pick the right
	/// destination.			/// destination.
	bool FoldBranchToCommonDest(BranchInst BI, const DataLayout DL = nullptr);			bool FoldBranchToCommonDest(BranchInst BI, const DataLayout DL = nullptr,
				unsigned BonusInstThreshold = 1);

	/// DemoteRegToStack - This function takes a virtual register computed by an			/// DemoteRegToStack - This function takes a virtual register computed by an
	/// Instruction and replaces it with a slot in the stack frame, allocated via			/// Instruction and replaces it with a slot in the stack frame, allocated via
	/// alloca. This allows the CFG to be changed around without fear of			/// alloca. This allows the CFG to be changed around without fear of
	/// invalidating the SSA information for the value. It returns the pointer to			/// invalidating the SSA information for the value. It returns the pointer to
	/// the alloca inserted to create a stack slot for X.			/// the alloca inserted to create a stack slot for X.
	///			///
	AllocaInst *DemoteRegToStack(Instruction &X,			AllocaInst *DemoteRegToStack(Instruction &X,
	▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

lib/Transforms/Scalar/SimplifyCFGPass.cpp

Show All 29 Lines
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
		#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "simplifycfg"		#define DEBUG_TYPE "simplifycfg"

		static cl::opt<unsigned>
		UserBonusInstThreshold("bonus-inst-threshold", cl::Hidden, cl::init(1),
		cl::desc("Control the number of bonus instructions (default = 1)"));

STATISTIC(NumSimpl, "Number of blocks simplified");		STATISTIC(NumSimpl, "Number of blocks simplified");

namespace {		namespace {
struct CFGSimplifyPass : public FunctionPass {		struct CFGSimplifyPass : public FunctionPass {
static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid
CFGSimplifyPass() : FunctionPass(ID) {		unsigned BonusInstThreshold;
		CFGSimplifyPass(int T = -1) : FunctionPass(ID) {
		BonusInstThreshold = (T == -1) ? UserBonusInstThreshold : unsigned(T);
initializeCFGSimplifyPassPass(*PassRegistry::getPassRegistry());		initializeCFGSimplifyPassPass(*PassRegistry::getPassRegistry());
}		}
bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<AssumptionTracker>();		AU.addRequired<AssumptionTracker>();
AU.addRequired<TargetTransformInfo>();		AU.addRequired<TargetTransformInfo>();
}		}
};		};
}		}

char CFGSimplifyPass::ID = 0;		char CFGSimplifyPass::ID = 0;
INITIALIZE_PASS_BEGIN(CFGSimplifyPass, "simplifycfg", "Simplify the CFG", false,		INITIALIZE_PASS_BEGIN(CFGSimplifyPass, "simplifycfg", "Simplify the CFG", false,
false)		false)
INITIALIZE_AG_DEPENDENCY(TargetTransformInfo)		INITIALIZE_AG_DEPENDENCY(TargetTransformInfo)
INITIALIZE_PASS_DEPENDENCY(AssumptionTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionTracker)
INITIALIZE_PASS_END(CFGSimplifyPass, "simplifycfg", "Simplify the CFG", false,		INITIALIZE_PASS_END(CFGSimplifyPass, "simplifycfg", "Simplify the CFG", false,
false)		false)

// Public interface to the CFGSimplification pass		// Public interface to the CFGSimplification pass
FunctionPass *llvm::createCFGSimplificationPass() {		FunctionPass *llvm::createCFGSimplificationPass(int Threshold) {
return new CFGSimplifyPass();		return new CFGSimplifyPass(Threshold);
}		}

/// mergeEmptyReturnBlocks - If we have more than one empty (other than phi		/// mergeEmptyReturnBlocks - If we have more than one empty (other than phi
/// node) return blocks, merge them together to promote recursive block merging.		/// node) return blocks, merge them together to promote recursive block merging.
static bool mergeEmptyReturnBlocks(Function &F) {		static bool mergeEmptyReturnBlocks(Function &F) {
bool Changed = false;		bool Changed = false;

BasicBlock *RetBlock = nullptr;		BasicBlock *RetBlock = nullptr;
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	static bool mergeEmptyReturnBlocks(Function &F) {

return Changed;		return Changed;
}		}

/// iterativelySimplifyCFG - Call SimplifyCFG on all the blocks in the function,		/// iterativelySimplifyCFG - Call SimplifyCFG on all the blocks in the function,
/// iterating until no more changes are made.		/// iterating until no more changes are made.
static bool iterativelySimplifyCFG(Function &F, const TargetTransformInfo &TTI,		static bool iterativelySimplifyCFG(Function &F, const TargetTransformInfo &TTI,
const DataLayout *DL,		const DataLayout *DL,
AssumptionTracker *AT) {		AssumptionTracker *AT,
		unsigned BonusInstThreshold) {
bool Changed = false;		bool Changed = false;
bool LocalChange = true;		bool LocalChange = true;
while (LocalChange) {		while (LocalChange) {
LocalChange = false;		LocalChange = false;

// Loop over all of the basic blocks and remove them if they are unneeded...		// Loop over all of the basic blocks and remove them if they are unneeded...
//		//
for (Function::iterator BBIt = F.begin(); BBIt != F.end(); ) {		for (Function::iterator BBIt = F.begin(); BBIt != F.end(); ) {
if (SimplifyCFG(BBIt++, TTI, DL, AT)) {		if (SimplifyCFG(BBIt++, TTI, BonusInstThreshold, DL, AT)) {
LocalChange = true;		LocalChange = true;
++NumSimpl;		++NumSimpl;
}		}
}		}
Changed \|= LocalChange;		Changed \|= LocalChange;
}		}
return Changed;		return Changed;
}		}

// It is possible that we may require multiple passes over the code to fully		// It is possible that we may require multiple passes over the code to fully
// simplify the CFG.		// simplify the CFG.
//		//
bool CFGSimplifyPass::runOnFunction(Function &F) {		bool CFGSimplifyPass::runOnFunction(Function &F) {
if (skipOptnoneFunction(F))		if (skipOptnoneFunction(F))
return false;		return false;

AssumptionTracker *AT = &getAnalysis<AssumptionTracker>();		AssumptionTracker *AT = &getAnalysis<AssumptionTracker>();
const TargetTransformInfo &TTI = getAnalysis<TargetTransformInfo>();		const TargetTransformInfo &TTI = getAnalysis<TargetTransformInfo>();
DataLayoutPass *DLP = getAnalysisIfAvailable<DataLayoutPass>();		DataLayoutPass *DLP = getAnalysisIfAvailable<DataLayoutPass>();
const DataLayout *DL = DLP ? &DLP->getDataLayout() : nullptr;		const DataLayout *DL = DLP ? &DLP->getDataLayout() : nullptr;
bool EverChanged = removeUnreachableBlocks(F);		bool EverChanged = removeUnreachableBlocks(F);
EverChanged \|= mergeEmptyReturnBlocks(F);		EverChanged \|= mergeEmptyReturnBlocks(F);
EverChanged \|= iterativelySimplifyCFG(F, TTI, DL, AT);		EverChanged \|= iterativelySimplifyCFG(F, TTI, DL, AT, BonusInstThreshold);

// If neither pass changed anything, we're done.		// If neither pass changed anything, we're done.
if (!EverChanged) return false;		if (!EverChanged) return false;

// iterativelySimplifyCFG can (rarely) make some loops dead. If this happens,		// iterativelySimplifyCFG can (rarely) make some loops dead. If this happens,
// removeUnreachableBlocks is needed to nuke them, which means we should		// removeUnreachableBlocks is needed to nuke them, which means we should
// iterate between the two optimizations. We structure the code like this to		// iterate between the two optimizations. We structure the code like this to
// avoid reruning iterativelySimplifyCFG if the second pass of		// avoid reruning iterativelySimplifyCFG if the second pass of
// removeUnreachableBlocks doesn't do anything.		// removeUnreachableBlocks doesn't do anything.
if (!removeUnreachableBlocks(F))		if (!removeUnreachableBlocks(F))
return true;		return true;

do {		do {
EverChanged = iterativelySimplifyCFG(F, TTI, DL, AT);		EverChanged = iterativelySimplifyCFG(F, TTI, DL, AT, BonusInstThreshold);
EverChanged \|= removeUnreachableBlocks(F);		EverChanged \|= removeUnreachableBlocks(F);
} while (EverChanged);		} while (EverChanged);

return true;		return true;
}		}

lib/Transforms/Utils/SimplifyCFG.cpp

Show All 38 Lines
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"		#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
		#include "llvm/Transforms/Utils/ValueMapper.h"
#include <algorithm>		#include <algorithm>
#include <map>		#include <map>
#include <set>		#include <set>
using namespace llvm;		using namespace llvm;
using namespace PatternMatch;		using namespace PatternMatch;

#define DEBUG_TYPE "simplifycfg"		#define DEBUG_TYPE "simplifycfg"

Show All 33 Lines	bool operator<(ValueEqualityComparisonCase RHS) const {
return Value < RHS.Value;		return Value < RHS.Value;
}		}

bool operator==(BasicBlock *RHSDest) const { return Dest == RHSDest; }		bool operator==(BasicBlock *RHSDest) const { return Dest == RHSDest; }
};		};

class SimplifyCFGOpt {		class SimplifyCFGOpt {
const TargetTransformInfo &TTI;		const TargetTransformInfo &TTI;
		unsigned BonusInstThreshold;
const DataLayout *const DL;		const DataLayout *const DL;
AssumptionTracker *AT;		AssumptionTracker *AT;
Value isValueEqualityComparison(TerminatorInst TI);		Value isValueEqualityComparison(TerminatorInst TI);
BasicBlock GetValueEqualityComparisonCases(TerminatorInst TI,		BasicBlock GetValueEqualityComparisonCases(TerminatorInst TI,
std::vector<ValueEqualityComparisonCase> &Cases);		std::vector<ValueEqualityComparisonCase> &Cases);
bool SimplifyEqualityComparisonWithOnlyPredecessor(TerminatorInst *TI,		bool SimplifyEqualityComparisonWithOnlyPredecessor(TerminatorInst *TI,
BasicBlock *Pred,		BasicBlock *Pred,
IRBuilder<> &Builder);		IRBuilder<> &Builder);
bool FoldValueComparisonIntoPredecessors(TerminatorInst *TI,		bool FoldValueComparisonIntoPredecessors(TerminatorInst *TI,
IRBuilder<> &Builder);		IRBuilder<> &Builder);

bool SimplifyReturn(ReturnInst *RI, IRBuilder<> &Builder);		bool SimplifyReturn(ReturnInst *RI, IRBuilder<> &Builder);
bool SimplifyResume(ResumeInst *RI, IRBuilder<> &Builder);		bool SimplifyResume(ResumeInst *RI, IRBuilder<> &Builder);
bool SimplifyUnreachable(UnreachableInst *UI);		bool SimplifyUnreachable(UnreachableInst *UI);
bool SimplifySwitch(SwitchInst *SI, IRBuilder<> &Builder);		bool SimplifySwitch(SwitchInst *SI, IRBuilder<> &Builder);
bool SimplifyIndirectBr(IndirectBrInst *IBI);		bool SimplifyIndirectBr(IndirectBrInst *IBI);
bool SimplifyUncondBranch(BranchInst *BI, IRBuilder <> &Builder);		bool SimplifyUncondBranch(BranchInst *BI, IRBuilder <> &Builder);
bool SimplifyCondBranch(BranchInst *BI, IRBuilder <>&Builder);		bool SimplifyCondBranch(BranchInst *BI, IRBuilder <>&Builder);

public:		public:
SimplifyCFGOpt(const TargetTransformInfo &TTI, const DataLayout *DL,		SimplifyCFGOpt(const TargetTransformInfo &TTI, unsigned BonusInstThreshold,
AssumptionTracker *AT)		const DataLayout DL, AssumptionTracker AT)
: TTI(TTI), DL(DL), AT(AT) {}		: TTI(TTI), BonusInstThreshold(BonusInstThreshold), DL(DL), AT(AT) {}
bool run(BasicBlock *BB);		bool run(BasicBlock *BB);
};		};
}		}

/// SafeToMergeTerminators - Return true if it is safe to merge these two		/// SafeToMergeTerminators - Return true if it is safe to merge these two
/// terminator instructions together.		/// terminator instructions together.
///		///
static bool SafeToMergeTerminators(TerminatorInst SI1, TerminatorInst SI2) {		static bool SafeToMergeTerminators(TerminatorInst SI1, TerminatorInst SI2) {
▲ Show 20 Lines • Show All 1,841 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator I = PB->begin(), E = PB->end(); I != E; I++) {
}		}
}		}
return false;		return false;
}		}

/// FoldBranchToCommonDest - If this basic block is simple enough, and if a		/// FoldBranchToCommonDest - If this basic block is simple enough, and if a
/// predecessor branches to us and one of our successors, fold the block into		/// predecessor branches to us and one of our successors, fold the block into
/// the predecessor and use logical operations to pick the right destination.		/// the predecessor and use logical operations to pick the right destination.
bool llvm::FoldBranchToCommonDest(BranchInst BI, const DataLayout DL) {		bool llvm::FoldBranchToCommonDest(BranchInst BI, const DataLayout DL,
		unsigned BonusInstThreshold) {
BasicBlock *BB = BI->getParent();		BasicBlock *BB = BI->getParent();

Instruction *Cond = nullptr;		Instruction *Cond = nullptr;
if (BI->isConditional())		if (BI->isConditional())
Cond = dyn_cast<Instruction>(BI->getCondition());		Cond = dyn_cast<Instruction>(BI->getCondition());
else {		else {
// For unconditional branch, check for a simple CFG pattern, where		// For unconditional branch, check for a simple CFG pattern, where
// BB has a single predecessor and BB's successor is also its predecessor's		// BB has a single predecessor and BB's successor is also its predecessor's
Show All 20 Lines	else {
if (!Cond)		if (!Cond)
return false;		return false;
}		}

if (!Cond \|\| (!isa<CmpInst>(Cond) && !isa<BinaryOperator>(Cond)) \|\|		if (!Cond \|\| (!isa<CmpInst>(Cond) && !isa<BinaryOperator>(Cond)) \|\|
Cond->getParent() != BB \|\| !Cond->hasOneUse())		Cond->getParent() != BB \|\| !Cond->hasOneUse())
return false;		return false;

// Only allow this if the condition is a simple instruction that can be
// executed unconditionally. It must be in the same block as the branch, and
// must be at the front of the block.
BasicBlock::iterator FrontIt = BB->front();

// Ignore dbg intrinsics.
while (isa<DbgInfoIntrinsic>(FrontIt)) ++FrontIt;

// Allow a single instruction to be hoisted in addition to the compare
// that feeds the branch. We later ensure that any values that _it_ uses
// were also live in the predecessor, so that we don't unnecessarily create
// register pressure or inhibit out-of-order execution.
Instruction *BonusInst = nullptr;
if (&*FrontIt != Cond &&
FrontIt->hasOneUse() && FrontIt->user_back() == Cond &&
isSafeToSpeculativelyExecute(FrontIt, DL)) {
BonusInst = &*FrontIt;
++FrontIt;

// Ignore dbg intrinsics.
while (isa<DbgInfoIntrinsic>(FrontIt)) ++FrontIt;
}

// Only a single bonus inst is allowed.
if (&*FrontIt != Cond)
return false;

// Make sure the instruction after the condition is the cond branch.		// Make sure the instruction after the condition is the cond branch.
BasicBlock::iterator CondIt = Cond; ++CondIt;		BasicBlock::iterator CondIt = Cond; ++CondIt;

hfinkelUnsubmitted Not Done Reply Inline Actions Please don't remove these blank lines, I think they make the code easier to read. hfinkel: Please don't remove these blank lines, I think they make the code easier to read.
// Ignore dbg intrinsics.		// Ignore dbg intrinsics.
while (isa<DbgInfoIntrinsic>(CondIt)) ++CondIt;		while (isa<DbgInfoIntrinsic>(CondIt)) ++CondIt;

if (&*CondIt != BI)		if (&*CondIt != BI)
return false;		return false;

		// Only allow this transformation if computing the condition doesn't involve
		// too many instructions and these involved instructions can be executed
		// unconditionally. We denote all involved instructions except the condition
		// as "bonus instructions", and only allow this transformation when the
		// number of the bonus instructions does not exceed a certain threshold.
		unsigned NumBonusInsts = 0;
		for (auto I = BB->begin(); Cond != I; ++I) {
		// Ignore dbg intrinsics.
		if (isa<DbgInfoIntrinsic>(I))
		continue;
		hfinkelUnsubmitted Not Done Reply Inline Actions I don't think that the hasOneUse check here really does what you want once we allow for more than once instruction. We used to check that the single bonus instruction had one user and this user specifically was the Cond. Now you'd like to allow for some variable number of single-use instructions (regardless of a relationship to Cond). This could change behavior even when allowing only a single instruction. I think that what you really want to do is to walk up the operand graph from Cond, accumulating instructions until you reach your limit, keeping the hasOneUse check and the check that the use is Cond for the first instruction. hfinkel: I don't think that the hasOneUse check here really does what you want once we allow for more…
		jingyueAuthorUnsubmitted Not Done Reply Inline Actions Hi Hal, Thanks for the careful review! However, I don't think the modified code changes the behavior when allowing only a single bonus instruction. Note that in Line 2037 I check whether the only user is in the same BB (and appears after the potential bonus instruction otherwise def doesn't dominate use). When there is only one potential bonus instruction, it is either used by Cond or BI (DbgInfoIntrinsic only uses MDNode but not Instruction). Being used by BI is impossible, because BI only uses Cond as its first operand and other operands are all BB labels. Therefore, this bonus instruction must be used by Cond. Does this make sense? I agree it is at least worth a comment. I like your suggestion of early exiting once we reach the limit. I'll change that part. Thanks, Jingyue jingyue: Hi Hal, Thanks for the careful review! However, I don't think the modified code changes the…
		if (!I->hasOneUse() \|\| !isSafeToSpeculativelyExecute(I, DL))
		return false;
		// I has only one use and can be executed unconditionally.
		Instruction *User = dyn_cast<Instruction>(I->user_back());
		if (User == nullptr \|\| User->getParent() != BB)
		return false;
		// I is used in the same BB. Since BI uses Cond and doesn't have more slots
		// to use any other instruction, User must be an instruction between next(I)
		// and Cond.
		++NumBonusInsts;
		// Early exits once we reach the limit.
		if (NumBonusInsts > BonusInstThreshold)
		return false;
		}

// Cond is known to be a compare or binary operator. Check to make sure that		// Cond is known to be a compare or binary operator. Check to make sure that
// neither operand is a potentially-trapping constant expression.		// neither operand is a potentially-trapping constant expression.
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Cond->getOperand(0)))		if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Cond->getOperand(0)))
if (CE->canTrap())		if (CE->canTrap())
return false;		return false;
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Cond->getOperand(1)))		if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Cond->getOperand(1)))
if (CE->canTrap())		if (CE->canTrap())
return false;		return false;
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	if (InvertPredCond) {
NewCond = Builder.CreateNot(NewCond,		NewCond = Builder.CreateNot(NewCond,
PBI->getCondition()->getName()+".not");		PBI->getCondition()->getName()+".not");
}		}

PBI->setCondition(NewCond);		PBI->setCondition(NewCond);
PBI->swapSuccessors();		PBI->swapSuccessors();
}		}

// If we have a bonus inst, clone it into the predecessor block.		// If we have bonus instructions, clone them into the predecessor block.
Instruction *NewBonus = nullptr;		// Note that there may be mutliple predecessor blocks, so we cannot move
if (BonusInst) {		// bonus instructions to a predecessor block.
NewBonus = BonusInst->clone();		ValueToValueMapTy VMap; // maps original values to cloned values
		// We already make sure Cond is the last instruction before BI. Therefore,
		// every instructions before Cond other than DbgInfoIntrinsic are bonus
		// instructions.
		for (auto BonusInst = BB->begin(); Cond != BonusInst; ++BonusInst) {
		if (isa<DbgInfoIntrinsic>(BonusInst))
		continue;
		Instruction *NewBonusInst = BonusInst->clone();
		RemapInstruction(NewBonusInst, VMap,
		RF_NoModuleLevelChanges \| RF_IgnoreMissingEntries);
		VMap[BonusInst] = NewBonusInst;

// If we moved a load, we cannot any longer claim any knowledge about		// If we moved a load, we cannot any longer claim any knowledge about
// its potential value. The previous information might have been valid		// its potential value. The previous information might have been valid
// only given the branch precondition.		// only given the branch precondition.
// For an analogous reason, we must also drop all the metadata whose		// For an analogous reason, we must also drop all the metadata whose
// semantics we don't understand.		// semantics we don't understand.
NewBonus->dropUnknownMetadata(LLVMContext::MD_dbg);		NewBonusInst->dropUnknownMetadata(LLVMContext::MD_dbg);

PredBlock->getInstList().insert(PBI, NewBonus);		PredBlock->getInstList().insert(PBI, NewBonusInst);
NewBonus->takeName(BonusInst);		NewBonusInst->takeName(BonusInst);
BonusInst->setName(BonusInst->getName()+".old");		BonusInst->setName(BonusInst->getName() + ".old");
}		}

// Clone Cond into the predecessor basic block, and or/and the		// Clone Cond into the predecessor basic block, and or/and the
// two conditions together.		// two conditions together.
Instruction *New = Cond->clone();		Instruction *New = Cond->clone();
if (BonusInst) New->replaceUsesOfWith(BonusInst, NewBonus);		RemapInstruction(New, VMap,
		RF_NoModuleLevelChanges \| RF_IgnoreMissingEntries);
PredBlock->getInstList().insert(PBI, New);		PredBlock->getInstList().insert(PBI, New);
New->takeName(Cond);		New->takeName(Cond);
Cond->setName(New->getName()+".old");		Cond->setName(New->getName() + ".old");

if (BI->isConditional()) {		if (BI->isConditional()) {
Instruction *NewCond =		Instruction *NewCond =
cast<Instruction>(Builder.CreateBinOp(Opc, PBI->getCondition(),		cast<Instruction>(Builder.CreateBinOp(Opc, PBI->getCondition(),
New, "or.cond"));		New, "or.cond"));
PBI->setCondition(NewCond);		PBI->setCondition(NewCond);

uint64_t PredTrueWeight, PredFalseWeight, SuccTrueWeight, SuccFalseWeight;		uint64_t PredTrueWeight, PredFalseWeight, SuccTrueWeight, SuccFalseWeight;
▲ Show 20 Lines • Show All 461 Lines • ▼ Show 20 Lines
/// br label %end		/// br label %end
/// end:		/// end:
/// ... = phi i1 [ true, %entry ], [ %tmp, %DEFAULT ], [ true, %entry ]		/// ... = phi i1 [ true, %entry ], [ %tmp, %DEFAULT ], [ true, %entry ]
///		///
/// We prefer to split the edge to 'end' so that there is a true/false entry to		/// We prefer to split the edge to 'end' so that there is a true/false entry to
/// the PHI, merging the third icmp into the switch.		/// the PHI, merging the third icmp into the switch.
static bool TryToSimplifyUncondBranchWithICmpInIt(		static bool TryToSimplifyUncondBranchWithICmpInIt(
ICmpInst *ICI, IRBuilder<> &Builder, const TargetTransformInfo &TTI,		ICmpInst *ICI, IRBuilder<> &Builder, const TargetTransformInfo &TTI,
const DataLayout DL, AssumptionTracker AT) {		unsigned BonusInstThreshold, const DataLayout DL, AssumptionTracker AT) {
BasicBlock *BB = ICI->getParent();		BasicBlock *BB = ICI->getParent();

// If the block has any PHIs in it or the icmp has multiple uses, it is too		// If the block has any PHIs in it or the icmp has multiple uses, it is too
// complex.		// complex.
if (isa<PHINode>(BB->begin()) \|\| !ICI->hasOneUse()) return false;		if (isa<PHINode>(BB->begin()) \|\| !ICI->hasOneUse()) return false;

Value *V = ICI->getOperand(0);		Value *V = ICI->getOperand(0);
ConstantInt *Cst = cast<ConstantInt>(ICI->getOperand(1));		ConstantInt *Cst = cast<ConstantInt>(ICI->getOperand(1));
Show All 16 Lines	if (SI->getDefaultDest() != BB) {
assert(VVal && "Should have a unique destination value");		assert(VVal && "Should have a unique destination value");
ICI->setOperand(0, VVal);		ICI->setOperand(0, VVal);

if (Value *V = SimplifyInstruction(ICI, DL)) {		if (Value *V = SimplifyInstruction(ICI, DL)) {
ICI->replaceAllUsesWith(V);		ICI->replaceAllUsesWith(V);
ICI->eraseFromParent();		ICI->eraseFromParent();
}		}
// BB is now empty, so it is likely to simplify away.		// BB is now empty, so it is likely to simplify away.
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}

// Ok, the block is reachable from the default dest. If the constant we're		// Ok, the block is reachable from the default dest. If the constant we're
// comparing exists in one of the other edges, then we can constant fold ICI		// comparing exists in one of the other edges, then we can constant fold ICI
// and zap it.		// and zap it.
if (SI->findCaseValue(Cst) != SI->case_default()) {		if (SI->findCaseValue(Cst) != SI->case_default()) {
Value *V;		Value *V;
if (ICI->getPredicate() == ICmpInst::ICMP_EQ)		if (ICI->getPredicate() == ICmpInst::ICMP_EQ)
V = ConstantInt::getFalse(BB->getContext());		V = ConstantInt::getFalse(BB->getContext());
else		else
V = ConstantInt::getTrue(BB->getContext());		V = ConstantInt::getTrue(BB->getContext());

ICI->replaceAllUsesWith(V);		ICI->replaceAllUsesWith(V);
ICI->eraseFromParent();		ICI->eraseFromParent();
// BB is now empty, so it is likely to simplify away.		// BB is now empty, so it is likely to simplify away.
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}

// The use of the icmp has to be in the 'end' block, by the only PHI node in		// The use of the icmp has to be in the 'end' block, by the only PHI node in
// the block.		// the block.
BasicBlock *SuccBlock = BB->getTerminator()->getSuccessor(0);		BasicBlock *SuccBlock = BB->getTerminator()->getSuccessor(0);
PHINode *PHIUse = dyn_cast<PHINode>(ICI->user_back());		PHINode *PHIUse = dyn_cast<PHINode>(ICI->user_back());
if (PHIUse == nullptr \|\| PHIUse != &SuccBlock->front() \|\|		if (PHIUse == nullptr \|\| PHIUse != &SuccBlock->front() \|\|
isa<PHINode>(++BasicBlock::iterator(PHIUse)))		isa<PHINode>(++BasicBlock::iterator(PHIUse)))
▲ Show 20 Lines • Show All 1,218 Lines • ▼ Show 20 Lines
bool SimplifyCFGOpt::SimplifySwitch(SwitchInst *SI, IRBuilder<> &Builder) {		bool SimplifyCFGOpt::SimplifySwitch(SwitchInst *SI, IRBuilder<> &Builder) {
BasicBlock *BB = SI->getParent();		BasicBlock *BB = SI->getParent();

if (isValueEqualityComparison(SI)) {		if (isValueEqualityComparison(SI)) {
// If we only have one predecessor, and if it is a branch on this value,		// If we only have one predecessor, and if it is a branch on this value,
// see if that predecessor totally determines the outcome of this switch.		// see if that predecessor totally determines the outcome of this switch.
if (BasicBlock *OnlyPred = BB->getSinglePredecessor())		if (BasicBlock *OnlyPred = BB->getSinglePredecessor())
if (SimplifyEqualityComparisonWithOnlyPredecessor(SI, OnlyPred, Builder))		if (SimplifyEqualityComparisonWithOnlyPredecessor(SI, OnlyPred, Builder))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

Value *Cond = SI->getCondition();		Value *Cond = SI->getCondition();
if (SelectInst *Select = dyn_cast<SelectInst>(Cond))		if (SelectInst *Select = dyn_cast<SelectInst>(Cond))
if (SimplifySwitchOnSelect(SI, Select))		if (SimplifySwitchOnSelect(SI, Select))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

// If the block only contains the switch, see if we can fold the block		// If the block only contains the switch, see if we can fold the block
// away into any preds.		// away into any preds.
BasicBlock::iterator BBI = BB->begin();		BasicBlock::iterator BBI = BB->begin();
// Ignore dbg intrinsics.		// Ignore dbg intrinsics.
while (isa<DbgInfoIntrinsic>(BBI))		while (isa<DbgInfoIntrinsic>(BBI))
++BBI;		++BBI;
if (SI == &*BBI)		if (SI == &*BBI)
if (FoldValueComparisonIntoPredecessors(SI, Builder))		if (FoldValueComparisonIntoPredecessors(SI, Builder))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}

// Try to transform the switch into an icmp and a branch.		// Try to transform the switch into an icmp and a branch.
if (TurnSwitchRangeIntoICmp(SI, Builder))		if (TurnSwitchRangeIntoICmp(SI, Builder))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

// Remove unreachable cases.		// Remove unreachable cases.
if (EliminateDeadSwitchCases(SI, DL, AT))		if (EliminateDeadSwitchCases(SI, DL, AT))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

if (ForwardSwitchConditionToPHI(SI))		if (ForwardSwitchConditionToPHI(SI))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

if (SwitchToLookupTable(SI, Builder, TTI, DL))		if (SwitchToLookupTable(SI, Builder, TTI, DL))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

return false;		return false;
}		}

bool SimplifyCFGOpt::SimplifyIndirectBr(IndirectBrInst *IBI) {		bool SimplifyCFGOpt::SimplifyIndirectBr(IndirectBrInst *IBI) {
BasicBlock *BB = IBI->getParent();		BasicBlock *BB = IBI->getParent();
bool Changed = false;		bool Changed = false;

Show All 20 Lines	if (IBI->getNumDestinations() == 1) {
// If the indirectbr has one successor, change it to a direct branch.		// If the indirectbr has one successor, change it to a direct branch.
BranchInst::Create(IBI->getDestination(0), IBI);		BranchInst::Create(IBI->getDestination(0), IBI);
EraseTerminatorInstAndDCECond(IBI);		EraseTerminatorInstAndDCECond(IBI);
return true;		return true;
}		}

if (SelectInst *SI = dyn_cast<SelectInst>(IBI->getAddress())) {		if (SelectInst *SI = dyn_cast<SelectInst>(IBI->getAddress())) {
if (SimplifyIndirectBrOnSelect(IBI, SI))		if (SimplifyIndirectBrOnSelect(IBI, SI))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}
return Changed;		return Changed;
}		}

bool SimplifyCFGOpt::SimplifyUncondBranch(BranchInst *BI, IRBuilder<> &Builder){		bool SimplifyCFGOpt::SimplifyUncondBranch(BranchInst *BI, IRBuilder<> &Builder){
BasicBlock *BB = BI->getParent();		BasicBlock *BB = BI->getParent();

if (SinkCommon && SinkThenElseCodeToEnd(BI))		if (SinkCommon && SinkThenElseCodeToEnd(BI))
return true;		return true;

// If the Terminator is the only non-phi instruction, simplify the block.		// If the Terminator is the only non-phi instruction, simplify the block.
BasicBlock::iterator I = BB->getFirstNonPHIOrDbg();		BasicBlock::iterator I = BB->getFirstNonPHIOrDbg();
if (I->isTerminator() && BB != &BB->getParent()->getEntryBlock() &&		if (I->isTerminator() && BB != &BB->getParent()->getEntryBlock() &&
TryToSimplifyUncondBranchFromEmptyBlock(BB))		TryToSimplifyUncondBranchFromEmptyBlock(BB))
return true;		return true;

// If the only instruction in the block is a seteq/setne comparison		// If the only instruction in the block is a seteq/setne comparison
// against a constant, try to simplify the block.		// against a constant, try to simplify the block.
if (ICmpInst *ICI = dyn_cast<ICmpInst>(I))		if (ICmpInst *ICI = dyn_cast<ICmpInst>(I))
if (ICI->isEquality() && isa<ConstantInt>(ICI->getOperand(1))) {		if (ICI->isEquality() && isa<ConstantInt>(ICI->getOperand(1))) {
for (++I; isa<DbgInfoIntrinsic>(I); ++I)		for (++I; isa<DbgInfoIntrinsic>(I); ++I)
;		;
if (I->isTerminator() &&		if (I->isTerminator() &&
TryToSimplifyUncondBranchWithICmpInIt(ICI, Builder, TTI, DL, AT))		TryToSimplifyUncondBranchWithICmpInIt(ICI, Builder, TTI,
		BonusInstThreshold, DL, AT))
return true;		return true;
}		}

// If this basic block is ONLY a compare and a branch, and if a predecessor		// If this basic block is ONLY a compare and a branch, and if a predecessor
// branches to us and our successor, fold the comparison into the		// branches to us and our successor, fold the comparison into the
// predecessor and use logical operations to update the incoming value		// predecessor and use logical operations to update the incoming value
// for PHI nodes in common successor.		// for PHI nodes in common successor.
if (FoldBranchToCommonDest(BI, DL))		if (FoldBranchToCommonDest(BI, DL, BonusInstThreshold))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
return false;		return false;
}		}


bool SimplifyCFGOpt::SimplifyCondBranch(BranchInst *BI, IRBuilder<> &Builder) {		bool SimplifyCFGOpt::SimplifyCondBranch(BranchInst *BI, IRBuilder<> &Builder) {
BasicBlock *BB = BI->getParent();		BasicBlock *BB = BI->getParent();

// Conditional branch		// Conditional branch
if (isValueEqualityComparison(BI)) {		if (isValueEqualityComparison(BI)) {
// If we only have one predecessor, and if it is a branch on this value,		// If we only have one predecessor, and if it is a branch on this value,
// see if that predecessor totally determines the outcome of this		// see if that predecessor totally determines the outcome of this
// switch.		// switch.
if (BasicBlock *OnlyPred = BB->getSinglePredecessor())		if (BasicBlock *OnlyPred = BB->getSinglePredecessor())
if (SimplifyEqualityComparisonWithOnlyPredecessor(BI, OnlyPred, Builder))		if (SimplifyEqualityComparisonWithOnlyPredecessor(BI, OnlyPred, Builder))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

// This block must be empty, except for the setcond inst, if it exists.		// This block must be empty, except for the setcond inst, if it exists.
// Ignore dbg intrinsics.		// Ignore dbg intrinsics.
BasicBlock::iterator I = BB->begin();		BasicBlock::iterator I = BB->begin();
// Ignore dbg intrinsics.		// Ignore dbg intrinsics.
while (isa<DbgInfoIntrinsic>(I))		while (isa<DbgInfoIntrinsic>(I))
++I;		++I;
if (&*I == BI) {		if (&*I == BI) {
if (FoldValueComparisonIntoPredecessors(BI, Builder))		if (FoldValueComparisonIntoPredecessors(BI, Builder))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
} else if (&*I == cast<Instruction>(BI->getCondition())){		} else if (&*I == cast<Instruction>(BI->getCondition())){
++I;		++I;
// Ignore dbg intrinsics.		// Ignore dbg intrinsics.
while (isa<DbgInfoIntrinsic>(I))		while (isa<DbgInfoIntrinsic>(I))
++I;		++I;
if (&*I == BI && FoldValueComparisonIntoPredecessors(BI, Builder))		if (&*I == BI && FoldValueComparisonIntoPredecessors(BI, Builder))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}
}		}

// Try to turn "br (X == 0 \| X == 1), T, F" into a switch instruction.		// Try to turn "br (X == 0 \| X == 1), T, F" into a switch instruction.
if (SimplifyBranchOnICmpChain(BI, DL, Builder))		if (SimplifyBranchOnICmpChain(BI, DL, Builder))
return true;		return true;

// If this basic block is ONLY a compare and a branch, and if a predecessor		// If this basic block is ONLY a compare and a branch, and if a predecessor
// branches to us and one of our successors, fold the comparison into the		// branches to us and one of our successors, fold the comparison into the
// predecessor and use logical operations to pick the right destination.		// predecessor and use logical operations to pick the right destination.
if (FoldBranchToCommonDest(BI, DL))		if (FoldBranchToCommonDest(BI, DL, BonusInstThreshold))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

// We have a conditional branch to two blocks that are only reachable		// We have a conditional branch to two blocks that are only reachable
// from BI. We know that the condbr dominates the two blocks, so see if		// from BI. We know that the condbr dominates the two blocks, so see if
// there is any identical code in the "then" and "else" blocks. If so, we		// there is any identical code in the "then" and "else" blocks. If so, we
// can hoist it up to the branching block.		// can hoist it up to the branching block.
if (BI->getSuccessor(0)->getSinglePredecessor()) {		if (BI->getSuccessor(0)->getSinglePredecessor()) {
if (BI->getSuccessor(1)->getSinglePredecessor()) {		if (BI->getSuccessor(1)->getSinglePredecessor()) {
if (HoistThenElseCodeToIf(BI, DL))		if (HoistThenElseCodeToIf(BI, DL))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
} else {		} else {
// If Successor #1 has multiple preds, we may be able to conditionally		// If Successor #1 has multiple preds, we may be able to conditionally
// execute Successor #0 if it branches to Successor #1.		// execute Successor #0 if it branches to Successor #1.
TerminatorInst *Succ0TI = BI->getSuccessor(0)->getTerminator();		TerminatorInst *Succ0TI = BI->getSuccessor(0)->getTerminator();
if (Succ0TI->getNumSuccessors() == 1 &&		if (Succ0TI->getNumSuccessors() == 1 &&
Succ0TI->getSuccessor(0) == BI->getSuccessor(1))		Succ0TI->getSuccessor(0) == BI->getSuccessor(1))
if (SpeculativelyExecuteBB(BI, BI->getSuccessor(0), DL))		if (SpeculativelyExecuteBB(BI, BI->getSuccessor(0), DL))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}
} else if (BI->getSuccessor(1)->getSinglePredecessor()) {		} else if (BI->getSuccessor(1)->getSinglePredecessor()) {
// If Successor #0 has multiple preds, we may be able to conditionally		// If Successor #0 has multiple preds, we may be able to conditionally
// execute Successor #1 if it branches to Successor #0.		// execute Successor #1 if it branches to Successor #0.
TerminatorInst *Succ1TI = BI->getSuccessor(1)->getTerminator();		TerminatorInst *Succ1TI = BI->getSuccessor(1)->getTerminator();
if (Succ1TI->getNumSuccessors() == 1 &&		if (Succ1TI->getNumSuccessors() == 1 &&
Succ1TI->getSuccessor(0) == BI->getSuccessor(0))		Succ1TI->getSuccessor(0) == BI->getSuccessor(0))
if (SpeculativelyExecuteBB(BI, BI->getSuccessor(1), DL))		if (SpeculativelyExecuteBB(BI, BI->getSuccessor(1), DL))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;
}		}

// If this is a branch on a phi node in the current block, thread control		// If this is a branch on a phi node in the current block, thread control
// through this block if any PHI node entries are constants.		// through this block if any PHI node entries are constants.
if (PHINode *PN = dyn_cast<PHINode>(BI->getCondition()))		if (PHINode *PN = dyn_cast<PHINode>(BI->getCondition()))
if (PN->getParent() == BI->getParent())		if (PN->getParent() == BI->getParent())
if (FoldCondBranchOnPHI(BI, DL))		if (FoldCondBranchOnPHI(BI, DL))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

// Scan predecessor blocks for conditional branches.		// Scan predecessor blocks for conditional branches.
for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI)		for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI)
if (BranchInst PBI = dyn_cast<BranchInst>((PI)->getTerminator()))		if (BranchInst PBI = dyn_cast<BranchInst>((PI)->getTerminator()))
if (PBI != BI && PBI->isConditional())		if (PBI != BI && PBI->isConditional())
if (SimplifyCondBranchToCondBranch(PBI, BI))		if (SimplifyCondBranchToCondBranch(PBI, BI))
return SimplifyCFG(BB, TTI, DL, AT) \| true;		return SimplifyCFG(BB, TTI, BonusInstThreshold, DL, AT) \| true;

return false;		return false;
}		}

/// Check if passing a value to an instruction will cause undefined behavior.		/// Check if passing a value to an instruction will cause undefined behavior.
static bool passingValueIsAlwaysUndefined(Value V, Instruction I) {		static bool passingValueIsAlwaysUndefined(Value V, Instruction I) {
Constant *C = dyn_cast<Constant>(V);		Constant *C = dyn_cast<Constant>(V);
if (!C)		if (!C)
▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
}		}

/// SimplifyCFG - This function is used to do simplification of a CFG. For		/// SimplifyCFG - This function is used to do simplification of a CFG. For
/// example, it adjusts branches to branches to eliminate the extra hop, it		/// example, it adjusts branches to branches to eliminate the extra hop, it
/// eliminates unreachable basic blocks, and does other "peephole" optimization		/// eliminates unreachable basic blocks, and does other "peephole" optimization
/// of the CFG. It returns true if a modification was made.		/// of the CFG. It returns true if a modification was made.
///		///
bool llvm::SimplifyCFG(BasicBlock *BB, const TargetTransformInfo &TTI,		bool llvm::SimplifyCFG(BasicBlock *BB, const TargetTransformInfo &TTI,
		unsigned BonusInstThreshold,
const DataLayout DL, AssumptionTracker AT) {		const DataLayout DL, AssumptionTracker AT) {
return SimplifyCFGOpt(TTI, DL, AT).run(BB);		return SimplifyCFGOpt(TTI, BonusInstThreshold, DL, AT).run(BB);
}		}

test/Transforms/SimplifyCFG/branch-fold-threshold.ll

This file was added.

				; RUN: opt %s -simplifycfg -S \| FileCheck %s --check-prefix=NORMAL
				; RUN: opt %s -simplifycfg -S -bonus-inst-threshold=2 \| FileCheck %s --check-prefix=AGGRESSIVE

				define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d, i32* %input) {
				; NORMAL-LABEL: @foo(
				; AGGRESSIVE-LABEL: @foo(
				entry:
				%cmp = icmp sgt i32 %d, 3
				br i1 %cmp, label %cond.end, label %lor.lhs.false
				; NORMAL: br i1
				; AGGRESSIVE: br i1

				lor.lhs.false:
				%mul = shl i32 %c, 1
				%add = add nsw i32 %mul, %a
				%cmp1 = icmp slt i32 %add, %b
				br i1 %cmp1, label %cond.false, label %cond.end
				; NORMAL: br i1
				; AGGRESSIVE-NOT: br i1

				cond.false:
				%0 = load i32* %input, align 4
				br label %cond.end

				cond.end:
				%cond = phi i32 [ %0, %cond.false ], [ 0, %lor.lhs.false ], [ 0, %entry ]
				ret i32 %cond
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] threshold for folding branches with common destinationClosedPublic

Details

Diff Detail

Event Timeline

+ return false;

Revision Contents

Diff 14197

include/llvm/Transforms/Scalar.h

include/llvm/Transforms/Utils/Local.h

lib/Transforms/Scalar/SimplifyCFGPass.cpp

lib/Transforms/Utils/SimplifyCFG.cpp

test/Transforms/SimplifyCFG/branch-fold-threshold.ll

[SimplifyCFG] threshold for folding branches with common destination
ClosedPublic