This is an archive of the discontinued LLVM Phabricator instance.

Allow BB duplication threshold to be adjusted through JumpThreading's ctor
ClosedPublic

Authored by hliao on Sep 22 2014, 12:38 PM.

Download Raw Diff

Details

Reviewers

sebpop
nadav
hfinkel

Summary

Some targets may not want to duplicate BB when threading jumps or use different threshold. Adding threshold parameter in that pass's ctor and initializer allows customization on BB duplication.

Diff Detail

Event Timeline

hliao updated this revision to Diff 13947.Sep 22 2014, 12:38 PM

hliao retitled this revision from to Allow BB duplication threshold to be adjusted through JumpThreading's ctor.

hliao updated this object.

hliao edited the test plan for this revision. (Show Details)

hliao added a reviewer: hfinkel.

hliao added a subscriber: Unknown Object (MLST).

spatel added a subscriber: spatel.Sep 22 2014, 2:45 PM

spatel added inline comments.

lib/Transforms/Scalar/JumpThreading.cpp
50	I have no opinion about the functionality of this patch, but how about giving a name to that magic '6'...like: static unsigned BBDuplicateThresholdDefault = 6; Then you can use that constant as the default parameter assignment, and you don't have to introduce yet another magic number (-1) and cast signed to unsigned.

PING again

lib/Transforms/Scalar/JumpThreading.cpp
50	'-1' here is used to tell whether a value other that default threshold is specified or not. However, that default threshold would be changed through command line option. Adding a name for that magic '6' and using it as default parameter does work correctly when that default threshold is changed through command option but no value is specified through pass creator.

The functionality of the patch looks okay. Please add a comment that says that non-negative values override the internal default.

If this is supposed to allow for target customization, should the number come from TTI? What is the underlying hardware constraint(s) you're trying to model? Also, a test case would be nice.

I think that he is trying to disable code duplication because GPU languages may contain barriers that must not be duplicated.

I think that he is trying to disable code duplication because GPU languages may contain barriers that must not be duplicated.

If that's the motivation, then I wonder why our "noduplicate" function attribute is not already sufficient.

Hi Hal

Yeah, "noduplicate" could prevent duplicating of barrier calls but that
patch wants to address the potential issue on processors with divergent
control flow, commonly found in GPUs, e.g. AMD/NVIDIA ones. The
scenario is that, if BB is duplicated to exploit more jump threading,
targets with divergent CF may execute more instructions if the
condition is a divergent one.

For updating that threshold from TTI, yeah, if we are interested in
that case. I could come another patch considering both TTI and
user-specified threshold.

Yours

Michael

Original Message -----

From: "Michael Liao" <michael.liao@intel.com>
To: "michael liao" <michael.liao@intel.com>, nrotem@apple.com, hfinkel@anl.gov
Cc: spatel@rotateright.com, llvm-commits@cs.uiuc.edu
Sent: Monday, September 29, 2014 6:34:36 PM
Subject: Re: [PATCH] Allow BB duplication threshold to be adjusted through JumpThreading's ctor

Hi Hal

Yeah, "noduplicate" could prevent duplicating of barrier calls but
that
patch wants to address the potential issue on processors with
divergent
control flow, commonly found in GPUs, e.g. AMD/NVIDIA ones. The
scenario is that, if BB is duplicated to exploit more jump threading,
targets with divergent CF may execute more instructions if the
condition is a divergent one.

For updating that threshold from TTI, yeah, if we are interested in
that case. I could come another patch considering both TTI and
user-specified threshold.

I suppose that I don't understand what you mean by "if we are interested." Generally speaking, ctor parameters are useful only for clients who are not using the standard optimization pipeline, and we'd like the standard optimization pipeline to generally work well for a wide range of targets. Thus, a TTI interface is preferred.

From a cost modeling perspective, how can you tell whether the instruction duplication will be worthwhile. Can this be something like 2*(instruction costs) <= (branch cost)?

Thanks again,
Hal

Yours

Michael

http://reviews.llvm.org/D5444

Hi Hal

This patch has been committed as https://reviews.llvm.org/rL218375

This revision is now accepted and ready to land.Oct 19 2016, 9:25 AM

sebpop closed this revision.Oct 19 2016, 9:25 AM

Revision Contents

Path

Size

include/

llvm/

Transforms/

Scalar.h

2 lines

lib/

Transforms/

Scalar/

JumpThreading.cpp

17 lines

Diff 13947

include/llvm/Transforms/Scalar.h

Context not available.
	// JumpThreading - Thread control through mult-pred/multi-succ blocks where some	// JumpThreading - Thread control through mult-pred/multi-succ blocks where some
	// preds always go to some succ.	// preds always go to some succ.
	//	//
	FunctionPass *createJumpThreadingPass();	FunctionPass *createJumpThreadingPass(int Threshold = -1);

	//===----------------------------------------------------------------------===//	//===----------------------------------------------------------------------===//
	//	//
Context not available.

lib/Transforms/Scalar/JumpThreading.cpp

Context not available.
	STATISTIC(NumDupes, "Number of branch blocks duplicated to eliminate phi");	STATISTIC(NumDupes, "Number of branch blocks duplicated to eliminate phi");

	static cl::opt<unsigned>	static cl::opt<unsigned>
	Threshold("jump-threading-threshold",	BBDuplicateThreshold("jump-threading-threshold",
	cl::desc("Max block size to duplicate for jump threading"),	cl::desc("Max block size to duplicate for jump threading"),
	cl::init(6), cl::Hidden);	cl::init(6), cl::Hidden);
		spatelUnsubmitted Not Done Reply Inline Actions I have no opinion about the functionality of this patch, but how about giving a name to that magic '6'...like: static unsigned BBDuplicateThresholdDefault = 6; Then you can use that constant as the default parameter assignment, and you don't have to introduce yet another magic number (-1) and cast signed to unsigned. spatel: I have no opinion about the functionality of this patch, but how about giving a name to that…
		hliaoAuthorUnsubmitted Not Done Reply Inline Actions '-1' here is used to tell whether a value other that default threshold is specified or not. However, that default threshold would be changed through command line option. Adding a name for that magic '6' and using it as default parameter does work correctly when that default threshold is changed through command option but no value is specified through pass creator. hliao: '-1' here is used to tell whether a value other that default threshold is specified or not.

Context not available.
	#endif	#endif
	DenseSet<std::pair<Value, BasicBlock> > RecursionSet;	DenseSet<std::pair<Value, BasicBlock> > RecursionSet;

		unsigned BBDupThreshold;

	// RAII helper for updating the recursion stack.	// RAII helper for updating the recursion stack.
	struct RecursionSetRemover {	struct RecursionSetRemover {
	DenseSet<std::pair<Value, BasicBlock> > &TheSet;	DenseSet<std::pair<Value, BasicBlock> > &TheSet;
Context not available.
	};	};
	public:	public:
	static char ID; // Pass identification	static char ID; // Pass identification
	JumpThreading() : FunctionPass(ID) {	JumpThreading(int T = -1) : FunctionPass(ID) {
		BBDupThreshold = (T == -1) ? BBDuplicateThreshold : unsigned(T);
	initializeJumpThreadingPass(*PassRegistry::getPassRegistry());	initializeJumpThreadingPass(*PassRegistry::getPassRegistry());
	}	}

Context not available.
	"Jump Threading", false, false)	"Jump Threading", false, false)

	// Public interface to the Jump Threading pass	// Public interface to the Jump Threading pass
	FunctionPass *llvm::createJumpThreadingPass() { return new JumpThreading(); }	FunctionPass *llvm::createJumpThreadingPass(int Threshold) { return new JumpThreading(Threshold); }

	/// runOnFunction - Top level algorithm.	/// runOnFunction - Top level algorithm.
	///	///
Context not available.
	return false;	return false;
	}	}

	unsigned JumpThreadCost = getJumpThreadDuplicationCost(BB, Threshold);	unsigned JumpThreadCost = getJumpThreadDuplicationCost(BB, BBDupThreshold);
	if (JumpThreadCost > Threshold) {	if (JumpThreadCost > BBDupThreshold) {
	DEBUG(dbgs() << " Not threading BB '" << BB->getName()	DEBUG(dbgs() << " Not threading BB '" << BB->getName()
	<< "' - Cost is too high: " << JumpThreadCost << "\n");	<< "' - Cost is too high: " << JumpThreadCost << "\n");
	return false;	return false;
Context not available.
	return false;	return false;
	}	}

	unsigned DuplicationCost = getJumpThreadDuplicationCost(BB, Threshold);	unsigned DuplicationCost = getJumpThreadDuplicationCost(BB, BBDupThreshold);
	if (DuplicationCost > Threshold) {	if (DuplicationCost > BBDupThreshold) {
	DEBUG(dbgs() << " Not duplicating BB '" << BB->getName()	DEBUG(dbgs() << " Not duplicating BB '" << BB->getName()
	<< "' - Cost is too high: " << DuplicationCost << "\n");	<< "' - Cost is too high: " << DuplicationCost << "\n");
	return false;	return false;
Context not available.