This is an archive of the discontinued LLVM Phabricator instance.

[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true.
ClosedPublic

Authored by jlebar on Mar 30 2016, 2:45 PM.

Download Raw Diff

Details

Reviewers

tra
chandlerc

Commits

rGcad81cf6b303: [Speculation] Add a SpeculativeExecution mode where the pass does nothing…
rL266398: [Speculation] Add a SpeculativeExecution mode where the pass does nothing…

Summary

This lets us add this pass to the IR pass manager unconditionally; it
will simply not do anything on targets without branch divergence.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 52130.Mar 30 2016, 2:45 PM

jlebar retitled this revision from to [Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added subscribers: chandlerc, rnk, jingyue, llvm-commits.

tra added inline comments.Mar 30 2016, 3:12 PM

lib/Transforms/Scalar/SpeculativeExecution.cpp
272	Could we pass an argument to createSpeculativeExecutionPass() instead of encoding it in the name?

jlebar added inline comments.Mar 30 2016, 3:14 PM

lib/Transforms/Scalar/SpeculativeExecution.cpp
272	We could, although I wouldn't want it to be a boolean; see my (by now quite old) rant about this: http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html I figured it wasn't worth exposing an enum into the llvm namespace, but if that's preferred, I can do it.

mehdi_amini added a child revision: D18626: [PM] Add a SpeculativeExecution pass for targets with divergent branches..Mar 30 2016, 3:32 PM

mehdi_amini mentioned this in D18626: [PM] Add a SpeculativeExecution pass for targets with divergent branches..

mehdi_amini added a subscriber: mehdi_amini.Mar 30 2016, 3:34 PM

mehdi_amini added inline comments.

lib/Transforms/Scalar/SpeculativeExecution.cpp
272	Offtopic: I'm so glad to see your blog post entry. I'd had the same argument on a patch review once on llvm-commits! I'm trying to avoid it as much as possible.

Two high level comment:

I'd just add in the PassManagerBuilder change here...

Please add a pass-local flag that allows you to test the pass in both modes, and some test cases exercising the behavior here.

lib/Transforms/Scalar/SpeculativeExecution.cpp
96	LLVM doesn't use this style for enums. However, I wouldn't use an enum here. I would just use a well named boolean.
272	While I'm sympathetic, and I think having separate public functions is fine, I wouldn't reach to the full power of an enum here. Within LLVM and Clang we pretty commonly will use '/MyFlagName/ true' as the argument. There is even a clang-tidy check that will ensure the name in the argument and the name in the parameter match so you don't get the ugly bugs here. I'd just go with that pattern as it seems like a lot less overhead for something simple and localized to this file.

This revision now requires changes to proceed.Apr 7 2016, 12:32 AM

jlebar marked 3 inline comments as done.Apr 7 2016, 9:15 AM

jlebar added inline comments.

lib/Transforms/Scalar/SpeculativeExecution.cpp
272	I'd just go with [the pass a "named bool" arg] pattern as it seems like a lot less overhead for something simple and localized to this file. I think it's a question of the relative public-ness of APIs. For the API that's private to this file, sure, I agree that an enum is overkill, I'll change it to a named bool. But for the public functions, it's a lot harder to guarantee that people are going to call them in a sane way. In fact, people can call them from outside the LLVM project entirely. So those are exactly the kinds of functions I was talking about in the blog post. I also like separate named functions over a public enum in this case. (FWIW I don't think clang-tidy is relevant until it's run automatically as part of most peoples' workflows. At the moment, I think I'm one of the only ones who runs clang-format automatically, and I had to write a wrapper around arc to accomplish that. I haven't bothered to figure out clang-tidy, mostly because, eh, but also I think it's nontrivial because I'd need to give it -I paths. If you actually want to rely on clang-tidy in general, we should figure out how to incorporate it into everyone's workflows; otherwise, I can't rely on other people using it. Moreover, the clang-tidy check I'm familiar with only ensures that, if you write the "named bool" comment in a certain way, the name is right. It doesn't ensure that you name the bool at all, and it doesn't catch all of the creative ways you could write the name. So even if it were running on every patch, it wouldn't prevent the problems described in that blog post.) Anyway, I think we're on the same page here.

Please add a pass-local flag that allows you to test the pass in both modes, and some test cases exercising the behavior here.

Chandler, how would you feel if I added a test to D18626 instead? (The test would check that we do the speculative execution only on the appropriate targets.)

Adding a flag to force speculative execution on (or off?) doesn't seem to test what we're actually after, namely that we do SpeculativeExecution iff the target has divergent branches.

In D18625#398483, @jlebar wrote:

Please add a pass-local flag that allows you to test the pass in both modes, and some test cases exercising the behavior here.

Chandler, how would you feel if I added a test to D18626 instead? (The test would check that we do the speculative execution only on the appropriate targets.)

You should also probably do some testing there, but...

Adding a flag to force speculative execution on (or off?) doesn't seem to test what we're actually after, namely that we do SpeculativeExecution iff the target has divergent branches.

I'm not suggesting you force speculation on or off here. I'm suggesting you toggle the two modes described in the comments: "always" vs "only if divergent arch". Essentially, a flag that simulates the two public functions you can call to construct the pass.

You can use a target that will trivially never have speculation enabled to ensure you can observe this.

I'm essentially saying you should test two different things:

The pass has an "always" mode that doesn't care about the target. That should involve a flag here.

Targets can differentially control this when not in the "always" mode. That can involve having a flag for a single target that swings it both ways, or a flag that swings all targets, or just comparing two known targets. However, the last of these makes it annoying because we have to disable the test if *either* of the targets are missing.

In D18625#398483, @jlebar wrote:

Please add a pass-local flag that allows you to test the pass in both modes, and some test cases exercising the behavior here.

Chandler, how would you feel if I added a test to D18626 instead? (The test would check that we do the speculative execution only on the appropriate targets.)

You should also probably do some testing there, but...

Adding a flag to force speculative execution on (or off?) doesn't seem to test what we're actually after, namely that we do SpeculativeExecution iff the target has divergent branches.

You can use a target that will trivially never have speculation enabled to ensure you can observe this.

I'm essentially saying you should test two different things:

The pass has an "always" mode that doesn't care about the target. That should involve a flag here.

Targets can differentially control this when not in the "always" mode. That can involve having a flag for a single target that swings it both ways, or a flag that swings all targets, or just comparing two known targets. However, the last of these makes it annoying because we have to disable the test if *either* of the targets are missing.

Add test, and flag for testing.

OK, this what you had in mind?

Just some minor nits and tweaks to the testing below. Thanks. LGTM, feel free to submit with this stuff addressed, or ask for clarifications if any of it doesn't make sense.

lib/Transforms/Scalar/SpeculativeExecution.cpp
101–103	I would use either ...Arch or ...Target consistently. Does it make sense to make the flag be the default argument rather than an \|\|? I don't feel strongly either way.
test/Transforms/SpeculativeExecution/divergent-target.ll
7 ↗	(On Diff #53455)	Maybe put this in '-mtriple' arguments to the two target specific ones? That way the "always" mode isn't tied to this target? It would also make the '-march' above more clean as it could be a parallel '-mtriple=x86_64-...'
11 ↗	(On Diff #53455)	This looks stale?

This revision is now accepted and ready to land.Apr 12 2016, 2:48 PM

Address Chandler's comments.

Thank you for the review, Chandler!

test/Transforms/SpeculativeExecution/divergent-target.ll
7 ↗	(On Diff #53455)	Much better, thanks. I'd been trying -mcpu and -march, but that didn't work, and I didn't look closely enough at opt.cpp to see that -mtriple would do the same.

Closed by commit rL266398: [Speculation] Add a SpeculativeExecution mode where the pass does nothing… (authored by jlebar). · Explain WhyApr 14 2016, 5:37 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

llvm/

LinkAllPasses.h

1 line

Transforms/

Scalar.h

4 lines

lib/

Transforms/

Scalar/

SpeculativeExecution.cpp

45 lines

Diff 52130

include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createInstructionSimplifierPass();		(void) llvm::createInstructionSimplifierPass();
(void) llvm::createLoopVectorizePass();		(void) llvm::createLoopVectorizePass();
(void) llvm::createSLPVectorizerPass();		(void) llvm::createSLPVectorizerPass();
(void) llvm::createBBVectorizePass();		(void) llvm::createBBVectorizePass();
(void) llvm::createPartiallyInlineLibCallsPass();		(void) llvm::createPartiallyInlineLibCallsPass();
(void) llvm::createScalarizerPass();		(void) llvm::createScalarizerPass();
(void) llvm::createSeparateConstOffsetFromGEPPass();		(void) llvm::createSeparateConstOffsetFromGEPPass();
(void) llvm::createSpeculativeExecutionPass();		(void) llvm::createSpeculativeExecutionPass();
		(void) llvm::createSpeculativeExecutionIfHasBranchDivergencePass();
(void) llvm::createRewriteSymbolsPass();		(void) llvm::createRewriteSymbolsPass();
(void) llvm::createStraightLineStrengthReducePass();		(void) llvm::createStraightLineStrengthReducePass();
(void) llvm::createMemDerefPrinter();		(void) llvm::createMemDerefPrinter();
(void) llvm::createFloat2IntPass();		(void) llvm::createFloat2IntPass();
(void) llvm::createEliminateAvailableExternallyPass();		(void) llvm::createEliminateAvailableExternallyPass();

(void)new llvm::IntervalPartition();		(void)new llvm::IntervalPartition();
(void)new llvm::ScalarEvolutionWrapperPass();		(void)new llvm::ScalarEvolutionWrapperPass();
Show All 14 Lines

include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 418 Lines • ▼ Show 20 Lines

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// SpeculativeExecution - Aggressively hoist instructions to enable			// SpeculativeExecution - Aggressively hoist instructions to enable
	// speculative execution on targets where branches are expensive.			// speculative execution on targets where branches are expensive.
	//			//
	FunctionPass *createSpeculativeExecutionPass();			FunctionPass *createSpeculativeExecutionPass();

				// Same as createSpeculativeExecutionPass, but does nothing unless
				// TargetTransformInfo::hasBranchDivergence() is true.
				FunctionPass *createSpeculativeExecutionIfHasBranchDivergencePass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// LoadCombine - Combine loads into bigger loads.			// LoadCombine - Combine loads into bigger loads.
	//			//
	BasicBlockPass *createLoadCombinePass();			BasicBlockPass *createLoadCombinePass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

lib/Transforms/Scalar/SpeculativeExecution.cpp

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
// SimplifyCFG. SimplifyCFG will not speculate if no selects are introduced and		// SimplifyCFG. SimplifyCFG will not speculate if no selects are introduced and
// it will speculate at most one instruction. It also will not speculate if		// it will speculate at most one instruction. It also will not speculate if
// there is a value defined in the if-block that is only used in the then-block.		// there is a value defined in the if-block that is only used in the then-block.
// These restrictions make sense since the speculation in SimplifyCFG seems		// These restrictions make sense since the speculation in SimplifyCFG seems
// aimed at introducing cheap selects, while this pass is intended to do more		// aimed at introducing cheap selects, while this pass is intended to do more
// aggressive speculation while counting on later passes to either capitalize on		// aggressive speculation while counting on later passes to either capitalize on
// that or clean it up.		// that or clean it up.
//		//
		// This pass operates in one of two modes:
		//
		// - ALWAYS: The pass always runs.
		// - ONLY_IF_DIVERGENT_ARCH: The pass is a nop unless
		// TargetTransformInfo::hasBranchDivergence() is true.
		//
		// The ONLY_IF_DIVERGENT_ARCH mode lets you include this pass unconditionally in
		// the IR pass pipeline, but only enable it for relevant targets.
		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
Show All 18 Lines
// further optimization.		// further optimization.
static cl::opt<unsigned> SpecExecMaxNotHoisted(		static cl::opt<unsigned> SpecExecMaxNotHoisted(
"spec-exec-max-not-hoisted", cl::init(5), cl::Hidden,		"spec-exec-max-not-hoisted", cl::init(5), cl::Hidden,
cl::desc("Speculative execution is not applied to basic blocks where the "		cl::desc("Speculative execution is not applied to basic blocks where the "
"number of instructions that would not be speculatively executed "		"number of instructions that would not be speculatively executed "
"exceeds this limit."));		"exceeds this limit."));

namespace {		namespace {
		enum ExecutionMode { ALWAYS, ONLY_IF_DIVERGENT_ARCH };
		chandlercUnsubmitted Not Done Reply Inline Actions LLVM doesn't use this style for enums. However, I wouldn't use an enum here. I would just use a well named boolean. chandlerc: LLVM doesn't use this style for enums. However, I wouldn't use an enum here. I would just use…

class SpeculativeExecution : public FunctionPass {		class SpeculativeExecution : public FunctionPass {
public:		public:
static char ID;		static char ID;
SpeculativeExecution(): FunctionPass(ID) {}		SpeculativeExecution() : SpeculativeExecution(ALWAYS) {}
		explicit SpeculativeExecution(ExecutionMode Mode)
		: FunctionPass(ID), Mode(Mode) {}
		chandlercUnsubmitted Done Reply Inline Actions I would use either ...Arch or ...Target consistently. Does it make sense to make the flag be the default argument rather than an \|\|? I don't feel strongly either way. chandlerc: I would use either ...Arch or ...Target consistently. Does it make sense to make the flag be…

void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;
bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;

		const char *getPassName() const override {
		switch (Mode) {
		case ALWAYS:
		return "Speculatively execute instructions";
		case ONLY_IF_DIVERGENT_ARCH:
		return "Speculatively execute instructions if target has divergent "
		"branches";
		}
		}

private:		private:
bool runOnBasicBlock(BasicBlock &B);		bool runOnBasicBlock(BasicBlock &B);
bool considerHoistingFromTo(BasicBlock &FromBlock, BasicBlock &ToBlock);		bool considerHoistingFromTo(BasicBlock &FromBlock, BasicBlock &ToBlock);

		const ExecutionMode Mode;
const TargetTransformInfo *TTI = nullptr;		const TargetTransformInfo *TTI = nullptr;
};		};
} // namespace		} // namespace

char SpeculativeExecution::ID = 0;		char SpeculativeExecution::ID = 0;
INITIALIZE_PASS_BEGIN(SpeculativeExecution, "speculative-execution",		INITIALIZE_PASS_BEGIN(SpeculativeExecution, "speculative-execution",
"Speculatively execute instructions", false, false)		"Speculatively execute instructions", false, false)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(SpeculativeExecution, "speculative-execution",		INITIALIZE_PASS_END(SpeculativeExecution, "speculative-execution",
"Speculatively execute instructions", false, false)		"Speculatively execute instructions", false, false)

void SpeculativeExecution::getAnalysisUsage(AnalysisUsage &AU) const {		void SpeculativeExecution::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
}		}

bool SpeculativeExecution::runOnFunction(Function &F) {		bool SpeculativeExecution::runOnFunction(Function &F) {
if (skipOptnoneFunction(F))		if (skipOptnoneFunction(F))
return false;		return false;

TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
		if (Mode == ONLY_IF_DIVERGENT_ARCH && !TTI->hasBranchDivergence()) {
		DEBUG(dbgs() << "Not running SpeculativeExecution because "
		"TTI->hasBranchDivergence() is false.");
		return false;
		}

bool Changed = false;		bool Changed = false;
for (auto& B : F) {		for (auto& B : F) {
Changed \|= runOnBasicBlock(B);		Changed \|= runOnBasicBlock(B);
}		}
return Changed;		return Changed;
}		}

▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	for (auto I = FromBlock.begin(); I != FromBlock.end();) {
}		}
}		}
return true;		return true;
}		}

namespace llvm {		namespace llvm {

FunctionPass *createSpeculativeExecutionPass() {		FunctionPass *createSpeculativeExecutionPass() {
return new SpeculativeExecution();		return new SpeculativeExecution(ALWAYS);
		}

		FunctionPass *createSpeculativeExecutionIfHasBranchDivergencePass() {
		traUnsubmitted Done Reply Inline Actions Could we pass an argument to createSpeculativeExecutionPass() instead of encoding it in the name? tra: Could we pass an argument to createSpeculativeExecutionPass() instead of encoding it in the…
		jlebarAuthorUnsubmitted Done Reply Inline Actions We could, although I wouldn't want it to be a boolean; see my (by now quite old) rant about this: http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html I figured it wasn't worth exposing an enum into the llvm namespace, but if that's preferred, I can do it. jlebar: We could, although I wouldn't want it to be a boolean; see my (by now quite old) rant about…
		mehdi_aminiUnsubmitted Done Reply Inline Actions Offtopic: I'm so glad to see your blog post entry. I'd had the same argument on a patch review once on llvm-commits! I'm trying to avoid it as much as possible. mehdi_amini: Offtopic: I'm so glad to see your blog post entry. I'd had the same argument on a patch review…
		chandlercUnsubmitted Not Done Reply Inline Actions While I'm sympathetic, and I think having separate public functions is fine, I wouldn't reach to the full power of an enum here. Within LLVM and Clang we pretty commonly will use '/MyFlagName/ true' as the argument. There is even a clang-tidy check that will ensure the name in the argument and the name in the parameter match so you don't get the ugly bugs here. I'd just go with that pattern as it seems like a lot less overhead for something simple and localized to this file. chandlerc: While I'm sympathetic, and I think having separate public functions is fine, I wouldn't reach…
		jlebarAuthorUnsubmitted Not Done Reply Inline Actions I'd just go with [the pass a "named bool" arg] pattern as it seems like a lot less overhead for something simple and localized to this file. I think it's a question of the relative public-ness of APIs. For the API that's private to this file, sure, I agree that an enum is overkill, I'll change it to a named bool. But for the public functions, it's a lot harder to guarantee that people are going to call them in a sane way. In fact, people can call them from outside the LLVM project entirely. So those are exactly the kinds of functions I was talking about in the blog post. I also like separate named functions over a public enum in this case. (FWIW I don't think clang-tidy is relevant until it's run automatically as part of most peoples' workflows. At the moment, I think I'm one of the only ones who runs clang-format automatically, and I had to write a wrapper around arc to accomplish that. I haven't bothered to figure out clang-tidy, mostly because, eh, but also I think it's nontrivial because I'd need to give it -I paths. If you actually want to rely on clang-tidy in general, we should figure out how to incorporate it into everyone's workflows; otherwise, I can't rely on other people using it. Moreover, the clang-tidy check I'm familiar with only ensures that, if you write the "named bool" comment in a certain way, the name is right. It doesn't ensure that you name the bool at all, and it doesn't catch all of the creative ways you could write the name. So even if it were running on every patch, it wouldn't prevent the problems described in that blog post.) Anyway, I think we're on the same page here. jlebar: > I'd just go with [the pass a "named bool" arg] pattern as it seems like a lot less overhead…
		return new SpeculativeExecution(ONLY_IF_DIVERGENT_ARCH);
}		}

} // namespace llvm		} // namespace llvm

This is an archive of the discontinued LLVM Phabricator instance.

[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 52130

include/llvm/LinkAllPasses.h

include/llvm/Transforms/Scalar.h

lib/Transforms/Scalar/SpeculativeExecution.cpp

[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true.
ClosedPublic