This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Transforms/Utils/
-
lib/
-
Transforms/
-
Utils/
3/8
SimplifyCFG.cpp

Differential D86346

[SimplifyCFG] Accumulate cost against budget
AbandonedPublic

Authored by samparker on Aug 21 2020, 6:07 AM.

Download Raw Diff

Details

Reviewers

mkazantsev
spatel
lebedev.ri

Summary

As highlighted in D82438, the speculation cost isn't accumulated in SpeculativelyExecuteBB. So now include the selects, the speculated instruction and any constant expressions against the set threshold.

Diff Detail

Event Timeline

samparker created this revision.Aug 21 2020, 6:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2020, 6:07 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

samparker requested review of this revision.Aug 21 2020, 6:07 AM

samparker edited the summary of this revision. (Show Details)Aug 21 2020, 6:10 AM

lebedev.ri added inline comments.Aug 21 2020, 6:23 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2099–2104	Please add a new `cl::opt` for that. I believe, the threshold should be the branch misprediction cost, i.e. if it takes less cost (latency) to execute the branch then to wrongly predict that it won't be executed, but then rewind and execute it anyways, we should just execute it. So i'd go with `20`.

samparker added inline comments.Aug 21 2020, 6:34 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2099–2104	Most of the cost modelling framework actually tries to assume that branches are always predicated though, and I don't understand why making arbitrary decisions about machine details would be a good idea here. Plus, it's also completely irrelevant to code size... It would seem to be better to query the latency / size of the branch, but I highly doubt that would produce results that anyone would want...

samparker added a reviewer: spatel.Aug 21 2020, 6:34 AM

lebedev.ri added inline comments.Aug 21 2020, 6:37 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2179	Ah, so this transform basically never fires then,
2183	I'm not sure what is going on here.

lebedev.ri mentioned this in D86347: [SimplifyCFG] Two entry phi select costs.Aug 21 2020, 6:39 AM

lebedev.ri requested changes to this revision.Aug 21 2020, 6:41 AM

lebedev.ri added inline comments.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2099–2104	I'll rephrase: given that the cost modelling logic changed, the remaining value of `PHINodeFoldingThreshold` is completely wrong.

This revision now requires changes to proceed.Aug 21 2020, 6:41 AM

samparker added inline comments.Aug 21 2020, 6:59 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2099–2104	Okay... so I guess we need to start at: What is the purpose of this code? I think it's trying to convert any number of phis and speculate, at most, a single instruction (and maybe try to figure out something with constexprs). If PHINodeFoldingThreshold is supposed to represent the cost of a cmp + sel, this doesn't feel completely broken to me but the threshold could just be PHINodeFoldingThreshold * NumPHIs?

lebedev.ri added inline comments.Aug 21 2020, 7:04 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2099–2104	I have no idea, it's you who is changing/fixing this, you should know how it works, so you tell me :) I'm just saying that since the entire cost-modelling logic changed from "let's just ensure that every single instruction is below some threshold, and ignore total cost" to "let's count the total cost", it is pretty obvious that using the old threshold as-is makes no sense.

Harbormaster completed remote builds in B69131: Diff 287018.Aug 21 2020, 7:06 AM

samparker added inline comments.Aug 21 2020, 7:51 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2099–2104	Yeah, so I think multiplying by the number of phis should bring us back inline with what we were doing before.

Rebased after reorganising the two parts of code that evaluate phis. The budget now allows a basic cost for each phi within the block, plus a basic cost for the one instruction that we may speculate.

Ping.

Please do address my review comments.

This revision now requires changes to proceed.Sep 10 2020, 12:46 AM

This review seems to be stuck/dead, consider abandoning if no longer relevant.

This revision now requires review to proceed.Jan 12 2023, 4:46 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:46 PM

Herald added a subscriber: StephenFan. · View Herald Transcript

samparker abandoned this revision.Jan 12 2023, 9:04 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

SimplifyCFG.cpp

33 lines

Diff 287331

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
#include <utility>		#include <utility>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;
using namespace PatternMatch;		using namespace PatternMatch;

#define DEBUG_TYPE "simplifycfg"		#define DEBUG_TYPE "simplifycfg"

// Chosen as 2 so as to be cheap, but still to have enough power to fold		/// The cost of converting each phi to a select, with the total cost allowed
// a select, so the "clamp" idiom (of a min followed by a max) will be caught.		/// for the block is this value multiplied by the total number of phis, plus
// To catch this, we need to fold a compare and a select, hence '2' being the		/// one to allow the speculation of a single instruction.
// minimum reasonable default.
static cl::opt<unsigned> PHINodeFoldingThreshold(		static cl::opt<unsigned> PHINodeFoldingThreshold(
"phi-node-folding-threshold", cl::Hidden, cl::init(2),		"phi-node-folding-threshold", cl::Hidden, cl::init(1),
cl::desc(		cl::desc(
"Control the amount of phi node folding to perform (default = 2)"));		"Control the amount of phi node folding to perform (default = 1)"));

static cl::opt<unsigned> TwoEntryPHINodeFoldingThreshold(		static cl::opt<unsigned> TwoEntryPHINodeFoldingThreshold(
"two-entry-phi-node-folding-threshold", cl::Hidden, cl::init(4),		"two-entry-phi-node-folding-threshold", cl::Hidden, cl::init(4),
cl::desc("Control the maximal total instruction cost that we are willing "		cl::desc("Control the maximal total instruction cost that we are willing "
"to speculatively execute to fold a 2-entry PHI node into a "		"to speculatively execute to fold a 2-entry PHI node into a "
"select (default = 4)"));		"select (default = 4)"));

static cl::opt<bool> DupRet(		static cl::opt<bool> DupRet(
▲ Show 20 Lines • Show All 1,927 Lines • ▼ Show 20 Lines	for (PHINode &PN : EndBB->phis()) {
if (!OrigCE && !ThenCE)		if (!OrigCE && !ThenCE)
continue; // Known safe and cheap.		continue; // Known safe and cheap.

if ((ThenCE && !isSafeToSpeculativelyExecute(ThenCE)) \|\|		if ((ThenCE && !isSafeToSpeculativelyExecute(ThenCE)) \|\|
(OrigCE && !isSafeToSpeculativelyExecute(OrigCE)))		(OrigCE && !isSafeToSpeculativelyExecute(OrigCE)))
return false;		return false;
unsigned OrigCost = OrigCE ? ComputeSpeculationCost(OrigCE, TTI) : 0;		unsigned OrigCost = OrigCE ? ComputeSpeculationCost(OrigCE, TTI) : 0;
unsigned ThenCost = ThenCE ? ComputeSpeculationCost(ThenCE, TTI) : 0;		unsigned ThenCost = ThenCE ? ComputeSpeculationCost(ThenCE, TTI) : 0;
unsigned MaxCost =		BudgetRemaining -= OrigCost + ThenCost;
2 * PHINodeFoldingThreshold * TargetTransformInfo::TCC_Basic;
if (OrigCost + ThenCost > MaxCost)
return false;

// Account for the cost of an unfolded ConstantExpr which could end up		// Account for the cost of an unfolded ConstantExpr which could end up
// getting expanded into Instructions.		// getting expanded into Instructions.
// FIXME: This doesn't account for how many operations are combined in the		// FIXME: This doesn't account for how many operations are combined in the
// constant expression.		// constant expression.
++SpeculatedInstructions;		++SpeculatedInstructions;
if (SpeculatedInstructions > 1)		if (SpeculatedInstructions > 1)
return false;		return false;
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	bool SimplifyCFGOpt::SpeculativelyExecuteBB(BranchInst BI, BasicBlock ThenBB,
const TargetTransformInfo &TTI) {		const TargetTransformInfo &TTI) {
// Be conservative for now. FP select instruction can often be expensive.		// Be conservative for now. FP select instruction can often be expensive.
Value *BrCond = BI->getCondition();		Value *BrCond = BI->getCondition();
if (isa<FCmpInst>(BrCond))		if (isa<FCmpInst>(BrCond))
return false;		return false;

BasicBlock *BB = BI->getParent();		BasicBlock *BB = BI->getParent();
BasicBlock *EndBB = ThenBB->getTerminator()->getSuccessor(0);		BasicBlock *EndBB = ThenBB->getTerminator()->getSuccessor(0);
int BudgetRemaining =
PHINodeFoldingThreshold * TargetTransformInfo::TCC_Basic;		// Enable the cost of speculating a single basic instruction plus a basic cost
		// for each phi in the block.
		unsigned NumPHIs = std::distance(EndBB->phis().begin(), EndBB->phis().end());
		int BudgetRemaining = TargetTransformInfo::TCC_Basic +
		(TargetTransformInfo::TCC_Basic * NumPHIs * PHINodeFoldingThreshold);
		lebedev.riUnsubmitted Not Done Reply Inline Actions Please add a new `cl::opt` for that. I believe, the threshold should be the branch misprediction cost, i.e. if it takes less cost (latency) to execute the branch then to wrongly predict that it won't be executed, but then rewind and execute it anyways, we should just execute it. So i'd go with `20`. lebedev.ri: Please add a new `cl::opt` for that. I believe, the threshold should be the branch…
		samparkerAuthorUnsubmitted Done Reply Inline Actions Most of the cost modelling framework actually tries to assume that branches are always predicated though, and I don't understand why making arbitrary decisions about machine details would be a good idea here. Plus, it's also completely irrelevant to code size... It would seem to be better to query the latency / size of the branch, but I highly doubt that would produce results that anyone would want... samparker: Most of the cost modelling framework actually tries to assume that branches are always…
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'll rephrase: given that the cost modelling logic changed, the remaining value of `PHINodeFoldingThreshold` is completely wrong. lebedev.ri: I'll rephrase: given that the cost modelling logic changed, the remaining value of…
		samparkerAuthorUnsubmitted Done Reply Inline Actions Okay... so I guess we need to start at: What is the purpose of this code? I think it's trying to convert any number of phis and speculate, at most, a single instruction (and maybe try to figure out something with constexprs). If PHINodeFoldingThreshold is supposed to represent the cost of a cmp + sel, this doesn't feel completely broken to me but the threshold could just be PHINodeFoldingThreshold * NumPHIs? samparker: Okay... so I guess we need to start at: What is the purpose of this code? I think it's trying…
		lebedev.riUnsubmitted Not Done Reply Inline Actions I have no idea, it's you who is changing/fixing this, you should know how it works, so you tell me :) I'm just saying that since the entire cost-modelling logic changed from "let's just ensure that every single instruction is below some threshold, and ignore total cost" to "let's count the total cost", it is pretty obvious that using the old threshold as-is makes no sense. lebedev.ri: I have no idea, it's you who is changing/fixing this, you should know how it works, so you tell…
		samparkerAuthorUnsubmitted Done Reply Inline Actions Yeah, so I think multiplying by the number of phis should bring us back inline with what we were doing before. samparker: Yeah, so I think multiplying by the number of phis should bring us back inline with what we…

// If ThenBB is actually on the false edge of the conditional branch, remember		// If ThenBB is actually on the false edge of the conditional branch, remember
// to swap the select operands later.		// to swap the select operands later.
bool Invert = false;		bool Invert = false;
if (ThenBB != BI->getSuccessor(0)) {		if (ThenBB != BI->getSuccessor(0)) {
assert(ThenBB == BI->getSuccessor(1) && "No edge from 'if' block?");		assert(ThenBB == BI->getSuccessor(1) && "No edge from 'if' block?");
Invert = true;		Invert = true;
}		}
Show All 27 Lines	for (BasicBlock::iterator BBI = ThenBB->begin(),
if (SpeculatedInstructions > 1)		if (SpeculatedInstructions > 1)
return false;		return false;

// Don't hoist the instruction if it's unsafe or expensive.		// Don't hoist the instruction if it's unsafe or expensive.
if (!isSafeToSpeculativelyExecute(I) &&		if (!isSafeToSpeculativelyExecute(I) &&
!(HoistCondStores && (SpeculatedStoreValue = isSafeToSpeculateStore(		!(HoistCondStores && (SpeculatedStoreValue = isSafeToSpeculateStore(
I, BB, ThenBB, EndBB))))		I, BB, ThenBB, EndBB))))
return false;		return false;
if (!SpeculatedStoreValue &&		if (!SpeculatedStoreValue) {
ComputeSpeculationCost(I, TTI) >		BudgetRemaining -= ComputeSpeculationCost(I, TTI);
PHINodeFoldingThreshold * TargetTransformInfo::TCC_Basic)		if (BudgetRemaining < 0)
return false;		return false;
		}

// Store the store speculation candidate.		// Store the store speculation candidate.
if (SpeculatedStoreValue)		if (SpeculatedStoreValue)
SpeculatedStore = cast<StoreInst>(I);		SpeculatedStore = cast<StoreInst>(I);

// Do not hoist the instruction if any of its operands are defined but not		// Do not hoist the instruction if any of its operands are defined but not
// used in BB. The transformation will prevent the operand from		// used in BB. The transformation will prevent the operand from
// being sunk into the use block.		// being sunk into the use block.
Show All 10 Lines	bool SimplifyCFGOpt::SpeculativelyExecuteBB(BranchInst BI, BasicBlock ThenBB,
// speculation. Note, while we iterate over a DenseMap here, we are summing		// speculation. Note, while we iterate over a DenseMap here, we are summing
// and so iteration order isn't significant.		// and so iteration order isn't significant.
for (SmallDenseMap<Instruction *, unsigned, 4>::iterator		for (SmallDenseMap<Instruction *, unsigned, 4>::iterator
I = SinkCandidateUseCounts.begin(),		I = SinkCandidateUseCounts.begin(),
E = SinkCandidateUseCounts.end();		E = SinkCandidateUseCounts.end();
I != E; ++I)		I != E; ++I)
if (I->first->hasNUses(I->second)) {		if (I->first->hasNUses(I->second)) {
++SpeculatedInstructions;		++SpeculatedInstructions;
if (SpeculatedInstructions > 1)		if (SpeculatedInstructions > 1)
		lebedev.riUnsubmitted Not Done Reply Inline Actions Ah, so this transform basically never fires then, lebedev.ri: Ah, so this transform basically never fires then,
return false;		return false;
}		}

// Check that we can insert the selects and that it's not too expensive to do		// Check that we can insert the selects and that it's not too expensive to do
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm not sure what is going on here. lebedev.ri: I'm not sure what is going on here.
// so.		// so.
bool Convert = SpeculatedStore != nullptr;		bool Convert = SpeculatedStore != nullptr;
Convert \|= validateAndCostRequiredSelects(BB, ThenBB, EndBB,		Convert \|= validateAndCostRequiredSelects(BB, ThenBB, EndBB,
SpeculatedInstructions,		SpeculatedInstructions,
BudgetRemaining, TTI);		BudgetRemaining, TTI);
if (!Convert \|\| BudgetRemaining < 0)		if (!Convert \|\| BudgetRemaining < 0)
return false;		return false;

▲ Show 20 Lines • Show All 2,961 Lines • Show Last 20 Lines