This is an archive of the discontinued LLVM Phabricator instance.

[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst
ClosedPublic

Authored by yrouban on May 20 2019, 1:38 AM.

Download Raw Diff

Details

Reviewers

nikic
reames
davide

Commits

rGa3e16719c46a: Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata…
rL362583: Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata…
rG53f2f3286572: [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for…
rL361808: [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for…

Summary

This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper introduced in D62122.
A new test is added.

Diff Detail

Event Timeline

yrouban created this revision.May 20 2019, 1:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2019, 1:38 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

yrouban added a parent revision: D62122: [NFC] Introduce SwitchInst wrapper for prof branch_weights handling.May 20 2019, 1:38 AM

yrouban added a child revision: D61179: Verifier: check prof branch_weights.

yrouban mentioned this in D61179: Verifier: check prof branch_weights.May 20 2019, 8:24 PM

nikic added inline comments.May 25 2019, 10:13 AM

llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
402	I think that something can go wrong here... Consider a case where we have a switch with branch weights, then we remove the first case (so that Changed=true in SwitchInstProfBranchWeightsWrapper), then we find that the second case is always true, such that this constant folding call will erase the switch, leaving behind a dangling reference in SwitchInstProfBranchWeightsWrapper, which will be accessed when the destructor runs at the end of the function.

yrouban marked an inline comment as done.May 27 2019, 7:16 AM

yrouban added inline comments.

llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
402	Good catch! ConstantFoldTerminator() can change SwitchInst on its own.

renamed SwitchInstProfBranchWeightsWrapper to SwitchInstProfUpdateWrapper
explicitly narrowed scope of the SwitchInstProfUpdateWrapper object so it does not overlap with ConstantFoldTerminator()
added a test case where switch turns into conditional branch

This looks ok, but I wonder if it might make sense to have an explicit update() method for cases (like this) where we need to control the point where the SwitchInst is updated. Something like:

void update() {
  if (Changed)
    SI.setMetadata(LLVMContext::MD_prof, buildProfBranchWeightsMD());
  Changed = false;
}
~SwitchInstProfUpdateWrapper() {
  update();
}

I feel like this might be more obvious than extra scopes to control the destruction point.

In D62126#1518460, @nikic wrote:

I considered this way and found it error prone. The wrapper stays alive after ConstantFoldTerminator() and developer may accidentally reuse it but the stored weights might be unsync with the underlying SwitchInst's weights changed by ConstantFoldTerminator().

LGTM

llvm/test/Transforms/CorrelatedValuePropagation/profmd.ll
49	These check lines should probably be positioned one block higher?

This revision is now accepted and ready to land.May 28 2019, 12:18 AM

yrouban marked an inline comment as done.May 28 2019, 3:21 AM

Closed by commit rL361808: [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for… (authored by yrouban). · Explain WhyMay 28 2019, 4:32 AM

This revision was automatically updated to reflect the committed changes.

FYI, we're seeing a crash in llvm::SwitchInstProfUpdateWrapper::getProfBranchWeights() following this patch.
I'm trying to get a testcase available.

My guess would be that this is due to the assumption of valid branch_weights inside https://github.com/llvm-mirror/llvm/blob/2015c9266833ebae0e42a641252b2b120ef0f77a/lib/IR/Instructions.cpp#L3908, which is not a guarantee we have at this point.

Providing an obviously incorrect branch_weights for one of the added test cases is enough to cause an assertion failure:

define i32 @crash(i32 %s) {
entry:
  %cmp = icmp sgt i32 %s, 0
  br i1 %cmp, label %positive, label %out

positive:
  switch i32 %s, label %out [
  i32 1, label %next
  i32 -1, label %next
  i32 -2, label %next
  ], !prof !{!"branch_weights"} ; not enough elements

out:
  %p = phi i32 [ -1, %entry ], [ 1, %positive ]
  ret i32 %p

next:
  %q = phi i32 [ 0, %positive ], [ 0, %positive ], [ 0, %positive ]
  ret i32 %q
}

nikic mentioned this in rG5b32f60ec31c: Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling….May 28 2019, 2:26 PM

nikic mentioned this in rL361881: Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling….

Reverted with rL361881 for now.

This revision is now accepted and ready to land.May 28 2019, 2:28 PM

Thank you for the quick action!

Thank you very much for the revert, the analysis and the test.
I'm going make the wrapper class safe by introducing invalid sticky state.

yrouban mentioned this in D62656: Make SwitchInstProfUpdateWrapper safer.May 30 2019, 6:00 AM

yrouban added a parent revision: D62656: Make SwitchInstProfUpdateWrapper safer.

In D62126#1520054, @nikic wrote:

Providing an obviously incorrect branch_weights for one of the added test cases is enough to cause an assertion failure:

I have just submitted a fix. Please see D62656. Once it is landed this patch can be landed unchanged.
I decided to not include the crash test case as it would be caught by verification proposed in D61179.

This revision is now accepted and ready to land.May 30 2019, 6:08 AM

Closed by commit rL362583: Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata… (authored by yrouban). · Explain WhyJun 4 2019, 10:47 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

CorrelatedValuePropagation.cpp

117 lines

test/

Transforms/

CorrelatedValuePropagation/

profmd.ll

119 lines

Diff 201533

llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp

	Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines

	/// Simplify a switch instruction by removing cases which can never fire. If the			/// Simplify a switch instruction by removing cases which can never fire. If the
	/// uselessness of a case could be determined locally then constant propagation			/// uselessness of a case could be determined locally then constant propagation
	/// would already have figured it out. Instead, walk the predecessors and			/// would already have figured it out. Instead, walk the predecessors and
	/// statically evaluate cases based on information available on that edge. Cases			/// statically evaluate cases based on information available on that edge. Cases
	/// that cannot fire no matter what the incoming edge can safely be removed. If			/// that cannot fire no matter what the incoming edge can safely be removed. If
	/// a case fires on every incoming edge then the entire switch can be removed			/// a case fires on every incoming edge then the entire switch can be removed
	/// and replaced with a branch to the case destination.			/// and replaced with a branch to the case destination.
	static bool processSwitch(SwitchInst SI, LazyValueInfo LVI,			static bool processSwitch(SwitchInst I, LazyValueInfo LVI,
	DominatorTree *DT) {			DominatorTree *DT) {
	DomTreeUpdater DTU(*DT, DomTreeUpdater::UpdateStrategy::Lazy);			DomTreeUpdater DTU(*DT, DomTreeUpdater::UpdateStrategy::Lazy);
	Value *Cond = SI->getCondition();			Value *Cond = I->getCondition();
	BasicBlock *BB = SI->getParent();			BasicBlock *BB = I->getParent();

	// If the condition was defined in same block as the switch then LazyValueInfo			// If the condition was defined in same block as the switch then LazyValueInfo
	// currently won't say anything useful about it, though in theory it could.			// currently won't say anything useful about it, though in theory it could.
	if (isa<Instruction>(Cond) && cast<Instruction>(Cond)->getParent() == BB)			if (isa<Instruction>(Cond) && cast<Instruction>(Cond)->getParent() == BB)
	return false;			return false;

	// If the switch is unreachable then trying to improve it is a waste of time.			// If the switch is unreachable then trying to improve it is a waste of time.
	pred_iterator PB = pred_begin(BB), PE = pred_end(BB);			pred_iterator PB = pred_begin(BB), PE = pred_end(BB);
	if (PB == PE) return false;			if (PB == PE) return false;

	// Analyse each switch case in turn.			// Analyse each switch case in turn.
	bool Changed = false;			bool Changed = false;
	DenseMap<BasicBlock*, int> SuccessorsCount;			DenseMap<BasicBlock*, int> SuccessorsCount;
	for (auto *Succ : successors(BB))			for (auto *Succ : successors(BB))
	SuccessorsCount[Succ]++;			SuccessorsCount[Succ]++;

				{ // Scope for SwitchInstProfUpdateWrapper. It must not live during
				// ConstantFoldTerminator() as the underlying SwitchInst can be changed.
				SwitchInstProfUpdateWrapper SI(*I);

	for (auto CI = SI->case_begin(), CE = SI->case_end(); CI != CE;) {			for (auto CI = SI->case_begin(), CE = SI->case_end(); CI != CE;) {
	ConstantInt *Case = CI->getCaseValue();			ConstantInt *Case = CI->getCaseValue();

	// Check to see if the switch condition is equal to/not equal to the case			// Check to see if the switch condition is equal to/not equal to the case
	// value on every incoming edge, equal/not equal being the same each time.			// value on every incoming edge, equal/not equal being the same each time.
	LazyValueInfo::Tristate State = LazyValueInfo::Unknown;			LazyValueInfo::Tristate State = LazyValueInfo::Unknown;
	for (pred_iterator PI = PB; PI != PE; ++PI) {			for (pred_iterator PI = PB; PI != PE; ++PI) {
	// Is the switch condition equal to the case value?			// Is the switch condition equal to the case value?
	LazyValueInfo::Tristate Value = LVI->getPredicateOnEdge(CmpInst::ICMP_EQ,			LazyValueInfo::Tristate Value = LVI->getPredicateOnEdge(CmpInst::ICMP_EQ,
	Cond, Case, *PI,			Cond, Case, *PI,
	BB, SI);			BB, SI);
	// Give up on this case if nothing is known.			// Give up on this case if nothing is known.
	if (Value == LazyValueInfo::Unknown) {			if (Value == LazyValueInfo::Unknown) {
	State = LazyValueInfo::Unknown;			State = LazyValueInfo::Unknown;
	break;			break;
	}			}

	// If this was the first edge to be visited, record that all other edges			// If this was the first edge to be visited, record that all other edges
	// need to give the same result.			// need to give the same result.
	if (PI == PB) {			if (PI == PB) {
	State = Value;			State = Value;
	continue;			continue;
	}			}

	// If this case is known to fire for some edges and known not to fire for			// If this case is known to fire for some edges and known not to fire for
	// others then there is nothing we can do - give up.			// others then there is nothing we can do - give up.
	if (Value != State) {			if (Value != State) {
	State = LazyValueInfo::Unknown;			State = LazyValueInfo::Unknown;
	break;			break;
	}			}
	}			}

	if (State == LazyValueInfo::False) {			if (State == LazyValueInfo::False) {
	// This case never fires - remove it.			// This case never fires - remove it.
	BasicBlock *Succ = CI->getCaseSuccessor();			BasicBlock *Succ = CI->getCaseSuccessor();
	Succ->removePredecessor(BB);			Succ->removePredecessor(BB);
	CI = SI->removeCase(CI);			CI = SI.removeCase(CI);
	CE = SI->case_end();			CE = SI->case_end();

	// The condition can be modified by removePredecessor's PHI simplification			// The condition can be modified by removePredecessor's PHI simplification
	// logic.			// logic.
	Cond = SI->getCondition();			Cond = SI->getCondition();

	++NumDeadCases;			++NumDeadCases;
	Changed = true;			Changed = true;
	if (--SuccessorsCount[Succ] == 0)			if (--SuccessorsCount[Succ] == 0)
	DTU.applyUpdatesPermissive({{DominatorTree::Delete, BB, Succ}});			DTU.applyUpdatesPermissive({{DominatorTree::Delete, BB, Succ}});
	continue;			continue;
	}			}
	if (State == LazyValueInfo::True) {			if (State == LazyValueInfo::True) {
	// This case always fires. Arrange for the switch to be turned into an			// This case always fires. Arrange for the switch to be turned into an
	// unconditional branch by replacing the switch condition with the case			// unconditional branch by replacing the switch condition with the case
	// value.			// value.
	SI->setCondition(Case);			SI->setCondition(Case);
	NumDeadCases += SI->getNumCases();			NumDeadCases += SI->getNumCases();
	Changed = true;			Changed = true;
	break;			break;
	}			}

	// Increment the case iterator since we didn't delete it.			// Increment the case iterator since we didn't delete it.
	++CI;			++CI;
	}			}
				}

	if (Changed)			if (Changed)
	// If the switch has been simplified to the point where it can be replaced			// If the switch has been simplified to the point where it can be replaced
	// by a branch then do so now.			// by a branch then do so now.
	ConstantFoldTerminator(BB, /DeleteDeadConditions = / false,			ConstantFoldTerminator(BB, /DeleteDeadConditions = / false,
	/TLI = / nullptr, &DTU);			/TLI = / nullptr, &DTU);
				nikicUnsubmitted Not Done Reply Inline Actions I think that something can go wrong here... Consider a case where we have a switch with branch weights, then we remove the first case (so that Changed=true in SwitchInstProfBranchWeightsWrapper), then we find that the second case is always true, such that this constant folding call will erase the switch, leaving behind a dangling reference in SwitchInstProfBranchWeightsWrapper, which will be accessed when the destructor runs at the end of the function. nikic: I think that something can go wrong here... Consider a case where we have a switch with branch…
				yroubanAuthorUnsubmitted Done Reply Inline Actions Good catch! ConstantFoldTerminator() can change SwitchInst on its own. yrouban: Good catch! ConstantFoldTerminator() can change SwitchInst on its own.
	return Changed;			return Changed;
	}			}

	// See if we can prove that the given overflow intrinsic will not overflow.			// See if we can prove that the given overflow intrinsic will not overflow.
	static bool willNotOverflow(WithOverflowInst WO, LazyValueInfo LVI) {			static bool willNotOverflow(WithOverflowInst WO, LazyValueInfo LVI) {
	Value *RHS = WO->getRHS();			Value *RHS = WO->getRHS();
	ConstantRange RRange = LVI->getConstantRange(RHS, WO->getParent(), WO);			ConstantRange RRange = LVI->getConstantRange(RHS, WO->getParent(), WO);
	ConstantRange NWRegion = ConstantRange::makeGuaranteedNoWrapRegion(			ConstantRange NWRegion = ConstantRange::makeGuaranteedNoWrapRegion(
	▲ Show 20 Lines • Show All 383 Lines • Show Last 20 Lines

llvm/test/Transforms/CorrelatedValuePropagation/profmd.ll

This file was added.

				; RUN: opt < %s -correlated-propagation -S \| FileCheck %s

				; Removed several cases from switch.
				define i32 @switch1(i32 %s) {
				; CHECK-LABEL: @switch1(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[CMP:%.]] = icmp slt i32 [[S:%.]], 0
				; CHECK-NEXT: br i1 [[CMP]], label [[NEGATIVE:%.]], label [[OUT:%.]]
				;
				entry:
				%cmp = icmp slt i32 %s, 0
				br i1 %cmp, label %negative, label %out

				negative:
				; CHECK: negative:
				; CHECK-NEXT: switch i32 [[S]], label [[OUT]] [
				; CHECK-NEXT: i32 -2, label [[NEXT:%.*]]
				; CHECK-NEXT: i32 -1, label [[NEXT]]
				switch i32 %s, label %out [
				i32 0, label %out
				i32 1, label %out
				i32 -1, label %next
				i32 -2, label %next
				i32 2, label %out
				i32 3, label %out
				; CHECK-NEXT: !prof ![[MD0:[0-9]+]]
				], !prof !{!"branch_weights", i32 99, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6}

				out:
				%p = phi i32 [ 1, %entry ], [ -1, %negative ], [ -1, %negative ], [ -1, %negative ], [ -1, %negative ], [ -1, %negative ]
				ret i32 %p

				next:
				%q = phi i32 [ 0, %negative ], [ 0, %negative ]
				ret i32 %q
				}

				; Removed all cases from switch.
				define i32 @switch2(i32 %s) {
				; CHECK-LABEL: @switch2(
				;
				entry:
				%cmp = icmp sgt i32 %s, 0
				br i1 %cmp, label %positive, label %out

				positive:
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[CMP:%.]] = icmp sgt i32 [[S:%.]], 0
				; CHECK-NEXT: br i1 [[CMP]], label [[POSITIVE:%.]], label [[OUT:%.]]
				nikicUnsubmitted Done Reply Inline Actions These check lines should probably be positioned one block higher? nikic: These check lines should probably be positioned one block higher?
				switch i32 %s, label %out [
				i32 0, label %out
				i32 -1, label %next
				i32 -2, label %next
				], !prof !{!"branch_weights", i32 99, i32 1, i32 2, i32 3}

				out:
				%p = phi i32 [ -1, %entry ], [ 1, %positive ], [ 1, %positive ]
				ret i32 %p

				next:
				%q = phi i32 [ 0, %positive ], [ 0, %positive ]
				ret i32 %q
				}

				; Change switch into conditional branch.
				define i32 @switch3(i32 %s) {
				; CHECK-LABEL: @switch3(
				;
				entry:
				%cmp = icmp sgt i32 %s, 0
				br i1 %cmp, label %positive, label %out

				positive:
				; CHECK: positive:
				; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 %s, 1
				; CHECK-NEXT: br i1 [[CMP]], label [[NEXT:%.]], label [[OUT:%.]], !prof ![[MD1:[0-9]+]]
				switch i32 %s, label %out [
				i32 1, label %next
				i32 -1, label %next
				i32 -2, label %next
				], !prof !{!"branch_weights", i32 99, i32 1, i32 2, i32 3}

				out:
				%p = phi i32 [ -1, %entry ], [ 1, %positive ]
				ret i32 %p

				next:
				%q = phi i32 [ 0, %positive ], [ 0, %positive ], [ 0, %positive ]
				ret i32 %q
				}

				; Removed all cases from switch.
				define i32 @switch4(i32 %s) {
				; CHECK-LABEL: @switch4(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[CMP:%.]] = icmp slt i32 [[S:%.]], 0
				; CHECK-NEXT: br i1 [[CMP]], label [[NEGATIVE:%.]], label [[OUT:%.]]
				;
				entry:
				%cmp = icmp slt i32 %s, 0
				br i1 %cmp, label %negative, label %out

				negative:
				; CHECK: negative:
				; CHECK-NEXT: br label %out
				switch i32 %s, label %out [
				i32 0, label %out
				i32 1, label %out
				i32 2, label %out
				i32 3, label %out
				], !prof !{!"branch_weights", i32 99, i32 1, i32 2, i32 3, i32 4}

				out:
				%p = phi i32 [ 1, %entry ], [ -1, %negative ], [ -1, %negative ], [ -1, %negative ], [ -1, %negative ], [ -1, %negative ]
				ret i32 %p
				}

				; CHECK: ![[MD0]] = !{!"branch_weights", i32 99, i32 4, i32 3}
				; CHECK: ![[MD1]] = !{!"branch_weights", i32 1, i32 99}