This is an archive of the discontinued LLVM Phabricator instance.

Differential D19435

[LowerExpectIntrinsic] make default likely/unlikely ratio bigger
ClosedPublic

Authored by spatel on Apr 22 2016, 2:15 PM.

Download Raw Diff

Details

Reviewers

davidxl
deadalnix
hfinkel

Commits

rGd2d2aa52cda3: [LowerExpectIntrinsic] make default likely/unlikely ratio bigger
rL267615: [LowerExpectIntrinsic] make default likely/unlikely ratio bigger

Summary

We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel updated this revision to Diff 54719.Apr 22 2016, 2:15 PM

spatel retitled this revision from to [LowerExpectIntrinsic] pin default likely/unlikely weights to min/max values.

spatel updated this object.

spatel added reviewers: davidxl, hfinkel, deadalnix.

spatel added a subscriber: llvm-commits.

Herald added a subscriber: mcrosier. · View Herald TranscriptApr 22 2016, 2:15 PM

spatel mentioned this in D19488: [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch.Apr 25 2016, 10:35 AM

What is the motivation for this change? Current weight distribution 64:4 is not strong enough for if conversion to not generate cmov?

Ok I see D19488 that provides some context.

David

In D19435#411143, @davidxl wrote:

What is the motivation for this change? Current weight distribution 64:4 is not strong enough for if conversion to not generate cmov?

Yes - 4:64 is 5.88%. I don't think we can safely convert to a branch in that case. We would have to consider mispredict penalty at that point. Please see D19488.

But that is not the only reason: builtin_expect() is a programmer override that should affect BB placement too (although this doesn't work today AFAIK). We should set the weights to the extreme values to ensure that happens regardless of other weights that are based on profile data or heuristics. I noticed this could be a problem while looking at test/Transforms/LoopUnswitch/cold-loop.ll (notice the "100000000" branch weight).

But that is not the only reason: builtin_expect() is a programmer override that should affect BB placement too (although this doesn't work today AFAIK). We should set the weights to the extreme values to ensure that happens regardless of other weights that are based on profile data or heuristics. I noticed this could be a problem while looking at test/Transforms/LoopUnswitch/cold-loop.ll (notice the "100000000" branch weight).

It should work as the BB layout uses 80% as the threshold to follow successors. Do you have evidence that it does not work for block layout?

I don't have hard evidence, but the LoopUnswitch heuristics made me nervous and (sorry to ping-pong between this and the other patch) I noticed that x86 codegen, at least, wasn't affected whether I made it highly likely true or highly likely false for a simple case.

This basically forces the branch to be as likely/unlikely as possible. I think this makes sense. One should not expect the compiler to do a good job when provided inaccurate infos.

As long as weight can be specified explicitly, legacy code that relied on the 64/4 behavior can continue to made to behave the way it is intended to. I'd give some time for @davidxl to comment, but as far as I'm concerned, this is for the best.

davidxl added inline comments.Apr 25 2016, 2:45 PM

lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
39 ↗	(On Diff #54719)	Making LikelyWeight more extreme is fine, but I don't see using max unsigned int is needed. Some large value such as 2000 or 2048 (1<<11) should be good enough.
42 ↗	(On Diff #54719)	I suggest not using 0 weight which BFI can not handle well. 1 should be better.

spatel added inline comments.Apr 26 2016, 9:11 AM

lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
42 ↗	(On Diff #54719)	I thought this over, and I really don't want to compromise on either value here. The meaning of builtin_expect() is clear, and we should honor that programmer directive. Picking semi-random values from the air can only lead to unseen problems down the road. Can you explain why/where BFI has a problem with a '0' weight? If it's a simple bug, I will try to fix that before this patch.

davidxl added inline comments.Apr 26 2016, 9:22 AM

lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
42 ↗	(On Diff #54719)	I am not sure what problem you can foresee down the road with 99.95% probability proposed. 2000 is actually not something coming out of thin air -- GCC also uses this -- not that it is perfect, but I would guess lots of apps are tuned based on that setting. The BFI with 0 weight problem is not a simple bug -- it is due to the limitation of the propagation algorithm. See BlockFrequencyInfoImplBase::adddToDist. Due to that, when computing BFI, 0 weight is changed to 1 anyway, so it might be better to make it an explicit 1.

spatel added inline comments.Apr 26 2016, 9:36 AM

lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
42 ↗	(On Diff #54719)	The problem I foresee - known unknown? :) - is that some heuristic-based transform will not trigger because we chose unwisely here. Why be imperfect when we can be perfect? It's the same argument/implementation that we said was correct in D19299. Since BFI is already handling its bug internally, we don't need to propagate knowledge of that bug up here. Anyone else want to register a vote on this? :)

[repeating message as email because Phab comment doesn't seem to have come
through]

The problem I foresee - known unknown? :) - is that some heuristic-based
transform will not trigger because we chose unwisely here. Why be imperfect
when we can be perfect? It's the same argument/implementation that we said
was correct in D19299 http://reviews.llvm.org/D19299.

Since BFI is already handling its bug internally, we don't need to
propagate knowledge of that bug up here.
Anyone else want to register a vote on this?

spatel mentioned this in rL267572: [CodeGenPrepare] use branch weight metadata to decide if a select should be….Apr 26 2016, 10:17 AM

Patch updated:
Changed weights to 2000 and 1 as suggested by David.

lgtm.

lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
42 ↗	(On Diff #55061)	'substitute as actual ...' --> use it to annotate that the branch is likely/unlikely to be taken.

This revision is now accepted and ready to land.Apr 26 2016, 2:27 PM

Closed by commit rL267615: [LowerExpectIntrinsic] make default likely/unlikely ratio bigger (authored by spatel). · Explain WhyApr 26 2016, 3:29 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

LowerExpectIntrinsic.cpp

24 lines

test/

Transforms/

LowerExpectIntrinsic/

basic.ll

8 lines

Diff 55110

llvm/trunk/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp

	Show All 28 Lines

	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "lower-expect-intrinsic"			#define DEBUG_TYPE "lower-expect-intrinsic"

	STATISTIC(ExpectIntrinsicsHandled,			STATISTIC(ExpectIntrinsicsHandled,
	"Number of 'expect' intrinsic instructions handled");			"Number of 'expect' intrinsic instructions handled");

	static cl::opt<uint32_t>			// These default values are chosen to represent an extremely skewed outcome for
	LikelyBranchWeight("likely-branch-weight", cl::Hidden, cl::init(64),			// a condition, but they leave some room for interpretation by later passes.
	cl::desc("Weight of the branch likely to be taken (default = 64)"));			//
	static cl::opt<uint32_t>			// If the documentation for __builtin_expect() was made explicit that it should
	UnlikelyBranchWeight("unlikely-branch-weight", cl::Hidden, cl::init(4),			// only be used in extreme cases, we could make this ratio higher. As it stands,
	cl::desc("Weight of the branch unlikely to be taken (default = 4)"));			// programmers may be using __builtin_expect() / llvm.expect to annotate that a
				// branch is likely or unlikely to be taken.
				//
				// There is a known dependency on this ratio in CodeGenPrepare when transforming
				// 'select' instructions. It may be worthwhile to hoist these values to some
				// shared space, so they can be used directly by other passes.

				static cl::opt<uint32_t> LikelyBranchWeight(
				"likely-branch-weight", cl::Hidden, cl::init(2000),
				cl::desc("Weight of the branch likely to be taken (default = 2000)"));
				static cl::opt<uint32_t> UnlikelyBranchWeight(
				"unlikely-branch-weight", cl::Hidden, cl::init(1),
				cl::desc("Weight of the branch unlikely to be taken (default = 1)"));

	static bool handleSwitchExpect(SwitchInst &SI) {			static bool handleSwitchExpect(SwitchInst &SI) {
	CallInst *CI = dyn_cast<CallInst>(SI.getCondition());			CallInst *CI = dyn_cast<CallInst>(SI.getCondition());
	if (!CI)			if (!CI)
	return false;			return false;

	Function *Fn = CI->getCalledFunction();			Function *Fn = CI->getCalledFunction();
	if (!Fn \|\| Fn->getIntrinsicID() != Intrinsic::expect)			if (!Fn \|\| Fn->getIntrinsicID() != Intrinsic::expect)
	▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LowerExpectIntrinsic/basic.ll

	Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines

	return: ; preds = %if.end, %if.then			return: ; preds = %if.end, %if.then
	%0 = load i32, i32* %retval			%0 = load i32, i32* %retval
	ret i32 %0			ret i32 %0
	}			}

	declare i1 @llvm.expect.i1(i1, i1) nounwind readnone			declare i1 @llvm.expect.i1(i1, i1) nounwind readnone

	; CHECK: !0 = !{!"branch_weights", i32 64, i32 4}			; CHECK: !0 = !{!"branch_weights", i32 2000, i32 1}
	; CHECK: !1 = !{!"branch_weights", i32 4, i32 64}			; CHECK: !1 = !{!"branch_weights", i32 1, i32 2000}
	; CHECK: !2 = !{!"branch_weights", i32 4, i32 64, i32 4}			; CHECK: !2 = !{!"branch_weights", i32 1, i32 2000, i32 1}
	; CHECK: !3 = !{!"branch_weights", i32 64, i32 4, i32 4}			; CHECK: !3 = !{!"branch_weights", i32 2000, i32 1, i32 1}