Download Raw Diff

Details

Reviewers

davidxl
danielcdh
eraman

Commits

rG2a6b7991d403: Restrict call metadata based hotness detection to Sample PGO mode
rL302844: Restrict call metadata based hotness detection to Sample PGO mode

Summary

Don't use the metadata on call instructions for determining hotness
unless we are in sample PGO mode, where it is needed because profile
counts are not accurate. In instrumentation mode this is not necessary
and does more harm than good when calls have VP metadata that hasn't
been properly scaled after transformations or dropped after constant
prop based devirtualization (both should be fixed, but we don't need
to do this in the first place for instrumentation PGO).

This required adjusting a number of tests to distinguish between sample
and instrumentation PGO handling.

Diff Detail

Build Status

Buildable 6365
Build 6365: arc lint + arc unit

Event Timeline

tejohnson created this revision.May 4 2017, 11:57 AM

Herald added subscribers: Prazek, mehdi_amini, rengolin, aemerson. · View Herald TranscriptMay 4 2017, 11:57 AM

I feel the underlying problem here is the use of VP metadata to get counts in sample PGO mode. Dehao, even when VP metadata is available, can we add branch_weights to the call? Then, only sample PGO will annotate branch weights to calls and that won't affect instrumented PGO.

In D32877#747647, @eraman wrote:

I feel the underlying problem here is the use of VP metadata to get counts in sample PGO mode. Dehao, even when VP metadata is available, can we add branch_weights to the call? Then, only sample PGO will annotate branch weights to calls and that won't affect instrumented PGO.

That sounds like potentially a good longer term change. But I'd still like for this to go in as an immediate fix for the issue, which is causing a noticeable impact on instrumentation PGO performance, and because there is no reason AFAIK for anything other than Sample PGO to need to look at call metadata for determining hotness.

In D32877#747667, @tejohnson wrote:

In D32877#747647, @eraman wrote:

I feel the underlying problem here is the use of VP metadata to get counts in sample PGO mode. Dehao, even when VP metadata is available, can we add branch_weights to the call? Then, only sample PGO will annotate branch weights to calls and that won't affect instrumented PGO.

That sounds like potentially a good longer term change. But I'd still like for this to go in as an immediate fix for the issue, which is causing a noticeable impact on instrumentation PGO performance, and because there is no reason AFAIK for anything other than Sample PGO to need to look at call metadata for determining hotness.

Just to confirm, with https://reviews.llvm.org/D32773, we should have recovered all the instrumented PGO performance, isn't it?

lib/Analysis/ProfileSummaryInfo.cpp
82	Looks like you can always get the Kind from Inst, why would you want to pass in the Summary?

In D32877#747704, @danielcdh wrote:

In D32877#747667, @tejohnson wrote:

In D32877#747647, @eraman wrote:

I feel the underlying problem here is the use of VP metadata to get counts in sample PGO mode. Dehao, even when VP metadata is available, can we add branch_weights to the call? Then, only sample PGO will annotate branch weights to calls and that won't affect instrumented PGO.

That sounds like potentially a good longer term change. But I'd still like for this to go in as an immediate fix for the issue, which is causing a noticeable impact on instrumentation PGO performance, and because there is no reason AFAIK for anything other than Sample PGO to need to look at call metadata for determining hotness.

Just to confirm, with https://reviews.llvm.org/D32773, we should have recovered all the instrumented PGO performance, isn't it?

You're right, I suspect that it may (will be checking after it is fully integrated here), but my concern is that there may still be other transformations that don't update the VP metadata properly (I mentioned earlier this week that e.g. it appears the metadata isn't being dropped when calls are devirtualized after constant prop, although I don't think this causes the performance issue). These things should be fixed for other reasons including Sample PGO accuracy, but it is not needed for instrumentation PGO in the first place - is there any reason for anything other than Sample PGO to look at this instead of the branch weights?

lib/Analysis/ProfileSummaryInfo.cpp
82	It involves doing some work that is already done when the summary is available, so this was added just for the case where it is invoked without a summary.

eraman added inline comments.May 5 2017, 4:10 PM

include/llvm/Analysis/ProfileSummaryInfo.h
62 ↗	(On Diff #97862)	A major rationale for adding ProfileSummaryInfo as a separate analysis is to prevent Profilesummary from being directly manipulated, so I believe we shouldn't add an interface that takes ProfileSummary as a parameter. As Dehao mentions below, you anyway get the module from the instruction and get the kind from there, so this is not necessary. My concern there is this becomes unnecessarily expensive as we get an invariant value (kind) many times.

tejohnson added inline comments.May 5 2017, 4:21 PM

include/llvm/Analysis/ProfileSummaryInfo.h
62 ↗	(On Diff #97862)	A major rationale for adding ProfileSummaryInfo as a separate analysis is to prevent Profilesummary from being directly manipulated, so I believe we shouldn't add an interface that takes ProfileSummary as a parameter. Note that this is only passed in when getProfileCount() is called from a ProfileSummaryInfo method (it passes in the Summary object it owns), so I am not sure what the concern is about directly manipulating it? As Dehao mentions below, you anyway get the module from the instruction and get the kind from there, so this is not necessary. My concern there is this becomes unnecessarily expensive as we get an invariant value (kind) many times. Exactly, that's why I don't think we should keep re-finding the profile metadata every time when it is called from the ProfileSummaryInfo.

eraman added inline comments.May 5 2017, 4:28 PM

include/llvm/Analysis/ProfileSummaryInfo.h
62 ↗	(On Diff #97862)	The concern is you now have a public method that takes ProfileSummaryInfo * (even though no one outside the class passes it). What you can do is have a private helper method that takes a ProfileSummaryInfo *. Within the class, call this directly. Note that the class already owns the Summary. External users will still call the public method, where you can get the summary from the instruction and call the private method (with a comment about the overhead).

tejohnson added inline comments.May 8 2017, 1:53 PM

include/llvm/Analysis/ProfileSummaryInfo.h
62 ↗	(On Diff #97862)	Done

Address review comments

Harbormaster completed remote builds in B6256: Diff 98202.May 8 2017, 1:53 PM

tejohnson mentioned this in rL302705: Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builder.May 10 2017, 12:05 PM

Update patch now that getProfileCount is not static. This greatly simplified
the code changes, but required adjusting a few more tests which needed
profile summary metadata.

Harbormaster completed remote builds in B6335: Diff 98560.May 10 2017, 4:33 PM

eraman added inline comments.May 10 2017, 5:51 PM

lib/Analysis/ProfileSummaryInfo.cpp
78–85	I wonder if we should check if Summary is non-null and then the summary kind is PSK_Sample. There is one test case down below (inliner count update) where you had to attach the summary to the test case. Is there any reason the summary has to be present to get the count based on entry count and block frequency?

tejohnson added inline comments.May 10 2017, 6:58 PM

lib/Analysis/ProfileSummaryInfo.cpp
78–85	Do you mean only do the metadata-based hotness when computeSummary() returns true and the kind is PSK_Sample? I.e. if !computeSummary(), then assume instrumentation based? I could do that, it would mean a few less test changes.

eraman added inline comments.May 11 2017, 10:53 AM

lib/Analysis/ProfileSummaryInfo.cpp
78–85	Yes, that's what I should've written. As long as we have function entry counts, we should return the profile count.

Treat missing summary as instrumentation PGO, and remove unneeded test changes

LGTM

This revision is now accepted and ready to land.May 11 2017, 4:29 PM

Closed by commit rL302844: Restrict call metadata based hotness detection to Sample PGO mode (authored by tejohnson). · Explain WhyMay 11 2017, 4:31 PM

This revision was automatically updated to reflect the committed changes.

Diff 98695

lib/Analysis/ProfileSummaryInfo.cpp

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines

	Optional<uint64_t>			Optional<uint64_t>
	ProfileSummaryInfo::getProfileCount(const Instruction *Inst,			ProfileSummaryInfo::getProfileCount(const Instruction *Inst,
	BlockFrequencyInfo *BFI) {			BlockFrequencyInfo *BFI) {
	if (!Inst)			if (!Inst)
	return None;			return None;
	assert((isa<CallInst>(Inst) \|\| isa<InvokeInst>(Inst)) &&			assert((isa<CallInst>(Inst) \|\| isa<InvokeInst>(Inst)) &&
	"We can only get profile count for call/invoke instruction.");			"We can only get profile count for call/invoke instruction.");
	// Check if there is a profile metadata on the instruction. If it is present,			if (computeSummary() && Summary->getKind() == ProfileSummary::PSK_Sample) {
	// determine hotness solely based on that.			// In sample PGO mode, check if there is a profile metadata on the
				// instruction. If it is present, determine hotness solely based on that,
				// since the sampled entry count may not be accurate.
	uint64_t TotalCount;			uint64_t TotalCount;
				danielcdhUnsubmitted Not Done Reply Inline Actions Looks like you can always get the Kind from Inst, why would you want to pass in the Summary? danielcdh: Looks like you can always get the Kind from Inst, why would you want to pass in the Summary?
				tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions It involves doing some work that is already done when the summary is available, so this was added just for the case where it is invoked without a summary. tejohnson: It involves doing some work that is already done when the summary is available, so this was…
	if (Inst->extractProfTotalWeight(TotalCount))			if (Inst->extractProfTotalWeight(TotalCount))
	return TotalCount;			return TotalCount;
				}
				eramanUnsubmitted Not Done Reply Inline Actions I wonder if we should check if Summary is non-null and then the summary kind is PSK_Sample. There is one test case down below (inliner count update) where you had to attach the summary to the test case. Is there any reason the summary has to be present to get the count based on entry count and block frequency? eraman: I wonder if we should check if Summary is non-null and then the summary kind is PSK_Sample.
				tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Do you mean only do the metadata-based hotness when computeSummary() returns true and the kind is PSK_Sample? I.e. if !computeSummary(), then assume instrumentation based? I could do that, it would mean a few less test changes. tejohnson: Do you mean only do the metadata-based hotness when computeSummary() returns true and the kind…
				eramanUnsubmitted Not Done Reply Inline Actions Yes, that's what I should've written. As long as we have function entry counts, we should return the profile count. eraman: Yes, that's what I should've written. As long as we have function entry counts, we should…
	if (BFI)			if (BFI)
	return BFI->getBlockProfileCount(Inst->getParent());			return BFI->getBlockProfileCount(Inst->getParent());
	return None;			return None;
	}			}

	/// Returns true if the function's entry is hot. If it returns false, it			/// Returns true if the function's entry is hot. If it returns false, it
	/// either means it is not hot or it is unknown whether it is hot or not (for			/// either means it is not hot or it is unknown whether it is hot or not (for
	/// example, no profile data is available).			/// example, no profile data is available).
	▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

test/Bitcode/thinlto-function-summary-callgraph-profile-summary.ll

This file was copied to test/Bitcode/thinlto-function-summary-callgraph-sample-profile-summary.ll.

	Show All 23 Lines
	; "none2"			; "none2"
	; CHECK-NEXT: <FUNCTION op0=37 op1=5			; CHECK-NEXT: <FUNCTION op0=37 op1=5
	; "none3"			; "none3"
	; CHECK-NEXT: <FUNCTION op0=42 op1=5			; CHECK-NEXT: <FUNCTION op0=42 op1=5
	; CHECK-LABEL: <GLOBALVAL_SUMMARY_BLOCK			; CHECK-LABEL: <GLOBALVAL_SUMMARY_BLOCK
	; CHECK-NEXT: <VERSION			; CHECK-NEXT: <VERSION
	; CHECK-NEXT: <VALUE_GUID op0=25 op1=123/>			; CHECK-NEXT: <VALUE_GUID op0=25 op1=123/>
	; op4=hot1 op6=cold op8=hot2 op10=hot4 op12=none1 op14=hot3 op16=none2 op18=none3 op20=123			; op4=hot1 op6=cold op8=hot2 op10=hot4 op12=none1 op14=hot3 op16=none2 op18=none3 op20=123
	; CHECK-NEXT: <PERMODULE_PROFILE {{.*}} op4=1 op5=3 op6=5 op7=1 op8=2 op9=3 op10=4 op11=3 op12=6 op13=2 op14=3 op15=3 op16=7 op17=2 op18=8 op19=2 op20=25 op21=3/>			; CHECK-NEXT: <PERMODULE_PROFILE {{.*}} op4=1 op5=3 op6=5 op7=1 op8=2 op9=3 op10=4 op11=1 op12=6 op13=2 op14=3 op15=3 op16=7 op17=2 op18=8 op19=2 op20=25 op21=3/>
	; CHECK-NEXT: </GLOBALVAL_SUMMARY_BLOCK>			; CHECK-NEXT: </GLOBALVAL_SUMMARY_BLOCK>

	; CHECK: <STRTAB_BLOCK			; CHECK: <STRTAB_BLOCK
	; CHECK-NEXT: blob data = 'hot_functionhot1hot2hot3hot4coldnone1none2none3'			; CHECK-NEXT: blob data = 'hot_functionhot1hot2hot3hot4coldnone1none2none3'

	; COMBINED: <GLOBALVAL_SUMMARY_BLOCK			; COMBINED: <GLOBALVAL_SUMMARY_BLOCK
	; COMBINED-NEXT: <VERSION			; COMBINED-NEXT: <VERSION
	; COMBINED-NEXT: <VALUE_GUID			; COMBINED-NEXT: <VALUE_GUID
	▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

test/Bitcode/thinlto-function-summary-callgraph-sample-profile-summary.ll

This file was copied from test/Bitcode/thinlto-function-summary-callgraph-profile-summary.ll.

	Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines



	!llvm.module.flags = !{!1}			!llvm.module.flags = !{!1}
	!20 = !{!"function_entry_count", i64 110, i64 123}			!20 = !{!"function_entry_count", i64 110, i64 123}

	!1 = !{i32 1, !"ProfileSummary", !2}			!1 = !{i32 1, !"ProfileSummary", !2}
	!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}			!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}
	!3 = !{!"ProfileFormat", !"InstrProf"}			!3 = !{!"ProfileFormat", !"SampleProfile"}
	!4 = !{!"TotalCount", i64 10000}			!4 = !{!"TotalCount", i64 10000}
	!5 = !{!"MaxCount", i64 10}			!5 = !{!"MaxCount", i64 10}
	!6 = !{!"MaxInternalCount", i64 1}			!6 = !{!"MaxInternalCount", i64 1}
	!7 = !{!"MaxFunctionCount", i64 1000}			!7 = !{!"MaxFunctionCount", i64 1000}
	!8 = !{!"NumCounts", i64 3}			!8 = !{!"NumCounts", i64 3}
	!9 = !{!"NumFunctions", i64 3}			!9 = !{!"NumFunctions", i64 3}
	!10 = !{!"DetailedSummary", !11}			!10 = !{!"DetailedSummary", !11}
	!11 = !{!12, !13, !14}			!11 = !{!12, !13, !14}
	!12 = !{i32 10000, i64 100, i32 1}			!12 = !{i32 10000, i64 100, i32 1}
	!13 = !{i32 999000, i64 100, i32 1}			!13 = !{i32 999000, i64 100, i32 1}
	!14 = !{i32 999999, i64 1, i32 2}			!14 = !{i32 999999, i64 1, i32 2}
	!15 = !{!"branch_weights", i32 100}			!15 = !{!"branch_weights", i32 100}

test/Transforms/CodeGenPrepare/section-samplepgo.ll

This file was copied from test/Transforms/CodeGenPrepare/section.ll.

Show All 33 Lines	define void @cold_func() !prof !16 {
ret void		ret void
}		}

; CHECK: ![[HOT_ID]] = !{!"function_section_prefix", !".hot"}		; CHECK: ![[HOT_ID]] = !{!"function_section_prefix", !".hot"}
; CHECK: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}		; CHECK: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}
!llvm.module.flags = !{!1}		!llvm.module.flags = !{!1}
!1 = !{i32 1, !"ProfileSummary", !2}		!1 = !{i32 1, !"ProfileSummary", !2}
!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}		!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}
!3 = !{!"ProfileFormat", !"InstrProf"}		!3 = !{!"ProfileFormat", !"SampleProfile"}
!4 = !{!"TotalCount", i64 10000}		!4 = !{!"TotalCount", i64 10000}
!5 = !{!"MaxCount", i64 1000}		!5 = !{!"MaxCount", i64 1000}
!6 = !{!"MaxInternalCount", i64 1}		!6 = !{!"MaxInternalCount", i64 1}
!7 = !{!"MaxFunctionCount", i64 1000}		!7 = !{!"MaxFunctionCount", i64 1000}
!8 = !{!"NumCounts", i64 3}		!8 = !{!"NumCounts", i64 3}
!9 = !{!"NumFunctions", i64 3}		!9 = !{!"NumFunctions", i64 3}
!10 = !{!"DetailedSummary", !11}		!10 = !{!"DetailedSummary", !11}
!11 = !{!12, !13, !14}		!11 = !{!12, !13, !14}
!12 = !{i32 10000, i64 100, i32 1}		!12 = !{i32 10000, i64 100, i32 1}
!13 = !{i32 999000, i64 100, i32 1}		!13 = !{i32 999000, i64 100, i32 1}
!14 = !{i32 999999, i64 1, i32 2}		!14 = !{i32 999999, i64 1, i32 2}
!15 = !{!"function_entry_count", i64 1000}		!15 = !{!"function_entry_count", i64 1000}
!16 = !{!"function_entry_count", i64 1}		!16 = !{!"function_entry_count", i64 1}
!17 = !{!"branch_weights", i32 80}		!17 = !{!"branch_weights", i32 80}
!18 = !{!"branch_weights", i32 1}		!18 = !{!"branch_weights", i32 1}

test/Transforms/CodeGenPrepare/section.ll

This file was copied to test/Transforms/CodeGenPrepare/section-samplepgo.ll.

	; RUN: opt < %s -codegenprepare -S \| FileCheck %s			; RUN: opt < %s -codegenprepare -S \| FileCheck %s

	target triple = "x86_64-pc-linux-gnu"			target triple = "x86_64-pc-linux-gnu"

	; This tests that hot/cold functions get correct section prefix assigned			; This tests that hot/cold functions get correct section prefix assigned

	; CHECK: hot_func{{.*}}!section_prefix ![[HOT_ID:[0-9]+]]			; CHECK: hot_func{{.*}}!section_prefix ![[HOT_ID:[0-9]+]]
	; The entry is hot			; The entry is hot
	define void @hot_func() !prof !15 {			define void @hot_func() !prof !15 {
	ret void			ret void
	}			}

	; CHECK: hot_call_func{{.*}}!section_prefix ![[HOT_ID]]			; For instrumentation based PGO, we should only look at entry counts,
	; The sum of 2 callsites are hot			; not call site VP metadata (which can exist on value profiled memcpy,
	define void @hot_call_func() !prof !16 {			; or possibly left behind after static analysis based devirtualization).
				; CHECK: cold_func1{{.*}}!section_prefix ![[COLD_ID:[0-9]+]]
				define void @cold_func1() !prof !16 {
	call void @hot_func(), !prof !17			call void @hot_func(), !prof !17
	call void @hot_func(), !prof !17			call void @hot_func(), !prof !17
	ret void			ret void
	}			}

	; CHECK-NOT: normal_func{{.*}}!section_prefix			; CHECK: cold_func2{{.*}}!section_prefix
	; The sum of all callsites are neither hot or cold			define void @cold_func2() !prof !16 {
	define void @normal_func() !prof !16 {
	call void @hot_func(), !prof !17			call void @hot_func(), !prof !17
	call void @hot_func(), !prof !18			call void @hot_func(), !prof !18
	call void @hot_func(), !prof !18			call void @hot_func(), !prof !18
	ret void			ret void
	}			}

	; CHECK: cold_func{{.*}}!section_prefix ![[COLD_ID:[0-9]+]]			; CHECK: cold_func3{{.*}}!section_prefix ![[COLD_ID]]
	; The entry and the callsite are both cold			define void @cold_func3() !prof !16 {
	define void @cold_func() !prof !16 {
	call void @hot_func(), !prof !18			call void @hot_func(), !prof !18
	ret void			ret void
	}			}

	; CHECK: ![[HOT_ID]] = !{!"function_section_prefix", !".hot"}			; CHECK: ![[HOT_ID]] = !{!"function_section_prefix", !".hot"}
	; CHECK: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}			; CHECK: ![[COLD_ID]] = !{!"function_section_prefix", !".unlikely"}
	!llvm.module.flags = !{!1}			!llvm.module.flags = !{!1}
	!1 = !{i32 1, !"ProfileSummary", !2}			!1 = !{i32 1, !"ProfileSummary", !2}
	Show All 17 Lines

test/Transforms/Inline/prof-update.ll

	; RUN: opt < %s -inline -S \| FileCheck %s			; RUN: opt < %s -inline -S \| FileCheck %s
	; Checks if inliner updates branch_weights annotation for call instructions.			; Checks if inliner updates branch_weights annotation for call instructions.

	declare void @ext();			declare void @ext();
	declare void @ext1();			declare void @ext1();
	@func = global void ()* null			@func = global void ()* null

	; CHECK: define void @callee(i32 %n) !prof ![[ENTRY_COUNT:[0-9]*]]			; CHECK: define void @callee(i32 %n) !prof ![[ENTRY_COUNT:[0-9]*]]
	define void @callee(i32 %n) !prof !1 {			define void @callee(i32 %n) !prof !15 {
	%cond = icmp sle i32 %n, 10			%cond = icmp sle i32 %n, 10
	br i1 %cond, label %cond_true, label %cond_false			br i1 %cond, label %cond_true, label %cond_false
	cond_true:			cond_true:
	; ext1 is optimized away, thus not updated.			; ext1 is optimized away, thus not updated.
	; CHECK: call void @ext1(), !prof ![[COUNT_CALLEE1:[0-9]*]]			; CHECK: call void @ext1(), !prof ![[COUNT_CALLEE1:[0-9]*]]
	call void @ext1(), !prof !2			call void @ext1(), !prof !16
	ret void			ret void
	cond_false:			cond_false:
	; ext is cloned and updated.			; ext is cloned and updated.
	; CHECK: call void @ext(), !prof ![[COUNT_CALLEE:[0-9]*]]			; CHECK: call void @ext(), !prof ![[COUNT_CALLEE:[0-9]*]]
	call void @ext(), !prof !2			call void @ext(), !prof !16
	%f = load void (), void ()* @func			%f = load void (), void ()* @func
	; CHECK: call void %f(), !prof ![[COUNT_IND_CALLEE:[0-9]*]]			; CHECK: call void %f(), !prof ![[COUNT_IND_CALLEE:[0-9]*]]
	call void %f(), !prof !4			call void %f(), !prof !18
	ret void			ret void
	}			}

	; CHECK: define void @caller()			; CHECK: define void @caller()
	define void @caller() {			define void @caller() {
	; CHECK: call void @ext(), !prof ![[COUNT_CALLER:[0-9]*]]			; CHECK: call void @ext(), !prof ![[COUNT_CALLER:[0-9]*]]
	; CHECK: call void %f.i(), !prof ![[COUNT_IND_CALLER:[0-9]*]]			; CHECK: call void %f.i(), !prof ![[COUNT_IND_CALLER:[0-9]*]]
	call void @callee(i32 15), !prof !3			call void @callee(i32 15), !prof !17
	ret void			ret void
	}			}

	!llvm.module.flags = !{!0}			!llvm.module.flags = !{!1}
	!0 = !{i32 1, !"MaxFunctionCount", i32 2000}			!1 = !{i32 1, !"ProfileSummary", !2}
	!1 = !{!"function_entry_count", i64 1000}			!2 = !{!3, !4, !5, !6, !7, !8, !9, !10}
	!2 = !{!"branch_weights", i64 2000}			!3 = !{!"ProfileFormat", !"SampleProfile"}
	!3 = !{!"branch_weights", i64 400}			!4 = !{!"TotalCount", i64 10000}
	!4 = !{!"VP", i32 0, i64 140, i64 111, i64 80, i64 222, i64 40, i64 333, i64 20}			!5 = !{!"MaxCount", i64 10}
				!6 = !{!"MaxInternalCount", i64 1}
				!7 = !{!"MaxFunctionCount", i64 2000}
				!8 = !{!"NumCounts", i64 2}
				!9 = !{!"NumFunctions", i64 2}
				!10 = !{!"DetailedSummary", !11}
				!11 = !{!12, !13, !14}
				!12 = !{i32 10000, i64 100, i32 1}
				!13 = !{i32 999000, i64 100, i32 1}
				!14 = !{i32 999999, i64 1, i32 2}
				!15 = !{!"function_entry_count", i64 1000}
				!16 = !{!"branch_weights", i64 2000}
				!17 = !{!"branch_weights", i64 400}
				!18 = !{!"VP", i32 0, i64 140, i64 111, i64 80, i64 222, i64 40, i64 333, i64 20}
	attributes #0 = { alwaysinline }			attributes #0 = { alwaysinline }
	; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 600}			; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 600}
	; CHECK: ![[COUNT_CALLEE1]] = !{!"branch_weights", i64 2000}			; CHECK: ![[COUNT_CALLEE1]] = !{!"branch_weights", i64 2000}
	; CHECK: ![[COUNT_CALLEE]] = !{!"branch_weights", i64 1200}			; CHECK: ![[COUNT_CALLEE]] = !{!"branch_weights", i64 1200}
	; CHECK: ![[COUNT_IND_CALLEE]] = !{!"VP", i32 0, i64 84, i64 111, i64 48, i64 222, i64 24, i64 333, i64 12}			; CHECK: ![[COUNT_IND_CALLEE]] = !{!"VP", i32 0, i64 84, i64 111, i64 48, i64 222, i64 24, i64 333, i64 12}
	; CHECK: ![[COUNT_CALLER]] = !{!"branch_weights", i64 800}			; CHECK: ![[COUNT_CALLER]] = !{!"branch_weights", i64 800}
	; CHECK: ![[COUNT_IND_CALLER]] = !{!"VP", i32 0, i64 56, i64 111, i64 32, i64 222, i64 16, i64 333, i64 8}			; CHECK: ![[COUNT_IND_CALLER]] = !{!"VP", i32 0, i64 56, i64 111, i64 32, i64 222, i64 16, i64 333, i64 8}

unittests/Analysis/ProfileSummaryInfoTest.cpp

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	TEST_F(ProfileSummaryInfoTest, InstrProf) {
EXPECT_TRUE(PSI.isHotBB(BB3, &BFI));		EXPECT_TRUE(PSI.isHotBB(BB3, &BFI));

CallSite CS1(BB1->getFirstNonPHI());		CallSite CS1(BB1->getFirstNonPHI());
auto *CI2 = BB2->getFirstNonPHI();		auto *CI2 = BB2->getFirstNonPHI();
CallSite CS2(CI2);		CallSite CS2(CI2);

EXPECT_TRUE(PSI.isHotCallSite(CS1, &BFI));		EXPECT_TRUE(PSI.isHotCallSite(CS1, &BFI));
EXPECT_FALSE(PSI.isHotCallSite(CS2, &BFI));		EXPECT_FALSE(PSI.isHotCallSite(CS2, &BFI));

		// Test that adding an MD_prof metadata with a hot count on CS2 does not
		// change its hotness as it has no effect in instrumented profiling.
		MDBuilder MDB(M->getContext());
		CI2->setMetadata(llvm::LLVMContext::MD_prof, MDB.createBranchWeights({400}));
		EXPECT_FALSE(PSI.isHotCallSite(CS2, &BFI));
}		}

TEST_F(ProfileSummaryInfoTest, SampleProf) {		TEST_F(ProfileSummaryInfoTest, SampleProf) {
auto M = makeLLVMModule("SampleProfile");		auto M = makeLLVMModule("SampleProfile");
Function *F = M->getFunction("f");		Function *F = M->getFunction("f");
ProfileSummaryInfo PSI = buildPSI(M.get());		ProfileSummaryInfo PSI = buildPSI(M.get());

BasicBlock &BB0 = F->getEntryBlock();		BasicBlock &BB0 = F->getEntryBlock();
Show All 26 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Restrict call metadata based hotness detection to Sample PGO mode
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 98695

lib/Analysis/ProfileSummaryInfo.cpp

test/Bitcode/thinlto-function-summary-callgraph-profile-summary.ll

test/Bitcode/thinlto-function-summary-callgraph-sample-profile-summary.ll

test/Transforms/CodeGenPrepare/section-samplepgo.ll

test/Transforms/CodeGenPrepare/section.ll

test/Transforms/Inline/prof-update.ll

unittests/Analysis/ProfileSummaryInfoTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Restrict call metadata based hotness detection to Sample PGO modeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 98695

lib/Analysis/ProfileSummaryInfo.cpp

test/Bitcode/thinlto-function-summary-callgraph-profile-summary.ll

test/Bitcode/thinlto-function-summary-callgraph-sample-profile-summary.ll

test/Transforms/CodeGenPrepare/section-samplepgo.ll

test/Transforms/CodeGenPrepare/section.ll

test/Transforms/Inline/prof-update.ll

unittests/Analysis/ProfileSummaryInfoTest.cpp

Restrict call metadata based hotness detection to Sample PGO mode
ClosedPublic