Download Raw Diff

Details

Reviewers

hoy
wmi
wenlei
davidxl

Commits

rGc460ef61d64f: [CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen

Summary

Change to use UINT64_MAX for sample count of dangling probe from llvm-profgen side. The compiler will identify this for further process, please refer to https://reviews.llvm.org/D95962 for the details.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wlei created this revision.Feb 16 2021, 1:03 PM

Herald added subscribers: hoy, wenlei, lxfind. · View Herald TranscriptFeb 16 2021, 1:03 PM

wlei requested review of this revision.Feb 16 2021, 1:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2021, 1:03 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

wlei edited the summary of this revision. (Show Details)Feb 16 2021, 1:07 PM

wlei added reviewers: hoy, wmi, wenlei, davidxl.

Harbormaster completed remote builds in B89431: Diff 324084.Feb 16 2021, 1:55 PM

wenlei added inline comments.Feb 16 2021, 9:27 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
512	Do we still want to count samples on dangling probe towards total samples? Without deduplication for dangling probes, we could have multiple dangling probes in the same block and counting samples covering these probe repeatedly may cause bloated total samples?

hoy added inline comments.Feb 16 2021, 10:40 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
512	Good point. Since the samples collected on dangling probes are invalid, I would not count it against total samples.

wlei added inline comments.Feb 16 2021, 10:49 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
512	I see, thanks for your suggestion. How about the count for `isEntry` for line506, I guess it's the original count not the zero? Or we won't have dangling probe which is the entry probe?

hoy added inline comments.Feb 16 2021, 11:06 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
512	That's a good point. It's possible to have a dangling entry probe and we should bail out here. Or we can just return right after adding body samples for dangling probes.

remove count towards total sample for dangling probes

llvm/tools/llvm-profgen/ProfileGenerator.cpp
512	I see. Since the sample count are invalid, we shouldn't count for total sample nor using it to infer inliner's sample count. Thanks for your clarification.

Harbormaster completed remote builds in B89563: Diff 324339.Feb 17 2021, 10:32 AM

hoy added inline comments.Feb 26 2021, 10:13 AM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	Actually I `UINT64_MAX` may cause overflow to total samples. Even if it doesn't, profile merge may overflow too. That's one of the reasons we were using 0 as a special count. @wmi What do you think about keeping using 0?

hoy added inline comments.Mar 3 2021, 12:17 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	I talked to other folks and they like using `UINT64_MAX` instead of 0 to be less confusing. @wlei we may need to fix the places that accumulate total samples and set entry count to not using `UINT64_MAX` as the sample count.

wlei added inline comments.Mar 3 2021, 5:46 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	Thanks for the sharing. So for profgen side, we can keep this patch, right? see line 501, it only adds the body sample and doesn't accumulate the total sample. For compiler side, we need to avoid the addition when meeting the `UINT64_MAX`

hoy added inline comments.Mar 3 2021, 6:24 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	May need to check dangling in `CSProfileGenerator::populateFunctionBoundarySamples` when updating the callsite target samples?

hoy accepted this revision.Mar 3 2021, 6:30 PM

hoy added inline comments.

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	I was wrong. The current implementation looks good. We are not propagating `UINT64_MAX` anywhere except for adding it as a body sample. On the compiler side, the merging of dangling probes are taken care of here: https://reviews.llvm.org/D95962 . Please rebase this change on top of D95962 to share the definition of `FunctionSamples::InvalidProbeCount`.

This revision is now accepted and ready to land.Mar 3 2021, 6:30 PM

wlei added inline comments.Mar 3 2021, 8:56 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	Thanks for your clarification. Will rebase it after `D95962` landed.

wenlei added inline comments.Mar 3 2021, 11:55 PM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	Since we are using addBodySamples here, can we have probe count not equal to UINT64_MAX after the addition? Can we add assertion?

address reviewers' feedback

wlei added inline comments.Mar 4 2021, 12:13 AM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	Good catch. Changed to update the count only it doesn't exist and add the assertion.

lgtm.

hoy added inline comments.Mar 4 2021, 12:31 AM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	This is a good point. Probes with same ID may have both dangling and non-dangling copies. Using `SampleRecord::merge` should be safe, like `R->merge(unctionSamples::InvalidProbeCount, 1)`

Harbormaster completed remote builds in B91983: Diff 328043.Mar 4 2021, 7:48 AM

use merge function to update body samples for probe

wlei added inline comments.Mar 4 2021, 10:13 AM

llvm/tools/llvm-profgen/ProfileGenerator.cpp
501	Thanks for your reminding. I see, then we should also consider non-dangling case, so added a helper function `addBodySamplesForProbe` for this. See whether this looks good for you?

lgtm, thx!

Harbormaster completed remote builds in B92112: Diff 328224.Mar 5 2021, 12:08 AM

This revision was landed with ongoing or failed builds.Mar 8 2021, 2:37 PM

Closed by commit rGc460ef61d64f: [CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen (authored by wlei). · Explain Why

This revision was automatically updated to reflect the committed changes.

wlei added a commit: rGc460ef61d64f: [CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen.

Diff 328043

llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test

	; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --csprof-cold-thres=0 \| FileCheck %s --check-prefix=CHECK-UNWINDER			; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --csprof-cold-thres=0 \| FileCheck %s --check-prefix=CHECK-UNWINDER
	; RUN: FileCheck %s --input-file %t			; RUN: FileCheck %s --input-file %t

	; CHECK: [main:2 @ foo]:74:0			; CHECK: [main:2 @ foo]:74:0
	; CHECK-NEXT: 2: 15			; CHECK-NEXT: 2: 15
	; CHECK-NEXT: 3: 15			; CHECK-NEXT: 3: 15
	; CHECK-NEXT: 4: 14			; CHECK-NEXT: 4: 14
	; CHECK-NEXT: 5: 1			; CHECK-NEXT: 5: 1
	; CHECK-NEXT: 6: 15			; CHECK-NEXT: 6: 15
	; CHECK-NEXT: 8: 14 bar:14			; CHECK-NEXT: 8: 14 bar:14
	; CHECK-NEXT: !CFGChecksum: 138950591924			; CHECK-NEXT: !CFGChecksum: 138950591924
	; CHECK-NEXT:[main:2 @ foo:8 @ bar]:56:14			; CHECK-NEXT:[main:2 @ foo:8 @ bar]:28:14
	; CHECK-NEXT: 1: 14			; CHECK-NEXT: 1: 14
	; CHECK-NEXT: 2: 14			; CHECK-NEXT: 2: 18446744073709551615
	; CHECK-NEXT: 3: 14			; CHECK-NEXT: 3: 18446744073709551615
	; CHECK-NEXT: 4: 14			; CHECK-NEXT: 4: 14
	; CHECK-NEXT: !CFGChecksum: 72617220756			; CHECK-NEXT: !CFGChecksum: 72617220756


	; CHECK-UNWINDER: Binary(inline-cs-pseudoprobe.perfbin)'s Range Counter:			; CHECK-UNWINDER: Binary(inline-cs-pseudoprobe.perfbin)'s Range Counter:
	; CHECK-UNWINDER-EMPTY:			; CHECK-UNWINDER-EMPTY:
	; CHECK-UNWINDER-NEXT: (800, 858): 1			; CHECK-UNWINDER-NEXT: (800, 858): 1
	; CHECK-UNWINDER-NEXT: (80e, 82b): 1			; CHECK-UNWINDER-NEXT: (80e, 82b): 1
	Show All 31 Lines

llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test

	; RUN: llvm-profgen --perfscript=%S/Inputs/noinline-cs-pseudoprobe.perfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --csprof-cold-thres=0 \| FileCheck %s --check-prefix=CHECK-UNWINDER			; RUN: llvm-profgen --perfscript=%S/Inputs/noinline-cs-pseudoprobe.perfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --csprof-cold-thres=0 \| FileCheck %s --check-prefix=CHECK-UNWINDER
	; RUN: FileCheck %s --input-file %t			; RUN: FileCheck %s --input-file %t

	; CHECK: [main:2 @ foo]:75:0			; CHECK: [main:2 @ foo]:75:0
	; CHECK-NEXT: 2: 15			; CHECK-NEXT: 2: 15
	; CHECK-NEXT: 3: 15			; CHECK-NEXT: 3: 15
	; CHECK-NEXT: 4: 15			; CHECK-NEXT: 4: 15
	; CHECK-NEXT: 6: 15			; CHECK-NEXT: 6: 15
	; CHECK-NEXT: 8: 15 bar:15			; CHECK-NEXT: 8: 15 bar:15
	; CHECK-NEXT: !CFGChecksum: 138950591924			; CHECK-NEXT: !CFGChecksum: 138950591924
	; CHECK-NEXT:[main:2 @ foo:8 @ bar]:60:15			; CHECK-NEXT:[main:2 @ foo:8 @ bar]:30:15
	; CHECK-NEXT: 1: 15			; CHECK-NEXT: 1: 15
	; CHECK-NEXT: 2: 15			; CHECK-NEXT: 2: 18446744073709551615
	; CHECK-NEXT: 3: 15			; CHECK-NEXT: 3: 18446744073709551615
	; CHECK-NEXT: 4: 15			; CHECK-NEXT: 4: 15
	; CHECK-NEXT: !CFGChecksum: 72617220756			; CHECK-NEXT: !CFGChecksum: 72617220756


	; CHECK-UNWINDER: Binary(noinline-cs-pseudoprobe.perfbin)'s Range Counter:			; CHECK-UNWINDER: Binary(noinline-cs-pseudoprobe.perfbin)'s Range Counter:
	; CHECK-UNWINDER-NEXT: main:2			; CHECK-UNWINDER-NEXT: main:2
	; CHECK-UNWINDER-NEXT: (79e, 7bf): 15			; CHECK-UNWINDER-NEXT: (79e, 7bf): 15
	; CHECK-UNWINDER-NEXT: (7c4, 7cf): 15			; CHECK-UNWINDER-NEXT: (7c4, 7cf): 15
	Show All 35 Lines

llvm/tools/llvm-profgen/ProfileGenerator.cpp

Show First 20 Lines • Show All 486 Lines • ▼ Show 20 Lines	void PseudoProbeCSProfileGenerator::populateBodySamplesWithProbes(
// the Address2ProbeMap		// the Address2ProbeMap
extractProbesFromRange(RangeCounter, ProbeCounter, Binary);		extractProbesFromRange(RangeCounter, ProbeCounter, Binary);
for (auto PI : ProbeCounter) {		for (auto PI : ProbeCounter) {
const PseudoProbe *Probe = PI.first;		const PseudoProbe *Probe = PI.first;
uint64_t Count = PI.second;		uint64_t Count = PI.second;
FunctionSamples &FunctionProfile =		FunctionSamples &FunctionProfile =
getFunctionProfileForLeafProbe(ContextStrStack, Probe, Binary);		getFunctionProfileForLeafProbe(ContextStrStack, Probe, Binary);

		// Use InvalidProbeCount(UINT64_MAX) to mark sample count for a dangling
		// probe. Dangling probes are the probes associated to an empty block. With
		// this place holder, sample count on dangling probe will not be trusted by
		// the compiler and it will rely on the counts inference algorithm to get
		// the probe a reasonable count.
		if (Probe->isDangling()) {
		ErrorOr<uint64_t> R = FunctionProfile.findSamplesAt(Probe->Index, 0);
		hoyUnsubmitted Not Done Reply Inline Actions Actually I `UINT64_MAX` may cause overflow to total samples. Even if it doesn't, profile merge may overflow too. That's one of the reasons we were using 0 as a special count. @wmi What do you think about keeping using 0? hoy: Actually I `UINT64_MAX` may cause overflow to total samples. Even if it doesn't, profile merge…
		hoyUnsubmitted Not Done Reply Inline Actions I talked to other folks and they like using `UINT64_MAX` instead of 0 to be less confusing. @wlei we may need to fix the places that accumulate total samples and set entry count to not using `UINT64_MAX` as the sample count. hoy: I talked to other folks and they like using `UINT64_MAX` instead of 0 to be less confusing.
		wenleiUnsubmitted Not Done Reply Inline Actions Since we are using addBodySamples here, can we have probe count not equal to UINT64_MAX after the addition? Can we add assertion? wenlei: Since we are using addBodySamples here, can we have probe count not equal to UINT64_MAX after…
		wleiAuthorUnsubmitted Done Reply Inline Actions Thanks for the sharing. So for profgen side, we can keep this patch, right? see line 501, it only adds the body sample and doesn't accumulate the total sample. For compiler side, we need to avoid the addition when meeting the `UINT64_MAX` wlei: Thanks for the sharing. So for profgen side, we can keep this patch, right? see line 501, it…
		hoyUnsubmitted Not Done Reply Inline Actions May need to check dangling in `CSProfileGenerator::populateFunctionBoundarySamples` when updating the callsite target samples? hoy: May need to check dangling in `CSProfileGenerator::populateFunctionBoundarySamples` when…
		hoyUnsubmitted Not Done Reply Inline Actions I was wrong. The current implementation looks good. We are not propagating `UINT64_MAX` anywhere except for adding it as a body sample. On the compiler side, the merging of dangling probes are taken care of here: https://reviews.llvm.org/D95962 . Please rebase this change on top of D95962 to share the definition of `FunctionSamples::InvalidProbeCount`. hoy: I was wrong. The current implementation looks good. We are not propagating `UINT64_MAX`…
		wleiAuthorUnsubmitted Done Reply Inline Actions Thanks for your clarification. Will rebase it after `D95962` landed. wlei: Thanks for your clarification. Will rebase it after `D95962` landed.
		wleiAuthorUnsubmitted Done Reply Inline Actions Good catch. Changed to update the count only it doesn't exist and add the assertion. wlei: Good catch. Changed to update the count only it doesn't exist and add the assertion.
		hoyUnsubmitted Not Done Reply Inline Actions This is a good point. Probes with same ID may have both dangling and non-dangling copies. Using `SampleRecord::merge` should be safe, like `R->merge(unctionSamples::InvalidProbeCount, 1)` hoy: This is a good point. Probes with same ID may have both dangling and non-dangling copies. Using…
		wleiAuthorUnsubmitted Done Reply Inline Actions Thanks for your reminding. I see, then we should also consider non-dangling case, so added a helper function `addBodySamplesForProbe` for this. See whether this looks good for you? wlei: Thanks for your reminding. I see, then we should also consider non-dangling case, so added a…
		if (!R) {
		FunctionProfile.addBodySamples(Probe->Index, 0,
		FunctionSamples::InvalidProbeCount);
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'InvalidProbeCount' in 'llvm::sampleprof::FunctionSamples' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'InvalidProbeCount' in 'llvm::sampleprof::FunctionSamples'…
		} else {
		assert(R.get() == FunctionSamples::InvalidProbeCount &&
		Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'InvalidProbeCount' in 'llvm::sampleprof::FunctionSamples' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'InvalidProbeCount' in 'llvm::sampleprof::FunctionSamples'…
		"Dangling probe count should be UINT64_MAX.");
		}
		continue;
		}
FunctionProfile.addBodySamples(Probe->Index, 0, Count);		FunctionProfile.addBodySamples(Probe->Index, 0, Count);
FunctionProfile.addTotalSamples(Count);		FunctionProfile.addTotalSamples(Count);
		wenleiUnsubmitted Not Done Reply Inline Actions Do we still want to count samples on dangling probe towards total samples? Without deduplication for dangling probes, we could have multiple dangling probes in the same block and counting samples covering these probe repeatedly may cause bloated total samples? wenlei: Do we still want to count samples on dangling probe towards total samples? Without…
		hoyUnsubmitted Not Done Reply Inline Actions Good point. Since the samples collected on dangling probes are invalid, I would not count it against total samples. hoy: Good point. Since the samples collected on dangling probes are invalid, I would not count it…
		wleiAuthorUnsubmitted Done Reply Inline Actions I see, thanks for your suggestion. How about the count for `isEntry` for line506, I guess it's the original count not the zero? Or we won't have dangling probe which is the entry probe? wlei: I see, thanks for your suggestion. How about the count for `isEntry` for line506, I guess it's…
		hoyUnsubmitted Not Done Reply Inline Actions That's a good point. It's possible to have a dangling entry probe and we should bail out here. Or we can just return right after adding body samples for dangling probes. hoy: That's a good point. It's possible to have a dangling entry probe and we should bail out here.
		wleiAuthorUnsubmitted Done Reply Inline Actions I see. Since the sample count are invalid, we shouldn't count for total sample nor using it to infer inliner's sample count. Thanks for your clarification. wlei: I see. Since the sample count are invalid, we shouldn't count for total sample nor using it to…
if (Probe->isEntry()) {		if (Probe->isEntry()) {
FunctionProfile.addHeadSamples(Count);		FunctionProfile.addHeadSamples(Count);
// Look up for the caller's function profile		// Look up for the caller's function profile
const auto *InlinerDesc = Binary->getInlinerDescForProbe(Probe);		const auto *InlinerDesc = Binary->getInlinerDescForProbe(Probe);
if (InlinerDesc != nullptr) {		if (InlinerDesc != nullptr) {
// Since the context id will be compressed, we have to use callee's		// Since the context id will be compressed, we have to use callee's
// context id to infer caller's context id to ensure they share the		// context id to infer caller's context id to ensure they share the
// same context prefix.		// same context prefix.
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 328043

llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test

llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test

llvm/tools/llvm-profgen/ProfileGenerator.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgenClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 328043

llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test

llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test

llvm/tools/llvm-profgen/ProfileGenerator.cpp

[CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen
ClosedPublic