This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/llvm-profdata/
-
tools/
-
llvm-profdata/
-
Inputs/
-
sample-overlap-0.proftext
-
sample-overlap-1.proftext
-
sample-overlap-2.proftext
-
sample-overlap-3.proftext
-
sample-overlap-4.proftext
-
sample-overlap-5.proftext
2
sample-overlap.test
-
tools/llvm-profdata/
-
llvm-profdata/
8/18
llvm-profdata.cpp

Differential D83852

[llvm-profdata] Implement llvm-profdata overlap for sample profiles
ClosedPublic

Authored by weihe on Jul 14 2020, 10:59 PM.

Download Raw Diff

Details

Reviewers

wenlei
hoyFB
davidxl
wmi

Commits

rG540489de6816: [llvm-profdata] Implement llvm-profdata overlap for sample profiles

Summary

Implemented the llvm-profdata overlap feature for sample profiles. It reports weighted similarity and unweighted overlap metrics at program and function level for two input profiles. Similarity metrics are symmetric with regards to the order of two input profiles. By default, the tool only reports program-level summary. Users can look into function-level details via additional options --function, --similarity-cutoff, and --value-cutoff.

The similarity metrics are designed as follows:

Program-level summary
- Whole program profile similarity is an aggregate over function-level similarity FS: PS = sum(FS(A) * avg_weight(A)) for all function A.
- Whole program sample overlap: PSO = common_samples / total_samples.
- Function overlap: FO = #common_function / #total_function.
- Hot-function overlap: HFO = #common_hot_function / #total_hot_function.
- Hot-block overlap: HBO = #common_hot_block / #total_hot_block.
Function-level details
- Function-level similarity is an aggregate over line/block-level similarities BS of all sample lines/blocks in the function, weighted by the closeness of the function's weights in two profiles: FS = sum(BS(i)) * (1 - weight_distance(A)).
- Function-level sample overlap: FSO = common_samples / total_samples for samples in the function.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

weihe created this revision.Jul 14 2020, 10:59 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2020, 10:59 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

weihe edited the summary of this revision. (Show Details)Jul 14 2020, 11:25 PM

Harbormaster completed remote builds in B64292: Diff 278075.Jul 14 2020, 11:36 PM

weihe edited the summary of this revision. (Show Details)Jul 15 2020, 6:58 AM

Corrected two typos in comment.

weihe added reviewers: wenlei, hoyFB.Jul 15 2020, 7:22 AM

Harbormaster failed remote builds in B64353: Diff 278180!Jul 15 2020, 8:16 AM

hoyFB added inline comments.Jul 17 2020, 12:19 AM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1058	function-level similarity?
1248	I think you want just `const auto Match = ...`. The return value will be returned by reference if it is large.
1250	May just increase `HotFuncOverlap.UnionCount` before continue, instead of erasing every matched function from `TestHotFunc`? like for (const auto &F : TestHotFunc) { if (BaseHotFunc.count(F.first())) ++HotFuncOverlap.OverlapCount; else ++HotFuncOverlap.UnionCount; }

wenlei added reviewers: davidxl, wmi.Jul 17 2020, 11:56 AM

snehasish added a subscriber: snehasish.Jul 17 2020, 3:02 PM

Refactored computeHotFuncOverlap() and renamed computeSampleFunctionOverlap() to computeSampleFunctionInternalOverlap()

weihe marked 2 inline comments as done.Jul 17 2020, 10:25 PM

weihe added inline comments.

llvm/tools/llvm-profdata/llvm-profdata.cpp
1058	Thank you for the comment! This is not exactly the function-level similarity we report in the tool, but an intermediate result towards it. The formula of function-level similarity is given at line 882. I renamed this function to `computeSampleFunctionInternalOverlap()` and moved it to a private member to reduce confusions.
1248	Thank you for the suggestion! I've changed the code accordingly.
1250	This suggestion is really good! The refactored code is much cleaner. Thank you very much!

Harbormaster failed remote builds in B64795: Diff 278969!Jul 17 2020, 10:50 PM

dokyungs added a subscriber: dokyungs.Jul 21 2020, 12:22 PM

@wmi We'd like to hear about your input on this. Thanks!

Thanks for the work. It is a very useful feature.

llvm/tools/llvm-profdata/llvm-profdata.cpp
1166–1167	Can we make the similarity within range 0~1 to be consistent with Block and profile similarity? It is more natural to reason the similarity with range 0~1.
1268–1274	Seemly a lot of complexity of the function comes from lock step iteration of the maps from two profiles at the same time. Could you extract the lock step iteration logic into a separate class? This way we don't have to deal with the logic multiple times in iterating BodySampleMap, CallsiteSamplesMap and FunctionSamplesMap.
1293–1298	updateForUnmatchedBlock is a special case of the block above. We may be able to share the code.

Extracted lock step iteration logic into class MatchStep and revised code according to other suggestions.

weihe added inline comments.Jul 28 2020, 9:58 PM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1166–1167	Thank you for the suggestion! I have changed the function `computeSampleFunctionInternalOverlap()` to return a double in range 0~1.
1268–1274	I have extracted the logic of lock step iteration to class `MatchStep`. This is a great suggestion. Thank you very much!
1293–1298	Thank you for the suggestion! I combined this code with `updateForUnmatchedBlock()` into a new function `updateOverlapStatsForFunction()`.

Thanks. Another suggestion about comment. Other than that, LGTM.

llvm/tools/llvm-profdata/llvm-profdata.cpp
1253–1254	Both weightFuncSimilarity and weightByImportance considers the weight of the function in the profile, so at the beginning I felt confused what are their difference. I find out the difference is weightFuncSimilarity absorbs the weight difference into the similarity so the similarity is still in the range of 0~1, while weightByImportance multiplies the similarity by weight ratio of the function in the profile (the average ratio of the two profiles), so the aggregate similarity of all the functions in the profiles will be in the range of 0~1. Please correct me if I am wrong. But it is better to make the intention of these two functions more clear in the comments.

This revision is now accepted and ready to land.Aug 1 2020, 10:14 AM

Grouped previous weightFuncSimilarity() and computeSampleFunctionInternalOverlap() to computeSampleFunctionOverlap() and added comments for readability.

weihe added inline comments.Aug 6 2020, 11:45 PM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1253–1254	Yes, that's right! In fact, the previous `weightFuncSimilarity()` is part of the computation of "function-level similarity", whereas `weightByImportance()` is part of the computation that aggregates "function-level similarity" to "profile-level similarity". So the two functions use weights in slightly different ways. I also realized these two functions may be confusing, so I grouped the previous `weightFuncSimilarity()` (renamed as `weightForFuncSimilarity()`) and `computeSampleFunctionInternalOverlap()` into one `computeSampleFunctionOverlap()` function. In addition, I added comments to `weightForFuncSimilarity()` and `weightByImportance()` to explain the different purpose of these functions. Thank you for pointing this out!

Harbormaster completed remote builds in B67419: Diff 283819.Aug 7 2020, 12:35 AM

hoyFB added inline comments.Aug 7 2020, 11:00 AM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1011–1017	Can you please decorate these functions with `const`?

hoyFB accepted this revision.Aug 7 2020, 11:00 AM

Added const keyword to member functions of MatchStep.

weihe added inline comments.Aug 8 2020, 2:47 PM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1011–1017	Thank you for this suggestion! I have added `const` keyword to the member functions.

MaskRay added a subscriber: MaskRay.Aug 8 2020, 3:18 PM

MaskRay added inline comments.

llvm/test/tools/llvm-profdata/sample-overlap.test
2	You may consider a new test utility `split-file`, which can group multiple auxiliary files. D83834

Harbormaster completed remote builds in B67597: Diff 284144.Aug 8 2020, 3:45 PM

Thank you for working on this during internship, @weihe! The extra tweaks here on top of our internal version look good to me as well. The test failure doesn't seem related, I will rebase and land this on your behalf.

llvm/test/tools/llvm-profdata/sample-overlap.test
2	Thanks for the pointer - I wasn't aware of the recently added utility. That will work, but in this particular case, I think it's still better to keep profile inputs separate so they're semantically legal afdo profile by themselves.

This revision was landed with ongoing or failed builds.Aug 8 2020, 6:02 PM

Closed by commit rG540489de6816: [llvm-profdata] Implement llvm-profdata overlap for sample profiles (authored by weihe, committed by wenlei). · Explain Why

This revision was automatically updated to reflect the committed changes.

wenlei added a commit: rG540489de6816: [llvm-profdata] Implement llvm-profdata overlap for sample profiles.

Hello @wenlei, @weihe,

llvm/tools/llvm-profdata/llvm-profdata.cpp

1599

Match is invalidated after this line, so it cannot be compared with BaseFuncProf.end() afterwards at L1607 and L1609.

In a Debug build on Windows/MSVC this asserts in MS-STL:

The following tests fail because of this:

LLVM :: tools/llvm-profdata/compact-sample.proftext
LLVM :: tools/llvm-profdata/sample-overlap.test

The following patch seems to fi the issue, but I thought I'll let you decide what to do?

diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 488dc8fa4317..38d9cb9461bb 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -1633,6 +1633,7 @@ void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) {
            "except inlinees");
     FuncOverlap.TestSample = TestStats[FuncOverlap.TestName].SampleSum;

+    bool Matched = false;
     const auto Match = BaseFuncProf.find(FuncOverlap.TestName);
     if (Match == BaseFuncProf.end()) {
       const FuncSampleStats &FuncStats = TestStats[FuncOverlap.TestName];
@@ -1677,6 +1678,8 @@ void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) {
       // Remove matched base functions for later reporting functions not found
       // in test profile.
       BaseFuncProf.erase(Match);
+
+      Matched = true;
     }

     // Print function-level similarity information if specified by options.
@@ -1684,9 +1687,8 @@ void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) {
            "TestStats should have records for all functions in test profile "
            "except inlinees");
     if (TestStats[FuncOverlap.TestName].MaxSample >= FuncFilter.ValueCutoff ||
-        (Match != BaseFuncProf.end() &&
-         FuncOverlap.Similarity < LowSimilarityThreshold) ||
-        (Match != BaseFuncProf.end() && !FuncFilter.NameFilter.empty() &&
+        (Matched && FuncOverlap.Similarity < LowSimilarityThreshold) ||
+        (Matched && !FuncFilter.NameFilter.empty() &&
          FuncOverlap.BaseName.toString().find(FuncFilter.NameFilter) !=
              std::string::npos)) {
       assert(ProfOverlap.BaseSample > 0 &&

wenlei added inline comments.Sep 1 2021, 2:31 PM

llvm/tools/llvm-profdata/llvm-profdata.cpp
1599	Good catch, thanks! I will fix it as you suggested.

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-profdata/

Inputs/

sample-overlap-0.proftext

18 lines

sample-overlap-1.proftext

18 lines

sample-overlap-2.proftext

18 lines

sample-overlap-3.proftext

18 lines

sample-overlap-4.proftext

18 lines

sample-overlap-5.proftext

18 lines

sample-overlap.test

118 lines

tools/

llvm-profdata/

llvm-profdata.cpp

971 lines

Diff 284158

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-0.proftext

This file was added.

				_Z3bari:20301:1437
				1: 1437
				_Z3fooi:7711:610
				1: 610
				main:184019:0
				4: 534
				4.2: 534
				5: 1075
				5.1: 1075
				6: 2080
				7: 534
				9: 2064 _Z3bari:1471 _Z3fooi:631
				10: inline1:1000
				1: 1000
				10: inline2:2000
				1: 2000
				_Z3bazi:20301:1000
				1: 1000

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-1.proftext

This file was added.

				_Z3bari:203010:14370
				1: 14370
				_Z3fooi:77110:6100
				1: 6100
				main:1840190:0
				4: 5340
				4.2: 5340
				5: 10750
				5.1: 10750
				6: 20800
				7: 5340
				9: 20640 _Z3bari:14710 _Z3fooi:6310
				10: inline1:10000
				1: 10000
				10: inline2:20000
				1: 20000
				_Z3bazi:203010:10000
				1: 10000

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-2.proftext

This file was added.

				_Z3bari:20301:1437
				1: 1437
				_Z3fooi:7711:610
				1: 610
				main:18401:0
				4: 53
				4.2: 53
				5: 107
				5.1: 107
				6: 208
				7: 53
				9: 206 _Z3bari:1471 _Z3fooi:631
				10: inline1:100
				1: 100
				10: inline2:200
				1: 200
				_Z3bazi:20301:1000
				1: 1000

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-3.proftext

This file was added.

				_Z3bari:20301:1437
				1: 1437
				_Z3fooi2:7711:610
				1: 610
				main2:184019:0
				4: 534
				4.2: 534
				5: 1075
				5.1: 1075
				6: 2080
				7: 534
				9: 2064 _Z3bari:1471 _Z3fooi:631
				10: inline1:1000
				1: 1000
				10: inline2:2000
				1: 2000
				_Z3bazi:20301:1000
				1: 100

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-4.proftext

This file was added.

				_Z3bari:20301:1437
				2: 1437
				_Z3fooi:7711:610
				2: 610
				main:184019:0
				5: 534
				5.2: 534
				6: 1075
				6.1: 1075
				7: 208
				8: 534
				10: 206 _Z3bari:1471 _Z3fooi:631
				11: inline1:1000
				1: 1000
				11: inline2:2000
				1: 2000
				_Z3bazi:20301:1000
				2: 1000

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-5.proftext

This file was added.

				_Z3bari:0:0
				1: 0
				_Z3fooi:0:0
				1: 0
				main:0:0
				4: 0
				4.2: 0
				5: 0
				5.1: 0
				6: 0
				7: 0
				9: 0
				10: inline1:0
				1: 0
				10: inline2:0
				1: 0
				_Z3bazi:0:0
				1: 0

llvm/test/tools/llvm-profdata/sample-overlap.test

This file was added.

				; RUN: llvm-profdata overlap --sample %S/Inputs/sample-overlap-0.proftext %S/Inputs/sample-overlap-0.proftext \| FileCheck %s --check-prefix=OVERLAP0 --match-full-lines --strict-whitespace
				; OVERLAP0:Program level:
				MaskRayUnsubmitted Not Done Reply Inline Actions You may consider a new test utility `split-file`, which can group multiple auxiliary files. D83834 MaskRay: You may consider a new test utility `split-file`, which can group multiple auxiliary files.
				wenleiUnsubmitted Not Done Reply Inline Actions Thanks for the pointer - I wasn't aware of the recently added utility. That will work, but in this particular case, I think it's still better to keep profile inputs separate so they're semantically legal afdo profile by themselves. wenlei: Thanks for the pointer - I wasn't aware of the recently added utility. That will work, but in…
				; OVERLAP0: Whole program profile similarity: 100.000%
				; OVERLAP0: Whole program sample overlap: 100.000%
				; OVERLAP0: percentage of samples unique in base profile: 0.000%
				; OVERLAP0: percentage of samples unique in test profile: 0.000%
				; OVERLAP0: total samples in base profile: 13943
				; OVERLAP0: total samples in test profile: 13943
				; OVERLAP0: Function overlap: 100.000%
				; OVERLAP0: overlap functions: 4
				; OVERLAP0: functions unique in base profile: 0
				; OVERLAP0: functions unique in test profile: 0
				; OVERLAP0: Hot-function overlap: 100.000%
				; OVERLAP0: overlap hot functions: 4
				; OVERLAP0: hot functions unique in base profile: 0
				; OVERLAP0: hot functions unique in test profile: 0
				; OVERLAP0: Hot-block overlap: 100.000%
				; OVERLAP0: overlap hot blocks: 12
				; OVERLAP0: hot blocks unique in base profile: 0
				; OVERLAP0: hot blocks unique in test profile: 0

				; RUN: llvm-profdata overlap --sample %S/Inputs/sample-overlap-0.proftext %S/Inputs/sample-overlap-1.proftext \| FileCheck %s --check-prefix=OVERLAP1 --match-full-lines --strict-whitespace
				; OVERLAP1:Program level:
				; OVERLAP1: Whole program profile similarity: 100.000%
				; OVERLAP1: Whole program sample overlap: 10.000%
				; OVERLAP1: percentage of samples unique in base profile: 0.000%
				; OVERLAP1: percentage of samples unique in test profile: 0.000%
				; OVERLAP1: total samples in base profile: 13943
				; OVERLAP1: total samples in test profile: 139430
				; OVERLAP1: Function overlap: 100.000%
				; OVERLAP1: overlap functions: 4
				; OVERLAP1: functions unique in base profile: 0
				; OVERLAP1: functions unique in test profile: 0
				; OVERLAP1: Hot-function overlap: 100.000%
				; OVERLAP1: overlap hot functions: 4
				; OVERLAP1: hot functions unique in base profile: 0
				; OVERLAP1: hot functions unique in test profile: 0
				; OVERLAP1: Hot-block overlap: 100.000%
				; OVERLAP1: overlap hot blocks: 12
				; OVERLAP1: hot blocks unique in base profile: 0
				; OVERLAP1: hot blocks unique in test profile: 0

				; RUN: llvm-profdata overlap --sample --similarity-cutoff=800000 %S/Inputs/sample-overlap-0.proftext %S/Inputs/sample-overlap-2.proftext \| FileCheck %s --check-prefix=OVERLAP2 --match-full-lines --strict-whitespace
				; OVERLAP2:Program level:
				; OVERLAP2: Whole program profile similarity: 63.720%
				; OVERLAP2: Whole program sample overlap: 29.649%
				; OVERLAP2: percentage of samples unique in base profile: 0.000%
				; OVERLAP2: percentage of samples unique in test profile: 0.000%
				; OVERLAP2: total samples in base profile: 13943
				; OVERLAP2: total samples in test profile: 4134
				; OVERLAP2: Function overlap: 100.000%
				; OVERLAP2: overlap functions: 4
				; OVERLAP2: functions unique in base profile: 0
				; OVERLAP2: functions unique in test profile: 0
				; OVERLAP2: Hot-function overlap: 100.000%
				; OVERLAP2: overlap hot functions: 4
				; OVERLAP2: hot functions unique in base profile: 0
				; OVERLAP2: hot functions unique in test profile: 0
				; OVERLAP2: Hot-block overlap: 100.000%
				; OVERLAP2: overlap hot blocks: 12
				; OVERLAP2: hot blocks unique in base profile: 0
				; OVERLAP2: hot blocks unique in test profile: 0
				; OVERLAP2:Function-level details:
				; OVERLAP2:Base weight Test weight Similarity Overlap Base unique Test unique Base samples Test samples Function name
				; OVERLAP2:78.15% 26.29% 48.09% 9.98% 0.00% 0.00% 10896 1087 main
				; OVERLAP2:10.31% 34.76% 75.55% 100.00% 0.00% 0.00% 1437 1437 _Z3bari

				; RUN: llvm-profdata overlap --sample --value-cutoff=1000 %S/Inputs/sample-overlap-0.proftext %S/Inputs/sample-overlap-3.proftext \| FileCheck %s --check-prefix=OVERLAP3 --match-full-lines --strict-whitespace
				; OVERLAP3:Program level:
				; OVERLAP3: Whole program profile similarity: 14.301%
				; OVERLAP3: Whole program sample overlap: 6.040%
				; OVERLAP3: percentage of samples unique in base profile: 82.522%
				; OVERLAP3: percentage of samples unique in test profile: 88.216%
				; OVERLAP3: total samples in base profile: 13943
				; OVERLAP3: total samples in test profile: 13043
				; OVERLAP3: Function overlap: 33.333%
				; OVERLAP3: overlap functions: 2
				; OVERLAP3: functions unique in base profile: 2
				; OVERLAP3: functions unique in test profile: 2
				; OVERLAP3: Hot-function overlap: 16.667%
				; OVERLAP3: overlap hot functions: 1
				; OVERLAP3: hot functions unique in base profile: 3
				; OVERLAP3: hot functions unique in test profile: 2
				; OVERLAP3: Hot-block overlap: 4.545%
				; OVERLAP3: overlap hot blocks: 1
				; OVERLAP3: hot blocks unique in base profile: 11
				; OVERLAP3: hot blocks unique in test profile: 10
				; OVERLAP3:Function-level details:
				; OVERLAP3:Base weight Test weight Similarity Overlap Base unique Test unique Base samples Test samples Function name
				; OVERLAP3:10.31% 11.02% 99.29% 100.00% 0.00% 0.00% 1437 1437 _Z3bari
				; OVERLAP3:0.00% 83.54% 0.00% 0.00% 0.00% 100.00% 0 10896 main2

				; RUN: llvm-profdata overlap --sample --function=main %S/Inputs/sample-overlap-0.proftext %S/Inputs/sample-overlap-4.proftext \| FileCheck %s --check-prefix=OVERLAP4 --match-full-lines --strict-whitespace
				; OVERLAP4:Program level:
				; OVERLAP4: Whole program profile similarity: 17.302%
				; OVERLAP4: Whole program sample overlap: 8.134%
				; OVERLAP4: percentage of samples unique in base profile: 73.542%
				; OVERLAP4: percentage of samples unique in test profile: 82.209%
				; OVERLAP4: total samples in base profile: 13943
				; OVERLAP4: total samples in test profile: 10213
				; OVERLAP4: Function overlap: 100.000%
				; OVERLAP4: overlap functions: 4
				; OVERLAP4: functions unique in base profile: 0
				; OVERLAP4: functions unique in test profile: 0
				; OVERLAP4: Hot-function overlap: 100.000%
				; OVERLAP4: overlap hot functions: 4
				; OVERLAP4: hot functions unique in base profile: 0
				; OVERLAP4: hot functions unique in test profile: 0
				; OVERLAP4: Hot-block overlap: 14.286%
				; OVERLAP4: overlap hot blocks: 3
				; OVERLAP4: hot blocks unique in base profile: 9
				; OVERLAP4: hot blocks unique in test profile: 9
				; OVERLAP4:Function-level details:
				; OVERLAP4:Base weight Test weight Similarity Overlap Base unique Test unique Base samples Test samples Function name
				; OVERLAP4:78.15% 70.17% 23.33% 11.18% 66.14% 74.64% 10896 7166 main

				; RUN: llvm-profdata overlap --sample %S/Inputs/sample-overlap-0.proftext %S/Inputs/sample-overlap-5.proftext \| FileCheck %s --check-prefix=OVERLAP5 --match-full-lines --strict-whitespace
				; OVERLAP5:Sum of sample counts for profile {{.*}}/Inputs/sample-overlap-5.proftext is 0.

llvm/tools/llvm-profdata/llvm-profdata.cpp

Show First 20 Lines • Show All 939 Lines • ▼ Show 20 Lines	if (Overlap.Test.CountSum < 1.0f) {
exit(0);		exit(0);
}		}
loadInput(WeightedInput, nullptr, &Context);		loadInput(WeightedInput, nullptr, &Context);
overlapInput(BaseFilename, TestFilename, &Context, Overlap, FuncFilter, OS,		overlapInput(BaseFilename, TestFilename, &Context, Overlap, FuncFilter, OS,
IsCS);		IsCS);
Overlap.dump(OS);		Overlap.dump(OS);
}		}

		namespace {
		struct SampleOverlapStats {
		StringRef BaseName;
		StringRef TestName;
		// Number of overlap units
		uint64_t OverlapCount;
		// Total samples of overlap units
		uint64_t OverlapSample;
		// Number of and total samples of units that only present in base or test
		// profile
		uint64_t BaseUniqueCount;
		uint64_t BaseUniqueSample;
		uint64_t TestUniqueCount;
		uint64_t TestUniqueSample;
		// Number of units and total samples in base or test profile
		uint64_t BaseCount;
		uint64_t BaseSample;
		uint64_t TestCount;
		uint64_t TestSample;
		// Number of and total samples of units that present in at least one profile
		uint64_t UnionCount;
		uint64_t UnionSample;
		// Weighted similarity
		double Similarity;
		// For SampleOverlapStats instances representing functions, weights of the
		// function in base and test profiles
		double BaseWeight;
		double TestWeight;

		SampleOverlapStats()
		: OverlapCount(0), OverlapSample(0), BaseUniqueCount(0),
		BaseUniqueSample(0), TestUniqueCount(0), TestUniqueSample(0),
		BaseCount(0), BaseSample(0), TestCount(0), TestSample(0), UnionCount(0),
		UnionSample(0), Similarity(0.0), BaseWeight(0.0), TestWeight(0.0) {}
		};
		} // end anonymous namespace

		namespace {
		struct FuncSampleStats {
		uint64_t SampleSum;
		uint64_t MaxSample;
		uint64_t HotBlockCount;
		FuncSampleStats() : SampleSum(0), MaxSample(0), HotBlockCount(0) {}
		FuncSampleStats(uint64_t SampleSum, uint64_t MaxSample,
		uint64_t HotBlockCount)
		: SampleSum(SampleSum), MaxSample(MaxSample),
		HotBlockCount(HotBlockCount) {}
		};
		} // end anonymous namespace

		namespace {
		enum MatchStatus { MS_Match, MS_FirstUnique, MS_SecondUnique, MS_None };

		// Class for updating merging steps for two sorted maps. The class should be
		// instantiated with a map iterator type.
		template <class T> class MatchStep {
		public:
		MatchStep() = delete;

		MatchStep(T FirstIter, T FirstEnd, T SecondIter, T SecondEnd)
		: FirstIter(FirstIter), FirstEnd(FirstEnd), SecondIter(SecondIter),
		SecondEnd(SecondEnd), Status(MS_None) {}

		bool areBothFinished() const {
		return (FirstIter == FirstEnd && SecondIter == SecondEnd);
		}

		bool isFirstFinished() const { return FirstIter == FirstEnd; }

		bool isSecondFinished() const { return SecondIter == SecondEnd; }
		hoyFBUnsubmitted Not Done Reply Inline Actions Can you please decorate these functions with `const`? hoyFB: Can you please decorate these functions with `const`?
		weiheAuthorUnsubmitted Done Reply Inline Actions Thank you for this suggestion! I have added `const` keyword to the member functions. weihe: Thank you for this suggestion! I have added `const` keyword to the member functions.

		/// Advance one step based on the previous match status unless the previous
		/// status is MS_None. Then update Status based on the comparison between two
		/// container iterators at the current step. If the previous status is
		/// MS_None, it means two iterators are at the beginning and no comparison has
		/// been made, so we simply update Status without advancing the iterators.
		void updateOneStep();

		T getFirstIter() const { return FirstIter; }

		T getSecondIter() const { return SecondIter; }

		MatchStatus getMatchStatus() const { return Status; }

		private:
		// Current iterator and end iterator of the first container.
		T FirstIter;
		T FirstEnd;
		// Current iterator and end iterator of the second container.
		T SecondIter;
		T SecondEnd;
		// Match status of the current step.
		MatchStatus Status;
		};
		} // end anonymous namespace

		template <class T> void MatchStep<T>::updateOneStep() {
		switch (Status) {
		case MS_Match:
		++FirstIter;
		++SecondIter;
		break;
		case MS_FirstUnique:
		++FirstIter;
		break;
		case MS_SecondUnique:
		++SecondIter;
		break;
		case MS_None:
		break;
		}
		hoyFBUnsubmitted Not Done Reply Inline Actions function-level similarity? hoyFB: function-level similarity?
		weiheAuthorUnsubmitted Done Reply Inline Actions Thank you for the comment! This is not exactly the function-level similarity we report in the tool, but an intermediate result towards it. The formula of function-level similarity is given at line 882. I renamed this function to `computeSampleFunctionInternalOverlap()` and moved it to a private member to reduce confusions. weihe: Thank you for the comment! This is not exactly the function-level similarity we report in the…

		// Update Status according to iterators at the current step.
		if (areBothFinished())
		return;
		if (FirstIter != FirstEnd &&
		(SecondIter == SecondEnd \|\| FirstIter->first < SecondIter->first))
		Status = MS_FirstUnique;
		else if (SecondIter != SecondEnd &&
		(FirstIter == FirstEnd \|\| SecondIter->first < FirstIter->first))
		Status = MS_SecondUnique;
		else
		Status = MS_Match;
		}

		// Return the sum of line/block samples, the max line/block sample, and the
		// number of line/block samples above the given threshold in a function
		// including its inlinees.
		static void getFuncSampleStats(const sampleprof::FunctionSamples &Func,
		FuncSampleStats &FuncStats,
		uint64_t HotThreshold) {
		for (const auto &L : Func.getBodySamples()) {
		uint64_t Sample = L.second.getSamples();
		FuncStats.SampleSum += Sample;
		FuncStats.MaxSample = std::max(FuncStats.MaxSample, Sample);
		if (Sample >= HotThreshold)
		++FuncStats.HotBlockCount;
		}

		for (const auto &C : Func.getCallsiteSamples()) {
		for (const auto &F : C.second)
		getFuncSampleStats(F.second, FuncStats, HotThreshold);
		}
		}

		/// Predicate that determines if a function is hot with a given threshold. We
		/// keep it separate from its callsites for possible extension in the future.
		static bool isFunctionHot(const FuncSampleStats &FuncStats,
		uint64_t HotThreshold) {
		// We intentionally compare the maximum sample count in a function with the
		// HotThreshold to get an approximate determination on hot functions.
		return (FuncStats.MaxSample >= HotThreshold);
		}

		namespace {
		class SampleOverlapAggregator {
		public:
		SampleOverlapAggregator(const std::string &BaseFilename,
		const std::string &TestFilename,
		double LowSimilarityThreshold, double Epsilon,
		const OverlapFuncFilters &FuncFilter)
		: BaseFilename(BaseFilename), TestFilename(TestFilename),
		LowSimilarityThreshold(LowSimilarityThreshold), Epsilon(Epsilon),
		FuncFilter(FuncFilter) {}

		/// Detect 0-sample input profile and report to output stream. This interface
		/// should be called after loadProfiles().
		bool detectZeroSampleProfile(raw_fd_ostream &OS) const;

		/// Write out function-level similarity statistics for functions specified by
		/// options --function, --value-cutoff, and --similarity-cutoff.
		void dumpFuncSimilarity(raw_fd_ostream &OS) const;

		/// Write out program-level similarity and overlap statistics.
		void dumpProgramSummary(raw_fd_ostream &OS) const;

		/// Write out hot-function and hot-block statistics for base_profile,
		/// test_profile, and their overlap. For both cases, the overlap HO is
		/// calculated as follows:
		/// Given the number of functions (or blocks) that are hot in both profiles
		/// HCommon and the number of functions (or blocks) that are hot in at
		/// least one profile HUnion, HO = HCommon / HUnion.
		void dumpHotFuncAndBlockOverlap(raw_fd_ostream &OS) const;

		/// This function tries matching functions in base and test profiles. For each
		/// pair of matched functions, it aggregates the function-level
		/// similarity into a profile-level similarity. It also dump function-level
		/// similarity information of functions specified by --function,
		/// --value-cutoff, and --similarity-cutoff options. The program-level
		/// similarity PS is computed as follows:
		/// Given function-level similarity FS(A) for all function A, the
		/// weight of function A in base profile WB(A), and the weight of function
		/// A in test profile WT(A), compute PS(base_profile, test_profile) =
		/// sum_A(FS(A) * avg(WB(A), WT(A))) ranging in [0.0f to 1.0f] with 0.0
		/// meaning no-overlap.
		void computeSampleProfileOverlap(raw_fd_ostream &OS);

		/// Initialize ProfOverlap with the sum of samples in base and test
		/// profiles. This function also computes and keeps the sum of samples and
		/// max sample counts of each function in BaseStats and TestStats for later
		/// use to avoid re-computations.
		void initializeSampleProfileOverlap();

		/// Load profiles specified by BaseFilename and TestFilename.
		std::error_code loadProfiles();

		private:
		SampleOverlapStats ProfOverlap;
		SampleOverlapStats HotFuncOverlap;
		SampleOverlapStats HotBlockOverlap;
		std::string BaseFilename;
		std::string TestFilename;
		std::unique_ptr<sampleprof::SampleProfileReader> BaseReader;
		std::unique_ptr<sampleprof::SampleProfileReader> TestReader;
		// BaseStats and TestStats hold FuncSampleStats for each function, with
		// function name as the key.
		StringMap<FuncSampleStats> BaseStats;
		StringMap<FuncSampleStats> TestStats;
		// Low similarity threshold in floating point number
		double LowSimilarityThreshold;
		wmiUnsubmitted Not Done Reply Inline Actions Can we make the similarity within range 0~1 to be consistent with Block and profile similarity? It is more natural to reason the similarity with range 0~1. wmi: Can we make the similarity within range 0~1 to be consistent with Block and profile similarity?
		weiheAuthorUnsubmitted Done Reply Inline Actions Thank you for the suggestion! I have changed the function `computeSampleFunctionInternalOverlap()` to return a double in range 0~1. weihe: Thank you for the suggestion! I have changed the function `computeSampleFunctionInternalOverlap…
		// Block samples above BaseHotThreshold or TestHotThreshold are considered hot
		// for tracking hot blocks.
		uint64_t BaseHotThreshold;
		uint64_t TestHotThreshold;
		// A small threshold used to round the results of floating point accumulations
		// to resolve imprecision.
		const double Epsilon;
		std::multimap<double, SampleOverlapStats, std::greater<double>>
		FuncSimilarityDump;
		// FuncFilter carries specifications in options --value-cutoff and
		// --function.
		OverlapFuncFilters FuncFilter;
		// Column offsets for printing the function-level details table.
		static const unsigned int TestWeightCol = 15;
		static const unsigned int SimilarityCol = 30;
		static const unsigned int OverlapCol = 43;
		static const unsigned int BaseUniqueCol = 53;
		static const unsigned int TestUniqueCol = 67;
		static const unsigned int BaseSampleCol = 81;
		static const unsigned int TestSampleCol = 96;
		static const unsigned int FuncNameCol = 111;

		/// Return a similarity of two line/block sample counters in the same
		/// function in base and test profiles. The line/block-similarity BS(i) is
		/// computed as follows:
		/// For an offsets i, given the sample count at i in base profile BB(i),
		/// the sample count at i in test profile BT(i), the sum of sample counts
		/// in this function in base profile SB, and the sum of sample counts in
		/// this function in test profile ST, compute BS(i) = 1.0 - fabs(BB(i)/SB -
		/// BT(i)/ST), ranging in [0.0f to 1.0f] with 0.0 meaning no-overlap.
		double computeBlockSimilarity(uint64_t BaseSample, uint64_t TestSample,
		const SampleOverlapStats &FuncOverlap) const;

		void updateHotBlockOverlap(uint64_t BaseSample, uint64_t TestSample,
		uint64_t HotBlockCount);

		void getHotFunctions(const StringMap<FuncSampleStats> &ProfStats,
		StringMap<FuncSampleStats> &HotFunc,
		uint64_t HotThreshold) const;

		void computeHotFuncOverlap();

		/// This function updates statistics in FuncOverlap, HotBlockOverlap, and
		/// Difference for two sample units in a matched function according to the
		/// given match status.
		void updateOverlapStatsForFunction(uint64_t BaseSample, uint64_t TestSample,
		uint64_t HotBlockCount,
		SampleOverlapStats &FuncOverlap,
		double &Difference, MatchStatus Status);

		/// This function updates statistics in FuncOverlap, HotBlockOverlap, and
		/// Difference for unmatched callees that only present in one profile in a
		/// matched caller function.
		void updateForUnmatchedCallee(const sampleprof::FunctionSamples &Func,
		SampleOverlapStats &FuncOverlap,
		double &Difference, MatchStatus Status);

		/// This function updates sample overlap statistics of an overlap function in
		/// base and test profile. It also calculates a function-internal similarity
		/// FIS as follows:
		/// For offsets i that have samples in at least one profile in this
		/// function A, given BS(i) returned by computeBlockSimilarity(), compute
		/// FIS(A) = (2.0 - sum_i(1.0 - BS(i))) / 2, ranging in [0.0f to 1.0f] with
		/// 0.0 meaning no overlap.
		double computeSampleFunctionInternalOverlap(
		const sampleprof::FunctionSamples &BaseFunc,
		const sampleprof::FunctionSamples &TestFunc,
		SampleOverlapStats &FuncOverlap);

		/// Function-level similarity (FS) is a weighted value over function internal
		/// similarity (FIS). This function computes a function's FS from its FIS by
		/// applying the weight.
		double weightForFuncSimilarity(double FuncSimilarity, uint64_t BaseFuncSample,
		uint64_t TestFuncSample) const;

		/// The function-level similarity FS(A) for a function A is computed as
		/// follows:
		/// Compute a function-internal similarity FIS(A) by
		/// computeSampleFunctionInternalOverlap(). Then, with the weight of
		/// function A in base profile WB(A), and the weight of function A in test
		/// profile WT(A), compute FS(A) = FIS(A) * (1.0 - fabs(WB(A) - WT(A)))
		hoyFBUnsubmitted Not Done Reply Inline Actions I think you want just `const auto Match = ...`. The return value will be returned by reference if it is large. hoyFB: I think you want just `const auto Match = ...`. The return value will be returned by reference…
		weiheAuthorUnsubmitted Done Reply Inline Actions Thank you for the suggestion! I've changed the code accordingly. weihe: Thank you for the suggestion! I've changed the code accordingly.
		/// ranging in [0.0f to 1.0f] with 0.0 meaning no overlap.
		double
		hoyFBUnsubmitted Not Done Reply Inline Actions May just increase `HotFuncOverlap.UnionCount` before continue, instead of erasing every matched function from `TestHotFunc`? like for (const auto &F : TestHotFunc) { if (BaseHotFunc.count(F.first())) ++HotFuncOverlap.OverlapCount; else ++HotFuncOverlap.UnionCount; } hoyFB: May just increase `HotFuncOverlap.UnionCount` before continue, instead of erasing every matched…
		weiheAuthorUnsubmitted Done Reply Inline Actions This suggestion is really good! The refactored code is much cleaner. Thank you very much! weihe: This suggestion is really good! The refactored code is much cleaner. Thank you very much!
		computeSampleFunctionOverlap(const sampleprof::FunctionSamples *BaseFunc,
		const sampleprof::FunctionSamples *TestFunc,
		SampleOverlapStats *FuncOverlap,
		uint64_t BaseFuncSample,
		wmiUnsubmitted Not Done Reply Inline Actions Both weightFuncSimilarity and weightByImportance considers the weight of the function in the profile, so at the beginning I felt confused what are their difference. I find out the difference is weightFuncSimilarity absorbs the weight difference into the similarity so the similarity is still in the range of 0~1, while weightByImportance multiplies the similarity by weight ratio of the function in the profile (the average ratio of the two profiles), so the aggregate similarity of all the functions in the profiles will be in the range of 0~1. Please correct me if I am wrong. But it is better to make the intention of these two functions more clear in the comments. wmi: Both weightFuncSimilarity and weightByImportance considers the weight of the function in the…
		weiheAuthorUnsubmitted Done Reply Inline Actions Yes, that's right! In fact, the previous `weightFuncSimilarity()` is part of the computation of "function-level similarity", whereas `weightByImportance()` is part of the computation that aggregates "function-level similarity" to "profile-level similarity". So the two functions use weights in slightly different ways. I also realized these two functions may be confusing, so I grouped the previous `weightFuncSimilarity()` (renamed as `weightForFuncSimilarity()`) and `computeSampleFunctionInternalOverlap()` into one `computeSampleFunctionOverlap()` function. In addition, I added comments to `weightForFuncSimilarity()` and `weightByImportance()` to explain the different purpose of these functions. Thank you for pointing this out! weihe: Yes, that's right! In fact, the previous `weightFuncSimilarity()` is part of the computation of…
		uint64_t TestFuncSample);

		/// Profile-level similarity (PS) is a weighted aggregate over function-level
		/// similarities (FS). This method weights the FS value by the function
		/// weights in the base and test profiles for the aggregation.
		double weightByImportance(double FuncSimilarity, uint64_t BaseFuncSample,
		uint64_t TestFuncSample) const;
		};
		} // end anonymous namespace

		bool SampleOverlapAggregator::detectZeroSampleProfile(
		raw_fd_ostream &OS) const {
		bool HaveZeroSample = false;
		if (ProfOverlap.BaseSample == 0) {
		OS << "Sum of sample counts for profile " << BaseFilename << " is 0.\n";
		HaveZeroSample = true;
		}
		if (ProfOverlap.TestSample == 0) {
		OS << "Sum of sample counts for profile " << TestFilename << " is 0.\n";
		HaveZeroSample = true;
		wmiUnsubmitted Not Done Reply Inline Actions Seemly a lot of complexity of the function comes from lock step iteration of the maps from two profiles at the same time. Could you extract the lock step iteration logic into a separate class? This way we don't have to deal with the logic multiple times in iterating BodySampleMap, CallsiteSamplesMap and FunctionSamplesMap. wmi: Seemly a lot of complexity of the function comes from lock step iteration of the maps from two…
		weiheAuthorUnsubmitted Done Reply Inline Actions I have extracted the logic of lock step iteration to class `MatchStep`. This is a great suggestion. Thank you very much! weihe: I have extracted the logic of lock step iteration to class `MatchStep`. This is a great…
		}
		return HaveZeroSample;
		}

		double SampleOverlapAggregator::computeBlockSimilarity(
		uint64_t BaseSample, uint64_t TestSample,
		const SampleOverlapStats &FuncOverlap) const {
		double BaseFrac = 0.0;
		double TestFrac = 0.0;
		if (FuncOverlap.BaseSample > 0)
		BaseFrac = static_cast<double>(BaseSample) / FuncOverlap.BaseSample;
		if (FuncOverlap.TestSample > 0)
		TestFrac = static_cast<double>(TestSample) / FuncOverlap.TestSample;
		return 1.0 - std::fabs(BaseFrac - TestFrac);
		}

		void SampleOverlapAggregator::updateHotBlockOverlap(uint64_t BaseSample,
		uint64_t TestSample,
		uint64_t HotBlockCount) {
		bool IsBaseHot = (BaseSample >= BaseHotThreshold);
		bool IsTestHot = (TestSample >= TestHotThreshold);
		if (!IsBaseHot && !IsTestHot)
		return;

		wmiUnsubmitted Not Done Reply Inline Actions updateForUnmatchedBlock is a special case of the block above. We may be able to share the code. wmi: updateForUnmatchedBlock is a special case of the block above. We may be able to share the code.
		weiheAuthorUnsubmitted Done Reply Inline Actions Thank you for the suggestion! I combined this code with `updateForUnmatchedBlock()` into a new function `updateOverlapStatsForFunction()`. weihe: Thank you for the suggestion! I combined this code with `updateForUnmatchedBlock()` into a new…
		HotBlockOverlap.UnionCount += HotBlockCount;
		if (IsBaseHot)
		HotBlockOverlap.BaseCount += HotBlockCount;
		if (IsTestHot)
		HotBlockOverlap.TestCount += HotBlockCount;
		if (IsBaseHot && IsTestHot)
		HotBlockOverlap.OverlapCount += HotBlockCount;
		}

		void SampleOverlapAggregator::getHotFunctions(
		const StringMap<FuncSampleStats> &ProfStats,
		StringMap<FuncSampleStats> &HotFunc, uint64_t HotThreshold) const {
		for (const auto &F : ProfStats) {
		if (isFunctionHot(F.second, HotThreshold))
		HotFunc.try_emplace(F.first(), F.second);
		}
		}

		void SampleOverlapAggregator::computeHotFuncOverlap() {
		StringMap<FuncSampleStats> BaseHotFunc;
		getHotFunctions(BaseStats, BaseHotFunc, BaseHotThreshold);
		HotFuncOverlap.BaseCount = BaseHotFunc.size();

		StringMap<FuncSampleStats> TestHotFunc;
		getHotFunctions(TestStats, TestHotFunc, TestHotThreshold);
		HotFuncOverlap.TestCount = TestHotFunc.size();
		HotFuncOverlap.UnionCount = HotFuncOverlap.TestCount;

		for (const auto &F : BaseHotFunc) {
		if (TestHotFunc.count(F.first()))
		++HotFuncOverlap.OverlapCount;
		else
		++HotFuncOverlap.UnionCount;
		}
		}

		void SampleOverlapAggregator::updateOverlapStatsForFunction(
		uint64_t BaseSample, uint64_t TestSample, uint64_t HotBlockCount,
		SampleOverlapStats &FuncOverlap, double &Difference, MatchStatus Status) {
		assert(Status != MS_None &&
		"Match status should be updated before updating overlap statistics");
		if (Status == MS_FirstUnique) {
		TestSample = 0;
		FuncOverlap.BaseUniqueSample += BaseSample;
		} else if (Status == MS_SecondUnique) {
		BaseSample = 0;
		FuncOverlap.TestUniqueSample += TestSample;
		} else {
		++FuncOverlap.OverlapCount;
		}

		FuncOverlap.UnionSample += std::max(BaseSample, TestSample);
		FuncOverlap.OverlapSample += std::min(BaseSample, TestSample);
		Difference +=
		1.0 - computeBlockSimilarity(BaseSample, TestSample, FuncOverlap);
		updateHotBlockOverlap(BaseSample, TestSample, HotBlockCount);
		}

		void SampleOverlapAggregator::updateForUnmatchedCallee(
		const sampleprof::FunctionSamples &Func, SampleOverlapStats &FuncOverlap,
		double &Difference, MatchStatus Status) {
		assert((Status == MS_FirstUnique \|\| Status == MS_SecondUnique) &&
		"Status must be either of the two unmatched cases");
		FuncSampleStats FuncStats;
		if (Status == MS_FirstUnique) {
		getFuncSampleStats(Func, FuncStats, BaseHotThreshold);
		updateOverlapStatsForFunction(FuncStats.SampleSum, 0,
		FuncStats.HotBlockCount, FuncOverlap,
		Difference, Status);
		} else {
		getFuncSampleStats(Func, FuncStats, TestHotThreshold);
		updateOverlapStatsForFunction(0, FuncStats.SampleSum,
		FuncStats.HotBlockCount, FuncOverlap,
		Difference, Status);
		}
		}

		double SampleOverlapAggregator::computeSampleFunctionInternalOverlap(
		const sampleprof::FunctionSamples &BaseFunc,
		const sampleprof::FunctionSamples &TestFunc,
		SampleOverlapStats &FuncOverlap) {

		using namespace sampleprof;

		double Difference = 0;

		// Accumulate Difference for regular line/block samples in the function.
		// We match them through sort-merge join algorithm because
		// FunctionSamples::getBodySamples() returns a map of sample counters ordered
		// by their offsets.
		MatchStep<BodySampleMap::const_iterator> BlockIterStep(
		BaseFunc.getBodySamples().cbegin(), BaseFunc.getBodySamples().cend(),
		TestFunc.getBodySamples().cbegin(), TestFunc.getBodySamples().cend());
		BlockIterStep.updateOneStep();
		while (!BlockIterStep.areBothFinished()) {
		uint64_t BaseSample =
		BlockIterStep.isFirstFinished()
		? 0
		: BlockIterStep.getFirstIter()->second.getSamples();
		uint64_t TestSample =
		BlockIterStep.isSecondFinished()
		? 0
		: BlockIterStep.getSecondIter()->second.getSamples();
		updateOverlapStatsForFunction(BaseSample, TestSample, 1, FuncOverlap,
		Difference, BlockIterStep.getMatchStatus());

		BlockIterStep.updateOneStep();
		}

		// Accumulate Difference for callsite lines in the function. We match
		// them through sort-merge algorithm because
		// FunctionSamples::getCallsiteSamples() returns a map of callsite records
		// ordered by their offsets.
		MatchStep<CallsiteSampleMap::const_iterator> CallsiteIterStep(
		BaseFunc.getCallsiteSamples().cbegin(),
		BaseFunc.getCallsiteSamples().cend(),
		TestFunc.getCallsiteSamples().cbegin(),
		TestFunc.getCallsiteSamples().cend());
		CallsiteIterStep.updateOneStep();
		while (!CallsiteIterStep.areBothFinished()) {
		MatchStatus CallsiteStepStatus = CallsiteIterStep.getMatchStatus();
		assert(CallsiteStepStatus != MS_None &&
		"Match status should be updated before entering loop body");

		if (CallsiteStepStatus != MS_Match) {
		auto Callsite = (CallsiteStepStatus == MS_FirstUnique)
		? CallsiteIterStep.getFirstIter()
		: CallsiteIterStep.getSecondIter();
		for (const auto &F : Callsite->second)
		updateForUnmatchedCallee(F.second, FuncOverlap, Difference,
		CallsiteStepStatus);
		} else {
		// There may be multiple inlinees at the same offset, so we need to try
		// matching all of them. This match is implemented through sort-merge
		// algorithm because callsite records at the same offset are ordered by
		// function names.
		MatchStep<FunctionSamplesMap::const_iterator> CalleeIterStep(
		CallsiteIterStep.getFirstIter()->second.cbegin(),
		CallsiteIterStep.getFirstIter()->second.cend(),
		CallsiteIterStep.getSecondIter()->second.cbegin(),
		CallsiteIterStep.getSecondIter()->second.cend());
		CalleeIterStep.updateOneStep();
		while (!CalleeIterStep.areBothFinished()) {
		MatchStatus CalleeStepStatus = CalleeIterStep.getMatchStatus();
		if (CalleeStepStatus != MS_Match) {
		auto Callee = (CalleeStepStatus == MS_FirstUnique)
		? CalleeIterStep.getFirstIter()
		: CalleeIterStep.getSecondIter();
		updateForUnmatchedCallee(Callee->second, FuncOverlap, Difference,
		CalleeStepStatus);
		} else {
		// An inlined function can contain other inlinees inside, so compute
		// the Difference recursively.
		Difference += 2.0 - 2 * computeSampleFunctionInternalOverlap(
		CalleeIterStep.getFirstIter()->second,
		CalleeIterStep.getSecondIter()->second,
		FuncOverlap);
		}
		CalleeIterStep.updateOneStep();
		}
		}
		CallsiteIterStep.updateOneStep();
		}

		// Difference reflects the total differences of line/block samples in this
		// function and ranges in [0.0f to 2.0f]. Take (2.0 - Difference) / 2 to
		// reflect the similarity between function profiles in [0.0f to 1.0f].
		return (2.0 - Difference) / 2;
		}

		double SampleOverlapAggregator::weightForFuncSimilarity(
		double FuncInternalSimilarity, uint64_t BaseFuncSample,
		uint64_t TestFuncSample) const {
		// Compute the weight as the distance between the function weights in two
		// profiles.
		double BaseFrac = 0.0;
		double TestFrac = 0.0;
		assert(ProfOverlap.BaseSample > 0 &&
		"Total samples in base profile should be greater than 0");
		BaseFrac = static_cast<double>(BaseFuncSample) / ProfOverlap.BaseSample;
		assert(ProfOverlap.TestSample > 0 &&
		"Total samples in test profile should be greater than 0");
		TestFrac = static_cast<double>(TestFuncSample) / ProfOverlap.TestSample;
		double WeightDistance = std::fabs(BaseFrac - TestFrac);

		// Take WeightDistance into the similarity.
		return FuncInternalSimilarity * (1 - WeightDistance);
		}

		double
		SampleOverlapAggregator::weightByImportance(double FuncSimilarity,
		uint64_t BaseFuncSample,
		uint64_t TestFuncSample) const {

		double BaseFrac = 0.0;
		double TestFrac = 0.0;
		assert(ProfOverlap.BaseSample > 0 &&
		"Total samples in base profile should be greater than 0");
		BaseFrac = static_cast<double>(BaseFuncSample) / ProfOverlap.BaseSample / 2.0;
		assert(ProfOverlap.TestSample > 0 &&
		"Total samples in test profile should be greater than 0");
		TestFrac = static_cast<double>(TestFuncSample) / ProfOverlap.TestSample / 2.0;
		return FuncSimilarity * (BaseFrac + TestFrac);
		}

		double SampleOverlapAggregator::computeSampleFunctionOverlap(
		const sampleprof::FunctionSamples *BaseFunc,
		const sampleprof::FunctionSamples *TestFunc,
		SampleOverlapStats *FuncOverlap, uint64_t BaseFuncSample,
		uint64_t TestFuncSample) {
		// Default function internal similarity before weighted, meaning two functions
		// has no overlap.
		const double DefaultFuncInternalSimilarity = 0;
		double FuncSimilarity;
		double FuncInternalSimilarity;

		// If BaseFunc or TestFunc is nullptr, it means the functions do not overlap.
		// In this case, we use DefaultFuncInternalSimilarity as the function internal
		// similarity.
		if (!BaseFunc \|\| !TestFunc) {
		FuncInternalSimilarity = DefaultFuncInternalSimilarity;
		} else {
		assert(FuncOverlap != nullptr &&
		"FuncOverlap should be provided in this case");
		FuncInternalSimilarity = computeSampleFunctionInternalOverlap(
		BaseFunc, TestFunc, *FuncOverlap);
		// Now, FuncInternalSimilarity may be a little less than 0 due to
		// imprecision of floating point accumulations. Make it zero if the
		// difference is below Epsilon.
		FuncInternalSimilarity = (std::fabs(FuncInternalSimilarity - 0) < Epsilon)
		? 0
		: FuncInternalSimilarity;
		}
		FuncSimilarity = weightForFuncSimilarity(FuncInternalSimilarity,
		BaseFuncSample, TestFuncSample);
		return FuncSimilarity;
		}

		void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) {
		using namespace sampleprof;

		StringMap<const FunctionSamples *> BaseFuncProf;
		const auto &BaseProfiles = BaseReader->getProfiles();
		for (const auto &BaseFunc : BaseProfiles) {
		BaseFuncProf.try_emplace(BaseFunc.second.getFuncName(), &(BaseFunc.second));
		}
		ProfOverlap.UnionCount = BaseFuncProf.size();

		const auto &TestProfiles = TestReader->getProfiles();
		for (const auto &TestFunc : TestProfiles) {
		SampleOverlapStats FuncOverlap;
		FuncOverlap.TestName = TestFunc.second.getFuncName();
		assert(TestStats.count(FuncOverlap.TestName) &&
		"TestStats should have records for all functions in test profile "
		"except inlinees");
		FuncOverlap.TestSample = TestStats[FuncOverlap.TestName].SampleSum;

		const auto Match = BaseFuncProf.find(FuncOverlap.TestName);
		if (Match == BaseFuncProf.end()) {
		const FuncSampleStats &FuncStats = TestStats[FuncOverlap.TestName];
		++ProfOverlap.TestUniqueCount;
		ProfOverlap.TestUniqueSample += FuncStats.SampleSum;
		FuncOverlap.TestUniqueSample = FuncStats.SampleSum;

		updateHotBlockOverlap(0, FuncStats.SampleSum, FuncStats.HotBlockCount);

		double FuncSimilarity = computeSampleFunctionOverlap(
		nullptr, nullptr, nullptr, 0, FuncStats.SampleSum);
		ProfOverlap.Similarity +=
		weightByImportance(FuncSimilarity, 0, FuncStats.SampleSum);

		++ProfOverlap.UnionCount;
		ProfOverlap.UnionSample += FuncStats.SampleSum;
		} else {
		++ProfOverlap.OverlapCount;

		// Two functions match with each other. Compute function-level overlap and
		// aggregate them into profile-level overlap.
		FuncOverlap.BaseName = Match->second->getFuncName();
		assert(BaseStats.count(FuncOverlap.BaseName) &&
		"BaseStats should have records for all functions in base profile "
		"except inlinees");
		FuncOverlap.BaseSample = BaseStats[FuncOverlap.BaseName].SampleSum;

		FuncOverlap.Similarity = computeSampleFunctionOverlap(
		Match->second, &TestFunc.second, &FuncOverlap, FuncOverlap.BaseSample,
		FuncOverlap.TestSample);
		ProfOverlap.Similarity +=
		weightByImportance(FuncOverlap.Similarity, FuncOverlap.BaseSample,
		FuncOverlap.TestSample);
		ProfOverlap.OverlapSample += FuncOverlap.OverlapSample;
		ProfOverlap.UnionSample += FuncOverlap.UnionSample;

		// Accumulate the percentage of base unique and test unique samples into
		// ProfOverlap.
		ProfOverlap.BaseUniqueSample += FuncOverlap.BaseUniqueSample;
		ProfOverlap.TestUniqueSample += FuncOverlap.TestUniqueSample;

		// Remove matched base functions for later reporting functions not found
		// in test profile.
		BaseFuncProf.erase(Match);
		aganeaUnsubmitted Not Done Reply Inline Actions `Match` is invalidated after this line, so it cannot be compared with `BaseFuncProf.end()` afterwards at L1607 and L1609. In a Debug build on Windows/MSVC this asserts in MS-STL: The following tests fail because of this: LLVM :: tools/llvm-profdata/compact-sample.proftext LLVM :: tools/llvm-profdata/sample-overlap.test The following patch seems to fi the issue, but I thought I'll let you decide what to do? diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp index 488dc8fa4317..38d9cb9461bb 100644 --- a/llvm/tools/llvm-profdata/llvm-profdata.cpp +++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp @@ -1633,6 +1633,7 @@ void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) { "except inlinees"); FuncOverlap.TestSample = TestStats[FuncOverlap.TestName].SampleSum; + bool Matched = false; const auto Match = BaseFuncProf.find(FuncOverlap.TestName); if (Match == BaseFuncProf.end()) { const FuncSampleStats &FuncStats = TestStats[FuncOverlap.TestName]; @@ -1677,6 +1678,8 @@ void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) { // Remove matched base functions for later reporting functions not found // in test profile. BaseFuncProf.erase(Match); + + Matched = true; } // Print function-level similarity information if specified by options. @@ -1684,9 +1687,8 @@ void SampleOverlapAggregator::computeSampleProfileOverlap(raw_fd_ostream &OS) { "TestStats should have records for all functions in test profile " "except inlinees"); if (TestStats[FuncOverlap.TestName].MaxSample >= FuncFilter.ValueCutoff \|\| - (Match != BaseFuncProf.end() && - FuncOverlap.Similarity < LowSimilarityThreshold) \|\| - (Match != BaseFuncProf.end() && !FuncFilter.NameFilter.empty() && + (Matched && FuncOverlap.Similarity < LowSimilarityThreshold) \|\| + (Matched && !FuncFilter.NameFilter.empty() && FuncOverlap.BaseName.toString().find(FuncFilter.NameFilter) != std::string::npos)) { assert(ProfOverlap.BaseSample > 0 && aganea: `Match` is invalidated after this line, so it cannot be compared with `BaseFuncProf.end()`…
		wenleiUnsubmitted Not Done Reply Inline Actions Good catch, thanks! I will fix it as you suggested. wenlei: Good catch, thanks! I will fix it as you suggested.
		}

		// Print function-level similarity information if specified by options.
		assert(TestStats.count(FuncOverlap.TestName) &&
		"TestStats should have records for all functions in test profile "
		"except inlinees");
		if (TestStats[FuncOverlap.TestName].MaxSample >= FuncFilter.ValueCutoff \|\|
		(Match != BaseFuncProf.end() &&
		FuncOverlap.Similarity < LowSimilarityThreshold) \|\|
		(Match != BaseFuncProf.end() && !FuncFilter.NameFilter.empty() &&
		FuncOverlap.BaseName.find(FuncFilter.NameFilter) !=
		FuncOverlap.BaseName.npos)) {
		assert(ProfOverlap.BaseSample > 0 &&
		"Total samples in base profile should be greater than 0");
		FuncOverlap.BaseWeight =
		static_cast<double>(FuncOverlap.BaseSample) / ProfOverlap.BaseSample;
		assert(ProfOverlap.TestSample > 0 &&
		"Total samples in test profile should be greater than 0");
		FuncOverlap.TestWeight =
		static_cast<double>(FuncOverlap.TestSample) / ProfOverlap.TestSample;
		FuncSimilarityDump.emplace(FuncOverlap.BaseWeight, FuncOverlap);
		}
		}

		// Traverse through functions in base profile but not in test profile.
		for (const auto &F : BaseFuncProf) {
		assert(BaseStats.count(F.second->getFuncName()) &&
		"BaseStats should have records for all functions in base profile "
		"except inlinees");
		const FuncSampleStats &FuncStats = BaseStats[F.second->getFuncName()];
		++ProfOverlap.BaseUniqueCount;
		ProfOverlap.BaseUniqueSample += FuncStats.SampleSum;

		updateHotBlockOverlap(FuncStats.SampleSum, 0, FuncStats.HotBlockCount);

		double FuncSimilarity = computeSampleFunctionOverlap(
		nullptr, nullptr, nullptr, FuncStats.SampleSum, 0);
		ProfOverlap.Similarity +=
		weightByImportance(FuncSimilarity, FuncStats.SampleSum, 0);

		ProfOverlap.UnionSample += FuncStats.SampleSum;
		}

		// Now, ProfSimilarity may be a little greater than 1 due to imprecision
		// of floating point accumulations. Make it 1.0 if the difference is below
		// Epsilon.
		ProfOverlap.Similarity = (std::fabs(ProfOverlap.Similarity - 1) < Epsilon)
		? 1
		: ProfOverlap.Similarity;

		computeHotFuncOverlap();
		}

		void SampleOverlapAggregator::initializeSampleProfileOverlap() {
		const auto &BaseProf = BaseReader->getProfiles();
		for (const auto &I : BaseProf) {
		++ProfOverlap.BaseCount;
		FuncSampleStats FuncStats;
		getFuncSampleStats(I.second, FuncStats, BaseHotThreshold);
		ProfOverlap.BaseSample += FuncStats.SampleSum;
		BaseStats.try_emplace(I.second.getFuncName(), FuncStats);
		}

		const auto &TestProf = TestReader->getProfiles();
		for (const auto &I : TestProf) {
		++ProfOverlap.TestCount;
		FuncSampleStats FuncStats;
		getFuncSampleStats(I.second, FuncStats, TestHotThreshold);
		ProfOverlap.TestSample += FuncStats.SampleSum;
		TestStats.try_emplace(I.second.getFuncName(), FuncStats);
		}

		ProfOverlap.BaseName = StringRef(BaseFilename);
		ProfOverlap.TestName = StringRef(TestFilename);
		}

		void SampleOverlapAggregator::dumpFuncSimilarity(raw_fd_ostream &OS) const {
		using namespace sampleprof;

		if (FuncSimilarityDump.empty())
		return;

		formatted_raw_ostream FOS(OS);
		FOS << "Function-level details:\n";
		FOS << "Base weight";
		FOS.PadToColumn(TestWeightCol);
		FOS << "Test weight";
		FOS.PadToColumn(SimilarityCol);
		FOS << "Similarity";
		FOS.PadToColumn(OverlapCol);
		FOS << "Overlap";
		FOS.PadToColumn(BaseUniqueCol);
		FOS << "Base unique";
		FOS.PadToColumn(TestUniqueCol);
		FOS << "Test unique";
		FOS.PadToColumn(BaseSampleCol);
		FOS << "Base samples";
		FOS.PadToColumn(TestSampleCol);
		FOS << "Test samples";
		FOS.PadToColumn(FuncNameCol);
		FOS << "Function name\n";
		for (const auto &F : FuncSimilarityDump) {
		double OverlapPercent =
		F.second.UnionSample > 0
		? static_cast<double>(F.second.OverlapSample) / F.second.UnionSample
		: 0;
		double BaseUniquePercent =
		F.second.BaseSample > 0
		? static_cast<double>(F.second.BaseUniqueSample) /
		F.second.BaseSample
		: 0;
		double TestUniquePercent =
		F.second.TestSample > 0
		? static_cast<double>(F.second.TestUniqueSample) /
		F.second.TestSample
		: 0;

		FOS << format("%.2f%%", F.second.BaseWeight * 100);
		FOS.PadToColumn(TestWeightCol);
		FOS << format("%.2f%%", F.second.TestWeight * 100);
		FOS.PadToColumn(SimilarityCol);
		FOS << format("%.2f%%", F.second.Similarity * 100);
		FOS.PadToColumn(OverlapCol);
		FOS << format("%.2f%%", OverlapPercent * 100);
		FOS.PadToColumn(BaseUniqueCol);
		FOS << format("%.2f%%", BaseUniquePercent * 100);
		FOS.PadToColumn(TestUniqueCol);
		FOS << format("%.2f%%", TestUniquePercent * 100);
		FOS.PadToColumn(BaseSampleCol);
		FOS << F.second.BaseSample;
		FOS.PadToColumn(TestSampleCol);
		FOS << F.second.TestSample;
		FOS.PadToColumn(FuncNameCol);
		FOS << F.second.TestName << "\n";
		}
		}

		void SampleOverlapAggregator::dumpProgramSummary(raw_fd_ostream &OS) const {
		OS << "Profile overlap infomation for base_profile: " << ProfOverlap.BaseName
		<< " and test_profile: " << ProfOverlap.TestName << "\nProgram level:\n";

		OS << " Whole program profile similarity: "
		<< format("%.3f%%", ProfOverlap.Similarity * 100) << "\n";

		assert(ProfOverlap.UnionSample > 0 &&
		"Total samples in two profile should be greater than 0");
		double OverlapPercent =
		static_cast<double>(ProfOverlap.OverlapSample) / ProfOverlap.UnionSample;
		assert(ProfOverlap.BaseSample > 0 &&
		"Total samples in base profile should be greater than 0");
		double BaseUniquePercent = static_cast<double>(ProfOverlap.BaseUniqueSample) /
		ProfOverlap.BaseSample;
		assert(ProfOverlap.TestSample > 0 &&
		"Total samples in test profile should be greater than 0");
		double TestUniquePercent = static_cast<double>(ProfOverlap.TestUniqueSample) /
		ProfOverlap.TestSample;

		OS << " Whole program sample overlap: "
		<< format("%.3f%%", OverlapPercent * 100) << "\n";
		OS << " percentage of samples unique in base profile: "
		<< format("%.3f%%", BaseUniquePercent * 100) << "\n";
		OS << " percentage of samples unique in test profile: "
		<< format("%.3f%%", TestUniquePercent * 100) << "\n";
		OS << " total samples in base profile: " << ProfOverlap.BaseSample << "\n"
		<< " total samples in test profile: " << ProfOverlap.TestSample << "\n";

		assert(ProfOverlap.UnionCount > 0 &&
		"There should be at least one function in two input profiles");
		double FuncOverlapPercent =
		static_cast<double>(ProfOverlap.OverlapCount) / ProfOverlap.UnionCount;
		OS << " Function overlap: " << format("%.3f%%", FuncOverlapPercent * 100)
		<< "\n";
		OS << " overlap functions: " << ProfOverlap.OverlapCount << "\n";
		OS << " functions unique in base profile: " << ProfOverlap.BaseUniqueCount
		<< "\n";
		OS << " functions unique in test profile: " << ProfOverlap.TestUniqueCount
		<< "\n";
		}

		void SampleOverlapAggregator::dumpHotFuncAndBlockOverlap(
		raw_fd_ostream &OS) const {
		assert(HotFuncOverlap.UnionCount > 0 &&
		"There should be at least one hot function in two input profiles");
		OS << " Hot-function overlap: "
		<< format("%.3f%%", static_cast<double>(HotFuncOverlap.OverlapCount) /
		HotFuncOverlap.UnionCount * 100)
		<< "\n";
		OS << " overlap hot functions: " << HotFuncOverlap.OverlapCount << "\n";
		OS << " hot functions unique in base profile: "
		<< HotFuncOverlap.BaseCount - HotFuncOverlap.OverlapCount << "\n";
		OS << " hot functions unique in test profile: "
		<< HotFuncOverlap.TestCount - HotFuncOverlap.OverlapCount << "\n";

		assert(HotBlockOverlap.UnionCount > 0 &&
		"There should be at least one hot block in two input profiles");
		OS << " Hot-block overlap: "
		<< format("%.3f%%", static_cast<double>(HotBlockOverlap.OverlapCount) /
		HotBlockOverlap.UnionCount * 100)
		<< "\n";
		OS << " overlap hot blocks: " << HotBlockOverlap.OverlapCount << "\n";
		OS << " hot blocks unique in base profile: "
		<< HotBlockOverlap.BaseCount - HotBlockOverlap.OverlapCount << "\n";
		OS << " hot blocks unique in test profile: "
		<< HotBlockOverlap.TestCount - HotBlockOverlap.OverlapCount << "\n";
		}

		std::error_code SampleOverlapAggregator::loadProfiles() {
		using namespace sampleprof;

		LLVMContext Context;
		auto BaseReaderOrErr = SampleProfileReader::create(BaseFilename, Context);
		if (std::error_code EC = BaseReaderOrErr.getError())
		exitWithErrorCode(EC, BaseFilename);

		auto TestReaderOrErr = SampleProfileReader::create(TestFilename, Context);
		if (std::error_code EC = TestReaderOrErr.getError())
		exitWithErrorCode(EC, TestFilename);

		BaseReader = std::move(BaseReaderOrErr.get());
		TestReader = std::move(TestReaderOrErr.get());

		if (std::error_code EC = BaseReader->read())
		exitWithErrorCode(EC, BaseFilename);
		if (std::error_code EC = TestReader->read())
		exitWithErrorCode(EC, TestFilename);

		// Load BaseHotThreshold and TestHotThreshold as 99-percentile threshold in
		// profile summary.
		const uint64_t HotCutoff = 990000;
		ProfileSummary &BasePS = BaseReader->getSummary();
		for (const auto &SummaryEntry : BasePS.getDetailedSummary()) {
		if (SummaryEntry.Cutoff == HotCutoff) {
		BaseHotThreshold = SummaryEntry.MinCount;
		break;
		}
		}

		ProfileSummary &TestPS = TestReader->getSummary();
		for (const auto &SummaryEntry : TestPS.getDetailedSummary()) {
		if (SummaryEntry.Cutoff == HotCutoff) {
		TestHotThreshold = SummaryEntry.MinCount;
		break;
		}
		}
		return std::error_code();
		}

		void overlapSampleProfile(const std::string &BaseFilename,
		const std::string &TestFilename,
		const OverlapFuncFilters &FuncFilter,
		uint64_t SimilarityCutoff, raw_fd_ostream &OS) {
		using namespace sampleprof;

		// We use 0.000005 to initialize OverlapAggr.Epsilon because the final metrics
		// report 2--3 places after decimal point in percentage numbers.
		SampleOverlapAggregator OverlapAggr(
		BaseFilename, TestFilename,
		static_cast<double>(SimilarityCutoff) / 1000000, 0.000005, FuncFilter);
		if (std::error_code EC = OverlapAggr.loadProfiles())
		exitWithErrorCode(EC);

		OverlapAggr.initializeSampleProfileOverlap();
		if (OverlapAggr.detectZeroSampleProfile(OS))
		return;

		OverlapAggr.computeSampleProfileOverlap(OS);

		OverlapAggr.dumpProgramSummary(OS);
		OverlapAggr.dumpHotFuncAndBlockOverlap(OS);
		OverlapAggr.dumpFuncSimilarity(OS);
		}

static int overlap_main(int argc, const char *argv[]) {		static int overlap_main(int argc, const char *argv[]) {
cl::opt<std::string> BaseFilename(cl::Positional, cl::Required,		cl::opt<std::string> BaseFilename(cl::Positional, cl::Required,
cl::desc("<base profile file>"));		cl::desc("<base profile file>"));
cl::opt<std::string> TestFilename(cl::Positional, cl::Required,		cl::opt<std::string> TestFilename(cl::Positional, cl::Required,
cl::desc("<test profile file>"));		cl::desc("<test profile file>"));
cl::opt<std::string> Output("output", cl::value_desc("output"), cl::init("-"),		cl::opt<std::string> Output("output", cl::value_desc("output"), cl::init("-"),
cl::desc("Output file"));		cl::desc("Output file"));
cl::alias OutputA("o", cl::desc("Alias for --output"), cl::aliasopt(Output));		cl::alias OutputA("o", cl::desc("Alias for --output"), cl::aliasopt(Output));
cl::opt<bool> IsCS("cs", cl::init(false),		cl::opt<bool> IsCS("cs", cl::init(false),
cl::desc("For context sensitive counts"));		cl::desc("For context sensitive counts"));
cl::opt<unsigned long long> ValueCutoff(		cl::opt<unsigned long long> ValueCutoff(
"value-cutoff", cl::init(-1),		"value-cutoff", cl::init(-1),
cl::desc(		cl::desc(
"Function level overlap information for every function in test "		"Function level overlap information for every function in test "
"profile with max count value greater then the parameter value"));		"profile with max count value greater then the parameter value"));
cl::opt<std::string> FuncNameFilter(		cl::opt<std::string> FuncNameFilter(
"function",		"function",
cl::desc("Function level overlap information for matching functions"));		cl::desc("Function level overlap information for matching functions"));
		cl::opt<unsigned long long> SimilarityCutoff(
		"similarity-cutoff", cl::init(0),
		cl::desc(
		"For sample profiles, list function names for overlapped functions "
		"with similarities below the cutoff (percentage times 10000)."));
		cl::opt<ProfileKinds> ProfileKind(
		cl::desc("Profile kind:"), cl::init(instr),
		cl::values(clEnumVal(instr, "Instrumentation profile (default)"),
		clEnumVal(sample, "Sample profile")));
cl::ParseCommandLineOptions(argc, argv, "LLVM profile data overlap tool\n");		cl::ParseCommandLineOptions(argc, argv, "LLVM profile data overlap tool\n");

std::error_code EC;		std::error_code EC;
raw_fd_ostream OS(Output.data(), EC, sys::fs::OF_Text);		raw_fd_ostream OS(Output.data(), EC, sys::fs::OF_Text);
if (EC)		if (EC)
exitWithErrorCode(EC, Output);		exitWithErrorCode(EC, Output);

		if (ProfileKind == instr)
overlapInstrProfile(BaseFilename, TestFilename,		overlapInstrProfile(BaseFilename, TestFilename,
OverlapFuncFilters{ValueCutoff, FuncNameFilter}, OS,		OverlapFuncFilters{ValueCutoff, FuncNameFilter}, OS,
IsCS);		IsCS);
		else
		overlapSampleProfile(BaseFilename, TestFilename,
		OverlapFuncFilters{ValueCutoff, FuncNameFilter},
		SimilarityCutoff, OS);

return 0;		return 0;
}		}

typedef struct ValueSitesStats {		typedef struct ValueSitesStats {
ValueSitesStats()		ValueSitesStats()
: TotalNumValueSites(0), TotalNumValueSitesWithValueProfile(0),		: TotalNumValueSites(0), TotalNumValueSitesWithValueProfile(0),
TotalNumValues(0) {}		TotalNumValues(0) {}
▲ Show 20 Lines • Show All 278 Lines • ▼ Show 20 Lines
// print out or let it be an empty string.		// print out or let it be an empty string.
static void dumpHotFunctionList(const std::vector<std::string> &ColumnTitle,		static void dumpHotFunctionList(const std::vector<std::string> &ColumnTitle,
const std::vector<int> &ColumnOffset,		const std::vector<int> &ColumnOffset,
const std::vector<HotFuncInfo> &PrintValues,		const std::vector<HotFuncInfo> &PrintValues,
uint64_t HotFuncCount, uint64_t TotalFuncCount,		uint64_t HotFuncCount, uint64_t TotalFuncCount,
uint64_t HotProfCount, uint64_t TotalProfCount,		uint64_t HotProfCount, uint64_t TotalProfCount,
const std::string &HotFuncMetric,		const std::string &HotFuncMetric,
raw_fd_ostream &OS) {		raw_fd_ostream &OS) {
assert(ColumnOffset.size() == ColumnTitle.size());		assert(ColumnOffset.size() == ColumnTitle.size() &&
assert(ColumnTitle.size() >= 4);		"ColumnOffset and ColumnTitle should have the same size");
assert(TotalFuncCount > 0);		assert(ColumnTitle.size() >= 4 &&
		"ColumnTitle should have at least 4 elements");
		assert(TotalFuncCount > 0 &&
		"There should be at least one function in the profile");
double TotalProfPercent = 0;		double TotalProfPercent = 0;
if (TotalProfCount > 0)		if (TotalProfCount > 0)
TotalProfPercent = ((double)HotProfCount) / TotalProfCount * 100;		TotalProfPercent = static_cast<double>(HotProfCount) / TotalProfCount * 100;

formatted_raw_ostream FOS(OS);		formatted_raw_ostream FOS(OS);
FOS << HotFuncCount << " out of " << TotalFuncCount		FOS << HotFuncCount << " out of " << TotalFuncCount
<< " functions with profile ("		<< " functions with profile ("
<< format("%.2f%%", (((double)HotFuncCount) / TotalFuncCount * 100))		<< format("%.2f%%",
		(static_cast<double>(HotFuncCount) / TotalFuncCount * 100))
<< ") are considered hot functions";		<< ") are considered hot functions";
if (!HotFuncMetric.empty())		if (!HotFuncMetric.empty())
FOS << " (" << HotFuncMetric << ")";		FOS << " (" << HotFuncMetric << ")";
FOS << ".\n";		FOS << ".\n";
FOS << HotProfCount << " out of " << TotalProfCount << " profile counts ("		FOS << HotProfCount << " out of " << TotalProfCount << " profile counts ("
<< format("%.2f%%", TotalProfPercent) << ") are from hot functions.\n";		<< format("%.2f%%", TotalProfPercent) << ") are from hot functions.\n";

for (size_t I = 0; I < ColumnTitle.size(); ++I) {		for (size_t I = 0; I < ColumnTitle.size(); ++I) {
Show All 24 Lines	showHotFunctionList(const StringMap<sampleprof::FunctionSamples> &Profiles,
auto &SummaryVector = PS.getDetailedSummary();		auto &SummaryVector = PS.getDetailedSummary();
uint64_t MinCountThreshold = 0;		uint64_t MinCountThreshold = 0;
for (const ProfileSummaryEntry &SummaryEntry : SummaryVector) {		for (const ProfileSummaryEntry &SummaryEntry : SummaryVector) {
if (SummaryEntry.Cutoff == HotFuncCutoff) {		if (SummaryEntry.Cutoff == HotFuncCutoff) {
MinCountThreshold = SummaryEntry.MinCount;		MinCountThreshold = SummaryEntry.MinCount;
break;		break;
}		}
}		}
assert(MinCountThreshold != 0);

// Traverse all functions in the profile and keep only hot functions.		// Traverse all functions in the profile and keep only hot functions.
// The following loop also calculates the sum of total samples of all		// The following loop also calculates the sum of total samples of all
// functions.		// functions.
std::multimap<uint64_t, std::pair<const FunctionSamples *, const uint64_t>,		std::multimap<uint64_t, std::pair<const FunctionSamples *, const uint64_t>,
std::greater<uint64_t>>		std::greater<uint64_t>>
HotFunc;		HotFunc;
uint64_t ProfileTotalSample = 0;		uint64_t ProfileTotalSample = 0;
uint64_t HotFuncSample = 0;		uint64_t HotFuncSample = 0;
uint64_t HotFuncCount = 0;		uint64_t HotFuncCount = 0;
uint64_t MaxCount = 0;
for (const auto &I : Profiles) {		for (const auto &I : Profiles) {
		FuncSampleStats FuncStats;
const FunctionSamples &FuncProf = I.second;		const FunctionSamples &FuncProf = I.second;
ProfileTotalSample += FuncProf.getTotalSamples();		ProfileTotalSample += FuncProf.getTotalSamples();
MaxCount = FuncProf.getMaxCountInside();		getFuncSampleStats(FuncProf, FuncStats, MinCountThreshold);

// MinCountThreshold is a block/line threshold computed for a given cutoff.		if (isFunctionHot(FuncStats, MinCountThreshold)) {
// We intentionally compare the maximum sample count in a function with this
// threshold to get an approximate threshold for hot functions.
if (MaxCount >= MinCountThreshold) {
HotFunc.emplace(FuncProf.getTotalSamples(),		HotFunc.emplace(FuncProf.getTotalSamples(),
std::make_pair(&(I.second), MaxCount));		std::make_pair(&(I.second), FuncStats.MaxSample));
HotFuncSample += FuncProf.getTotalSamples();		HotFuncSample += FuncProf.getTotalSamples();
++HotFuncCount;		++HotFuncCount;
}		}
}		}

std::vector<std::string> ColumnTitle{"Total sample (%)", "Max sample",		std::vector<std::string> ColumnTitle{"Total sample (%)", "Max sample",
"Entry sample", "Function name"};		"Entry sample", "Function name"};
std::vector<int> ColumnOffset{0, 24, 42, 58};		std::vector<int> ColumnOffset{0, 24, 42, 58};
▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profdata] Implement llvm-profdata overlap for sample profilesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 284158

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-0.proftext

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-1.proftext

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-2.proftext

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-3.proftext

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-4.proftext

llvm/test/tools/llvm-profdata/Inputs/sample-overlap-5.proftext

llvm/test/tools/llvm-profdata/sample-overlap.test

llvm/tools/llvm-profdata/llvm-profdata.cpp

[llvm-profdata] Implement llvm-profdata overlap for sample profiles
ClosedPublic