Page MenuHomePhabricator

[CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size
ClosedPublic

Authored by wlei on Jan 5 2021, 12:22 PM.

Details

Summary

This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs).

For example, two input profiles:
[main @ foo @ bar]:60
[main @ bar]:50
Under threshold = 100, the two profiles will be merge into one with the base context, get result:
[bar]:110

Added two switches:
--csprof-cold-thres=<value>: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default.
--csprof-keep-cold: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile.

Results:
Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later.

Diff Detail

Unit TestsFailed

TimeTest
150 msx64 windows > LLVM.tools/llvm-profgen::merge-cold-profile.test
Script: -- : 'RUN: at line 2'; llvm-profgen --perfscript=C:\ws\w1\llvm-project\premerge-checks\llvm\test\tools\llvm-profgen/Inputs/recursion-compression.perfscript --binary=C:\ws\w1\llvm-project\premerge-checks\llvm\test\tools\llvm-profgen/Inputs/recursion-compression.perfbin --output=C:\ws\w1\llvm-project\premerge-checks\build\test\tools\llvm-profgen\Output\merge-cold-profile.test.tmp --compress-recursion=-1 --csprof-cold-thres=8
210 msx64 windows > LLVM.tools/llvm-profgen::recursion-compression.test
Script: -- : 'RUN: at line 2'; llvm-profgen --perfscript=C:\ws\w1\llvm-project\premerge-checks\llvm\test\tools\llvm-profgen/Inputs/recursion-compression.perfscript --binary=C:\ws\w1\llvm-project\premerge-checks\llvm\test\tools\llvm-profgen/Inputs/recursion-compression.perfbin --output=C:\ws\w1\llvm-project\premerge-checks\build\test\tools\llvm-profgen\Output\recursion-compression.test.tmp --compress-recursion=0 --csprof-cold-thres=0

Event Timeline

wlei created this revision.Jan 5 2021, 12:22 PM
wlei requested review of this revision.Jan 5 2021, 12:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2021, 12:22 PM
wlei edited the summary of this revision. (Show Details)Jan 5 2021, 3:21 PM
wlei added reviewers: wenlei, wmi, davidxl, hoy.
wmi added inline comments.Jan 10 2021, 11:20 PM
llvm/test/tools/llvm-profgen/merge-cold-profile.test
7

I see "[main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb]:9:4" is trimmed but [fa]:14:4 is kept, so the threshold should be 10?

llvm/tools/llvm-profgen/ProfileGenerator.cpp
352

emplace_back(I.getKey(), &I.second)?

wlei updated this revision to Diff 316228.Jan 12 2021, 1:52 PM
wlei edited the summary of this revision. (Show Details)

Rebase and addressing Wei's feedback

wlei added inline comments.Jan 12 2021, 1:53 PM
llvm/test/tools/llvm-profgen/merge-cold-profile.test
7

The second test is for --csprof-keep-cold, which means we only merge the cold profile but don't remove it.
So I intended to use a very high threshold(100) to let all the profiles merge together and keep it.

llvm/tools/llvm-profgen/ProfileGenerator.cpp
352

Good catch, fixed

wmi added inline comments.Jan 14 2021, 12:09 PM
llvm/test/tools/llvm-profgen/merge-cold-profile.test
7

Ok, I see. Thanks for explaining it.

It means if the total count of a profile with context is smaller than the csprof-cold-thres, it will be merged into its base profile. If the total count of the base profile after all merge is done is still smaller than the csprof-cold-thres, it will be trimmed unless csprof-keep-cold flag is specified. Is it correct? Better document it more clearly in the flag desc.

wlei added inline comments.Jan 14 2021, 12:28 PM
llvm/test/tools/llvm-profgen/merge-cold-profile.test
7

Yes, that's exactly what I mean. Will update the flag desc, thanks for your feedback.

wlei updated this revision to Diff 316791.Jan 14 2021, 3:03 PM

Refine the cl::desc for CSProfColdThres and CSProfKeepCold

wmi accepted this revision.Jan 14 2021, 3:31 PM

LGTM. Thanks.

This revision is now accepted and ready to land.Jan 14 2021, 3:31 PM
hoy accepted this revision.Jan 15 2021, 10:00 AM
hoy added inline comments.
llvm/tools/llvm-profgen/ProfileGenerator.cpp
33

Nit: Specify

I was wondering in the future if we'd like a consistent way of computing cold threshold with SampleProfileSummaryBuilder, i.e., counts below 99.9% (aka cut-off) are considered cold, so that the profile generator and the profile consumer will be on the same page.

wlei updated this revision to Diff 317079.Jan 15 2021, 2:35 PM

fix typo and rebase

wlei added inline comments.Jan 15 2021, 2:40 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
33

Thanks for sharing this. I'm not sure how we get the 100 threshold, I guess this is the result of many iterations. Let me dig into the SampleProfileSummaryBuilder to see the chance.

hoy added inline comments.Jan 15 2021, 4:29 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
33

Yes, it is a number from experiments. The logic in the profile summary builder is a bit different in that it sorts all samples by weight and treats the samples below a given percentage (say 0.01%) as cold. In order words, if there are more than 99.99% samples that are hotter than the current one, then the current is treated as cold. That's the logic used in the compiler.

wenlei added inline comments.Jan 20 2021, 9:29 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
33

Yeah, +1 for what Hongtao said, I wanted to use %-based cutoff, but didn't have access to SampleProfileSummaryBuilder from early prototype.

I think create_llvm_profile tool have both sample_threshold_frac and sample_threshold_absolute? Similarly both could be useful here. We could refine them as we broaden the adoption and continue tuning - I don't have an opinion whether we make all these improvements here or in another patch.

wlei added inline comments.Jan 20 2021, 10:40 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
33

Cool, thanks for your informative suggestions! will have a try and add more experiments on SPEC.

This revision was landed with ongoing or failed builds.Feb 4 2021, 11:05 AM
This revision was automatically updated to reflect the committed changes.