This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO] Do not recount callee samples when computing profile summary for nested CS profile.
ClosedPublic

Authored by hoy on Feb 10 2022, 3:27 PM.

Details

Summary

When generating nested CS profile with all calling contexts of a function duplicated into a base profile under --generate-merged-base-profiles, do not recount callee samples when computing profile summary. This fixes the profile summary mismatch between flat cs profile and nested cs profile, for both extbinary and text format.

Diff Detail

Event Timeline

hoy created this revision.Feb 10 2022, 3:27 PM
hoy requested review of this revision.Feb 10 2022, 3:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2022, 3:27 PM
hoy updated this revision to Diff 407720.Feb 10 2022, 4:21 PM

Updating D119494: [CSSPGO] Do not recount callee samples when computing profile summary for nested CS profile.

wenlei added inline comments.Feb 10 2022, 6:29 PM
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
128

We should not have ProfileSummaryBuilder depending on profile generation flags from llvm-profgen. You moved the flags out off llvm-profgen to avoid cyclic dependency, but this is still hacky.

If we want to this specific behavior of ProfileSummaryBuilder, i.e. whether callee samples should be counted, we can have ProfileSummaryBuilder expose a proper knob for that.

In this case, maybe we could also consider actually setting a duplicated bit/attribute for those duplicated callee profile, then ProfileSummaryBuilder can naturally skip the duplicated one, which should be equivalent.

hoy added inline comments.Feb 10 2022, 6:45 PM
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
128

For extbiny profile, ProfileSummaryBuilder is called in profile writing time, which is still in the SampleProf library. The command line switches being checked here is because we would like to correct the summary computation during profile writing time. Maybe we can expose a writer flag to for this instead.

A function profile may have both inlined samples and uninlined samples and it's hard to tell them apart once they are merged. We could add an extra attribute to the extbin profile for GenerateMergedBaseProfiles which can be checked here, similar to ProfileIsCSNested. This seems cleaner. WDYT?

hoy added inline comments.Feb 10 2022, 6:47 PM
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
128

Actually an attribute for each callee profile may work better. It can help text profile too.

hoy updated this revision to Diff 407774.Feb 10 2022, 10:00 PM

Switching to using per-context attribute.

wenlei accepted this revision.Feb 10 2022, 10:28 PM

lgtm, thanls.

llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
128

For extbiny profile, ProfileSummaryBuilder is called in profile writing time, which is still in the SampleProf library.

It is ok for llvm-profgen to use ProfileSummaryBuilder, but ProfileSummaryBuilder should not be coupled with anything specific to profile generation. Checking the switches made it coupled with profile generation specifics - that's the hacky part, it breaks proper layering.

The command line switches being checked here is because we would like to correct the summary computation during profile writing time.

I understand the practical part of it, i.e. why you want to do this. I'm arguing the structure of the implementation, not why this is needed.

This revision is now accepted and ready to land.Feb 10 2022, 10:28 PM
wenlei added inline comments.Feb 10 2022, 10:30 PM
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
113–116

nit:

} else if (FS.getContext().hasAttribute(sampleprof::ContextDuplicatedIntoBase)) {
  return;
}
hoy updated this revision to Diff 407778.Feb 10 2022, 10:47 PM

Updating D119494: [CSSPGO] Do not recount callee samples when computing profile summary for nested CS profile.

hoy edited the summary of this revision. (Show Details)Feb 10 2022, 10:48 PM