When generating nested CS profile with all calling contexts of a function duplicated into a base profile under --generate-merged-base-profiles, do not recount callee samples when computing profile summary. This fixes the profile summary mismatch between flat cs profile and nested cs profile, for both extbinary and text format.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Updating D119494: [CSSPGO] Do not recount callee samples when computing profile summary for nested CS profile.
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp | ||
---|---|---|
128 | We should not have ProfileSummaryBuilder depending on profile generation flags from llvm-profgen. You moved the flags out off llvm-profgen to avoid cyclic dependency, but this is still hacky. If we want to this specific behavior of ProfileSummaryBuilder, i.e. whether callee samples should be counted, we can have ProfileSummaryBuilder expose a proper knob for that. In this case, maybe we could also consider actually setting a duplicated bit/attribute for those duplicated callee profile, then ProfileSummaryBuilder can naturally skip the duplicated one, which should be equivalent. |
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp | ||
---|---|---|
128 | For extbiny profile, ProfileSummaryBuilder is called in profile writing time, which is still in the SampleProf library. The command line switches being checked here is because we would like to correct the summary computation during profile writing time. Maybe we can expose a writer flag to for this instead. A function profile may have both inlined samples and uninlined samples and it's hard to tell them apart once they are merged. We could add an extra attribute to the extbin profile for GenerateMergedBaseProfiles which can be checked here, similar to ProfileIsCSNested. This seems cleaner. WDYT? |
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp | ||
---|---|---|
128 | Actually an attribute for each callee profile may work better. It can help text profile too. |
lgtm, thanls.
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp | ||
---|---|---|
128 |
It is ok for llvm-profgen to use ProfileSummaryBuilder, but ProfileSummaryBuilder should not be coupled with anything specific to profile generation. Checking the switches made it coupled with profile generation specifics - that's the hacky part, it breaks proper layering.
I understand the practical part of it, i.e. why you want to do this. I'm arguing the structure of the implementation, not why this is needed. |
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp | ||
---|---|---|
113–116 | nit: } else if (FS.getContext().hasAttribute(sampleprof::ContextDuplicatedIntoBase)) { return; } |
Updating D119494: [CSSPGO] Do not recount callee samples when computing profile summary for nested CS profile.
nit: