Follow-up patch to https://reviews.llvm.org/D125246, support computeSummaryAndThreshold based on context trie.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
435 | It's not completely deleted, it's moved to the ctor(see: ProfileGenerator.h: 180), this is because we don't ever save the llvm-sample-profile into ProfileMap, I made the llvm-sample-profile a temporary variable, it's converted to trie right in the ctor and dropped. So I also moved this part of code there. | |
1045 | Yeah, we can merge the contextless profiles twice, but is there any concern to save one? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
435 | I see. Profiled functions are also used to on-demand decode pseudo probes and it's needed for non-CS profile generator too. | |
1045 | I guess my questions should be why this the extra parameter needed. Is the original implementation that checks FunctionSamples::ProfileIsCS inside SampleProfileSummaryBuilder::computeSummaryForProfiles not working? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
1045 | OK, I see your point, to save the unnecessary merge. I'd say if this is not at a visible cost, let's avoid changing the interface. Otherwise maybe we can reset UseContextLessSummary here? A comment would be helpful. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
131–138 | The move is for llvm-sample-profile path. Do we fall back from move to copy for that path now? | |
1044 | Not critical, but in the past we have the ability to use context profile as well when profile-summary-contextless=0 is used. It seems like we now lost that ability? | |
1045 | save/restore UseContextLessSummary sounds like a good idea. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
131–138 | There is no extra copy here, we don't update ProfileMap for CS profile. For llvm-sample-profile path, the input is a reference, the map is converted to a trie in the ctor, so the lifetime of the reference end in the ctor. The ProfileMap is always empty for CS profile until we call buildProfileMap. | |
435 | I see, good catch! I added them back for non-CS path. | |
1044 | I see, I wasn't aware that we can use profile-summary-contextless=0. Yes, now we don't have the ability. I guess we don't need this before pre-inliner? To support this, we can create a full CS map here like we call buildProfileMap, but we need to make a copy instead of move in buildProfileMap, we don't have that for now. it seems it will increase the complexity of the code. or we can call to compute the summary again after pre-inliner. | |
1045 | It seems we can not copy the cl::opt<bool> class, I just set UseContextLessSummary=true here, not sure if we need to recover the UseContextLessSummary value. |
Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
1045 | Would be better to recover the value since that'll be used by the extbinary writer to compute and flush out the profile summary on the final ProfileMap. |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
182 | This piece of code is needed because Profiles is not saved as a field of ProfileGeneratorBase anymore. Can we still do that to avoid duplicate this code here? |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
182 | Before it copied to ProfileMap and now we need to make ProfileMap empty because the buildProfileMap will convert the optimized trie into ProfileMap. Non-empty ProfileMap will mess up the results. Maybe we can add a new field to ProfileGeneratorBase like SampleProfileMap & LLVMSampleProfiles;? |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
182 |
Just clean it inside buildProfileMap? |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
182 | If cleaning it before, all the FunctionSamples owned by the map are freed, the FunctionSamples pointer in the ContextTrieNode is dangling. |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
182 | I see. Maybe we should just compute a contextless profile from ContextTracker in collectProfiledFunctions ? This is less efficient, but should fit better in the pipeline for easier maintenance. The performance of the llvm profile path isn't critical at the end of the day. |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
182 | On the second thought, we don't really need a contextless profile, just looping on the context tri and collecting function names should be sufficient? |
Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie
Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
449 | Make a copy of this in ProfileGeneratorBase::collectProfiledFunctions to keep the function prototype consistent? Actually the function can be made a virtual function. | |
465 | Also check FunctionProfile is not null? A null profile means the function is likely not executed. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
465 | Oh right, functions only showing up in stack samples should be considered as profiled. Actually there was a bug on the ProfileMap path and now your change is fixing it. LGTM. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
476 | Sorry for not making it clear. Can we do something like below? Basically keep the original collectProfiledFunctions as is and add a new override for CSProfileGenerator. void ProfileGeneratorBase::collectProfiledFunctions() { std::unordered_set<const BinaryFunction *> ProfiledFunctions; if (SampleCounters) { .... } else if (!ProfileMap.empty()) .... } Binary->setProfiledFunctions(ProfiledFunctions); } void CSProfileGenerator::collectProfiledFunctions() { deal with ContextTrieNode; Binary->setProfiledFunctions(ProfiledFunctions); } |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
476 | if (SampleCounters) { .... } This part of code is also needed for CSProfileGenerator because collectProfiledFunctions is called before the generateProbeBasedProfile, at that time none of ContextTrieNode is created. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
476 | I see, would this work? void CSProfileGenerator::collectProfiledFunctions() { if (getRootContext) { ... Binary->setProfiledFunctions(ProfiledFunctions); } else { ProfileGeneratorBase::collectProfiledFunctions(); } } |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
476 | This one should work, but it might introduce more complexity. We only have two modes 1) from Sample counter 2) from llvm-sample-profile. Before we use one condition(SampleCounters == null) to differentiate this two mode. But this one adds another condition(getRootContext()) which intend to do the same thing. Is there any concern for the current version? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
476 | Not a really concern, the current version looks good. I'm trying to see if the code can be simplified furthermore. Now that the context tri and ProfileMap are mutual-exclusive, perhaps using virtual functions to implement them is the best way. Adding an extra check is not worth. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
1046 | Should this be set to false? We want to avoid merging ContextLessProfiles again in computeSummaryForProfiles? |
Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
1046 | Oops, it should be false. I also found that set UseContextLessSummary false is not enough, I have to set FunctionSamples::ProfileIsCS to false as well, because the UseContextLessSummary.getNumOccurrences() is 0 here. if (UseContextLessSummary || (sampleprof::FunctionSamples::ProfileIsCS && !UseContextLessSummary.getNumOccurrences())) { ... |
LGTM, thanks for addressing the feedbacks!
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
1046 | Good catch. A comment about why setting the two variables (e.g, "to avoid re-merging profiles") would be good. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
476 | Consider this to give it more structure and help readability? ProfileGeneratorBase::collectProfiledFunctions() { if (collectFunctionsFromRawProfile(ProfiledFunctions)) Binary->setProfiledFunctions(ProfiledFunctions); else if (collectFunctionsFromLLVMProfile(ProfiledFunctions)) Binary->setProfiledFunctions(ProfiledFunctions); else llvm_unreachable(...); } bool ProfileGeneratorBase::collectFunctionsFromRawProfile(...) { if (!SampleCounters) return false; // read raw profile sample counters return true; } bool ProfileGenerator::collectFunctionsFromLLVMProfile(..) { // read ProfileMap return true; } bool CSProfileGenerator::collectFunctionsFromLLVMProfile(..) { // read trie nodes return true; } | |
1044 | I think it's ok to not support profile-summary-contextless=0 given we haven't really use it much. Maybe add an assertion for UseContextLessSummary to be true? |
This piece of code is needed because Profiles is not saved as a field of ProfileGeneratorBase anymore. Can we still do that to avoid duplicate this code here?