Page MenuHomePhabricator

[CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie
ClosedPublic

Authored by wlei on Jun 3 2022, 4:03 PM.

Details

Summary

Follow-up patch to https://reviews.llvm.org/D125246, support computeSummaryAndThreshold based on context trie.

Diff Detail

Event Timeline

wlei created this revision.Jun 3 2022, 4:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2022, 4:03 PM
wlei requested review of this revision.Jun 3 2022, 4:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2022, 4:03 PM
hoy added inline comments.Jun 5 2022, 9:28 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
435

why delete this?

1048

Is false needed here? We are passing in contextless profiles so merging them again inside computeSummaryAndThreshold should be fine?

wlei added inline comments.Jun 6 2022, 11:23 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
435

It's not completely deleted, it's moved to the ctor(see: ProfileGenerator.h: 180), this is because we don't ever save the llvm-sample-profile into ProfileMap, I made the llvm-sample-profile a temporary variable, it's converted to trie right in the ctor and dropped. So I also moved this part of code there.

1048

Yeah, we can merge the contextless profiles twice, but is there any concern to save one?

hoy added inline comments.Jun 6 2022, 11:47 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
435

I see. Profiled functions are also used to on-demand decode pseudo probes and it's needed for non-CS profile generator too.

1048

I guess my questions should be why this the extra parameter needed. Is the original implementation that checks FunctionSamples::ProfileIsCS inside SampleProfileSummaryBuilder::computeSummaryForProfiles not working?

hoy added inline comments.Jun 6 2022, 11:55 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
1048

OK, I see your point, to save the unnecessary merge. I'd say if this is not at a visible cost, let's avoid changing the interface. Otherwise maybe we can reset UseContextLessSummary here? A comment would be helpful.

wenlei added inline comments.Jun 6 2022, 4:15 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
131–138

The move is for llvm-sample-profile path. Do we fall back from move to copy for that path now?

1047

Not critical, but in the past we have the ability to use context profile as well when profile-summary-contextless=0 is used. It seems like we now lost that ability?

1048

save/restore UseContextLessSummary sounds like a good idea.

wlei added inline comments.Jun 6 2022, 7:26 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
131–138

There is no extra copy here, we don't update ProfileMap for CS profile. For llvm-sample-profile path, the input is a reference, the map is converted to a trie in the ctor, so the lifetime of the reference end in the ctor. The ProfileMap is always empty for CS profile until we call buildProfileMap.

435

I see, good catch! I added them back for non-CS path.

1047

I see, I wasn't aware that we can use profile-summary-contextless=0. Yes, now we don't have the ability. I guess we don't need this before pre-inliner?

To support this, we can create a full CS map here like we call buildProfileMap, but we need to make a copy instead of move in buildProfileMap, we don't have that for now. it seems it will increase the complexity of the code. or we can call to compute the summary again after pre-inliner.

1048

It seems we can not copy the cl::opt<bool> class, I just set UseContextLessSummary=true here, not sure if we need to recover the UseContextLessSummary value.

wlei updated this revision to Diff 434674.Jun 6 2022, 7:26 PM

Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie

hoy added inline comments.Jun 8 2022, 4:06 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
1048

Would be better to recover the value since that'll be used by the extbinary writer to compute and flush out the profile summary on the final ProfileMap.

hoy added inline comments.Jun 8 2022, 4:23 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
188

This piece of code is needed because Profiles is not saved as a field of ProfileGeneratorBase anymore. Can we still do that to avoid duplicate this code here?

wlei added inline comments.Jun 8 2022, 4:34 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
188

Before it copied to ProfileMap and now we need to make ProfileMap empty because the buildProfileMap will convert the optimized trie into ProfileMap. Non-empty ProfileMap will mess up the results.

Maybe we can add a new field to ProfileGeneratorBase like SampleProfileMap & LLVMSampleProfiles;?

hoy added inline comments.Jun 8 2022, 4:46 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
188

now we need to make ProfileMap empty because the buildProfileMap will convert the optimized trie into ProfileMap. Non-empty ProfileMap will mess up the results.

Just clean it inside buildProfileMap?

wlei added inline comments.Jun 8 2022, 4:50 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
188

If cleaning it before, all the FunctionSamples owned by the map are freed, the FunctionSamples pointer in the ContextTrieNode is dangling.

hoy added inline comments.Jun 8 2022, 5:16 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
188

I see. Maybe we should just compute a contextless profile from ContextTracker in collectProfiledFunctions ? This is less efficient, but should fit better in the pipeline for easier maintenance. The performance of the llvm profile path isn't critical at the end of the day.

hoy added inline comments.Jun 8 2022, 5:25 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
188

On the second thought, we don't really need a contextless profile, just looping on the context tri and collecting function names should be sufficient?

wlei added inline comments.Jun 9 2022, 6:38 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
1048

I see, value recovered.

llvm/tools/llvm-profgen/ProfileGenerator.h
188

Sounds good to use name directly from trie. Just one thing is context tri is only in CSProfileGenerator, so I still need to use different function for this.

wlei updated this revision to Diff 435753.Jun 9 2022, 6:39 PM

Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie

wlei updated this revision to Diff 435755.Jun 9 2022, 6:44 PM

Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie

hoy added inline comments.Jun 9 2022, 9:25 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
445

Make a copy of this in ProfileGeneratorBase::collectProfiledFunctions to keep the function prototype consistent? Actually the function can be made a virtual function.

470

Also check FunctionProfile is not null? A null profile means the function is likely not executed.

wlei added inline comments.Jun 9 2022, 9:37 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
423

Here is for all context including the non-sampling one.

470

I think we discussed before for the non-executed context and the conclusion is to keep it? because I saw in SampleCounters case, it keep the function there. See above code(line 413)

hoy added inline comments.Jun 9 2022, 9:56 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
470

Oh right, functions only showing up in stack samples should be considered as profiled. Actually there was a bug on the ProfileMap path and now your change is fixing it. LGTM.

wlei updated this revision to Diff 435798.Jun 9 2022, 10:05 PM

changed to use virtual function for collectProfiledFunctions.

hoy added inline comments.Jun 9 2022, 10:15 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
475

Sorry for not making it clear. Can we do something like below? Basically keep the original collectProfiledFunctions as is and add a new override for CSProfileGenerator.

void ProfileGeneratorBase::collectProfiledFunctions() {
  std::unordered_set<const BinaryFunction *> ProfiledFunctions;
  if (SampleCounters) {
     ....
  } else if (!ProfileMap.empty())
     ....
  }
  Binary->setProfiledFunctions(ProfiledFunctions);
}

void CSProfileGenerator::collectProfiledFunctions() {
  deal with ContextTrieNode;
  Binary->setProfiledFunctions(ProfiledFunctions);
}
wlei added inline comments.Jun 9 2022, 10:21 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
475
if (SampleCounters) {
   ....
}

This part of code is also needed for CSProfileGenerator because collectProfiledFunctions is called before the generateProbeBasedProfile, at that time none of ContextTrieNode is created.

hoy added inline comments.Jun 9 2022, 10:24 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
475

I see, would this work?

void CSProfileGenerator::collectProfiledFunctions() {
  if (getRootContext) {
    ... 
    Binary->setProfiledFunctions(ProfiledFunctions);
   }
   else {
     ProfileGeneratorBase::collectProfiledFunctions();
   }
}
wlei added inline comments.Jun 9 2022, 10:48 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
475

This one should work, but it might introduce more complexity. We only have two modes 1) from Sample counter 2) from llvm-sample-profile. Before we use one condition(SampleCounters == null) to differentiate this two mode. But this one adds another condition(getRootContext()) which intend to do the same thing.

Is there any concern for the current version?

hoy added inline comments.Jun 9 2022, 11:03 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
475

Not a really concern, the current version looks good. I'm trying to see if the code can be simplified furthermore. Now that the context tri and ProfileMap are mutual-exclusive, perhaps using virtual functions to implement them is the best way. Adding an extra check is not worth.

hoy added inline comments.Jun 9 2022, 11:05 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
1049

Should this be set to false? We want to avoid merging ContextLessProfiles again in computeSummaryForProfiles?

wlei updated this revision to Diff 435946.Jun 10 2022, 9:22 AM

Updating D127026: [CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie

wlei added inline comments.Jun 10 2022, 9:24 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
1049

Oops, it should be false. I also found that set UseContextLessSummary false is not enough, I have to set FunctionSamples::ProfileIsCS to false as well, because the UseContextLessSummary.getNumOccurrences() is 0 here.

  if (UseContextLessSummary || (sampleprof::FunctionSamples::ProfileIsCS &&
                                !UseContextLessSummary.getNumOccurrences())) {
...
hoy accepted this revision.Jun 10 2022, 9:34 AM

LGTM, thanks for addressing the feedbacks!

llvm/tools/llvm-profgen/ProfileGenerator.cpp
1049

Good catch. A comment about why setting the two variables (e.g, "to avoid re-merging profiles") would be good.

This revision is now accepted and ready to land.Jun 10 2022, 9:34 AM
wlei updated this revision to Diff 435967.Jun 10 2022, 10:35 AM

added more commments.

wenlei added inline comments.Jun 22 2022, 11:24 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
475

Consider this to give it more structure and help readability?

ProfileGeneratorBase::collectProfiledFunctions() {
    if (collectFunctionsFromRawProfile(ProfiledFunctions)) 
        Binary->setProfiledFunctions(ProfiledFunctions);
    else if (collectFunctionsFromLLVMProfile(ProfiledFunctions)) 
        Binary->setProfiledFunctions(ProfiledFunctions);
    else 
        llvm_unreachable(...);
}

bool ProfileGeneratorBase::collectFunctionsFromRawProfile(...) {
    if (!SampleCounters)
       return false;
    // read raw profile sample counters
    return true;
}

bool ProfileGenerator::collectFunctionsFromLLVMProfile(..) {
    // read ProfileMap
    return true;
}

bool CSProfileGenerator::collectFunctionsFromLLVMProfile(..) {
    // read trie nodes
    return true;
}
1047

I think it's ok to not support profile-summary-contextless=0 given we haven't really use it much. Maybe add an assertion for UseContextLessSummary to be true?

wlei updated this revision to Diff 439611.Jun 23 2022, 8:16 PM

Addressing Wenlei's feedback

wenlei accepted this revision.Jun 27 2022, 6:11 PM

lgtm, thanks.

This revision was landed with ongoing or failed builds.Jun 27 2022, 11:30 PM
This revision was automatically updated to reflect the committed changes.