This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profgen] Ignore branch count against outline function
ClosedPublic

Authored by wlei on Sep 30 2021, 11:06 AM.

Details

Summary

For some transformations like hot-cold split or coro split, it can outline its part of function ranges. Since sample loader is the early stage of backend and no split happens at that time, compiler can't recognize those function, so in llvm-profgen we should attribute the sample to the original function. This is already done for the body range samples since we use the symbols from dwarf which is created before the split.

But for branch samples, the call from master function to its outlined function is actually not a call to the original function, we shouldn't add head/callsie samples for it. So instead of dwarf symbol, we use the symbols from symbol table and ignore those functions with special suffixes(like .cold ,.resume) for accumulating the callsite/head samples.

Diff Detail

Event Timeline

wlei created this revision.Sep 30 2021, 11:06 AM
wlei requested review of this revision.Sep 30 2021, 11:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2021, 11:06 AM
hoy added inline comments.Oct 1 2021, 11:55 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
403

nit: isOutlinedFunction

We should still count samples for it except for the head samples.

407

Check against specific suffix here so that we don't overlook cases we are not aware of?

wlei added inline comments.Oct 1 2021, 12:52 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
403

Yeah, this is only called by populateBoundarySamples, so the body samples can be collected for those outlined functions.

407

So I guess you mean we can't assume here it covers all the outlined function case, right? I was thinking all the non-outlined cases are all handled by getCanonicalFnName.

wlei added inline comments.Oct 1 2021, 1:01 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
407

If we have an outlined function here, say it's name is foo.cold, it's function body code is still count against foo because we use the name from debug table. so all the body samples are collected into foo 's FunctionSamples. This is the right way.

But for the head samples, here we used the name from symbol table, so if we count this, we will have a sample line like

foo.cold:0:100

means the total_samples is zero and head samples is not.

So I'm wondering if we don't do anything for populateBoundarySamples and we have a post-process to trim all the FunctionSamples that headsamples is non-zero but total_samples is zero.

With this, I guess we won't overlook any cases. what do you think?

hoy added inline comments.Oct 1 2021, 1:16 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
407

Actually I meant the change here is a bit too general. We are assuming names with a suffix other than .__uniq, are all for backend-outlined functions. I'm not quite sure that's the case so suggesting we just check against .resume here. For example, the front end could outline functions and when it comes to sample loader, those outlined functions should have their own profiles.

wlei added inline comments.Oct 1 2021, 1:53 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
407

That makes sense, thanks!

Then how about we just leave it unchanged like before(we can leave the test in this patch).

Currently we already use the name from symbol table, so we won't have the issue as our internal tool which used from debug table.

The only issue is we might have some sample like foo.cold:0:100, but it won't have impact.

If it's a pre-SampleLoader outlined function, compiler can consume it otherwise it just ignore it.

hoy added inline comments.Oct 1 2021, 3:07 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
407

That sounds good too. The extra profile there will just be ignored by the sample loader.

wlei updated this revision to Diff 376656.Oct 1 2021, 5:51 PM
wlei retitled this revision from [llvm-profgen] Ignore branch count against outline function to [llvm-profgen] Update test case to check branch count against outline function.
wlei edited the summary of this revision. (Show Details)

Remove the related code to check outlined function,.

If it's an outlined function with suffixes, currently we only generate a zero total count and non-zero head count FunctionSample like "foo.cold:0:100".

If the outlined function is generated before the SampleLoader, the compiler will consume it, otherwise it will ignore it.

For llvm-profgen side, If it gets an outlined function with suffixes, currently we only generate a zero total count and non-zero head count FunctionSample like "foo.cold:0:100".

This seems like an inconsistent state, and it would also confuse profile similarity check depending on whether we have split/outline or not.

For compiler side, If the outlined function is generated before the SampleLoader, the compiler will consume it, otherwise it will ignore it.

Where do we attribute the body samples for the funclet? If that's with the main body (instead of function), we won't be able to consume the funclet profile for splitting before sample loader.

Though I think most split/outline happens after sample loader, and it's cleaner if we attribute all counts to the original function, then left to the split/outline to decide how to split the counts.

wlei added a comment.Oct 5 2021, 5:13 PM

For llvm-profgen side, If it gets an outlined function with suffixes, currently we only generate a zero total count and non-zero head count FunctionSample like "foo.cold:0:100".

This seems like an inconsistent state, and it would also confuse profile similarity check depending on whether we have split/outline or not.

For compiler side, If the outlined function is generated before the SampleLoader, the compiler will consume it, otherwise it will ignore it.

Where do we attribute the body samples for the funclet? If that's with the main body (instead of function), we won't be able to consume the funclet profile for splitting before sample loader.

Though I think most split/outline happens after sample loader, and it's cleaner if we attribute all counts to the original function, then left to the split/outline to decide how to split the counts.

I see, yeah, see the code below, if I understand correctly, the pre-sample-loader funclet won't have profile matched(always missing profile). We use findFunctionSamples which check DILocation2SampleMap for the Inst's profile. However, for every function, the DILocation2SampleMap will be reset to empty. So for a funclet, it can't find the loc from it's master function's DILocation2SampleMap, it will just return null.

bool SampleProfileLoader::runOnFunction(Function &F, ModuleAnalysisManager *AM) {
  DILocation2SampleMap.clear();
  ...
}
template <typename BT>
const FunctionSamples *SampleProfileLoaderBaseImpl<BT>::findFunctionSamples(
    const InstructionT &Inst) const {
  const DILocation *DIL = Inst.getDebugLoc();
  auto it = DILocation2SampleMap.try_emplace(DIL, nullptr);
  ...
  return it.first->second;
}
wlei updated this revision to Diff 377429.Oct 5 2021, 10:29 PM
wlei retitled this revision from [llvm-profgen] Update test case to check branch count against outline function to [llvm-profgen] Ignore branch count against outline function.

revert back to check outlined function

Checked for compiler side, it can't match any of the pre-sample-loader outline function, so we can just ignore all the functions with .* suffix.

wlei updated this revision to Diff 377710.Oct 6 2021, 3:23 PM
wlei edited the summary of this revision. (Show Details)

Filter out hot-cold split and coro split function only.

hoy accepted this revision.Oct 6 2021, 3:43 PM

lgtm, thanks.

This revision is now accepted and ready to land.Oct 6 2021, 3:43 PM
wenlei accepted this revision.Oct 6 2021, 10:43 PM

lgtm, thanks.

Would be good to clarify in commit message that the body samples will always be attributed to the main body based on dwarf symbol name which is pre-split.

wlei edited the summary of this revision. (Show Details)Oct 7 2021, 1:15 PM
This revision was landed with ongoing or failed builds.Oct 7 2021, 2:04 PM
This revision was automatically updated to reflect the committed changes.