Page MenuHomePhabricator

[CSSPGO] Consume pseudo-probe-based AutoFDO profile
ClosedPublic

Authored by hoy on Nov 30 2020, 2:02 PM.

Details

Summary

This change enables pseudo-probe-based sample counts to be consumed by the sample profile loader under the regular -fprofile-sample-use switch with minimal adjustments to the existing sample file formats. After the counts are imported, a probe helper, aka, a PseudoProbeManager object, is automatically launched to verify the CFG checksum of every function in the current compilation against the corresponding checksum from the profile. Mismatched checksums will cause a function profile to be slipped. A SampleProfileProber pass is scheduled before any of the SampleProfileLoader instances so that the CFG checksums as well as probe mappings are available during the profile loading time. The PseudoProbeManager object is set up right after the profile reading is done. In the future a CFG-based fuzzy matching could be done in PseudoProbeManager.

Samples will be applied only to pseudo probe instructions as well as probed callsites once the checksum verification goes through. Those instructions are processed in the same way that regular instructions would be processed in the line-number-based scenario. In other words, a function is processed in a regular way as if it was reduced to just containing pseudo probes (block probes and callsites).

Adjustment to profile format

A CFG checksum field is being added to the existing AutoFDO profile formats. So far only the text format and the extended binary format are supported. For the text format, a new line like

!CFGChecksum: 12345

is added to the end of the body sample lines. For the extended binary profile format, we introduce a metadata section to store the checksum map from function names to their CFG checksums.

Diff Detail

Event Timeline

hoy created this revision.Nov 30 2020, 2:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2020, 2:02 PM
hoy requested review of this revision.Nov 30 2020, 2:02 PM
hoy edited the summary of this revision. (Show Details)Nov 30 2020, 2:05 PM
hoy updated this revision to Diff 311078.Dec 10 2020, 5:44 PM

Rebasing

hoy updated this revision to Diff 311299.Dec 11 2020, 12:13 PM

Updating test.

wmi added inline comments.Dec 13 2020, 12:13 PM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

I guess it makes sense to treat missing entry to have zero count instead of unknown because the function profiles with unmatched CFGchecksum has been dropped, is that right?

If it is, can we check CFGchecksum not zero here instead of ProfileIsProbeBased or ProfileIsCS? When CFGchecksum is not zero and matched, we are confident the profile is accurate no matter what kind of profile it is.

Another thing is: I think debug info based profile can have a similar checksum then it can use a similar strategy here as probe based profile. It may not be a CFGchecksum because debug info based profile has no idea about CFG, so it is better to rename CFGchecksum to something more general.

675–677

Is it correct that for probe based profile all the FunctionHash should already be initialized to a non-zero value?

681–682

Item1 won't be an issue if -funique-internal-linkage-names becomes default (https://reviews.llvm.org/rGad1b9daa4bf40c1907794fd5de7807aad1f0553c).

llvm/lib/IR/PseudoProbe.cpp
38–44

Can you wrap them into a function like extractProbeFromDiscriminator?

llvm/lib/ProfileData/SampleProf.cpp
135–138

Should we output CFGchecksum if it is supported? It will affect the output of 'llvm-profdata show -sample'

llvm/lib/ProfileData/SampleProfReader.cpp
870

Nit: if (!ProfileIsProbeBased) return sampleprof_error::success;

llvm/lib/ProfileData/SampleProfWriter.cpp
171

Nit: if (!FunctionSamples::ProfileIsProbeBased) return sampleprof_error::success;

174

What about the case that the profile use MD5 to represent name?

llvm/tools/llvm-profdata/llvm-profdata.cpp
684–686

This should use exitWithError instead of assertion for user to get the error message in release mode, same as you did for compare below.

hoy marked 5 inline comments as done.Dec 13 2020, 9:22 PM
hoy added inline comments.
llvm/include/llvm/ProfileData/SampleProf.h
554–556

I think the comment I put here is a bit confusing. A missing entry for a probe here really means a missing counter. When a probe counter doesn't appear in a function profile, we are sure that the probe has not been executed (virtually). From our experiment, giving it a zero count is a bit more accurate instead of having the counts inference algorithm to figure out a count for it.

We use CFGchecksum to identify if a profile is debug-info-based or pseudo-probe-based. A probe-based profile will likely have a non-zero checksum due to the way CFGchecksum is computed. As you pointed out, when CFGchecksum is matched, we are confident the profile is accurate (even if there are missing counters for some probes). If not matched, the profile will be completely dropped and no counters will be applied to the function..

It's a good suggestion to apply a similar checksum for debug-info-based profile to tell how close the pervious source is to the current IR. We can discuss more about this.

675–677

A FunctionHash is SampleProfileProber::computeCFGHash which in theory could have zero value, though in practice we haven't seen that even with a single-block function.

681–682

Thanks for pointing this out! The switch will be helpful. Comment updated.

llvm/lib/IR/PseudoProbe.cpp
38–44

Done.

llvm/lib/ProfileData/SampleProf.cpp
135–138

Good point, added printing for CFGchecksum.

llvm/lib/ProfileData/SampleProfWriter.cpp
174

Good question. I thought this is orthogonal from MD5 but correct me if I'm wrong. Does a name index point to the name table that are filled with MD5 codes?

llvm/tools/llvm-profdata/llvm-profdata.cpp
684–686

Sounds good.

hoy updated this revision to Diff 311483.Dec 13 2020, 9:22 PM
hoy marked 3 inline comments as done.

Addressing Wei's feedbacks.

wmi added inline comments.Dec 14 2020, 9:34 AM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

A missing entry for a probe here really means a missing counter.

Could you list the possible cases you have known where probe could be missing?

From our experiment, giving it a zero count is a bit more accurate instead of having the counts inference algorithm to figure out a count for it.

Yes, I agree with that. I just want to know whether it makes sense to check CFGChecksum match instead of checking the profile is probe based or context sensitive.

hoy added inline comments.Dec 14 2020, 10:03 AM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

Yeah. A probe is deleted literally only if the corresponding block is really dead. Otherwise if the probe counter is not sampled, it means the block is there but just was not run. We trust this and assign a zero weight to the probe. There is a third case where a probe is sampled through its adjacent physical instruction but the samples are not trusted because the next physical instruction starts a new block. We call that case a "dangling" probe. This usually reflects that all instructions of a block are moved out of that block and only that probe is left there. For dangling probes we intentionally record a zero count in the profile instead of a missing counter. This helps the counts inference algorithm. Will have a patch to deal with optimizations that result in dangling probes.

I see. So far the check against probe-based is equivalent to checking against CFGChecksum, since that's the way to identify if a profile is probe-based or debug-info based. If we'd like to also employ a CFGChecksum for the debug-info scenario in future, it's reasonable to check CFGChecksum here.

wmi added inline comments.Dec 14 2020, 10:40 AM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

Is there case that optimizations after probe insertion create new BBs and the new BBs somehow don't have probes?

So far the check against probe-based is equivalent to checking against CFGChecksum, since that's the way to identify if a profile is probe-based or debug-info based.

Ok, it is fine to leave the change later.

llvm/lib/ProfileData/SampleProfWriter.cpp
174

Ah, sorry, never mind. It is writing name index intead of name here so it is fine.

hoy added inline comments.Dec 14 2020, 10:40 AM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

Given a second thought, I'm not sure how we'd like to make a CFGChecksum for the debug-info profile. Maybe just check ProfileIsProbeBased for now and change it to checking CFGChecksum when it comes to extending debug-info profile? What do you think?

wmi added inline comments.Dec 15 2020, 5:45 PM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

Yes, I agree with it. It is fine to adjust it later.

554–556

Is there case that optimizations after probe insertion create new BBs and the new BBs somehow don't have probes?

I still curious about the answer to the question above. Could you share your thought?

hoy added inline comments.Dec 15 2020, 6:10 PM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

Sorry for missing the comment.

That's a good question. Yes, there will be blocks created after probe insertion that won't get a probe. I hope that's not too common if the new blocks really contain code from the original IR. And if that happens, we could rely on the counts inference pass to mitigate that. The worst case is that optimizer creates a conditional structure with none of the new blocks copied from the original IR and places some user code in them. Since there's no probe in the new blocks, we won't know anything about how frequent that code is executed during sampling. I guess in that case we would need to fix the optimizer to copy the original probe along with the code, if the optimization is not done on block level.

hoy added inline comments.Dec 15 2020, 6:41 PM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

I could write a pass to validate that every block with real instructions gets a probe and run it after each optimization. That should give us a view of how common the missing probe cases are. Then we can start fixing the offending passes from there. What do you think? Thanks for raising the issue.

wmi added inline comments.Dec 15 2020, 8:31 PM
llvm/include/llvm/ProfileData/SampleProf.h
554–556

That is a good plan. Thanks. And I think it is fine to return zero count for BB without probe here, and revisit it later.

llvm/lib/IR/PseudoProbe.cpp
25

Add a message for it.

llvm/lib/Transforms/IPO/SampleProfile.cpp
790

Add a message for it.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
46

Change to emplace_back to avoid build issue of old GCC?

llvm/test/Transforms/SampleProfile/pseudo-probe-profile.ll
2

Could you also add a test similar to test/Transforms/SampleProfile/function_metadata.ll? That is to verify pseudo probe works with thinlto -- hot inline instance will be added to function_entry_count metadata so that thinlto can import them.

hoy marked 2 inline comments as done.Dec 15 2020, 11:55 PM
hoy added inline comments.
llvm/lib/IR/PseudoProbe.cpp
25

Done.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
46

Good catch, thanks!

llvm/test/Transforms/SampleProfile/pseudo-probe-profile.ll
2

Thanks for the suggestion. Adjusted function_metadata.ll for pseudo probe.

hoy updated this revision to Diff 312131.Dec 15 2020, 11:56 PM
hoy marked an inline comment as done.

Addressing Wei's feedbacks.

wmi accepted this revision.Dec 16 2020, 12:16 PM

LGTM.

This revision is now accepted and ready to land.Dec 16 2020, 12:16 PM
This revision was landed with ongoing or failed builds.Dec 16 2020, 4:02 PM
This revision was automatically updated to reflect the committed changes.