This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.
ClosedPublic

Authored by hoy on May 9 2022, 1:52 PM.

Details

Summary

When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map. This could cause indirect call promotion to function improperly. One issue is that the inlined callsites are treated with double amount of samples. The other is the inlined callsites are reconsidered for subsequent PGO ICP.

I'm fixing this by excluding call targets from the callsite for inlined targets. While fixing this I found that callsite target sum and the number of body samples for that callsite could be mismatched. D122609: [llvm-profgen] Update callsite body samples by summing up all call target samples. has an explanation and a fix for that on llvm-profgen side. For now I'm tolerating it in this change.

Diff Detail

Event Timeline

hoy created this revision.May 9 2022, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2022, 1:52 PM
hoy requested review of this revision.May 9 2022, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2022, 1:52 PM
hoy updated this revision to Diff 428197.May 9 2022, 2:04 PM

Updating D125266: [CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.

hoy edited the summary of this revision. (Show Details)May 9 2022, 2:21 PM
hoy added reviewers: wenlei, wlei.

When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map.

Good catch.. That is also not consistent with AutoFDO profile -- a callee should be in either call site sample or target map of body sample, but not both.

llvm/include/llvm/ProfileData/SampleProf.h
378

The asymmetricity between removeCalledTarget and addCalledTarget can be confusing. We're reducing NumSamples in removeCalledTarget, but we are not adding removeCalledTarget in addCalledTarget. Two questions, 1) should NumSamples include target counts, and are we doing that consistently? 2) Any ways to make them the two functions consistent?

383

You used optional instead relying on zero to indicate no removal here because there could be count be zero count returned due to mismatch but removal is still happening?

llvm/lib/ProfileData/SampleProf.cpp
527

nit: subtractTotalSamples to be clear.

hoy added inline comments.May 10 2022, 3:00 PM
llvm/include/llvm/ProfileData/SampleProf.h
378

NumSamples and target counts are computed separately in llvm-profgen, one is based on lbr ranges while the other is based on branch samples. So they do not have any connection on the profile generator side.

I guess we can separate removeSamples out of removeCalledTarget to be consistent. The same would also be needed on the FunctionSample side. Maybe just rename removeCalledTarget to removeCalledTargetAndBodySamples? For now they should always be reduced together.

383

Yes, it is to identify mismatch or zero-count call targets. The latter should be very rare though.

llvm/lib/ProfileData/SampleProf.cpp
527

Done.

wenlei added inline comments.May 11 2022, 11:35 AM
llvm/include/llvm/ProfileData/SampleProf.h
378

Ok, now I see that they're added separately. I'm wondering what is the definition of NumSamples. If the two don't have connection and NumSamples is samples for this location, and call target count is the count for specific call targets, then they should be changed independently.

Here changing the call/callee representation from CallTargets to nested FunctionSamples shouldn't change the value of NumSamples because the we would still derived the same value from the same ranges?

hoy added inline comments.May 11 2022, 12:09 PM
llvm/include/llvm/ProfileData/SampleProf.h
378

By no connection I mean how they are generated in llvm-profgen and used in the compiler technically. But logically NumSamples should be the sum of all target counts. It reflects the the execution count of the callsite. Removing one target means the callsite won't be executed as many times as before. The reduced count will be instead reflected on another code path that promotes the indirect callsite. Annotating block frequency based on the original callsite count (due to the max operation) could be misleading.

wenlei added inline comments.May 12 2022, 5:03 PM
llvm/include/llvm/ProfileData/SampleProf.h
378

Followed up offline. So the key is that NumSamples of call site BodySample only contains execution counts for not inline calls. And nest profile implies inlining, so changing representation from call target to nest profile does need to reduce NumSample for call site.. So it's good here.

The asymmetricity in naming and what they do is still less than ideal though. Perhaps for SampleRecord we can just make it consistent by having one removeCalledTarget that only removes target, and, and one removeBodySamples that only reduces NumSamples.

For FunctionSamples, removeCalledTarget-> removeCalledTargetAndBodySample?

hoy added inline comments.May 12 2022, 8:17 PM
llvm/include/llvm/ProfileData/SampleProf.h
378

Sounds good. I'm also removing the use of Optional, instead using zero count to simplify the logic.

hoy updated this revision to Diff 429126.May 12 2022, 8:18 PM

Updating D125266: [CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.

wenlei accepted this revision.May 12 2022, 11:17 PM

lgtm except nits.

llvm/include/llvm/ProfileData/SampleProf.h
728

nit: now that we have a removeSamples, name this removeTotalSamples.

llvm/lib/ProfileData/SampleProf.cpp
27

Is this needed?

This revision is now accepted and ready to land.May 12 2022, 11:17 PM
hoy added inline comments.May 13 2022, 9:12 AM
llvm/include/llvm/ProfileData/SampleProf.h
728

Sounds good.

llvm/lib/ProfileData/SampleProf.cpp
27

Good catch. It's not. It was automatically added by my editor,

hoy updated this revision to Diff 429269.May 13 2022, 9:13 AM

Updating D125266: [CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.

This revision was landed with ongoing or failed builds.May 13 2022, 9:19 AM
This revision was automatically updated to reflect the committed changes.