This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] Don't mix up the existing indirect call value profile with the new value profile annotated after inlining.
ClosedPublic

Authored by wmi on Mar 17 2021, 6:26 PM.

Details

Summary

In https://reviews.llvm.org/D96806 and https://reviews.llvm.org/D97350, we use the magic number -1 in the value profile to avoid repeated indirect call promotion to the same target for an indirect call. Function updateIDTMetaData is used to mark an target as being promoted in the value profile with the magic number. updateIDTMetaData is also used to update the value profile when an indirect call is inlined and new inline instance profile should be applied. For the second case, currently updateIDTMetaData mixes up the existing value profile of the indirect call with the new profile, leading to the problematic senario that a target count is larger than the total count in the value profile.

The patch fixes the problem. When updateIDTMetaData is used to update the value profile after inlining, all the values in the existing value profile will be dropped except the values with the magic number counts.

Diff Detail

Event Timeline

wmi created this revision.Mar 17 2021, 6:26 PM
wmi requested review of this revision.Mar 17 2021, 6:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2021, 6:26 PM
davidxl added inline comments.Mar 17 2021, 8:19 PM
llvm/lib/Transforms/IPO/SampleProfile.cpp
767–768

The logic seems clearer if the handling is specialized for sum == 0 and != 0 case:

if (sum == 0) {
} else {

// iniitlaize value count map with already promoted targets.
 ....

}

wmi added inline comments.Mar 17 2021, 10:47 PM
llvm/lib/Transforms/IPO/SampleProfile.cpp
767–768

Very good suggestion. I try it and it looks clearer indeed.

wmi updated this revision to Diff 331467.Mar 17 2021, 10:47 PM

Address David's comment.

davidxl accepted this revision.Mar 18 2021, 9:13 AM

lgtm

This revision is now accepted and ready to land.Mar 18 2021, 9:13 AM
hoy added a comment.Mar 18 2021, 10:16 AM

Thanks for fixing the assert failure!

llvm/test/Transforms/SampleProfile/norepeated-icp-3.ll
72

I'd like to understand a bit more about the counts here. In reality, is 8000 from prelink and 5860 from postlink? If the same profile is used, why different callees can be seen?

hoy added inline comments.Mar 18 2021, 10:18 AM
llvm/test/Transforms/SampleProfile/norepeated-icp-3.ll
72

Asking because the failing assert also fired with csspgo, where in postlink the same profile was retrieved differently due to adjusted top-down order in postlink that affects context Tri promotion.

wmi added inline comments.Mar 18 2021, 11:53 AM
llvm/test/Transforms/SampleProfile/norepeated-icp-3.ll
72

One case is 8000 is from outline instance. After inlining, the indirect call get more precise profile 5860 from inline instance. There can be different set of callees in the value profile in the outline instance and in the inline instance.

If it is the same profile applied differently to the same indirect call in csspgo, we should still drop all the existing values in the value profile except those with -1 count and apply the new values, like the patch currently does.

hoy added inline comments.Mar 18 2021, 12:01 PM
llvm/test/Transforms/SampleProfile/norepeated-icp-3.ll
72

I see. Thanks for explaining. The CSSPGO case is a bit different. The inlined indirect callsite always uses the same context profile in prelink and postlink, but in postlink the Tri looks different when it comes to processing the inlined icall site because the inlinee the icall is from gets compiled before the caller, therefore the Tri context is promoted and the inlinee profile can no longer be found.