In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation.
"An indirect callsite may be..."?
Is this annotated the value profile before or after promotion/inlining of the indirect calls recorded as inlined in the profile? If after, not sure why we want to include the counts of the promoted/inlined targets in Sum (which AFAICT is what will happen by computing Sum via findIndirectCallFunctionSamples).
It's after. The reason we need to have all promoted counts included in SUM is because we do not want to promote too many targets. E.g. if an indirect call site has already been promoted to 3 targets that covers 99% of the case. The rest 1%, even if it's 100% to a 4th target. we do not want to promote it.
The reason I ask is that this is different than what is done by the main ICP pass. E.g. see ICallPromotionFunc::processFunction and ICallPromotionFunc::tryToPromote, which update the total count before re-generating a new VP annotation. It also makes this patch have a larger effect than what is described in the summary. Suggest splitting this change out and evaluating separately.