When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map. This could cause indirect call promotion to function improperly. One issue is that the inlined callsites are treated with double amount of samples. The other is the inlined callsites are reconsidered for subsequent PGO ICP.
I'm fixing this by excluding call targets from the callsite for inlined targets. While fixing this I found that callsite target sum and the number of body samples for that callsite could be mismatched. D122609: [llvm-profgen] Update callsite body samples by summing up all call target samples. has an explanation and a fix for that on llvm-profgen side. For now I'm tolerating it in this change.
The asymmetricity between removeCalledTarget and addCalledTarget can be confusing. We're reducing NumSamples in removeCalledTarget, but we are not adding removeCalledTarget in addCalledTarget. Two questions, 1) should NumSamples include target counts, and are we doing that consistently? 2) Any ways to make them the two functions consistent?