Page MenuHomePhabricator

Add a pass to generate synthetic function entry counts.

Authored by eraman on Dec 27 2017, 5:03 PM.



This pass synthesizes function entry counts by traversing the callgraph
and using the relative block frequencies of the callsites. The intended
use of these counts is in inlining to determine hot/cold callsites in
the absence of profile information.

The pass is split into two files with the code that propagates the
counts in a callgraph in a Utils file. I plan to add support for
propagation in the thinlto link phase and the propagation code will be
shared and hence this split. I did not add support to the old PM since
hot callsite determination in inlining is not possible in old PM
(although we could use hot callee heuristic with synthetic counts in the
old PM it is not worth the effort tuning it)

Diff Detail


Event Timeline

eraman created this revision.Dec 27 2017, 5:03 PM
davidxl added inline comments.Jan 3 2018, 2:21 PM
27 ↗(On Diff #128250)

add documentation comments here.

256 ↗(On Diff #128250)

Is it better to let getEntryCount to take an optional flag to indicate 'is_synthetic'? The implementation of these two are mostly the same.

48 ↗(On Diff #128250)

We should probably differentiate functions with cold attributes and set them with lower initial count.

70 ↗(On Diff #128250)

This should be restricted to local functions not address escaped. If the local function is only called indirectly, setting its initial count to zero can make it look too cold. For address exposed ones, the initial count should be handled similarly to global functions.

eraman marked 2 inline comments as done.Jan 3 2018, 5:38 PM
eraman added inline comments.
27 ↗(On Diff #128250)

I have the documentation comments next to the implementation in the cpp file.

256 ↗(On Diff #128250)

I actually intended to remove this but this slipped in. I have a follow up patch that makes EntryCount a class and make the getEntryCount return it. Using a boolean for synthetic feels brittle. Shall I remove this get method?

eraman updated this revision to Diff 128588.Jan 3 2018, 5:39 PM

Address review comments.

davidxl added inline comments.Jan 4 2018, 11:50 AM
256 ↗(On Diff #128250)

The new interface that returns EntryCount seems a better approach. Yes, if you can unify the interfaces, this one should be removed. You can do it in two steps -- one takes the boolean flag, and then improved with a follow up patch.

50 ↗(On Diff #128588)

Nit: callsites that call into the SCC come first.

64 ↗(On Diff #128588)

PropagatedCounts --> ExtraSCCEntryCountFromCycles ? or just ExtraSCCEntryCounts?

Also modify the comments to say:

  1. first update the global count of nodes with SCC using cyclic edges first.
  2. the update is done in two steps ..
67 ↗(On Diff #128588)

Weight --> RelativeFreq

It is confusing to see Weight multiplied by Count

106 ↗(On Diff #128588)

Call it CallSiteRelFreq may make the code more readable.

eraman marked 4 inline comments as done.Jan 4 2018, 4:58 PM
eraman added inline comments.
256 ↗(On Diff #128250)

This patch doesn't read the synthetic entry count and hence I am just removing the getSyntheticEntryCount from this. I will leave the boolean on the setEntryCount and in the next patch add an EntryCount class.

64 ↗(On Diff #128588)

I have modified the comments along the lines of what you say. I have renamed the variable to AdditionalCounts - I prefer this to your other suggestions.

eraman updated this revision to Diff 128685.Jan 4 2018, 4:59 PM
eraman marked an inline comment as done.

Address David's comments.

davidxl accepted this revision.Jan 8 2018, 9:39 AM


60 ↗(On Diff #128685)

make it zero perhaps?

This revision is now accepted and ready to land.Jan 8 2018, 9:39 AM
eraman added inline comments.Jan 8 2018, 11:01 AM
60 ↗(On Diff #128685)

I have no strong opinion, but I expect the default values to change after we make the inliner use these synthetic counts and tune the inliner. So I'll leave it as 5 for now.

This revision was automatically updated to reflect the committed changes.