The current prefix based function layout algorithm only looks at function's entry count, which is not sufficient. A function should be grouped together if its entry count or any call edge count is hot.
Details
Diff Detail
- Build Status
Buildable 4949 Build 4949: arc lint + arc unit
Event Timeline
I concur that the current state of checking the hotness of entry count is not good. I am not sure this is the right approach without seeing some numbers. Consider a function foo with calls to bar1 and bar2. Let's say foo's entry is cold. Neither foo->bar1 nor foo->bar2 is hot, but the sum of their counts is hot. Your new solution will still place foo in cold section. So may be we should add all outgoing counts and check its hotness. Even then, we may not want to put foo in cold section if the sum of outgoing edges is not cold.
Updated the patch to change for the cold prefix tool.
In terms of performance, it will only affect ITLB performance on very large applications. For the perf test I've done, it shows improved ITLB and heatmap, but no visible performance difference is observed.
lib/Analysis/ProfileSummaryInfo.cpp | ||
---|---|---|
128 | Based on our offline conversation, I thought we want a function to be labeled only if both its entry count and the outgoing edge count sum is cold, no? With this code, a cold function can still have non-hot non-cold out edges. I think it makes sense to have coldInCallGraph => cold entry count AND cold out edges Do you agree? |
lib/Analysis/ProfileSummaryInfo.cpp | ||
---|---|---|
128 | You are right. Updated the code. |
The code looks fine to me, but the test needs to be updated.
lib/Analysis/ProfileSummaryInfo.cpp | ||
---|---|---|
120 | Nit: function's entry AND total call edge count is cold | |
121 | Comment has been pasted from the previous function and hasn't been properly updated. | |
test/Transforms/CodeGenPrepare/section.ll | ||
1 | The test needs to be updated. You've added one case - a function with cold entry with hot call edge - but the other scenarios (sum of outgoing edges is hot, cold function with non-cold outgoing edges etc) are not tested. |
LGTM after addressing the test changes.
test/Transforms/CodeGenPrepare/section.ll | ||
---|---|---|
1 | It's better now, but I prefer you add cases where there are multiple out edges where sum of outgoing edges is hot (and individual edges are not) and sum of outgoing edges is not cold with individual edges being cold. |
Nit: function's entry AND total call edge count is cold