If a function is hot, put it in text.hot section.
Details
Diff Detail
Event Timeline
lib/CodeGen/TargetLoweringObjectFileImpl.cpp | ||
---|---|---|
308 |
In short, the initial heuristic should be conservative in the sense we only filter out those we are really confident with -- mainly cold ones never executed which can go a long way.
|
include/llvm/ProfileData/ProfileCommon.h | ||
---|---|---|
70 | Document this method -- what does it do? | |
lib/CodeGen/TargetLoweringObjectFileImpl.cpp | ||
248 | Move these two functions into ProfileSummary.h (as a standalone function in llvm:: namespace) | |
252 | Better name it as: GroupFunctionsByHotness | |
255 | I think the initial value should be false. To turn it on by default, more data is needed (and announced more widely). | |
lib/ProfileData/ProfileSummary.cpp | ||
81 | use FIXME | |
87 | FIXME | |
98 | FIXME |
lib/ProfileData/ProfileSummary.cpp | ||
---|---|---|
80 | If this takes Function * as an argument, it can be a wrapper of getEntryCount call | |
91 | I suggest changing this interface's name to be 'isFunctionRarelyCalled'. It is different from 'isFunctionCold' which should be checking the internal block count. These two interfaces are for different purposes. The later can be used to guide decisions such as optimize for size etc, while the former can be used for layout decision to reduce itlb misses. Similarly, isFunctionHot does not mean it is frequently called either -- so the interface need to be split up too. | |
95 | the check of hasProfile is probably redundant. |
include/llvm/ProfileData/ProfileCommon.h | ||
---|---|---|
32 | These two functions should be elsewhere. Perhaps in TargetLowering ? | |
lib/ProfileData/ProfileSummary.cpp | ||
91 | This and isFunctionHot should be made as a member functions of Function. | |
95 | I think what David says is true (!F->genEntryCount() implies !hasProfile()). |
include/llvm/ProfileData/ProfileCommon.h | ||
---|---|---|
32 | The strings themselves are not profile related. One might want to use .unlikely for error handling routines even without profile data. | |
lib/ProfileData/ProfileSummary.cpp | ||
91 | Again, the function itself is not fully profile dependent. I think ProfileCommon should only have APIs like isCountConsideredHot(uint64_t Count). |
include/llvm/ProfileData/ProfileCommon.h | ||
---|---|---|
32 | Aren't those also related to profile too (it is profile that can be determined statically)? | |
lib/ProfileData/ProfileSummary.cpp | ||
91 | You have a point about Function being not profile dependent, but I think we should not restrict the scope of ProfileCommon.h too much like this. It is reasonable for ProfileCommon to reference IR related headers. |
update
lib/ProfileData/ProfileSummary.cpp | ||
---|---|---|
80 | !getEntryCount() does not mean no profile for the binary, but no profile for the function. We need to have this function to indicate if this is an FDO/AFDO build. | |
91 | After a second thought, I think it still makes sense to group "hot" functions together instead of "frequently called" This is because the hot path of an "infrequently called" hot function may have many calls to a frequently called function. So if these two functions are far apart, it will waste an ITLB entry. Additionally, this will pollute the heat map and makes performance tuning more difficult. | |
95 | I've updated the logic, PTAL. |
include/llvm/ProfileData/ProfileCommon.h | ||
---|---|---|
71 | If it is used to indicate profile availability at program level, it should take Module as argument. | |
lib/ProfileData/ProfileSummary.cpp | ||
99 | Should return false -- as we don't know (no profile data) | |
102 | This will work well for instr profile but may produce false positive for sampleFDO -- this will be fixed when summary based computation is used. |
lib/ProfileData/ProfileSummary.cpp | ||
---|---|---|
99 | For AutoFDO, F->setEntryCount() is only called when there is sample in that function. So for those 0-sampled functions, F->getEntryCount() will be false. | |
102 | Yes, but we can always set entry-count as no less than 1 if the function has profile. Otherwise we have no way to calculate bb counts. |
Looks good in general.
- Need a test case covering both regular case and -ffunction-sections (-function-sections for llc)
- Need a follow up discussion to introduce user level option -freorder-function (and turn on by default with profile) in LLVM. Also talk to other maintainers of other main platforms (MachO, etc) to add this support there.
lib/ProfileData/ProfileSummary.cpp | ||
---|---|---|
82 | return M->getMaximumFunctionCount().hasValue(); I know this doesn't work for sample profile, but it works for instrumentation profile and is better than returning false. | |
99 | I think sample profile loader pass should set 0 to the entry count. The meaning of getEntryCount() should not be dependent on the profile format that is used. (To be clear, I am not saying that should be fixed as part of this patch) |
added the unittest. also removed hasProfile to use getEntryCount()'s boolean value to check. I will have a follow-up patch to make getEntryCount() behavior consistent between sample profile and instr profile.
test/CodeGen/X86/partition-sections.ll | ||
---|---|---|
1 ↗ | (On Diff #48716) | Can you also add a test without -function-sections option. |
LGTM.
To repeat the follow ups needed:
- tuning, performance number and turning on by default for PGO
- support for other platforms (and consider user options if applicable).
The test case misses mtriple option.
Dehao, we should not leave buildbot failing for long period of time.
The buildbot message should arrive at your inbox very shortly after
the commit -- can you check if there is any issue there (e.g. in spam
filter)?
thanks,
David
These two functions should be elsewhere. Perhaps in TargetLowering ?