AutoFDO performance is sensitive to profile density, i.e., the amount of samples in the profile relative to the program size, because profiles with insufficient samples could be inaccurate due to statistical noise and thus hurt AutoFDO performance. A previous investigation showed that AutoFDO performed better on MySQL with increased amount of samples. Therefore, we implement a profile-density computation feature to give hints about profile density to users and the compiler.
We define the density of a profile Prof as follows:
- For each function A in the profile, density(A) = total_samples(A) / sizeof(A).
- density(Prof) = min(density(A)) for all functions A that are warm (defined below).
A function is considered warm if its total-samples is within top N percent of the profile. For implementation, we reuse the ProfileSummaryBuilder::getHotCountThreshold(..) as threshold which can be set by percent(--profile-summary-cutoff-hot) or by value(--profile-summary-hot-count).
We also introduce --hot-function-density-threshold to set hot function density threshold and will give suggestion if profile density is below it which implies we should increase samples.
This also applies for CS profile with all profiles merged into base.
Can we omit raw_fd_ostream &OS as a parameter and use outs() directly inside?
With that, can we also remove the include for raw_os_ostream.h?