Based on internal testing at Google we found that setting the profile
summary cutoff threshold to 999950 yields the best results in terms of
itlb and icache metrics (as observed on Intel CPUs).
*default* = Split out code if no profile count available for block
*size-%* = The fraction of bytes split out of .text and .text.hot
*itlb* = Misses per kilo instructions (MPKI) for itlb
*icache* = Misses per kilo instructions (MPKI) for L1 icache
Search1
cutoff | size-% | itlb | icache |
---|---|---|---|
default | 42.5861 | 0.0822151 | 2.46363 |
999999 | 44.9350 | 0.0767194 | 2.44416 |
999950 | 50.0660 | 0.075744 | 2.4091 |
999500 | 56.9158 | 0.082564 | 2.4188 |
995000 | 63.8625 | 0.0814927 | 2.42832 |
990000 | 71.7314 | 0.106906 | 2.57785 |
Search2
cutoff | size-% | itlb | icache |
---|---|---|---|
default | 2.8845 | 0.626712 | 4.73245 |
999999 | 3.3291 | 0.602309 | 4.70045 |
999950 | 3.8577 | 0.587842 | 4.71632 |
999500 | 4.4170 | 0.63577 | 4.68351 |
995000 | 5.1020 | 0.657969 | 4.82272 |
990000 | 5.7153 | 0.719122 | 5.39496 |
Nit: TTI -> TargetTransformInfo