Using AvgLoopIters on any loop is too imprecise making the cost model favor users inside loop nests regardless of the actual tripcount.
Compile times -O3
benchmark | nspecs before | nspecs after | instrCnt delta % |
ClamAV | 5 | 5 | +0.006 |
7zip | -0.031 | ||
tramp3d-v4 | -0.043 | ||
kimwitu++ | -0.156 | ||
sqlite3 | 2 | -0.571 | |
mafft | -0.029 | ||
lencod | +0.029 | ||
SPASS | 2 | 2 | -0.038 |
consumer-typeset | -0.045 | ||
Bullet | 1 | +0.055 | |
geomean | -0.083 | ||
Compile times LTO
benchmark | nspecs before | nspecs after | instrCnt delta % |
ClamAV | 1 | -0.159 | |
7zip | -0.023 | ||
tramp3d-v4 | -0.018 | ||
kimwitu++ | +0.016 | ||
sqlite3 | 2 | 1 | +0.357 |
mafft | +0.029 | ||
lencod | 0 | ||
SPASS | 1 | -0.283 | |
consumer-typeset | 1 | +0.539 | |
Bullet | +0.013 | ||
geomean | +0.047 |
(Hi! I notice this and was trying to get some generated function specialized so would like to share some thoughts :) )
Function spec cost is calculated as Metrics.NumInsts * InlineConstants::getInstrCost() (i.e., without TTI per instruction cost).
I wonder if it would make more sense to make per-instruction cost calculation consistent.
Also it seems Weight could give a big bonus for functions in a hot inner-loop with PGO, wonder how it affects code size in PGO case.