Currently we use a combined metric TargetTransformInfo::TCK_SizeAndLatency
when estimating the specialization bonus. This is suboptimal, and in some
cases erroneous. For example we shouldn't be weighting the codesize decrease
attributed to constant propagation by the block frequency of the dead code.
Instead only the latency savings should be weighted by block frequency. The
total codesize savings from all the specialization arguments should be
deducted from the specialization cost.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
I run some experiments to measure compilation time. It seems if getUserBonus returns std::pair<Cost,Cost> instead of caching CodeSize and Latency to the InstCostVisitor it is slightly faster (perhaps a litle uglier too).
This patch improves geomean of instruction count for llvm-test-suite by -0.016% at O3 and regresses it by +0.07% at LTO. The alternative is -0.036 and +0.064% respectively.
Comment Actions
This revision is better in compile times (instruction count) for llvm test suite: Geomean with O3 is -0.049%, with LTO is +0.062%.