Loop trip counts can often be resolved during LTO. We should obviously be unrolling small loops once those trip counts have been resolved, but we weren't.
This causes no change in a -flto run of the test-suite for me, but the nature of fully unrolling loops is that when they trigger, they cause massive changes in runtime. We observe this on third party test suites.
Note: it is inserted before LoopInterchangePass in the regular pipeline?
Also there is a comment before another insertion // BBVectorize may have significantly shortened a loop body; unroll again.. Could this be valid for the Loop and SLP vectorizers that are inserted below?