This patch adds a selected set of cleanup passes including a pre-inline pass before LLVM IR PGO instrumentation. The inline is only intended to apply those obvious/trivial ones before instrumentation so that much less instrumentation is needed to get better profiling information. This will drastically improve the instrumented code performance for large C++ applications. Another benefit was the context sensitive counts that can potentially improve the PGO optimization.
This is previously discussed here:
Here I recollected some the data to show the impact using the followings:
(1) spec2006 C and C++ programs
(2) some Google internal benchmarks (which are of different size of C++ applications).
(3) instrumented LLVM compiler
The metrics collected are:
(1) instrumented code runtime performance.
(2) program text size (i.e. sum of sections (having address and name) sizes minus
writable section sizes.)
(3) program binary size