This patch adds a selected set of cleanup passes including a pre-inline pass before LLVM IR PGO instrumentation. The inline is only intended to apply those obvious/trivial ones before instrumentation so that much less instrumentation is needed to get better profiling information. This will drastically improve the instrumented code performance for large C++ applications. Another benefit was the context sensitive counts that can potentially improve the PGO optimization.
This is previously discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089425.html
Here I recollected some the data to show the impact using the followings:
(1) spec2006 C and C++ programs
(2) some Google internal benchmarks (which are of different size of C++ applications).
(3) instrumented LLVM compiler
The metrics collected are:
(1) instrumented code runtime performance.
https://docs.google.com/document/d/1DCP7KPjtitHMotX2KgTO8PADd9zjLzPJgSXZ8egd6Mk/edit?usp=sharing
(2) program text size (i.e. sum of sections (having address and name) sizes minus
writable section sizes.)
https://docs.google.com/document/d/1ItOetOEc1vTpU54EBjQK4O0LfulFIv3gs9XXnmzwqdo/edit?usp=sharing
(3) program binary size
https://docs.google.com/document/d/1nmiR6He0cxMFqMzWE-X7FdhL5jSzdS9h7WLxDrfmSxQ/edit?usp=sharing
Is zero or more needed?