This diff implements minimal block coverage instrumentation. When the -pgo-block-coverage option is used, basic blocks will be instrumented for block coverage using single byte booleans. The coverage of some basic blocks can be inferred from others, so not every basic block is instrumented. In fact, we found that only ~60% of basic blocks need to be instrumented. These differences lead to less size overhead when compared to instrumenting block counts. For example, block coverage on the clang binary has an overhead of 20 Mi (17%) compared to 56 Mi (47%) with block counts.
Even though block coverage profiles have less precision than block count profiles, they can still be used to guide optimizations. In PGOUseFunc we use block coverage to populate edge weights such that BFI gives nonzero counts to only covered blocks. We do this by 1) setting the entry count of covered functions to a large value, i.e., 10000 and 2) populating edge weights using block coverage. In the next diff https://reviews.llvm.org/D125743 we use BFI to guide the machine outliner to avoid outlining covered blocks. This -pgo-block-coverage option provides a trade off of generating less precise profiles for faster and smaller instrumented binaries.
The BlockCoverageInference class defines the algorithm to find the minimal set of basic blocks that need to be instrumented for coverage. This is different from the Kirchhoff circuit law optimization that is used for edge counts because that does not work for block coverage. The reason for this is that edge counts can be added together to find a missing count while block coverage cannot since they store boolean values. So we need a new algorithm to find which blocks must be instrumented.
The details on this algorithm can be found in this paper titled "Minimum Coverage Instrumentation": https://arxiv.org/abs/2208.13907
Special thanks to Julian Mestre for creating this block coverage inference algorithm.
Binary size of clang using -O2:
- Base
- .text: 65.8 Mi
- Total: 119 Mi
- IRPGO (-fprofile-generate -mllvm -disable-vp -mllvm -debug-info-correlate)
- .text: 93.0 Mi
- __llvm_prf_cnts: 14.5 Mi
- Total: 175 Mi
- Minimal Block Coverage (-fprofile-generate -mllvm -disable-vp -mllvm -debug-info-correlate -mllvm -pgo-block-coverage)
- .text: 82.1 Mi
- __llvm_prf_cnts: 1.38 Mi
- Total: 139 Mi
LLVM_TRANSFORMS_INSTRUMENTATION_BLOCKCOVERAGEINFERENCE_H