Implemented the llvm-profdata overlap feature for sample profiles. It reports weighted similarity and unweighted overlap metrics at program and function level for two input profiles. Similarity metrics are symmetric with regards to the order of two input profiles. By default, the tool only reports program-level summary. Users can look into function-level details via additional options --function, --similarity-cutoff, and --value-cutoff.
The similarity metrics are designed as follows:
- Program-level summary
- Whole program profile similarity is an aggregate over function-level similarity FS: PS = sum(FS(A) * avg_weight(A)) for all function A.
- Whole program sample overlap: PSO = common_samples / total_samples.
- Function overlap: FO = #common_function / #total_function.
- Hot-function overlap: HFO = #common_hot_function / #total_hot_function.
- Hot-block overlap: HBO = #common_hot_block / #total_hot_block.
- Function-level details
- Function-level similarity is an aggregate over line/block-level similarities BS of all sample lines/blocks in the function, weighted by the closeness of the function's weights in two profiles: FS = sum(BS(i)) * (1 - weight_distance(A)).
- Function-level sample overlap: FSO = common_samples / total_samples for samples in the function.
You may consider a new test utility split-file, which can group multiple auxiliary files. D83834