The problem to solve:
- When simplify-cfg could hoist / sink indirect callsites with inherently different target values, there isn't a simple lossless way to merge !prof value profile metadata.
- hoist of indirect call https://gcc.godbolt.org/z/o6G68rn3v, sink of indirect call https://gcc.godbolt.org/z/79E3onono. The C++ comments and test cases have brief comments to explain why/when not preserving is sub-optimal. Basically, ICP heuristics (code) promotes when a target value show up enough times, and merging the sum changes the distribution.
This patch:
- Before indirect-call-promotion happens, preserve the indirect callsites with target value profiles in simplify-cfg when there isn't a good way to merge.
- Basically, when two indirect callsites each has value profiles, preserve them; or else it's fine to hoist or sink before ICP.
- After indirect-call-promotion happens, always sink or hoist (when possible) and merge value profiles.
Implementation:
- Set a module flag after the last ICP pass completes in a binary build pipeline.
- Pass in ThinOrFullLTOPhase to ICP pass constructor. IndirectCallProm transformations all happen after ICP pass completes in a {thin, regular} LTO postlink pipeline or in a non-LTO pipeline.
- Add helper functions to decide whether it's profitable to merge and get merged results.
- SimplifyCFG calls helper functions above to preserve callsites for indirect-call-prom to happen, or hoist/sink indirect calls && merge value profiles.
- Clearing value profiles after ICP makes use of them won't work since CGProfile pass still uses <target-value, counter> pairs to compute function hotness.
This is unnecessary if !MDProf