Add the following heuristics for irreducible loop metadata:
- When an irreducible loop header is missing the loop header weight metadata, give it the minimum weight seen among other headers.
- Annotate indirectbr targets with the loop header weight metadata (as they are likely to become irreducible loop headers after indirectbr tail duplication.)
These greatly improve the accuracy of the block frequency info of the Python
interpreter loop (eg. from ~3-16x off down to ~40-55% off) and the Python
performance (eg. unpack_sequence from ~50% slower to ~8% faster than GCC) due to
better register allocation under PGO.
what is the root cause of missing header weight?