We want to find better loop top for this common (and similar) pattern:
entry | ------> loop.header (body) |97% / \ | /50% \50% --- latch <--- if.then \ 97% / \3% /3% loop.end
Currently, Branch Probability Basic Block Placement will generate BlockChain in this order:
entry -> loop.header -> if.then -> latch -> loop.end
This order cause latch needs an branch jumping back to loop.header when condition is true.
Better BlockChain order would be:
entry -> latch -> loop.header -> if.then -> loop.end
So latch can fall through loop.header without this jump.
Thanks Carrot for pointing out this performance issue: https://llvm.org/bugs/show_bug.cgi?id=25782
We also test this patch on Power8 by running SPEC2006, gcc and libquantum get 5% improvements.
Sink this to where it is used?