This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344
As noted in the bug report, I have a couple of questions about how to implement this. I've taken my best guesses at those questions in this patch:
- Should SimplifyCFG use metadata and not create a select in the first place for an obviously predictable branch? Or should CGP be responsible for undoing that transform?
I decided that CGP should undo the transform for the same reason that earlier patches have used the same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place.
- Since we're relying on branch weight metadata, we need a TLI hook to determine just how lopsided that data must be before favoring branches over a select. What's a good default value for that ratio?
I selected >99% taken or not taken as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly.
As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable.
Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. I proposed to change that in D19435.
hard coding default value like this make it hard to do performance experiment -- suggest an internal option to control