Currently CodeGenPrepare is very time consuming in handling big functions.
Old Algorithm :
It iterate each BB in function, and go on handle very instructions in BB.
Due to some instruction optimizations may affect the BBs' dominate tree.
The old logic will re-iterate and try optimize for each BB.
Suppose we have a big function with 20000 BBs, If we handled the last BB
with fine tuning the dominate tree. We need totally re-iterate and try optimize
the 20000 BBs from the beginning.
The Complex is near N!
And we really encounter somes big tests (> 20000 BBs) that cost more than 30 mins in this pass.
(Debug version compiler will cost 2 hours here)
What this patch do for huge function ?
It mainly changes the iteration way for optimization.
1 First we do optimizeBlock for each BB (that is same with old way).
And, in the meaning time, If BB is changed/updated in the optimization, it will be put into FreshBBs (try do optimizeBlock again).
The new created BB at previous iteration will also put into FreshBBs.
2 For the BBs which not updated at previous iteration, we directly skip it.
Strictly speaking, here may miss some opportunity, but I think the probability is very small. (from my local performance test for specs and some other perf tests, there is no harm to performance. )
3 For Instructions in single BB, we do optimizeInst for each instruction.
If optimizeInst change the instruction dominator in this BB, rather than break and go back to optimize the first BB (the old way),
we directly iterate instructions (to do optimizeInst) in this updated BB again (the new way).
What this patch do for small/normal (not huge) function ?
It is same with the Old Algorithm. (NFC)
Pls add cases for the patch with HugeFuncThresholdInCGPP be specified with small number.