We are seeing r276077 drastically increasing compiler time for our larger benchmarks in PGO profile generation build (both clang based and IR based mode) -- it can be 20x slower than without the patch (like from 30 secs to 780 secs)
The increased time are all in pass LCSSA. The problematic code is about PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater is accumulating (never been cleared). Since the inserted PHIs are added to the candidate for each rewrite, The earlier ones will be repeatedly added. Later when adding the new PHIs to the work-list, we don't check the duplication either. This can result in extremely long work-list that containing tons of duplicated PHIs.
This attached patch fixes the issue by hoisting the code out of the loop.