MergeBlockIntoPredecessor unconditionally calls RemoveRedundantDbgInstrs,
which iterates twice over the instructions of the basic block.
MergeBlockIntoPredecessor is itself called in a hot inner loop,
and as a result, RemoveRedundantDbgInstrs was taking 98% of
the runtime in some cases, and causing 10 minute compile times in some
downstream workloads.
Make the call to RemoveRedundantDbgInstrs optional in
MergeBlockIntoPredecessor, and call RemoveRedundantDbgInstrs after the
loop is unrolled in LoopUnrollPass. This reduces the runtime usage of
RemoveRedundantDbgInstrs to 3%. There are places it can be moved where
the usage will be lower, but moving it into a function called
"simplifyLoopAfterUnroll" seemed a good fit.
See: Bug 47746
please document under what circumstances one would pass true to the last two parameters