Split critical edges coming out of indirect branches, when it is easy to do: we match a "jump table" pattern where the jump table has local linkage, is used by exactly one indirectbr, and has no other uses.
Having a critical edge survive until MI means that when we go out of SSA, we have to define all the live-ins of a destination block within the origin block. Normally, MachineSink tries to split critical edges and sink these definitions into the destination block, but teaching it to split indirect edges on the MI level in a generic way looks problematic. So, instead, this tries to split these edges in IR.
This is motivated by the use of computed gotos in python 2.7 - PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. This causes us to emit about ~100 defs of registers containing constants, which we then fail to sink because the edge is critical (destination block has incoming edges from both the indirectbr and from a switch). So, at each goto, we "spill" about a hundred constants.
That end result is that a clang-compiled python interpreter is about ~2.5 (!) slower on a simple python reduction loop than a gcc-compiled interpreter.