If the block being cloned contains a PHI node, in general, we need to clone that PHI node, even though it's trivial. If the operand of the PHI is an instruction in the block being cloned, the correct value for the operand doesn't exist until SSAUpdater constructs it.
We usually don't hit this issue because we try to avoid threading across loop headers, but it's possible to hit this in some cases involving irreducible CFGs. I added a flag to allow threading across loop headers to make the testcase simpler.
Thanks to Brian Rzycki for reducing the testcase.