This patch identifies non header phis that have no cyclic dependencies with
header phis (reduction/induction/first order recurrence phis).
If those phis have outside uses, we can still vectorize the loop and extract the
last element. This is because the iteration dependence distance for these phis
can be widened upto VF (similar to how we do for induction/reduction) since they do not
have a cyclic dependence with header phis.
The key point is to extract the last element from the vectorized phi and update
the scalar loop exit block phi to contain this extracted element from the vector
loop.
Worth updating the above comment, to also include non-header Phi's.
Perhaps leave behind a TODO somewhere, that other (non-Phi) instructions should also be able to work with the new extract-scalar-of-last-iteration code.