If we unroll a loop in the vectorizer (without vectorizing), and the cost model requires a epilogue be generated for correctness, the code generation must actually do so.
The included test case on an unmodified opt will access memory one past the expected bound.
I believe this to be the root cause of the issues seen with 3e5ce49e, but even if it isn't, it's definitely a bug.
That seems like the simplest way to both fix the bug and avoid pessimizing the unroll-only case, i.e.: