Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions.
Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do.
This may seem like a trivial cleanup but it reduces the amount of work that gets done in one pass in the loop vectorizer, which will be very important when it gets extracted into a utility.