If we're extracting an element and inserting into a undef vector
with the same number of elements, we can use the original vector.
This pattern occurs around reductions that have been cascaded
together.
This can be generalized to wider/narrow vectors by using
insert_subvector/extract_subvector, but we don't have lit tests
for that case currently.
We can also support non-undef before by using a slide or vmv.v.v
Do we need to handle the case when VL=0 and the destination register is preserved?