Currently, we default to using a vmv.s.x and vslide1up sequence for inserting elements into a vector. This lowering has a couple of downsides. First, it requires a temporary register to hold the scalar-as-vector. Second, for inserts into the middle of a vector, the VL chosen needs to be Idx + 1. This causes VL toggles since these odd VLs are unlikely to be sharable.
Instead, we can use a vmerge.vx to perform the insert. This avoids the need for the temporary register and odd VLs, but requires the population of a mask register. For the moment, restrict usage to when we can use a single vmv.v.i to populate the mask - i.e. indices less than 5. If we like the direction of this patch, this restriction can be lifted by using a vmseq(vid, index) sequence, but I'll defer that to later work.
Could we still use vmv.v.i on scalable vectors in theory? Since Idx isn't scaled by vscale for insert_vector_elt.