This removes the existing patterns for inserting two lanes into an f16/i16 vector register using VINS, instead using a DAG combine to pattern match the same code sequences. The tablegen patterns were already on the large side (foreach LANE = [0, 2, 4, 6]) and were not handling all the cases they could. Moving that to a DAG combine, whilst not less code, allows us to more easily control and expand the selections on VINSs. For example this allows us to remove the AddedComplexity on VCVTT.
The extra trick that this has learned in the process is to move two adjacent lanes using a single f32 vmov, allowing some extra inefficiencies to be removed.
@dmgreen I'm seeing unused variable warnings for Val1Copy, Val2Copy + VecCopy