Does this actually have the correct semantics at runtime? It looks like the "index" instruction is using the wrong width.
Changed predicate instructions to use the correct bitwidths
Do you mind expanding your tests to add this ones test_insert_into_undef_<type>
%b = insertelement <vscale x 4 x float> undef, float %a, i32 0
%b = insertelement <vscale x 4 x float> undef, float %a, i64 idx
Changed tests to include all nxv data types, not just 2f16, 4f16, and 2f32.
Thank you @DylanFleming-arm