This was for some reason skipping operands that are subregisters
instead of keeping the same subregister index.
v_movreld_b32 expects src0 to be the subregister of the tied
super register use/def.
e.g. v_movreld_b32 v0, v9, <imp-def, tied3> v[0:3], <imp-use, tied2> v[0:3] was being replaced with v[4:7] = copy v[0:3] v_movreld_b32 v0, v9, <imp-def, tied3> v[4:7], <imp-use, tied2> v[4:7], which really writes to v[0:3]