I don't think copies of these should ever be produced (at leasts for the high half) since the high half is not really addressable, and only appears that way to some instructions. Where are copies coming from?
V_PACK_B32_F16 has some FP flushing properties and is not suitable for a copy. I think you have to do essentially what D74740 does
First, hi16 registers are used by load_hi instructions, that is their destination. And then RA can happily copy anything to anything. For sanity we need to know how to copy any register.
I cannot do it here, I would need to scavenge a physreg for a mask, either if I use v_perm_b32 (if available) or v_bfi_b32... In fact I do not see a good instruction to do it if v_pack_b32 does not work.
The high result isn't what's encoded though, so they really are writing the 32-bit register. They only read the low 16-bits. I think the correct way to model this is a 32-bit write but only a 16-bit read
Another thing which doesn't work is mov with sdwa. It needs dst_preserve and then it needs tied operand for this.
If pack doesn't work I can think only about an extremely ugly solution in a general case: clear destination bits with two shifts, then use v_or_b16 with op_sel.