By attaching implicit vgpr to these copies, they will be allocated and available at time of copy lowering, so we no longer need register scavenging. The price is that in high RP kernel, this may cause additional spilling.
This is a WIP while I investigate what is needed to do the same for AGPR copy "spills" (and potentially remove VGPRForAGPRCopy).