MachineLICM can hoist instructions out of loop, but will chooses not to do so for Cheap instructions, including all COPY instructions. Under Cortex-M cpus, where there isn't really a big difference between a MOV and any other instruction, we should really hoist these out of loops, even if they increase the immediate register pressure. MachineLICM will still test if the register pressure limit has been reached, but this puts a shouldHoistCheapInsts target hook in for hoisting instructions that wouldn't otherwise be.
This is especially true for MVE code where we sink VDUP's into blocks attempting to fold them into register variants of vector instructions. Where this scalar is a float value we are left with a COPY from SPR to GPR which needs to be hoisted. Other than that this did not seem to cause a lot of changes.