Grabs the same logic and reasoning from the X86 implementation of the hook. The benefit is slightly less clear for when the soft float ABI is used (i.e. there's no transfer from an FPR to a GPR), but I've opted not to gate it based on ABI - feedback welcome.
I've got no reason to believe this is a particularly meaningful optimisation (i.e. I spotted the hook and figured the X86 logic applied to RISC-V rather than seeing this issue in real code), but it still seems marginally better for these test cases.
On the off chance you want to apply locally and experiment, the base test file is in D140409 (I've not committed directly, in case people aren't keen on adding this hook).