[AMDGPU] Fix saving fp and bp
Spilling the fp or bp to scratch could overwrite VGPRs of inactive
lanes. Fix that by using only the active lanes of the scavenged VGPR.
This builds on the assumptions that
- a function is never called with exec=0
- lanes do not die in a function, i.e. exec!=0 in the function epilog
- no new lanes are active when exiting the function, i.e. exec in the epilog is a subset of exec in the prolog.
Differential Revision: https://reviews.llvm.org/D96869