This is an alternate approach to D57970.
Currently funclets reuse the same stack slots that are used in the parent function for saving callee-saved xmm registers. If the parent function modifies a callee-saved xmm register before an excpetion is thrown, the catch handler will overwrite the original saved value.
This patch allocates space in funclets stack for saving callee-saved xmm registers and uses RSP instead RBP to access memory.
I think you should include this offset in the return value from getWinEHFuncletFrameSize. With your change, getWinEHFuncletFrameSize no longer returns the true funclet frame size, and it would be a bug to have a call site that forgets to include the XMM size offset. In fact, I found and fixed this exact bug in the catch funclet establishing frame offset code. We should fix it by design before relanding.