When -fsplit-stack is used, the function prologue contains two additional blocks: the first checks the stack bound, and the second calls __morestack. The second block may be laid out later in the function body, as it is an unlikely block. However, it is still the prologue, where we haven't pushed callee-save registers and the frame pointer (if used) yet. Currently we didn't take this into account when generating the unwind info. The CFIInstrInserter pass does take care of the return address correctly, but it does not handle the frame pointer and callee-save registers.
This change fixes this by generating .cfi_restore for saved registers to reset them as of the function entry. We also generate .cfi_remember_state and .cfi_restore_state before and after this block, so that the blocks before and after it are not affected no matter where this block is laid out. As it is specific to split-stack, we do it in X86FrameLowering, instead of teaching CFIInstrInserter about all the saved register stuff (which is generally not needed).
Added basic support of .cfi_remember_state and .cfi_restore_state support to CFIInstrInserter and AsmPrinter.
I feel it is not the correct meaning of .cfi_remember_state. .cfi_remember_state means remembering the current set of CFI rules so it is related with layout instead of function control flow. Here calculateOutgoingCFAInfo is based on function control flow.
For allocMBB, the .cfi_remember_state remembers the state its prev MBB, not the state of its predecessor MBB, which is the function entry.
I think we should ignore .cfi_remember_state and .cfi_restore_state inside of calculateOutgoingCFAInfo because they are not directives associated with CFI changing machine instructions.