Page MenuHomePhabricator

[WinEH] Allocate space in funclets stack to save XMM CSRs
ClosedPublic

Authored by pengfei on Aug 22 2019, 8:09 AM.

Details

Summary

This is an alternate approach to D63396

Currently funclets reuse the same stack slots that are used in the
parent function for saving callee-saved xmm registers. If the parent
function modifies a callee-saved xmm register before an excpetion is
thrown, the catch handler will overwrite the original saved value.

This patch allocates space in funclets stack for saving callee-saved xmm
registers and uses RSP instead RBP to access memory.

Signed-off-by: Pengfei Wang <pengfei.wang@intel.com>

Diff Detail

Repository
rL LLVM

Event Timeline

pengfei created this revision.Aug 22 2019, 8:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2019, 8:09 AM
rnk added a comment.Aug 22 2019, 10:46 AM

Thanks, this seems more straightforward. :)

llvm/lib/Target/X86/X86FrameLowering.cpp
1397–1398 ↗(On Diff #216627)

I think this else can be simplified by using getFrameIndexReferenceSP. The .seh_savexmm directive expects an offset relative to SP-after-prologue, but getFrameIndexReference may use an RBP offset, and the SEHFrameOffset is the difference between them at this point. Making the preference for RSP offsets explicit seems clearer.

At first I wanted to find a way to eliminate the need for the if/else here, but I see it's not possible without a "prefer SP" variant to the new XMM frame index ref helper.

1862 ↗(On Diff #216627)

This calculation is only correct in funclets. I think what I had in mind was that we'd pass in IsFunclet to this method, since all the callers have to calculate it anyway. That way we avoid one more conditional at the call site.

Is 32 always correct? Shouldn't it be MFI.getMaxCallFrameSize()?

2042 ↗(On Diff #216627)

What is the desired change in behavior here? It looks like this can be simplified. SpillSlotOffset is always negative and Align is always positive, so I would expect Remainder to always be zero or negative.

Would this expression calculate the same value?

SpillSlotOffset = -alignTo(-SpillSlotOffset, Align);
2054 ↗(On Diff #216627)

"for funclets"

llvm/lib/Target/X86/X86MachineFunctionInfo.h
41 ↗(On Diff #216627)

Perhaps initialize it to INT32_MIN for safety?

llvm/lib/Target/X86/X86RegisterInfo.cpp
713 ↗(On Diff #216627)

I would expect that for a multi-bb cleanup funclet, MBB.isEHFuncletEntry() wouldn't be sufficient, and you'd need to check if the terminator is a funclet return.

pengfei updated this revision to Diff 216824.Aug 23 2019, 6:38 AM
pengfei marked 5 inline comments as done.

Address review comments.

pengfei marked 2 inline comments as done and an inline comment as not done.Aug 23 2019, 6:59 AM

Thanks for the patient review and valuable suggestions.

llvm/lib/Target/X86/X86FrameLowering.cpp
1397–1398 ↗(On Diff #216627)

The return value of getFrameIndexReferenceSP(MF, FI, IgnoredFrameReg, SEHFrameOffset) has 8 bytes gap as against the former math. It may need more code to eliminate the gap. So it's better to keep it.

1862 ↗(On Diff #216627)

Because we always need condition IsFunclet to select this method in caller emitPrologue.
I changed the function name to getWin64EHFrameIndexRef, which may be more appropriate.

Right, MFI.getMaxCallFrameSize() is more robust.

2042 ↗(On Diff #216627)

The former code doesn't align it in some cases, e.g. SpillSlotOffset = -8 and Align = 64.

Yes, it can be simplified to this. Thanks!

llvm/lib/Target/X86/X86MachineFunctionInfo.h
41 ↗(On Diff #216627)

I have used a FI->Slot map to replace this. It's more robust and flexible.

llvm/lib/Target/X86/X86RegisterInfo.cpp
713 ↗(On Diff #216627)

You are correct. Thanks for the reminding!

pengfei updated this revision to Diff 216989.Aug 23 2019, 5:58 PM
pengfei marked an inline comment as not done.

Fix the error in calculating funclet stack pointer.

rnk accepted this revision.Aug 26 2019, 2:17 PM

lgtm, thanks!

llvm/lib/Target/X86/X86RegisterInfo.cpp
713 ↗(On Diff #216989)

I confirmed this is O(#terminators) instead of O(#instrs), so this isn't O(n^2) in BB size. :)

This revision is now accepted and ready to land.Aug 26 2019, 2:17 PM
This revision was automatically updated to reflect the committed changes.