This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Rewrite stack frame handling for win64 vararg functions
ClosedPublic

Authored by mstorsjo on Jul 26 2017, 2:41 PM.

Details

Summary

The previous attempt, which made do with a single offset in computeCalleeSaveRegisterPairs, wasn't quite enough. The previous attempt only worked as long as CombineSPBump == true (since the offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).

Instead include the size for the fixed stack area used for win64 varargs in calculations in emitPrologue/emitEpilogue. The stack consists of mainly three parts;

  • AFI->getLocalStackSize()
  • AFI->getCalleeSavedStackSize()
  • FixedObject

Most of the places in the code which previously used the CSStackSize now use PrologueSaveSize instead, which is the sum of the latter two, while some cases which need exactly the middle one use AFI->getCalleeSavedStackSize() explicitly instead of a local variable.

In addition to moving the offsetting into emitPrologue/emitEpilogue (which fixes functions with CombineSPBump == false), also set the frame pointer to point to the right location, where the frame pointer and link register actually are stored. In addition to the prologue/epilogue, this also requires changes to resolveFrameIndexReference.

Add tests for a function that keeps a frame pointer and that uses a VLA.

Diff Detail

Event Timeline

mstorsjo created this revision.Jul 26 2017, 2:41 PM

Hmm, I think this still breaks some part of my setup - I need to check it more. But does this look like a good direction?

t.p.northover edited edge metadata.Jul 26 2017, 3:15 PM

There's far too much ad-hoc hackery going on here for my tastes. This is the 3rd (soon to be 4th) patch trying to get something working, I think it's time to step back and really look at the code that's being modified. Or at least come up with a thorough set of tests.

There's far too much ad-hoc hackery going on here for my tastes. This is the 3rd (soon to be 4th) patch trying to get something working, I think it's time to step back and really look at the code that's being modified. Or at least come up with a thorough set of tests.

Yeah. The first approach I had (which redefined the canonical frame address) was ugly but probably didn't have any of these other issues.

I'll try to get all codepaths thoroughly tested here before trying to get the next version landed.

mgrang added a subscriber: mgrang.Jul 28 2017, 11:43 AM

@mstorsjo Could you please add me as subscriber to your COFF ARM64 patches? We are trying to build spec2k/spec2k6 for COFF and running into issues with vararg functions. Some of your patches have fixed our issues :)

@mstorsjo Could you please add me as subscriber to your COFF ARM64 patches? We are trying to build spec2k/spec2k6 for COFF and running into issues with vararg functions. Some of your patches have fixed our issues :)

Sorry, I forgot to add you on this one :-) I've got a revision of this one coming up pretty soon, I'm trying to get all the corner cases tested properly this time. A partial fix is to skip the change in line 536/539, but that change is also needed (with some other bits) for completeness.

mstorsjo updated this revision to Diff 108715.Jul 28 2017, 2:30 PM
mstorsjo retitled this revision from [AArch64] Fix callee saved registers in win64 vararg functions with CombineSPBump=0 to [AArch64] Rewrite stack frame handling for win64 vararg functions.
mstorsjo edited the summary of this revision. (Show Details)

Tim: Updated after a whole lot of more testing, testing all codepaths through the emit prologue/epilogue as far as I can see (tested both with and without a frame pointer), testing unwinding for C++ exceptions through a win64cc function, etc. Hopefully this instills a bit more confidence this time. I've included a few open questions about how to rename variables in the patch message and inline that I hope you can give your opinion on.

@mgrang does this version fix all your issues with vararg functions, or are there still issues that can be traced back to this? With this in place, all of the test code I've tried to build seems to work. (Also, have you tried using lld-link for linking? It seems complete enough for coff/arm64 for me now.)

@mgrang does this version fix all your issues with vararg functions, or are there still issues that can be traced back to this? With this in place, all of the test code I've tried to build seems to work. (Also, have you tried using lld-link for linking? It seems complete enough for coff/arm64 for me now.)

Ping @mgrang - did this solve your issues with vararg functions, or is there still some case that I've missed?

@mgrang does this version fix all your issues with vararg functions, or are there still issues that can be traced back to this? With this in place, all of the test code I've tried to build seems to work. (Also, have you tried using lld-link for linking? It seems complete enough for coff/arm64 for me now.)

Ping @mgrang - did this solve your issues with vararg functions, or is there still some case that I've missed?

Yes, so far we went past all issues related to varargs for spec2000. I think this patch LGTM. Also, we don't use lld-link for linking (we have an in-house ARM linker).

mgrang accepted this revision.Aug 1 2017, 11:58 AM

LGTM.

This revision is now accepted and ready to land.Aug 1 2017, 11:58 AM
t.p.northover added inline comments.Aug 1 2017, 1:00 PM
lib/Target/AArch64/AArch64FrameLowering.cpp
512–514

I'm marginally in favour of all the TODOs, but either way the decision should be made and resolved before committing. FWIW my preferred bikeshed colour would be something like PrologueSaveSize or PrologueStackSize.

mstorsjo updated this revision to Diff 109200.Aug 1 2017, 1:52 PM
mstorsjo edited the summary of this revision. (Show Details)

Applied the TODOs as suggested by Tim, renamed CSStackSize to PrologueSaveSize (and renamed FixedStack to FixedObject), wrapped some long lines with clang-format-diff.

t.p.northover accepted this revision.Aug 1 2017, 1:56 PM

Thanks Martin, I think this looks reasonable now too.

This revision was automatically updated to reflect the committed changes.