This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs
ClosedPublic

Authored by mstorsjo on Jul 21 2017, 4:55 AM.

Details

Summary

Create a dummy 8 byte fixed object for the unused slot below the first stored vararg.

Alternative ideas tested but skipped: One could try to align the whole fixed object to 16, but I haven't found how to add an offset to the stack frame used in LowerWin64_VASTART.

If only the size of the fixed stack object size is padded but not the offset, via MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false), PrologEpilogInserter crashes due to "Attempted to reset backwards range!".

This fixes misconceptions about where registers are spilled, since AArch64FrameLowering.cpp assumes the offset from fixed objects is aligned to 16 bytes (and the Win64 case there already manually aligns the offset to 16 bytes).

This fixes cases where local stack allocations could overwrite callee saved registers on the stack.

Diff Detail

Repository
rL LLVM

Event Timeline

mstorsjo created this revision.Jul 21 2017, 4:55 AM

To word the description differently: This makes sure that the generic stack object allocator and the custom callee saved register spiller in AArch64FrameLowering agree about where registers are stored.

t.p.northover added inline comments.Jul 21 2017, 9:49 AM
lib/Target/AArch64/AArch64ISelLowering.cpp
2894–2895 ↗(On Diff #107653)

This is all a bit inconsistent. Parts of it are trying to handle GPRSaveSize % 16 != 8 but it just goes right ahead and makes the slot 8 bytes.

mstorsjo added inline comments.Jul 21 2017, 9:55 AM
lib/Target/AArch64/AArch64ISelLowering.cpp
2894–2895 ↗(On Diff #107653)

Which way would you suggest doing it? In practice it will only ever be 8 or 0. Using 16 - alignTo(size,16) as size for this fixed object? Using some other condition than & 15 for checking if it's needed?

mstorsjo updated this revision to Diff 107803.Jul 22 2017, 2:35 PM
mstorsjo edited the summary of this revision. (Show Details)

Updated the code to be consistent and use generic expressions for the size of the filler, instead of hardcoding 8 bytes. Added a comment to indicate that it in practice always will be 8 anyway.

Added a larger testcase for the case where the previous code broke down, where a local stack allocation clobbered a callee saved register. (This is stripped down from building parts of the MS ucrt headers - is this ok, or should I skip it?)

t.p.northover accepted this revision.Jul 24 2017, 1:59 PM

Thanks, this looks reasonable now I think.

This revision is now accepted and ready to land.Jul 24 2017, 1:59 PM
This revision was automatically updated to reflect the committed changes.