This is an archive of the discontinued LLVM Phabricator instance.

[PPC64] Add split-stack support.
ClosedPublic

Authored by sfertile on Sep 14 2018, 8:26 AM.

Details

Summary

This support is slightly different then the X86_64 implementation in that calls to __morestack don't need to get rewritten to calls to __moresatck_non_split when a split-stack caller calls a non-split-stack callee. Instead the size of the stack frame requested by the caller is adjusted prior to the call to __morestack. The size the stack-frame will be adjusted by is tune-able through a new --split-stack-adjust-size option.

Diff Detail

Repository
rL LLVM

Event Timeline

sfertile created this revision.Sep 14 2018, 8:26 AM
ruiu added inline comments.Sep 14 2018, 9:41 AM
ELF/Arch/PPC64.cpp
663 ↗(On Diff #165494)

It perhaps needs a high-level comment describing what instruction sequence is rewritten to what instruction sequence.

665 ↗(On Diff #165494)

It's not clear to me what this code does -- is this PPC-specific thing?

676 ↗(On Diff #165494)

nit: remove {}

681 ↗(On Diff #165494)

Ditto

685 ↗(On Diff #165494)

nit: I'd name HiImm and LoImm

We usually define one variable per line, so

int16_t HiImm = 0;
int16_t LoImm = 0;
694 ↗(On Diff #165494)

Can this happen?

703 ↗(On Diff #165494)

Is this an ABI violation, or legitimate code has such code sequence in which an instruction other than addi or nop follows the first instruction?

ELF/Driver.cpp
284–285 ↗(On Diff #165494)

This may be a silly question, but why we have --split-stack-adjust-size only for PPC64 and not for x86-64? If we can live without that option with x86-64, why do we need it for PPC64?

ELF/Target.h
80 ↗(On Diff #165494)

This should be a boolean public member variable instead of a virtual function call for the sake of simplicity.

sfertile added inline comments.Sep 14 2018, 11:31 AM
ELF/Arch/PPC64.cpp
665 ↗(On Diff #165494)

Yes, its V2 abi specific. A function can have 2 entry points:

  1. A global entry point which will setup the TOC pointer if the function needs it.
  2. A local entry point, which will be the entry if the caller shares the same value for the TOC pointer, this skips over the TOC pointer initialization.

The 3 most significant bits of the functions st_other flags are used to encode the offset from the global entry to the local entry.

This is something we are already doing in getRelocTargetVA for R_PPC_CALL RelExpr and I believe it comes up in the position-dependent long branch thunks patch I posted. I'll move the calculation into its own function and try to explain the how and why more clearly there so its not so opaque to people not familiar with PowerPC64 abi.

694 ↗(On Diff #165494)

I believe so, but I'm not 100% sure. I'm extrapolating from the x86-64-split-stack-prologue-adjust-silent.s test which implies that having a .note.GNU-no-split-stack section tells the linker there may be unrecognizable prologues. I wouldn't expect a compiler to generate such a prologue but was guessing it should be valid to write a custom prologue that called __morestack_non_split directly and the program would still be valid.

703 ↗(On Diff #165494)

Like the above answer I'm not really sure. I believe a compliant compiler is expected to always emit this sequence but I don't think its necessarily disallowed to emit a different sequence. The behavior with the GNU-no-split-stack note indicates to me that different sequences are allowed but the handshake with the linker is broken and its up to the user to have implemented something that makes sense, but I can't find this documented anywhere,

ELF/Driver.cpp
284–285 ↗(On Diff #165494)

I'm not very familiar with the x86-64 implementation, but from my understanding on x86-64 you don't have the room to rewrite the instructions to test with the adjusted stack size for all of the supported prologues types (and/or arbitrary stack frame adjustment sizes). So instead of adjusting the check in the function prologue you have to always jump to __morestack_non_split and rely on it doing the adjusted check. On PowerPC we will always emit 2 instructions for adjusting the stack, which is enough to rewrite the adjustment for any supported stack frame size, so we don't need to call __more_stack_non_split.

ruiu added inline comments.Sep 14 2018, 1:26 PM
ELF/Arch/PPC64.cpp
665 ↗(On Diff #165494)

A documentation about that kind of ABI-specific stuff is sometimes extremely useful to understand code. Can you write a file comment describing what you just wrote? If there are other PPC-specific things, it's worth to be written as well.

sfertile marked an inline comment as done.Sep 18 2018, 7:01 AM
sfertile added inline comments.
ELF/Arch/PPC64.cpp
665 ↗(On Diff #165494)

I've posted https://reviews.llvm.org/D52231 which is an nfc patch adding a helper function to calculate the offset as well as document the differences between global entry and local entry. I'll update this patch to be dependent on that one.

sfertile updated this revision to Diff 166062.Sep 18 2018, 7:09 PM
  • Added an explanation of the expected prologue instructions and how the linker is supposed to modify them.
  • Add check for overflow on the adjusted stack size
  • Add check that the register operands are the expected ones.
  • Changed global-entry to local-entry offset calculation to use new helper function.
sfertile marked 5 inline comments as done.Sep 18 2018, 7:10 PM
sfertile edited the summary of this revision. (Show Details)
ruiu added inline comments.Sep 19 2018, 7:09 AM
ELF/Arch/PPC64.cpp
717 ↗(On Diff #166062)

there's an extra space after "prologue"

761 ↗(On Diff #166062)

I think this can be triggered by feeding an object file that doesn't comply with the ABI standard, so this shouldn't be an assert. I'd just omit this check.

794 ↗(On Diff #166062)

Do you need a cast?

sfertile updated this revision to Diff 168313.Oct 4 2018, 9:18 AM

Removed extra space in comment, and incorrect assert.

sfertile marked an inline comment as done.Oct 4 2018, 9:24 AM
sfertile added inline comments.
ELF/Arch/PPC64.cpp
761 ↗(On Diff #166062)

Good point. I've updated to ensure that if LoImm is set and the second instruction is not a nop or we don't adjust the stack.

794 ↗(On Diff #166062)

Since LoImm and HiImm can be negative I added the casts so that we zero extend in the conversion from 16 bits to 32 bits. I could instead add a mask for the lower 16bits if that is clearer eg write32(Loc + 4, 0x3D810000 | (0x0000FFFF & HiImm));

ruiu accepted this revision.Oct 4 2018, 9:57 AM

LGTM

ELF/Config.h
220 ↗(On Diff #168313)

Replace two spaces with one space.

This revision is now accepted and ready to land.Oct 4 2018, 9:57 AM
This revision was automatically updated to reflect the committed changes.