Page MenuHomePhabricator

[PowerPC] Only use PLT annotations if using PIC relocation model
ClosedPublic

Authored by jhibbits on Nov 21 2019, 2:39 PM.

Details

Summary

The default static (non-PIC, non-PIE) model for 32-bit powerpc does not
use @PLT annotations and relocations in GCC. LLVM shouldn't use @PLT
annotations either, because it breaks secure-PLT linking with GNU LD.

Update the available-externally.ll test to reflect that default mode should be
the same as the static relocation, by using the same check prefix.

Diff Detail

Event Timeline

jhibbits created this revision.Nov 21 2019, 2:39 PM

This doesn't seem quite sufficient for FreeBSD kernel modules. It's emitting R_PPC_REL24 instead of R_PPC_ADDR16_HA/R_PPC_ADDR16_LO pairs for stuff like memset and memcpy, which are unusable because we're running modules in KVA and the kernel text in the 32 bit DMAP. Still need to figure out why these are being emitted this way when using freestanding.

vit9696 added a subscriber: vit9696.

This change makes good sense to me, as I suggested pretty much the same thing 2 years ago (D38554). In the end for us it was easier to fix LLD and switch to GNU ABI ignoring IFUNC support entirely than to convince the others and support the change. Yet either way believe that it is logical not to emit PLT for static memory model. Personally I can accept this, but I wonder whether there are more comments coming.

On the FreeBSD kernel side, I put some time into working out the missing parts on ppc32 and slightly extending the kernel linker to support secure-plt PIC kernel modules. Working on that over at https://reviews.freebsd.org/D22608.

emaste added a subscriber: emaste.Dec 5 2019, 9:56 AM

Hi Justin,

I'd like to help reviewing this patch. I know how calls work in the 2 64-bit ELF abis pretty well but am not familiar with 32-bit PowerPC at all so I have a handful of questions to start:

  1. I've been using this ABI DOC. Is this the 32-bit ABI supported in LLVM? Are there other 32-bit ELF PowerPC Abis I need to be aware of?
  2. What secure-plt reference or references should I read to get up to speed?

Finally I'd like to just verify my base understanding of the difference between bl sym and bl sym@PLT.

bl sym will produce a call instruction with an R_PPC_REL24 relocation, which will tell the linker to fill in the instruction with the offset from the program counter at the call instruction to sym.
bl sym@PLT will produce a call instruction with an R_PPC_PLTREL24 which will instruct the linker to create a PLT, allocate a PLT entry for sym, and fill in the instruction with the offset from the program counter to the PLT entry for 'sym'.

I'm used to having 1 relocation type for both calls, and the linker can determine based on the properties of 'sym' and whether its linking a shared object or not whether the call needs to indirect through the PLT.

  • If you emit a call instruction with the PLTREL relocation but the symbol ends up module local, does the linker convert the call to a local one?
  • Is it an error to not use a PLTREL relocation for a call that needs to be indirected through the PLT or is the linker smart enough to add a PLT entry for the symbol and branch to there instead?

This doesn't seem quite sufficient for FreeBSD kernel modules. It's emitting R_PPC_REL24 instead of R_PPC_ADDR16_HA/R_PPC_ADDR16_LO pairs for stuff like memset and memcpy,
which are unusable because we're running modules in KVA and the kernel text in the 32 bit DMAP. Still need to figure out why these are being emitted this way when using freestanding.

With those relocations, are you building up the address of memset/memcpy directly and performing an indirect call?
eg, something along the lines of:

addis r12, r0, memset@ha
addi r12, r12, memset@l
mtctr r12
bctrl
sfertile removed a subscriber: sfertile.

@Bdragon28 we might need to use -mlongcall for modules, if we aren't already. This should(?) force the compiler to generate the appropriate long-distance code sequences. I'm not sure, though, if clang supports this yet.

@sfertile Thanks for reviewing this. That ABI spec has been superseded generally by https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf, which unifies that with the Linux ABI(s) and embedded ABI(s). That also includes the secure-PLT reference.

Your overall understanding of the relocations involved and code generated is accurate. To answer your questions:

  • Newer linkers (GNU ld post-2.17.50) might be smart enough to convert to a local call. However, GNU ld uses PLTREL relocations without the +0x8000 offset as an indicator of "BSS-PLT" in its heuristics. This is what led to this change.
  • It is an error to not use a PLTREL relocation for a call that needs to be indirected through the PLT. The linker will complain to recompile with -fPIC.
MaskRay added a subscriber: MaskRay.EditedDec 10 2019, 10:50 AM

Hi Justin,

I'd like to help reviewing this patch. I know how calls work in the 2 64-bit ELF abis pretty well but am not familiar with 32-bit PowerPC at all so I have a handful of questions to start:

  1. I've been using this ABI DOC. Is this the 32-bit ABI supported in LLVM? Are there other 32-bit ELF PowerPC Abis I need to be aware of?
  2. What secure-plt reference or references should I read to get up to speed?

Power Architecture® 32-bit Application Binary Interface Supplement 1.0 - Linux® & Embedded (https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf) is the newest version I can find. You can find secure PLT information there. Unfortunately I think a lot of details are undocumented. They are essentially implementation details in gcc and binutils, so we have to observe gcc/as/ld output or read their source code.

@jhibbits This patch needs a rebase after D70126.

Finally I'd like to just verify my base understanding of the difference between bl sym and bl sym@PLT.

bl sym will produce a call instruction with an R_PPC_REL24 relocation, which will tell the linker to fill in the instruction with the offset from the program counter at the call instruction to sym.

Matches my observations.

bl sym@PLT will produce a call instruction with an R_PPC_PLTREL24 which will instruct the linker to create a PLT, allocate a PLT entry for sym, and fill in the instruction with the offset from the program counter to the PLT entry for 'sym'.

This is like R_X86_64_PLT32. A PLT entry may be optimized out if it is not necessary (the symbol is non-preemptible). See

// lld/ELF/Relocations.cpp
static RelExpr fromPlt(RelExpr expr) {
  // We decided not to use a plt. Optimize a reference to the plt to a
  // reference to the symbol itself.
  switch (expr) {
  case R_PLT_PC:
  case R_PPC32_PLTREL:
    return R_PC;

The tricky part is that the addend may be 0 (BSS-PLT or Secure-PLT -fpic/-fpie) or 0x8000 (Secure-PLT -fPIC/-fPIE).

As an aside: https://reviews.llvm.org/D70937 was committed to fix PPC64 problems, but we need to think how to make the code more general (less PPC32 specific hack).

I'm used to having 1 relocation type for both calls, and the linker can determine based on the properties of 'sym' and whether its linking a shared object or not whether the call needs to indirect through the PLT.

  • If you emit a call instruction with the PLTREL relocation but the symbol ends up module local, does the linker convert the call to a local one?

It does.

  • Is it an error to not use a PLTREL relocation for a call that needs to be indirected through the PLT or is the linker smart enough to add a PLT entry for the symbol and branch to there instead?

For Secure PLT -fPIC/-fPIE which uses 0x8000 as the addend of an R_PPC_PLTREL24 relocation, R_PPC_PLTREL24 must be used. It is an error in GNU ld and lld to use R_PPC_REL24.

@MaskRay , @jhibbits Thank you for providing the correct ABI doc, and for the answers. I'll start going through it.

@Bdragon28 we might need to use -mlongcall for modules, if we aren't already. This should(?) force the compiler to generate the appropriate long-distance code sequences. I'm not sure, though, if clang supports this yet.

Clang does support it, there is a test test/Driver/ppc-features.cpp. Only calls where the callee is a GlobalAddressSDNode are transformed though. ExternalSymbolSDNode callees might need to be transformed as well: For example I believe the selection dag can create calls to memcpy/memset/memmove where the callee is an ExternalSymbolSDNode. Is transforming every call into a direct call overkill though? Would it be possible to perform a direct all to any local callee, and indirect calls for external symbols?

IIUC this patch is a pretty drastic change. Take fopr example the following IR:

target datalayout = "E-m:e-p:32:32-i64:64-n32"
target triple = "powerpc-unknown-linux-gnu-unknown"

@Dst = external global [512 x i32], align 4
@Src = external global [512 x i32], align 4
define dso_local void @caller(i32 %i) local_unnamed_addr {
entry:
  %call = tail call i8* @memcpy(i8* bitcast ([512 x i32]* @Dst to i8*), i8* bitcast ([512 x i32]* @Src to i8*), i32 %i)
  ret void
}
declare i8* @memcpy(i8*, i8*, i32) local_unnamed_addr

!llvm.module.flags = !{!0}
!0 = !{i32 1, !"wchar_size", i32 4}

Will now produce: bl memcpy instead of bl memcpy@PLT

Combined with:

  • It is an error to not use a PLTREL relocation for a call that needs to be indirected through the PLT. The linker will complain to recompile with -fPIC.

Means you can't link static execs against shared objects anymore. Is this the intent? Or am I missing something obvious?

@MaskRay , @jhibbits Thank you for providing the correct ABI doc, and for the answers. I'll start going through it.

  • It is an error to not use a PLTREL relocation for a call that needs to be indirected through the PLT. The linker will complain to recompile with -fPIC.

Means you can't link static execs against shared objects anymore. Is this the intent? Or am I missing something obvious?

My phrasing wasn't quite right, I guess.
The linker will create the PLT stubs by nature of it being an external shared object, or something to that effect. What I really meant was that it's an error to need a PLT and not have it (linker will warn or error if you try to create a .so and use anything other than PLTREL relocations).

  • It is an error to not use a PLTREL relocation for a call that needs to be indirected through the PLT. The linker will complain to recompile with -fPIC.

Means you can't link static execs against shared objects anymore. Is this the intent? Or am I missing something obvious?

My phrasing wasn't quite right, I guess.
The linker will create the PLT stubs by nature of it being an external shared object, or something to that effect. What I really meant was that it's an error to need a PLT and not have it (linker will warn or error if you try to create a .so and use anything other than PLTREL relocations).

Ok, that makes sense then.

Like MaskRay mentioned previously, you will have to rebase this as there was a pretty large refactor that landed to the call lowering code.

MaskRay accepted this revision.Dec 17 2019, 6:18 PM

Since I recently picked up some stuff on lld PPC32, I have run powerpc-linux-gnu-gcc many times to observe its behaviors. This matches what I see, but you probably also need @sfertile's thumbs-up.

because it breaks secure-PLT linking with GNU LD

Can you be more specific? I guess this is because clang emitted R_PPC_PLTREL24 causes GNU ld to fall back to --bss-plt due to a deficiency in its bfd/elf32-ppc.c:ppc_elf_select_plt_layout code.

powerpc-linux-gnu-ld: bss-plt forced due to b.o

This revision is now accepted and ready to land.Dec 17 2019, 6:18 PM
sfertile accepted this revision as: sfertile.Dec 18 2019, 6:02 AM

LGTM also. I have one minor question: Is it worthwhile to add a small lit test explicitly for this? ie.

  1. we don't emit the '@PLT' annotation on a call when linking with static-relocation model.
  2. emit the annotation with PIC relocation model.

That gives us somewhere a bit more visible you can add the comment about how emitting the annotation for static relocation model breaks secure-plt linking with ld.bfd.

jhibbits updated this revision to Diff 234542.Dec 18 2019, 8:03 AM

Update diff. It would help if I actually compiled when rebasing...

@sfertile I figured the available-externally.ll test would suffice for this.
Do you think a separate test is needed, or a comment is warranted on this
test for it?

Update diff. It would help if I actually compiled when rebasing...

@sfertile I figured the available-externally.ll test would suffice for this.
Do you think a separate test is needed, or a comment is warranted on this
test for it?

I see the available externally test more as a test of that specific linkage type, but anyone that tried to update this behavior would need to modify that test, and see the comment so I'd be equally happy with either.

sfertile added inline comments.Dec 18 2019, 8:22 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5110–5116

Minor nit: I belive clang-format will produce a different indentation for this. can you run that before committing.

jhibbits updated this revision to Diff 234550.Dec 18 2019, 8:28 AM

Updated diff after clang-format. I actually had run git-clang-format on it
before submitting, but forgot to amend it to the change before submitting
for review.

This revision was automatically updated to reflect the committed changes.