This is an archive of the discontinued LLVM Phabricator instance.

[ELF][PPC64] Implement IPLT code sequence for non-preemptible IFUNC
ClosedPublic

Authored by MaskRay on Dec 13 2019, 9:30 PM.

Details

Summary

Non-preemptible IFUNC are placed in in.iplt (.glink on EM_PPC64). If
there is a non-GOT non-PLT relocation, for pointer equality, we change
the type of the symbol from STT_IFUNC and STT_FUNC and bind it to the
.glink entry.

On EM_386, EM_X86_64, EM_ARM, and EM_AARCH64, the PLT code sequence
loads the address from its associated .got.plt slot. An IPLT also has an
associated .got.plt slot and can use the same code sequence.

On EM_PPC64, the PLT code sequence is actually a bl instruction in
.glink . It jumps to __glink_PLTresolve (the PLT header). and
__glink_PLTresolve computes the .plt slot (relocated by
R_PPC64_JUMP_SLOT).

An IPLT does not have an associated R_PPC64_JUMP_SLOT, so we cannot use
bl in .iplt . Instead, create a call stub which has a similar code
sequence as PPC64PltCallStub. We don't save the TOC pointer, so such
scenarios will not work: a function pointer to a non-preemptible ifunc,
which resolves to a function defined in another DSO. This is the
restriction described by https://sourceware.org/glibc/wiki/GNU_IFUNC
(though on many architectures it works in practice):

Requirement (a): Resolver must be defined in the same translation unit as the implementations.

If an ifunc is taken address but not called, technically we don't need
an entry for it, but we currently do that.

This patch makes

// clang -fuse-ld=lld a.c
#include <stdio.h>
static void impl(void) { puts("meow"); }
void thefunc(void) __attribute__((ifunc("resolver")));
void *resolver(void) { return &impl; }
int main(void) {
  thefunc();
  void (*theptr)(void) = &thefunc;
  theptr();
}

work on Linux glibc and FreeBSD. Calling a function pointer pointing to
a Non-preemptible IFUNC never worked before.

Event Timeline

MaskRay created this revision.Dec 13 2019, 9:30 PM
MaskRay added a comment.EditedDec 13 2019, 10:01 PM

This patch solves part of the problems but there is another issue.

#include <stdio.h>
static void impl(void) { puts("meow"); }
void thefunc(void) __attribute__((ifunc("resolver")));
void *resolver(void) { return &impl; }
int main(void) {
  thefunc();
  void (*theptr)(void) = &thefunc;
  theptr();
}

GNU ld powerpc tries very hard to keep the type of thefunc as STT_IFUNC and may produce many R_PPC64_IRELATIVE (one for the GOT entry of the IPLT entry; one for R_PPC64_ADDR64 relocating the .toc entry; etc)

% ~/llvm/Release/bin/clang -fPIC -fuse-ld=bfd -g b.c -o b.bfd
% readelf -r b.bfd
Relocation section '.rela.dyn' at offset 0x380 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
000000001001ff08  0000000200000026 R_PPC64_ADDR64         0000000000000000 __gmon_start__ + 0
000000001001ff28  00000000000000f8 R_PPC64_IRELATIVE                         10000660
0000000010020028  00000000000000f8 R_PPC64_IRELATIVE                         10000660
...
% readelf -Ws b | grep thefunc
    50: 0000000010000480     0 NOTYPE  LOCAL  DEFAULT   12 00000019.plt_call.thefunc
    66: 0000000010000660    32 IFUNC   GLOBAL DEFAULT [<localentry>: 8]    12 thefunc
% ./b.bfd
meow
meow

lld creates a canonical PLT when a non-PLT non-GOT relocation is found. The type of thefunc is changed from STT_IFUNC to STT_FUNC.

The direct function call thefunc() jumps to the call stub __plt_thefunc, which arranges for saving and setting up TOC.

The function pointer call will trigger a ld.so assertion failure. The problem may be that, the .glink entry (iplt entry, a bl instruction) jumps to the PLT header (__glink_PLTresolve). __glink_PLTresolve computes a PLT index from the address of the bl instruction. ld.so resolver/binder will fail because the index does not correspond to a valid R_PPC64_JUMP_SLOT.

% ~/llvm/Release/bin/clang -fPIC -fuse-ld=lld -g b.c -o b.lld
% readelf -r b.lld
Relocation section '.rela.dyn' at offset 0x468 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000010030be0  00000000000000f8 R_PPC64_IRELATIVE                         10010700
...
% readelf -Ws b.lld | grep thefunc                                          
    22: 00000000100108e8    20 FUNC    LOCAL  DEFAULT   14 __plt_thefunc
    39: 00000000100109b8     0 FUNC    GLOBAL DEFAULT [<localentry>: 8]    17 thefunc
% ./b
meow
Inconsistency detected by ld.so: dl-runtime.c: 80: _dl_fixup: Assertion `ELFW(R_TYPE)(reloc->r_info) == ELF_MACHINE_JMP_SLOT' failed!

How x86_64 ifunc works

Disassembly of section .plt:

0000000000201830 <puts@plt-0x10>:
  201830:       ff 35 f2 21 00 00       push   QWORD PTR [rip+0x21f2]        # 203a28 <__TMC_END__+0x8>
  201836:       ff 25 f4 21 00 00       jmp    QWORD PTR [rip+0x21f4]        # 203a30 <__TMC_END__+0x10>
  20183c:       0f 1f 40 00             nop    DWORD PTR [rax+0x0]

0000000000201840 <puts@plt>:
  201840:       ff 25 f2 21 00 00       jmp    QWORD PTR [rip+0x21f2]        # 203a38 <puts@GLIBC_2.2.5>
  201846:       68 00 00 00 00          push   0x0
  20184b:       e9 e0 ff ff ff          jmp    201830 <_fini+0x14>

0000000000201850 <thefunc>:
  201850:       ff 25 ea 21 00 00       jmp    QWORD PTR [rip+0x21ea]        # 203a40 <__TMC_END__+0x20>
###### push 0x1 is incorrect w/o this patch, but the following two instructions aren't used anyway
  201856:       68 01 00 00 00          push   0x1
  20185b:       e9 d0 ff ff ff          jmp    201830 <_fini+0x14>

Relocation section '.rela.dyn' at offset 0x470 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000202a00  0000000100000006 R_X86_64_GLOB_DAT      0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
0000000000202a08  0000000200000006 R_X86_64_GLOB_DAT      0000000000000000 __gmon_start__ + 0
0000000000203a40  0000000000000025 R_X86_64_IRELATIVE                        201740

Relocation section '.rela.plt' at offset 0x4b8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000203a38  0000000500000007 R_X86_64_JUMP_SLOT     0000000000000000 puts@GLIBC_2.2.5 + 0

The iplt entry loads the .got.plt slot, which has been resolved by an R_X86_64_IRELATIVE. Note that an iplt entry appears to supply a PLT index to the PLT header, but those two instructions are actually ignored.

How AArch64 ifunc works

I don't have AArch64 libraries on my machine, but they are irrelevant to our discussion.

clang -target aarch64-linux -fuse-ld=lld ifunc.c -nostdlib -o ifunc -Wl,--defsym=puts=0

Disassembly of section .plt:

00000000002101f0 thefunc:
  2101f0: 90 00 00 90                   adrp    x16, #65536
  2101f4: 11 02 41 f9                   ldr     x17, [x16, #512]
  2101f8: 10 02 08 91                   add     x16, x16, #512
  2101fc: 20 02 1f d6                   br      x17

Relocation section '.rela.dyn' at offset 0x158 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000220200  0000000000000408 R_AARCH64_IRELATIVE                       210178

Similar to x86_64, an AArch64 iplt entry loads a .got.plt slot.

How ARM works

Disassembly of section .plt:

000111a0 thefunc:
   111a0: 00 c6 8f e2                   add     r12, pc, #0, #12
   111a4: 01 ca 8c e2                   add     r12, r12, #4096
   111a8: 08 f0 bc e5                   ldr     pc, [r12, #8]!

000111ac $d:
   111ac:       d4 d4 d4 d4     .word   0xd4d4d4d4

Relocation section '.rel.dyn' at offset 0x134 contains 1 entry:
 Offset     Info    Type                Sym. Value  Symbol's Name
000121b0  000000a0 R_ARM_IRELATIVE

Similar.

MaskRay updated this revision to Diff 233944.Dec 14 2019, 1:35 PM
MaskRay retitled this revision from [ELF][PowerPC] Set PltSection alignment to 4 and fix IPLT index to [ELF][PPC64] Fix IPLT entry in .glink.
MaskRay edited subscribers, added: kbarton; removed: wuzish, kristof.beyls, arphaman and 2 others.

.

MaskRay updated this revision to Diff 233948.Dec 14 2019, 6:06 PM
MaskRay retitled this revision from [ELF][PPC64] Fix IPLT entry in .glink to [ELF][PPC64] Implement IPLT code sequence for non-preemptible IFUNC.
MaskRay edited the summary of this revision. (Show Details)
MaskRay removed a subscriber: Bdragon28.

Rebase on other patches

MaskRay edited the summary of this revision. (Show Details)Dec 14 2019, 6:18 PM
MaskRay updated this revision to Diff 233979.Dec 15 2019, 10:11 AM
MaskRay edited the summary of this revision. (Show Details)
MaskRay added reviewers: Restricted Project, Bdragon28, ruiu, sfertile.
MaskRay removed a subscriber: Bdragon28.

Add reviewers

MaskRay updated this revision to Diff 234138.Dec 16 2019, 1:36 PM
MaskRay edited the summary of this revision. (Show Details)

Update tests

ruiu added a comment.Dec 16 2019, 8:10 PM

Looks good but I want someone from PPC side to take a look at this change.

I'm not super knowledgeable on IFuncs so I really appreciate the detailed descriptions you've provided. I'll start brushing up on ifuncs so I can review this. I have one early observation though: If its valid to tail call a non-preemptible ifunc (and since its non-preemptible I'm assuming it is tail-callable) , then saving of the toc pointer in the stub is not safe.

Take for example: a function A that calls B that tail calls C; where C is a non-preemptable ifunc, and A and B are either in different linkage modules or Bis preemptable.

A --> Plt stub --> B --> IPltstub --> C

The plt stub between A and B saves the toc-pointer from A's module to the linkage area. When B tail calls C it has poped off the stack frame its allocated (if it had one), the IPlt stub will overwrite the toc-pointer from A's module with the toc-pointer shared by B and C. When we return to the call site in A we will restore the wrong toc-pointer. Since the IFunc is non-preemptable and LLD doesn't support multiple TOC bases in a module (I belive bfd has an option for this when using small code model but I am not sure) I would think the caller and callee have to share the same TOC pointer, so no toc save is needed.

MaskRay updated this revision to Diff 234376.Dec 17 2019, 1:05 PM
MaskRay edited the summary of this revision. (Show Details)

Delete std r2, 24(r1) from the IPLT code sequence. If my understanding of sfertile's example is correct,
we have to accept some loss of functionality, either we can't use ifunc tail call, or we can't support non-preemptable ifunc resolving to a function in a different module.

It seems that we can sacrifice the latter because https://sourceware.org/glibc/wiki/GNU_IFUNC says:

Requirement (a): Resolver must be defined in the same translation unit as the implementations.

Making it work is an extension (I believe it works on EM_386, EM_X86_64, EM_ARM, and EM_AARCH64). Unforunately we have to add the restriction for EM_PPC64.

MaskRay updated this revision to Diff 234392.Dec 17 2019, 2:33 PM

Rebase on top of D71631. I realized that EM_PPC needs const Symbol &s.

MaskRay updated this revision to Diff 234398.Dec 17 2019, 3:01 PM

Properly rebase on top of D71631

👾 👾 👾

This revision was not accepted when it landed; it landed in state Needs Review.Dec 29 2019, 10:43 PM
This revision was automatically updated to reflect the committed changes.