This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Fix R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX when target is IFUNC.
AbandonedPublic

Authored by grimar on Nov 6 2018, 3:50 AM.

Details

Summary

This is https://bugs.llvm.org/show_bug.cgi?id=38074. This simple reproducible is below.

Imagine the following code:

user.cpp:

extern "C" int myfunc();

int main() {
 int (*p)()  = &myfunc;

 return p();
}

library.cpp:

int myfunc2() {
   return 2;
}

extern "C" int myfunc();
__asm__ (".type myfunc, @gnu_indirect_function");

typeof(myfunc) * myfunc_dispatch (void) __asm__ ("myfunc");
typeof(myfunc) * myfunc_dispatch (void)  {
  return myfunc2;
}

If you run the application linked with gold, the output will be correct:

> clang library.o user.o -fPIC -static -o test
> ./test 
> echo $?
2

The application linked with LLD (-fuse-ld=lld) shows 32 for me instead.

The issue is that when calling a function, for example,
LLD generates a .got entry that points to the IFUNC resolver function when
instead, it should use the PLT entries properly for handling the IFUNC.

So we should create a got entry that points to PLT entry, which itself loads the value from
.got.plt, relocated with R_X86_64_IRELATIVE to make things work. Patch do that.

Diff Detail

Event Timeline

grimar created this revision.Nov 6 2018, 3:50 AM

I've tested this out on X86, Arm and AArch64 and it works for X86 and Arm. Unfortunately it doesn't seem to work with AArch64 yet. I don't see any R_AARCH64_IRELATIVE relocations.

I haven't found out why as yet. I'll do some digging to see if there is anything obvious that we are missing. If you have access to a X86 Linux box this is reproducible with a GCC aarch64-linux-gnu cross compilation toolchain and the linux user mode emulator qemu-aarch64.

test/ELF/gotpc-relax.s
26

my git diff is showing some trailing whitespace characters on the lines ending with 0x203000

Not a massive problem, but worth removing.

grimar added a comment.Nov 6 2018, 7:26 AM

I've tested this out on X86, Arm and AArch64 and it works for X86 and Arm. Unfortunately it doesn't seem to work with AArch64 yet. I don't see any R_AARCH64_IRELATIVE relocations.

I haven't found out why as yet. I'll do some digging to see if there is anything obvious that we are missing. If you have access to a X86 Linux box this is reproducible with a GCC aarch64-linux-gnu cross compilation toolchain and the linux user mode emulator qemu-aarch64.

Thanks! My quick guess is that perhaps some new Expr is passed to toPlt(RelExpr Expr) for AArch64 case and hence condition if (needsPlt(Expr) && !Sym.isInPlt()) is not satisfied.
I'll try to take a look at what is going on there during this week.

I've tested this out on X86, Arm and AArch64 and it works for X86 and Arm. Unfortunately it doesn't seem to work with AArch64 yet. I don't see any R_AARCH64_IRELATIVE relocations.

I haven't found out why as yet. I'll do some digging to see if there is anything obvious that we are missing. If you have access to a X86 Linux box this is reproducible with a GCC aarch64-linux-gnu cross compilation toolchain and the linux user mode emulator qemu-aarch64.

Thanks! My quick guess is that perhaps some new Expr is passed to toPlt(RelExpr Expr) for AArch64 case and hence condition if (needsPlt(Expr) && !Sym.isInPlt()) is not satisfied.
I'll try to take a look at what is going on there during this week.

Yes I agree. It looks like AArch64 uses R_AARCH64_ADR_GOT_PAGE which translates to R_GOT_PAGE_PC. Whereas on ARM we have R_ARM_GOT_PREL which translates to R_GOT_PC. So at a first approximation it looks like adding support for R_GOT_PAGE_PC_PLT might work.

grimar added a comment.Nov 8 2018, 6:05 AM

I've tested this out on X86, Arm and AArch64 and it works for X86 and Arm. Unfortunately, it doesn't seem to work with AArch64 yet. I don't see any R_AARCH64_IRELATIVE relocations.

I see the R_AARCH64_IRELATIVE for some reason. Am I doing something differently?

I haven't found out why as yet. I'll do some digging to see if there is anything obvious that we are missing. If you have access to a X86 Linux box this is reproducible with a GCC aarch64-linux-gnu cross compilation toolchain and the linux user mode emulator qemu-aarch64.

It seems there is something different on AArch64 perhaps. If I take user.cpp and library.cpp files from description
and clang version 8.0.0 (trunk 346310) + LLD 8.0.0 (trunk 345820) (without this patch applied) I have:

clang -target aarch64-linux-gnueabi user.cpp -c -o user.o
clang -target aarch64-linux-gnueabi library.cpp -c -o library.o
clang -target aarch64-linux-gnueabi user.o library.o -static -fuse-ld=lld -o out
qemu-aarch64 -L /usr/aarch64-linux-gnu out
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

And output has IRELATIVE:

readelf -r out -W

Relocation section '.rela.plt' at offset 0x238 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
00000000002819f0  0000000000000408 R_AARCH64_IRELATIVE                       2201c8

If I use gcc (gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
to produce objects instead, a signal code is different, but the situation is about the same in general:

aarch64-linux-gnu-gcc library.cpp -c -o library.o
aarch64-linux-gnu-gcc user.cpp -c -o user.o
~/LLVM/build_lldb/bin/clang -target aarch64-linux-gnueabi user.o library.o -static -fuse-ld=lld -o out
 qemu-aarch64 -L /usr/aarch64-linux-gnu out
�S����[����#qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted (core dumped)
readelf -r out -W
Relocation section '.rela.plt' at offset 0x238 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
00000000002819f0  0000000000000408 R_AARCH64_IRELATIVE                       2201bc

If I stop using LLD (so, use default /usr/lib/gcc-cross/aarch64-linux-gnu/5.4.0/../../../../aarch64-linux-gnu/bin/ld -vGNU ld (GNU Binutils for Ubuntu) 2.28)
it works fine.

So I am observing R_AARCH64_IRELATIVE is present in both cases, but it crashes even without this patch.

I would start tracking AArch64 issue status separately probably and I think we can land this patch first.

In my example I was using -fpic , without it then we get a different relocation type R_AARCH64_ADR_PREL_PG_HI21 (with -fpic it is R_AARCH64_ADR_GOT_PAGE), I suspect that is why I didn't see the IRELATIVE relocation. I don't mind if we keep this patch to be specific to non-AArch64, but if we do the we either shouldn't close the PR until we've made it work, or raise another PR to make sure we don't forget about it.

grimar added a comment.Nov 8 2018, 6:38 AM

In my example I was using -fpic , without it then we get a different relocation type R_AARCH64_ADR_PREL_PG_HI21 (with -fpic it is R_AARCH64_ADR_GOT_PAGE), I suspect that is why I didn't see the IRELATIVE relocation. I don't mind if we keep this patch to be specific to non-AArch64, but if we do the we either shouldn't close the PR until we've made it work, or raise another PR to make sure we don't forget about it.

Ah, my mistake, sorry. I did not use -fPIC because of wrong assumption (it did not segfault with fPIC and I started to try to catch the crash, though it seems to be a different issue).
So, you right, with -fPIC I do not see the relocation and was able to reproduce the issue mentioned in description finally.
I'll try to fix it then.

(and sure we should not close the PR without a fix for AArch64.)

ruiu added a comment.Nov 12 2018, 11:04 PM

Where do you create .got.iplt?

Where do you create .got.iplt?

After converting expression to PLT expression (Expr = toPlt(Expr);):
https://github.com/llvm-mirror/lld/blob/master/ELF/Relocations.cpp#L993

code flow goes inside if (needsPlt(Expr) && !Sym.isInPlt()) {:
https://github.com/llvm-mirror/lld/blob/master/ELF/Relocations.cpp#L1018

And adds an entry to In.IgotPlt inside addPltEntry.

ruiu added a comment.Nov 13 2018, 1:53 AM

Oh okay, that was just a typo. (Your patch description contains a mention about .got.iplt)

grimar edited the summary of this revision. (Show Details)Nov 13 2018, 2:28 AM

Oh okay, that was just a typo. (Your patch description contains a mention about .got.iplt)

Right. I thought In.IgotPlt has .got.iplt input section name. But it is .got.plt atm, just like the output.

My understanding is that the .igot, .igot.plt come from the bfd linker script.

.got            : { *(.got) *(.igot) }
. = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .);
.got.plt        : { *(.got.plt)  *(.igot.plt) }

As you can see they are logically part of the .got with the .igot used to make sure the ifunc resolver entries happen last (and are resolved last).

grimar added a comment.EditedNov 13 2018, 2:50 AM

My understanding is that the .igot, .igot.plt come from the bfd linker script.

.got            : { *(.got) *(.igot) }
. = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .);
.got.plt        : { *(.got.plt)  *(.igot.plt) }

As you can see they are logically part of the .got with the .igot used to make sure the ifunc resolver entries happen last (and are resolved last).

Yeah, look like I saw them there. I thought we do the same though.
i.e. that means our script can not discard/move igot.plt now it seems since we use the `got.plt' for both on x64. Does not seem it is useful though.

grimar updated this revision to Diff 175437.Nov 27 2018, 2:43 AM
  • Removed trailing whitespace from the test case.
  • Rebased.

Ping.
Rui, can we land it?

ruiu added inline comments.Dec 28 2018, 1:59 PM
ELF/Relocations.h
57

Can you add a comment as to what this enum value represents?

grimar updated this revision to Diff 179684.Dec 29 2018, 3:51 AM
grimar marked 2 inline comments as done.
  • Addressed review comment.

Note:
Russia starts celebrating the NY, the day after NY, the day after the day after NY and so on. I'll be back on 9 January.

grimar added inline comments.Dec 29 2018, 3:52 AM
ELF/Relocations.h
57

Done.

grimar added a reviewer: pcc.Jan 28 2019, 10:53 AM
ruiu accepted this revision.Jan 28 2019, 3:26 PM

LGTM

This revision is now accepted and ready to land.Jan 28 2019, 3:26 PM

@grimar D57371 fixed the "R_X86_64_REX_GOTPCRELX referencing ifunc should not be relaxed" bug, via the following line:

if (!Sym.IsPreemptible && !Sym.isGnuIFunc()) {

Target::adjustRelaxExpr is not called for a non-preemptible ifunc. I think this patch is not needed.

clang -fuse-ld=lld -Wa,-mrelax-relocations=yes a.c b.c -fPIC -static -o a

I know one issue, which is unrelated to this patch: config->hasDynSymTab = !sharedFiles.empty() || config->isPic || config->exportDynamic;
This condition does not take dynamic relocations into account.

So, clang -fuse-ld=lld -Wa,-mrelax-relocations=yes a.c b.c -fPIC -static -o a -z ifunc-noplt -z notext does not generate a dynamic R_X86_64_REX_GOTPCRELX.
Adding -Wl,--export-dynamic will produce a dynamic R_X86_64_REX_GOTPCRELX.
config->hasDynSymTab can affect Symbol::includeInDynSym, which can be called before Writer. So I don't know an easy fix.

@markj

grimar abandoned this revision.EditedFeb 11 2020, 1:38 AM

@grimar D57371 fixed the "R_X86_64_REX_GOTPCRELX referencing ifunc should not be relaxed" bug, via the following line:

if (!Sym.IsPreemptible && !Sym.isGnuIFunc()) {

Target::adjustRelaxExpr is not called for a non-preemptible ifunc. I think this patch is not needed.

I think you are right. Seems I did not realise that the last comment from Peter Wu (https://bugs.llvm.org/show_bug.cgi?id=38074#c9)
is actually a one+ year old when reopened the bug ~2 weeks ago.

Thank you for spotting!

markj added a comment.Feb 11 2020, 8:07 AM

I know one issue, which is unrelated to this patch: config->hasDynSymTab = !sharedFiles.empty() || config->isPic || config->exportDynamic;
This condition does not take dynamic relocations into account.

So, clang -fuse-ld=lld -Wa,-mrelax-relocations=yes a.c b.c -fPIC -static -o a -z ifunc-noplt -z notext does not generate a dynamic R_X86_64_REX_GOTPCRELX.
Adding -Wl,--export-dynamic will produce a dynamic R_X86_64_REX_GOTPCRELX.
config->hasDynSymTab can affect Symbol::includeInDynSym, which can be called before Writer. So I don't know an easy fix.

I can't see an easy solution other than to further constrain the use of -zifunc-noplt to require !-static || --export-dynamic. This would be fine for FreeBSD's use of the option.