This is an archive of the discontinued LLVM Phabricator instance.

ELF: Allow GOT relocs pointing to non-preemptable ifunc to resolve to an IRELATIVE where possible.
ClosedPublic

Authored by pcc on Jan 28 2019, 10:08 PM.

Details

Summary

Non-GOT non-PLT relocations to non-preemptible ifuncs result in the
creation of a canonical PLT, which now takes the identity of the IFUNC
in the symbol table. This (a) ensures address consistency inside and
outside the module, and (b) fixes a bug where some of these relocations
end up pointing to the resolver.

Fixes (at least) PR40474 and PR40501.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

pcc created this revision.Jan 28 2019, 10:08 PM

Thanks for sharing the WIP. I think the approach will work and should match the behaviour of ld.bfd more closely. I've been trying to think of alternatives to redirecting all the symbols for the non GOT generating case. Off the top of my head:

  • Keep the _PLT RelExpr variants and instead of changing the symbol alter the Expr for the non-plt non-got generating reference. This has the disadvantage of needing _PLT variants for any RelType that might fall into that category.
  • Still create a new symbol for the non-plt, non-got generating reference, but instead of replacing the existing symbol, just change Sym for the affected relocations, leaving the others untouched. I think this would mean you wouldn't need to create DirectSym but it would mean having to store the symbol somewhere so that subsequent non-plt, non-got generating references could use it.

Not sure if either of those are better at this point though.

lld/ELF/InputSection.cpp
613 ↗(On Diff #184016)

I think that the only use for the _PLT RelType is for redirection to ifunc PLT entries, you might be able to get rid of them from Relocations.h.

lld/ELF/Relocations.cpp
1029 ↗(On Diff #184016)

I'd be tempted to reverse the condition to put the non-ifunc or preemptible case first; it is much simpler and more commonly encountered.

1033 ↗(On Diff #184016)

As this comes out of WIP, I think it would be good to expand the comment to cover ifunc handling in general as there isn't a good external reference for them. The decisions below may seem a bit arbitrary without prior knowledge.

1048 ↗(On Diff #184016)

Just to check I understand. This is the case where we have a non-generating address taking reference of the ifunc symbol that we need to redirect to the PLT entry for the ifunc symbol. Instead of changing the RelExpr to a _PLT form so that the PLT address of the ifunc is returned, we create a new non-ifunc symbol at the offset of the PLT entry for the ifunc symbol. This symbol inherits the PltIndex and GotIndex so that all RelExpr expressions resolve to the same address. As this symbol replaces the original one a new symbol (DirectSym) above needs to be created just for the purposes of the IRELATIVE relocation.

I think that this will have the property of changing the address of the symbol in the static and the dynamic symbol table (if the symbol is exported or a DSO references it). The former might be a little confusing but is otherwise harmless. Would be interesting to see what the dynamic loader will report to the shared library in that case. I note that ld.bfd gives an error message to recompile with PIC if the symbol is address taken using a non-got generating reference and exported; I think it would be worth following suit and giving an error.

1051 ↗(On Diff #184016)

One advantage of the indirection is that we don't end up with a dynamic (ish) relocation such as IRELATIVE in the .text section for the case:

extern void f2();
void f() {
   void (*fp)(void) = fff; 
}
pcc updated this revision to Diff 184460.Jan 30 2019, 11:46 PM
pcc marked 5 inline comments as done.
pcc retitled this revision from [wip] ELF: Allow GOT relocs pointing to non-preemptable ifunc to resolve to an IRELATIVE where possible. to ELF: Allow GOT relocs pointing to non-preemptable ifunc to resolve to an IRELATIVE where possible..
pcc edited the summary of this revision. (Show Details)

Address TODOs and review comments

pcc added a comment.Jan 30 2019, 11:47 PM

PTAL, this should be ready for review now.

lld/ELF/InputSection.cpp
613 ↗(On Diff #184016)

It looks like R_PLT is still used for R_MICROMIPS_26_S1. I was able to remove R_AARCH64_PLT_PAGE_PC, though, by noting that the only remaining caller of toPlt is now in the thunk creation code which I believe operates on relocations that were relaxed by fromPlt.

By the way, if I comment out that call to toPlt, all the tests still pass. We might want to add a test for that code.

lld/ELF/Relocations.cpp
1033 ↗(On Diff #184016)

I wrote a long comment here which is basically a brain dump of everything that I've learned about ifuncs while working on PR40474.

1048 ↗(On Diff #184016)

Just to check I understand. This is the case where we have a non-generating address taking reference of the ifunc symbol that we need to redirect to the PLT entry for the ifunc symbol. Instead of changing the RelExpr to a _PLT form so that the PLT address of the ifunc is returned, we create a new non-ifunc symbol at the offset of the PLT entry for the ifunc symbol.

Correct.

This symbol inherits the PltIndex and GotIndex so that all RelExpr expressions resolve to the same address.

Only all PLT-generating RelExprs resolve to the same address. The symbol would never have received a GotIndex because in the non-preemptible ifunc case we always just create a PLT if the relocation is GOT-generating. If a GOT is required after creating the canonical PLT (or if it was required before creating the PLT, which is now kept track of using the field GotInIgot), a GOT entry is created that points to the canonical PLT instead of the ifunc itself, to ensure address consistency between GOT and direct relocations. This handles the same case as what used to be handled by R_AARCH64_GOT_PAGE_PC_PLT.

As this symbol replaces the original one a new symbol (DirectSym) above needs to be created just for the purposes of the IRELATIVE relocation.

I think that this will have the property of changing the address of the symbol in the static and the dynamic symbol table (if the symbol is exported or a DSO references it).

Correct.

The former might be a little confusing but is otherwise harmless.

While updating the tests I realised that in fact the objdump output will normally look a little nicer if the ifunc is redirected because normally an ifunc is declared as an alias of a resolver, which means that the resolver's disassembly will be labelled with the resolver's symbol name and the plt will be labelled with the ifunc's symbol name.

Would be interesting to see what the dynamic loader will report to the shared library in that case.

Since I change the symbol type to STT_FUNC it should just be handled like a regular symbol, which means that the shared library will see the address of the plt. This is what I observed with your repro in PR40501 if I change it to take the address of fff in code rather than in an initializer.

I note that ld.bfd gives an error message to recompile with PIC if the symbol is address taken using a non-got generating reference and exported; I think it would be worth following suit and giving an error.

That might just be because it's difficult to teach ld.bfd to redirect a dynamic symbol. We handle that case just fine so I'm inclined not to do that unless there is a need to.

1051 ↗(On Diff #184016)

In my implementation I've added a check that the section is writable before producing the IRELATIVE. This is similar to the check before producing a RELATIVE near the top of processRelocAux.

Thanks for the update. I'll be able to take a look at this in detail next week.

lld/ELF/InputSection.cpp
613 ↗(On Diff #184016)

I'll add a test case next week. It may be possible to simplify toPlt as thunks should only use a subset of the expressions.

I've suggested a test case that fails when toPlt() is removed in D57743. Aiming to have comments on this review by the end of the day.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 3:38 AM

I've made a few suggestions on the comments, otherwise looks good to me. I think it is better to keep the complexity in one place and document it than spreading it out. It would be good to get some input from George and Rui, or someone else that hasn't been deep diving on ifuncs recently to see if they had any problems following the code. I've run this on some address comparison tests I had lying around for AArch64 and it looks good on these. I'll run these on Arm tomorrow, as that puts the .igot.plt in the .got and will comment if I find any problems, but I'm not expecting any.

I seem to remember FreeBsd making heavy use of ifuncs for some configuration, may be worth adding emaste as a reviewer to see if he can arrange for someone from that project to take a look and maybe run some tests.

lld/ELF/Relocations.cpp
1115 ↗(On Diff #184460)

May I suggest highlighting the use of DirectSym as it wasn't immediately apparent what canonical meant in the context, although with the comment above explaining canonical this might be less of a problem. Maybe something like:
"Create a copy of the symbol to use as the target of the IRELATIVE relocation in the igotPlt. This is in case we make the PLT canonical later, which would overwrite the original symbol."

1142 ↗(On Diff #184460)

Maybe worth expanding last part to "create a GOT entry pointing to the PLT entry for Sym."

1160 ↗(On Diff #184460)

I initially missed the Sym.GotInIgot = false; and thought about what might happen if there was a GOT generating reference after canonicalisation. Perhaps something like:
"We previously encountered a GOT generating reference that we redirected to the Igot. Now that the PLT entry is canonical we must clear the redirection to the Igot and add a GOT entry. As we've changed the symbol type to STT_FUNC future got generating references will naturally use this GOT entry."

lld/test/ELF/gnu-ifunc-canon.s
2 ↗(On Diff #184460)

I found it difficult to track the inputs from the outputs. For example it is easy to lose the type of reference from %t1.o, %t2.o and %t3.o. For example %t1.o has a RO pc-relative reference, %t2.o has a RO absolute reference and %t3.o has a RW absolute + offset reference. Maybe worth using %t-ro-pcrel.o etc.

Alternatively a comment above each ld.lld invocation with a description of the inputs, for example non-got-generating reference, expect canonical PLT.

I've not had a chance to check through each case yet. Will do that tomorrow and will post if I find any problems.

pcc marked 5 inline comments as done.Feb 5 2019, 7:03 PM

Thanks for the review. I've tested this using statically linked executables on Android on both AArch64 (with ifuncs in hwasan) and ARM (with ifuncs in libc.a).

@emaste do you know how I can test this on FreeBSD?

lld/test/ELF/gnu-ifunc-canon.s
2 ↗(On Diff #184460)

I've renamed the files.

pcc updated this revision to Diff 185470.Feb 5 2019, 7:04 PM
pcc marked an inline comment as done.
  • Address review comments

Thanks for the update. My Arm tests were fine as well. This all looks good to me, I don't have any more comments at the moment.

pcc added a comment.Feb 7 2019, 11:26 AM

I tested this patch by linking FreeBSD userspace with it and running their test suite [1], and there were no regressions. However, it appears that only a handful of userspace functions are using ifunc. The kernel uses more, but I couldn't test this on the kernel because they use a custom flag -z ifunc-noplt to link their kernel and this flag isn't implemented in upstream lld. In order to test this on the kernel the patch implementing the flag would need to be rebased on top of this change, and that's probably beyond the limit of what's reasonable to test here.

[1] https://www.freebsd.org/cgi/man.cgi?query=tests&apropos=0&sektion=7&manpath=FreeBSD+11-current&format=html

ruiu added a reviewer: emaste.Feb 7 2019, 5:25 PM

Ed,

Did -z ifunc-noplt work well for FreeBSD kernel? If it works well, maybe we should merge it to the mainline lld tree?

emaste added a subscriber: markj.Feb 7 2019, 5:39 PM
emaste added a comment.Feb 7 2019, 5:43 PM

Ed,

Did -z ifunc-noplt work well for FreeBSD kernel? If it works well, maybe we should merge it to the mainline lld tree?

Yes it works well for us - the plan was for @markj to rebase and upstream it but it got preempted by other work; we'll bump the priority in light of this patch. Mark do you think we could test with this patch + -z ifunc-noplt in the near future?

markj added a comment.Feb 8 2019, 9:23 AM

Ed,

Did -z ifunc-noplt work well for FreeBSD kernel? If it works well, maybe we should merge it to the mainline lld tree?

Yes it works well for us - the plan was for @markj to rebase and upstream it but it got preempted by other work; we'll bump the priority in light of this patch. Mark do you think we could test with this patch + -z ifunc-noplt in the near future?

Yes, I will submit a patch to phabricator in the next couple of weeks. In the meantime I'll try and get this patch tested with -z ifunc-noplt before Monday.

markj added a comment.Feb 10 2019, 8:27 PM
In D57371#1389410, @pcc wrote:

I tested this patch by linking FreeBSD userspace with it and running their test suite [1], and there were no regressions. However, it appears that only a handful of userspace functions are using ifunc. The kernel uses more, but I couldn't test this on the kernel because they use a custom flag -z ifunc-noplt to link their kernel and this flag isn't implemented in upstream lld. In order to test this on the kernel the patch implementing the flag would need to be rebased on top of this change, and that's probably beyond the limit of what's reasonable to test here.

For the purpose of testing it is possible to remove this flag without ill effect.

I built lld from git and built a kernel with this diff applied and -zifunc-noplt removed from the kernel; didn't have any problems booting. I ported the -zifunc-noplt patch and was able to boot with that flag enabled too. Note that the kernel doesn't have any non-preemptible ifuncs, though.

pcc added a comment.Feb 10 2019, 8:43 PM
In D57371#1389410, @pcc wrote:

I tested this patch by linking FreeBSD userspace with it and running their test suite [1], and there were no regressions. However, it appears that only a handful of userspace functions are using ifunc. The kernel uses more, but I couldn't test this on the kernel because they use a custom flag -z ifunc-noplt to link their kernel and this flag isn't implemented in upstream lld. In order to test this on the kernel the patch implementing the flag would need to be rebased on top of this change, and that's probably beyond the limit of what's reasonable to test here.

For the purpose of testing it is possible to remove this flag without ill effect.

I built lld from git and built a kernel with this diff applied and -zifunc-noplt removed from the kernel; didn't have any problems booting. I ported the -zifunc-noplt patch and was able to boot with that flag enabled too.

Good to hear, thanks.

Note that the kernel doesn't have any non-preemptible ifuncs, though.

Isn't the FreeBSD kernel an ET_EXEC? That should mean that all of its symbols are non-preemptible, right?

markj added a comment.Feb 10 2019, 8:47 PM
In D57371#1392265, @pcc wrote:

Note that the kernel doesn't have any non-preemptible ifuncs, though.

Isn't the FreeBSD kernel an ET_EXEC? That should mean that all of its symbols are non-preemptible, right?

Indeed, you are right.

ruiu accepted this revision.Feb 13 2019, 11:32 AM

LGTM

Even though I believe what this patch does is correct, it seems really complicated to me, but that's not a fault of this patch. This part of lld code is mind-boggling. At this point, it is perhaps easier to rewrite it entirely than incrementally improving the code quality of this file. I don't think I would do this soon, but that's probably what we should keep in mind.

This revision is now accepted and ready to land.Feb 13 2019, 11:32 AM
pcc marked an inline comment as done.Feb 13 2019, 1:44 PM

LGTM

Even though I believe what this patch does is correct, it seems really complicated to me, but that's not a fault of this patch. This part of lld code is mind-boggling. At this point, it is perhaps easier to rewrite it entirely than incrementally improving the code quality of this file. I don't think I would do this soon, but that's probably what we should keep in mind.

Agreed that this needs a rewrite at some point. My biggest concern was that this patch causes relocations to be processed using different code paths depending on the order in which they are encountered (hence why the test checks both orders); fixing that will probably require a fundamental change here.

lld/ELF/Relocations.cpp
1209 ↗(On Diff #185470)

Reading through the patch again I noticed that this ought to be R.Type not R.Sym->Type. I'll fix it in the commit but I can't seem to construct a test case exposing the change in behaviour.

This revision was automatically updated to reflect the committed changes.