This is an archive of the discontinued LLVM Phabricator instance.

[PPC32] Emit R_PPC_PLTREL24 for calls to dso_local ifunc
ClosedPublic

Authored by MaskRay on Dec 17 2019, 10:51 PM.

Details

Summary
static void *ifunc(void) __attribute__((ifunc("resolver")));
void foo() { ifunc(); }

The relocation produced by the ifunc() call:

  1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
  2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000
  3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
  4. clang -msecure-plt -fPIE => R_PPC_REL24

4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this
relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a
function, both GNU ld and lld (after D71621 fix) may produce a wrong
result.

This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC.
Both GNU ld and lld (after D71621) will be happy.

Diff Detail

Event Timeline

MaskRay created this revision.Dec 17 2019, 10:51 PM

In terms of making a dso_local ifunc work this behavior seems logical to me: I'm skeptical though because I'm not sure we typically know when we are calling an ifunc.
From the gcc docs on ifunc attribute

The exported header file declaring the function the user calls would contain:
extern void *memcpy (void *, const void *, size_t);
allowing the user to call memcpy as a regular function, unaware of the actual implementation

Maybe non-preemptable ifuncs are different in that they must be declared as ifuncs explicitly before use, but if not I think the compiler/linker/loader need to work in tandem to support calling an ifunc where the call site does not know the implementation is an STT_IFUNC as opposed to an STT_FUNC. If thats the case, doing something different at the callsite based on if the callee being an ifunc would be the wrong approach to making this work (even though it seems its the approach taken by gcc and ld.bfd).

MaskRay added a comment.EditedDec 18 2019, 9:48 AM

In terms of making a dso_local ifunc work this behavior seems logical to me: I'm skeptical though because I'm not sure we typically know when we are calling an ifunc.
From the gcc docs on ifunc attribute

The exported header file declaring the function the user calls would contain:
extern void *memcpy (void *, const void *, size_t);
allowing the user to call memcpy as a regular function, unaware of the actual implementation

Maybe non-preemptable ifuncs are different in that they must be declared as ifuncs explicitly before use, but if not I think the compiler/linker/loader need to work in tandem to support calling an ifunc where the call site does not know the implementation is an STT_IFUNC as opposed to an STT_FUNC. If thats the case, doing something different at the callsite based on if the callee being an ifunc would be the wrong approach to making this work (even though it seems its the approach taken by gcc and ld.bfd).

I had wondered whether we should do the following instead, but I figure out that this may be a PPC32 specific problem.

--- a/llvm/lib/Target/TargetMachine.cpp
+++ b/llvm/lib/Target/TargetMachine.cpp
@@ -97,3 +97,3 @@ bool TargetMachine::shouldAssumeDSOLocal(const Module &M,
   if (GV && GV->isDSOLocal())
-    return true;
+    return !isa<GlobalIFunc>(GV);

On PPC32, the problem is that for R_PPC_REL24 and R_PPC_PLTREL24(r_addend=0x8000), r30 is expected to point to different places (_GLOBAL_OFFSET_TABLE_ for R_PPC_REL24, .got2+addend for R_PPC_PLTREL24). If a function calls functions with different relocation types and both callees need a call stub, a single r30 value cannot make both call stubs happy.

In llvm MC and GNU as, STT_FUNC can be upgraded to STT_IFUNC. Ideally in C/C++ code users don't have to annotate a function as __attribute__((ifunc(...))) and they can annotate it in (inline) assembly with .type ifunc,@gnu_indirect_function.
It looks like GCC and ABI makers made a mistake. I think R_PPC_LOCAL24PC was a mistake. The jump instruction is relocated with either R_PPC_LOCAL24PC, R_PPC_REL24 or R_PPC_PLTREL24 depending on the -fno-pic/-fpic/-fPIC and -msecure-plt/-mbss-plt. I am glad that clang does not generate the useless R_PPC_LOCAL24PC for regular symbols (it generates it for _GLOBAL_OFFSET_TABLE_). (I should note that PPC64 does fix the problem. R_PPC_REL24, R_PPC_PLTREL24 and R_PPC_LOCAL24PC are unified. Choosing the relocation type according to the preemptivity property is wrong.)

I think non-preemptible ifuncs work correctly on EM_386, EM_X86_64, EM_ARM, and EM_AARCH64. They don't need the GlobalIFunc special case in TargetMachine::shouldAssumeDSOLocal or its call sites to work well. However, I may have missed something. I have also noticed that ifunc is not well tested. Applying the previous patch does not cause any test updates. If we do find other targets may need a similar GlobalIFunc special case, we can check whether it makes more sense to be placed in shouldAssumeDSOLocal.

MaskRay updated this revision to Diff 234573.Dec 18 2019, 10:26 AM

Rebase on D70570

-relocation-model=static should use R_PPC_REL24

🤡🤡🤡(Any chance this is reviewed before I watch star wars? (I felt that if I did not say, due to vacation schedules, this might took a lot longer..))

sfertile accepted this revision as: sfertile.Dec 20 2019, 10:50 AM

🤡🤡🤡(Any chance this is reviewed before I watch star wars? (I felt that if I did not say, due to vacation schedules, this might took a lot longer..))

Yep, sorry for taking so long. It seems the fact that upgrading a function to an ifunc isn't always safe for the 32-bit PowerPC ABI is soemthing we just have to live with. I am good with this patch as is, with 2 minor comments:

  1. Is it possible to diagnose when a call to an ifunc uses R_PPC_LOCAL24PC or R_PPC_REL24 but should have used a R_PPC_PLTREL instead? It would be really nice to issue a Error with a descriptive diagnostic at link time for this.
  1. We should consider not marking ifunc as dso_local when generating the IR. At least for ppc32, but I think conceptually it might make sense in general.

P.S. Hope you enjoy Star Wars!

This revision is now accepted and ready to land.Dec 20 2019, 10:50 AM

🤡🤡🤡(Any chance this is reviewed before I watch star wars? (I felt that if I did not say, due to vacation schedules, this might took a lot longer..))

Yep, sorry for taking so long. It seems the fact that upgrading a function to an ifunc isn't always safe for the 32-bit PowerPC ABI is soemthing we just have to live with. I am good with this patch as is, with 2 minor comments:

  1. Is it possible to diagnose when a call to an ifunc uses R_PPC_LOCAL24PC or R_PPC_REL24 but should have used a R_PPC_PLTREL instead? It would be really nice to issue a Error with a descriptive diagnostic at link time for this.

This looks like a linker suggestion, right? Technically lld can do that... But GNU ld silently generates incorrect code, and clang does not generate R_PPC_LOCAL24PC (except _GLOBAL_OFFSET_TABLE which is hidden and unrelated to ifunc). I think the usefulness of the detection may not be that much. I guess we can probably live with this deficiency...

  1. We should consider not marking ifunc as dso_local when generating the IR. At least for ppc32, but I think conceptually it might make sense in general.

This will be an alternative. I have a weak preference for making this general. Since this seems to work on all other targets (and ppc64), the scope of this hack should probably be kept as narrow as possible. It seems this place is the most appropriate place before we found more issues.

Thanks!

MaskRay updated this revision to Diff 234937.Dec 20 2019, 11:31 AM

Rename ifunc-dsolocal.ll as ifunc.ll because we also test non-dsolocal cases.

This revision was automatically updated to reflect the committed changes.