Page MenuHomePhabricator

[PPC64] toc-indirect to toc-relative relaxation.
Needs ReviewPublic

Authored by sfertile on Nov 19 2018, 12:52 PM.



When accessing a global variable which is not defined in the translation-unit being compiled the compilers on ppc64 will generate a toc-entry for the global and a got-indirect access using that toc-entry (as if it were a .got entry)

For example the following C code

extern int aGlobal;

int foo(void) {
  return aGlobal;

would get translated to:

        addis 3, 2, .LC0@toc@ha 
        ld 3, .LC0@toc@l(3)
        lwa 3, 0(3)


        .section        .toc,"aw",@progbits
        .tc aGlobal[TC],aGlobal

Where .LC0@toc is the offset from the TOC base-pointer to the label .LC0, and .tc aGlobal[TC],aGlobal creates an entry in the .toc section to store the address of aGlobal.
The first 2 instructions build the address of the toc-entry and load the address of aGlobal into r3, while the following lwa instruction loads the value of the global into r3.

If the global being accessed is a non-preemptable definition and the offset from the TOC pointer to the definition can be materialized with 2 instructions then we can relax the first 2 instructions to add that offset to the TOC base pointer rather then loading the address out of the .toc.

Diff Detail

rLLD LLVM Linker

Event Timeline

sfertile created this revision.Nov 19 2018, 12:52 PM

Zaara pointed out I am missing the full context, I'll re post the patch with the context fixed shortly.

sfertile updated this revision to Diff 174677.Nov 19 2018, 1:23 PM

Added full context

syzaara added inline comments.Nov 20 2018, 6:45 AM

Instead of 58, we can use the enum DFormOpcd which has LD defined to 58.


Expeceted -> Expected


Maybe use Target->GotEntrySize instead of 8.


entires -> entries


relxation -> relaxation

MaskRay added inline comments.Nov 21 2018, 1:28 PM

Instr &= 0x03FFFFFF;


Offset / 8 /* Toc Entry size */ -> Index


Maybe just return None;

MaskRay added inline comments.Nov 21 2018, 1:41 PM

They are put here probably because of ARM/AArch64 precedents: getAArch64UndefinedRelativeWeakVA getARMStaticBase etc.

If TOC is PPC/PPC64 specific, the static function tryRelaxToc and many of its callees may be renamed to include the substring PPC.


If the TOC (Table of Contents) section is PPC/PPC64 specific this should probably be named R_PPC_RELAX_TOC. R_AARCH64_RELAX_TLS_GD_TO_IE_PAGE_PC is a precedent.

sfertile updated this revision to Diff 175064.Nov 22 2018, 10:44 AM

Addressed initial review comments.

sfertile marked 9 inline comments as done.Nov 22 2018, 10:45 AM
sfertile added inline comments.

Thanks, I wasn't aware of 'None'.

ruiu added inline comments.Nov 26 2018, 10:56 AM

Error messages should start with a lowercase letter.

I think you should use error() instead of llvm_unreachable() because this condition can be triggered by a user. llvm_unreachable should be used only to report an internal error or a bug.


I think Instr = (Instr & 0x3FFFFFF) | 0x38000000; is a bit more straightforward.


writeInstrFromHalf16(Loc, (Instr & 0x3FFFFFF) | 0x38000000);

is perhaps better.


This is an unusual formatting style in lld. Please put break before }.


U -> u


ElfTy -> ELFT


Can you return std::pair<Defined *, int64_t> instead of Optional<std::pair<Defined *, int64_t>>? It looks like you can simply return {nullptr, 0} to indicate a null value.


Replace auto with the actual type.




Please do not use auto in lld unless its actual type is obvious from right-hand side.

grimar added a subscriber: grimar.Nov 27 2018, 1:34 AM
grimar added inline comments.

Seems the intention was to name it 'relaxTocPPC64'?


dyn_cast -> cast as null return value is not expected here.


Defined* -> Defined

sfertile updated this revision to Diff 176625.Dec 4 2018, 7:13 AM
sfertile marked an inline comment as done.

Addressed further review comments.

sfertile marked 11 inline comments as done.Dec 4 2018, 7:17 AM
sfertile added inline comments.

I named it relaxToPPC64 because the function is looking up the symbol we will potentially relax to. I failed to realize that its unfortunately close to relaxToc which would also make sense as a name :)

grimar added a comment.Dec 4 2018, 7:48 AM

I did not have a chance to debug it, but my comment is below.


I think we usually reach here (relocateAlloc) when we already know will we do the relaxation or not.
I.e. for example on x86_64 for R_RELAX_GOT_PC we just call Target->relaxGot because check in the adjustRelaxExpr the conditions for relaxing. And here then we either have R_RELAX_GOT_PC which name contains RELAX and means we *will do* the relaxation or we would have another R_* expression if we are not going to relax.

So I think you should not have a R_PPC64_RELAX_TOC expression here if you are not going to relax. I think you should check the conditions somewhere earlier. Also, the platform-specific code can live into own file (like ELF\Arch\PPC64.cpp) then. I think you can put all the functions you added there.

sfertile marked an inline comment as done.Dec 5 2018, 8:44 AM

Related to @grimar comment, I am also playing around with doing some of the checking early and mapping toc based relocations that are not-relaxable to got based relocations during scanRelocs. Ideally I'd like to be able to add a separate step for PPC64 once all relocations have been scanned that will iterate over the .toc sections to handle large-code model where any symbols that can't be safely relaxed get added to the .got and all remaining toc based relocations get relaxed. This should hopefully allow me to get rid of the .toc sections in the final binaries, but I'm not familiar enough with large-code model, or gcc and xlc codegen to know if this is feasible yet.


The got-indirect to toc-relative relaxation is unique in this regard because we can't decide if we can relax until the data segment has been fully built. During scanRelocs we could see if the indirection is to a defined symbol, but that is only enough to know that the relocation is potentially relax-able. We still need the offset from the TOC base pointer to the symbols definition to fit in a signed 32-bit offset (which isn't guaranteed in the large code model), and scanRelocs is too early to determine that. I could add the relax checking earlier than here but that would involve doing the same scan over the .toc sections relocations and replacing those that are indeed relaxable. I decided not to do that in the spirit of keeping it simple.

I like the idea of moving the platform specific code into PPC64.cpp, and will update the patch with that change, but I think here is the right place to perform the relaxation.

sfertile updated this revision to Diff 179574.Thu, Dec 27, 1:17 PM

Moved the toc relaxation logic into the PPC64 target.

Few comments/suggestions.


Warnings and error messages should start from the lowercase.


The same.


You can write shorter:

return {dyn_cast_or_null<Defined>(&IndirectSym), getAddend<ELFT>(Rela)};

Why not DefSym->Section->Name != ".toc" ?


return {};


return {};


So maybe you should never reach here if !Config->TocOptimize?
Seems you can do something like the following in getRelExpr:

case R_PPC64_TOC16_HA:
case R_PPC64_TOC16_LO_DS:
  return Config->TocOptimize ? R_PPC64_RELAX_TOC : R_XXXX;

RelaxTo.first/RelaxTo.second are a bit hard to follow and read. Lets do:

Defined *D;
int64_t Addend;
std::tie(D, Addend) = relaxTo(Rel);

OK, thanks for the explanation.


I think Returns false otherwise part is an obvious continuation and hence excessive.


You can use


Can you remove the excessive indentations please?

sfertile updated this revision to Diff 179765.Mon, Dec 31, 9:19 AM

Addressed review comments

sfertile marked 10 inline comments as done.Mon, Dec 31, 9:23 AM
sfertile updated this revision to Diff 179766.Mon, Dec 31, 9:26 AM

Forgot to update formatting in one of the tests.

sfertile marked 2 inline comments as done.Mon, Dec 31, 9:27 AM
ruiu added inline comments.Wed, Jan 2, 3:55 PM

Isn't this an error?



sfertile marked an inline comment as done.Thu, Jan 3, 12:14 PM
sfertile added inline comments.

Ideally it would be, however I found a gcc bug where it emits a toc entry without a corresponding relocation. This causes the look ups for any relocations after the skipped one to fail. I am still trying to cut the problem down to something smaller so I can understand it better (and open a bug report to get it fixed), but it seems that the rest of the codegen is valid (ie nothing tries to read or write to the toc entry missing the relocation). I don't think the problem is prevalent enough to warrant changing to a slower scan of the relocations in the case we do find a mismatch, and because the rest of the codegen is 'good' I don't think we should emit an error either.

sfertile marked an inline comment as done.Mon, Jan 7, 7:12 AM
sfertile added inline comments.

I've got some clarification from the ppc64 gcc/binutils developers and although the case I found was a bug , both gcc and xlc will emit values that don't need relocations into the .toc section in certian circumstances. I'll have to fix the lookup logic to deal with the fact I can't assume every .toc entry has a relocation.

sfertile marked 2 inline comments as not done.Mon, Jan 7, 7:45 AM