Page MenuHomePhabricator

[ELF][PPC64] Support long branch thunks with addends

Authored by MaskRay on Dec 2 2019, 6:15 PM.



Fixes PPC64 part of PR40438

// clang -target ppc64le -c
// .text.unlikely may be placed in a separate output section (via -z keep-text-section-prefix)
// The distance between bar in .text.unlikely and foo in .text may be larger than 32MiB.
static void foo() {}
__attribute__((section(".text.unlikely"))) static int bar() { foo(); return 0; }
__attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for PPC64.

AArch64: .text -> __AArch64ADRPThunk_ (adrp x16, ...; add x16, x16, ...; br x16) -> target
PPC64: .text -> __long_branch_ (addis 12, 2, ...; ld 12, ...(12); mtctr 12; bctr) -> target

AArch64 can leverage ADRP to jump to the target directly, but PPC64
needs to load an address from .branch_lt . Before Power ISA v3.0, the
PC-relative ADDPCIS was not available. .branch_lt was invented to work
around the limitation.

Symbol::ppc64BranchltIndex is replaced by
PPC64LongBranchTargetSection::entry_index which take addends into

The tests are rewritten: ppc64-long-branch.s (-no-pie) and
ppc64-long-branch-pi.s (-pie and -shared).

Diff Detail

Event Timeline

MaskRay created this revision.Dec 2 2019, 6:15 PM
MaskRay edited the summary of this revision. (Show Details)Dec 2 2019, 6:15 PM
MaskRay updated this revision to Diff 231817.Dec 2 2019, 8:55 PM

Improve tests (add backward bl)

lkail added a subscriber: lkail.Dec 2 2019, 9:24 PM

The change looks reasonable to me but I'll defer to the PPC experts for the PPC specific changes, as I'm not that familiar with it.

Sorry for the wait, I was on vacation. I'm going over the related pr, D70637 and this patch right now.

sfertile accepted this revision.Dec 5 2019, 8:11 AM

I have a couple minor questions, but it LGTM.

Should there be an error when an addend is used and the call target is a function symbol with a non-zero offset between global and local entries?


On trunk I still see sym.ppc64BranchltIndex = old.ppc64BranchltIndex; here. I take it deleting that is meant to be part of this patch.


Why sym->getVA() + addend as opposed to sym->getVA(addend)?


Odd, I though we would always create a .got section with the tocbase as the first entry. And the ppc64-long-branch.s doesn't seem to need it.

This revision is now accepted and ready to land.Dec 5 2019, 8:11 AM
MaskRay marked 5 inline comments as done.EditedDec 5 2019, 10:00 AM

.got is not always created. .got will be created when there is at least .TOC. related relocation:

// Relocations.cpp:1308
               expr)) {>hasGotOffRel = true;

Secondly, it works arounds a small issue that will not be fixed by this patch. If .rela.dyn has no entries before thunk creation.

      forEachRelSec(scanRelocations<ELFT>) // Most dynamic relocations are added in this pass.
      // The synthetic section .rela.dyn is removed from its parent (OutputSection .rela.dyn) because it has no entries.
        // Add a dynamic relocation to mainPart->relaDyn

This can affect all targets that add dynamic relocations in ThunkCreator::createThunks. Currently only PPC64 is affected, because only PPC64 needs .branch_lt to work around the lack of PC-relative addressing before Power ISA v3.0.

Fortunately, this pass ordering problem usually does not matter in practice. PPC64PILongBranchThunk is used by config->isPic cases. In PIC cases, there are almost assuredly some dynamic relocations here and there.

I created a wontfix bug at . It is not clear if it can be fixed, or if there is an easy fix. I have closed it because a fix is probably not worthwhile on its own.


I intend to delete this line in a separate change.

replaceWithDefined is used by canonical PLT and copy relocations, which implies that the symbol is preemptable. ppc64BranchltIndex is only used by non-preemptable cases. So the line is not needed.

## Force creation of .got
.section .data
.quad .TOC.@tocbase

in ppc64-long-branch-pi.s serves two purposes. I've put more information in the main comment (which is editable).

MaskRay updated this revision to Diff 232378.Dec 5 2019, 10:04 AM
MaskRay marked 2 inline comments as done.

getVA() + addend => getVA(addend)

This revision was automatically updated to reflect the committed changes.