[PPC64] global dynamic to initial exec relaxation
ClosedPublic

Authored by sfertile on Jun 12 2018, 12:55 PM.

Details

Summary

Patch adds support for relaxing the global-dynamic tls sequence to initial-exec tls sequence.

the relaxation performs the following transformation:
addis r3, r2, x@got@tlsgd@ha --> addis r3, r2, x@got@tprel@ha
addi r3, r3, x@got@tlsgd@l --> ld r3, x@got@tprel@l(r3)
bl __tls_get_addr(x@tlsgd) --> nop
nop --> add r3, r3, r13

and instead of emitting a DTPMOD64/DTPREL64 pair for x, we emit a single R_PPC64_TPREL64.

Diff Detail

Repository
rLLD LLVM Linker
sfertile created this revision.Jun 12 2018, 12:55 PM
ruiu added a comment.Jun 12 2018, 1:16 PM

Can you give me a pointer to the spec describing these relaxations?

ELF/Arch/PPC64.cpp
411

Add a blank line

413

nit:? remove {}

ELF/InputSection.cpp
783

Remove an extra space character at beginning of the sentence.

786

Wrong indentation.

ELF/Relocations.h
70

Did you really have to add these two new types? I'm not claiming that they are unnecessary, I'm just wondering, so could you please explain why the existing types don't work for you?

Can you give me a pointer to the spec describing these relaxations?

Yep, the abi has a section describing each of the tls relaxations: http://openpowerfoundation.org/wp-content/uploads/resources/leabi/content/dbdoclet.50655241_61284.html

ELF/Relocations.h
70

For R_RELAX_TLS_GD_TO_IE_GOT_OFF I don't have to add it. I could use the _ABS version and subtract the TocBase from the result in the PPC target code. Either way I have to adjust the value passed to relaxTlsGdToIe (by the TocOffset if passed the GotOffset and by TocBase if passed the syms VA) so there isn't much diffrence between them. I picked adding the new RelExpr simply for clarity since the instructions in the relaxed sequence use a got-offset and not the syms VA.

For R_TLSGD_HINT I think I can actually use R_TLSDESC_CALL instead. When I initially looked at R_TLSDESC_CALL I thought there was something it generated that was different from what I needed, but going over it again it looks like its a perfect fit.

Can you give me a pointer to the spec describing these relaxations?

Yep, the abi has a section describing each of the tls relaxations: http://openpowerfoundation.org/wp-content/uploads/resources/leabi/content/dbdoclet.50655241_61284.html

Relatedly can you expand on the comments with a reference to the part of the ABI with the associated relaxation?

-eric

sfertile updated this revision to Diff 151226.Jun 13 2018, 12:37 PM

Updated formatting, removed the 2 new RelExprs since they aren't needed, expanded comments on the relaxation specifics.

I also fixed an issue where I had assumed the input register to the load was r3, which isn't necessary. Any register could be used for the intermediate value.

syzaara added inline comments.Jun 13 2018, 12:45 PM
ELF/Arch/PPC64.cpp
440

I think it would be more clear if we called relocateOne with the replacement relocation here.

ruiu added inline comments.Jun 13 2018, 1:55 PM
ELF/Arch/PPC64.cpp
421–425

I wonder if this comment actually explains what this function is doing. This function handles three different relocation types, but this comment seems to cover only one type.

446

Perhaps 0x7fff0000 is easier to understand than 31<<16.

452

Wrong indentation -- can you always apply clang-format?

sfertile added inline comments.Jun 13 2018, 4:18 PM
ELF/Arch/PPC64.cpp
421–425

Part 1 is the addis/addi sequence which covers the R_PPC64_GOT_TLSGD16_HA and R_PPC64_GOT_TLSGD16_LO relocations on the first 2 instructions. Even though we handle them separately and they won't nessicariily be adjacent in the source it really only makes sense to talk about the pair together.

Part 2 and 3 cover the remaining R_PPC64_TLSGD relocation which affects the last 2 instructions in the sequence.
(A call that crosses dso boundaries has to be followed by a nop on PPC which is why we can handle both instructions with a single hint relocation).

440

Now I wish I hadn't jumped the gun on abandoning the R_RELAX_TLS_GD_TO_IE_GOT_OFF RelExpr :).

I like your idea relocateOne(Loc, R_PPC64_TPREL16_HA/LO_DS, Val) makes it clear that the relocation is handled the same as a normal TPREL16. Since we are getting the syms VA though, it would have to be relocateOne(Loc, R_PPC64_TPREL16_HA/LO_DS, Val - InX::got->getVA()) which I think kills the clarity ...

I think we should pick one of the following:

  1. Use a RelExpr of R_RELAX_TLS_GD_TO_IE_GOT_OFF so we get just the got offset of the symbol, then call relocateOne which handles adjusting by the toc-offset and performing the relocation.
  1. Stick to the R_RELAX_TLS_GD_TO_IE_GOT_ABS RelExpr, adjust by the TocBase and relocate here (as the patch is currently doing).

I would much prefer 1, as long as everyone is ok with adding the new RelExpr.

446

I picked this format to try to convey its a 5-bit mask and shifted over to bits 16-21. When I'm reading if I see 0x1F0000 I'll typically have to stop and count the 0's to know where the mask is. Does 0x1f << 16 work for you?

452

yes, sorry about that. I've been lazy with clang-format for a while because my vim commands for it were broken. I've fixed them and will format the rest of this patch.

sfertile updated this revision to Diff 152748.Jun 25 2018, 12:32 PM
  • clang formatted patch
  • Switched the relax RelExpr to one that calculates a got-offset.
  • used relocateOne to make it clearer what we were relaxing to.
  • updated register mask to use hex instead of decimal.
ruiu accepted this revision.Jun 25 2018, 11:27 PM

LGTM

This revision is now accepted and ready to land.Jun 25 2018, 11:27 PM
This revision was automatically updated to reflect the committed changes.