This is an archive of the discontinued LLVM Phabricator instance.

[PPC64] Add TLS initial exec to local exec relaxation
ClosedPublic

Authored by syzaara on Jun 12 2018, 1:01 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

syzaara created this revision.Jun 12 2018, 1:01 PM
ruiu added inline comments.Jun 12 2018, 1:12 PM
lld/ELF/Arch/PPC64.cpp
164 ↗(On Diff #151002)

In lld we do not take the approach that "every constant should be defined as variable" but this is perhaps a bit cross the line. Don't you have constants for these values?

syzaara updated this revision to Diff 151192.Jun 13 2018, 10:15 AM
ruiu added inline comments.Jun 13 2018, 3:23 PM
lld/ELF/Arch/PPC64.cpp
25 ↗(On Diff #151192)

nit: add a blank line before any multi-line class, struct, enum, etc definitions/declarations.

It needs a comment. Is this a instruction number?

183 ↗(On Diff #151192)

format?

184–185 ↗(On Diff #151192)

Can you please clang-format?

202–204 ↗(On Diff #151192)

Which does this mean, we expect the three sequential instructions or one of them?

lld/ELF/Relocations.h
47 ↗(On Diff #151192)

I don't know if you really need to add a new type. IE→LE relaxation has already implemented for some other architectures. I wonder if you can do the same thing as others.

lld/test/ELF/ppc64-tls-ie-le.s
1 ↗(On Diff #151192)

How did you create this file? It seems a bit too long. Can you reduce it by hand? Also, splitting to smaller test files might improve maintainability.

syzaara added inline comments.Jun 14 2018, 9:53 AM
lld/ELF/Arch/PPC64.cpp
202–204 ↗(On Diff #151192)

These may not be sequential but are all part of the initial exec sequence and we expect each one of them to be converted into a new instruction.
The instructions:

addis r9, r2, x@got@tprel@ha
ld r9, x@got@tprel@l(r9)

are building the offset of x relative to the thread pointer by reading it from the got.
Then,

add r9, r9, x@tls

completes the address by replacing x@tls with r13 which is the thread pointer.

We will do the following replacements as started by the V2 ABI:

addis r9, r2, x@got@tprel@ha  changes into nop
ld r9, x@got@tprel@l(r9) changes into addis r9, r13, x@tprel@ha
add r9, r9, x@tls changes into addi r9, r9, x@tprel@l

This new sequence now calculates the offset of x relative to the thread pointer rather than reading it from the got.

lld/test/ELF/ppc64-tls-ie-le.s
1 ↗(On Diff #151192)

I created this file by hand. It is composed of multiple small tests that are all related so I think it would be better if we kept them in one file. The small tests are meant to test the same relocation relaxation which has multiple options for the instruction:

lbzx -> lbz
lhzx -> lhz
lwzx -> lwz
ldx -> ld
stbx -> stb
sthx -> sth
stwx -> stw
stdx -> std
add -> addi
syzaara added inline comments.Jun 14 2018, 12:59 PM
lld/ELF/Relocations.h
47 ↗(On Diff #151192)

I added this new type because I wanted a relocation which would get ignored similar to how R_HINT and R_NONE are, except for when handling TLS relaxations. I added R_TLS_IE_HINT as the RelExp for R_PPC64_TLS.
R_PPC64_TLS is not suppose to be evaluated into anything, it just gets replaced by r13. Currently we have R_PPC64_TLS mapped to the RelExpr R_HINT. However, for handling TLS relaxations we need to processes the R_PPC64_TLS relocation so that we can perform the relaxation of:

add r9, r9, x@tls
into
addi r9, r9, x@tprel@l

Leaving it as R_HINT causes us to skip calling relaxTlsGdToLe with R_PPC64_TLS and we are unable to do the above change since there is an early exit in scanReloc for R_HINT.

If I use the existing relocation types used for initial exec, like R_GOT_OFF, I would be evaluating it in getRelocTargetVA, just to never use it.

Adding this new type allows me to skip getRelocTargetVA for R_PPC64_TLS, while being able to handle it in the TLS relaxations.

syzaara updated this revision to Diff 151705.Jun 18 2018, 6:46 AM
sfertile added inline comments.Jun 27 2018, 8:55 AM
lld/ELF/Relocations.h
47 ↗(On Diff #151705)

Maybe R_TLSIE_HINT instead to match R_TLSLD_HINT added in other patch.
Also its in the wrong place, since these are sorted alphabetically.

syzaara updated this revision to Diff 154139.Jul 4 2018, 10:43 AM
ruiu added inline comments.Jul 5 2018, 9:54 AM
lld/ELF/Arch/PPC64.cpp
207 ↗(On Diff #154139)

nit: error messages should start with a lowercase letter.

238 ↗(On Diff #154139)

nit: remove {} since we don't need to create a new scope for local variable for this case.

249 ↗(On Diff #154139)

unsigned -> uint32_t

251 ↗(On Diff #154139)

U -> u

252–253 ↗(On Diff #154139)

unsigned -> uint32_t.

lld/ELF/Relocations.h
83 ↗(On Diff #154139)

Does this compile? R_TLSIE_HINT is added but other files refer R_TLS_IE_HINT.

syzaara updated this revision to Diff 154286.Jul 5 2018, 12:35 PM
MaskRay added a subscriber: MaskRay.EditedJul 17 2018, 10:54 AM

Just a question. Does this patch support the optimization that fills in the GOT slot when a group of initial-exec relocations are known to be link-time constants?

__attribute__((tls_model("initial-exec")))
static __thread DTLS dtls;
48:   00 00 62 3c     addis   r3,r2,0
                      48: R_PPC64_GOT_TPREL16_HA      _ZN11__sanitizerL4dtlsE
4c:   00 00 63 e8     ld      r3,0(r3)
                      4c: R_PPC64_GOT_TPREL16_LO_DS   _ZN11__sanitizerL4dtlsE
50:   14 6a 83 7c     add     r4,r3,r13
                      50: R_PPC64_TLS _ZN11__sanitizerL4dtlsE

https://github.com/llvm-mirror/lld/tree/master/ELF/Relocations.cpp#L747

template <class ELFT> static void addGotEntry(Symbol &Sym) {
...
  bool IsLinkTimeConstant =
      !Sym.IsPreemptible && (!Config->Pic || isAbsolute(Sym));
  /// if R_PPC64_GOT_TPREL16_HA R_PPC64_GOT_TPREL16_LO_DS are link-time constants, they can be filled but `Target->GotRel` should not be used here as it is a 64-bit value.
  /// Target->GotRel is R_PPC64_GLOB_DAT but I think it should not be used.
  if (IsLinkTimeConstant) {
    InX::Got->Relocations.push_back({Expr, Target->GotRel, Off, 0, &Sym});
    return;
  }

Just a question. Does this patch support the optimization that fills in the GOT slot when a group of initial-exec relocations are known to be link-time constants?

__attribute__((tls_model("initial-exec")))
static __thread DTLS dtls;
48:   00 00 62 3c     addis   r3,r2,0
                      48: R_PPC64_GOT_TPREL16_HA      _ZN11__sanitizerL4dtlsE
4c:   00 00 63 e8     ld      r3,0(r3)
                      4c: R_PPC64_GOT_TPREL16_LO_DS   _ZN11__sanitizerL4dtlsE
50:   14 6a 83 7c     add     r4,r3,r13
                      50: R_PPC64_TLS _ZN11__sanitizerL4dtlsE

https://github.com/llvm-mirror/lld/tree/master/ELF/Relocations.cpp#L747

template <class ELFT> static void addGotEntry(Symbol &Sym) {
...
  bool IsLinkTimeConstant =
      !Sym.IsPreemptible && (!Config->Pic || isAbsolute(Sym));
  /// if R_PPC64_GOT_TPREL16_HA R_PPC64_GOT_TPREL16_LO_DS are link-time constants, they can be filled but `Target->GotRel` should not be used here as it is a 64-bit value.
  /// Target->GotRel is R_PPC64_GLOB_DAT but I think it should not be used.
  if (IsLinkTimeConstant) {
    InX::Got->Relocations.push_back({Expr, Target->GotRel, Off, 0, &Sym});
    return;
  }

For the example of:
attribute((tls_model("initial-exec")))
static __thread DTLS dtls;

We handle the TLS relocations for R_PPC64_GOT_TPREL16_HA, R_PPC64_GOT_TPREL16_LO_DS, R_PPC64_TLS before we check for needsGot.
When handling these relocations, they are optimized into the local exec model. This model does not require any GOT slots, so there isn't anything to optimize for the GOT slots.

sfertile added inline comments.Jul 23 2018, 10:49 AM
lld/ELF/Arch/PPC64.cpp
26 ↗(On Diff #154286)

Do we need to relax vector load/stores as well?

222 ↗(On Diff #154286)

A brief description of what x@tls means would be useful here.

225 ↗(On Diff #154286)

getDFormOp already lists all the instructions we want to convert form X-form to D-form. We can probably condense this comment to not list them all explicitly.

The last instruction in the initial exec sequence has multiple variations that need to be handled. If we are building an address it will use an add instruction, if we are accessing memory it will  use any of the X-form indexed load or store instructions
257 ↗(On Diff #154286)

We should have a default case with an llvm_unreachable, and do we need to handle R_PPC64_GOT_TPREL16_DS and R_PPC64_GOT_TPREL16_HI?

lld/ELF/InputSection.cpp
515 ↗(On Diff #154286)

Sort alphabetically.

syzaara updated this revision to Diff 157109.Jul 24 2018, 12:59 PM
ruiu added a comment.Jul 24 2018, 2:42 PM

Generally looking good.

lld/ELF/Arch/PPC64.cpp
227 ↗(On Diff #157109)

By the last instruction, do you mean add in the above code? If so, please write it so.

240 ↗(On Diff #157109)

nit: you can remove ()

249 ↗(On Diff #157109)

remove space between Op and "(".

syzaara updated this revision to Diff 157273.Jul 25 2018, 8:10 AM
ruiu accepted this revision.Jul 25 2018, 10:45 AM

LGTM with this change.

lld/ELF/Arch/PPC64.cpp
228–229 ↗(On Diff #157273)

Incomplete sentence? Please fix.

This revision is now accepted and ready to land.Jul 25 2018, 10:45 AM
This revision was automatically updated to reflect the committed changes.