Page MenuHomePhabricator

[lld][ELF][SPARC] Implement and fixup GOT/PLT relocations
Needs ReviewPublic

Authored by LemonBoy on May 23 2021, 8:52 AM.

Details

Reviewers
jrtc27
MaskRay
Summary
  • Support R_SPARC_GOT13 relocation, completing the support for 32bit PIC model.
  • Support the R_SPARC_GOTDATA_OP_ family of relocations and its optimizations.
  • Reserve a GOT entry as required by the ABI.
  • Emit RELA entries as required by the ABI.
  • Avoid the use of .got.plt.
  • Apply the R_SPARC_JMP_SLOT relocation on PLT entries.

Diff Detail

Event Timeline

LemonBoy created this revision.May 23 2021, 8:52 AM
LemonBoy requested review of this revision.May 23 2021, 8:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2021, 8:52 AM

Please stop putting the rela change into tangentially-related revisions.

Commits should be something like:

  1. Fix lld to use rela
  2. Finish GOT support
  3. Add IE TLS
  4. Add LD TLS
  5. Add GD TLS
LemonBoy updated this revision to Diff 347254.May 23 2021, 9:53 AM
LemonBoy edited the summary of this revision. (Show Details)

Depends on D102986.

I dislike that we are fixing these stuff increasingly with so many patches.

See the history of Arch/PPC.cpp - most stuff are implemented with very few commits.

MaskRay requested changes to this revision.May 23 2021, 10:14 AM
This revision now requires changes to proceed.May 23 2021, 10:14 AM

I dislike that we are fixing these stuff increasingly with so many patches.

See the history of Arch/PPC.cpp - most stuff are implemented with very few commits.

I'm getting a bit tired of splitting and merging patches, can we please decide on what the best course of action is and end this suffering?

I dislike that we are fixing these stuff increasingly with so many patches.

See the history of Arch/PPC.cpp - most stuff are implemented with very few commits.

I'm getting a bit tired of splitting and merging patches, can we please decide on what the best course of action is and end this suffering?

Hm, MaskRay makes a good point. I guess my real complaint when I stop and think about it is not combining multiple things into one commit, but having a commit that says "do X" when it's really "do X and Y". One commit that's "fix everything", or two commits "fix everything except TLS" and "finish TLS", seems fine to me, so long as the descriptions are correct and don't sneak in changes that are unrelated to the commit _message_.

MaskRay added a comment.EditedMay 23 2021, 10:39 AM

I think the problem is the existing port is too broken. Aside from that, some relocation types are clearly not tested. Some GOT/PLT relocations are likely not properly tested.
The bar for the initial check-in was too low I'd say.

I think for a mature port, fixing things incrementally make for good commit history. But I cannot feel similarly for sparcv9 which apparently misses many things.
Personally I like the test organization of ppc32-* the most. Not too many tests (x86_64 tends to have the problem that there are too many files) but sufficient coverage is provided.
By looking at the history, you can see that I ensured GOT/PLT and every related things are correct in one commit and we never need to fix the implemented stuff.

MaskRay added a comment.EditedMay 23 2021, 10:42 AM

I think it is useful to set up a minimal goal. When you fix/implement GOT/PLT, make sure some basic programs with GOT/PLT are really working.

If we do "fix GOT a" "fot GOT b" "fix PLT c", it is very likely that a subsequent commit may update an earlier commit. This is less elegant.

So what's the state of this port? When I implemented ppc32 support and added riscv support, I made sure glibc/musl ppc32/riscv dynamically linking programs work.

Some (candidate) milestones: (1) support -fno-pic -no-pie -nostdlib link (2) support dynamic linking (many stuff are here: RELATIVE/COPY/GOT/PLT/...) (3) support tls (4) support extra things that glibc static links require

I think it is useful to set up a minimal goal. When you fix/implement GOT/PLT, make sure some basic programs with GOT/PLT are really working.

If we do "fix GOT a" "fot GOT b" "fix PLT c", it is very likely that a subsequent commit may update an earlier commit. This is less elegant.

So what's the state of this port? When I implemented ppc32 support and added riscv support, I made sure glibc/musl ppc32/riscv dynamically linking programs work.

Some (candidate) milestones: (1) support -fno-pic -no-pie -nostdlib link (2) support dynamic linking (many stuff are here: RELATIVE/COPY/GOT/PLT/...) (3) support tls (4) support extra things that glibc static links require

The port is good enough to be able to successfully link the whole Zig compiler and its test suite.
This set of patches is part of my effort to let it build and run a whole Zig program that's dynamically linked against glibc, I'm progressively fixing lld as needed.

My current roadmap is:

  • Implement the remaining TLS models {IE, GD, LD}
  • Implement the GOTDATA relocations.

Some fixup commits to fix what was broken/missing are perhaps inelegant but necessary.

LemonBoy updated this revision to Diff 352997.Jun 18 2021, 6:46 AM
LemonBoy edited the summary of this revision. (Show Details)

Add back the RELA bit, no explicit test is needed as the existing ones are already covering that bit.

Is dynamic linking fully functional?

Is dynamic linking fully functional?

Sort of. I managed to link and run (under Qemu) a dynamically-linked PIE executable after applying a few more touch-ups, the idea is to turn this patchset into one targeting GOT/PLT support, as you suggested.
No updated patch was uploaded because I've hit a problem with (I think) how LLD handles GOT entries and how SPARC GOTDATA relocations are supposed to work. At the moment the plan is to avoid implementing the GOTDATA optimizations and treat them as regular GOT-relative relocations instead.
Below you can find the set of relocations emitted by GCC for a small hello-world program, the six GOTDATA triplets reference six different objects in the .rodata sections ... by means of a STT_SECTION reference with non-zero addend. Since all the six relocations point to the same symbol LLD generates a single GOT entry (every call to isInGot returns true) and that generates binaries that are crashing at runtime.

As you can see the code sequence for this kind of relocation expects a single GOT entry for each relocation w/ different addends:

sethi %gdop_hix22(symbol), %l1
xor   %l1, %gdop_lox10(symbol), %l1
ld    [%l7 + %l1], %l2, %gdop(symbol)

I'm open to suggestions on how to overcome this problem, of course relaxing this relocation kind solves the problem at hand (even though that opens another can of worms, RelExpr already has 64 elements and RelExprMaskBuilder requires some work to accommodate more entries) but I wonder if that'd be only papering over a bigger problem.

Section (3) .rela.text {
  0x4 R_SPARC_PC22 _GLOBAL_OFFSET_TABLE_ 0xFFFFFFFFFFFFFFF8
  0x8 R_SPARC_PC10 _GLOBAL_OFFSET_TABLE_ 0xFFFFFFFFFFFFFFFC
  0xC R_SPARC_WPLT30 __sparc_get_pc_thunk.l7 0x0
  0x24 R_SPARC_GOTDATA_OP_HIX22 .rodata 0x0
  0x28 R_SPARC_GOTDATA_OP_LOX10 .rodata 0x0
  0x2C R_SPARC_GOTDATA_OP .rodata 0x0
  0x34 R_SPARC_WPLT30 puts 0x0
  0x3C R_SPARC_GOTDATA_OP_HIX22 .rodata 0x10
  0x40 R_SPARC_GOTDATA_OP_LOX10 .rodata 0x10
  0x44 R_SPARC_GOTDATA_OP .rodata 0x10
  0x4C R_SPARC_GOTDATA_OP_HIX22 .rodata 0x18
  0x50 R_SPARC_GOTDATA_OP_LOX10 .rodata 0x18
  0x54 R_SPARC_GOTDATA_OP .rodata 0x18
  0x5C R_SPARC_WPLT30 fopen 0x0
  0x74 R_SPARC_GOTDATA_OP_HIX22 .rodata 0x28
  0x78 R_SPARC_GOTDATA_OP_LOX10 .rodata 0x28
  0x7C R_SPARC_GOTDATA_OP .rodata 0x28
  0x84 R_SPARC_WPLT30 puts 0x0
  0x98 R_SPARC_GOTDATA_OP_HIX22 .rodata 0x30
  0x9C R_SPARC_GOTDATA_OP_LOX10 .rodata 0x30
  0xA0 R_SPARC_GOTDATA_OP .rodata 0x30
  0xA8 R_SPARC_WPLT30 fwrite 0x0
  0xB4 R_SPARC_WPLT30 fclose 0x0
  0xBC R_SPARC_GOTDATA_OP_HIX22 .rodata 0x38
  0xC0 R_SPARC_GOTDATA_OP_LOX10 .rodata 0x38
  0xC4 R_SPARC_GOTDATA_OP .rodata 0x38
  0xCC R_SPARC_WPLT30 puts 0x0
}

Ok, so you need R_RELAX_GOT_OFF to match R_RELAX_GOT_PC?

Ok, so you need R_RELAX_GOT_OFF to match R_RELAX_GOT_PC?

A relocation that behaves as R_GOT_OFF or, when it's possible to relax it, as a R_GOTREL.

jrtc27 added a comment.EditedJun 20 2021, 10:03 AM

Ok, so you need R_RELAX_GOT_OFF to match R_RELAX_GOT_PC?

A relocation that behaves as R_GOT_OFF or, when it's possible to relax it, as a R_GOTREL.

Yes, the R_GOT_OFF version of what R_RELAX_GOT_PC is to R_GOT_PC (e.g. movq foo@GOTPCREL(%rip), %rax -> leaq foo(%rip), %rax, where the latter means leaq foo@PCREL(%rip) but that's implied by the %rip base).

Ok, so you need R_RELAX_GOT_OFF to match R_RELAX_GOT_PC?

A relocation that behaves as R_GOT_OFF or, when it's possible to relax it, as a R_GOTREL.

Yes, the R_GOT_OFF version of what R_RELAX_GOT_PC is to R_GOT_PC (e.g. movq foo@GOTPCREL(%rip), %rax -> leaq foo(%rip), %rax, where the latter means leaq foo@PCREL(%rip) but that's implied by the %rip base).

Oh. Yes, that's the idea. At the moment I'm treating all of them as R_GOT_OFF but, as explained above, that's not working for every binary.

LemonBoy updated this revision to Diff 353366.Jun 21 2021, 7:41 AM
LemonBoy retitled this revision from [lld][ELF][SPARC] Fix GOT-relative relocations to [lld][ELF][SPARC] Implement and fixup GOT/PLT relocations.
LemonBoy edited the summary of this revision. (Show Details)

Here's the updated patch, I'm now able to link and run several small programs linked against glibc (in both PIC/PIE mode), shared libraries and some basic examples w/ ifuncs.

The use of uint128_t is a hack that works pretty well, suggestions are welcome.

Once D102575 I'll be able to commit some tests.

32-bit architectures don't support int128_t.

Downstream we turned it into a std::pair<uint64_t, uint64_t> (https://github.com/CTSRD-CHERI/llvm-project/blob/de58a2a25c32471b4b3c2aec77fee624e24120cd/lld/ELF/Relocations.cpp#L136-L179), though something more general where it's parameterised on the number of words might be nicer rather than hand-rolling a pair version.

Downstream we turned it into a std::pair<uint64_t, uint64_t> (https://github.com/CTSRD-CHERI/llvm-project/blob/de58a2a25c32471b4b3c2aec77fee624e24120cd/lld/ELF/Relocations.cpp#L136-L179), though something more general where it's parameterised on the number of words might be nicer rather than hand-rolling a pair version.

Interesting, I started implementing it in terms of an array of uint64_t elements.
Using a std::pair looks good to me, it's quite unlikely for RelExpr to become bigger than 128 (hopefully). Do you plan on upstreaming this change?