This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho][NFC] refactor relocation handling
ClosedPublic

Authored by gkm on Jan 21 2021, 2:39 AM.

Details

Reviewers
int3
Group Reviewers
Restricted Project
Commits
rG3a9d2f1488f0: [lld-macho][NFC] refactor relocation handling
Summary

Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes.

Many cleanups

Diff Detail

Event Timeline

gkm requested review of this revision.Jan 21 2021, 2:39 AM
gkm created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2021, 2:39 AM
int3 added a subscriber: int3.Jan 21 2021, 9:46 AM

I like this enum-bit-based design :)

lld/MachO/InputFiles.cpp
238–259

ideally we would only add functionality in the diff where it gets tested, but I guess diff splitting is kinda hairy at times... can we at least note in the commit message that this gets tested in D88629?

241

no need for llvm::

253

we should put these as default field initializers in the struct definition. And perhaps we can also add a RelocationInfoType::INVALID enum value instead of using 0xff

258

seems unnecessary

lld/MachO/OutputSegment.cpp
49–50 ↗(On Diff #318137)

this change seems out of place in this diff

lld/MachO/Target.h
38–39

why do we need to define these if they are invalid?

96

naming nit 1: I think the usual terminology is "relax", not "morph". LLD-ELF uses this at least.
nit 2: I guess "load" and "add" are the instructions on ARM, but the equivalent ones on x86 are "mov" and "lea", which makes this naming a bit confusing, especially since "lea" = *load* effective address. How about relaxGotLoad()?

100

nit: add newline after multi-line function definition

int3 added inline comments.Jan 21 2021, 12:45 PM
lld/MachO/Target.h
50–51

I think ld64 (and other bits of macOS source) use "TLV" in place of "TLS". Pointers to TLVs are abbreviated as TLVPs.

gkm marked 9 inline comments as done.Jan 28 2021, 1:29 PM
gkm added inline comments.
lld/MachO/InputFiles.cpp
238–259

X86_64_RELOC_SUBTRACTOR exists, so it can and should be tested in this diff. I added a test.

lld/MachO/Target.h
38–39

It's a minor expediency for checking validity of reloc length: 1 << r.length vs. 1 << (r.length - 2)

50–51

That's a real distinction, and apparently needs more commentary to avoid confusion:
TLV = Thread-Local Variable and pertains to the datum and its symbol.
TLS = Thread-Local Storage and pertains to the section that contains the TLVs.

gkm updated this revision to Diff 319955.Jan 28 2021, 1:49 PM
gkm marked 3 inline comments as done.
  • revise according to review feedback
int3 accepted this revision.Jan 28 2021, 11:56 PM
int3 added inline comments.
lld/MachO/Target.cpp
38

how do you feel about naming this method just hasAttr? Having "RelocAttrBit" in both the method and the enum name seems a bit repetitive. Plus the "bit" part is a bit (heh) of an implementation detail.

lld/MachO/Target.h
50–51

I understood the distinction... I'm saying that TLVP seems like a more appropriate name for the former and TLV for the latter. In particular, in the SectionType enum, we have

/// S_THREAD_LOCAL_VARIABLES - Section with thread local variable
/// structure data.
S_THREAD_LOCAL_VARIABLES = 0x13u,
/// S_THREAD_LOCAL_VARIABLE_POINTERS - Section with pointers to thread
/// local structures.
S_THREAD_LOCAL_VARIABLE_POINTERS = 0x14u,

Based on the above, it seems fitting to me that a reloc which can be present in a S_THREAD_LOCAL_VARIABLES section has attribute TLV. And similarly TLVP seems like a fitting name for a reloc which can point to an address in a S_THREAD_LOCAL_VARIABLES section.

I'm aware that "TLS" is commonly used in the sense that you are describing, but from what I can tell neither ld64 nor dyld use that abbreviation, so it would be nice to hew to that.

This revision is now accepted and ready to land.Jan 28 2021, 11:56 PM
gkm marked 2 inline comments as done.Jan 31 2021, 7:00 PM

I also completed the TODO(gkm): hoist into target-independent code driven by RelocAttrBits refactors for prepareSymbolRelocation() and resolveSymbolVA()

lld/MachO/Target.h
50–51

Very good. I realized we only need one bit, which I named TLV. Thread-local Variable Pointer relocs are TLV|LOAD, and Thread-local Variable relocs are TLV (no LOAD).

BTW, LOAD is a new name for what I had been calling MORPH, and then RELAX.

gkm updated this revision to Diff 320389.Jan 31 2021, 7:04 PM
gkm marked an inline comment as done.
  • revise according to review feedback
  • rewrite prepareSymbolRelocation() and resolveSymbolVA() as reloc-attribute-driven rather than target-specific
int3 added inline comments.Feb 1 2021, 9:18 AM
lld/MachO/Target.h
50–51

sounds good :)

gkm added a subscriber: smeenai.Feb 1 2021, 4:11 PM

@int3 @smeenai
Regarding test failures x64 debian > libarcher.races ... These look unrelated to anything I have done in this diff, but looks can be deceiving.

How can I verify? The slow way is that I can run the entire premerge check on CentOS8 and hope that’s sufficiently close to the Debian environment. Is there some sort of health meter for tests so that I can see if these tests are broken for contemporaneous diffs, or main?

gkm updated this revision to Diff 320632.Feb 1 2021, 4:34 PM
  • rebase to see if seemingly-unrelated test failures resolve.
This revision was landed with ongoing or failed builds.Feb 2 2021, 9:55 AM
This revision was automatically updated to reflect the committed changes.
int3 added inline comments.Feb 2 2021, 12:33 PM
lld/MachO/Writer.cpp
412–413

I think this shouldn't have been part of the diff

uabelho added inline comments.
lld/MachO/Arch/X86_64.cpp
66

gcc warns about this always being false since type is unsigned.
Can it be removed?

Similarly with the assert above, I suppose type >= 0 is always true and it can be removed.