This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Support X86_64_RELOC_SIGNED_{1,2,4}
ClosedPublic

Authored by MaskRay on May 3 2020, 9:40 PM.

Details

Summary

We currently only support extern relocations.
X86_64_RELOC_SIGNED_{1,2,4} are like X86_64_RELOC_SIGNED, but with the
implicit addend fixed to 1, 2, and 4, respectively.
See the comment in lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation.

Diff Detail

Event Timeline

MaskRay created this revision.May 3 2020, 9:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2020, 9:40 PM
MaskRay updated this revision to Diff 261738.May 3 2020, 9:46 PM

Forgot to add llvm-objdump test

int3 added a comment.May 4 2020, 12:56 PM

Thanks for putting this up!

See the comment in lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation.

The comment there confuses me a little:

// The Darwin x86_64 relocation format has a problem where it cannot
// encode an address (L<foo> + <constant>) which is outside the atom
// containing L<foo>. Generally, this shouldn't occur but it does
// happen when we have a RIPrel instruction with data following the
// relocation entry (e.g., movb $012, L0(%rip)). Even with the PCrel
// adjustment Darwin x86_64 uses, the offset is still negative and the
// linker has no way to recognize this.

It says "the linker has no way to recognize this", but it appears that the implicit addend already allows the linker to recognize it. Any idea why it also needs a special relocation type? Is it redundant / a mistake in the implementation? ("However, the specification or implementation of this seems to be incomplete..." seems to indicate that I guess.)

lld/test/MachO/x86-64-reloc-signed.s
18

nit: X86_64_RELOC_SIGNED_4 (missing underscore)

MaskRay added a comment.EditedMay 4 2020, 1:54 PM

Thanks for putting this up!

See the comment in lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation.

The comment there confuses me a little:

// The Darwin x86_64 relocation format has a problem where it cannot
// encode an address (L<foo> + <constant>) which is outside the atom
// containing L<foo>. Generally, this shouldn't occur but it does
// happen when we have a RIPrel instruction with data following the
// relocation entry (e.g., movb $012, L0(%rip)). Even with the PCrel
// adjustment Darwin x86_64 uses, the offset is still negative and the
// linker has no way to recognize this.

It says "the linker has no way to recognize this", but it appears that the implicit addend already allows the linker to recognize it. Any idea why it also needs a special relocation type? Is it redundant / a mistake in the implementation? ("However, the specification or implementation of this seems to be incomplete..." seems to indicate that I guess.)

I guess X86_64_RELOC_SIGNED_1 can be used to assume the implicit addend is 1, i.e. it does not need to read the implicit addend at all (like ELF RELA). In binutils-gdb source code, the comment also implies the addends:

/* Same as BFD_RELOC_32_PCREL but with an implicit -1 addend.  */
  BFD_RELOC_MACH_O_X86_64_PCREL32_1,

/* Same as BFD_RELOC_32_PCREL but with an implicit -2 addend.  */
  BFD_RELOC_MACH_O_X86_64_PCREL32_2,

/* Same as BFD_RELOC_32_PCREL but with an implicit -4 addend.  */
  BFD_RELOC_MACH_O_X86_64_PCREL32_4,

I played with X86MachObjectWriter.cpp a bit. We can simply treat X86_64_RELOC_SIGNED_* as X86_64_RELOC_SIGNED.

MaskRay updated this revision to Diff 261924.May 4 2020, 1:58 PM

Fix relocation type names

I think it makes a difference for non-extern relocations.

Our relocation computation for pcrel relocations (the only type we support right now) is currently VA of target address + implicit addend - VA of relocation. We want the end result to be VA of target address + implicit addend - address of next instruction, since PC-relative relocations on x86-64 are relative to the start of the next instruction, and we accomplish this by subtracting 4 in X86_64::relocateOne. That's assumes the relocation is at the end of the instruction though, which isn't correct for X86_64_RELOC_SIGNED_{1,2,4}.

We're getting away with this right now because for extern relocations, the implicit addend negates this. E.g. for an X86_64_RELOC_SIGNED_1 (where there's 1 byte after the relocation), our computed offset would normally have been too high by 1 (since we'd be computing target VA - (address of next instruction - 1)), but the implicit addend is -1, so this works out. For non-extern relocations though, the implicit addend doesn't appear to have this adjustment, so we'll need to take the relocation type into account there.

Given that we only support extern relocations right now, this LGTM. @int3, are you good with this going in first and adjusting D79211 after?

An interesting aside: if you change the _s+2 to _s+1, the object file just contains a regular X86_64_RELOC_SIGNED. I wonder if that can cause any issues.

lld/test/MachO/x86-64-reloc-signed.s
18

(same nit for the two below)

int3 accepted this revision.May 4 2020, 2:05 PM

lgtm. I'll rebase my section relocation diff on this.

This revision is now accepted and ready to land.May 4 2020, 2:05 PM
lld/test/MachO/x86-64-reloc-signed.s
18

btw, is there a guarantee that MC will emit a relocation of this particular type ? (and the same question might be relevant for other similar diffs). One way to (at least) verify it is to use llvm-objdump or llvm-readobj to check that the list of relocations contains exactly what's expected here.

smeenai added inline comments.May 4 2020, 2:44 PM
lld/test/MachO/x86-64-reloc-signed.s
18

Good point, and yeah, that seems worthwhile to check.

MaskRay marked 3 inline comments as done.May 4 2020, 3:11 PM
MaskRay added inline comments.
lld/test/MachO/x86-64-reloc-signed.s
18

I did want to test initial relocations (the opposite of outstanding relocations; some ELF ppc64-* tests do this). Weirdly, llvm-readobj -r says the relocations are not ordered by offset (in ELF this is not guaranteed but the order is nice for a variety of reasons):

Section __text {
  0x3D 1 2 1 X86_64_RELOC_SIGNED 0 _s
  0x26 1 2 1 X86_64_RELOC_BRANCH 0 _f
  0x20 1 2 1 X86_64_RELOC_SIGNED_1 0 _s
  0x1A 1 2 1 X86_64_RELOC_BRANCH 0 _f
  0x11 1 2 1 X86_64_RELOC_SIGNED_2 0 _s
  0xB 1 2 1 X86_64_RELOC_BRANCH 0 _f
  0x2 1 2 1 X86_64_RELOC_SIGNED_4 0 _s
}

I'd hope this to be fixed first because otherwise we can't interleave instructions and CHECK lines for initial relocations in their order.

lgtm. I'll rebase my section relocation diff on this.

Thanks! Will rebase and commit to make things move faster. I do think relocation emitter is not ideal and we probably should fix that. Likewise, yaml2obj/obj2yaml may need to improved just to make future lld-macho tests nicer...

MaskRay edited the summary of this revision. (Show Details)May 4 2020, 3:14 PM
This revision was automatically updated to reflect the committed changes.
int3 added a comment.May 5 2020, 4:24 PM

For non-extern relocations though, the implicit addend doesn't appear to have this adjustment, so we'll need to take the relocation type into account there.

Just experimented locally, it appears that non-extern relocations have this adjustment too...

thakis added a subscriber: thakis.May 15 2020, 1:41 PM
thakis added inline comments.
lld/MachO/Arch/X86_64.cpp
53

Is this correct? Looking at lld/lib/ReaderWriter/MachO/ArchHandler_x86_64.cpp, it has this handy table:

ripRel32,              /// ex: movq _foo(%rip), %rax
ripRel32Minus1,        /// ex: movb $0x12, _foo(%rip)
ripRel32Minus2,        /// ex: movw $0x1234, _foo(%rip)
ripRel32Minus4,        /// ex: movl $0x12345678, _foo(%rip)
ripRel32Anon,          /// ex: movq L1(%rip), %rax
ripRel32Minus1Anon,    /// ex: movb $0x12, L1(%rip)
ripRel32Minus2Anon,    /// ex: movw $0x1234, L1(%rip)
ripRel32Minus4Anon,    /// ex: movw $0x12345678, L1(%rip)

and it maps SIGNED(_1/2/4) to these and then it does relocations with offsets for the offset relocation types.

MaskRay marked an inline comment as done.May 18 2020, 3:35 PM
MaskRay added inline comments.
lld/MachO/Arch/X86_64.cpp
53

This matches my observation of ld64 and X86MachObjectWriter.cpp

Reading the implicit addend back may seem weird but I believe X86_64_RELOC_SIGNED_1 requires the implicit addend to be 1.

MaskRay marked an inline comment as done.May 18 2020, 3:42 PM
MaskRay added inline comments.
lld/MachO/Arch/X86_64.cpp
53

Sorry, I mean the implicit addend is -1 (not 1).

f: c6 05 ff ff ff ff 45          movb    $69, -1(%rip)  # 15 <__text+0x15>
         0000000000000011:  X86_64_RELOC_SIGNED_1        _s-1