This is an archive of the discontinued LLVM Phabricator instance.

MCJIT MachO debugging support
ClosedPublic

Authored by loladiro on Jun 3 2014, 9:31 AM.

Details

Reviewers
lhames
Summary

Adds the necessary hooks to MCJIT to be able to debug generated MachO files using the GDB JIT debugging interface. Corresponding patch to LLDB in a minute.

Diff Detail

Event Timeline

loladiro updated this revision to Diff 10054.Jun 3 2014, 9:31 AM
loladiro retitled this revision from to MCJIT MachO debugging support.
loladiro updated this object.
loladiro edited the test plan for this revision. (Show Details)
loladiro added a subscriber: Unknown Object (MLST).
echristo added a subscriber: echristo.

Lang has touched this code last...

lhames edited edge metadata.Jun 16 2014, 1:29 PM

Hi Keno,

The updatesSectionAddress implementation is a bit funky, but I don't have any better ideas for how to do it.

Regarding the comment:
TODO: Refactor RuntimeDyld.cpp, so we at least don't nede the symbol
thing anymore.

Any chance you could expand on how you see this being refactored in the future? Could you file a bug for it and add a reference to the TODO?

Otherwise this looks good to me.

  • Lang.

Hi Lang,

thanks for taking a look. When I originally wrote this code, I had an idea on how to do it, but I don't think it actually works the way I had hoped, especially because MachO makes the assumption on the original object file lay out in other places as well. I think I'm just to push a new version that removes that comment.

Keno

In D4005#7, @loladiro wrote:

...I think I'm just to push a new version that removes that comment.

That's ok. In that case though, could you tweak the implementation of updateSectionAddress so that it can be safely called multiple times? I believe LLDB sometimes does that (possibly redundantly - with the same "new" address each time).
As currently written, you'll overwrite the "old" section address the second time through.

Cheers,
Lang.

Hi Keno,

I think all you need, in updateSectionAddress, is a guard along the lines of:

if (OldSectionAddrList[Sec.getRawDataRefImpl().d.a] == 0)
  OldSectionAddrList[Sec.getRawDataRefImpl().d.a] = oldAddr;

That way the original section address will be preserved (so internal relocations will still be computed correctly), but successive moves should also update the symbol and section addresses in the object image. Does that sound right to you?

Cheers,
Lang.

Yes, that's right. I was busy yesterday, but I was actually just rebuilding now with almost that exact patch (I made it an assertion failure to call the function twice with different addresses though).

loladiro updated this revision to Diff 10603.Jun 18 2014, 9:23 PM
loladiro edited edge metadata.

Addressed the comments.

Is it a problem if updateSectionAddress is called more than once with different addresses? I agree it's odd, but as long as the sections aren't finalized in between I think it's ok?

The other thing that occurred to me is that getOldSectionAddr will return '0' if the section hasn't been moved, which isn't right. I think that OldSectionAddrList should be initialized with the initial section addr when the MachOObjectImage is constructed. The updateSectionAddress method shouldn't need to touch OldSectionAddrList at all.

These two issues are connected: Since we need to populate OldSectionAddrList up-front, we can't use it to test whether we've moved a section before, so the test guarding the assertion won't work. If you want to disallow moving sections more than once you'll need to keep an "alreadyMoved" set too.

Yeah, I guess it's ok to move a section again. Populating the list up-front sounds like a good idea to me.

loladiro updated this revision to Diff 10718.Jun 20 2014, 7:16 PM

Now initializing the Array up-front.

Anything else I should do here?

Format it with clang-format :)

loladiro updated this revision to Diff 10802.Jun 24 2014, 2:26 PM

Small fix and formatting.

Missed lib/Object/MachOObjectFile.cpp :)

-eric

I ran it through clang-format-diff.py. The files themselves are not entirely formatted acording to clang-format, so I didn't want to introduce a large set of unrelated changes. Not sure why that wasn't reformatted by clang-format-diff.py.

Definitely a bug in clang-format-diff since the opening brace should be on the line before.

loladiro updated this revision to Diff 10808.Jun 24 2014, 3:02 PM

One more small reformat.

LGTM, I'll leave it up to Lang to review some of the grotty mach-o
details I've likely forgotten :)

-eric

One more nit since I noticed:

+ if (is64) {
+ ((MachO::section_64 *)data)->addr = Addr;
+ } else {
+ ((MachO::section *)data)->addr = Addr;
+ }

No need for the braces here.

I'm kindof dubious about this whole approach, but this matches what
the ELF stuff is doing so...

-eric

lhames accepted this revision.Jun 24 2014, 4:25 PM
lhames edited edge metadata.

LGTM too.

There's a serious caveat attached to this, which is that I think the idea of re-writing object files in-place is fundamentally broken. This will probably work for MCJIT because it emits its object files directly into memory. For RuntimeDyld, which may memmap its object files, this almost certainly won't work. We should aim to have a nicer interface between the JIT and the debugger. This patch doesn't make things any worse than they already were though, so I don't want to hold it up.

Thanks again for working on this Keno!

This revision is now accepted and ready to land.Jun 24 2014, 4:25 PM

Thanks, Lang. Note that I don't have commit access, so somebody else will have to commit (I can fix the one remaining style nit in a moment). I agree that we should think about a different interface to the debugger. For now I'm happy that this works though. Maybe the mailing list would be a better place to discuss what such an interface would look like.

Committed in r211652. Thanks Keno!

loladiro closed this revision.Jun 30 2014, 12:52 PM