Page MenuHomePhabricator

Make it easier to use DwarfContext with MCJIT

Authored by loladiro on Jan 13 2015, 4:15 PM.



This supersedes, hopefully properly dealing with the JIT case and also adds an actual test case. DwarfContext was basically already usable for the JIT (and back when we were overwriting ELF files it actually worked out of the box by accident), but in order to resolve relocations correctly it needs to know the load address of the section. Rather than trying to get this out of the ObjectFile or requiring the user to create a new ObjectFile just to get some debug info, this adds the capability to pass in that info directly. As part of this I separated out part of the LoadedObjectInfo struct from RuntimeDyld, since it is now required add a higher layer. For now I put that in ObjectFile.h, but I would be happy to receive suggestions for more appropriate places.

Diff Detail


Event Timeline

loladiro updated this revision to Diff 18120.Jan 13 2015, 4:15 PM
loladiro retitled this revision from to Make it easier to use DwarfContext with MCJIT and add RelocVisitor support for Darwin.
loladiro updated this object.
loladiro edited the test plan for this revision. (Show Details)
loladiro added reviewers: lhames, echristo.
loladiro set the repository for this revision to rL LLVM.
loladiro added a subscriber: Unknown Object (MLST).
loladiro updated this revision to Diff 18122.Jan 13 2015, 4:17 PM

Accidentally forgot newly added test cases in previous diff.

rafael added inline comments.
569 ↗(On Diff #18122)

Not directly, but It is valid to write

.long .text

and it is represented as a relocation to a symbol, and the symbol has type STT_SECTION.

Would it be desirable to check for that here? If not, please expand the comment a bit.

249 ↗(On Diff #18122)

Please return an ErrorOr<uint8_t>.

278 ↗(On Diff #18122)

As you can see it is way too easy to ignore a std::error_code return :-)

1040 ↗(On Diff #18122)

Someone familiar with COFF should check this.

1010 ↗(On Diff #18122)

Actually, if this is never going to fail, just return a uint8_t.

lhames edited edge metadata.Jan 15 2015, 2:50 PM

Hi Keno,

Is the plan here just to get debugging working for MachO objects?

My intention was that people wanting to interface between the debugger and the JIT should use LoadedObjectInfo::getObjectForDebug(), and that should return a new file that's appropriate for parsing with DWARFContext.

For ELF files, LoadedELFObjectInfo::getObjectForDebug() is doing the same thing that RuntimeDyldELF used to and overwriting the vmaddrs in the ObjectFile, but it's duplicating the ObjectFile first to ensure that that's safe.

For MachO files my plan was to do the same thing, using your old patch for MachO debugging. We'd just need to hook that up in LoadedMachOObjectInfo.

  • Lang.

Hi Lang,

No, this is not for the debugger. This is for the client itself to be able to parse the DWARF information. This is the last piece I need to put MCJIT back on par functionality wise with the old jit (was was able to give you line table information).


Sorry - that was a bit vague of me. I should have said that I intend for getObjectForDebug to work with DWARFContext, not just the debugger.

Is creating a new object file too much overhead? We haven't gone this far in LoadedELFObjectFile yet, but rather than duplicating the original object file, I think it should be reasonable to build a new object file that just contains a symbol table and load commands.

I'm not hugely opposed to the direction of this patch. It's actually very close to what I pitched to echristo when I first cleaned up RuntimeDyld's memory ownership model. At the time though, he convinced me that it would be unpleasant to thread the optional "object interpretation metadata" through the interfaces, hence the decision to just form a new self-contained object file.

Well, I see three reasons for this direction over the other:

  1. We can avoid relocation the debug sections. And in fact, since all this is done lazily, we only ever need to process the sections that we actually need (quite a win from a memory and performance standpoint).
  2. I was never really happy with the way I ended up solving the debug object problem for MachO. A lot of things in MachO refer to the original layout of the object file and fixing them all up is a pain. It can be done, but it always feels like a hack to me. In contrast, other than the admittedly ugly need to pass the LoadedObjectInfo to the DwarfContext, this seems rather sane to me.
  3. I think it improves reusability of the code. I think at some point the whole DWARF library will need to be split into a separate library (it's already duplicated between LLVM and lldb) and in other contexts people might not like to create such a debug object. Even with this patch, I imagine it might be quite possible to create a LoadedObjectInfo subclass for a different JIT and use the DWARF parser with it (admittedly this is probably a minor point).

Also note that this patch isn't actually that big. It's a little inflated by adding test cases, exposing this functionality through llvm-rtdyld and adding the missing ObjectFile methods for handling MachO relocations.

loladiro updated this revision to Diff 18303.Jan 16 2015, 8:35 AM
loladiro edited edge metadata.

Addresses @rafael's review comments. Added implementations of getRelocationSection for ELF and COFF, which aren't strictly necessary, since it's equivalent on ELF/COFF to get the symbols' section, which is what we do anyway. Nevertheless, better to have the API complete in case somebody wants to use it for a different use case.

friss added a subscriber: friss.Jan 21 2015, 10:31 AM

Your patch seems to touch a few different and pretty independent parts on the compiler/tools. It looks like this could be split up quite a bit. Just commenting on the libDebugInfo code:

228–240 ↗(On Diff #18303)

I'd suggest to submit that in a separate patch too. It can be tested quite simply by verifying that llvm-dwarfdump doesn't emit any relocation processing "errors" anymore on x86_64 macho object files.

557–559 ↗(On Diff #18303)

The { should be on the same line as the if. If you don't use the LoadAddress variable out of the if, you can move its declaration inside the condition.

561 ↗(On Diff #18303)

Wouldn't it be cleaner to have a method inside the LoadedObjectInfo that directly returns a StringRef (or ArrayRef) to the section data?

633–637 ↗(On Diff #18303)

No need for the variable, just put the condition in the if. We only care about debug sections here, can this ever be true for the debug sections? Will the JIT load them and relocate them?

680–701 ↗(On Diff #18303)

You introduce section lookups by name for every relocation. Isn't there a more efficient interface? Also, I think I would prefer the code for the LoadedObjectInfo case to be in its own if() rather than mixed with the standard relocation processing (So that it's more self-evident that it's only adding a new code path).

608–610 ↗(On Diff #18303)

This part of the change should be submitted separately with appropriate test coverage.

Not suggesting to do it for this patch, but looking more globally, I wonder if this function should really be returning results that span multiple sequences or if we shouldn't simply modify it to only look in the first sequence containing the start address.
Or maybe we need to change DWARFContext::getLineInfoForAddressRange() to interpret the results differently. Right now, if your result spans multiple sequences, it'll return line mappings for the EndSequence rows which could be very misleading.

I'll split up the patch for further in depth review, but I wanted to have a chance to have everybody see how everything plays together. In particular, I'd like @echristo to comment whether the general direction of this patch seems reasonable. If that's the case, I'll split the patch into three parts (The debug info changes, the LoadedObjectInfo integration with DwarfContext and the Darwin support for RelocVisitor).

561 ↗(On Diff #18303)

That's a good idea.

633–637 ↗(On Diff #18303)

Yes, the JIT will load and relocate debug info in some cases.

680–701 ↗(On Diff #18303)

Currently there's only the lookup by name. I'm not sure if the RuntimeDyld SectionIDs have any relation to the section order in the object file. @lhames?

608–610 ↗(On Diff #18303)

Will do. I had just noticed it while testing this.

vtjnash added a subscriber: vtjnash.Mar 6 2015, 8:25 PM
loladiro updated this revision to Diff 21445.Mar 7 2015, 11:58 PM
loladiro retitled this revision from Make it easier to use DwarfContext with MCJIT and add RelocVisitor support for Darwin to Make it easier to use DwarfContext with MCJIT.
loladiro updated this object.

This rebases the current diff and splits out the MachO RelocVisitor changes into

loladiro updated this revision to Diff 25369.May 8 2015, 2:13 PM
loladiro edited the test plan for this revision. (Show Details)

@echristo @lhames As discussed on IRC, rebased to make it independent of the earlier MachO patch (because that one is not easily testable without this).

The RuntimeDyld side of this looks good to me.

@echristo Can you take a look?

Could somebody please take a look at the RelocVisitor part of this? This patch shouldn't break anything and it's very useful when working with MCJIT.

echristo edited edge metadata.May 21 2015, 11:44 AM

Added some comments, looks like bits overlap with the other patch. Some more inline comments as well. I'll take another look after you get this stuff.


147–156 ↗(On Diff #25369)

Needs doxygen docs for the functions.

557–562 ↗(On Diff #25369)


679–704 ↗(On Diff #25369)

Can you comment this out? It looks a little odd and it's hard to get straight.

loladiro updated this revision to Diff 26269.May 21 2015, 12:56 PM
loladiro edited edge metadata.

Updated with more comments. I also flattened the if nest. Hopefully that should be easier to understand now.

Few more inline comments.

147 ↗(On Diff #26269)

Sentences end with a period. Also we have \brief support turned on so you don't need to use \brief anymore. (And in the other places :)

173 ↗(On Diff #26269)

clone is entirely unused.

685–687 ↗(On Diff #26269)

LLVM style is that single statement blocks don't get braces if possible.

700–702 ↗(On Diff #26269)


261–269 ↗(On Diff #26269)

Explanatory comment here. (It looks obvious, but I like them in general. :)

loladiro added inline comments.May 21 2015, 1:15 PM
147 ↗(On Diff #26269)

Will fix.

173 ↗(On Diff #26269)

It's supposed to be available for the client to be able to obtain a copy of the LoadedObjectInfo, since the one it gets from MCJIT is passed by reference. We could probably change that one to a shared_ptr, but I didn't want to put that API breakage into the same commit.

loladiro updated this revision to Diff 26274.May 21 2015, 1:24 PM

More comments and style updates.

echristo added inline comments.May 21 2015, 1:31 PM
173 ↗(On Diff #26269)

I guess, it might be nicer to make this clear. I'm not a fan of requiring people to clone it etc. Maybe I'm just not seeing where it's going to be used off the top of my head, and can be added in the future if we have a user of the functionality?

loladiro added inline comments.May 21 2015, 1:35 PM
173 ↗(On Diff #26269)

I added this because I need it in Julia. The MCJIT listener interface is

NotifyObjectEmitted(const ObjectFile &Object, const RuntimeDyld::LoadedObjectInfo &L)

but I need to store the LoadedObjectInfo for later (to query the DWARFContext), so I added the clone method as a workaround. Better alternatives would be most appreciated of course.

echristo accepted this revision.May 21 2015, 2:01 PM
echristo edited edge metadata.
echristo added inline comments.
173 ↗(On Diff #26269)

Not my favorite set of interfaces, no, but I don't have anything else in mind at the moment and hopefully someone else can come up with something better.

This revision is now accepted and ready to land.May 21 2015, 2:01 PM
This revision was automatically updated to reflect the committed changes.
dblaikie added inline comments.
173 ↗(On Diff #26269)

This is a bit problematic as the code is entirely untested...

For example - I've fixed the API to use unique_ptr, which may break your usage & I have little idea if my change is correct.

Please add some kind of test coverage in-tree or it'd be best to remove this API.

grimar added a subscriber: grimar.May 20 2017, 3:33 AM

This change introduced getLoadedSectionContents(), but it is not used anywhere in LLVM code I think. Do we need it ?