This supersedes http://reviews.llvm.org/D4010, hopefully properly dealing with the JIT case and also adds an actual test case. DwarfContext was basically already usable for the JIT (and back when we were overwriting ELF files it actually worked out of the box by accident), but in order to resolve relocations correctly it needs to know the load address of the section. Rather than trying to get this out of the ObjectFile or requiring the user to create a new ObjectFile just to get some debug info, this adds the capability to pass in that info directly. As part of this I separated out part of the LoadedObjectInfo struct from RuntimeDyld, since it is now required add a higher layer. For now I put that in ObjectFile.h, but I would be happy to receive suggestions for more appropriate places.
Details
Diff Detail
Event Timeline
include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
595 | Not directly, but It is valid to write .long .text and it is represented as a relocation to a symbol, and the symbol has type STT_SECTION. Would it be desirable to check for that here? If not, please expand the comment a bit. | |
include/llvm/Object/MachO.h | ||
249 | Please return an ErrorOr<uint8_t>. | |
include/llvm/Object/RelocVisitor.h | ||
278 ↗ | (On Diff #18122) | As you can see it is way too easy to ignore a std::error_code return :-) |
lib/Object/COFFObjectFile.cpp | ||
1047 | Someone familiar with COFF should check this. | |
lib/Object/MachOObjectFile.cpp | ||
1045 | Actually, if this is never going to fail, just return a uint8_t. |
Hi Keno,
Is the plan here just to get debugging working for MachO objects?
My intention was that people wanting to interface between the debugger and the JIT should use LoadedObjectInfo::getObjectForDebug(), and that should return a new file that's appropriate for parsing with DWARFContext.
For ELF files, LoadedELFObjectInfo::getObjectForDebug() is doing the same thing that RuntimeDyldELF used to and overwriting the vmaddrs in the ObjectFile, but it's duplicating the ObjectFile first to ensure that that's safe.
For MachO files my plan was to do the same thing, using your old patch for MachO debugging. We'd just need to hook that up in LoadedMachOObjectInfo.
- Lang.
Hi Lang,
No, this is not for the debugger. This is for the client itself to be able to parse the DWARF information. This is the last piece I need to put MCJIT back on par functionality wise with the old jit (was was able to give you line table information).
Keno
Sorry - that was a bit vague of me. I should have said that I intend for getObjectForDebug to work with DWARFContext, not just the debugger.
Is creating a new object file too much overhead? We haven't gone this far in LoadedELFObjectFile yet, but rather than duplicating the original object file, I think it should be reasonable to build a new object file that just contains a symbol table and load commands.
I'm not hugely opposed to the direction of this patch. It's actually very close to what I pitched to echristo when I first cleaned up RuntimeDyld's memory ownership model. At the time though, he convinced me that it would be unpleasant to thread the optional "object interpretation metadata" through the interfaces, hence the decision to just form a new self-contained object file.
Well, I see three reasons for this direction over the other:
- We can avoid relocation the debug sections. And in fact, since all this is done lazily, we only ever need to process the sections that we actually need (quite a win from a memory and performance standpoint).
- I was never really happy with the way I ended up solving the debug object problem for MachO. A lot of things in MachO refer to the original layout of the object file and fixing them all up is a pain. It can be done, but it always feels like a hack to me. In contrast, other than the admittedly ugly need to pass the LoadedObjectInfo to the DwarfContext, this seems rather sane to me.
- I think it improves reusability of the code. I think at some point the whole DWARF library will need to be split into a separate library (it's already duplicated between LLVM and lldb) and in other contexts people might not like to create such a debug object. Even with this patch, I imagine it might be quite possible to create a LoadedObjectInfo subclass for a different JIT and use the DWARF parser with it (admittedly this is probably a minor point).
Also note that this patch isn't actually that big. It's a little inflated by adding test cases, exposing this functionality through llvm-rtdyld and adding the missing ObjectFile methods for handling MachO relocations.
Addresses @rafael's review comments. Added implementations of getRelocationSection for ELF and COFF, which aren't strictly necessary, since it's equivalent on ELF/COFF to get the symbols' section, which is what we do anyway. Nevertheless, better to have the API complete in case somebody wants to use it for a different use case.
Your patch seems to touch a few different and pretty independent parts on the compiler/tools. It looks like this could be split up quite a bit. Just commenting on the libDebugInfo code:
include/llvm/Object/RelocVisitor.h | ||
---|---|---|
228–240 ↗ | (On Diff #18303) | I'd suggest to submit that in a separate patch too. It can be tested quite simply by verifying that llvm-dwarfdump doesn't emit any relocation processing "errors" anymore on x86_64 macho object files. |
lib/DebugInfo/DWARFContext.cpp | ||
557–559 ↗ | (On Diff #18303) | The { should be on the same line as the if. If you don't use the LoadAddress variable out of the if, you can move its declaration inside the condition. |
561 ↗ | (On Diff #18303) | Wouldn't it be cleaner to have a method inside the LoadedObjectInfo that directly returns a StringRef (or ArrayRef) to the section data? |
633–637 ↗ | (On Diff #18303) | No need for the variable, just put the condition in the if. We only care about debug sections here, can this ever be true for the debug sections? Will the JIT load them and relocate them? |
680–701 ↗ | (On Diff #18303) | You introduce section lookups by name for every relocation. Isn't there a more efficient interface? Also, I think I would prefer the code for the LoadedObjectInfo case to be in its own if() rather than mixed with the standard relocation processing (So that it's more self-evident that it's only adding a new code path). |
lib/DebugInfo/DWARFDebugLine.cpp | ||
608–610 ↗ | (On Diff #18303) | This part of the change should be submitted separately with appropriate test coverage. Not suggesting to do it for this patch, but looking more globally, I wonder if this function should really be returning results that span multiple sequences or if we shouldn't simply modify it to only look in the first sequence containing the start address. |
I'll split up the patch for further in depth review, but I wanted to have a chance to have everybody see how everything plays together. In particular, I'd like @echristo to comment whether the general direction of this patch seems reasonable. If that's the case, I'll split the patch into three parts (The debug info changes, the LoadedObjectInfo integration with DwarfContext and the Darwin support for RelocVisitor).
lib/DebugInfo/DWARFContext.cpp | ||
---|---|---|
561 ↗ | (On Diff #18303) | That's a good idea. |
633–637 ↗ | (On Diff #18303) | Yes, the JIT will load and relocate debug info in some cases. |
680–701 ↗ | (On Diff #18303) | Currently there's only the lookup by name. I'm not sure if the RuntimeDyld SectionIDs have any relation to the section order in the object file. @lhames? |
lib/DebugInfo/DWARFDebugLine.cpp | ||
608–610 ↗ | (On Diff #18303) | Will do. I had just noticed it while testing this. |
This rebases the current diff and splits out the MachO RelocVisitor changes into http://reviews.llvm.org/D8148.
Could somebody please take a look at the RelocVisitor part of this? This patch shouldn't break anything and it's very useful when working with MCJIT.
Added some comments, looks like bits overlap with the other patch. Some more inline comments as well. I'll take another look after you get this stuff.
-eric
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
147–156 ↗ | (On Diff #25369) | Needs doxygen docs for the functions. |
lib/DebugInfo/DWARF/DWARFContext.cpp | ||
557–562 | Style. | |
679–704 | Can you comment this out? It looks a little odd and it's hard to get straight. |
Updated with more comments. I also flattened the if nest. Hopefully that should be easier to understand now.
Few more inline comments.
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
147 ↗ | (On Diff #26269) | Sentences end with a period. Also we have \brief support turned on so you don't need to use \brief anymore. (And in the other places :) |
173 ↗ | (On Diff #26269) | clone is entirely unused. |
lib/DebugInfo/DWARF/DWARFContext.cpp | ||
685–687 | LLVM style is that single statement blocks don't get braces if possible. | |
700–702 | Ditto. | |
tools/llvm-rtdyld/llvm-rtdyld.cpp | ||
261–269 | Explanatory comment here. (It looks obvious, but I like them in general. :) |
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
147 ↗ | (On Diff #26269) | Will fix. |
173 ↗ | (On Diff #26269) | It's supposed to be available for the client to be able to obtain a copy of the LoadedObjectInfo, since the one it gets from MCJIT is passed by reference. We could probably change that one to a shared_ptr, but I didn't want to put that API breakage into the same commit. |
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
173 ↗ | (On Diff #26269) | I guess, it might be nicer to make this clear. I'm not a fan of requiring people to clone it etc. Maybe I'm just not seeing where it's going to be used off the top of my head, and can be added in the future if we have a user of the functionality? |
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
173 ↗ | (On Diff #26269) | I added this because I need it in Julia. The MCJIT listener interface is NotifyObjectEmitted(const ObjectFile &Object, const RuntimeDyld::LoadedObjectInfo &L) but I need to store the LoadedObjectInfo for later (to query the DWARFContext), so I added the clone method as a workaround. Better alternatives would be most appreciated of course. |
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
173 ↗ | (On Diff #26269) | Not my favorite set of interfaces, no, but I don't have anything else in mind at the moment and hopefully someone else can come up with something better. |
include/llvm/DebugInfo/DIContext.h | ||
---|---|---|
173 ↗ | (On Diff #26269) | This is a bit problematic as the code is entirely untested... For example - I've fixed the API to use unique_ptr, which may break your usage & I have little idea if my change is correct. Please add some kind of test coverage in-tree or it'd be best to remove this API. |
This change introduced getLoadedSectionContents(), but it is not used anywhere in LLVM code I think. Do we need it ?
Not directly, but It is valid to write
.long .text
and it is represented as a relocation to a symbol, and the symbol has type STT_SECTION.
Would it be desirable to check for that here? If not, please expand the comment a bit.