Make Module::GetSectionList output consistent
AbandonedPublic

Authored by labath on Feb 6 2018, 4:50 AM.

Details

Summary

The result of Module::GetSectionList depended on whether the symbol
vendor has been loaded (which would augment the section list with the
extra sections that have been found by the vendor).

Although this bit was documented in the function header, this makes a
quirky api, as other Module functions have no requirement to load the
vendor explicitly -- instead they will do it on demand. In practice,
what this meant is that the user would nearly always get the augmented
section list (because by the time he requested it, it's likely something
would have loaded the vendor already), *except* if all that he was doing
was loading a module and immediately dumping out the sections like
lldb-test does.

labath created this revision.Feb 6 2018, 4:50 AM
labath added inline comments.Feb 6 2018, 4:53 AM
tools/lldb-test/lldb-test.cpp
75

The expliciting symbol filespec setting was short-circuiting the regular search process. I removed it as there was no test depending on this yet.

clayborg added inline comments.Feb 12 2018, 7:32 AM
source/Core/Module.cpp
1289

should we pass "obj_file" down into the SymbolVendor::CreateSections(...) call for the case where the symbol vendor is using the same ObjectFile that it can just return?

labath added inline comments.Feb 14 2018, 5:49 AM
source/Core/Module.cpp
1289

I don't think an extra parameter is needed to achieve this, as this is something that the symbol vendor should already know about based on how it was constructed.

And after looking up the implementation, it seems this is already how it works right now: SymbolVendorELF (the only non-trivial implementation of this function) is only constructed if it manages to find debug info in an alternative symbol file (there's an if (llvm::sys::fs::equivalent(candidate_file_spec.GetPath(), module_file_spec.GetPath())) check in Symbols::LocateExecutableSymbolFile). If we fail to construct a SymbolVendorELF then a default SymbolVendor instance is created in SymbolVendor::FindPlugin (and that one has a no-op implementation of this function).

Greg, with my last comment in mind, how do you feel about this patch?

davide accepted this revision.Feb 21 2018, 9:25 AM

LGTM.

source/Core/Module.cpp
1289

I agree with Pavel's reasoning here.

This revision is now accepted and ready to land.Feb 21 2018, 9:25 AM

Back from vacation, sorry for the delay. Only one question on who is actually responsible for creating the sections. I vote to let SymbolVendor::CreateSection() always do the work. Let me know what you think.

source/Core/Module.cpp
1287–1288

If we are going to let the symbol vendor do it's thing, then we should probably not manually call ObjectFile::CreateSections. The idea behind SymbolVendor is that _it_ knows how best to do things for one or more object files. Can we move all logic for parsing sections into the SymbolVendor::CreateSections()?

labath added inline comments.Feb 22 2018, 10:06 AM
source/Core/Module.cpp
1287–1288

I like that idea. I'm going to take a look to see how that could be done.

labath updated this revision to Diff 135971.Feb 26 2018, 2:01 PM

This version moves the logic for the building of unified section list into the
symbol vendor plugin.

The idea was to keep the construction of section lists of individual object
files inside the ObjectFile object, and then have the SymbolVendor do the
merging. This was easy to achieve for the SymbolVendorELF, and I'm quite happy
about how things look like there.

However, I had a lot of trouble with making this work for the MachO
objectfile+symbolvendor combo, because there the section building is
intertwined with the load command parsing and the building of the unified
sections list. In particular, the parse function seems to be doing some dodgy
things like taking a section from the "unified" list and inserting it into
per-object-file list. So, here I had to leave the building of the unified list
in the ObjectFile class, and the SymbolVendor just orchestrates the functions
to be called in the right order (for this I needed to cast the ObjectFile into
ObjectFileMachO, which seemingly adds a new dependency to the symbol vendor,
but this is not really true, as the vendor was already checking the type of the
object file via the GetPluginName() function). It would be great if someone
with more MachO knowledge could look at the CreateSections function and split
it up into more manageable chunks.

The other sub-optimal design decision I had to make was due to the mutual
recursion of the Module, SymbolVendor and SymbolFile classes: SymbolVendor
creates a SymbolFile class, which expect that the Module's section list is
already constructed, but SymbolVendor is the one constructing the list. This
means that I couldn't just construct the SymbolVendor and have the Module ask
it for the section list, but I had to make the vendor directly update the
section list within the module (before constructing the symbolfile class).

On the plus side, I really like that I was able to remove the
"update_module_section_list" argument from the ObjectFile::GetSectionList
function, which was very dodgy as well, because whether the module's section
list was updated, depended on the value of the argument of the *first* call of
that function.

All in all, I think patch cleans up the unified section list handling, but only
by a small bit. Let me know what you think.

labath updated this revision to Diff 135975.Feb 26 2018, 2:07 PM

Remove the changes to source/Plugins/SymbolVendor/CMakeLists.txt that snuck in
(I was experimenting with enabling the plugin on non-mac systems, but realized
that's not possible right now).

labath requested review of this revision.Mar 2 2018, 1:04 PM

Explicitly requesting re-review to make sure this is on your radar.

We have a new patch that <D44041> would benefit from this, but if this is not ready to go in yet (it's a turned into a fairly big refactor), I'll probably add a workaround to lldb-test to make that patch testable.

clayborg accepted this revision.Mar 2 2018, 2:24 PM

Looks fine, a bit hard to tell exactly what is going on so I will accept as long as the following things are still true:

  • each lldb_private::ObjectFile has its own section list that perfectly mirrors exactly what is in that object file and that file alone
  • The lldb_private::Module hands out a unified section list that is populated by the symbol vendor where it uses one or more object files to create the unified section list
This revision is now accepted and ready to land.Mar 2 2018, 2:24 PM
labath added a comment.Mar 2 2018, 2:46 PM

I can try to split the patch up a bit if you think it will make things easier. I'm not sure how much I will be able to do that, as the patch is not that big, it just needs to touch a bunch of files for the changed interfaces.

Looks fine, a bit hard to tell exactly what is going on so I will accept as long as the following things are still true:

  • each lldb_private::ObjectFile has its own section list that perfectly mirrors exactly what is in that object file and that file alone

This part is not completely true. It is true for ObjectFileELF/COFF/JIT, and is enforced by the fact that the ObjectFile::GetSections does not even have access to the unified section list. However, it is *not* true for ObjectFileMachO (but that is not because of this patch). This is the problem I've had when trying to refactor this (and it's the reason the ObjectFileMachO<->SymbolVendorMacOSX interface is a bit weird). It seems that the ObjectFileMachO sometimes takes a section from the unified list and adds it to it's own list (or at least it appears to be doing that). I don't know enough about MachO to understand why is it doing that, or how to fix it. I've highlighted the places in the code where I think this is happening. Any suggestions would be welcome here.

  • The lldb_private::Module hands out a unified section list that is populated by the symbol vendor where it uses one or more object files to create the unified section list

This part is true.

source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
1415–1416

Here we are grabbing a section from the unified section list.

1562

And here it gets added to the object file's section list.

I can try to split the patch up a bit if you think it will make things easier. I'm not sure how much I will be able to do that, as the patch is not that big, it just needs to touch a bunch of files for the changed interfaces.

Looks fine, a bit hard to tell exactly what is going on so I will accept as long as the following things are still true:

  • each lldb_private::ObjectFile has its own section list that perfectly mirrors exactly what is in that object file and that file alone

This part is not completely true. It is true for ObjectFileELF/COFF/JIT, and is enforced by the fact that the ObjectFile::GetSections does not even have access to the unified section list. However, it is *not* true for ObjectFileMachO (but that is not because of this patch). This is the problem I've had when trying to refactor this (and it's the reason the ObjectFileMachO<->SymbolVendorMacOSX interface is a bit weird). It seems that the ObjectFileMachO sometimes takes a section from the unified list and adds it to it's own list (or at least it appears to be doing that). I don't know enough about MachO to understand why is it doing that, or how to fix it. I've highlighted the places in the code where I think this is happening. Any suggestions would be welcome here.

The dSYM file is a mach-o file that contains symbols only, It is because the dSYM file (stand alone debug info file) has all of the section definitions from the main executable, but no section content for everything except the DWARF debug info. The DWARF only exists in the dSYM file. The dSYM file also has all of the symbols as it gets made before the executable is stripped. Many symbols refer to a section by section index, so that is why we must have the section definitions in the dSYM file. So the dSYM wants the sections from the main executable so it can access text and data if needed since symbol refer to these and someone might ask a symbol from a dSYM file what its instructions are. So the dSYM uses the real sections from the main executable file instead of its own.

  • The lldb_private::Module hands out a unified section list that is populated by the symbol vendor where it uses one or more object files to create the unified section list

This part is true.

labath added a comment.Mar 2 2018, 3:39 PM

The dSYM file is a mach-o file that contains symbols only, It is because the dSYM file (stand alone debug info file) has all of the section definitions from the main executable, but no section content for everything except the DWARF debug info. The DWARF only exists in the dSYM file. The dSYM file also has all of the symbols as it gets made before the executable is stripped. Many symbols refer to a section by section index, so that is why we must have the section definitions in the dSYM file. So the dSYM wants the sections from the main executable so it can access text and data if needed since symbol refer to these and someone might ask a symbol from a dSYM file what its instructions are. So the dSYM uses the real sections from the main executable file instead of its own.

I see, thanks for the explanation. The code makes more sense now.

But now I have another question. Is this an intentional design decision that we want to preserve; or just how things happen to work now, and should be cleaned up in the future (to achieve the "ObjectFile contains only own sections" invariant)?

I'm asking this, because if it's the former then this patch probably needs to be redone, as it tries to enforce that invariant (and then needs to jump through hoops to enable the ObjectFileMachO use case). OTOH, if it's the latter, then I think this is a step in the right direction, as it makes it obvious that ObjectFileMachO is doing something that it shouldn't (and at the same time it makes it harder for other ObjectFiles to do the same).

BTW, the same situation (symbol file containing empty section definitions for code&data) can happen on elf targets as well (if the symbol file is produced with strip --only-keep-debug), but everything seems to be working fine, presumably because all consumers are accessing the sections through the unified section list.

The dSYM file is a mach-o file that contains symbols only, It is because the dSYM file (stand alone debug info file) has all of the section definitions from the main executable, but no section content for everything except the DWARF debug info. The DWARF only exists in the dSYM file. The dSYM file also has all of the symbols as it gets made before the executable is stripped. Many symbols refer to a section by section index, so that is why we must have the section definitions in the dSYM file. So the dSYM wants the sections from the main executable so it can access text and data if needed since symbol refer to these and someone might ask a symbol from a dSYM file what its instructions are. So the dSYM uses the real sections from the main executable file instead of its own.

I see, thanks for the explanation. The code makes more sense now.

But now I have another question. Is this an intentional design decision that we want to preserve; or just how things happen to work now, and should be cleaned up in the future (to achieve the "ObjectFile contains only own sections" invariant)?

Intentional, but it is up to the SymbolVendor subclass to do all of this. We want the data from the dSYM to be useful and since the dSYM really is a companion debug info file, it needs info from elsewhere to be complete.

I'm asking this, because if it's the former then this patch probably needs to be redone, as it tries to enforce that invariant (and then needs to jump through hoops to enable the ObjectFileMachO use case). OTOH, if it's the latter, then I think this is a step in the right direction, as it makes it obvious that ObjectFileMachO is doing something that it shouldn't (and at the same time it makes it harder for other ObjectFiles to do the same).

It should be doing it. Again, the symbol vendor is the one that is trying to make everything fit together and "just work".

BTW, the same situation (symbol file containing empty section definitions for code&data) can happen on elf targets as well (if the symbol file is produced with strip --only-keep-debug), but everything seems to be working fine, presumably because all consumers are accessing the sections through the unified section list.

Exactly. I see any good debug info format not needing to duplicate things that might be in another file. The symbol vendor under the covers, creates a unified experience for all consumers of the Module that contains the symbol vendor, so it is something that we want to encourage and allow. So ObjectFileMachO isn't doing anything wrong. Feel free to update the design if needed?

AS for the ELF example where only debug info is around, it might not be running into issues if it doesn't contain a symbol table. If it does contain a symbol table, hopefully it is using the unified section table, otherwise if might have symbol's whose addresses have sections from the stripped ELF file and it might not have the section contents that it needs to back it. In that case we might fail to disassemble the symbol's code.

labath added a comment.Mar 2 2018, 4:38 PM

AS for the ELF example where only debug info is around, it might not be running into issues if it doesn't contain a symbol table. If it does contain a symbol table, hopefully it is using the unified section table, otherwise if might have symbol's whose addresses have sections from the stripped ELF file and it might not have the section contents that it needs to back it. In that case we might fail to disassemble the symbol's code.

I made a trivial experiment:

g++ a.cc
objcopy a.out --only-keep-debug a.sym #a.sym contains a symtab
strip a.out # a.out does not contain a symtab
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) b main
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.
(lldb) target symbols add a.sym
symbol file '/tmp/X/a.sym' has been added to '/tmp/X/a.out'
1 location added to breakpoint 1
#symbol resolution seems fine
(lldb) disassemble -n main
a.out`main:
a.out[0x69b] <+0>:  pushq  %rbp
a.out[0x69c] <+1>:  movq   %rsp, %rbp
a.out[0x69f] <+4>:  callq  0x690                     ; foo()
a.out[0x6a4] <+9>:  addl   $0x2a, %eax
a.out[0x6a7] <+12>: popq   %rbp
a.out[0x6a8] <+13>: retq
# so does disassembling

The part that is not clear to me is, if the dsym object file should absorb the sections from the main object file, then what is the purpose of the "unified section list" in the module? I can see why we need a unified list, and I think it's a good idea, but having an ObjectFile containing the merged list seems to defeat it. From a separation of concerns POV, it would be much clearer if each ObjectFile contained only it's own sections, and then some other entity (Module) held the combined data(sections) of the individual object files. Then, most clients should operate on the unified view of the module, but if anyone ever had a reason, he could always look directly at a specific ObjectFile, and see what's contained there.

Among other things, this could be very useful for lldb-server. lldb-server needs only lightweight access to the data in the object file -- it does not need the Module class and everything it pulls in (SymbolVendor, SymbolFile, ...). If we could make the ObjectFile class completely standalone, we can avoid pulling all this stuff in and make the code more modular. It would also help with the migration to llvm's Object parsing library, as it knows nothing about unified sections. If we had a standalone ObjectFile implementation, then it would be easy to swap it out for llvm's and still keep the "section unification" code intact as it would be in a different layer.

So, could this be a direction we can move in? I can put this patch on ice and try to make the ObjectFileMachO standalone instead. I don't think it should be that hard, given that this is how things already work for elf. I'm hoping it will be a matter of changing some consumers to use module->GetSectionList() instead of objfile->GetSectionList()...

labath abandoned this revision.Mar 3 2018, 1:52 PM

I don't think this approach is good while ObjectFileMachO is messing with sections of other object files. At this point I think, I've actually learned enough about MachO to understand what the code is doing there, so I'm going to see if I can make it do what I want.

AS for the ELF example where only debug info is around, it might not be running into issues if it doesn't contain a symbol table. If it does contain a symbol table, hopefully it is using the unified section table, otherwise if might have symbol's whose addresses have sections from the stripped ELF file and it might not have the section contents that it needs to back it. In that case we might fail to disassemble the symbol's code.

I made a trivial experiment:

g++ a.cc
objcopy a.out --only-keep-debug a.sym #a.sym contains a symtab
strip a.out # a.out does not contain a symtab
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) b main
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.
(lldb) target symbols add a.sym
symbol file '/tmp/X/a.sym' has been added to '/tmp/X/a.out'
1 location added to breakpoint 1
#symbol resolution seems fine
(lldb) disassemble -n main
a.out`main:
a.out[0x69b] <+0>:  pushq  %rbp
a.out[0x69c] <+1>:  movq   %rsp, %rbp
a.out[0x69f] <+4>:  callq  0x690                     ; foo()
a.out[0x6a4] <+9>:  addl   $0x2a, %eax
a.out[0x6a7] <+12>: popq   %rbp
a.out[0x6a8] <+13>: retq
# so does disassembling

The part that is not clear to me is, if the dsym object file should absorb the sections from the main object file, then what is the purpose of the "unified section list" in the module? I can see why we need a unified list, and I think it's a good idea, but having an ObjectFile containing the merged list seems to defeat it. From a separation of concerns POV, it would be much clearer if each ObjectFile contained only it's own sections, and then some other entity (Module) held the combined data(sections) of the individual object files. Then, most clients should operate on the unified view of the module, but if anyone ever had a reason, he could always look directly at a specific ObjectFile, and see what's contained there.

The issue is the object file must create symbols that uses the section from the unified section list. The easiest way to accomplish that was to have the object file just use the right section and not have to worry about any other code having to know anything about the relationship between a debuig info file and the main executable file.

Among other things, this could be very useful for lldb-server. lldb-server needs only lightweight access to the data in the object file -- it does not need the Module class and everything it pulls in (SymbolVendor, SymbolFile, ...). If we could make the ObjectFile class completely standalone, we can avoid pulling all this stuff in and make the code more modular. It would also help with the migration to llvm's Object parsing library, as it knows nothing about unified sections. If we had a standalone ObjectFile implementation, then it would be easy to swap it out for llvm's and still keep the "section unification" code intact as it would be in a different layer.

So, could this be a direction we can move in? I can put this patch on ice and try to make the ObjectFileMachO standalone instead. I don't think it should be that hard, given that this is how things already work for elf. I'm hoping it will be a matter of changing some consumers to use module->GetSectionList() instead of objfile->GetSectionList()...

Again, we need any objects coming out of the ObjectFile to have the correct sections. Not sure how we would accomplish that with a solution where the dSYM file uses it own useless sections.

labath added a comment.Mar 6 2018, 5:51 AM

Again, we need any objects coming out of the ObjectFile to have the correct sections. Not sure how we would accomplish that with a solution where the dSYM file uses it own useless sections.

I think we have different definitions of "correct" sections. I agree that the nulled out sections in the dSYM file are (nearly) useless, but I think they are also correct, because that's exactly what's in that object file (and I think a single ObjectFile instance should represent information from a single object file -- nothing more, nothing less). If we need (and we do need) to have a view of combined information from multiple object files, then there should be a separate entity that does that (I think that's what the unified section list is supposed to be). Then, anything that needs to operate on the unified view should be taught to operate on the unified view and not on the individual object files.

Now, I have no idea how hard is it to achieve this, but the fact that this is already how things work for elf gives me hope that it will not be that hard.

Btw, do you know how much I can rely on the test suite to catch any issues here? I've tried simply commenting out the line where we add the section from the unified list into the dsym file, and all tests still passed. Is there any particular scenario I should look at in more detail?