Add support for type units (.debug_types) to LLDB in a way that is compatible with DWARF 5
Needs ReviewPublic

Authored by clayborg on Apr 18 2017, 8:25 AM.

Details

Summary

This patch adds support for type units and parsing the .debug_types section which allows LLDB to debug binaries that were built with "-gdwarf-4 -fdebug-types-section". This patch takes into account how DWARF 5 will represent type units: all compile units will contain extra fields and all type units are in the .debug_info section.

Normally this change would be very invasive because we would need to make a way for DIEs in the .debug_types section to have unique IDs (lldb::user_id_t). This patch takes advantage of the fact that debug info in the .debug_types section can't point to any other DWARF directly. There are only CU relative references and type signatures that might point to other debug info, but indirected through the type signature. This means the offset of the DIEs in the ,.debug_types section are not important. Given all of this, there are the important takes from this patch:

  • Modify DWARFCompileUnit to contain extra fields: type signature and type offset. This is the way DWARF5 is doing it, so we will need to do that anyway in the near future.
  • Slide the offset of all DIEs in .debug_types by the size of the .debug_info section. This allows every DIE to have unique offset and allows this patch to "just work" with the existing DWARF parser
  • Anyone extracting attribute values must use the DWARFDIE::GetData() or DWARFCompileUnit::GetData() to make sure they are getting the right info (.debug_info or .debug_types) to extract from
  • LLDB testing adds a new "_dwarf_debug_types" variant (like we add the _dwarf, _dwarf_dsym, _dwo) on platforms that support the .debug_types

I would love to get this in. Anything is better that what LLDB does now when it debugs binaries with .debug_types: crash a lot. So we should get this in and iterate on it.

Diff Detail

There are a very large number of changes, so older changes are hidden. Show Older Changes
davide added inline comments.Feb 26 2018, 9:52 AM
source/Plugins/SymbolFile/DWARF/DIERef.cpp
51–68

Why? This needs to be explained in a comment.

source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
66

any reason why you can't use LLVM adt?

This revision now requires changes to proceed.Feb 26 2018, 9:52 AM

This commit has no tests. It should have many.

I has full testing. Please read the commit and notice there is a new debug info flavor that is added. So of course it is tested.

It's very big, so it could be split in several pieces.

This is one thing that goes together. It isn't that big. I adds support for .debug_types into the ObjectFile and parses the information. How would you propose breaking this up?

I'd really appreciate if you can take the time to do so. For now, some comments.

I believe I did take the time to test this properly. Let me know if you still think otherwise.

packages/Python/lldbsuite/test/lldbtest.py
704

It is removing spurious whitespaces and there are other changes in this file. Most editors these days will remove spurious spaces and we should strive to remove them where possibly no?

1765–1766

Again, removing spurious whitespaces?

packages/Python/lldbsuite/test/make/Makefile.rules
197

Ok, I can remove this.

packages/Python/lldbsuite/test/plugins/builder_base.py
204

All other "def buildXXX" functions do this, so I am following the convention.

packages/Python/lldbsuite/test/test_categories.py
27

So what is the suggestion here?

source/Plugins/SymbolFile/DWARF/DIERef.cpp
51–68

Will do

source/Plugins/SymbolFile/DWARF/DWARFDataExtractor.h
51

It doesn't. I checked the docs that show up in the doxygen in http://lldb.llvm.org/cpp_reference/html/index.html and these trailing comments don't hose it up.

source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
121

I am copying the first loop.

source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
66

No reason. IMHO STL is fine unless there is a real reason llvm adt is better?

So, I tried this out on my machine, and got just one failure:

======================================================================
FAIL: test_with_run_command_dwarf_type_units (TestInlinedBreakpoints.InlinedBreakpointsTestCase)
   Test 'b basic_types.cpp:176' does break (where int.cpp includes basic_type.cpp).
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1827, in dwarf_type_units_test_method
    return attrvalue(self)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py", line 24, in test_with_run_command
    self.inlined_breakpoints()
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py", line 46, in inlined_breakpoints
    self, "basic_type.cpp", self.line, num_expected_locations=0)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 376, in run_break_set_by_file_and_line
    num_locations=num_expected_locations)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 572, in check_breakpoint_result
    out_num_locations))
AssertionError: False is not True : Expecting 0 locations, got 1.

However, I am not so sure about the proliferation of debug info variants that we seem to be having. Right now we have two outstanding patches wanting to add a debug variant, which would bring the total amount of variants per test to 6. I'm not sure this is a tractable way forward.
IIUC, clang only puts "non-trivial" types in type units. I'm not sure how many of our tests even define classes/structs (i.e., for how many tests this debug info variant would actually be testing something new).

source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.h
49

It seems weird to have an IsTypeUnit method on a compile unit, when a dwarf compile unit is clearly not a type unit. Would some of the refactors that Jan is doing (I haven't looked at those too closely as it's not really my area, but I know he is adding a DwarfUnit class) help with this?

emaste added inline comments.Feb 26 2018, 10:36 AM
packages/Python/lldbsuite/test/make/Makefile.rules
197

FreeBSD still defaults to DWARF2 at present.

So, I tried this out on my machine, and got just one failure:

======================================================================
FAIL: test_with_run_command_dwarf_type_units (TestInlinedBreakpoints.InlinedBreakpointsTestCase)
   Test 'b basic_types.cpp:176' does break (where int.cpp includes basic_type.cpp).
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1827, in dwarf_type_units_test_method
    return attrvalue(self)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py", line 24, in test_with_run_command
    self.inlined_breakpoints()
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py", line 46, in inlined_breakpoints
    self, "basic_type.cpp", self.line, num_expected_locations=0)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 376, in run_break_set_by_file_and_line
    num_locations=num_expected_locations)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 572, in check_breakpoint_result
    out_num_locations))
AssertionError: False is not True : Expecting 0 locations, got 1.

I saw this too. When I ran _any_ of the variants manually, they all failed to set any breakpoints. Not sure this test is testing what it claims to be testing.

However, I am not so sure about the proliferation of debug info variants that we seem to be having. Right now we have two outstanding patches wanting to add a debug variant, which would bring the total amount of variants per test to 6. I'm not sure this is a tractable way forward.
IIUC, clang only puts "non-trivial" types in type units. I'm not sure how many of our tests even define classes/structs (i.e., for how many tests this debug info variant would actually be testing something new).

We could check after the build if the binary for the test even has a .debug_types sections and if not abort? That was we don't have to mark up the test in any way (like "this tests has structures or enums"), but we could ensure we test with .debug_types when needed.

So, I tried this out on my machine, and got just one failure:

======================================================================
FAIL: test_with_run_command_dwarf_type_units (TestInlinedBreakpoints.InlinedBreakpointsTestCase)
   Test 'b basic_types.cpp:176' does break (where int.cpp includes basic_type.cpp).
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1827, in dwarf_type_units_test_method
    return attrvalue(self)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py", line 24, in test_with_run_command
    self.inlined_breakpoints()
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py", line 46, in inlined_breakpoints
    self, "basic_type.cpp", self.line, num_expected_locations=0)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 376, in run_break_set_by_file_and_line
    num_locations=num_expected_locations)
  File "/usr/local/google/home/labath/ll/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 572, in check_breakpoint_result
    out_num_locations))
AssertionError: False is not True : Expecting 0 locations, got 1.

I saw this too. When I ran _any_ of the variants manually, they all failed to set any breakpoints. Not sure this test is testing what it claims to be testing.

Actually I can repro this and will look into it

However, I am not so sure about the proliferation of debug info variants that we seem to be having. Right now we have two outstanding patches wanting to add a debug variant, which would bring the total amount of variants per test to 6. I'm not sure this is a tractable way forward.
IIUC, clang only puts "non-trivial" types in type units. I'm not sure how many of our tests even define classes/structs (i.e., for how many tests this debug info variant would actually be testing something new).

We could check after the build if the binary for the test even has a .debug_types sections and if not abort? That was we don't have to mark up the test in any way (like "this tests has structures or enums"), but we could ensure we test with .debug_types when needed.

That would help somewhat, but that's still a very "shotgun" approach to testing this. What I would try to do (and I think this is what Davide had in mind as well) is to identify the cases that are interesting from the type unit perspective (a single class in type unit, two classes in type units referencing each other, class referencing an enum, etc. (I'm not really sure about what all that goes into a type unit)) and then write specific tests for that (or reuse existing ones). For example, plenty of tests do breakpoint setting or unwinding and those probably don't have to be tested separately with type units (or maybe one test that checks that setting breakpoint on a method in type unit is sufficient).

Right now, which tests are "run all variants" is represents more history than policy. Opt out of all variants is opt in, and while I at least kick out obvious tests that don't really need to run all variants when I touch them, but I don't think we have done a careful audit. I think the general idea of running variants is good, but we really should reduce the number of tests we run this way to the ones that actually test things that might break that way.

Note, I have (though very very occasionally) seen a dsymutil bug that broke line tables. So it would be good to keep a few "set a breakpoint and run to it" tests just to sanity check this. But most tests that just run somewhere and print a backtrace or view an integer variable or such-like do not need to run all variants.

However, I am not so sure about the proliferation of debug info variants that we seem to be having. Right now we have two outstanding patches wanting to add a debug variant, which would bring the total amount of variants per test to 6. I'm not sure this is a tractable way forward.
IIUC, clang only puts "non-trivial" types in type units. I'm not sure how many of our tests even define classes/structs (i.e., for how many tests this debug info variant would actually be testing something new).

We could check after the build if the binary for the test even has a .debug_types sections and if not abort? That was we don't have to mark up the test in any way (like "this tests has structures or enums"), but we could ensure we test with .debug_types when needed.

That would help somewhat, but that's still a very "shotgun" approach to testing this. What I would try to do (and I think this is what Davide had in mind as well) is to identify the cases that are interesting from the type unit perspective (a single class in type unit, two classes in type units referencing each other, class referencing an enum, etc. (I'm not really sure about what all that goes into a type unit)) and then write specific tests for that (or reuse existing ones). For example, plenty of tests do breakpoint setting or unwinding and those probably don't have to be tested separately with type units (or maybe one test that checks that setting breakpoint on a method in type unit is sufficient).

Precisely. I'm afraid what's proposed here is too coarse to be useful.

Note, I have (though very very occasionally) seen a dsymutil bug that broke line tables. So it would be good to keep a few "set a breakpoint and run to it" tests just to sanity check this. But most tests that just run somewhere and print a backtrace or view an integer variable or such-like do not need to run all variants.

The thing here is that I'm not sure everything can be captured by a single "debug info variant" dimension. The current dimensions kinda make sense, as in they alter the way pretty much all of debug info is accessed (we can think of "dwarf" as the basic one, "dsym" and "dwo" are similar in that they place everything in a separate file, I'm not sure about gmodules, but that's because I have no idea on how it works). However, I'm not sure the same goes for type units. For one they are not completely mutually exclusive with the existing variants, so you can have type-units+regular dwarf and type units+dwo (maybe even type units+dsym?). But obviously we can't add another "type unit" dimension and do a cartesian product....

I am afraid of the opposite: we test what we think we need to test and our simple tests cases don't adequately test the feature we are adding. I can certainly limit the testing to very simple test cases with a one test for a class and one for a enum, but that wouldn't have caught the issues I ran into with static variables in classes that I fixed before submitting this. My point is it is hard to figure out what a compiler, or debugger might hose up without testing all possible cases. Please chime in with opinions as I will go with the flow on this.

Note, I have (though very very occasionally) seen a dsymutil bug that broke line tables. So it would be good to keep a few "set a breakpoint and run to it" tests just to sanity check this. But most tests that just run somewhere and print a backtrace or view an integer variable or such-like do not need to run all variants.

The thing here is that I'm not sure everything can be captured by a single "debug info variant" dimension. The current dimensions kinda make sense, as in they alter the way pretty much all of debug info is accessed (we can think of "dwarf" as the basic one, "dsym" and "dwo" are similar in that they place everything in a separate file, I'm not sure about gmodules, but that's because I have no idea on how it works). However, I'm not sure the same goes for type units. For one they are not completely mutually exclusive with the existing variants, so you can have type-units+regular dwarf and type units+dwo (maybe even type units+dsym?). But obviously we can't add another "type unit" dimension and do a cartesian product....

if you see how horrible type units are in their implementation, you might change your mind. If one has a positive view of type units, you might think "I get one type and one type only and that can only mean good things". I reality this feature's implementation is really messy and there are so many areas that can screw up. For instance if you have a class that has static members, you will get a nice full type in the type unit that defines everything in the class. But with type units you can't refer to anything within the type unit itself other than the top level type itself (the struct/union/class or enum). So you now need to declare a DW_TAG_variable that points to the static member of the class so you can give it an address... What happens? The type is duplicated in each compile unit that uses it and a skeleton incomplete version of the class is represented in the compile unit DWARF (not in a type unit). The top level DW_TAG_structure or DW_TAG_class has a DW_AT_signature that points to the type in the type unit, but the incomplete struct/class has _just_ the static members inside of it (again, in the compile unit, not in the type unit). Then you have a DW_TAG_variable that points to the DW_TAG_member within the incomplete struct or class because there is no way to point to anything within the type if the type is in a type unit. So I believe that we do need to fully test types and run the .debug_types as a separate variant. Each compiler might also do things a bit differently as there is nothing in the .debug_types specification that says that this is how things need to be. And that is just one thing I found while debugging .debug_types... I am sure there are more skeletons in the closet. If we want to support this feature we need to test it. I come to this conclusion after years of dealing with many different "this DWARF does this cool new way to do things" where it has bit us in the past and things break.

vsk added a subscriber: vsk.Feb 26 2018, 1:22 PM

I'm concerned about the time it takes to run the test suite. If we can get decent test coverage of this patch without adding a new debug info flavor, that'd be ideal.

packages/Python/lldbsuite/test/lldbtest.py
704

No, we generally avoid making changes that are unrelated to the patch, as they make patch review and subsequent merges harder.

source/Plugins/SymbolFile/DWARF/DWARFDataExtractor.h
58

Please clang-format your diffs.

clayborg updated this revision to Diff 135977.Feb 26 2018, 2:13 PM

clang format
fixed comments
removed white space
fixed functionalities/breakpoint/inlined_breakpoints/TestInlinedBreakpoints.py failures

aprantl added inline comments.Feb 26 2018, 2:52 PM
packages/Python/lldbsuite/test/test_categories.py
27

I would mention type units in the comment rather than -fdebug-types-section.

source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
97–98

We probably should convert this to an early exit.

103–104

Perhaps add a comment explaining what is being done here?

118

I find hard to follow. What do you think about:

while (debug_types_data.ValidOffset(offset)) {
     cu_sp = DWARFCompileUnit::Extract(m_dwarf2Data, debug_types_data,
                                            &offset, true)));
     if (!cu_sp)
        return;
     m_type_sig_to_cu_index[cu_sp->GetTypeSignature()] =
         m_compile_units.size();
     m_compile_units.push_back(cu_sp);
     offset = cu_sp->GetNextCompileUnitOffset();
   }

?

Also, shouldn't the error be propagated?

clayborg updated this revision to Diff 135999.Feb 26 2018, 3:27 PM

Address Adrian's comments.

clayborg marked 4 inline comments as done.Feb 26 2018, 3:28 PM

I am afraid of the opposite: we test what we think we need to test and our simple tests cases don't adequately test the feature we are adding. I can certainly limit the testing to very simple test cases with a one test for a class and one for a enum, but that wouldn't have caught the issues I ran into with static variables in classes that I fixed before submitting this. My point is it is hard to figure out what a compiler, or debugger might hose up without testing all possible cases. Please chime in with opinions as I will go with the flow on this.

I don't think anyone here is suggesting to test less. The question we are asking is if running all 1000 or so tests doesn't just giv a false sense of security (and makes the testing process longer). A specially crafted test can trigger more corner cases and make it easier to debug future failures that a generic test. If the case of a static variable in a class is interesting, then maybe a test where you have two static variables from the same class defined in two different compilation units (and maybe referenced from a third one) is interesting as well. I'm pretty sure we don't have a test like that right now. Another interesting case which would not be tested in the "separate flavour" mode is the mixed-flags case: have part of your module compiled with type units, part without (and maybe a third part with type units and fission)..

Running the entire dotest test suite in -fdebug-types-section is certainly a good way to catch problems, but it's not the way to write regression tests. FWIW, the way I plan to test the new accelerator tables when I get to the lldb part is to run dotest with the new flags locally during development, use the failures to identify interesting test vectors, and then write regular tests to trigger these.

Running the entire dotest test suite in -fdebug-types-section is certainly a good way to catch problems, but it's not the way to write regression tests.

Is there testing in place that is more serious/thorough than the normal regression testing? It might be reasonable to do the full cartesian set at a slower pace. For example LLVM has the concept of "expensive checks" and there are a small number of bots out there that enable them. But it's not something anybody runs normally.

I am afraid of the opposite: we test what we think we need to test and our simple tests cases don't adequately test the feature we are adding. I can certainly limit the testing to very simple test cases with a one test for a class and one for a enum, but that wouldn't have caught the issues I ran into with static variables in classes that I fixed before submitting this. My point is it is hard to figure out what a compiler, or debugger might hose up without testing all possible cases. Please chime in with opinions as I will go with the flow on this.

I don't think anyone here is suggesting to test less. The question we are asking is if running all 1000 or so tests doesn't just giv a false sense of security (and makes the testing process longer). A specially crafted test can trigger more corner cases and make it easier to debug future failures that a generic test. If the case of a static variable in a class is interesting, then maybe a test where you have two static variables from the same class defined in two different compilation units (and maybe referenced from a third one) is interesting as well. I'm pretty sure we don't have a test like that right now. Another interesting case which would not be tested in the "separate flavour" mode is the mixed-flags case: have part of your module compiled with type units, part without (and maybe a third part with type units and fission)..

Running the entire dotest test suite in -fdebug-types-section is certainly a good way to catch problems, but it's not the way to write regression tests. FWIW, the way I plan to test the new accelerator tables when I get to the lldb part is to run dotest with the new flags locally during development, use the failures to identify interesting test vectors, and then write regular tests to trigger these.

For the accelerator tables you could always just manually index the DWARF and compare the results to the new accelerator tables as a unit test. No further testing needed?

Would everyone be ok with a solution where we leave the .debug_types tests in as a flavor, but we check the binary after creating it to ensure it has a .debug_types section. If it doesn't, then we skip the test (or pass it?), and if it does, then we go ahead and test it? The only overhead that is added then is some building. Many of the tests don't have types that go into the .debug_types section, so this would be an easy way for us to test .debug_types when it is present and just build and skip if it doesn't. No maintenance needed on any tests, we will just detect and enable if needed?

Ping for my last comment

labath added a comment.Mar 1 2018, 3:10 PM

Sorry, I've been waiting to give others (Davide?) a chance to express their opinion. I personally don't think having a new debug info flavour is a good idea. Tests written specifically to test this functionality will be easier to maintain and debug when they break. And keeping adding new flavours for every compiler option is not a viable long term strategy.

davide requested changes to this revision.Mar 1 2018, 3:48 PM

As already pointed out, I think this feature should be thought again & have more focused testing. We can have a meeting/discussion about this, and I need to think about it more.
But what you're proposing is basically a sledgehammer and I don't feel confident about having the patch committed in its current state.

This revision now requires changes to proceed.Mar 1 2018, 3:48 PM

I think we're being a little hasty here. Greg's last suggestion is worth investigation -- how many tests would actually be run for this new variant?

Greg, maybe you could remove the make clean targets from packages/Python/lldbsuite/test/plugins/builder_base.py (I think that'll do it), do a testsuite run to create all the test binaries, and then find out how how many tests are actually going to be run here?

Pavel is right that the explosion of debug variations will quickly become untenable, even with bots, and I'm not sure what a good solution to that is. Running the debug information parsers against the wide variety of programs we include in the testsuite is a good way to test them. This is a problem we should discuss, but if the practical impact of adding the extra testing for .debug_types is that we run 50 tests through an additional variant, I think it is premature to draw the line here and say this must be solved now.

I personally don't think having a new debug info flavour is a good idea. Tests written specifically to test this functionality will be easier to maintain and debug when they break. And keeping adding new flavours for every compiler option is not a viable long term strategy.

Here I disagree, from GDB testsuite experience I find more wider testing always better as it will better catch bugs even in unrelated new functionalities being implemented in the future. One can never guess all the combinations in advance - if one could then one needs no regression testing at all. There are more than enough CPU cycles available for testing. I find rather a serious problem racy==unreliable test results which waste human efforts investigating the results and with more parallel testing (make -j) and virtualized infrastructure the racy results happen even more often. As an example one racy testcase I idenitifed is:

packages/Python/lldbsuite/test/functionalities/signal/handle-segv/TestHandleSegv.py
line 33: AssertionError: 4 != 5 (eStateLaunching != eStateStopped)

I think at least these test are racy:

BreakpointAfterJoinTestCase-test
CreateDuringStepTestCase-test_step_inst_with
HandleSegvTestCase-test_inferior_handle_sigsegv
ReturnValueTestCase-test_vector_values
TestTargetSymbolsBuildidSymlink-test_target_symbols_buildid
ThreadSpecificBreakPlusConditionTestCase-test_python
ThreadSpecificBreakTestCase-test_python
ThreadStateTestCase-test_process_interrupt

I personally don't think having a new debug info flavour is a good idea. Tests written specifically to test this functionality will be easier to maintain and debug when they break. And keeping adding new flavours for every compiler option is not a viable long term strategy.

Here I disagree, from GDB testsuite experience I find more wider testing always better as it will better catch bugs even in unrelated new functionalities being implemented in the future. One can never guess all the combinations in advance - if one could then one needs no regression testing at all. There are more than enough CPU cycles available for testing. I find rather a serious problem racy==unreliable test results which waste human efforts investigating the results and with more parallel testing (make -j) and virtualized infrastructure the racy results happen even more often. As an example one racy testcase I idenitifed is:

packages/Python/lldbsuite/test/functionalities/signal/handle-segv/TestHandleSegv.py
line 33: AssertionError: 4 != 5 (eStateLaunching != eStateStopped)

I think at least these test are racy:

BreakpointAfterJoinTestCase-test
 CreateDuringStepTestCase-test_step_inst_with
 HandleSegvTestCase-test_inferior_handle_sigsegv
 ReturnValueTestCase-test_vector_values
 TestTargetSymbolsBuildidSymlink-test_target_symbols_buildid
 ThreadSpecificBreakPlusConditionTestCase-test_python
 ThreadSpecificBreakTestCase-test_python
 ThreadStateTestCase-test_process_interrupt

I'm not necessarily convinced having another column in the matrix is something that can substitute targeted testing.
If you're talking about something complementary, I think we might consider it, as a follow up.
Also, thanks for bringing the problem of the racy tests up. Adding another row in the matrix increases the chances of test failures. I think we might consider addressing the unreliability before making such a big change. Do you have a way of reproducing? [or even better, investigating]

Our bots have been green for a while (we tried to investigate known issues http://lab.llvm.org:8080/green/view/LLDB/) but we can only address what we know :)

Do you have a way of reproducing?

It just happens for me each time - on Fedora 27 x86_64 on 16-core (32HT) 2-node NUMA machine having env var MAKEFLAGS=-j32 (and tuned-adm throughput-performance if it matters). Comparing multiple runs makes some tests {Expected,Unexpected,}{Success,Failure,Error} without changing anything.

[or even better, investigating]

When the testcase is then ran separately it sure succeeds as the machine then has no load that time. Maybe one could make some race reproducer helper (such as was GDB read1) but I should analyse more LLDB testsuite races to know the common racy issues in LLDB testsuite.

Do you have a way of reproducing?

It just happens for me each time - on Fedora 27 x86_64 on 16-core (32HT) 2-node NUMA machine having env var MAKEFLAGS=-j32 (and tuned-adm throughput-performance if it matters). Comparing multiple runs makes some tests {Expected,Unexpected,}{Success,Failure,Error} without changing anything.

[or even better, investigating]

When the testcase is then ran separately it sure succeeds as the machine then has no load that time. Maybe one could make some race reproducer helper (such as was GDB read1) but I should analyse more LLDB testsuite races to know the common racy issues in LLDB testsuite.

That sounds a great helper. Any chance you're interested in working on that? I don't have access to a machine beefy enough to repro the failures, I'm afraid.

Any chance you're interested in working on that?

Yes, it is in the shortterm or rather midterm plan.

I personally don't think having a new debug info flavour is a good idea. Tests written specifically to test this functionality will be easier to maintain and debug when they break. And keeping adding new flavours for every compiler option is not a viable long term strategy.

Here I disagree, from GDB testsuite experience I find more wider testing always better as it will better catch bugs even in unrelated new functionalities being implemented in the future. One can never guess all the combinations in advance - if one could then one needs no regression testing at all. There are more than enough CPU cycles available for testing. I find rather a serious problem racy==unreliable test results which waste human efforts investigating the results and with more parallel testing (make -j) and virtualized infrastructure the racy results happen even more often. As an example one racy testcase I idenitifed is:

packages/Python/lldbsuite/test/functionalities/signal/handle-segv/TestHandleSegv.py
line 33: AssertionError: 4 != 5 (eStateLaunching != eStateStopped)

I think at least these test are racy:

BreakpointAfterJoinTestCase-test
 CreateDuringStepTestCase-test_step_inst_with
 HandleSegvTestCase-test_inferior_handle_sigsegv
 ReturnValueTestCase-test_vector_values
 TestTargetSymbolsBuildidSymlink-test_target_symbols_buildid
 ThreadSpecificBreakPlusConditionTestCase-test_python
 ThreadSpecificBreakTestCase-test_python
 ThreadStateTestCase-test_process_interrupt

I'm not necessarily convinced having another column in the matrix is something that can substitute targeted testing.

If I would have done targeted testing I wouldn't have caught the breakpoint issue that came up with this patch. So I agree with jankratochvil that it is best to test this all the way by adding another row. There is no way we can guess what certain compilers will do and and all the different permutations this can cause. Not sure where ".debug_types" lands in terms of popularity, but I would guess it comes in above other DWARF flavors (dwo, dwz). Can I at least land this and then we can work on future improvements to the rows required. This is only currently enabled on linux where we tend to have fast buildbots. Darwin compilers don't support this with mach-o due to the 255 section limit that mach-o suffers from.

jankratochvil added inline comments.Apr 9 2018, 11:27 AM
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
597

I do not like this DWARFDataExtractor::m_start modification, it sort of corrupts the DataExtractor and various operations stop working then - such as DWARFDataExtractor::GetByteSize(). DWZ patch makes from current dw_offset_t a virtual (remapped) offset and introduces new physical file section offset which is looked up for data extraction. The file offset is represented as DWARFFileOffset in D40474, instead of bool m_is_dwz; there could be some enum { DEBUG_INFO, DEBUG_TYPES, DWZ_DEBUG_INFO } m_where; instead.

Is there a link to some documentation that explains what DWZ does? I found a bunch of blurbs and man pages, but nothing useful. Nothing is found when I look for "DWZ" in either the DWARF 4 or DWARF 5 spec.

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
597

This means that this diff doesn't affect all of the other DWARF code. Nothing in .debug_types will refer to anything else (not DW_FORM_ref_addr, or any external references). So this trick allows us to just treat .debug_info as if .debug_types was appended to the end since nothing in .debug_types refers to any DIE outside of its type unit. This also mirrors what will actually happen with DWARF5 when all of the data is contained inside of the .debug_info section. This allows each DIE to have a unique "ID". Any other change requires a lot of changes to the DWARF parser and logic. So I actually like this feature. We can fix the GetByteSize() if needed. Basically every object in DWARf right now must be able to make a unique 64 bit unsigned integer ID in some way that we can get back to that info and partially parse more. These are handed out as lldb::user_id_t values for types, functions and more. Each flavor of DWARF will encode what they want into here. The normal DWARF it is just the absolute offset within the .debug_info. With .debug_types we just add the size of the .debug_info to the ID. For DWARF in .o files on Darwin, we encode the compile unit index into the top 32 bits and the DIE offset into the lower, DWO does something just as DWZ will need to. DWARFFileOffset doesn't mean much if there are multiple files. We have many competing type uniquing/debug info size reduction strategies being employed here. I can't believe we have DWO, DWZ, and debug types... But we have to make them all work. We can't just use the absolute file offset because DWO used external files where the file offsets could be the same in the external .o files... Not sure how this works with DWZ or what the best option is. I will read up on DWZ so I can propose some viable options. But each new flavor of the day that gets added the DWARF parser is adding a bunch of logic and edge cases. If two technologies (DWZ + DWO, DWZ + debug_types, etc) are used together, we need to ensure they can.

For DWZ files, it seems types can be contained in external DWARF files and are referred to using new forms. The right way to fix this in LLDB it to load each of these external files as a separate DWZ file (no changes needed to anything) and then teach the SymbolFileDWARF that refers to these files to import the types. If we have 24 shared libraries that use a single DWZ, we shouldn't be loading it more than once and inlining the DWARF, we should be using the AST importer to import the type from that file into the AST for the current DWARF file. We do have done this with -gmodules already, so DWZ shouldn't be that different. Let me know if I am missing anything.

we should be using the AST importer to import the type from that file into the AST for the current DWARF file. We do have done this with -gmodules already, so DWZ shouldn't be that different.

Is AST usable for imported declarations of DW_TAG_variable, DW_TAG_subprogram, DW_TAG_enumeration_type+DW_TAG_enumerators etc.? Still imported DW_TAG_partial_unit cannot contain any code or data addresses which may make it easier for AST.

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
597

Any other change requires a lot of changes to the DWARF parser and logic. So I actually like this feature.

I agree it is a fine quick&dirty hack. Just if my DWZ support gets accepted later anyway then this .debug_types feature could be implemented by its framework in a clean way (as a regular DIEs remapping which is required for DWZ anyway).

If two technologies (DWZ + DWO, DWZ + debug_types, etc) are used together, we need to ensure they can.

DWZ + DWO do not make sense to me. I haven't tried to use DWZ for DWO but DWZ finds common DWARF subtrees typically across CUs so it would not find much.

DWZ + debug_types is explicitly supported by the DWZ tool although that is IMO for compatibility only, DWZ can make the common type references slightly smaller than debug_types. I will sure need to implement debug_types support into my DWZ-for-LLDB patchset later.

BTW the current DWZ tool supports only DWARF-4 and the DWZ tags are non-standard GNU extensions. Therefore DW_FORM_GNU_ref_alt in existing DWZ files is DW_FORM_ref_alt in the proposal you found. I am not aware of any real documentation for this DWARF-4 GNU extension. But DWARF-5 further modified the proposal so it became DW_FORM_ref_sup4+DW_FORM_ref_sup8 in the DWARF-5 standard.

My DWZ-for-LLDB patchset implements the DWARF-4 non-standard GNU extensions as it is used in RHEL-7 (and possibly RHEL-8, not sure yet) which will be supported at least till 2024. It is also present in all recent Fedoras. Currently the DWZ tool has not yet been ported to DWARF-5, Mark Wielaard from Red Hat plans to do so.

jankratochvil added inline comments.Apr 10 2018, 2:46 PM
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
597

For DWARF in .o files on Darwin, we encode the compile unit index into the top 32 bits and the DIE offset into the lower

BTW that prevents implementing DWARF64 so I wanted to prevent using the 32/32 split. Currently Fedora/RHEL already contains files with .debug_info size near the 32-bit limit:

http://ftp.muni.cz/pub/linux/fedora/linux/development/rawhide/Everything/x86_64/debug/tree/Packages/q/qt5-qtwebkit-debuginfo-5.212.0-0.20.alpha2.fc29.x86_64.rpm
/usr/lib/debug/usr/lib64/libQt5WebKit.so.5.212.0-5.212.0-0.20.alpha2.fc29.x86_64.debug
.debug_info size = 0x9bff1b08 = 2.4GiB

Although that is because neither DWZ (as it would run out of memory) nor .debug_types (as it is expected DWZ will handle the duplicities) are applied and so all CUs are separate and even if it was >4GB there would still be no need for 64-bit DW_FORM_ref_addr. Still we are already close to the 32-bit limit, also shifting the split for example to 40/24 would make some limit on number of CUs etc.

clayborg updated this revision to Diff 144358.Apr 27 2018, 9:08 AM

Update patch to work with changes from:

https://reviews.llvm.org/D45170

I am looking at making the LLVM parser handle type units across v4/v5, and the trick of pretending .debug_types is pasted onto the end of .debug_info seems like a pretty convenient fiction. It unifies DIE handling, and still allows the dumper to distinguish which bits are in which section, which is traditionally how the dumper UI thinks about DWARF. So as long as you and Jan are on the same page there, I'm happy to follow suit.

source/Plugins/SymbolFile/DWARF/DIERef.cpp
53

... the type is

Ping. I would like to get this in. As for testing, the full matrix should be run on .debug_types as it is the main form of type uniquing that is defined by the DWARF spec. Any formats that are fully accepted and written into the DWARF spec should be fully tested to ensure everything works. .debug_types is only enabled on linux right now and our build bots are fast on these systems.

labath added a comment.May 4 2018, 5:06 AM

Maybe we could start by splitting of the ObjectFile recognition code for the debug_types section into a separate patch. That should be sufficiently non-controversial and it would reduce the number of files touched by this patch a bit.

Maybe we could start by splitting of the ObjectFile recognition code for the debug_types section into a separate patch. That should be sufficiently non-controversial and it would reduce the number of files touched by this patch a bit.

Sure thing. Anything to get this moving.

Maybe we could start by splitting of the ObjectFile recognition code for the debug_types section into a separate patch. That should be sufficiently non-controversial and it would reduce the number of files touched by this patch a bit.

I broke out the object file parsing code:

https://reviews.llvm.org/D46529

labath added inline comments.May 8 2018, 3:40 AM
source/Plugins/SymbolFile/DWARF/DWARFDataExtractor.h
35–58

This looks like a thing we could focus on next. Sliding the contents of the data buffer seems like a useful fiction, but the way it is implemented now, it relies on undefined behavior (out-of-range pointers, possibly even wrapping the address space) and it violates the invariants that we have some far been assuming about data extractors (namely, that for a DataExtractor, all offsets from 0 to GetByteSize()-1 are dereferencable). For example, for a "slid" data extractor:

  • ValidOffset(0) will return true, even though that is clearly not the case
  • Append(data2) will try to access data at *m_start and probably crash
  • so will Checksum()

Since there seems to be a consensus (at least, nobody has proposed an alternative solution) that this is the way we should treat type units, I think it makes sense to spend some time to make sure we built it on solid foundations. To me, this sliding seems like a fundamental change in the nature of the data extractors, and so I think it should be done in the base DataExtractor class. So I'd propose to move this into a separate patch, where we can make sure that all other DataExtractor APIs do something sensible even if the extractor has been "slid".

clayborg updated this revision to Diff 145969.May 9 2018, 11:29 AM

Broke up with patch further by submitting:

[PATH] D46529: Add support to object files for accessing the .debug_types section
https://reviews.llvm.org/D46529

[PATCH] D46606: General cleanup to minimize the .debug_types patch
https://reviews.llvm.org/D46606

This makes this patch easier to read and focus only on the .debug_types patch

The patch looks much better now, but I think we still need to discuss the data extractor sliding issue, as right now that's very hacky.

So, I've been attacking the type-units-in-.debug_info problem in LLVM, and what I've done (so far) in the LLVM parser is to factor out a DWARFUnitHeader class, and made DWARFUnit derive from that. This makes it feasible to parse a unit header before deciding what kind of unit we are looking at, which is necessary when .debug_info contains both type and compile units. I haven't posted that patch yet as it's an NFC prep for more work getting to the point of actually allowing mix-n-match units in .debug_info sections. But I'll post that patch today so people can at least look at it.

Mainly Greg and I should not take diverging approaches when we can converge instead.

The patch looks much better now, but I think we still need to discuss the data extractor sliding issue, as right now that's very hacky.

It makes things just work with .debug_info as is pretends that the .debug_types is just appended to the end of the .debug_info (offset wise). All info in .debug_types has no DW_FORM_ref_addr or any references that are outside of the current type unit so the info in .debug_types is position independent. I would like to keep this in DWARFDataExtractor only if possible as it is hacky and won't work in a general case. Feel free to suggest any other solution that doesn't require checked if the data is from .debug_types and adding an offset all over the code, but I couldn't think of one.

I really need to get this in as well. It is a very important feature and I am trying to do the right thing and get it into open source so we don't end up keeping the patch locally. So any help on expediting any comments and iterations would be great.

I can't coment on the xcode and python parts. The C++ all looks plausible.
I think the "sliding" trick will be trickier in the LLVM parser, as it has to handle .o files (which might have multiple instances of .debug_types in v4, and multiple instances of .debug_info in v5) but is conceptually nice. The sliding should be kept to DWARFDataExtractor and not pushed down into DataExtractor, IMO, because the problem is (I think) pretty specific to DWARF.

The thing here is that I don't see how can be done in a non-hacky way in DWARFDataExtractor while DWARFDataExtractor remains a (subclass of) DataExtractor. And it is a hack because as soon as you slide the extractor, a number of its functions become booby-trapped (explode as soon as you call them) or just return nonsensical results. And I'd like us to implement it in a way that it is not a hack, on any level.

I see several ways of achieving this (in decreasing order of intrusiveness):

  1. do this in the DataExtractor class. This would require us to change (extend) the definition of the class. Right now, I would define this class as "a view of a range of bytes with a bunch of utility functions for accessing them". After this change it would become something like "a view of a range of bytes with a bunch of utility functions for accessing them, where the accessor functions automatically apply a bias to all offsets". That's precisely the kind of change you are making now, only the difference would be that we would state this up-front rather than doing it behind the DataExtractor's back and hoping it works. In practice, this would be as simple as adding another offset_t m_bias; field to the class and going through it's methods to make sure the do something reasonable with it. "Something reasonable" would generally mean "subtract it from the input offset", but there are a couple of functions that would need to do something different, (e.g., ValidOffset() would return true for values from m_bias to m_bias + (m_end-m_start) -1 because those are the only valid offsets, etc.). This is my preferred solution because it provides with a consistent DataExtractor interface (and in terms of code churn, I don't think it's harder to implement than any of the other solutions). It's true that dwarf type units will probably be the only user of these "slid" extractors, but I don't think that matters as their functionality can be defined in a clear and self-consistent manner without any references to dwarf.
  1. Add the m_bias field to the DWARFDataExtractor class, but stop the class from publicly inheriting from DataExtractor (it can inherit privately, or have it as a member). The result of this is the same as the first option - we get to go through the DataExtractor interface and make sure all its functions do something sensible. The difference is that we may not need to go through that all methods, only those that are used for dwarf parsing; and the sliding trick can then be kept dwarf-specific.
  1. Keep DWARFDataExtractor as-is, and add a new SlidingDWARFExtractor class instead (which will then to the m_bias trick and so on); then convert all places that need this to use the new class instead. This is a variation on the second option. I don't see it as particularly interesting, as it the DWARFDataExtractor is already a very small class, but it would still provide us self-consistent APIs everywhere.

I can also see some kind of a solution involving a ConcatenatingDataExtractor, which would contain two or more DataExtractors as it's backing store and then pretend they represent one continuous address space from 0 to sum(sizes) (no sliding necessary). This would achieve what you wanted at a higher level. In a way it would be nicer because there would be a single canonical bytestream for each symbol file and you wouldn't have to ask each DIE for it's own extractor, but I expect it to be a bit harder to implement, and may have a runtime cost.

The next solution which does not require sliding is to modify your previous patch D46606 and insert some kind of offset fixup code there. It seems that in every place you call the new GetData() function, you have an offset around (makes sense, as the reason you are calling GetData is because you want to parse something). We could modify that function to take a virtual (slid) offset as a parameter, and return an offset which is suitable for usage in the returned extractor. So, the usage would become something like std::tie(extractor, real_offset) = die.GetData(virtual_offset);. I'm not sure how nice this will end up in practice, but it's a possibility.

Let me know what you think of these. I'm open to other options as well. The thing I'd like avoid is an implementation that relies on putting objects in an invalid state or invoking undefined behavior.

I haven't given Pavel's comments as much thought as they deserve, but here are some of mine.

It's a worthwhile point that a class ought to represent some useful and reasonably well defined abstraction. DataExtractor has one, and DWARFDataExtractor builds on that (at least in the LLVM parser) by handling relocations; the most reasonable way to handle multiple buffers would be with multiple extractors, building on top of what's there rather than hacking the insides.

One aspect I am looking at is that the LLVM parser needs to handle .o files, which can have multiple .debug_types (in v4) or .debug_info (in v5) sections, because the units are broken up into COMDATs. So, the dumper needs to be able to handle an arbitrary number of either kind of section, where LLDB needs at most one of each section. One way of dealing with all this is to have a vector or deque of sections, each section has its DWARFDataExtractor and knows its own "base offset" which would be the sum of the sizes of the preceding sections. I don't know how awkward it might be to turn this into a meta-extractor that hid details from consumers, but then I think it probably would not be all that hard to have consumers be aware that there can be multiple lumps of DWARF and get the offsets right on their own.

After reading Pavel's and Paul's comments, I will come up with a solution that allows multiple sections to be combined into one large section. That way it will be all legal and easy to extend. If we have just .debug_info, then we can use a base class that just calls directly through to DWARFDataExtractor. If we have more that one section that need to act as one section, I will make a new class that can put N sections together. I'll post a patch on its own that implements this with tests and we can iterate on that and make it work for all of our needs, then switch this patch over to use it. Sound good?

Go for it. It sounds like an abstraction that hides the multiple-section nature of the beast will particularly help LLDB; on the LLVM side the dumper will want to remain a bit more cognizant of the underlying sections. But we can surely leverage ideas from each other, which will ultimately help with converging on the Grand Unified Parser.

Sounds like a plan.

Just a ping, that ConcatenatingDataExtractor is still not coded, even as some draft patch? Just checking to prevent duplicating some existing work.
FYI providing a rebase with few simple conflicts resolved. I had to drop conflicting lldb.xcodeproj/project.pbxproj changes there.

Just a ping, that ConcatenatingDataExtractor is still not coded, even as some draft patch? Just checking to prevent duplicating some existing work.
FYI providing a rebase with few simple conflicts resolved. I had to drop conflicting lldb.xcodeproj/project.pbxproj changes there.

I have stalled on this due to performance. Anything I do will ruin the performance of 99% of the DWARF consumers for .debug_types which is rarely used by our clients. Right now we have one data extractor with only a virtual destructor.

Just to clarify - I guess DWARFConcatenatingDataExtractor should replace all uses of DWARFDataExtractor in the DWARF/ subdirectory, whenever lldb::offset_t offset is used, right?
As the other option would be to remap just lldb::user_id_t but that would not solve much.
BTW my D40474 patch: DWZ is using DWARF references in both main file .debug_info and in DWZ common file .debug_info so at least one of them always needs to get the read-in DWARF offsets adjusted anyway in DWARFFormValue::Reference(). Then I needed adjustments also in DWARFDebugInfo::GetCompileUnitAranges() as the DWZ common file I had to place from offset 0 so that it can be shared between multiple DWARF files using it, therefore .debug_info of the main files (users of DWZ common file) need to be shifted.
Also the DWZ common file is a separate DWARFDebugInfo so there needs to be some dispatcher (DWARFDebugInfo::DWZRedirect() in my D40474 DWZ patch) for it anyway, which is why I do not find DWARFConcatenatingDataExtractor as much a win for DWZ.
OTOH I do not think the DWARF parsing performance matters too much after .debug_names (or the Apple index which I haven't/cannot test) is used, why it should?

I have stalled on this due to performance. Anything I do will ruin the performance of 99% of the DWARF consumers for .debug_types which is rarely used by our clients.

I have tried to implement it with acceptable performance as D51578.

BTW I would welcome some direction forward (for our internal yearly team meeting) how to unblock this patch (and assuming DWZ D40474). I can code some other of the solutions described above by Pavel Labath or to performance-improve more D51578 (although then I can come with more DWARF performance improvements, at least re-introduce atomics to aid my mutexes from D40470).