This is an archive of the discontinued LLVM Phabricator instance.

[LLDB][DWARF] Fix duplicate TypeSP in type list
ClosedPublic

Authored by zequanwu on Dec 7 2021, 4:46 PM.

Details

Summary

The call graph is SymbolFileDWARF::ParseType -> DWARFASTParserClang::ParseTypeFromDWARF -> DWARFASTParserClang::UpdateSymbolContextScopeForType.
This patch removed the insertion of type to type list at UpdateSymbolContextScopeForType, since SymbolFileDWARF::ParseType always inserts the type returned by ParseTypeFromDWARF into the type list.

Diff Detail

Event Timeline

zequanwu requested review of this revision.Dec 7 2021, 4:46 PM
zequanwu created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2021, 4:46 PM

Thanks for fixing this. I assume that the type is already being added to the type list somewhere else -- it'd be nice to say, for future archaeologists' sake, where that actually happens.

But my main question is really about the test.

lldb/test/Shell/SymbolFile/DWARF/x86/find-basic-type.cpp
40

Does this actually check that the type is not emitted more than once?

This is the reason why the other checks have the "Found <X> types" assertion above. We currently don't have such output from --find=none, but feel free to add something.

It might also be better to make this a separate test, as the output also includes things like int -- it would start to get messy if you included all of that here, and this test was about testing something different anyway.

zequanwu updated this revision to Diff 392932.EditedDec 8 2021, 2:02 PM

I found that is actually a problem in TypeList. We need to uniquify Type in
TypeList. Otherwise, ParseTypes will add the same types twice because it
parses types in a dfs way without keeping track of what has been parsed.

  • Use an additional set of Type* to check if it exists or not.
  • Print number of types in TypeList.
zequanwu retitled this revision from [LLDB][DWARF] Fix duplicate TypeSP in type list to [LLDB] Uniquify Type in type list..Dec 8 2021, 2:03 PM
zequanwu edited the summary of this revision. (Show Details)
shafik added a comment.Dec 9 2021, 7:17 PM

I would also like to know where this duplicate insertion is happening. Can you walk us through the steps that lead to the duplicate entries? thank you!

I would also like to know where this duplicate insertion is happening. Can you walk us through the steps that lead to the duplicate entries? thank you!

Firstly, I found that SymbolFileDWARF::ParseType -> DWARFASTParserClang::ParseTypeFromDWARF -> DWARFASTParserClang::UpdateSymbolContextScopeForType. It's inserting the same type at both places (SymbolFileDWARF::ParseType and DWARFASTParserClang::UpdateSymbolContextScopeForType ). In my first revision, I just attempted to remove the insertion in UpdateSymbolContextScopeForType. Then I found that when running lldb-test symbols [binary], it's inserting same types at ParseTypes. For following dwarf info, when ParseType was called for DW_TAG_subroutine_type, it will also call ParseType for the type DW_TAG_base_type inside it. When the top-level ParseTypes iterates to 0x0000010f, the type will be inserted twice:

0x0000010a:   DW_TAG_subroutine_type
                DW_AT_type      (0x0000010f "int")

0x0000010f:   DW_TAG_base_type
                DW_AT_name      ("int")
                DW_AT_encoding  (DW_ATE_signed)
                DW_AT_byte_size (0x04)

So, I figured it might be better to deduplicate the types inside TypeList when insertion happens, though ParseTypes is only used at lldb-test.

I don't know whether the types should be uniqued at this level (obviously, they should be uniqued somewhere), but these are the thoughts that spring to mind:

  • if this is a problem for lldb-test, then it would be preferable to come up with a solution that does not negatively impact the performance and memory consumption of the production code
  • it seems like this patch essentially implements its own copy of llvm::SetVector
  • It's unfortunate to have set-like semanticts in TypeList, when we already have a type called TypeSet.
lldb/test/Shell/SymbolFile/DWARF/x86/dump-types.cpp
3–13 ↗(On Diff #392932)

It doesn't seem necessary to run all three accelerator flavours for this kind of a test.

zequanwu updated this revision to Diff 395543.Dec 20 2021, 4:20 PM

Revert back to first diff.

I don't know whether the types should be uniqued at this level (obviously, they should be uniqued somewhere), but these are the thoughts that spring to mind:

  • if this is a problem for lldb-test, then it would be preferable to come up with a solution that does not negatively impact the performance and memory consumption of the production code

I reverted it back to first diff. But I can't use lldb-test for testing purpose, since there is a problem for lldb-test that inserting duplicate types into type list (https://reviews.llvm.org/D115308?vs=on&id=392599#3188183).

lldb/test/Shell/SymbolFile/DWARF/x86/find-basic-type.cpp
40

It actually still emits types more than once using lldb-test.

zequanwu retitled this revision from [LLDB] Uniquify Type in type list. to [LLDB][DWARF] Fix duplicate TypeSP in type list.Dec 20 2021, 4:53 PM
zequanwu edited the summary of this revision. (Show Details)
labath accepted this revision.Dec 22 2021, 4:13 AM

Since you're simply removing code, I don't think that we need to insist on new test cases -- not breaking existing ones should suffice. That said, if you know of a way to test this via the image dump symfile command, I would definitely encourage you to do so.

And yes, we need to figure out what to do with lldb-test. Maybe we should just rewrite ParseTypes (used only in tests) to do the parsing in a different way...

This revision is now accepted and ready to land.Dec 22 2021, 4:13 AM
This revision was automatically updated to reflect the committed changes.