This is an archive of the discontinued LLVM Phabricator instance.

[llvm-pdbutil] Add options to only dump symbol record at specified offset and its parents or children with spcified depth.
ClosedPublic

Authored by zequanwu on Apr 22 2022, 6:46 PM.

Details

Summary

Right now, if we want to dump symbol at specified offset, we need to use grep.
And it can only show surrounding symbols in layout (not in lexical scope sense).

This adds similar options to dump command as llvm-dwarfdump to allow users
to dump symbol record at specified offset and its parents or children with
spcified depth.

--symbol-offset= must be used with --modi to dump only one symbol at given
offset.

--show-parents/--show-children must be used with --symbol-offset to
dump all symbols that are parents/children of the symbol at given offset.

--parent-recurse-depth/--children-recurse-depth must be used with
--show-parents/--show-children to specify the max up/down depth.

Diff Detail

Event Timeline

zequanwu created this revision.Apr 22 2022, 6:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 6:46 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
zequanwu requested review of this revision.Apr 22 2022, 6:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 6:46 PM
rnk added inline comments.Apr 25 2022, 2:58 PM
llvm/include/llvm/DebugInfo/CodeView/CVSymbolVisitor.h
33

Rather than having these fields, I think it might make more sense to have a new method, visitSymbolStreamFiltered(Syms, FilterOptions), which implements this new behavior. It would be slightly more functional and less stateful. WDYT?

I'm imagining a simple struct combining all the options, in case we want to add more later:

struct FilterOptions {
  Optional<uint32_t> SymbolOffset;
  Optional<uint32_t> ParentRecursiveDepth;
  Optional<uint32_t> ChildRecursiveDepth;
};
llvm/lib/DebugInfo/CodeView/CVSymbolVisitor.cpp
104

I think you need to "pop off" a parent record every time you see a closing scope. Consider this sequence of records, where ( is open, ) is close, * is the desired record: (()(*))

In your existing test case, this would correspond with the output of a second inline call site that isn't part of the first.

145

I don't think these scope end offsets are available in the typical object file. Try running llvm-pdbutil dump with these filtration options on a single object file instead of a PDB, I think you'll find that the end offset is zero.

146

This would crash if the PDB file is corrupt, please make it handle this case gracefully. I think in dumping code, we should be as defensive as possible because dumpers are often used on new or non-conforming inputs that trigger linker bugs.

152

To try to simplify things and avoid OOB error checking, I would try to restructure this into a single loop. Always maintain a stack of open parent record offsets. Pop them off when you reach closing scopes. When you reach the record you are looking for, print the open scopes.

llvm/test/tools/llvm-pdbutil/symbol-offset.test
37

Please include the C++ source used to produce the input YAML file to help illustrate the expected structure.

zequanwu updated this revision to Diff 425295.Apr 26 2022, 1:45 PM
zequanwu marked 3 inline comments as done.

Address comments.

zequanwu added inline comments.Apr 26 2022, 1:45 PM
llvm/lib/DebugInfo/CodeView/CVSymbolVisitor.cpp
104

An opening scope is only added in to ParentSymbols when the scope range [begin, end] encompass the desired offset.
In the case of (()(*)), the first inner scope`() will not be added into ParentSymbols`.

Added a test case for (()(*)).

145

Yes, all offsets and parent/end are 0. So, the filter options are only used in DumpOutputStyle::dumpModuleSymsForPdb, and silently ignored when dumping obj file.

146

Add VarStreamArray::isOffsetValid to check if an offset is valid, and use it to check offset before dereference.

llvm/test/tools/llvm-pdbutil/symbol-offset.test
37

I deleted the cpp file, so I rewrote new one.

rnk accepted this revision.Apr 27 2022, 2:11 PM

Looks good to me, just one style comment.

llvm/lib/DebugInfo/CodeView/CVSymbolVisitor.cpp
117

Can you reorder the else if with the folllowing else, so that the conditions appear in the same order that they are executed? Something like:

if (BeginOffset < SymbolOffset) {
  ... // accumulate parent offsets
} else if (BeginOffset == SymbolOffset) {
  ... // print all parent and desired records
} else if (BeginOffset <= SymEndOffset) {
  ... // handle child recs
} else {
  ... // Handle parents' ends.
}
This revision is now accepted and ready to land.Apr 27 2022, 2:11 PM
zequanwu marked an inline comment as done.Apr 27 2022, 2:37 PM
This revision was landed with ongoing or failed builds.Apr 27 2022, 2:37 PM
This revision was automatically updated to reflect the committed changes.

I've reverted this because symbol-offset.test fails under MSAN:

[ 1] ; RUN: llvm-pdbutil yaml2pdb %p/Inputs/symbol-offset.yaml --pdb=%t.pdb [FAIL]
llvm-pdbutil yaml2pdb <REDACTED>/llvm/test/tools/llvm-pdbutil/Inputs/symbol-offset.yaml --pdb=<REDACTED>/tmp/symbol-offset.test/symbol-offset.test.tmp.pdb
==9283==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55f975e5eb91 in __libcpp_tls_set <REDACTED>/include/c++/v1/__threading_support:428:12
    #1 0x55f975e5eb91 in set_pointer <REDACTED>/include/c++/v1/thread:196:5
    #2 0x55f975e5eb91 in void* std::__msan::__thread_proxy<std::__msan::tuple<std::__msan::unique_ptr<std::__msan::__thread_struct, std::__msan::default_delete<std::__msan::__thread_struct> >, llvm::parallel::detail::(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(llvm::ThreadPoolStrategy)::'lambda'()::operator()() const::'lambda'()> >(void*) <REDACTED>/include/c++/v1/thread:285:27
    #3 0x7f74a1e55b54 in start_thread (<REDACTED>/libpthread.so.0+0xbb54) (BuildId: 64752de50ebd1a108f4b3f8d0d7e1a13)
    #4 0x7f74a1dc9f7e in clone (<REDACTED>/libc.so.6+0x13cf7e) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)

I've reverted this because symbol-offset.test fails under MSAN:

[ 1] ; RUN: llvm-pdbutil yaml2pdb %p/Inputs/symbol-offset.yaml --pdb=%t.pdb [FAIL]
llvm-pdbutil yaml2pdb <REDACTED>/llvm/test/tools/llvm-pdbutil/Inputs/symbol-offset.yaml --pdb=<REDACTED>/tmp/symbol-offset.test/symbol-offset.test.tmp.pdb
==9283==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55f975e5eb91 in __libcpp_tls_set <REDACTED>/include/c++/v1/__threading_support:428:12
    #1 0x55f975e5eb91 in set_pointer <REDACTED>/include/c++/v1/thread:196:5
    #2 0x55f975e5eb91 in void* std::__msan::__thread_proxy<std::__msan::tuple<std::__msan::unique_ptr<std::__msan::__thread_struct, std::__msan::default_delete<std::__msan::__thread_struct> >, llvm::parallel::detail::(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(llvm::ThreadPoolStrategy)::'lambda'()::operator()() const::'lambda'()> >(void*) <REDACTED>/include/c++/v1/thread:285:27
    #3 0x7f74a1e55b54 in start_thread (<REDACTED>/libpthread.so.0+0xbb54) (BuildId: 64752de50ebd1a108f4b3f8d0d7e1a13)
    #4 0x7f74a1dc9f7e in clone (<REDACTED>/libc.so.6+0x13cf7e) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)

This stack doesn't show files changed in this patch. How do I interpret it?

Sorry, I haven't investigated.