This is an archive of the discontinued LLVM Phabricator instance.

[pdb] Parse module line info
ClosedPublic

Authored by zturner on Jun 2 2016, 4:04 PM.

Details

Summary

Rather than just dumping binary blocks for line info, we now pretty print individual line and column records.

To facilitate this, a couple of changes had to be made:

  1. ModuleSubstream got moved from DebugInfo/PDB to DebugInfo/CodeView, and various codeview related types are defined there. It turns out DebugInfo/CodeView/Line.h already defines many of these structures, but this is really old code that is not endian aware, doesn't interact well with StreamInterface and not very helpful for getting stuff out of a PDB. Eventually we should migrate the old readobj COFFDumper code to these new structures, or at least merge their functionality somehow.
  1. A ModuleSubstream visitor is introduced. Depending on where your module substream array comes from, different subsets of record types can be expected. We are already hand parsing these substream arrays in many places especially in COFFDumper.cpp. In the future we can migrate these paths to the visitor as well, which should reduce a lot of code in COFFDumper.cpp.

Diff Detail

Event Timeline

zturner updated this revision to Diff 59471.Jun 2 2016, 4:04 PM
zturner retitled this revision from to [pdb] Parse module line info.
zturner updated this object.
zturner added reviewers: ruiu, rnk, majnemer, amccarth.
zturner added a subscriber: llvm-commits.
ruiu added inline comments.Jun 2 2016, 4:18 PM
include/llvm/DebugInfo/CodeView/CodeView.h
16 ↗(On Diff #59471)

Instead of adding an #include to this header, can you include it into .cpp files that actually need Endian.h?

include/llvm/DebugInfo/CodeView/ModuleSubstream.h
22

Thank you for these comments!

41–42

What is this?

include/llvm/DebugInfo/CodeView/ModuleSubstreamVisitor.h
23

What does this correspond to?

39

I don't think you need !!.

lib/DebugInfo/CodeView/ModuleSubstream.cpp
12

Remove this blank line.

majnemer added inline comments.Jun 2 2016, 4:23 PM
tools/llvm-pdbdump/llvm-pdbdump.cpp
543–549

Instead of doing this bitmashing yourself, why not construct a LineInfo from N.Flags?

amccarth edited edge metadata.Jun 2 2016, 4:33 PM

DebugInfo/CodeView/ModuleSubstreamVisitor.cpp probably shouldn't be empty (or it shouldn't be added).

include/llvm/DebugInfo/CodeView/CodeView.h
16 ↗(On Diff #59471)

Nothing in this file actually uses the endian types (as far as I can see). I'd include Endian.h in llvm/DebugInfo/CodeView/ModuleSubstream.h and ModuleSubstreamVisitor.h.

include/llvm/DebugInfo/CodeView/StreamArray.h
55

Should this be explicit?

213

this->Index to distinguish from the Index parameter? Ug. Can you give the parameter a distinct name?

include/llvm/DebugInfo/CodeView/StreamReader.h
56

Why U and not Extractor?

tools/llvm-pdbdump/llvm-pdbdump.cpp
579

Should(could) L be a const auto &?

zturner updated this revision to Diff 59478.Jun 2 2016, 4:39 PM
zturner edited edge metadata.

Updated based on suggestions

majnemer added inline comments.Jun 2 2016, 4:45 PM
tools/llvm-pdbdump/llvm-pdbdump.cpp
545

I'd use the logic in tools/llvm-readobj/COFFDumper.cpp:940:

if (LI.isAlwaysStepInto())
  W.printString("StepInto", StringRef("Always"));
else if (LI.isNeverStepInto())
  W.printString("StepInto", StringRef("Never"));
else
  W.printNumber("LineNumberStart", LI.getStartLine());

You might have to update fewer tests down the road if you choose the same output format that llvm-readobj uses.

zturner marked 3 inline comments as done.Jun 2 2016, 4:45 PM

I'll address the remainder of the comments in a followup

include/llvm/DebugInfo/CodeView/ModuleSubstream.h
42–43

Added comments

include/llvm/DebugInfo/CodeView/ModuleSubstreamVisitor.h
24

This doesn't correspond to any CodeView types. It's just the ValueType of the array. i.e. when you iterate the array you will be iterating over a sequence of these. The value type can be whatever you want (just depends how you write the extractor), so here I chose to make it something that provides access to the 2 underlying embedded arrays and the offset field.

include/llvm/DebugInfo/CodeView/StreamReader.h
56

Because you don't ever have to think about what to pass for it, and it's not used in the function except for passthrough. You're never explicitly parameterizing the function, it's always deduced based on the type of the VarStreamArray you pass in.

lib/DebugInfo/CodeView/ModuleSubstream.cpp
13

Don't we usually put a blank line between the TU's "primary" header and the rest of the headers? I always do, wonder if I've been doing it wrong.

zturner updated this revision to Diff 59479.Jun 2 2016, 4:52 PM

Address remaining suggestions

majnemer accepted this revision.Jun 2 2016, 5:37 PM
majnemer edited edge metadata.

LGTM

include/llvm/DebugInfo/CodeView/ModuleSubstreamVisitor.h
43

What if BlockSize is smaller than LineFileBlockHeader?

include/llvm/DebugInfo/CodeView/StreamArray.h
91–96

Should we have in-class member initializers for these? Seems like it might be less error prone/more compact.

213

Should it be an error for Array.size() to be -1?

This revision is now accepted and ready to land.Jun 2 2016, 5:38 PM
zturner added inline comments.Jun 2 2016, 5:47 PM
include/llvm/DebugInfo/CodeView/ModuleSubstreamVisitor.h
43

That would indicate a corrupt record. So yea, I should check for that.

include/llvm/DebugInfo/CodeView/StreamArray.h
213

If it were easy to make it work, then we should, but to do that we'd probably need to increase the size of the iterator, and I'm not sure it's worth it. I think the only way that can ever happen is if your StreamRef is 2^32-1 bytes long, and I don't know that it's even possible to have a stream that big (I recall there are constants somewhere in the cv header files that determine how big a stream can be based on the block size and whether large blocks are enabled). I can look into it a bit more when I get home, but this would be a pretty exceptional circumstance.

zturner closed this revision.Jun 2 2016, 8:35 PM

size is unsigned, so it could happen if you had a stream which was exactly
2^32 bytes long, and it was an array of uint8's. :-/ Not that that would
ever happen. I think Rui has a patch up which changes this a little, so
I'll wait until that goes in.

lib/DebugInfo/PDB/CMakeLists.txt