This is an archive of the discontinued LLVM Phabricator instance.

[llvm-pdbutil] Output the symbol offset when dumping symbols
ClosedPublic

Authored by zturner on Jun 30 2017, 1:35 PM.

Details

Summary
Type records have a unique type index, but symbol records do
not.  Instead, symbol records refer to other symbol records
by referencing their offset in the symbol stream.  In a sense
this is the analogue of the TypeIndex, but we are not printing
it in the dumper.  Printing it not only gives us more useful
information when manually investigating the contents of a PDB,
but also allows us to write better tests by enabling us to
verify that fields that reference other symbol records do
so correctly.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Jun 30 2017, 1:35 PM
rnk added inline comments.Jun 30 2017, 2:04 PM
lld/test/COFF/pdb-comdat.test
44 ↗(On Diff #104924)

I'd prefer to see offsets in hex.

llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp
874 ↗(On Diff #104924)

Not 4? I guess it won't matter, the publics stream never has scope records.

llvm/tools/llvm-pdbutil/MinimalSymbolDumper.cpp
673 ↗(On Diff #104924)

Looks like this was left behind after search & replace & format.

zturner added inline comments.Jun 30 2017, 2:13 PM
lld/test/COFF/pdb-comdat.test
44 ↗(On Diff #104924)

I'm kind of of two minds here. I wanted them to look visually distinct from type indices, and there's also the issue of the size fields. Someone should be able to look at the size field (which is also printed) and easily compute the next offset. It's harder to mentally do hex math than decimal math so but this way someone can easily see that Offset(N+1) = Offset(N) + Size(N)

llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp
874 ↗(On Diff #104924)

Nope, it's actually 0 here. Publics has a bunch of header information and then a field which says "the symbol records are over here somewhere in another stream". And in that stream, the symbols are at offset 0.

I guess the conceptual difference is that the public symbol record stream is literally nothing but symbol records, whereas the module debug stream (which is where module symbols come from) contains an embedded substream which contains symbol records, so they put a 4 byte identifier before it that says "symbols are up next!". Not really necessary, but that's about the only thing I can think of.

rnk added inline comments.Jun 30 2017, 2:31 PM
lld/test/COFF/pdb-comdat.test
44 ↗(On Diff #104924)

Hard to do mental math in hex? Get gud.

I agree, though, size, offset, parent, and end should all be in the same base. It might as well be decimal for now.

This revision was automatically updated to reflect the committed changes.