This is an archive of the discontinued LLVM Phabricator instance.

pdbdump: Print "Publics" stream.
ClosedPublic

Authored by ruiu on May 13 2016, 1:43 PM.

Details

Summary

Publics stream seems to contain information as to public symbols.
It actually contains a serialized hash table along with fixed-sized
headers. This patch is not complete. It scans only till the end of
the stream and dump the header information. I'll write code to
de-serialize the hash table later.

Diff Detail

Repository
rL LLVM

Event Timeline

ruiu updated this revision to Diff 57239.May 13 2016, 1:43 PM
ruiu retitled this revision from to pdbdump: Print "Publics" stream..
ruiu updated this object.
ruiu added a reviewer: zturner.
ruiu added a subscriber: llvm-commits.
zturner edited edge metadata.May 13 2016, 1:49 PM

In Microsoft's code, is the hash table an NMT or an NMTNI? We can parse both of those using the NameMap class and NameHashTable classes in DebugInfoPDB. I forget which is which, but NameHashTable starts with a very obvious signature 0xEFFEEFFE, so you should be able to tell immediately which one it is.

ruiu added a comment.May 13 2016, 1:55 PM

I don't think it is NMT nor NMTNI. That hash table is embedded to the stream and has no header. Microsoft gsi.cpp has code to read it.

zturner added inline comments.May 13 2016, 2:01 PM
include/llvm/DebugInfo/PDB/Raw/DbiStream.h
36 ↗(On Diff #57239)

Now that we know what this actually is, can you change it to a better name? Like getPublicSymbolStreamIndex()?

lib/DebugInfo/PDB/Raw/DbiStream.cpp
187 ↗(On Diff #57239)

Same here, can you change the field name in Header to be PublicSymbolStreamIndex?

lib/DebugInfo/PDB/Raw/PublicsStream.cpp
47 ↗(On Diff #57239)

The name gsi suggests that this structure is used for more than just public symbols. I bet it stands for Global Symbol Info. There is another stream representing global symbols, and my guess is that they have the exact same format. Based on that, maybe we can call this entire class something more generic, like just SymbolStream. What do you think?

64 ↗(On Diff #57239)

Interesting, this is very similar to the other hash table, whose version is 0xEFFEEFFE. I wonder what the difference is.

72–75 ↗(On Diff #57239)

Do you know what the purpose of CRef is here? In TpiStream.cpp we have a very similar structure called EmbeddedBuff. In that case, the Offset points to a location in the stream, and the second field indicates the length of the buffer. If this is the same thing, perhaps it's worth raising that structure to a higher level so both can reuse it.

93–96 ↗(On Diff #57239)

I think this line isn't necessary, because the condition will be checked automatically when you try to read the HeaderInfo, and again when you try to read the GSIHashHeader.

ruiu added inline comments.May 13 2016, 2:08 PM
include/llvm/DebugInfo/PDB/Raw/DbiStream.h
36 ↗(On Diff #57239)

Will do.

lib/DebugInfo/PDB/Raw/DbiStream.cpp
187 ↗(On Diff #57239)

Will do.

lib/DebugInfo/PDB/Raw/PublicsStream.cpp
47 ↗(On Diff #57239)

I'm not sure yet if the format is the same, so I want to keep as it is for now. It shouldn't be too late to move it to a common file when we find that this is a common thing.

64 ↗(On Diff #57239)

I have no idea. In Microsoft code, this magic number is declared as GSIHashSCImpvV70.

72–75 ↗(On Diff #57239)

This should be related to a hash table, but I don"t understand the exact meaning yet. I just took the name from the reference.

93–96 ↗(On Diff #57239)

Sure, then I'll remove this piece of code.

ruiu updated this revision to Diff 57252.May 13 2016, 2:14 PM
ruiu edited edge metadata.
  • Updated as per zturner's comments.
zturner accepted this revision.May 13 2016, 2:23 PM
zturner edited edge metadata.
This revision is now accepted and ready to land.May 13 2016, 2:23 PM
This revision was automatically updated to reflect the committed changes.