This is an archive of the discontinued LLVM Phabricator instance.

Add support for writing TPI hash values
ClosedPublic

Authored by zturner on Sep 8 2016, 7:40 PM.

Details

Summary

This only adds support for writing hash values of the record types we currently understand. For all other record types, we write a hash value of 0. A quick look at a PDB generated by Microsoft confirms that they have actual values for every single recofd type, so we will likely need to do some more work to figure out how to hash these other record types. It's probably as trivial as just applying one of these hashing functions to the entire record contents, mostly just need to try it and see.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner updated this revision to Diff 70774.Sep 8 2016, 7:40 PM
zturner retitled this revision from to Add support for writing TPI hash values.
zturner updated this object.
zturner added reviewers: ruiu, rnk, amccarth.
zturner added a subscriber: llvm-commits.
majnemer added inline comments.
include/llvm/DebugInfo/PDB/Raw/TpiRecordHashVisitor.h
26 ↗(On Diff #70774)

TpiRecordHashUpdateVisitor() = default; ?

56–59 ↗(On Diff #70774)

Is this clang-formatted?

lib/DebugInfo/PDB/Raw/TpiStream.cpp
105–107 ↗(On Diff #70774)

Couldn't you just do std::vector<ulittle32_t> HashValueList(HashValueList.begin(), HashValueList.end());

test/DebugInfo/PDB/pdbdump-readwrite.test
13–18 ↗(On Diff #70774)

Is this diff correct?

zturner added inline comments.Sep 8 2016, 8:38 PM
include/llvm/DebugInfo/PDB/Raw/TpiRecordHashVisitor.h
56–59 ↗(On Diff #70774)

It's clang-formatted. I agree this looks odd, I don't know why it did it formatted it that way. I'll run it through clang-format again just to make sure.

lib/DebugInfo/PDB/Raw/TpiStream.cpp
105–107 ↗(On Diff #70774)

I tried but it didn't work because HashValueList.begin() isn't a normal iterator, it's a FixedStreamArrayIterator<>. There might be some magic to get it to work, but I don't know what the specific requirements are for an iterator to be usable in that context.

test/DebugInfo/PDB/pdbdump-readwrite.test
13–18 ↗(On Diff #70774)

Yes, I removed these intentionally. Previously we were re-creating the stream directory and stream size list exactly, even though we weren't filling all the streams with any data. I removed the code to do that because it makes it difficult / impossible when we need to allocate a non-special stream. Specifically, in this patch we go and allocate space for the hash value stream. Which stream was that from the stream table? If we're trying to map the MSF layout exactly to the same stream indices etc, then we would need to look at the yaml, figure out what stream was the hash value stream, and assign the same index to it. The logic starts getting complicated for no tangible benefit.

So instead, I decided that I would only reserve special streams, and reserve other dynamic streams as needed. See all the code in llvm-pdbdump.cpp that was deleted. Because of all this, the directory, number of streams, etc is no longer going to match up. Someday in the future when we can round-trip a PDB byte for byte, we can add these back.

rnk added inline comments.Sep 14 2016, 11:04 AM
include/llvm/DebugInfo/PDB/Raw/TpiRecordHashVisitor.h
24 ↗(On Diff #70774)

Five words in a class name feels like too much. At least, it's not consistent with the rest of LLVM. How do these names sound?
TpiHashing.h
TpiHashUpdater
TpiHashVerifier

The first class computes the hash, the second one verifies an existing hash.

I don't feel like Record or Visitor are helping readability that much.

lib/DebugInfo/PDB/Raw/TpiRecordHashVisitor.cpp
44 ↗(On Diff #70774)

Typo on "Souce"

zturner updated this revision to Diff 71424.Sep 14 2016, 1:46 PM

Updated with suggestions from rnk

rnk accepted this revision.Sep 14 2016, 2:20 PM
rnk edited edge metadata.

lgtm thanks!

This revision is now accepted and ready to land.Sep 14 2016, 2:20 PM
This revision was automatically updated to reflect the committed changes.