This is an archive of the discontinued LLVM Phabricator instance.

[PDB] Fix hash function used to write /src/headerblock
ClosedPublic

Authored by thakis on Apr 29 2019, 1:08 PM.

Details

Summary

lld-link used to write PDB files that DIA couldn't recover natvis
files from if:

  • The global strings table was > 64kiB
  • There were at least 3 natvis files

The cause was that the hash function for the /src/headerblock stream
was incorrect: It needs to be truncated to 16 bit.

If the global strings table was <= 64kiB, truncating to 16 bit is a
no-op, so this wasn't needed for small programs.

If there are only 1 or 2 natvis files, then the growth strategy in
HashTable::grow() would mean the hash table would have 2 buckets (for 1
natvis file) or 4 buckets (for 4 natvis files), and since the hash
function is used modulo number of buckets, and since 2 and 4 divide
0x10000, the missing % 0x10000 is a no-op there too. For 3 natvis
files, the hash table grows to 6 buckets, which has a factor that's not
common with 0x10000 and the difference starts to matter.

Fixes PR41626.

Diff Detail

Repository
rL LLVM

Event Timeline

thakis created this revision.Apr 29 2019, 1:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2019, 1:08 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
zturner accepted this revision.Apr 29 2019, 1:15 PM
zturner added inline comments.
llvm/lib/DebugInfo/PDB/Native/PDBStringTableBuilder.cpp
29–35 ↗(On Diff #197173)

IIUC, this shouldn't be specific to /src/headerblock right? Because this StringTableHashTraits is used for the entire string table. This means the comment is overly specific as it's referring to /src/headerblock from code that ultimately doesn't really have anything to do with /src/headerblock.

It also means that changing this will probably affect lots of other things that are subtle and hard to test. So be on the lookout for other weird failures.

This revision is now accepted and ready to land.Apr 29 2019, 1:15 PM
rnk accepted this revision.Apr 29 2019, 1:19 PM

lgtm

Initially I thought "we should change the return type of pdb::hashStringV1, but after studying the pdb code more, I see that sometimes the full 32-bit string hash is used, and sometimes only the 16-bit hash is used. I guess we just have to be careful and truncate wherever it seems necessary.

thakis marked an inline comment as done.Apr 29 2019, 3:13 PM
thakis added inline comments.
llvm/lib/DebugInfo/PDB/Native/PDBStringTableBuilder.cpp
29–35 ↗(On Diff #197173)

From what I understand, InjectedSourceTable is the only thing using this StringTableHashTraits: http://llvm-cs.pcc.me.uk/include/llvm/DebugInfo/PDB/Native/PDBStringTableBuilder.h/rStringTableHashTraits

Is that not the case?

This revision was automatically updated to reflect the committed changes.
zturner added inline comments.Apr 29 2019, 4:18 PM
llvm/lib/DebugInfo/PDB/Native/PDBStringTableBuilder.cpp
29–35 ↗(On Diff #197173)

If that's the case, then there's nothing to worry about. Thanks for double checking though.