This is an archive of the discontinued LLVM Phabricator instance.

Write a hash of the binary as the PE Debug Directory Timestamp
ClosedPublic

Authored by zturner on Mar 1 2018, 3:48 PM.

Details

Summary

Previously we were writing 0 for this field. This breaks symbol servers, and the value needs to be something, although it doesn't necessarily need to be a timestamp. I chose to use a hash of the binary here, calculated while the value of this field is still 0. Then, the 0 is replaced with the hash.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

zturner created this revision.Mar 1 2018, 3:48 PM
ruiu accepted this revision.Mar 1 2018, 4:09 PM

LGTM

This revision is now accepted and ready to land.Mar 1 2018, 4:09 PM
smeenai added inline comments.Mar 1 2018, 5:50 PM
lld/COFF/Writer.cpp
1045 ↗(On Diff #136628)

built it? Did you mean timestamp? (Or build id, but that doesn't seem right here.)

zturner updated this revision to Diff 136659.Mar 1 2018, 6:05 PM

It turns out I wasn't actually doing enough. The value that symbol servers *really* care about is the time stamp in the top-level coff file header. However, since link.exe sets this time stamp and the time stamp of al debug directories to matching values, we do the same here.

With my first version of the patch, symbol servers were still not working correctly, but after this version that writes the time stamp to the COFF file header, it is now working correctly.

Apparently this *still* isn't quite right. The scenario presented to me was this:

  1. User does a clean build, hash X gets written into PDB, hash Y gets written into EXE.
  2. User archives EXE and PDB on symbol store. EXE is in some directory X, PDB is in some directory Y.
  3. User adds some blank lines and rebuilds. EXE doesn't change but PDB does.
  4. User re-archives. EXE is unchanged so doesn't get updated on the symbol store, but PDB does. PDB now gets archived in directory Z.

So now there is only one EXE, Y, but there are two PDBs, X and Z. If you have the new source code, you will run the EXE, it will find it under directory Y in the sym store, map to the old PDB X, and the source won't match.

I think the solution to this is:

a) Hash the PDB first. Call it X, write this value into the PDB.
b) In the debug directories, write X for the time stamp.
c) Now hash the executable after you have written X in the debug directories, call this Y.
d) Write Y into the coff file header time stamp.

"Hash the PDB"

I thought hashing the PDB itself was considered undesirable for speed reasons (since PDBs can get pretty large)? Or is xxHash64 fast enough that it doesn't matter?

"Hash the PDB"

I thought hashing the PDB itself was considered undesirable for speed reasons (since PDBs can get pretty large)? Or is xxHash64 fast enough that it doesn't matter?

I haven't timed it (probably should). But, if we want reproducible builds, I dont' know of a better option.

"Hash the PDB"

I thought hashing the PDB itself was considered undesirable for speed reasons (since PDBs can get pretty large)? Or is xxHash64 fast enough that it doesn't matter?

I haven't timed it (probably should). But, if we want reproducible builds, I dont' know of a better option.

If it turns out that the hashing is really expensive, perhaps there could be an option to just use an actual timestamp or something, for people who care about link speed more than reproducibility?

Do you know how link.exe handles this? I know they're doing some sort of hashing for their timestamps, but I thought it was just based on the output binary, not its PDB.

thakis added a subscriber: thakis.Mar 2 2018, 7:23 AM

Apparently this *still* isn't quite right. The scenario presented to me was this:

  1. User does a clean build, hash X gets written into PDB, hash Y gets written into EXE.
  2. User archives EXE and PDB on symbol store. EXE is in some directory X, PDB is in some directory Y.
  3. User adds some blank lines and rebuilds. EXE doesn't change but PDB does.
  4. User re-archives. EXE is unchanged so doesn't get updated on the symbol store, but PDB does. PDB now gets archived in directory Z.

So now there is only one EXE, Y, but there are two PDBs, X and Z. If you have the new source code, you will run the EXE, it will find it under directory Y in the sym store, map to the old PDB X, and the source won't match.

I don't understand this scenario. If the pdb changes, doesn't the RSDS header in the exe linking to the binary have to change as well? And then the hash will be updated, and all is good? https://blogs.msdn.microsoft.com/oldnewthing/20180103-00/?p=97705 sounds like link.exe really only hashes the binary.

thakis added a comment.Mar 2 2018, 7:24 AM

Apparently this *still* isn't quite right. The scenario presented to me was this:

  1. User does a clean build, hash X gets written into PDB, hash Y gets written into EXE.
  2. User archives EXE and PDB on symbol store. EXE is in some directory X, PDB is in some directory Y.
  3. User adds some blank lines and rebuilds. EXE doesn't change but PDB does.
  4. User re-archives. EXE is unchanged so doesn't get updated on the symbol store, but PDB does. PDB now gets archived in directory Z.

So now there is only one EXE, Y, but there are two PDBs, X and Z. If you have the new source code, you will run the EXE, it will find it under directory Y in the sym store, map to the old PDB X, and the source won't match.

I don't understand this scenario. If the pdb changes, doesn't the RSDS header in the exe linking to the binary have to change as well?

Sorry, the RSDS header in the exe linking it to the _pdb_...

zturner updated this revision to Diff 136780.Mar 2 2018, 9:52 AM

Attempt N+1 at making this right.

Now we hash the binary *after* setting the guid and age. This should address the problem of debug info changes with no binary changes, because when that happens we increment the age. By hashing *after* setting the guid and age, we ensure that even if only the debug info changes, the hash will also change.

zturner updated this revision to Diff 136790.Mar 2 2018, 10:24 AM

Man, this is hard. There are so many weird edge cases.

Previous version of the patch would continue to write 0 for the timestamp if we weren't generating debug info. But, we really should be writing the time stamp in the coff file header even for non debug info builds. So this version does that.

I'm going to spend some time coming up with a set of tests for all of these edge cases, but throwing this up here in the meantime because I'm probably still doing something wrong.

This revision was automatically updated to reflect the committed changes.
zturner added a subscriber: rnk.Mar 8 2018, 4:45 AM

Thanks, I'll revert until I can get a fix in.