Previously we were writing 0 for this field. This breaks symbol servers, and the value needs to be something, although it doesn't necessarily need to be a timestamp. I chose to use a hash of the binary here, calculated while the value of this field is still 0. Then, the 0 is replaced with the hash.
Details
Diff Detail
- Repository
- rLLD LLVM Linker
Event Timeline
lld/COFF/Writer.cpp | ||
---|---|---|
1045 ↗ | (On Diff #136628) | built it? Did you mean timestamp? (Or build id, but that doesn't seem right here.) |
It turns out I wasn't actually doing enough. The value that symbol servers *really* care about is the time stamp in the top-level coff file header. However, since link.exe sets this time stamp and the time stamp of al debug directories to matching values, we do the same here.
With my first version of the patch, symbol servers were still not working correctly, but after this version that writes the time stamp to the COFF file header, it is now working correctly.
Apparently this *still* isn't quite right. The scenario presented to me was this:
- User does a clean build, hash X gets written into PDB, hash Y gets written into EXE.
- User archives EXE and PDB on symbol store. EXE is in some directory X, PDB is in some directory Y.
- User adds some blank lines and rebuilds. EXE doesn't change but PDB does.
- User re-archives. EXE is unchanged so doesn't get updated on the symbol store, but PDB does. PDB now gets archived in directory Z.
So now there is only one EXE, Y, but there are two PDBs, X and Z. If you have the new source code, you will run the EXE, it will find it under directory Y in the sym store, map to the old PDB X, and the source won't match.
I think the solution to this is:
a) Hash the PDB first. Call it X, write this value into the PDB.
b) In the debug directories, write X for the time stamp.
c) Now hash the executable after you have written X in the debug directories, call this Y.
d) Write Y into the coff file header time stamp.
"Hash the PDB"
I thought hashing the PDB itself was considered undesirable for speed reasons (since PDBs can get pretty large)? Or is xxHash64 fast enough that it doesn't matter?
I haven't timed it (probably should). But, if we want reproducible builds, I dont' know of a better option.
If it turns out that the hashing is really expensive, perhaps there could be an option to just use an actual timestamp or something, for people who care about link speed more than reproducibility?
Do you know how link.exe handles this? I know they're doing some sort of hashing for their timestamps, but I thought it was just based on the output binary, not its PDB.
I don't understand this scenario. If the pdb changes, doesn't the RSDS header in the exe linking to the binary have to change as well? And then the hash will be updated, and all is good? https://blogs.msdn.microsoft.com/oldnewthing/20180103-00/?p=97705 sounds like link.exe really only hashes the binary.
Attempt N+1 at making this right.
Now we hash the binary *after* setting the guid and age. This should address the problem of debug info changes with no binary changes, because when that happens we increment the age. By hashing *after* setting the guid and age, we ensure that even if only the debug info changes, the hash will also change.
Man, this is hard. There are so many weird edge cases.
Previous version of the patch would continue to write 0 for the timestamp if we weren't generating debug info. But, we really should be writing the time stamp in the coff file header even for non debug info builds. So this version does that.
I'm going to spend some time coming up with a set of tests for all of these edge cases, but throwing this up here in the meantime because I'm probably still doing something wrong.
This patch is breaking a few COFF tests: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/14971/steps/check-lld%20msan/logs/stdio.
Please take a look.