This is an archive of the discontinued LLVM Phabricator instance.

Add /DEBUG:GHASH option to LLD to speed up COFF linking
ClosedPublic

Authored by zturner on Dec 7 2017, 12:47 PM.

Details

Summary

Use of this flag causes LLD to try to use the .debug$H for each input object file in which it is present. The original code path (e.g. /DEBUG) is unchanged and still uses non-cryptographic, non-tree hashes that are computed entirely by the linker.

Some preliminary timings shows modest performance gains on real world programs. For example, on my machine linking clang with /DEBUG takes about 23 seconds, and with /DEBUG:GHASH takes about 17 seconds. So this is a 1.35x speedup. That said, this implementation is completely unoptimized. I haven't profiled to find out what the current bottlenecks are, and it's possible we are doing something terribly simple that could lead to further gains.

Note that this implementation does not compute missing hashes in parallel. That's one avenue for exploration, but I put some debugging metrics in, and found that when linking clang, we only computed about 31,000 SHA1 hashes serially. So, there is not likely a significant performance win to be had from computing missing hashes in parallel, and it also suggests there is not a big win to be had from post-processing system libraries to produce a .debug$H section. Just having a large enough application that the number of application defined types dominates the number of system defined types is sufficient to get good performance.

On the other hand, if we don't compute hashes in parallel, it does lead to poor worst case performance. I also tried using /DEBUG:GHASH when none of the object files contained a .debug$H section. In that case, we were twice as slow as /DEBUG. This is consistent with my earlier findings that swapping the hash algorithm from CityHash to SHA1 and makign no other changes led to a 2x slowdown.

Diff Detail

Event Timeline

zturner created this revision.Dec 7 2017, 12:47 PM
ruiu added inline comments.Dec 7 2017, 1:59 PM
lld/COFF/Config.h
92

We are using the same name for the command line option and the variable name, so it should be DebugGHashes.

lld/COFF/PDB.cpp
77–81

Are these classes expensive to instantiate? If the cost of instantiation is negligible, I'd just create four instances as members of this class.

176

Please add a function comment saying that this is .debug$H is a clang extension and this function checks if that is available.

182–190
return Header->Magic == COFF::DEBUG_HASHES_SECTION_MAGIC &&
       Header->Version == 0 &&
       Header->HashAlgorithm == uint16_t(GlobalTypeHashAlg::SHA1) &&
       DebugH.size() % 20 == 0;

is probably a bit more straightforward.

196

You can omit llvm::.

197

auto -> ArrayRef<uint8_t>

262

auto -> Optional<ArrayRef<uint8_t>>

llvm/lib/DebugInfo/CodeView/TypeStreamMerger.cpp
103

Why so many parentheses?

zturner added inline comments.Dec 7 2017, 2:05 PM
lld/COFF/PDB.cpp
262

I agree with your previous comment about changing auto, but this one is a bit longer, and when initialized inside the condition of an if statement, I think it's pretty idiomatic to use auto. So I would actually prefer to leave this one as is, but if you feel strongly I can still change it.

ruiu added inline comments.Dec 7 2017, 2:08 PM
lld/COFF/PDB.cpp
262

In lld even if it is a bit too long, we usually write the actual type name instead of auto, so I think I'd prefer doing the same thing here for consistency. Note that there are few exceptions to that policy, e.g. we write auto Err instead of std::error_code Err.

zturner updated this revision to Diff 126588.Dec 12 2017, 11:34 AM

Updated with suggestions.

rnk accepted this revision.Dec 13 2017, 1:23 PM

I'm happy with this, but let @ruiu sign off.

lld/COFF/PDB.cpp
195

s/llvm::None/None/

lld/test/COFF/Inputs/pdb-hashes-1.yaml
37

Remove the SectionData from .debug$S and .debug$T (also, we should fix obj2yaml already =P)

This revision is now accepted and ready to land.Dec 13 2017, 1:23 PM
ruiu accepted this revision.Dec 13 2017, 1:40 PM

LGTM

This revision was automatically updated to reflect the committed changes.