Hashing large constant blobs results in long compile times - hash only up to 1024 bytes.
Diff Detail
Unit Tests
Event Timeline
llvm/lib/Support/StringMap.cpp | ||
---|---|---|
87 | 1024 was chosen arbitrarily and different numbers may give a better tradeoff between collisions and hash time. |
It may make sense to first/also switch to xxHash (already included here https://github.com/llvm/llvm-project/blob/295861514e0d1e48df2918b630dd692ac27ee0de/llvm/include/llvm/Support/xxhash.h#L45), which should be much faster than DJB, especially for large keys.
OTOH, it appears that some tests depend on StringMap iteration order, so updating them as a result of changing the hash function may be painful. For example, https://reviews.llvm.org/D97396 (which attempted to change the seed value) was reverted for that reason.
Seems problematic if the strings happen to have common prefixes - not sure this is the right direction.
While it might be more work, might be worth considering changing hash function as @erikdesjardins mentioned, if considerations are made to help folks migrate/deal with the fallout of order changing (maybe even worth building in defensive features to reduce order dependence, like Abseil uses for its hash containers - though I'd have thought reverse iteration mode would've helped flush most of this out?)
1024 was chosen arbitrarily and different numbers may give a better tradeoff between collisions and hash time.