This is for https://bugs.llvm.org//show_bug.cgi?id=37029,
which was about the experiment of using hash_value for splitting strings.
hash_value at some point for short strings falls back to hash_short:
https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/Hashing.h#L453
I think we can use it instead of xxHash64 for `splitStrings, as this method
uses uint64_t as a return value and shows really good results it seems.
It computes the hash of the part of the string, but this seems to be OK here.
I used scylla to profile changes and benchmarked few algorithms.
splitNonStrings did not show up in profile,
so I experimented only on changing splitStrings. Results are below:
* Default (xxHash64): CPU(%) CPU(ms) - lld.exe 100.00% 4254 + lld::elf::MergeInputSection::splitStrings 21.86% 930 * With use of hash_value: - lld.exe 100.00% 4001 + lld::elf::MergeInputSection::splitStrings 18.25% 730 * With use of hashGnu: - lld.exe 100.00 % 4469 + lld::elf::MergeInputSection::splitStrings 25.60 % 1144 * With use of hashSysV: - lld.exe (PID: 5716) 100.00 % 5080 + lld::elf::MergeInputSection::splitStrings 33.40 % 1711 * This patch: - lld.exe (PID: 9192) 100.00% 3866 + lld::elf::MergeInputSection::splitStrings 13.24 % 512
So this change improves total CPU time by about 10% (4254/3866) for Scylla.
And makes splitStrings about 80% faster.
(Note that is the time that profiler shows, I did not yet try to benchmark it
in a regular way).
Seed value used was taken from:
https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/Hashing.h#L328
It is equal to default seed used by hash_value
This is interesting, but if this is effective, we should do that in StringRef::hash so that it speeds up everyone's code, shouldn't we?