I created this patch with the hope that computing a hash value while
calculating the string length is more efficient than doing the two in
separate passes. Unfortunately I couldn't find any performance improvement
on my machine with this patch, but you might find it interesting, so
I'm sharing it.
Details
- Reviewers
grimar • espindola
Diff Detail
- Build Status
Buildable 17482 Build 17482: arc lint + arc unit
Event Timeline
Note that strlen_xxHash64 was actually faster than strlen() and xxHash64() for a microbenchmark. I didn't observe noticeable difference in real benchmarks with this patch though.
It is possible that xxhash is just too slow for use in a hash table. The experiment I did for pr37029 using hash_combine was still using strlen.
llvm/lib/Support/xxhash.cpp | ||
---|---|---|
182 | I remember that when I experimented with all such stuff, at some point I also had the Hash out parameter passed by reference like H here. |
llvm/lib/Support/xxhash.cpp | ||
---|---|---|
149 | I tried to build this patch with MSVS, but it complains about the absence of this function. I had to use the following instead: int psnip_builtin_ffsl(long v) { unsigned long r; if (_BitScanForward(&r, (unsigned long)v)) { return (int)(r + 1); } return 0; } (took it from https://github.com/nemequ/portable-snippets/tree/master/builtin). But it does not work correctly it seems. I am testing on lld-speed-test\lld-speed-test\scylla, const char *End = Q + Remaining; size_t Len = End - P; handle it wrong and code ends up with |
llvm/lib/Support/xxhash.cpp | ||
---|---|---|
148 | btw, if you pass P=="ab\0" to this method, then R[0] will read the rest 5 bytes of memory from out of bounds, no? |
llvm/lib/Support/xxhash.cpp | ||
---|---|---|
148 | That's intentional. | |
149 | I'm not very familiar with these intrinsics of MSVC, but I believe you forgot to use the 64 bit version. Use this. static inline int my_ffsl(uint64_t v) { uint64r_t R; _BitScanForward64(&R, V); return R; } Note that this should be compiled to a single BSF instruction on x86-64. |
llvm/lib/Support/xxhash.cpp | ||
---|---|---|
148 | It would be wrong to keep it in lib/Support with that probably I think. | |
149 | Yes, thanks. The final code I used was: static inline int psnip_builtin_ffsl(uint64_t v) { unsigned long R; _BitScanForward64(&R, v); return R + 1; } And I was able to profile the patch finally. It was faster for me than the original code but
Yes, it was: ... Remaining = (psnip_builtin_ffsl(X) >> 3) - 1; 00007FF6F219C604 bsf rcx,r8 00007FF6F219C608 inc ecx 00007FF6F219C60A sar ecx,3 00007FF6F219C60D dec ecx } |
btw, if you pass P=="ab\0" to this method,
then R[0] will read the rest 5 bytes of memory from out of bounds, no?