The previous algorithm processed one character at a time, which is very painful on a modern CPU. Replace it with xxHash64, which both already exists in the codebase and is fairly fast.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
include/llvm/ADT/StringExtras.h | ||
---|---|---|
128 ↗ | (On Diff #96641) | Rename Result Seed. |
Thanks for working on this! In the future, please set llvm-commits as a subscriber on differential revisions, not as a reviewer. Also, please upload patches with context (e.g "git diff -U1000").
IMO, tests which break when the string hash function is changed are suspicious. Why is a dependence on a specific hash function being tested? It might make sense to fix the tests in prep commits (ideally no test updates would be needed with this patch). I'll look into fixing the llvm-{cov,profdata} issues.
include/llvm/ADT/StringExtras.h | ||
---|---|---|
130 ↗ | (On Diff #96641) | Would xxh32 run significantly faster/slower than xxh64? |
The list is both a subscriber and a reviewer. Next time I'll only make it a subscriber.
I assumed I had no context because Phabricator couldn't map my patch to the repository. Your comment makes me think it doesn't even try (ReviewBoard does this automatically). I didn't realize I'm supposed to supply the context!
IMO, tests which break when the string hash function is changed are suspicious. Why is a dependence on a specific hash function being tested? It might make sense to fix the tests in prep commits (ideally no test updates would be needed with this patch). I'll look into fixing the llvm-{cov,profdata} issues.
That was my first thought too - I was worried the hash value was encoded in the output some how. But I think this stems from the use of StringMap and StringSet, which place into buckets by hash value. So it isn't that anything interesting changed, just that the order of iteration changed. None of the interesting contents changed, AFAICT.
include/llvm/ADT/StringExtras.h | ||
---|---|---|
130 ↗ | (On Diff #96641) | It isn't already in the llvm code base. If we're going that route of adding a hash function, there are probably even better choices. |
Right, it looks like that's the only thing that's changed. My worry is that if we need to change the hash again, we'd have to go through another batch of test updates. It might be easier to convince people to not print things out in an order determined by a hash function now.
include/llvm/ADT/StringExtras.h | ||
---|---|---|
130 ↗ | (On Diff #96641) | Ok. I think it's fine to switch to xxh64 for now. |
test/DebugInfo/X86/gnu-public-names.ll | ||
218 ↗ | (On Diff #96643) | I'm not familiar with this test, but it's a bit concerning that more checks get added than removed. @aprantl, what are your thoughts on going forward with the test updates vs. making the output of llvm-dwarfdump not depend on the string hash function? |
@echristo: Is there anything besides LLVM tests that depends on llvm-dwarfdump's output ordering for the .gnu_pubnames section?
This is fine with me.
test/DebugInfo/X86/gnu-public-names.ll | ||
---|---|---|
218 ↗ | (On Diff #96643) | Agreed. |
include/llvm/ADT/StringExtras.h | ||
---|---|---|
158–159 ↗ | (On Diff #96769) | nit: in the LLVM coding style you want to put { at end of the previous line. |
Addressed coding style Issue.
Can someone commit this for me? I don't have access. Thank you!
Scott,
I cannot submit this patch because there are a few tests failing with your patch. Here is a log. https://gist.github.com/rui314/ef9cc0845131eb6e5e1e00912aefa2b5
Can you take a look?
Looks like this patch broke several bots.
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/8639/steps/test/logs/stdio
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/7423/steps/test_lld/logs/stdio
Can you take a look? In the meantime, I'll revert this patch.
Fixed clang-tools-extra build issues, but doesn't fix Mac lld issues (http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/7423/steps/test_lld/logs/stdio) as I do not have access to a Mac. Can someone look into it? It's probably just an identifier ordering issue.
Is it necessary to have a mac to run the mach-o tests? At any rate, changing the 'CHECK'/'DCHECK' lines to 'CHECK-DAG'/'DCHECK-DAG' should do the trick.
AFAIK the test only failed on the mac buildbot. I'm of two minds on CHECK vs CHECK-DAG; it makes the test immune to hash order issues, but in some cases I believe it can make the test too permissive, since in the end it just makes sure the lines are present, not that they are grouped together in any meaningful way.
That's a fair point. My thinking was that since the output being tested relies on the hash function to for ordering, the test shouldn't be checking for a particular order. But, instead of making the tests more permissive, we could simply sort things before printing them.
I'll get much better performance by just replacing lldb::ConstString with a lock free hashtable utilizing its own hash function.
I can't really comment on your lldb specific solution as I'm not familiar with that code, but I think this would be a valuable improvement for llvm independently.
Thanks!
Davide