ConcurrentHashTable - is a resizeable concurrent hashtable.
The range of resizings is limited up to x2^32. The hashtable allows only concurrent insertions.
Concurrent hashtable is necessary for the D96035 patch.
Differential D132455
[ADT] add ConcurrentHashtable class. avl on Aug 23 2022, 2:30 AM. Authored by
Details ConcurrentHashTable - is a resizeable concurrent hashtable. Concurrent hashtable is necessary for the D96035 patch.
Diff Detail
Unit Tests Event TimelineComment Actions Swift has a ConcurrentMap. IIRC it uses atomics and no locks. Comment Actions The comment says that Swift's ConcurrentMap is a binary tree, which is usually slower than hashmap. Another thing is that it does not support rebalancing, while this patch supports rehashing. Though I did not compare this patch with Swift's ConcurrentMap. Comment Actions Performance results for this patch https://reviews.llvm.org/file/data/a36xmbx43mp35xx5rznf/PHID-FILE-lncmoijytnb3gn4gjrii/Performance.pdf (Collected by utility from : D132548) Comment Actions added possibility to specify allocator as template parameter, Comment Actions @dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them? Comment Actions I won’t have time to take a look myself for a couple of weeks, but adding other interested parties. Certainly sounds like there’s crossover! The data structure in the other patch supports concurrent insertion and look-up and uses atomics rather than locks. It does not support iteration, although that could be implemented. It does not directly support arbitrary keys, but could be used to implement a more general map; the client is expected to do hashing and decide what to do with collisions. Likewise, it does not support erase, but the client could use a tombstone. Not sure if your use case requires those operations, or if the overhead would be worth it. Comment Actions David, thank you for pointing this another patch. It would be good to have a unified solution. Comment Actions This hashtable(D132455) is implemented for https://reviews.llvm.org/D96035 patch. So far, I created a table comparing patches: ------------------------------------------------------------------------ | HashMappedTrie | ConcurrentHashtable | ------------------------------------------------------------------------ thread-safe | yes | yes | ------------------------------------------------------------------------ range of resizings | not limited | x2^32 | | | can be increased | ------------------------------------------------------------------------ key/data pairs | no | yes | ------------------------------------------------------------------------ lock-free? | yes | no | | | uses mutexes for locking | ------------------------------------------------------------------------ insertions | yes | yes | ------------------------------------------------------------------------ lookups | yes | no | | | can be easily added | ------------------------------------------------------------------------ deletions | no | no | | | can be easily added | ------------------------------------------------------------------------ iterations | no | no | | can be easily added | an ineffective non-thread| | |safe solution could be done| ------------------------------------------------------------------------ hash collisions | yes | no collisions | | should be handled | | | by client | | ------------------------------------------------------------------------ I did some first-look performance comparisons of the patches(using ------------------------------------------------------------------------------ | HashMappedTrie | ConcurrentHashtable | ------------------------------------------------------------------------------ --num-threads 1 | time: 62sec | time: 30sec | --initial-size 100000000 | memory: 13.2G | memory: 16.1G | ------------------------------------------------------------------------------ --num-threads 1 | time: 62sec | time: 34sec | --initial-size 100 | memory: 13.2G | memory: 18.1G | ------------------------------------------------------------------------------ --num-threads 16 | time: 38sec | time: 3.5sec | --initial-size 100000000 | memory: 13.2G | memory: 16.1G | ------------------------------------------------------------------------------ --num-threads 16 | time: 38sec | time: 7.3sec | --initial-size 100 | memory: 13.2G | memory: 18.1G | ------------------------------------------------------------------------------ Comment Actions Thanks for doing the comparison with https://reviews.llvm.org/D133715. There are definitely some crossovers can help both data structure, like I also have a patch for a thread-safe-allocator: https://reviews.llvm.org/D133713. HashMappedTrie can definitely store key/data pair. For example, InMemoryCAS (https://reviews.llvm.org/D133716) is storing data as in HashMappedTrie to implement a CAS. There are also definitely more differences, like one data structure is table-like and the other is a tree-like. Not sure if it is worth that we unify the interface so you can switch between but we can consider that. I think the biggest difference is how the collision is handled. For a CAS implementation that intended for caching, a hash collision cannot be accepted. I guess it is possible to extend HashMappedTrie to support collision and have a mode to error on collision, but it is not free (but might not be too costly). I am super interested in how you do the comparison. Can you post a patch for the code you have for that? The HashMappedTrie hasn't been really tuned for performance/memory usage, it would be interesting to use your tool to do some investigation (for example, the NumRootBits and NumSubtrieBits can be tuned for memory/performance). Comment Actions
Sure. I will update https://reviews.llvm.org/D132548 to support HashMappedTrie in a couple of days. Comment Actions
@steven_wu I`ve updated D132548 to have a possibility to measure HashMappedTrie. It depends on D125979, D132455, D133715. /usr/bin/time -f " %E %M " ./check-hashtable --data-set random --num-threads 1 --table-kind hash-mapped-trie --aggregate-data Any changes/suggestions are welcomed :-) Comment Actions Thanks! It works great. The only downside is that different memory allocator is used in different implementation so the number is not directly comparable but could reflect the performance for simple use cases. It is already good to look at the scaling factor for implementations (size and threads). For example, I see the default configuration for HashMappedTrie scales to about 4 threads, then it just goes really bad (because the root bits is only 6 so high contention is expected in the beginning). Comment Actions refactored. Comment Actions ping. @JDevlieghere @aprantl Do you think it is better to move this ConcurrentHashtable into the DWARFLinkerParallel folder? Comment Actions @aprantl @JDevlieghere @dblaikie @MaskRay Would you mind to take a look at this review, please? The performance comparison for this hash table for reading strings from DWARF info from clang binary(done by utility from D132548): --num-threads 16 time memory 1. llvm-concurrent-hashmap: 0.82 sec 3.1G 2. lldb-const-string: 0.98 sec 3.1G --num-threads 1 time memory 1. llvm-concurrent-hashmap: 5.7 sec 3.1G 2. lldb-const-string: 7.1 sec 3.1G 3. llvm-string-map: 5.7 sec 3.1G The advantages comparing to lldb-const-string implementation:
Comment Actions I'm fine either way. If someone has concerns about the implementation, it's probably less contentious to land it in the DWARFLinkerParallel first, but so far I've not heard any objections. If we do want to use this from LLDB, then we'll need it to be in ADT eventually, though I'd like to do a comparison with Steven's HashMappedTrie for LLDB's real world usage. I read through the code and most of it makes sense to me, but I wouldn't mind if someone that deal with data structures on a daily basis has another look. I'll mark this as accepted but would ask you to let it sit here for a few more days for others to take a look.
Comment Actions I have the utility to do the comparision - https://reviews.llvm.org/D132548 The problem is that HashMappedTrie does not resolve hash collisions. it is assumed that such collisions would be resolved by the client of HashMappedTrie. I am not sure what is the good way to resolve such collisions. Thus, I do not know how to make a "fair" comparision at the moment.
Comment Actions Alexey, not all platforms support thread local storage. Can you wrap the tests up in #ifdef LLVM_ENABLE_THREADS #else #endif You should also use LLVM_THREAD_LOCAL instead of using thread_local. Thanks Comment Actions @SeanP Thank you for catching! It looks like it is not necessary to insert #ifdef LLVM_ENABLE_THREADS. Just using LLVM_THREAD_LOCAL should be enough. Please, consider https://reviews.llvm.org/D147649 Comment Actions I stumbled across a potential integer overflow in this code. A proposed fix is posted as D158117. |