This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/ADT/
-
llvm/
-
ADT/
1/5
ConcurrentHashtable.h
-
unittests/ADT/
-
ADT/
-
CMakeLists.txt
-
ConcurrentHashtableTest.cpp

Differential D132455

[ADT] add ConcurrentHashtable class.
ClosedPublic

Authored by avl on Aug 23 2022, 2:30 AM.

Download Raw Diff

Details

Reviewers

aprantl
JDevlieghere
dblaikie
MaskRay
int3
ikudrin

Commits

rG42058eea7912: [reland][ADT] add ConcurrentHashtable class.
rG8482b238062e: [ADT] add ConcurrentHashtable class.

Summary

ConcurrentHashTable - is a resizeable concurrent hashtable.
The range of resizings is limited up to x2^32. The hashtable allows only concurrent insertions.

Concurrent hashtable is necessary for the D96035 patch.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

avl created this revision.Aug 23 2022, 2:30 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2022, 2:30 AM

Herald added subscribers: StephenFan, mgorny. · View Herald Transcript

avl requested review of this revision.Aug 23 2022, 2:30 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2022, 2:30 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

avl mentioned this in D125979: [ADT] add FixedConcurrentHashTable class..Aug 23 2022, 2:32 AM

tschuett added a subscriber: tschuett.Aug 23 2022, 2:36 AM

Swift has a ConcurrentMap. IIRC it uses atomics and no locks.
https://github.com/apple/swift/blob/36893aa26041a49bd767acedb99cdd60ad6b3380/include/swift/Runtime/Concurrent.h#L246

Harbormaster completed remote builds in B182788: Diff 454756.Aug 23 2022, 3:44 AM

In D132455#3742117, @tschuett wrote:

Swift has a ConcurrentMap. IIRC it uses atomics and no locks.
https://github.com/apple/swift/blob/36893aa26041a49bd767acedb99cdd60ad6b3380/include/swift/Runtime/Concurrent.h#L246

The comment says that Swift's ConcurrentMap is a binary tree, which is usually slower than hashmap. Another thing is that it does not support rebalancing, while this patch supports rehashing. Though I did not compare this patch with Swift's ConcurrentMap.

avl added a child revision: D132548: [WIP][ADT] Utility for comparision of hashtables implementation..Aug 24 2022, 4:11 AM

Performance results for this patch https://reviews.llvm.org/file/data/a36xmbx43mp35xx5rznf/PHID-FILE-lncmoijytnb3gn4gjrii/Performance.pdf (Collected by utility from : D132548)

lkail added a subscriber: lkail.Aug 24 2022, 6:32 AM

added possibility to specify allocator as template parameter,
added possibility to change range of resizings,
set default resizing range to x2^32

avl edited the summary of this revision. (Show Details)Dec 20 2022, 2:12 PM

Harbormaster completed remote builds in B204238: Diff 484377.Dec 20 2022, 3:27 PM

removed dependence on <experimental/random>

Harbormaster completed remote builds in B204352: Diff 484524.Dec 21 2022, 4:35 AM

@dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them?

cure windows build.

Harbormaster completed remote builds in B205159: Diff 485614.Dec 29 2022, 8:24 AM

In D132455#4018472, @dblaikie wrote:

@dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them?

I won’t have time to take a look myself for a couple of weeks, but adding other interested parties.

Certainly sounds like there’s crossover! The data structure in the other patch supports concurrent insertion and look-up and uses atomics rather than locks. It does not support iteration, although that could be implemented. It does not directly support arbitrary keys, but could be used to implement a more general map; the client is expected to do hashing and decide what to do with collisions. Likewise, it does not support erase, but the client could use a tombstone. Not sure if your use case requires those operations, or if the overhead would be worth it.

In D132455#4018472, @dblaikie wrote:

@dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them?

David, thank you for pointing this another patch. It would be good to have a unified solution.

In D132455#4019525, @dexonsmith wrote:

In D132455#4018472, @dblaikie wrote:

@dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them?

I won’t have time to take a look myself for a couple of weeks, but adding other interested parties.

Certainly sounds like there’s crossover! The data structure in the other patch supports concurrent insertion and look-up and uses atomics rather than locks. It does not support iteration, although that could be implemented. It does not directly support arbitrary keys, but could be used to implement a more general map; the client is expected to do hashing and decide what to do with collisions. Likewise, it does not support erase, but the client could use a tombstone. Not sure if your use case requires those operations, or if the overhead would be worth it.

This hashtable(D132455) is implemented for https://reviews.llvm.org/D96035 patch.
The main requirement is to have a possibility to store aggregate key/data pairs
(i.e. key is separated from the data) in parallel. Another requirement is to have information
whether data inserted immidiately or by previous call. It should use memory pool
(like BumpPtrAllocator).

So far, I created a table comparing patches:

------------------------------------------------------------------------
                    |   HashMappedTrie     |    ConcurrentHashtable    |
------------------------------------------------------------------------
    thread-safe     |         yes          |           yes             |
------------------------------------------------------------------------
 range of resizings |     not limited      |          x2^32            |
                    |                      |     can be increased      |
------------------------------------------------------------------------
    key/data pairs  |         no           |           yes             |
------------------------------------------------------------------------
     lock-free?     |         yes          |           no              |
                    |                      | uses mutexes for locking  |
------------------------------------------------------------------------
    insertions      |         yes          |           yes             |
------------------------------------------------------------------------
      lookups       |         yes          |           no              |
                    |                      |   can be easily added     |
------------------------------------------------------------------------
     deletions      |          no          |           no              |
                    |                      |   can be easily added     |
------------------------------------------------------------------------
     iterations     |          no          |           no              |
                    |  can be easily added |  an ineffective non-thread|
                    |                      |safe solution could be done|
------------------------------------------------------------------------
  hash collisions   |          yes         |      no collisions        |
                    |   should be handled  |                           |
                    |      by client       |                           |                    
------------------------------------------------------------------------

I did some first-look performance comparisons of the patches(using
this utility https://reviews.llvm.org/D132548).
The numbers might be inaccurate if I used HashMappedTrie incorrectly.
There is a difference - I did not set any initial sizes for HashMappedTrie
while initial sizes for ConcurrentHashtable were set. Also, only the key
was stored for HashMappedTrie while the key and data pair were stored for
ConcurrentHashtable. All runs insert 100000000 strings converted from
the corresponding integer.

------------------------------------------------------------------------------
                          |   HashMappedTrie     |    ConcurrentHashtable    |
------------------------------------------------------------------------------
 --num-threads 1          | time:       62sec    | time:           30sec     |                    
 --initial-size 100000000 | memory:     13.2G    | memory:         16.1G     |                    
------------------------------------------------------------------------------
 --num-threads 1          | time:       62sec    | time:           34sec     |                    
 --initial-size       100 | memory:     13.2G    | memory:         18.1G     |                    
------------------------------------------------------------------------------
 --num-threads 16         | time:       38sec    | time:          3.5sec     |                    
 --initial-size 100000000 | memory:     13.2G    | memory:         16.1G     |                    
------------------------------------------------------------------------------
 --num-threads 16         | time:       38sec    | time:          7.3sec     |                    
 --initial-size       100 | memory:     13.2G    | memory:         18.1G     |                    
------------------------------------------------------------------------------

avl added a child revision: D140841: [DWARFLinkerParallel] Add StringPool class..Jan 2 2023, 4:22 AM

Thanks for doing the comparison with https://reviews.llvm.org/D133715.

There are definitely some crossovers can help both data structure, like I also have a patch for a thread-safe-allocator: https://reviews.llvm.org/D133713. HashMappedTrie can definitely store key/data pair. For example, InMemoryCAS (https://reviews.llvm.org/D133716) is storing data as in HashMappedTrie to implement a CAS.

There are also definitely more differences, like one data structure is table-like and the other is a tree-like. Not sure if it is worth that we unify the interface so you can switch between but we can consider that. I think the biggest difference is how the collision is handled. For a CAS implementation that intended for caching, a hash collision cannot be accepted. I guess it is possible to extend HashMappedTrie to support collision and have a mode to error on collision, but it is not free (but might not be too costly).

I am super interested in how you do the comparison. Can you post a patch for the code you have for that? The HashMappedTrie hasn't been really tuned for performance/memory usage, it would be interesting to use your tool to do some investigation (for example, the NumRootBits and NumSubtrieBits can be tuned for memory/performance).

I am super interested in how you do the comparison. Can you post a patch for the code you have for that?

Sure. I will update https://reviews.llvm.org/D132548 to support HashMappedTrie in a couple of days.

I am super interested in how you do the comparison. Can you post a patch for the code you have for that? The HashMappedTrie hasn't been really tuned for performance/memory usage, it would be interesting to use your tool to do some investigation (for example, the NumRootBits and NumSubtrieBits can be tuned for memory/performance).

@steven_wu I`ve updated D132548 to have a possibility to measure HashMappedTrie. It depends on D125979, D132455, D133715.
In its current state patch does not set NumRootBits, NumSubtrieBits, though such options might be added.
If intel threading building block hashmap is not neccessary - the USE_ITBB should be unset.
if lib cuckoo hashmap is not neccessary - the USE_LIBCUCKOO should be unset.
Command line to run the tool:

/usr/bin/time -f " %E %M " ./check-hashtable --data-set random --num-threads 1 --table-kind hash-mapped-trie --aggregate-data

Any changes/suggestions are welcomed :-)

avl mentioned this in D140841: [DWARFLinkerParallel] Add StringPool class..Jan 4 2023, 10:12 AM

In D132455#4026480, @avl wrote:

I am super interested in how you do the comparison. Can you post a patch for the code you have for that? The HashMappedTrie hasn't been really tuned for performance/memory usage, it would be interesting to use your tool to do some investigation (for example, the NumRootBits and NumSubtrieBits can be tuned for memory/performance).

@steven_wu I`ve updated D132548 to have a possibility to measure HashMappedTrie. It depends on D125979, D132455, D133715.
In its current state patch does not set NumRootBits, NumSubtrieBits, though such options might be added.
If intel threading building block hashmap is not neccessary - the USE_ITBB should be unset.
if lib cuckoo hashmap is not neccessary - the USE_LIBCUCKOO should be unset.
Command line to run the tool:

/usr/bin/time -f " %E %M " ./check-hashtable --data-set random --num-threads 1 --table-kind hash-mapped-trie --aggregate-data

Any changes/suggestions are welcomed :-)

Thanks! It works great. The only downside is that different memory allocator is used in different implementation so the number is not directly comparable but could reflect the performance for simple use cases.

It is already good to look at the scaling factor for implementations (size and threads). For example, I see the default configuration for HashMappedTrie scales to about 4 threads, then it just goes really bad (because the root bits is only 6 so high contention is expected in the beginning).

improved dumping, did several refactorings, improved resizing code.

Harbormaster completed remote builds in B206063: Diff 486795.Jan 6 2023, 4:17 AM

refactored.
simplified implementation(removed version for integral data).
implemented a couple of space optimizations.

Harbormaster completed remote builds in B207933: Diff 489390.Jan 15 2023, 12:53 PM

ping.

avl edited the summary of this revision. (Show Details)Jan 26 2023, 7:54 AM

ping.

@JDevlieghere @aprantl Do you think it is better to move this ConcurrentHashtable into the DWARFLinkerParallel folder?

rebased. refactored&simplified.

Harbormaster completed remote builds in B215950: Diff 500444.Feb 25 2023, 12:58 PM

fixed build.

Harbormaster completed remote builds in B216025: Diff 500530.Feb 26 2023, 4:25 AM

@aprantl @JDevlieghere @dblaikie @MaskRay Would you mind to take a look at this review, please?

The performance comparison for this hash table for reading strings from DWARF info from clang binary(done by utility from D132548):

--num-threads 16

                             time      memory
1. llvm-concurrent-hashmap: 0.82 sec     3.1G
2. lldb-const-string:       0.98 sec     3.1G


--num-threads 1

                             time      memory
1. llvm-concurrent-hashmap: 5.7 sec     3.1G
2. lldb-const-string:       7.1 sec     3.1G
3. llvm-string-map:         5.7 sec     3.1G

The advantages comparing to lldb-const-string implementation:

ConcurrentHashTableByPtr is general. It might be used not only for strings.
ConcurrentHashTableByPtr has a bit better performance numbers.
ConcurrentHashTableByPtr is more scalable.

ping.

In D132455#4106255, @avl wrote:

ping.

@JDevlieghere @aprantl Do you think it is better to move this ConcurrentHashtable into the DWARFLinkerParallel folder?

I'm fine either way. If someone has concerns about the implementation, it's probably less contentious to land it in the DWARFLinkerParallel first, but so far I've not heard any objections. If we do want to use this from LLDB, then we'll need it to be in ADT eventually, though I'd like to do a comparison with Steven's HashMappedTrie for LLDB's real world usage.

I read through the code and most of it makes sense to me, but I wouldn't mind if someone that deal with data structures on a daily basis has another look. I'll mark this as accepted but would ask you to let it sit here for a few more days for others to take a look.

llvm/include/llvm/ADT/ConcurrentHashtable.h
114–115
223
298	What's the benefit of rehashing at 90% capacity? It seems like this is going to always leave a few empty slots on the table? I understand you always need to have one slot because you rehash after insertion, but it seems like you could rehash here rehash when you've exhausted the bucket?
309

This revision is now accepted and ready to land.Mar 15 2023, 11:38 AM

In D132455#4197342, @JDevlieghere wrote:

In D132455#4106255, @avl wrote:

ping.

@JDevlieghere @aprantl Do you think it is better to move this ConcurrentHashtable into the DWARFLinkerParallel folder?

I'm fine either way. If someone has concerns about the implementation, it's probably less contentious to land it in the DWARFLinkerParallel first, but so far I've not heard any objections. If we do want to use this from LLDB, then we'll need it to be in ADT eventually, though I'd like to do a comparison with Steven's HashMappedTrie for LLDB's real world usage.

I have the utility to do the comparision - https://reviews.llvm.org/D132548 The problem is that HashMappedTrie does not resolve hash collisions. it is assumed that such collisions would be resolved by the client of HashMappedTrie. I am not sure what is the good way to resolve such collisions. Thus, I do not know how to make a "fair" comparision at the moment.

llvm/include/llvm/ADT/ConcurrentHashtable.h
298	When the hashtable is nearly 100% full then it needs to pass too many entries while searching for the free slot. the worst scenario is if the bucket is of 1000000 entries size and it already has 999999 entries then it might need to enumerate all 999999 entries, which is slow. In case bucket is of 1000000 entries size and it is 90% full the number of entries which should be enumerated is smaller. So wasting 10% of memory allows to have 20% performance improvement. The exact value "0.9" is received while experimenting. Probably it would be good to have a possibility to change this value (for the case when memory is more important).

addressed comments.

Harbormaster completed remote builds in B221130: Diff 507499.Mar 22 2023, 4:08 PM

Thank you for the review!

Closed by commit rG8482b238062e: [ADT] add ConcurrentHashtable class. (authored by avl). · Explain WhyMar 23 2023, 6:35 AM

This revision was automatically updated to reflect the committed changes.

avl added a commit: rG8482b238062e: [ADT] add ConcurrentHashtable class..

avl added a reverting change: rGfd4aeba307ca: Revert "[ADT] add ConcurrentHashtable class.".Mar 23 2023, 6:43 AM

avl added a commit: rG42058eea7912: [reland][ADT] add ConcurrentHashtable class..Mar 27 2023, 6:50 AM

Alexey, not all platforms support thread local storage. Can you wrap the tests up in

#ifdef LLVM_ENABLE_THREADS

#else

#endif

You should also use LLVM_THREAD_LOCAL instead of using thread_local.

Thanks

@SeanP Thank you for catching! It looks like it is not necessary to insert #ifdef LLVM_ENABLE_THREADS. Just using LLVM_THREAD_LOCAL should be enough. Please, consider https://reviews.llvm.org/D147649

I stumbled across a potential integer overflow in this code. A proposed fix is posted as D158117.

andrew.w.kaylor mentioned this in rG6664e80ace08: Fix integer overflow in ConcurrentHashtTableByPtr.Aug 16 2023, 3:43 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

ADT/

ConcurrentHashtable.h

930 lines

unittests/

ADT/

CMakeLists.txt

1 line

ConcurrentHashtableTest.cpp

424 lines

Diff 484377

llvm/include/llvm/ADT/ConcurrentHashtable.h

This file was added.

//===- ConcurrentHashtable.h ------------------------------------*- C++ -*-===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

#ifndef LLVM_ADT_CONCURRENTHASHTABLE_H

#define LLVM_ADT_CONCURRENTHASHTABLE_H

#include "llvm/ADT/DenseMap.h"

#include "llvm/ADT/Hashing.h"

#include "llvm/ADT/PointerIntPair.h"

#include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/SmallVector.h"

#include "llvm/Support/Allocator.h"

#include "llvm/Support/Debug.h"

#include "llvm/Support/Parallel.h"

#include "llvm/Support/WithColor.h"

#include <atomic>

#include <cstddef>

#include <iomanip>

#include <mutex>

#include <sstream>

#include <type_traits>

namespace llvm {

/// ConcurrentHashTable - is a resizeable concurrent hashtable.

/// The default range of resizings without noticable performance

/// degradations is up to x2^32. It could be changed by ChainPtrFreeBits

/// property. The hashtable allows only concurrent insertions.

/// (Though deletions could be easily added).

/// Data structure:

///

/// Inserted value is mapped to 64-bit hash value ->

///

/// [------- 64-bit Hash value --------]

/// [ ChainIndex ][ Bucket Index ]

/// | |

/// points to the points to

/// bucket chains. the bucket.

///

/// After initialization, all buckets consist of one chain(not inititalized

/// at start). During insertions, buckets might be extended to contain more

/// chains. The number of chains kept by bucket is called bucket width. Each

/// bucket can be independently resized and rehashed(no need to lock the whole

/// table). Different buckets may have different widths.

///

/// HashTablesSet keeps chains of all buckets:

///

/// HashTablesSet[ChainIdx][BucketIdx]:

///

/// [ Bucket 0 ][ Bucket 1 ][Bucket ...][ Bucket M ]

/// [Chain 0] [Chain head][Chain head][Chain head][Chain head]

/// [Chain 1] [Chain head][Chain head][Chain head][Chain head]

/// [Chain 2] [Chain head][Chain head][Chain head][Chain head]

/// .......... variable size ...............

/// [Chain N] [Chain head][Chain head][Chain head][Chain head]

///

/// Last bucket index == HashTableSize - 1

/// Last chain index == BucketWidth - 1

///

/// Each chain keeps a list of array of entries:

///

/// ChainHead->[ entry1 ] keeps entry data

/// [ entry2 ]

/// [ ... ]

/// [ last entry]

/// [ next chain item] -> points to the next chain item

///

/// Pointer to the first bucket chain also keeps the width of the bucket.

/// i.e. HashTables[0][BucketIdx] keeps both: pointer to the first chain

/// and the width of the bucket BucketIdx.

///

/// ConcurrentHashTable: each entry keeps KeyDataTy in place or null.

///

/// ConcurrentHashTableByPtr: each entry keeps a pair: hash value

/// and the pointer to the KeyDataTy or null.

template <typename KeyDataTy> struct HashedEntry {

uint64_t Hash;

KeyDataTy *Data = nullptr;

};

template <typename KeyDataTy, bool KeepDataInsideTable>

class ConcurrentHashTableConstants {

public:

// Define the range of increasing of the size of HashTablesSet.

// Size = Size * 2^GrowthRate

static size_t constexpr GrowthRate = 1;

// Define the size of the chain item.

static size_t constexpr ChainItemSize =

128 * sizeof(typename std::conditional<KeepDataInsideTable, uint8_t,

uint32_t>::type);

// Define the number of mutexes.

static size_t constexpr MutexesInitialSize = 256;

// Define the number of bits which will be used to keep bucket width.

static size_t constexpr ChainPtrFreeBits = 5;

};

/// AllocatorRefTy type allows to get access either to:

///

/// 1. thread local allocator.

///

/// static LLVM_THREAD_LOCAL BumpPtrAllocator DataAllocator;

/// class AllocatorRef {

/// public:

/// static inline BumpPtrAllocator& getAllocatorRef() {

/// return DataAllocator;

JDevlieghereUnsubmitted

Not Done

"InitialNumberOfBuckets must be greater than 0");

- size_t UINT64_BitsNum = sizeof(uint64_t) * 8;

- size_t UINT32_BitsNum = sizeof(uint32_t) * 8;

+ constexpr size_t UINT64_BitsNum = sizeof(uint64_t) * 8;

+ constexpr size_t UINT32_BitsNum = sizeof(uint32_t) * 8;

NumberOfBuckets = ThreadsNum;

JDevlieghere:

/// }

/// };

///

/// ASSUMPTION: ThreadPool is used to keep threads alive and so to keep

/// data.

///

/// 2. thread safe allocator.

///

/// static ThreadSafeBumpPtrAllocator DataAllocator;

/// class AllocatorRef {

/// public:

/// static inline BumpPtrAllocator& getAllocatorRef() {

/// return DataAllocator;

/// }

/// };

///

template <typename KeyTy, typename KeyDataTy, typename AllocatorRefTy,

typename AllocatorTy, bool KeepDataInsideTable, typename Info,

typename Constants =

ConcurrentHashTableConstants<KeyDataTy, KeepDataInsideTable>>

class ConcurrentHashTableBase {

public:

static_assert((Constants::GrowthRate > 0),

"GrowthRate must be greater than 0.");

static_assert((Constants::MutexesInitialSize > 0),

"MutexesInitialSize must be greater than 0.");

using MapEntryTy = typename std::conditional<KeepDataInsideTable, KeyDataTy,

HashedEntry<KeyDataTy>>::type;

using InsertionTy =

typename std::conditional<KeepDataInsideTable, KeyDataTy, KeyTy>::type;

using InsertionResultTy =

typename std::conditional<KeepDataInsideTable, KeyDataTy,

KeyDataTy *>::type;

ConcurrentHashTableBase(

size_t EstimatedSize,

size_t ThreadsNum = parallel::strategy.compute_thread_count()) {

assert(ThreadsNum > 0);

// Calculate hashtable size.

HashTableSize = EstimatedSize / EntriesPerChainItem;

HashTableSize = std::max(HashTableSize, (size_t)16);

HashTableSize = NextPowerOf2(HashTableSize);

NumberOfHashTables = 0;

// Allocate first hashtable.

allocateNewHashTables(1);

// Calculate number of mutexes.

BucketMutexesNum = NextPowerOf2(Constants::MutexesInitialSize * ThreadsNum);

BucketMutexesNum = std::min(BucketMutexesNum, HashTableSize);

// Allocate mutexes.

BucketMutexes =

AllocatorRefTy::getAllocatorRef().template Allocate<std::mutex>(

BucketMutexesNum);

for (size_t Idx = 0; Idx < BucketMutexesNum; Idx++)

new (&BucketMutexes[Idx]) std::mutex();

// Calculate masks.

size_t LeadingZerosNumber = countLeadingZeros(getHashMask());

HashBitsNum = sizeof(hash_code) * UINT8_WIDTH - LeadingZerosNumber;

// We have no more than (ChainItemAlign - 1) bits to keep bucket width.

MaxBucketWidth = 1 << std::min((ChainItemAlign - 1), LeadingZerosNumber);

// Calculate mask for extended hash bits.

ExtHashMask = ((HashTableSize * MaxBucketWidth) - 1) & ~getHashMask();

}

/// Insert new value \p NewValue or return already existing entry.

///

/// \returns entry and "true" if an entry is just inserted or

/// "false" if an entry already exists.

std::pair<InsertionResultTy, bool> insert(const InsertionTy &NewValue) {

// Calculate bucket index.

hash_code Hash = getHashCode(NewValue);

size_t BucketIdx = Hash & getHashMask();

size_t ExtHashBits = getExtHashBits(Hash);

MapEntryTy DataToInsert;

ValueInserter<InsertionResultTy> Handler(NewValue,

getBucketMutex(BucketIdx), Hash);

HashTableEntry &ZeroEntryRef = HashTablesSet[0][BucketIdx];

// Lock bucket.

Handler.BucketMutex.lock();

while (true) {

// Get chain head.

ChainItem *ChainHead = ZeroEntryRef;

ExtBitsEntry ExtEntry = ExtBitsEntry::getFromOpaqueValue(ChainHead);

uint8_t BucketWidthValue = getBucketWidthValue(ExtEntry);

size_t BucketWidth = getBucketWidthFromValue(BucketWidthValue);

size_t ChainIdx = ExtHashBits & (BucketWidth - 1);

if (ChainIdx == 0) {

ChainHead = ExtEntry.getPointer();

if (ChainHead == nullptr) {

ChainHead = allocateItem();

ExtEntry.setPointer(ChainHead);

ZeroEntryRef = static_cast<ChainItem *>(ExtEntry.getOpaqueValue());

}

} else {

HashTableEntry &EntryRef = HashTablesSet[ChainIdx][BucketIdx];

ChainHead = EntryRef;

JDevlieghereUnsubmitted

Not Done

llvm_unreachable("Insertion error.");

- return std::pair<KeyDataTy *, bool>();

+ return {};

}

/// Print information about current state of hash table structures.

JDevlieghere:

if (ChainHead == nullptr) {

ChainHead = allocateItem();

EntryRef = ChainHead;

}

// For each chain entry...

if (forEachChainEntry<CreateNewChainItems,

ValueInserter<InsertionResultTy>>(

BucketIdx, BucketWidth, ChainHead, Handler))

break;

}

return Handler.Result;

}

/// Print information about current state of hash table structures.

void printStatistic(raw_ostream &OS) {

OS << "\n--- HashTable statistic:\n";

OS << "\nTable Size = " << HashTableSize;

OS << "\nNumber of tables = " << NumberOfHashTables;

OS << "\nEntries per chain item = " << (int)EntriesPerChainItem;

OS << "\nNumber of mutexes = " << (int)BucketMutexesNum;

uint64_t NumberOfBuckets = 0;

uint64_t LongestChainLength = 0;

uint64_t NumberOfEntries = 0;

uint64_t NumberOfEntriesPlusEmpty = 0;

uint64_t OverallSize =

sizeof(*this) + HashTableSize * NumberOfHashTables * sizeof(MapEntryTy);

class ChainLengthCounter {

public:

IterationStatus handleEntry(MapEntryTy &EntryData) {

if (!ConcurrentHashTableBase::isNull(EntryData))

ChainLength++;

return IterationStatus::Next;

}

uint64_t ChainLength = 0;

};

DenseMap<size_t, size_t> BucketWidthsMap;

// For each bucket...

for (uint64_t CurBucketIdx = 0; CurBucketIdx < HashTableSize;

CurBucketIdx++) {

NumberOfEntriesPlusEmpty += EntriesPerChainItem * NumberOfHashTables;

if (isEmptyBucket(CurBucketIdx))

continue;

NumberOfBuckets++;

size_t BucketWidth =

getBucketWidthFromValue(getBucketWidthValue(CurBucketIdx));

BucketWidthsMap[BucketWidth] = BucketWidthsMap.lookup(BucketWidth) + 1;

// For each chain...

for (size_t CurChainIdx = 0; CurChainIdx < BucketWidth; CurChainIdx++) {

ChainItem *ChainHead =

getChainHead<DoNotCreateChainHead>(CurBucketIdx, CurChainIdx);

if (ChainHead == nullptr)

continue;

ChainLengthCounter Handler;

// For each chain entry...

[[maybe_unused]] bool ForEachSuccessful =

forEachChainEntry<DoNotCreateNewChainItems, ChainLengthCounter>(

CurBucketIdx, BucketWidth, ChainHead, Handler);

assert(ForEachSuccessful);

JDevlieghereUnsubmitted

Not Done

What's the benefit of rehashing at 90% capacity? It seems like this is going to always leave a few empty slots on the table? I understand you always need to have one slot because you rehash after insertion, but it seems like you could rehash here rehash when you've exhausted the bucket?

JDevlieghere: What's the benefit of rehashing at 90% capacity? It seems like this is going to always leave a…

avlAuthorUnsubmitted

Done

When the hashtable is nearly 100% full then it needs to pass too many entries while searching for the free slot. the worst scenario is if the bucket is of 1000000 entries size and it already has 999999 entries then it might need to enumerate all 999999 entries, which is slow.
In case bucket is of 1000000 entries size and it is 90% full the number of entries which should be enumerated is smaller.

So wasting 10% of memory allows to have 20% performance improvement. The exact value "0.9" is received while experimenting. Probably it would be good to have a possibility to change this value (for the case when memory is more important).

avl: When the hashtable is nearly 100% full then it needs to pass too many entries while searching…

LongestChainLength = std::max(LongestChainLength, Handler.ChainLength);

NumberOfEntries += Handler.ChainLength;

size_t ItemsInChain =

(Handler.ChainLength / EntriesPerChainItem) +

((Handler.ChainLength % EntriesPerChainItem) > 0 ? 1 : 0);

assert(ItemsInChain > 0);

NumberOfEntriesPlusEmpty += EntriesPerChainItem * (ItemsInChain - 1);

OverallSize += sizeof(ChainItem) * ItemsInChain;

if constexpr (std::is_pointer<MapEntryTy>::value)

JDevlieghereUnsubmitted

Not Done

"New bucket size less than size of current bucket");

- // Store old entries&hashes arrays.

+ // Store old entries & hashes arrays.

HashesPtr SrcHashes = CurBucket.Hashes;

JDevlieghere:

OverallSize += sizeof(KeyDataTy);

}

OS << "\nOverall number of entries = " << NumberOfEntries;

OS << "\nOverall number of buckets = " << NumberOfBuckets;

for (auto &Width : BucketWidthsMap)

OS << "\n Number of buckets with width " << Width.first << ": "

<< Width.second;

OS << "\nLongest chain length = " << LongestChainLength;

std::stringstream stream;

stream << std::fixed << std::setprecision(2)

<< ((float)NumberOfEntries / (float)NumberOfEntriesPlusEmpty);

std::string str = stream.str();

OS << "\nLoad factor = " << str;

OS << "\nOverall allocated size = " << OverallSize;

}

protected:

enum IterationStatus : uint8_t { Stop, Next };

// Set the alignment of ChanItem so that we have ChainPtrFreeBits bits

// available;

static size_t constexpr ChainItemAlign = 1 << Constants::ChainPtrFreeBits;

// The size of chain item entry.

static size_t constexpr ChainItemEntrySize =

sizeof(typename std::conditional<KeepDataInsideTable, KeyDataTy,

HashedEntry<KeyDataTy>>::type);

static_assert((Constants::ChainItemSize >=

(sizeof(uintptr_t) + ChainItemEntrySize)),

"ChainItemSize must be enough to keep atleast one entry.");

// The number of entries per chain item.

static size_t constexpr EntriesPerChainItem =

(Constants::ChainItemSize - sizeof(uintptr_t)) / ChainItemEntrySize;

struct alignas(ChainItemAlign) ChainItem {

ChainItem() = delete;

ChainItem(const ChainItem &) = delete;

ChainItem &operator=(const ChainItem &) = delete;

MapEntryTy Entries[EntriesPerChainItem];

ChainItem *Next;

};

static constexpr bool CreateChainHead = true;

static constexpr bool DoNotCreateChainHead = false;

static constexpr bool CreateNewChainItems = true;

static constexpr bool DoNotCreateNewChainItems = false;

using HashTableEntry = ChainItem *;

using HashTablePtr = HashTableEntry *;

// Store degree of bucket width in the free bits of pointer.

using ExtBitsEntry =

PointerIntPair<ChainItem *, Constants::ChainPtrFreeBits, uint8_t>;

bool isEmptyBucket(size_t BucketIdx) {

return ExtBitsEntry::getFromOpaqueValue(HashTablesSet[0][BucketIdx])

.getPointer() == nullptr;

}

uint8_t getBucketWidthValue(size_t BucketIdx) {

return ExtBitsEntry::getFromOpaqueValue(HashTablesSet[0][BucketIdx])

.getInt();

}

uint8_t getBucketWidthValue(ChainItem *Head) {

return getBucketWidthValue(ExtBitsEntry::getFromOpaqueValue(Head));

}

uint8_t getBucketWidthValue(ExtBitsEntry ExtEntry) {

return ExtEntry.getInt();

}

size_t getBucketWidthFromValue(uint8_t BucketWidthValue) {

return 1 << (BucketWidthValue);

}

void setBucketWidth(size_t BucketIdx, size_t NewBucketWidth) {

assert((NewBucketWidth >= 1) & !(NewBucketWidth & (NewBucketWidth - 1)));

assert(getBucketWidthFromValue(getBucketWidthValue(BucketIdx)) <

NewBucketWidth);

HashTableEntry &ExtEntryRef = HashTablesSet[0][BucketIdx];

ExtBitsEntry ExtEntry = ExtBitsEntry::getFromOpaqueValue(ExtEntryRef);

ExtEntry.setInt(static_cast<uint8_t>(countTrailingZeros(NewBucketWidth)));

ExtEntryRef = static_cast<ChainItem *>(ExtEntry.getOpaqueValue());

}

size_t getExtHashBits(hash_code Hash) {

return (Hash & ExtHashMask) >> HashBitsNum;

}

size_t getChainIdx(hash_code Hash, size_t BucketWidth) {

assert(BucketWidth > 0);

return getExtHashBits(Hash) & (BucketWidth - 1);

}

// Return number of entries inside specified bucket.

size_t getNumberOfEntries(size_t BucketIdx, size_t BucketWidth) {

class EntriesCounter {

public:

IterationStatus handleEntry(MapEntryTy &EntryData) {

if (ConcurrentHashTableBase::isNull(EntryData))

return IterationStatus::Next;

EntriesNumber++;

return IterationStatus::Next;

}

size_t EntriesNumber = 0;

} Handler;

for (size_t CurChainIdx = 0; CurChainIdx < BucketWidth; CurChainIdx++) {

ChainItem *ChainHead =

getChainHead<DoNotCreateChainHead>(BucketIdx, CurChainIdx);

if (ChainHead == nullptr)

continue;

[[maybe_unused]] bool ForEachSuccessful =

forEachChainEntry<DoNotCreateNewChainItems, EntriesCounter>(

BucketIdx, BucketWidth, ChainHead, Handler);

assert(ForEachSuccessful);

}

return Handler.EntriesNumber;

}

// Allocate new chain intem.

ChainItem *allocateItem() {

ChainItem *ResultItem =

AllocatorRefTy::getAllocatorRef().template Allocate<ChainItem>();

assert(ResultItem != nullptr);

memset(ResultItem, 0x0, sizeof(ChainItem));

return ResultItem;

}

// Return chain head. Create new head if CreateChainHead is true.

template <bool CreateChainHead>

inline __attribute__((always_inline)) ChainItem *

getChainHead(size_t BucketIdx, hash_code Hash, size_t BucketWidth) {

size_t ChainIdx = getChainIdx(Hash, BucketWidth);

return getChainHead<CreateChainHead>(BucketIdx, ChainIdx);

}

// Return chain head. Create new head if CreateChainHead is true.

template <bool CreateChainHead>

inline __attribute__((always_inline)) ChainItem *

getChainHead(size_t BucketIdx, size_t ChainIdx) {

ChainItem *ChainHead = HashTablesSet[ChainIdx][BucketIdx];

if (ChainIdx == 0) {

ExtBitsEntry ExtEntry = ExtBitsEntry::getFromOpaqueValue(ChainHead);

ChainHead = ExtEntry.getPointer();

if constexpr (CreateChainHead) {

if (ChainHead == nullptr) {

ChainHead = allocateItem();

ExtEntry.setPointer(ChainHead);

HashTablesSet[ChainIdx][BucketIdx] =

static_cast<ChainItem *>(ExtEntry.getOpaqueValue());

}

return ChainHead;

}

if constexpr (CreateChainHead) {

if (ChainHead == nullptr) {

ChainHead = allocateItem();

HashTablesSet[ChainIdx][BucketIdx] = ChainHead;

}

return ChainHead;

}

// Return next chain item. Create new item if necessary.

template <bool CreateNewChainItems = true,

typename std::enable_if<CreateNewChainItems>::type * = nullptr>

inline __attribute__((always_inline)) ChainItem *

getNextChainItem(ChainItem *Item) {

ChainItem *NextItem = Item->Next;

if (NextItem == nullptr) {

ChainItem *NewEntry = allocateItem();

Item->Next = NewEntry;

return NewEntry;

}

return NextItem;

}

// Return next chain item.

template <bool CreateNewChainItems = false,

typename std::enable_if<!CreateNewChainItems>::type * = nullptr>

inline __attribute__((always_inline)) ChainItem *

getNextChainItem(ChainItem *Item) {

return Item->Next;

}

// Extend HashTables and rehash specified bucket.

template <bool AllowResizings = true,

typename std::enable_if<AllowResizings>::type * = nullptr>

bool extendTable(size_t BucketIdx, size_t BucketWidth) {

assert(BucketWidth ==

getBucketWidthFromValue(getBucketWidthValue(BucketIdx)));

if (BucketWidth >= MaxBucketWidth)

return false;

ExtendAndRehashBucket(BucketIdx, BucketWidth);

return true;

}

// Do nothing.

template <bool AllowResizings = false,

typename std::enable_if<!AllowResizings>::type * = nullptr>

constexpr bool extendTable(size_t, size_t) {

return false;

}

// Enumerate all entries for the specified bucket and bucket`s chain.

template <bool CreateNewChainItems, typename EntryHandler>

inline __attribute__((always_inline)) bool

forEachChainEntry(size_t BucketIdx, size_t BucketWidth, ChainItem *ChainHead,

EntryHandler &Handler) {

assert(ChainHead != nullptr);

for (ChainItem *CurItem = ChainHead; CurItem != nullptr;

CurItem = getNextChainItem<CreateNewChainItems>(CurItem)) {

MapEntryTy *Entries = CurItem->Entries;

for (size_t EntryIdx = 0; EntryIdx < EntriesPerChainItem;) {

switch (Handler.handleEntry(Entries[EntryIdx])) {

case Stop:

return true;

case Next:

EntryIdx++;

}

if (extendTable<CreateNewChainItems>(BucketIdx, BucketWidth))

return false;

}

return true;

}

// Insert data into the chain item.

template <typename ReturnDataTy> class ValueInserter {

public:

ValueInserter(const InsertionTy &InsertionData, std::mutex &BucketMutex,

hash_code Hash)

: InsertionData(InsertionData), BucketMutex(BucketMutex), Hash(Hash) {}

IterationStatus handleEntry(MapEntryTy &EntryData) {

MapEntryTy Value = EntryData;

if (ConcurrentHashTableBase::isNull(Value)) {

// Insert new value into the empty slot.

if constexpr (KeepDataInsideTable) {

Result.first = InsertionData;

EntryData = Result.first;

} else {

Result.first =

Info::create(InsertionData, AllocatorRefTy::getAllocatorRef());

EntryData = {Hash, Result.first};

}

assert(!(isNull(Result.first)));

BucketMutex.unlock();

Result.second = true;

return IterationStatus::Stop;

}

if constexpr (KeepDataInsideTable) {

if (ConcurrentHashTableBase::isEqual(Value, InsertionData)) {

// Already existed entry matched with inserted data

// is found.

BucketMutex.unlock();

Result.first = Value;

Result.second = false;

return IterationStatus::Stop;

}

} else {

if (ConcurrentHashTableBase::isEqual(Value, InsertionData, Hash)) {

// Already existed entry matched with inserted data

// is found.

BucketMutex.unlock();

Result.first = Value.Data;

Result.second = false;

return IterationStatus::Stop;

}

return IterationStatus::Next;

}

std::pair<ReturnDataTy, bool> Result;

const InsertionTy &InsertionData;

std::mutex &BucketMutex;

hash_code Hash;

};

using CollectedValuesVector = SmallVector<MapEntryTy>;

// This function adds new chains to the bucket and rehash existing

// values to place them into the new chains.

void ExtendAndRehashBucket(size_t BucketIdx, size_t BucketWidth) {

assert(BucketWidth < MaxBucketWidth);

assert(BucketWidth > 0);

assert(Constants::GrowthRate < countLeadingZeros(BucketWidth));

size_t NewBucketsWidth =

std::min(static_cast<size_t>(BucketWidth << Constants::GrowthRate),

MaxBucketWidth);

// Check whether we need to create new hashtables.

if (NewBucketsWidth > NumberOfHashTables) {

std::unique_lock<std::mutex> Guard(HashTableMutex);

if (NewBucketsWidth > NumberOfHashTables)

allocateNewHashTables(NewBucketsWidth);

}

// Collect bucket values.

CollectedValuesVector Values;

collectValuesLocked(BucketIdx, BucketWidth, Values);

assert(getNumberOfEntries(BucketIdx, BucketWidth) == 0);

// Insert values into extended bucket.

insertValuesLocked(Values, BucketIdx, NewBucketsWidth);

// Update bucket width.

setBucketWidth(BucketIdx, NewBucketsWidth);

}

// Collect all data into the specified vector Values.

// Erase data from all bucket's chains.

void collectValuesLocked(size_t BucketIdx, size_t BucketWidth,

CollectedValuesVector &Values) {

class ValueCollector {

public:

ValueCollector(CollectedValuesVector &Values) : Values(Values) {}

IterationStatus handleEntry(MapEntryTy &EntryData) {

MapEntryTy Value = EntryData;

if (ConcurrentHashTableBase::isNull(Value))

return IterationStatus::Stop;

Values.emplace_back(Value);

setToNull(&EntryData);

return IterationStatus::Next;

}

CollectedValuesVector &Values;

} Handler(Values);

// Collect bucket values.

for (size_t CurChainIdx = 0; CurChainIdx < BucketWidth; CurChainIdx++) {

ChainItem *ChainHead =

getChainHead<DoNotCreateChainHead>(BucketIdx, CurChainIdx);

if (ChainHead == nullptr)

continue;

[[maybe_unused]] bool ForEachSuccessful =

forEachChainEntry<DoNotCreateNewChainItems, ValueCollector>(

BucketIdx, BucketWidth, ChainHead, Handler);

assert(ForEachSuccessful);

}

// Insert data into the locked bucket.

void insertValuesLocked(CollectedValuesVector &Values, size_t BucketIdx,

size_t BucketWidth) {

for (MapEntryTy &NewValue : Values) {

assert(!(ConcurrentHashTableBase::isNull(NewValue)));

hash_code Hash = ConcurrentHashTableBase::getHashCode(NewValue);

ChainItem *ChainHead =

getChainHead<CreateChainHead>(BucketIdx, Hash, BucketWidth);

class InsertEntry {

public:

InsertEntry(MapEntryTy NewValue) : NewValue(NewValue) {}

IterationStatus handleEntry(MapEntryTy &EntryData) {

if (ConcurrentHashTableBase::isNull(EntryData)) {

EntryData = NewValue;

return IterationStatus::Stop;

}

return IterationStatus::Next;

}

MapEntryTy NewValue;

} Handler(NewValue);

[[maybe_unused]] bool ForEachSuccessful =

forEachChainEntry<DoNotCreateNewChainItems, InsertEntry>(

BucketIdx, BucketWidth, ChainHead, Handler);

assert(ForEachSuccessful);

}

// Allocate new hashtables for specified NewBucketsWidth.

void allocateNewHashTables(size_t NewBucketsWidth) {

assert((NewBucketsWidth >= 1) & !(NewBucketsWidth & (NewBucketsWidth - 1)));

assert(NumberOfHashTables < NewBucketsWidth);

AllocatorTy &Allocator = AllocatorRefTy::getAllocatorRef();

HashTablePtr *NewHashTablesSet =

Allocator.template Allocate<HashTablePtr>(NewBucketsWidth);

// Allocate and initialize new tables, copy old tables.

// TODO: tables initialization might be done in parallel.

for (size_t Idx = 0; Idx < NewBucketsWidth; Idx++) {

if (Idx < NumberOfHashTables) {

NewHashTablesSet[Idx] = HashTablesSet[Idx];

continue;

}

HashTablePtr NewTable =

Allocator.template Allocate<HashTableEntry>(HashTableSize);

memset(NewTable, 0, sizeof(HashTableEntry) * HashTableSize);

NewHashTablesSet[Idx] = NewTable;

}

HashTablesSet = NewHashTablesSet;

NumberOfHashTables = NewBucketsWidth;

}

// Return mutex for the specified BucketIdx.

std::mutex &getBucketMutex(size_t BucketIdx) {

return BucketMutexes[BucketIdx & (BucketMutexesNum - 1)];

}

template <typename MapEntryTy,

typename std::enable_if<

(std::is_same<MapEntryTy, HashedEntry<KeyDataTy>>::value),

bool>::type = true>

static inline bool isNull(MapEntryTy Data) {

const uint64_t *DataPtr = reinterpret_cast<const uint64_t *>(&Data);

return *DataPtr == 0 && DataPtr[1] == 0;

}

template <typename MapEntryTy,

typename std::enable_if<

(!std::is_same<MapEntryTy, HashedEntry<KeyDataTy>>::value &&

sizeof(MapEntryTy) == sizeof(uint8_t)),

bool>::type = true>

static inline bool isNull(const MapEntryTy &Data) {

return *reinterpret_cast<const uint8_t *>(&Data) == static_cast<uint8_t>(0);

}

template <typename MapEntryTy,

typename std::enable_if<

(!std::is_same<MapEntryTy, HashedEntry<KeyDataTy>>::value &&

sizeof(MapEntryTy) == sizeof(uint16_t)),

bool>::type = true>

static inline bool isNull(const MapEntryTy &Data) {

return *reinterpret_cast<const uint16_t *>(&Data) ==

static_cast<uint16_t>(0);

}

template <typename MapEntryTy,

typename std::enable_if<

(!std::is_same<MapEntryTy, HashedEntry<KeyDataTy>>::value &&

sizeof(MapEntryTy) == sizeof(uint32_t)),

bool>::type = true>

static inline bool isNull(const MapEntryTy &Data) {

return *reinterpret_cast<const uint32_t *>(&Data) ==

static_cast<uint32_t>(0);

}

template <typename MapEntryTy,

typename std::enable_if<

(!std::is_same<MapEntryTy, HashedEntry<KeyDataTy>>::value &&

sizeof(MapEntryTy) == sizeof(uint64_t)),

bool>::type = true>

static inline bool isNull(const MapEntryTy &Data) {

return *reinterpret_cast<const uint64_t *>(&Data) ==

static_cast<uint64_t>(0);

}

static inline bool setToNull(MapEntryTy *Data) {

return memset(Data, 0, sizeof(MapEntryTy));

}

static inline bool isEqual(MapEntryTy LHS, const InsertionTy &RHS,

uint64_t Hash) {

return (LHS.Hash == Hash) && Info::isEqual(LHS.Data->key(), RHS);

}

static inline bool isEqual(const MapEntryTy &LHS, const InsertionTy &RHS) {

return Info::isEqual(LHS.key(), RHS.key());

}

static inline hash_code getHashCode(const KeyTy &Data) {

return Info::getHashValue(Data);

}

static inline hash_code getHashCode(const KeyDataTy &Data) {

return Info::getHashValue(Data.key());

}

static inline hash_code getHashCode(const HashedEntry<KeyDataTy> &Data) {

return Data.Hash;

}

size_t getHashMask() { return HashTableSize - 1; }

// Number of bits in hash mask.

uint64_t HashBitsNum = 0;

// Hash mask for the extended hash bits.

uint64_t ExtHashMask = 0;

// Array of mutexes.

std::mutex *BucketMutexes = nullptr;

// Number of used mutexes.

size_t BucketMutexesNum = 2;

// The maximal bucket width.

size_t MaxBucketWidth = 0;

// The size of single hashtable.

size_t HashTableSize = 0;

// The guard for HashTables array.

std::mutex HashTableMutex;

// The current number of HashTables.

std::atomic<size_t> NumberOfHashTables;

// HashTables keeping buckets chains.

std::atomic<HashTablePtr *> HashTablesSet = nullptr;

};

/// ConcurrentHashTable: keeps a KeyDataTy(which is a key-value pair).

/// The state when all data bits of KeyDataTy are zero is reserved as

/// a hashtable tombstone value.

///

template <typename KeyTy, typename KeyDataTy> class ConcurrentHashTableInfo {

public:

/// \returns Hash value for the specified \p Key.

static inline hash_code getHashValue(const KeyTy &Key) {

return std::hash<KeyTy>()(Key);

}

/// \returns true if both \p LHS and \p RHS are equal.

static inline bool isEqual(const KeyTy &LHS, const KeyTy &RHS) {

return LHS == RHS;

}

};

template <typename KeyTy, typename KeyDataTy, typename AllocatorRefTy,

typename AllocatorTy = BumpPtrAllocator,

typename Info = ConcurrentHashTableInfo<KeyTy, KeyDataTy>,

typename Constants = ConcurrentHashTableConstants<KeyDataTy, true>>

class ConcurrentHashTable

: public ConcurrentHashTableBase<KeyTy, KeyDataTy, AllocatorRefTy,

AllocatorTy, true, Info, Constants> {

public:

ConcurrentHashTable(size_t InitialSize)

: ConcurrentHashTableBase<KeyTy, KeyDataTy, AllocatorRefTy, AllocatorTy,

true, Info, Constants>(InitialSize) {}

};

/// ConcurrentHashTableByPtr: keeps a pointer to the KeyDataTy class

/// (which is a key-value pair).

template <typename KeyTy, typename KeyDataTy, typename AllocatorTy>

class ConcurrentHashTableInfoByPtr {

public:

/// \returns Hash value for the specified \p Key.

static inline hash_code getHashValue(const KeyTy &Key) {

return std::hash<KeyTy>()(Key);

}

/// \returns true if both \p LHS and \p RHS are equal.

static inline bool isEqual(const KeyTy &LHS, const KeyTy &RHS) {

return LHS == RHS;

}

/// \returns newly created object of KeyDataTy type.

static inline KeyDataTy *create(const KeyTy &Key, AllocatorTy &Allocator) {

return KeyDataTy::create(Key, Allocator);

}

};

template <typename KeyTy, typename KeyDataTy, typename AllocatorRefTy,

typename AllocatorTy = BumpPtrAllocator,

typename Info =

ConcurrentHashTableInfoByPtr<KeyTy, KeyDataTy, AllocatorTy>,

typename Constants = ConcurrentHashTableConstants<KeyDataTy, false>>

class ConcurrentHashTableByPtr

: public ConcurrentHashTableBase<KeyTy, KeyDataTy, AllocatorRefTy,

AllocatorTy, false, Info, Constants> {

public:

ConcurrentHashTableByPtr(size_t InitialSize)

: ConcurrentHashTableBase<KeyTy, KeyDataTy, AllocatorRefTy, AllocatorTy,

false, Info, Constants>(InitialSize) {}

};

} // end namespace llvm

#endif // LLVM_ADT_CONCURRENTHASHTABLE_H

llvm/unittests/ADT/CMakeLists.txt

Show All 11 Lines	add_llvm_unittest(ADTTests
BitFieldsTest.cpp		BitFieldsTest.cpp
BitmaskEnumTest.cpp		BitmaskEnumTest.cpp
BitTest.cpp		BitTest.cpp
BitVectorTest.cpp		BitVectorTest.cpp
BreadthFirstIteratorTest.cpp		BreadthFirstIteratorTest.cpp
BumpPtrListTest.cpp		BumpPtrListTest.cpp
CoalescingBitVectorTest.cpp		CoalescingBitVectorTest.cpp
CombinationGeneratorTest.cpp		CombinationGeneratorTest.cpp
		ConcurrentHashtableTest.cpp
DAGDeltaAlgorithmTest.cpp		DAGDeltaAlgorithmTest.cpp
DeltaAlgorithmTest.cpp		DeltaAlgorithmTest.cpp
DenseMapTest.cpp		DenseMapTest.cpp
DenseSetTest.cpp		DenseSetTest.cpp
DepthFirstIteratorTest.cpp		DepthFirstIteratorTest.cpp
DirectedGraphTest.cpp		DirectedGraphTest.cpp
EditDistanceTest.cpp		EditDistanceTest.cpp
EnumeratedArrayTest.cpp		EnumeratedArrayTest.cpp
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/unittests/ADT/ConcurrentHashtableTest.cpp

This file was added.

				//===- ConcurrentHashtableTest.cpp ----------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/ConcurrentHashtable.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/FormatVariadic.h"
				#include "llvm/Support/Parallel.h"
				#include "gtest/gtest.h"
				#include <experimental/random>
				#include <limits>
				#include <vector>
				using namespace llvm;

				namespace {
				class Int {
				public:
				Int() : Data(0x0) {}
				uint32_t key() const { return Data & 0x7FFFFFFF; }

				friend bool operator==(const Int &LHS, const Int &RHS) {
				return LHS.Data == RHS.Data;
				}

				static Int create(uint32_t Data) { return Int(Data \| 0x80000000); }

				protected:
				Int(uint32_t Data) : Data(Data) {}

				uint32_t Data;
				};

				TEST(ConcurrentHashTableTest, AddIntEntries) {
				static BumpPtrAllocator Allocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return Allocator; }
				};
				ConcurrentHashTable<uint32_t, Int, AllocatorRef> HashTable(10);

				std::pair<Int, bool> res1 = HashTable.insert(Int::create(1));
				// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == 1);
				EXPECT_TRUE(res1.second);

				res1 = HashTable.insert(Int::create(1));
				// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == 1);
				EXPECT_FALSE(res1.second);

				std::pair<Int, bool> res2 = HashTable.insert(Int::create(2));
				// Check old entry is still valid.
				EXPECT_TRUE(res1.first.key() == 1);
				// Check new entry is inserted.
				EXPECT_TRUE(res2.first.key() == 2);
				EXPECT_TRUE(res2.second);
				// Check new and old entries are not the equal.
				EXPECT_FALSE(res1.first == res2.first);

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 2") !=
				std::string::npos);
				}

				TEST(ConcurrentHashTableTest, AddEntriesWithResizing) {
				static BumpPtrAllocator Allocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return Allocator; }
				};
				ConcurrentHashTable<uint32_t, Int, AllocatorRef> HashTable(10);

				for (size_t Idx = 0; Idx < 10000; Idx++) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_TRUE(res1.second);

				res1 = HashTable.insert(Int::create(Idx));
				// Check entry is found.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				}

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);

				for (size_t Idx = 0; Idx < 10000; Idx++) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				// Check entry is found.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				}

				StatisticString.erase();
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);
				}

				TEST(ConcurrentHashTableTest, AddEntriesParralel) {
				static LLVM_THREAD_LOCAL BumpPtrAllocator DataAllocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return DataAllocator; }
				};
				ConcurrentHashTable<uint32_t, Int, AllocatorRef> HashTable(1000000);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_TRUE(res1.second);

				res1 = HashTable.insert(Int::create(Idx));
				// Check entry is found.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				});

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				});

				StatisticString.erase();
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);
				}

				TEST(ConcurrentHashTableTest, AddEntriesWithResizingParralel1) {
				static LLVM_THREAD_LOCAL BumpPtrAllocator DataAllocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return DataAllocator; }
				};
				ConcurrentHashTable<uint32_t, Int, AllocatorRef> HashTable(1000);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_TRUE(res1.second);

				res1 = HashTable.insert(Int::create(Idx));
				// Check entry is found.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				});

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				});

				StatisticString.erase();
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);
				}

				TEST(ConcurrentHashTableTest, AddEntriesWithResizingParralel2) {
				static LLVM_THREAD_LOCAL BumpPtrAllocator DataAllocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return DataAllocator; }
				};
				ConcurrentHashTable<uint32_t, Int, AllocatorRef> HashTable(10);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_TRUE(res1.second);

				res1 = HashTable.insert(Int::create(Idx));
				// Check entry is found.
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				});

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Int::create(Idx));
				EXPECT_TRUE(res1.first.key() == Idx);
				EXPECT_FALSE(res1.second);
				});

				StatisticString.erase();
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);
				}

				TEST(ConcurrentHashTableTest, AddRandomEntriesParralel) {

				std::vector<Int> Data;
				std::set<size_t> UniqData;
				std::srand(
				std::time(nullptr)); // use current time as seed for random generator
				for (size_t Idx = 0; Idx < 10000; Idx++) {
				size_t RndNum = std::experimental::randint(0, 100);
				Data.push_back(Int::create(RndNum));
				UniqData.insert(RndNum);
				}

				static LLVM_THREAD_LOCAL BumpPtrAllocator DataAllocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return DataAllocator; }
				};
				ConcurrentHashTable<uint32_t, Int, AllocatorRef> HashTable(10);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Data[Idx]);
				//// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == Data[Idx].key());

				res1 = HashTable.insert(Data[Idx]);
				//// Check entry is inserted.
				EXPECT_TRUE(res1.first.key() == Data[Idx].key());
				EXPECT_FALSE(res1.second);
				});

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = " +
				itostr(UniqData.size())) !=
				std::string::npos);

				parallelFor(0, 10000, [&](size_t Idx) {
				std::pair<Int, bool> res1 = HashTable.insert(Data[Idx]);
				EXPECT_TRUE(res1.first.key() == Data[Idx].key());
				EXPECT_FALSE(res1.second);
				});

				StatisticString.erase();
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = " +
				itostr(UniqData.size())) !=
				std::string::npos);
				}

				using ExtraDataTy = std::array<char, 0x20>;

				class String {
				public:
				String() {}
				const std::string &key() const { return Data; }

				static String *create(const std::string &Num, BumpPtrAllocator &Allocator) {
				String *Result = Allocator.Allocate<String>();
				new (Result) String(Num);
				return Result;
				}

				protected:
				String(const std::string &Num) { Data += Num; }

				std::string Data;
				ExtraDataTy ExtraData;
				};

				TEST(ConcurrentHashTableTest, AddStringEntries) {
				static BumpPtrAllocator Allocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return Allocator; }
				};
				ConcurrentHashTableByPtr<std::string, String, AllocatorRef> HashTable(10);

				std::pair<String *, bool> res1 = HashTable.insert("1");
				// Check entry is inserted.
				EXPECT_TRUE(res1.first->key() == "1");
				EXPECT_TRUE(res1.second);

				std::pair<String *, bool> res2 = HashTable.insert("2");
				// Check old entry is still valid.
				EXPECT_TRUE(res1.first->key() == "1");
				// Check new entry is inserted.
				EXPECT_TRUE(res2.first->key() == "2");
				EXPECT_TRUE(res2.second);
				// Check new and old entries use different memory.
				EXPECT_TRUE(res1.first != res2.first);

				std::pair<String *, bool> res3 = HashTable.insert("3");
				// Check one more entry is inserted.
				EXPECT_TRUE(res3.first->key() == "3");
				EXPECT_TRUE(res3.second);

				std::pair<String *, bool> res4 = HashTable.insert("1");
				// Check duplicated entry is inserted.
				EXPECT_TRUE(res4.first->key() == "1");
				EXPECT_FALSE(res4.second);
				// Check duplicated entry matches with the first one.
				EXPECT_TRUE(res1.first == res4.first);

				// Check first entry is still valid.
				EXPECT_TRUE(res1.first->key() == "1");

				// Check data was allocated by allocator.
				EXPECT_TRUE(Allocator.getBytesAllocated() > 0);

				// Check statistic.
				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				EXPECT_TRUE(StatisticString.find("Overall number of entries = 3") !=
				std::string::npos);
				}

				TEST(ConcurrentHashTableTest, AddStringEntriesParallel) {
				// Number of elements exceeds original size, thus hashtable
				// should be resized.
				const size_t NumElements = 10000;
				static LLVM_THREAD_LOCAL BumpPtrAllocator Allocator;
				class AllocatorRef {
				public:
				static inline BumpPtrAllocator &getAllocatorRef() { return Allocator; }
				};
				ConcurrentHashTableByPtr<std::string, String, AllocatorRef> HashTable(
				NumElements);

				// Check parallel insertion.
				parallelFor(0, NumElements, [&](size_t I) {
				std::string StringForElement = formatv("{0}", I);
				std::pair<String *, bool> Entry = HashTable.insert(StringForElement);
				EXPECT_TRUE(Entry.second);
				EXPECT_TRUE(Entry.first->key() == StringForElement);
				EXPECT_TRUE(Allocator.getBytesAllocated() > 0);
				});

				std::string StatisticString;
				raw_string_ostream StatisticStream(StatisticString);
				HashTable.printStatistic(StatisticStream);

				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);

				// Check parallel insertion of duplicates.
				parallelFor(0, NumElements, [&](size_t I) {
				size_t BytesAllocated = Allocator.getBytesAllocated();
				std::string StringForElement = formatv("{0}", I);
				std::pair<String *, bool> Entry = HashTable.insert(StringForElement);
				EXPECT_FALSE(Entry.second);
				EXPECT_TRUE(Entry.first->key() == StringForElement);
				// Check no additional bytes were allocated for duplicate.
				EXPECT_TRUE(Allocator.getBytesAllocated() == BytesAllocated);
				});

				// Check statistic.
				// Verifying that the table contains exactly the number of elements we
				// inserted.
				EXPECT_TRUE(StatisticString.find("Overall number of entries = 10000") !=
				std::string::npos);
				}

				} // namespace

This is an archive of the discontinued LLVM Phabricator instance.

[ADT] add ConcurrentHashtable class.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 484377

llvm/include/llvm/ADT/ConcurrentHashtable.h

llvm/unittests/ADT/CMakeLists.txt

llvm/unittests/ADT/ConcurrentHashtableTest.cpp

[ADT] add ConcurrentHashtable class.
ClosedPublic