This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/ADT/
-
llvm/
-
ADT/
-
TrieRawHashMap.h
-
lib/Support/
-
Support/
-
CMakeLists.txt
-
TrieHashIndexGenerator.h
-
TrieRawHashMap.cpp
-
unittests/ADT/
-
ADT/
-
CMakeLists.txt
2/4
TrieRawHashMapTest.cpp

Differential D133715

[ADT] Add TrieRawHashMap
Needs ReviewPublic

Authored by steven_wu on Sep 12 2022, 10:41 AM.

Download Raw Diff

Details

Reviewers

rnk
dblaikie
benlangmuir
dexonsmith

Summary

Implement TrieRawHashMap which stores objects into a Trie based on the
hash of the object.

User needs to supply the hashing function and guarantees the uniqueness of
the hash for the objects to be inserted. Hash collision is not
supported

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	340 ms	x64 debian > Profile-x86_64.Linux::counter_promo_for.c
	340 ms	x64 debian > Profile-x86_64.Linux::counter_promo_while.c

Event Timeline

steven_wu created this revision.Sep 12 2022, 10:41 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 10:41 AM

Herald added subscribers: ributzka, arphaman, hiraditya, mgorny. · View Herald Transcript

steven_wu requested review of this revision.Sep 12 2022, 10:41 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 10:41 AM

steven_wu added a child revision: D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.Sep 12 2022, 10:43 AM

Harbormaster completed remote builds in B186195: Diff 459513.Sep 12 2022, 11:05 AM

akyrtzi added a subscriber: akyrtzi.Sep 12 2022, 12:54 PM

steven_wu added parent revisions: D133714: [ADT] Introduce LazyAtomicPointer, D133713: [Support] Introduce ThreadSafeAllocator.Sep 19 2022, 3:21 PM

steven_wu removed a child revision: D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.

steven_wu added a child revision: D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.

aganea added a subscriber: aganea.Sep 27 2022, 8:14 AM

steven_wu updated this revision to Diff 481059.Dec 7 2022, 2:08 PM

Update missing decl due to header cleanup in other files.

Not sure I'll be the most detailed reviewer here, if anyone else wants to chime in.

I've a generally understanding of what a trie is, but at my current level of detail looking at this code it's not entirely clear to me what this particular data structure is - maybe some more comments both in the source and in the patch description? simple example usage (in a test and/or in those comments/patch description) migth be helpful

llvm/include/llvm/ADT/HashMappedTrie.h
25 ↗	(On Diff #481059)	I think we can use static constexpr implicitly inline constants now, rather than enums?
34–36 ↗	(On Diff #481059)	Guess these could be constexpr variable templates rather than functions? But probably not a big deal either way.
63 ↗	(On Diff #481059)
156–160 ↗	(On Diff #481059)	This sort of feels a bit confusing - why are we hashing a hash?
llvm/unittests/ADT/HashMappedTrieTest.cpp
71–75 ↗	(On Diff #481059)	Seems unfortunate to do significant amounts of testing via the stringification of an object - any chance this could be done in a more API-centric way?
199 ↗	(On Diff #481059)	This is specifically to make it non-trivially destructible? should this record the execution of the dtor and check that the right number (or even instance) of destructions occur?

[Resigning as reviewer, since I was the original author, but feel free ping me explicitly if I can be helpful for something and I'm not volunteering anything...]

In D133715#3980058, @dblaikie wrote:

Not sure I'll be the most detailed reviewer here, if anyone else wants to chime in.

I've a generally understanding of what a trie is, but at my current level of detail looking at this code it's not entirely clear to me what this particular data structure is - maybe some more comments both in the source and in the patch description? simple example usage (in a test and/or in those comments/patch description) migth be helpful

Yeah, probably more docs here would be good. Some notes that Steven might be able to use as a starting point (although feel free to ignore this an do your own thing!):

At a high-level, this is a set/map data structure that supports fast thread-safe insertion/lookup and does not support erase. (IIRC, the interface similar to a std::map, where the key is a hash?)

A hash-mapped trie is a tree of arrays where the prefix of the hash is consumed as an "index" into the array at each level of the tree. IIRC, this data structure as implemented:

is lock-free and thread-safe
supports "insert" and "lookup"
does NOT detect a hash collision; first insertion wins; you want a strong-enough hash for your data set that there are no collisions, or handle collisions in the value_type somehow
does NOT support "erase"; IIRC, that'd be hard to do in a thread-safe way, but I didn't think hard about it
does NOT support "iteration"; IIRC, it'd be easy to implement iteration that exposed the internal layout
array sizes are configurable (root array can be a different size from sub-trie arrays)
each slot in an array is either empty, a sub-trie, or a value

Insertion (and lookup) works basically like the following (skipping over details that make the lock-free-thead-safety work):

start with the root trie's array
convert a prefix of the hash into an index into the current array
if the slot is a value with the same hash, return the existing value
if the slot is a value with a different hash, create a new sub-trie that contains the existing value, put the sub-trie in the slot, and continue with step (5)
if the slot is a sub-trie, descend and go back to step (2) with the unused suffix of the hash on the sub-trie's array
if the slot is empty, insert and return the new value

llvm/include/llvm/ADT/HashMappedTrie.h
156–160 ↗	(On Diff #481059)	This is creating a `std::array<>` out of an `ArrayRef`. `std::array`'s constructors don't make this easy, so a wrapper function is helpful. Probably `copyHash` is a better name, or maybe there's a better way to do this!
llvm/unittests/ADT/HashMappedTrieTest.cpp
71–75 ↗	(On Diff #481059)	This is a good point. It was hard to test this way, and it's probably hard to maintain the tests. The motivating thought was to avoid adding APIs that exposed the layout so that clients couldn't inspect/rely on it (i.e., there are no APIs for iterating through all values). But maybe that is unnecessarily restrictive. IIRC, it would be fairly easy to change `Trie::pointer` to a recursive iterator, or to expose an iterator that wrapped it, or something. `operator++` might be quite slow and/or be thread-unsafe, but maybe it's worth opening up just for the testing benefit. (I'm happy either way; just chiming in in case I have more context about motivation for the currently-sparse API.)

alexander-shaposhnikov added a subscriber: alexander-shaposhnikov.Dec 7 2022, 6:20 PM

alexander-shaposhnikov added inline comments.

llvm/include/llvm/ADT/HashMappedTrie.h
322 ↗	(On Diff #481059)	constexpr

Harbormaster completed remote builds in B201811: Diff 481059.Dec 8 2022, 1:16 AM

In D133715#3980166, @dexonsmith wrote:

[Resigning as reviewer, since I was the original author, but feel free ping me explicitly if I can be helpful for something and I'm not volunteering anything...]

In D133715#3980058, @dblaikie wrote:

Not sure I'll be the most detailed reviewer here, if anyone else wants to chime in.

I've a generally understanding of what a trie is, but at my current level of detail looking at this code it's not entirely clear to me what this particular data structure is - maybe some more comments both in the source and in the patch description? simple example usage (in a test and/or in those comments/patch description) migth be helpful

Yeah, probably more docs here would be good. Some notes that Steven might be able to use as a starting point (although feel free to ignore this an do your own thing!):

At a high-level, this is a set/map data structure that supports fast thread-safe insertion/lookup and does not support erase. (IIRC, the interface similar to a std::map, where the key is a hash?)

A hash-mapped trie is a tree of arrays where the prefix of the hash is consumed as an "index" into the array at each level of the tree. IIRC, this data structure as implemented:

is lock-free and thread-safe

supports "insert" and "lookup"

does NOT detect a hash collision; first insertion wins; you want a strong-enough hash for your data set that there are no collisions, or handle collisions in the value_type somehow

does NOT support "erase"; IIRC, that'd be hard to do in a thread-safe way, but I didn't think hard about it

does NOT support "iteration"; IIRC, it'd be easy to implement iteration that exposed the internal layout

array sizes are configurable (root array can be a different size from sub-trie arrays)

each slot in an array is either empty, a sub-trie, or a value

Insertion (and lookup) works basically like the following (skipping over details that make the lock-free-thead-safety work):

start with the root trie's array

convert a prefix of the hash into an index into the current array

if the slot is a value with the same hash, return the existing value

if the slot is a value with a different hash, create a new sub-trie that contains the existing value, put the sub-trie in the slot, and continue with step (5)

if the slot is a sub-trie, descend and go back to step (2) with the unused suffix of the hash on the sub-trie's array

if the slot is empty, insert and return the new value

Oh, this is all really useful info I didn't understand from the code - I'd assumed it was a user-visible trie (eg: something you could insert entries into, then walk the trie) - if the trie-ness is purely an implementation detail, maybe it shouldn't be tested/dumped? (like we don't test the layout of hash buckets in DenseMap tests) if it does need to be tested, maybe having a test friend in some way.

As for naming, given the functionality. I'd consider putting Trie earlier in the name rather than later, same as DenseMap - it's not a trie data structure, it's a Set data structure implemented using a trie. So TrieSet. Though adding all the features might be a bit of a mouthful - ThreadSafeHashedTrieSet?... Not sure what the right answer is here (two hard problems in computer science, amirite?).

Is it a set or a map?

Not sure how to capture all the other fairly esoteric requirements (no removal, needs perfect hashing, lock free/thread safe, - I guess that's most of/all the user-visible major requirements (as you say, non-iterable is just a minor artifact of the current implementation, not a major defining feature of the interface)). PerfectHashMap? (if we were going verbose, ThreadSafePerfectHashMap... )

In D133715#3993178, @dblaikie wrote:

Is it a set or a map?

The interface is a map, just that the key type is restricted to look like a hash, and the client is expected to ensure the keys are well-distributed.

Not sure how to capture all the other fairly esoteric requirements (no removal, needs perfect hashing, lock free/thread safe, - I guess that's most of/all the user-visible major requirements (as you say, non-iterable is just a minor artifact of the current implementation, not a major defining feature of the interface)). PerfectHashMap? (if we were going verbose, ThreadSafePerfectHashMap... )

Both names make sense to me. "This is a map from a perfect hash to the data."

Issues I see with those names:

I worry it's not obvious from either name *when* you'd want to use this. (The answer is, I think, you want this if you are building a set shared between multiple threads, you expect lots of concurrent lookup/insertion, and you want fast insertion/lookup nevertheless.)
It might sound like the data structure does hashing. (It doesn't. The client is expected to provide the hash.)
It steals the turf for more general map built on top of this. More later.

Regardless of the name, maybe the programmer's guide could use an explainer.

Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.

Maybe it would make sense to lift up? Name could be PerfectHashSet
You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap

Maybe this could be PerfectHashSetImpl?

Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?
... but can be used directly if you want to manage the hashing yourself?

llvm/unittests/ADT/HashMappedTrieTest.cpp
241–245 ↗	(On Diff #481059)	FYI, here, buried in the unit tests (demoted due to lack of immediate use cases), this is a set data structure built on top of the trie. If this were lifted up again for actual use, as opposed to just a test, you'd want to template it on `HasherT` and then use HashBuilder (from llvm/include/llvm/Support/HashBuilder.h) to hash the value.

In D133715#3993270, @dexonsmith wrote:

In D133715#3993178, @dblaikie wrote:

Is it a set or a map?

The interface is a map, just that the key type is restricted to look like a hash, and the client is expected to ensure the keys are well-distributed.

Hmm, right.

Not sure how to capture all the other fairly esoteric requirements (no removal, needs perfect hashing, lock free/thread safe, - I guess that's most of/all the user-visible major requirements (as you say, non-iterable is just a minor artifact of the current implementation, not a major defining feature of the interface)). PerfectHashMap? (if we were going verbose, ThreadSafePerfectHashMap... )

Both names make sense to me. "This is a map from a perfect hash to the data."

Issues I see with those names:

I worry it's not obvious from either name *when* you'd want to use this. (The answer is, I think, you want this if you are building a set shared between multiple threads, you expect lots of concurrent lookup/insertion, and you want fast insertion/lookup nevertheless.)

I think given the number of things we're trying to capture in the name, and the nuance here - not sure we can capture that in the name (even if that was the only thing we wanted to capture in the name I can't think of a punchy name - InsertOnlyFastConcurrentHashMap? :/)

It might sound like the data structure does hashing. (It doesn't. The client is expected to provide the hash.)

Yeah, fair - I guess maybe that tends towards ConcurrentPerfectHashMap (open to Concurrent or ThreadSafe)

It steals the turf for more general map built on top of this. More later.

Regardless of the name, maybe the programmer's guide could use an explainer.

Might be this is esoteric enough not to be worth Programmer's Manual space, but certainly some good doc comments - but either/both I'm OK with.

Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.

Maybe it would make sense to lift up? Name could be PerfectHashSet

You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap

Maybe this could be PerfectHashSetImpl?

Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?

... but can be used directly if you want to manage the hashing yourself?

I take it the current use cases you have in mind/lined up in the CAS use this directly? Maybe a Raw prefix? RawPerfectHashSet, RawConcurrentHashSet, etc... some combination/choice of those sort of things?

llvm/unittests/ADT/HashMappedTrieTest.cpp
241–245 ↗	(On Diff #481059)	Hmm, yeah, this looks like overkill for the test - given the data structure as-is, we don't need any hash algorithm here - the test, I would expect, would use unique array values/"hash" values hardcoded/directly. If you've got a use for this data structure, it could come along in a subsequent patch?

In D133715#3993311, @dblaikie wrote:

Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.

Maybe it would make sense to lift up? Name could be PerfectHashSet

You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap

Maybe this could be PerfectHashSetImpl?

Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?

... but can be used directly if you want to manage the hashing yourself?

I take it the current use cases you have in mind/lined up in the CAS use this directly? Maybe a Raw prefix? RawPerfectHashSet, RawConcurrentHashSet, etc... some combination/choice of those sort of things?

@steven_wu can confirm, but that's what I remember. (There was an early branch that used the hash-included-client-friendly set, but the use case for it was refactored IIRC.)

To summarize, seems like the patch needs:

a new name (RawConcurrentHashSet SGTM);
some header docs explaining how to use it / design motivation;
a test support friend class that can inspect/test the trie layout without relying on stringification;
unit tests rewritten to use the friend (and probably dropping the example higher-level data structure).

In D133715#3993416, @dexonsmith wrote:

In D133715#3993311, @dblaikie wrote:

Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.

Maybe it would make sense to lift up? Name could be PerfectHashSet

You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap

Maybe this could be PerfectHashSetImpl?

Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?

... but can be used directly if you want to manage the hashing yourself?

I take it the current use cases you have in mind/lined up in the CAS use this directly? Maybe a Raw prefix? RawPerfectHashSet, RawConcurrentHashSet, etc... some combination/choice of those sort of things?

@steven_wu can confirm, but that's what I remember. (There was an early branch that used the hash-included-client-friendly set, but the use case for it was refactored IIRC.)

To summarize, seems like the patch needs:

a new name (RawConcurrentHashSet SGTM);

some header docs explaining how to use it / design motivation;

a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

unit tests rewritten to use the friend (and probably dropping the example higher-level data structure).

In D133715#3993489, @dblaikie wrote:

a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.

I don't remember the bugs, but two examples of subtleties:

On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.
The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.

I'd be a bit uneasy with the layout tests being dropped altogether.

Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return false?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.

In D133715#3993521, @dexonsmith wrote:

In D133715#3993489, @dblaikie wrote:

a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.

I don't remember the bugs, but two examples of subtleties:

On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.

The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.

I'd be a bit uneasy with the layout tests being dropped altogether.

Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return false?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.

Fair enough - if it's sufficient to have a verify operation (maybe "assertValid" - so, yeah, crash when not valid) I'd go with that, but given the argument you've made, if you think verifying the specific structure is significantly more valuable than that, I'd be OK with some private/test-friended introspection API.

In D133715#3995750, @dblaikie wrote:

In D133715#3993521, @dexonsmith wrote:

In D133715#3993489, @dblaikie wrote:

a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.

I don't remember the bugs, but two examples of subtleties:

On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.

The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.

I'd be a bit uneasy with the layout tests being dropped altogether.

Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return false?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.

Fair enough - if it's sufficient to have a verify operation (maybe "assertValid" - so, yeah, crash when not valid) I'd go with that, but given the argument you've made, if you think verifying the specific structure is significantly more valuable than that, I'd be OK with some private/test-friended introspection API.

IMO it's worth it; @steven_wu, if you disagree, I could live with assertValid(). (BTW, I remembered another justification. The test case inputs/setups are a bit subtle; checking the layout ensures you've covered the corner case you think you've covered.)

In D133715#3995988, @dexonsmith wrote:

In D133715#3995750, @dblaikie wrote:

In D133715#3993521, @dexonsmith wrote:

In D133715#3993489, @dblaikie wrote:

a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.

I don't remember the bugs, but two examples of subtleties:

On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.

The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.

I'd be a bit uneasy with the layout tests being dropped altogether.

Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return false?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.

Fair enough - if it's sufficient to have a verify operation (maybe "assertValid" - so, yeah, crash when not valid) I'd go with that, but given the argument you've made, if you think verifying the specific structure is significantly more valuable than that, I'd be OK with some private/test-friended introspection API.

IMO it's worth it; @steven_wu, if you disagree, I could live with assertValid(). (BTW, I remembered another justification. The test case inputs/setups are a bit subtle; checking the layout ensures you've covered the corner case you think you've covered.)

I prefer to keep those tests as it still provides value and also insights into underlying implementation. If someone got around to do a new and better implementation, we can drop them by then.

(avodiing quoting a lot)

I think the stringification's especially/sort of an unfortunate way to test this - that stringified version probably isn't how a user would want to dump the data structure (they'd want to see a user-centric view of the data structure, not the implementation details). Is it practical to at least move to friending the test from the data structure and have the test poke around in the internals to inspect the structure & assert/expect/check various things about it?

eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)

In D133715#3996103, @dblaikie wrote:

eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)

I do remember the string being easy to read in expect diagnostics, FWIW.

What about renaming the methods to printLayout() and dumpLayout()? Then print() and dump() would at least still be available for something user-centric.

In D133715#3996194, @dexonsmith wrote:

In D133715#3996103, @dblaikie wrote:

eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)

I do remember the string being easy to read in expect diagnostics, FWIW.

What about renaming the methods to printLayout() and dumpLayout()? Then print() and dump() would at least still be available for something user-centric.

Workable, though I'd still rather it be moved to the test file if it's not too inconvenient (with some friendship, probably). Avoid muddying the implementation with test-only features.

dblaikie mentioned this in D132455: [ADT] add ConcurrentHashtable class..Dec 28 2022, 7:20 AM

tschuett added a subscriber: tschuett.Jan 2 2023, 12:30 PM

tschuett added inline comments.

llvm/lib/Support/HashMappedTrieIndexGenerator.h
13 ↗	(On Diff #481059)	Please use `std::optional`.

In D133715#3996263, @dblaikie wrote:

In D133715#3996194, @dexonsmith wrote:

In D133715#3996103, @dblaikie wrote:

eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)

I do remember the string being easy to read in expect diagnostics, FWIW.

What about renaming the methods to printLayout() and dumpLayout()? Then print() and dump() would at least still be available for something user-centric.

Workable, though I'd still rather it be moved to the test file if it's not too inconvenient (with some friendship, probably). Avoid muddying the implementation with test-only features.

I find stringfication quite useful for debugging or inspecting the state of the CAS. Our branch (to be upstream later) has the OnDiskCAS, and we have a tool that can dump the CAS state (mostly the Trie information) in string format when pointed to the on disk CAS. My point is that even it is a testing/debugging tool, it is not limited to unit test only. I am ok with either names.

But I think we can expose some private methods for debugging to avoid string comparison in unit test and hopefully that is more readable.

llvm/unittests/ADT/HashMappedTrieTest.cpp
241–245 ↗	(On Diff #481059)	Agree. I don't think the actual data structure is useful outside the context of this unit test. I will simply it.

Address review feedback and rebase the patch using std::optional

Still thinking how to unittest without string compare. The only way I can think is to implement iteration on HashMappedTrie so we can have API to check the state of SubTrie. That will expose more types to user, which will make API more complicated.

Also for discussion of names, do we have other candidates to replace "HashMappedTrie" that is better and clearer?

steven_wu marked 9 inline comments as done.Jan 3 2023, 1:01 PM

steven_wu added inline comments.

llvm/unittests/ADT/HashMappedTrieTest.cpp
199 ↗	(On Diff #481059)	The test is for making sure no recursion happening when destroying a large Trie with objects inside (with or without destructor). The order doesn't matter. I added comment to clarify.

Harbormaster completed remote builds in B205511: Diff 486055.Jan 3 2023, 2:15 PM

Add some documentation about HashMappedTrie

Harbormaster completed remote builds in B205546: Diff 486102.Jan 3 2023, 5:56 PM

avl added a child revision: D132548: [WIP][ADT] Utility for comparision of hashtables implementation..Jan 4 2023, 8:56 AM

Update tests to avoid string comparsion.

It is not easy to add an iterator that can triversal Trie in a certain order but most test cases, we only care about the root node and the last sub-trie allocated. Those can be cheaply located just from the allocation chain, even it is not currently possible to construct the hash prefix if just walking the allocation chain.

Fix typo in the comments

Harbormaster completed remote builds in B205787: Diff 486411.Jan 4 2023, 5:29 PM

Allow checking prefix of the sub-trie

Ping. All feedbacks are addressed.

Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.

Harbormaster completed remote builds in B205955: Diff 486632.Jan 5 2023, 12:31 PM

In D133715#4023734, @steven_wu wrote:

In D133715#3996263, @dblaikie wrote:

In D133715#3996194, @dexonsmith wrote:

In D133715#3996103, @dblaikie wrote:

eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)

I do remember the string being easy to read in expect diagnostics, FWIW.

What about renaming the methods to printLayout() and dumpLayout()? Then print() and dump() would at least still be available for something user-centric.

Workable, though I'd still rather it be moved to the test file if it's not too inconvenient (with some friendship, probably). Avoid muddying the implementation with test-only features.

I find stringfication quite useful for debugging or inspecting the state of the CAS. Our branch (to be upstream later) has the OnDiskCAS, and we have a tool that can dump the CAS state (mostly the Trie information) in string format when pointed to the on disk CAS. My point is that even it is a testing/debugging tool, it is not limited to unit test only. I am ok with either names.

The ability to dump the data structure I'm all for - like DenseMap, etc, but I think that dumping should be the user-visible state (it's a map, the trie part is an implementation detail that ideally shouldn't be exposed in the dump developers use regularly - presumably it's not necessary except when debugging the data structure itself, which should be rare (we don't usually need to look at DenseMap's internal bucketing, for instance)).

But I think we can expose some private methods for debugging to avoid string comparison in unit test and hopefully that is more readable.

In D133715#4023992, @steven_wu wrote:

Address review feedback and rebase the patch using std::optional

Still thinking how to unittest without string compare. The only way I can think is to implement iteration on HashMappedTrie so we can have API to check the state of SubTrie. That will expose more types to user, which will make API more complicated.

Can the unit test be a friend to the data structure and use the data structure's internal APIs?

Also for discussion of names, do we have other candidates to replace "HashMappedTrie" that is better and clearer?

I'd really like to avoid "trie" in the name, though I guess it's as much an implementation detail as "Dense" is in DenseMap, so might be deserving to be in the name.

I think this is a map, so the name should probably end in 'map' not in 'trie'. I'd guess TrieHashMap - it's a type of hash map that uses a trie? Maybe make a shortlist of names to consider, etc?

In D133715#4029442, @steven_wu wrote:

Ping. All feedbacks are addressed.

Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.

Could ThreadSafeBumpPtrAllocator could be made lock-free? I think at least it would be possible to implement one that only locked when a new allocation is needed, instead of every time the ptr is bumped as now. (I’ll think about it a bit.)

Note that in the CAS use case it’s ideally true that most insertions are duplicates and don’t need to call the allocator at all. This is why we’ve been able to get away with a lock on each allocation.

The ability to dump the data structure I'm all for - like DenseMap, etc, but I think that dumping should be the user-visible state (it's a map, the trie part is an implementation detail that ideally shouldn't be exposed in the dump developers use regularly - presumably it's not necessary except when debugging the data structure itself, which should be rare (we don't usually need to look at DenseMap's internal bucketing, for instance)).

Correct. The current dump method is not designed to dump user-visible state. The goal is to dump how hash/key is stored in trie, and it doesn't even dump any user stored value. To dump user visible state from DenseMap, it needs to support iteration first but HashMappedTrie currently doesn't support iteration (and the CAS/ObjectStore it implements cannot have iteration for security reason). The only useful dump method can be implemented without iteration is what it has now. I am ok to remove dump method from this patch since I already rewrite the tests without using dump. The string dump for the trie structure can be useful to debug full CAS in the future. Since CAS cannot have iteration, a separate tool that can dump the entire trie structure can be helpful to look inside CAS. In that case, dump is a public API that needs to be used by this CAS inspection tool, not a private method that expose to test only.

Can the unit test be a friend to the data structure and use the data structure's internal APIs?

It is already the case now. I find a way to write unit-test with no string comparison and extract data from the trie without supporting real iteration. Supporting full iteration is not trivial since it will either involve a very complicated iterator or adding back edge in the trie or both. Currently, it has a very non-trivial way to iterate the subtrie in an order that is not really interesting to users, and the way it finds the hash prefix is quite expensive too. Thus, this is only useful for unit-tests and they are using private APIs that are friends to test only. A real iteration needs to do a tree walk from the root and store information it acquires along the way (like dump method).

I'd really like to avoid "trie" in the name, though I guess it's as much an implementation detail as "Dense" is in DenseMap, so might be deserving to be in the name.

I think this is a map, so the name should probably end in 'map' not in 'trie'. I'd guess TrieHashMap - it's a type of hash map that uses a trie? Maybe make a shortlist of names to consider, etc?

I don't have candidates. TrieHashMap sounds good to me.

In D133715#4029841, @dexonsmith wrote:

In D133715#4029442, @steven_wu wrote:

Ping. All feedbacks are addressed.

Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.

Could ThreadSafeBumpPtrAllocator could be made lock-free? I think at least it would be possible to implement one that only locked when a new allocation is needed, instead of every time the ptr is bumped as now. (I’ll think about it a bit.)

Note that in the CAS use case it’s ideally true that most insertions are duplicates and don’t need to call the allocator at all. This is why we’ve been able to get away with a lock on each allocation.

Yes, the FIXME for a ThreadSafeBumpPtrAllocator is still there. Currently, I don't think it is urgent to fix. It is not expected to have someone to use Trie as a high performance thread safe set/map.

In D133715#4029879, @steven_wu wrote:

In D133715#4029841, @dexonsmith wrote:

In D133715#4029442, @steven_wu wrote:

Ping. All feedbacks are addressed.

Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.

Could ThreadSafeBumpPtrAllocator could be made lock-free? I think at least it would be possible to implement one that only locked when a new allocation is needed, instead of every time the ptr is bumped as now. (I’ll think about it a bit.)

Note that in the CAS use case it’s ideally true that most insertions are duplicates and don’t need to call the allocator at all. This is why we’ve been able to get away with a lock on each allocation.

Here's a sketch that I think mostly works?

// Block for bump allocations.
struct BumpBlock {
  std::atomic<char *> Ptr;
  std::atomic<BumpBlock *> Next;
  char Bytes[4084];
  BumpBlock() : Ptr{Bytes}, Next{nullptr} {}

  // Compute new "Next", try to bump if there's space, else return nullptr.
  void *tryAllocate(size_t N, size_t Align = 1);
};

// Tail-allocated data for "big" allocations.
struct BumpSeparate {
  std::atomic<BumpSeparate *> Next;
  // tail-allocated data of appropriate and alignment, using malloc....
};

// Allocator.
struct BumpAllocator {
  std::atomic<BumpBlock *> CurrentBlock;
  std::atomic<BumpSeparate *> LastAllocSeparately;

  // Delete everything since there's no ownership here...
  ~BumpAllocator();

  void *allocate(size_t N, size_t Align = 1) {
    if (N > 2048)
      return allocateSeparately(N, Align);

    BumpBlock *B = CurrentBlock;
    std::unique_ptr<BumpBlock> New;
    void *NewAlloc = nullptr;
    while (true) {
      if (LLVM_LIKELY(B))
        if (void *Alloc = B->tryAllocate(N, Align))
          return Alloc;

      if (!New) {
        New = new BumpBlock;
        NewAlloc = New->tryAllocate(N, Align);
        assert(NewAlloc && "Empty block doesn't have space!");
      }
      if (!CurrentBlock.compare_exchange_weak(B, New.get()))
        continue;

      // New was saved in CurrentBlock. Fix its "Next" pointer and release it
      // so it's not deallocated.
      New->Next = B;
      New.release();
      return NewAlloc;
    }
  }

private:
  // Maintain a list of "big" allocations, similar to above.
  void *allocateSeparately(size_t N, size_t Align);
};

Not saying it needs to block this review...

But having a fast concurrent BumpPtrAllocator would be independently useful, and I'd suggest optimizing the allocator before bloating the default trie size.

Yes, the FIXME for a ThreadSafeBumpPtrAllocator is still there. Currently, I don't think it is urgent to fix. It is not expected to have someone to use Trie as a high performance thread safe set/map.

The immediate use case is as a high performance thread-safe data store. In the CAS use case we're expecting a lot of duplicate insertions which don't happen to hit the allocator, but I'm not sure that's really been checked/measured.

In D133715#4029870, @steven_wu wrote:

The ability to dump the data structure I'm all for - like DenseMap, etc, but I think that dumping should be the user-visible state (it's a map, the trie part is an implementation detail that ideally shouldn't be exposed in the dump developers use regularly - presumably it's not necessary except when debugging the data structure itself, which should be rare (we don't usually need to look at DenseMap's internal bucketing, for instance)).

Correct. The current dump method is not designed to dump user-visible state. The goal is to dump how hash/key is stored in trie, and it doesn't even dump any user stored value. To dump user visible state from DenseMap, it needs to support iteration first but HashMappedTrie currently doesn't support iteration (and the CAS/ObjectStore it implements cannot have iteration for security reason).

Not sure I follow why it'd need to support iteration to support dumping of user-centric state. Supporting iteration (using an iterator abstraction) is a fair bit more complicated than walking the data structure directly to dump out its contents, I'd think.

The only useful dump method can be implemented without iteration is what it has now. I am ok to remove dump method from this patch since I already rewrite the tests without using dump. The string dump for the trie structure can be useful to debug full CAS in the future. Since CAS cannot have iteration, a separate tool that can dump the entire trie structure can be helpful to look inside CAS. In that case, dump is a public API that needs to be used by this CAS inspection tool, not a private method that expose to test only.

Yeah, I'm generally OK with a public dump method that gives a user-centric result, like DenseMap's dump... oh, my mistake, DenseMap doesn't have a dump method. I guess we rely on debugger pretty printers for that (& as the person who wrote the gdb one at least, it's not great - because it's hard to walk the user-visible state correctly/skipping internal implementation details - so it does end up printing the buckets, etc).

But, sure, doesn't need to be in this patch - I guess if we don't have it for other data structures, seems fair/reasonable/OK that we don't have it for this one.

Can the unit test be a friend to the data structure and use the data structure's internal APIs?

It is already the case now. I find a way to write unit-test with no string comparison and extract data from the trie without supporting real iteration. Supporting full iteration is not trivial since it will either involve a very complicated iterator or adding back edge in the trie or both. Currently, it has a very non-trivial way to iterate the subtrie in an order that is not really interesting to users, and the way it finds the hash prefix is quite expensive too. Thus, this is only useful for unit-tests and they are using private APIs that are friends to test only. A real iteration needs to do a tree walk from the root and store information it acquires along the way (like dump method).

Ah, OK, great.

I'd really like to avoid "trie" in the name, though I guess it's as much an implementation detail as "Dense" is in DenseMap, so might be deserving to be in the name.

I think this is a map, so the name should probably end in 'map' not in 'trie'. I'd guess TrieHashMap - it's a type of hash map that uses a trie? Maybe make a shortlist of names to consider, etc?

I don't have candidates. TrieHashMap sounds good to me.

Sounds good to me, then.

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	Rather than having to indirect everything through this class helper - I think it's possible to name the class of the test fixture itself, and then you could friend that, so the test fixture would have direct access? Might be simpler that way.

But having a fast concurrent BumpPtrAllocator would be independently useful, and I'd suggest optimizing the allocator before bloating the default trie size.

+1 for fast concurrent ThreadSafeBumpPtrAllocator.

What do you think about following alternative implementation?

class ThreadSafeBumpPtrAllocator {
  ThreadSafeBumpPtrAllocator() {
    size_t ThreadsNum = ThreadPoolStrategy.compute_thread_count();
    allocators.resize(ThreadsNum);
  }
  
  void* Allocate (size_t Num) {
      size_t AllocatorIdx = getThreadIdx();
      
      return allocators[AllocatorIdx].Allocate(Num);
  }

  std::vector<BumpPtrAllocator> allocators;
};

static thread_local ThreadIdx;

size_t getThreadIdx() {
  return ThreadIdx;
}

This implementation uses the fact that ThreadPoolExecutor creates a fixed number
of threads(ThreadPoolStrategy.compute_thread_count()) and keeps them until destructed
. ThreadPoolExecutor can initialise thread local field ThreadIdx to the proper thread index.
The getThreadIdx() could return index of thread inside ThreadPoolExecutor.Threads.
ThreadSafeBumpPtrAllocator keeps separate allocator for each thread. In this case each thread would
always use separate allocator. No neccessary to have locks, cas operations, no races...

@steven_wu To have some interface compatibility and to make it possible to use HashMappedTrie for use case from D96035 probably,
Could it be possible to add boolean value to the result indicating whether data has just inserted?

std::pair<pointer, bool> insert(const_pointer Hint, value_type &&HashedData)

The use case for this:

std::pair<pointer, bool> Result = HashMappedTrie.insert(Data);
if (Result.second) {
  // initialize Result.first data
  Result.first->field = data;
}

In D133715#4031396, @avl wrote:
But having a fast concurrent BumpPtrAllocator would be independently useful, and I'd suggest optimizing the allocator before bloating the default trie size.

+1 for fast concurrent ThreadSafeBumpPtrAllocator.

What do you think about following alternative implementation?
class ThreadSafeBumpPtrAllocator {
  ThreadSafeBumpPtrAllocator() {
    size_t ThreadsNum = ThreadPoolStrategy.compute_thread_count();
    allocators.resize(ThreadsNum);
  }
  
  void* Allocate (size_t Num) {
      size_t AllocatorIdx = getThreadIdx();
      
      return allocators[AllocatorIdx].Allocate(Num);
  }

  std::vector<BumpPtrAllocator> allocators;
};

static thread_local ThreadIdx;

size_t getThreadIdx() {
  return ThreadIdx;
}
This implementation uses the fact that ThreadPoolExecutor creates a fixed number
of threads(ThreadPoolStrategy.compute_thread_count()) and keeps them until destructed
. ThreadPoolExecutor can initialise thread local field ThreadIdx to the proper thread index.
The getThreadIdx() could return index of thread inside ThreadPoolExecutor.Threads.
ThreadSafeBumpPtrAllocator keeps separate allocator for each thread. In this case each thread would
always use separate allocator. No neccessary to have locks, cas operations, no races...

Let's move the discussion of ThreadSafeAllocator to https://reviews.llvm.org/D133713 since this patch just uses it and the implementation is over there.

The background of this data structure is to use by a CAS, so it is ideal that the CAS doesn't need to lock to the amount of threads that is going to be spawned or rely on the thread id.
You can have a thread local allocator that allocates the value to be stored, you just need to do the allocation in insertLazy with your own allocator and manage its life time.

In D133715#4031396, @avl wrote:
But having a fast concurrent BumpPtrAllocator would be independently useful, and I'd suggest optimizing the allocator before bloating the default trie size.

+1 for fast concurrent ThreadSafeBumpPtrAllocator.

What do you think about following alternative implementation?
class ThreadSafeBumpPtrAllocator {
  ThreadSafeBumpPtrAllocator() {
    size_t ThreadsNum = ThreadPoolStrategy.compute_thread_count();
    allocators.resize(ThreadsNum);
  }
  
  void* Allocate (size_t Num) {
      size_t AllocatorIdx = getThreadIdx();
      
      return allocators[AllocatorIdx].Allocate(Num);
  }

  std::vector<BumpPtrAllocator> allocators;
};

static thread_local ThreadIdx;

size_t getThreadIdx() {
  return ThreadIdx;
}
This implementation uses the fact that ThreadPoolExecutor creates a fixed number
of threads(ThreadPoolStrategy.compute_thread_count()) and keeps them until destructed
. ThreadPoolExecutor can initialise thread local field ThreadIdx to the proper thread index.
The getThreadIdx() could return index of thread inside ThreadPoolExecutor.Threads.
ThreadSafeBumpPtrAllocator keeps separate allocator for each thread. In this case each thread would
always use separate allocator. No neccessary to have locks, cas operations, no races...

+1; seems worth experimenting with (downside is you have at least as many allocations as active threads, but maybe that’s fine); IIRC C++11 thread-local initialization is slow on Darwin so we might want __thread here, or maybe the static makes it fast; +1 to Steven’s comment this should move to the other review (also IMO this could be an incremental improvement that lands later).

@dexonsmith @benlangmuir @akyrtzi
Does TrieHashMap sound good to you?

Locally, I updated to use std::pair as return value to make it more map like, and remove the dump method. I will update the patch once we have the name we agreed on.

I guess one concern with TrieHashMap is that if this is the lower level implementation, and someone might implement a more map-like API on top of this, we might not want to take the "better" name for the data structure that'll be less directly used?

Could prefix with "Raw" or maybe TrieRawHashMap? (since it's the hashing part that's particularly "raw" - relying on the hash being unique, etc)

In D133715#4032581, @dblaikie wrote:

I guess one concern with TrieHashMap is that if this is the lower level implementation, and someone might implement a more map-like API on top of this, we might not want to take the "better" name for the data structure that'll be less directly used?

Could prefix with "Raw" or maybe TrieRawHashMap? (since it's the hashing part that's particularly "raw" - relying on the hash being unique, etc)

Either of those “raw” names SGTM.

Maybe RawHashTrieMap? It reads better when Raw is in the front, and it contains hash-trie and trie-map, which are both terms describing data structures similar to this but this is much simpler, thus raw.

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	I didn't see this comment. Can you elaborate more how this work? Since friend doesn't inherit and `TEST_F` are subclasses of the test fixture, the best I can think is to create forwarding in the test fixture, then it is not that much different from what it is now.

In D133715#4032713, @steven_wu wrote:

Maybe RawHashTrieMap? It reads better when Raw is in the front, and it contains hash-trie and trie-map, which are both terms describing data structures similar to this but this is much simpler, thus raw.

I think from a client perspective, this is some sort of variant of a HashMap, so ending with that makes sense to me. It’s an implementation detail that it’s a trie so that seems better in the prefix.

dexonsmith added inline comments.Jan 6 2023, 6:06 PM

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	I think: namespace llvm::testing { class HashMappedTrieTest; } namespace llvm { class HashMappedTrie { friend class HashMappedTrieTest; } } And then use the `llvm::testing` namespace to implement the test.

Rename to TrieRawHashMap.

Update tests to use test fixture.

steven_wu retitled this revision from [ADT] Add HashMappedTrie to [ADT] Add TrieRawHashMap.Jan 9 2023, 9:56 AM

steven_wu edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B206579: Diff 487489.Jan 9 2023, 12:00 PM

dblaikie added inline comments.Jan 10 2023, 12:32 PM

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	https://stackoverflow.com/questions/2396370/how-to-make-google-test-classes-friends-with-my-classes seems to discuss how to friend a fixture directly, though may require gtest support/inclusion in the header.
llvm/unittests/ADT/TrieRawHashMapTest.cpp
294	This seems like a fair bit of text to wrap the trie - what's the value/extra test coverage this is providingg? (sorry I'm not following too clearly) I'd have hoped/expected this API could be tested more directly, without needing this wrapper and/or without needing to involve something as non-trivial as std::string (instead using simple user defined types in the test)?

steven_wu added inline comments.Jan 10 2023, 2:11 PM

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	I am not sure about adding friend class in library header that references tests (and for each tests added). What do you think about the current implementation? There is an extra forwarding in the test helper but not too bad.
llvm/unittests/ADT/TrieRawHashMapTest.cpp
294	This a simplified version of the original test from the original patch. The original patch implements a generic set data structure in this test file and tests how it trie can be used like a set. Now it is simplified to be a stringset with a fake hash algorithm because there is no current value to put TrieSet in ADT.

dblaikie added inline comments.Jan 10 2023, 4:28 PM

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	It'd only be for the one test fixture here (HashMappedTrieTest), I think. Feels like it'd be nice to avoid the indirection through the `HashMappedTrieTestHelper`.
llvm/unittests/ADT/TrieRawHashMapTest.cpp
294	What extra test coverage does it provide? How bad/what would be the tradeoff of testing that functionality more directly without the mock?

steven_wu added inline comments.Jan 13 2023, 10:00 AM

llvm/unittests/ADT/HashMappedTrieTest.cpp
18 ↗	(On Diff #486632)	Sorry I still don't quite get it. Current `TrieRawHashMapTestHelper` is the only test fixture, but I have to put the indirection in there unless I want to friend every single tests (currently 3) in `llvm/ADT/TrieRawHashMap.h` using `FRIEND_TEST`. Am I missing something?
llvm/unittests/ADT/TrieRawHashMapTest.cpp
294	The extra test coverage it provides is the ability to store a non-POD type data into the Trie. where `insertLazy` is tested for lazy construction. I guess we can also tested it with the naive uint64_t. The alternative is might be just put `ThreadSafeHashMappedTrieSet` from the original patch into a header in ADT but there isn't a use case for that yet. So maybe testing insertLazy with uint64_t will make this cleaner.

Update unittest:

Simplify the test of StringSet by explicitly calling insertLazy on TrieHashMap.
Unify all the tests under the same template test fixture. Since we create this indirection, might as well use it to the full potential.

Harbormaster completed remote builds in B207706: Diff 489091.Jan 13 2023, 1:25 PM

Update after makeArrayRef deprecation

Harbormaster completed remote builds in B208276: Diff 489861.Jan 17 2023, 11:18 AM

avl mentioned this in D142318: [Support] Add PerThreadBumpPtrAllocator class..Jan 22 2023, 3:12 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

ADT/

TrieRawHashMap.h

404 lines

lib/

Support/

CMakeLists.txt

1 line

TrieHashIndexGenerator.h

89 lines

TrieRawHashMap.cpp

494 lines

unittests/

ADT/

CMakeLists.txt

1 line

TrieRawHashMapTest.cpp

347 lines

Diff 489861

llvm/include/llvm/ADT/TrieRawHashMap.h

This file was added.

				//===- TrieRawHashMap.h ------------------------------------------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ADT_TRIERAWHASHMAP_H
				#define LLVM_ADT_TRIERAWHASHMAP_H

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/Support/Casting.h"
				#include <atomic>
				#include <optional>

				namespace llvm {

				class raw_ostream;

				/// TrieRawHashMap - is a lock-free thread-safe trie that is can be used to
				/// store/index data based on a hash value. It can be customized to work with
				/// any hash algorithm or store any data.
				///
				/// Data structure:
				/// Data node stored in the Trie contains both hash and data:
				/// struct {
				/// HashT Hash;
				/// DataT Data;
				/// };
				///
				/// Data is stored/indexed via a prefix tree, where each node in the tree can be
				/// either the root, a sub-trie or a data node. Assuming a 4-bit hash and two
				/// data objects {0001, A} and {0100, B}, it can be stored in a trie
				/// (assuming Root has 2 bits, SubTrie has 1 bit):
				/// +--------+
				/// \|Root[00]\| -> {0001, A}
				/// \| [01]\| -> {0100, B}
				/// \| [10]\| (empty)
				/// \| [11]\| (empty)
				/// +--------+
				///
				/// Inserting a new object {0010, C} will result in:
				/// +--------+ +----------+
				/// \|Root[00]\| -> \|SubTrie[0]\| -> {0001, A}
				/// \| \| \| [1]\| -> {0010, C}
				/// \| \| +----------+
				/// \| [01]\| -> {0100, B}
				/// \| [10]\| (empty)
				/// \| [11]\| (empty)
				/// +--------+
				/// Note object A is sinked down to a sub-trie during the insertion. All the
				/// nodes are inserted through compare-exchange to ensure thread-safe and
				/// lock-free.
				///
				/// To find an object in the trie, walk the tree with prefix of the hash until
				/// the data node is found. Then the hash is compared with the hash stored in
				/// the data node to see if the is the same object.
				///
				/// Hash collision is not allowed so it is recommanded to use trie with a
				/// "strong" hashing algorithm. A well-distributed hash can also result in
				/// better performance and memory usage.
				///
				/// It currently does not support iteration and deletion.

				/// Base class for a lock-free thread-safe hash-mapped trie.
				class ThreadSafeTrieRawHashMapBase {
				public:
				static constexpr size_t TrieContentBaseSize = 4;
				static constexpr size_t DefaultNumRootBits = 6;
				static constexpr size_t DefaultNumSubtrieBits = 4;

				private:
				template <class T> struct AllocValueType {
				char Base[TrieContentBaseSize];
				std::aligned_union_t<sizeof(T), T> Content;
				};

				protected:
				template <class T>
				static constexpr size_t DefaultContentAllocSize = sizeof(AllocValueType<T>);

				template <class T>
				static constexpr size_t DefaultContentAllocAlign = alignof(AllocValueType<T>);

				template <class T>
				static constexpr size_t DefaultContentOffset =
				offsetof(AllocValueType<T>, Content);

				public:
				void operator delete(void *Ptr) { ::free(Ptr); }

				LLVM_DUMP_METHOD void dump() const;
				void print(raw_ostream &OS) const;

				protected:
				/// Result of a lookup. Suitable for an insertion hint. Maybe could be
				/// expanded into an iterator of sorts, but likely not useful (visiting
				/// everything in the trie should probably be done some way other than
				/// through an iterator pattern).
				class PointerBase {
				protected:
				void *get() const { return I == -2u ? P : nullptr; }

				public:
				PointerBase() noexcept = default;
				PointerBase(PointerBase &&) = default;
				PointerBase(const PointerBase &) = default;
				PointerBase &operator=(PointerBase &&) = default;
				PointerBase &operator=(const PointerBase &) = default;

				private:
				friend class ThreadSafeTrieRawHashMapBase;
				explicit PointerBase(void *Content) : P(Content), I(-2u) {}
				PointerBase(void *P, unsigned I, unsigned B) : P(P), I(I), B(B) {}

				bool isHint() const { return I != -1u && I != -2u; }

				void *P = nullptr;
				unsigned I = -1u;
				unsigned B = 0;
				};

				/// Find the stored content with hash.
				PointerBase find(ArrayRef<uint8_t> Hash) const;

				/// Insert and return the stored content.
				/// If the hash is already in the trie, it returns false.
				std::pair<PointerBase, bool>
				insert(PointerBase Hint, ArrayRef<uint8_t> Hash,
				function_ref<const uint8_t (void Mem, ArrayRef<uint8_t> Hash)>
				Constructor);

				ThreadSafeTrieRawHashMapBase() = delete;

				ThreadSafeTrieRawHashMapBase(
				size_t ContentAllocSize, size_t ContentAllocAlign, size_t ContentOffset,
				std::optional<size_t> NumRootBits = std::nullopt,
				std::optional<size_t> NumSubtrieBits = std::nullopt);

				/// Destructor, which asserts if there's anything to do. Subclasses should
				/// call \a destroyImpl().
				///
				/// \pre \a destroyImpl() was already called.
				~ThreadSafeTrieRawHashMapBase();
				void destroyImpl(function_ref<void(void *ValueMem)> Destructor);

				ThreadSafeTrieRawHashMapBase(ThreadSafeTrieRawHashMapBase &&RHS);

				// Move assignment can be implemented in a thread-safe way if NumRootBits and
				// NumSubtrieBits are stored inside the Root.
				ThreadSafeTrieRawHashMapBase &
				operator=(ThreadSafeTrieRawHashMapBase &&RHS) = delete;

				// No copy.
				ThreadSafeTrieRawHashMapBase(const ThreadSafeTrieRawHashMapBase &) = delete;
				ThreadSafeTrieRawHashMapBase &
				operator=(const ThreadSafeTrieRawHashMapBase &) = delete;

				// Debug functions. Implementation details and not guaranteed to be
				// thread-safe.
				PointerBase getRoot() const;
				unsigned getStartBit(PointerBase P) const;
				unsigned getNumBits(PointerBase P) const;
				unsigned getNumSlotUsed(PointerBase P) const;
				std::string getTriePrefixAsString(PointerBase P) const;
				unsigned getNumTries() const;
				// Visit next trie in the allocation chain.
				PointerBase getNextTrie(PointerBase P) const;

				private:
				friend class TrieRawHashMapTestHelper;
				const unsigned short ContentAllocSize;
				const unsigned short ContentAllocAlign;
				const unsigned short ContentOffset;
				unsigned short NumRootBits;
				unsigned short NumSubtrieBits;
				struct ImplType;
				// ImplPtr is owned by ThreadSafeTrieRawHashMapBase and needs to be freed in
				// destoryImpl.
				std::atomic<ImplType *> ImplPtr;
				ImplType &getOrCreateImpl();
				ImplType *getImpl() const;
				};

				/// Lock-free thread-safe hash-mapped trie.
				template <class T, size_t NumHashBytes>
				class ThreadSafeTrieRawHashMap : public ThreadSafeTrieRawHashMapBase {
				public:
				using HashT = std::array<uint8_t, NumHashBytes>;

				class LazyValueConstructor;
				struct value_type {
				const HashT Hash;
				T Data;

				value_type(value_type &&) = default;
				value_type(const value_type &) = default;

				value_type(ArrayRef<uint8_t> Hash, const T &Data)
				: Hash(copyHash(Hash)), Data(Data) {}
				value_type(ArrayRef<uint8_t> Hash, T &&Data)
				: Hash(copyHash(Hash)), Data(std::move(Data)) {}

				private:
				friend class LazyValueConstructor;

				struct EmplaceTag {};
				template <class... ArgsT>
				value_type(ArrayRef<uint8_t> Hash, EmplaceTag, ArgsT &&...Args)
				: Hash(copyHash(Hash)), Data(std::forward<ArgsT>(Args)...) {}

				static HashT copyHash(ArrayRef<uint8_t> HashRef) {
				HashT Hash;
				std::copy(HashRef.begin(), HashRef.end(), Hash.data());
				return Hash;
				}
				};

				using ThreadSafeTrieRawHashMapBase::operator delete;
				using HashType = HashT;

				using ThreadSafeTrieRawHashMapBase::dump;
				using ThreadSafeTrieRawHashMapBase::print;

				private:
				template <class ValueT> class PointerImpl : PointerBase {
				friend class ThreadSafeTrieRawHashMap;

				ValueT *get() const {
				if (void *B = PointerBase::get())
				return reinterpret_cast<ValueT *>(B);
				return nullptr;
				}

				public:
				ValueT &operator*() const {
				assert(get());
				return *get();
				}
				ValueT *operator->() const {
				assert(get());
				return get();
				}
				explicit operator bool() const { return get(); }

				PointerImpl() = default;
				PointerImpl(PointerImpl &&) = default;
				PointerImpl(const PointerImpl &) = default;
				PointerImpl &operator=(PointerImpl &&) = default;
				PointerImpl &operator=(const PointerImpl &) = default;

				protected:
				PointerImpl(PointerBase Result) : PointerBase(Result) {}
				};

				public:
				class pointer;
				class const_pointer;
				class pointer : public PointerImpl<value_type> {
				friend class ThreadSafeTrieRawHashMap;
				friend class const_pointer;

				public:
				pointer() = default;
				pointer(pointer &&) = default;
				pointer(const pointer &) = default;
				pointer &operator=(pointer &&) = default;
				pointer &operator=(const pointer &) = default;

				private:
				pointer(PointerBase Result) : pointer::PointerImpl(Result) {}
				};

				class const_pointer : public PointerImpl<const value_type> {
				friend class ThreadSafeTrieRawHashMap;

				public:
				const_pointer() = default;
				const_pointer(const_pointer &&) = default;
				const_pointer(const const_pointer &) = default;
				const_pointer &operator=(const_pointer &&) = default;
				const_pointer &operator=(const const_pointer &) = default;

				const_pointer(const pointer &P) : const_pointer::PointerImpl(P) {}

				private:
				const_pointer(PointerBase Result) : const_pointer::PointerImpl(Result) {}
				};

				class LazyValueConstructor {
				public:
				value_type &operator()(T &&RHS) {
				assert(Mem && "Constructor already called, or moved away");
				return assign(::new (Mem) value_type(Hash, std::move(RHS)));
				}
				value_type &operator()(const T &RHS) {
				assert(Mem && "Constructor already called, or moved away");
				return assign(::new (Mem) value_type(Hash, RHS));
				}
				template <class... ArgsT> value_type &emplace(ArgsT &&...Args) {
				assert(Mem && "Constructor already called, or moved away");
				return assign(::new (Mem)
				value_type(Hash, typename value_type::EmplaceTag{},
				std::forward<ArgsT>(Args)...));
				}

				LazyValueConstructor(LazyValueConstructor &&RHS)
				: Mem(RHS.Mem), Result(RHS.Result), Hash(RHS.Hash) {
				RHS.Mem = nullptr; // Moved away, cannot call.
				}
				~LazyValueConstructor() { assert(!Mem && "Constructor never called!"); }

				private:
				value_type &assign(value_type *V) {
				Mem = nullptr;
				Result = V;
				return *V;
				}
				friend class ThreadSafeTrieRawHashMap;
				LazyValueConstructor() = delete;
				LazyValueConstructor(void Mem, value_type &Result, ArrayRef<uint8_t> Hash)
				: Mem(Mem), Result(Result), Hash(Hash) {
				assert(Hash.size() == sizeof(HashT) && "Invalid hash");
				assert(Mem && "Invalid memory for construction");
				}
				void *Mem;
				value_type *&Result;
				ArrayRef<uint8_t> Hash;
				};

				/// Insert with a hint. Default-constructed hint will work, but it's
				/// recommended to start with a lookup to avoid overhead in object creation
				/// if it already exists.
				/// Return false if the hash is already in the map.
				std::pair<pointer, bool>
				insertLazy(const_pointer Hint, ArrayRef<uint8_t> Hash,
				function_ref<void(LazyValueConstructor)> OnConstruct) {
				auto Result = ThreadSafeTrieRawHashMapBase::insert(
				Hint, Hash, [&](void *Mem, ArrayRef<uint8_t> Hash) {
				value_type *Result = nullptr;
				OnConstruct(LazyValueConstructor(Mem, Result, Hash));
				return Result->Hash.data();
				});
				return {pointer(Result.first), Result.second};
				}

				std::pair<pointer, bool>
				insertLazy(ArrayRef<uint8_t> Hash,
				function_ref<void(LazyValueConstructor)> OnConstruct) {
				return insertLazy(const_pointer(), Hash, OnConstruct);
				}

				std::pair<pointer, bool> insert(const_pointer Hint, value_type &&HashedData) {
				return insertLazy(Hint, HashedData.Hash, [&](LazyValueConstructor C) {
				C(std::move(HashedData.Data));
				});
				}

				std::pair<pointer, bool> insert(const_pointer Hint,
				const value_type &HashedData) {
				return insertLazy(Hint, HashedData.Hash,
				[&](LazyValueConstructor C) { C(HashedData.Data); });
				}

				pointer find(ArrayRef<uint8_t> Hash) {
				assert(Hash.size() == std::tuple_size<HashT>::value);
				return ThreadSafeTrieRawHashMapBase::find(Hash);
				}

				const_pointer find(ArrayRef<uint8_t> Hash) const {
				assert(Hash.size() == std::tuple_size<HashT>::value);
				return ThreadSafeTrieRawHashMapBase::find(Hash);
				}

				ThreadSafeTrieRawHashMap(std::optional<size_t> NumRootBits = std::nullopt,
				std::optional<size_t> NumSubtrieBits = std::nullopt)
				: ThreadSafeTrieRawHashMapBase(DefaultContentAllocSize<value_type>,
				DefaultContentAllocAlign<value_type>,
				DefaultContentOffset<value_type>,
				NumRootBits, NumSubtrieBits) {}

				~ThreadSafeTrieRawHashMap() {
				if constexpr (std::is_trivially_destructible<value_type>::value)
				this->destroyImpl(nullptr);
				else
				this->destroyImpl(
				[](void P) { static_cast<value_type >(P)->~value_type(); });
				}

				// Move constructor okay.
				ThreadSafeTrieRawHashMap(ThreadSafeTrieRawHashMap &&) = default;

				// No move assignment or any copy.
				ThreadSafeTrieRawHashMap &operator=(ThreadSafeTrieRawHashMap &&) = delete;
				ThreadSafeTrieRawHashMap(const ThreadSafeTrieRawHashMap &) = delete;
				ThreadSafeTrieRawHashMap &
				operator=(const ThreadSafeTrieRawHashMap &) = delete;
				};

				} // namespace llvm

				#endif // LLVM_ADT_TRIERAWHASHMAP_H

llvm/lib/Support/CMakeLists.txt

Show First 20 Lines • Show All 219 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMSupport
SuffixTree.cpp		SuffixTree.cpp
SymbolRemappingReader.cpp		SymbolRemappingReader.cpp
SystemUtils.cpp		SystemUtils.cpp
TarWriter.cpp		TarWriter.cpp
ThreadPool.cpp		ThreadPool.cpp
TimeProfiler.cpp		TimeProfiler.cpp
Timer.cpp		Timer.cpp
ToolOutputFile.cpp		ToolOutputFile.cpp
		TrieRawHashMap.cpp
TrigramIndex.cpp		TrigramIndex.cpp
Twine.cpp		Twine.cpp
TypeSize.cpp		TypeSize.cpp
Unicode.cpp		Unicode.cpp
UnicodeCaseFold.cpp		UnicodeCaseFold.cpp
UnicodeNameToCodepoint.cpp		UnicodeNameToCodepoint.cpp
UnicodeNameToCodepointGenerated.cpp		UnicodeNameToCodepointGenerated.cpp
VersionTuple.cpp		VersionTuple.cpp
▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

llvm/lib/Support/TrieHashIndexGenerator.h

This file was added.

				//===- TrieHashIndexGenerator.h ---------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIB_SUPPORT_TRIEHASHINDEXGENERATOR_H
				#define LLVM_LIB_SUPPORT_TRIEHASHINDEXGENERATOR_H

				#include "llvm/ADT/ArrayRef.h"
				#include <optional>

				namespace llvm {

				struct IndexGenerator {
				size_t NumRootBits;
				size_t NumSubtrieBits;
				ArrayRef<uint8_t> Bytes;
				std::optional<size_t> StartBit = std::nullopt;

				size_t getNumBits() const {
				assert(StartBit);
				size_t TotalNumBits = Bytes.size() * 8;
				assert(*StartBit <= TotalNumBits);
				return std::min(*StartBit ? NumSubtrieBits : NumRootBits,
				TotalNumBits - *StartBit);
				}
				size_t next() {
				size_t Index;
				if (!StartBit) {
				StartBit = 0;
				Index = getIndex(Bytes, *StartBit, NumRootBits);
				} else {
				StartBit += StartBit ? NumSubtrieBits : NumRootBits;
				assert((*StartBit - NumRootBits) % NumSubtrieBits == 0);
				Index = getIndex(Bytes, *StartBit, NumSubtrieBits);
				}
				return Index;
				}

				size_t hint(unsigned Index, unsigned Bit) {
				assert(Index >= 0);
				assert(Bit < Bytes.size() * 8);
				assert(Bit == 0 \|\| (Bit - NumRootBits) % NumSubtrieBits == 0);
				StartBit = Bit;
				return Index;
				}

				size_t getCollidingBits(ArrayRef<uint8_t> CollidingBits) const {
				assert(StartBit);
				return getIndex(CollidingBits, *StartBit, NumSubtrieBits);
				}

				static size_t getIndex(ArrayRef<uint8_t> Bytes, size_t StartBit,
				size_t NumBits) {
				assert(StartBit < Bytes.size() * 8);

				Bytes = Bytes.drop_front(StartBit / 8u);
				StartBit %= 8u;
				size_t Index = 0;
				for (uint8_t Byte : Bytes) {
				size_t ByteStart = 0, ByteEnd = 8;
				if (StartBit) {
				ByteStart = StartBit;
				Byte &= (1u << (8 - StartBit)) - 1u;
				StartBit = 0;
				}
				size_t CurrentNumBits = ByteEnd - ByteStart;
				if (CurrentNumBits > NumBits) {
				Byte >>= CurrentNumBits - NumBits;
				CurrentNumBits = NumBits;
				}
				Index <<= CurrentNumBits;
				Index \|= Byte & ((1u << CurrentNumBits) - 1u);

				assert(NumBits >= CurrentNumBits);
				NumBits -= CurrentNumBits;
				if (!NumBits)
				break;
				}
				return Index;
				}
				};

				} // namespace llvm

				#endif // LLVM_LIB_SUPPORT_TRIEHASHINDEXGENERATOR_H

llvm/lib/Support/TrieRawHashMap.cpp

This file was added.

				//===- TrieRawHashMap.cpp -------------------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/TrieRawHashMap.h"
				#include "TrieHashIndexGenerator.h"
				#include "llvm/ADT/LazyAtomicPointer.h"
				#include "llvm/Support/Allocator.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/ThreadSafeAllocator.h"
				#include "llvm/Support/raw_ostream.h"
				#include <memory>

				using namespace llvm;

				namespace {
				struct TrieNode {
				const bool IsSubtrie = false;

				TrieNode(bool IsSubtrie) : IsSubtrie(IsSubtrie) {}

				static void *operator new(size_t Size) { return ::malloc(Size); }
				void operator delete(void *Ptr) { ::free(Ptr); }
				};

				struct TrieContent final : public TrieNode {
				const uint8_t ContentOffset;
				const uint8_t HashSize;
				const uint8_t HashOffset;

				void *getValuePointer() const {
				auto Content = reinterpret_cast<const uint8_t *>(this) + ContentOffset;
				return const_cast<uint8_t *>(Content);
				}

				ArrayRef<uint8_t> getHash() const {
				auto Begin = reinterpret_cast<const uint8_t >(this) + HashOffset;
				return ArrayRef(Begin, Begin + HashSize);
				}

				TrieContent(size_t ContentOffset, size_t HashSize, size_t HashOffset)
				: TrieNode(/IsSubtrie=/false), ContentOffset(ContentOffset),
				HashSize(HashSize), HashOffset(HashOffset) {}
				};
				static_assert(sizeof(TrieContent) ==
				ThreadSafeTrieRawHashMapBase::TrieContentBaseSize,
				"Check header assumption!");

				class TrieSubtrie final : public TrieNode {
				public:
				TrieNode *get(size_t I) const { return Slots[I].load(); }

				TrieSubtrie *
				sink(size_t I, TrieContent &Content, size_t NumSubtrieBits, size_t NewI,
				function_ref<TrieSubtrie *(std::unique_ptr<TrieSubtrie>)> Saver);

				static std::unique_ptr<TrieSubtrie> create(size_t StartBit, size_t NumBits);

				explicit TrieSubtrie(size_t StartBit, size_t NumBits);

				private:
				// FIXME: Use a bitset to speed up access:
				//
				// std::array<std::atomic<uint64_t>, NumSlots/64> IsSet;
				//
				// This will avoid needing to visit sparsely filled slots in
				// \a ThreadSafeTrieRawHashMapBase::destroyImpl() when there's a non-trivial
				// destructor.
				//
				// It would also greatly speed up iteration, if we add that some day, and
				// allow get() to return one level sooner.
				//
				// This would be the algorithm for updating IsSet (after updating Slots):
				//
				// std::atomic<uint64_t> &Bits = IsSet[I.High];
				// const uint64_t NewBit = 1ULL << I.Low;
				// uint64_t Old = 0;
				// while (!Bits.compare_exchange_weak(Old, Old \| NewBit))
				// ;

				// For debugging.
				unsigned StartBit = 0;
				unsigned NumBits = 0;
				friend class llvm::ThreadSafeTrieRawHashMapBase;

				public:
				/// Linked list for ownership of tries. The pointer is owned by TrieSubtrie.
				std::atomic<TrieSubtrie *> Next;

				/// The (co-allocated) slots of the subtrie.
				MutableArrayRef<LazyAtomicPointer<TrieNode>> Slots;
				};
				} // end namespace

				namespace llvm {
				template <> struct isa_impl<TrieContent, TrieNode> {
				static inline bool doit(const TrieNode &TN) { return !TN.IsSubtrie; }
				};
				template <> struct isa_impl<TrieSubtrie, TrieNode> {
				static inline bool doit(const TrieNode &TN) { return TN.IsSubtrie; }
				};
				} // end namespace llvm

				static size_t getTrieTailSize(size_t StartBit, size_t NumBits) {
				assert(NumBits < 20 && "Tries should have fewer than ~1M slots");
				return sizeof(TrieNode ) (1u << NumBits);
				}

				std::unique_ptr<TrieSubtrie> TrieSubtrie::create(size_t StartBit,
				size_t NumBits) {
				size_t Size = sizeof(TrieSubtrie) + getTrieTailSize(StartBit, NumBits);
				void *Memory = ::malloc(Size);
				TrieSubtrie *S = ::new (Memory) TrieSubtrie(StartBit, NumBits);
				return std::unique_ptr<TrieSubtrie>(S);
				}

				TrieSubtrie::TrieSubtrie(size_t StartBit, size_t NumBits)
				: TrieNode(true), StartBit(StartBit), NumBits(NumBits), Next(nullptr),
				Slots(reinterpret_cast<LazyAtomicPointer<TrieNode> *>(
				reinterpret_cast<char *>(this) + sizeof(TrieSubtrie)),
				(1u << NumBits)) {
				for (auto I = Slots.begin(), E = Slots.end(); I != E; ++I)
				new (I) LazyAtomicPointer<TrieNode>(nullptr);

				static_assert(
				std::is_trivially_destructible<LazyAtomicPointer<TrieNode>>::value,
				"Expected no work in destructor for TrieNode");
				}

				TrieSubtrie *TrieSubtrie::sink(
				size_t I, TrieContent &Content, size_t NumSubtrieBits, size_t NewI,
				function_ref<TrieSubtrie *(std::unique_ptr<TrieSubtrie>)> Saver) {
				assert(NumSubtrieBits > 0);
				std::unique_ptr<TrieSubtrie> S = create(StartBit + NumBits, NumSubtrieBits);

				assert(NewI < S->Slots.size());
				S->Slots[NewI].store(&Content);

				TrieNode *ExistingNode = &Content;
				assert(I < Slots.size());
				if (Slots[I].compare_exchange_strong(ExistingNode, S.get()))
				return Saver(std::move(S));

				// Another thread created a subtrie already. Return it and let "S" be
				// destructed.
				return cast<TrieSubtrie>(ExistingNode);
				}

				struct ThreadSafeTrieRawHashMapBase::ImplType {
				static ImplType *create(size_t StartBit, size_t NumBits) {
				size_t Size = sizeof(ImplType) + getTrieTailSize(StartBit, NumBits);
				void *Memory = ::malloc(Size);
				return ::new (Memory) ImplType(StartBit, NumBits);
				}

				TrieSubtrie *save(std::unique_ptr<TrieSubtrie> S) {
				assert(!S->Next && "Expected S to a freshly-constructed leaf");

				TrieSubtrie *CurrentHead = nullptr;
				// Add ownership of "S" to front of the list, so that Root -> S ->
				// Root.Next. This works by repeatedly setting S->Next to a candidate value
				// of Root.Next (initially nullptr), then setting Root.Next to S once the
				// candidate matches reality.
				while (!Root.Next.compare_exchange_weak(CurrentHead, S.get()))
				S->Next.exchange(CurrentHead);

				// Ownership transferred to subtrie.
				return S.release();
				}

				static void *operator new(size_t Size) { return ::malloc(Size); }
				void operator delete(void *Ptr) { ::free(Ptr); }

				/// FIXME: This should take a function that allocates and constructs the
				/// content lazily (taking the hash as a separate parameter), in case of
				/// collision.
				ThreadSafeAllocator<BumpPtrAllocator> ContentAlloc;
				TrieSubtrie Root; // Must be last! Tail-allocated.

				private:
				ImplType(size_t StartBit, size_t NumBits) : Root(StartBit, NumBits) {}
				};

				ThreadSafeTrieRawHashMapBase::ImplType &
				ThreadSafeTrieRawHashMapBase::getOrCreateImpl() {
				if (ImplType *Impl = ImplPtr.load())
				return *Impl;

				// Create a new ImplType and store it if another thread doesn't do so first.
				// If another thread wins this one is destroyed locally.
				std::unique_ptr<ImplType> Impl(ImplType::create(0, NumRootBits));
				ImplType *ExistingImpl = nullptr;
				if (ImplPtr.compare_exchange_strong(ExistingImpl, Impl.get()))
				return *Impl.release();

				return *ExistingImpl;
				}

				ThreadSafeTrieRawHashMapBase::PointerBase
				ThreadSafeTrieRawHashMapBase::find(ArrayRef<uint8_t> Hash) const {
				assert(!Hash.empty() && "Uninitialized hash");

				ImplType *Impl = ImplPtr.load();
				if (!Impl)
				return PointerBase();

				TrieSubtrie *S = &Impl->Root;
				IndexGenerator IndexGen{NumRootBits, NumSubtrieBits, Hash};
				size_t Index = IndexGen.next();
				for (;;) {
				// Try to set the content.
				TrieNode *Existing = S->get(Index);
				if (!Existing)
				return PointerBase(S, Index, *IndexGen.StartBit);

				// Check for an exact match.
				if (auto *ExistingContent = dyn_cast<TrieContent>(Existing))
				return ExistingContent->getHash() == Hash
				? PointerBase(ExistingContent->getValuePointer())
				: PointerBase(S, Index, *IndexGen.StartBit);

				Index = IndexGen.next();
				S = cast<TrieSubtrie>(Existing);
				}
				}

				std::pair<ThreadSafeTrieRawHashMapBase::PointerBase, bool>
				ThreadSafeTrieRawHashMapBase::insert(
				PointerBase Hint, ArrayRef<uint8_t> Hash,
				function_ref<const uint8_t (void Mem, ArrayRef<uint8_t> Hash)>
				Constructor) {
				assert(!Hash.empty() && "Uninitialized hash");

				ImplType &Impl = getOrCreateImpl();
				TrieSubtrie *S = &Impl.Root;
				IndexGenerator IndexGen{NumRootBits, NumSubtrieBits, Hash};
				size_t Index;
				if (Hint.isHint()) {
				S = static_cast<TrieSubtrie *>(Hint.P);
				Index = IndexGen.hint(Hint.I, Hint.B);
				} else {
				Index = IndexGen.next();
				}

				for (;;) {
				// Load the node from the slot, allocating and calling the constructor if
				// the slot is empty.
				bool Generated = false;
				TrieNode &Existing = S->Slots[Index].loadOrGenerate([&]() {
				Generated = true;

				// Construct the value itself at the tail.
				uint8_t Memory = reinterpret_cast<uint8_t >(
				Impl.ContentAlloc.Allocate(ContentAllocSize, ContentAllocAlign));
				const uint8_t *HashStorage = Constructor(Memory + ContentOffset, Hash);

				// Construct the TrieContent header, passing in the offset to the hash.
				TrieContent *Content = ::new (Memory)
				TrieContent(ContentOffset, Hash.size(), HashStorage - Memory);
				assert(Hash == Content->getHash() && "Hash not properly initialized");
				return Content;
				});
				// If we just generated it, return it!
				if (Generated)
				return {PointerBase(cast<TrieContent>(Existing).getValuePointer()), true};

				if (isa<TrieSubtrie>(Existing)) {
				S = &cast<TrieSubtrie>(Existing);
				Index = IndexGen.next();
				continue;
				}

				// Return the existing content if it's an exact match!
				auto &ExistingContent = cast<TrieContent>(Existing);
				if (ExistingContent.getHash() == Hash)
				return {PointerBase(ExistingContent.getValuePointer()), false};

				// Sink the existing content as long as the indexes match.
				for (;;) {
				size_t NextIndex = IndexGen.next();
				size_t NewIndexForExistingContent =
				IndexGen.getCollidingBits(ExistingContent.getHash());
				S = S->sink(Index, ExistingContent, IndexGen.getNumBits(),
				NewIndexForExistingContent,
				[&Impl](std::unique_ptr<TrieSubtrie> S) {
				return Impl.save(std::move(S));
				});
				Index = NextIndex;

				// Found the difference.
				if (NextIndex != NewIndexForExistingContent)
				break;
				}
				}
				}

				static void printHexDigit(raw_ostream &OS, uint8_t Digit) {
				if (Digit < 10)
				OS << char(Digit + '0');
				else
				OS << char(Digit - 10 + 'a');
				}

				static void printPrefix(raw_ostream &OS, StringRef Prefix) {
				while (Prefix.size() >= 4) {
				uint8_t Digit;
				bool ErrorParsingBinary = Prefix.take_front(4).getAsInteger(2, Digit);
				assert(!ErrorParsingBinary);
				(void)ErrorParsingBinary;
				printHexDigit(OS, Digit);
				Prefix = Prefix.drop_front(4);
				}
				if (!Prefix.empty())
				OS << "[" << Prefix << "]";
				}

				ThreadSafeTrieRawHashMapBase::ThreadSafeTrieRawHashMapBase(
				size_t ContentAllocSize, size_t ContentAllocAlign, size_t ContentOffset,
				std::optional<size_t> NumRootBits, std::optional<size_t> NumSubtrieBits)
				: ContentAllocSize(ContentAllocSize), ContentAllocAlign(ContentAllocAlign),
				ContentOffset(ContentOffset),
				NumRootBits(NumRootBits ? *NumRootBits : DefaultNumRootBits),
				NumSubtrieBits(NumSubtrieBits ? *NumSubtrieBits : DefaultNumSubtrieBits),
				ImplPtr(nullptr) {
				assert((!NumRootBits \|\| *NumRootBits < 20) &&
				"Root should have fewer than ~1M slots");
				assert((!NumSubtrieBits \|\| *NumSubtrieBits < 10) &&
				"Subtries should have fewer than ~1K slots");
				}

				ThreadSafeTrieRawHashMapBase::ThreadSafeTrieRawHashMapBase(
				ThreadSafeTrieRawHashMapBase &&RHS)
				: ContentAllocSize(RHS.ContentAllocSize),
				ContentAllocAlign(RHS.ContentAllocAlign),
				ContentOffset(RHS.ContentOffset), NumRootBits(RHS.NumRootBits),
				NumSubtrieBits(RHS.NumSubtrieBits) {
				// Steal the root from RHS.
				ImplPtr = RHS.ImplPtr.exchange(nullptr);
				}

				ThreadSafeTrieRawHashMapBase::~ThreadSafeTrieRawHashMapBase() {
				assert(!ImplPtr.load() && "Expected subclass to call destroyImpl()");
				}

				void ThreadSafeTrieRawHashMapBase::destroyImpl(
				function_ref<void(void *)> Destructor) {
				std::unique_ptr<ImplType> Impl(ImplPtr.exchange(nullptr));
				if (!Impl)
				return;

				// Destroy content nodes throughout trie. Avoid destroying any subtries since
				// we need TrieNode::classof() to find the content nodes.
				//
				// FIXME: Once we have bitsets (see FIXME in TrieSubtrie class), use them
				// facilitate sparse iteration here.
				if (Destructor)
				for (TrieSubtrie *Trie = &Impl->Root; Trie; Trie = Trie->Next.load())
				for (auto &Slot : Trie->Slots)
				if (auto *Content = dyn_cast_or_null<TrieContent>(Slot.load()))
				Destructor(Content->getValuePointer());

				// Destroy the subtries. Incidentally, this destroys them in the reverse order
				// of saving.
				TrieSubtrie *Trie = Impl->Root.Next;
				while (Trie) {
				TrieSubtrie *Next = Trie->Next.exchange(nullptr);
				delete Trie;
				Trie = Next;
				}
				}

				ThreadSafeTrieRawHashMapBase::PointerBase
				ThreadSafeTrieRawHashMapBase::getRoot() const {
				ImplType *Impl = ImplPtr.load();
				if (!Impl)
				return PointerBase();
				return PointerBase(&Impl->Root);
				}

				unsigned ThreadSafeTrieRawHashMapBase::getStartBit(
				ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				assert(!P.isHint() && "Not a valid trie");
				if (!P.P)
				return 0;
				if (auto S = dyn_cast<TrieSubtrie>((TrieNode )P.P))
				return S->StartBit;
				return 0;
				}

				unsigned ThreadSafeTrieRawHashMapBase::getNumBits(
				ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				assert(!P.isHint() && "Not a valid trie");
				if (!P.P)
				return 0;
				if (auto S = dyn_cast<TrieSubtrie>((TrieNode )P.P))
				return S->NumBits;
				return 0;
				}

				unsigned ThreadSafeTrieRawHashMapBase::getNumSlotUsed(
				ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				assert(!P.isHint() && "Not a valid trie");
				if (!P.P)
				return 0;
				auto S = dyn_cast<TrieSubtrie>((TrieNode )P.P);
				if (!S)
				return 0;
				unsigned Num = 0;
				for (unsigned I = 0, E = S->Slots.size(); I < E; ++I)
				if (auto *E = S->Slots[I].load())
				++Num;
				return Num;
				}

				std::string ThreadSafeTrieRawHashMapBase::getTriePrefixAsString(
				ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				assert(!P.isHint() && "Not a valid trie");
				if (!P.P)
				return std::string();

				auto S = dyn_cast<TrieSubtrie>((TrieNode )P.P);
				if (!S \|\| !S->IsSubtrie)
				return std::string();

				// Find a TrieContent node which has hash stored. Depth search following the
				// first used slot until a TrieContent node is found.
				TrieSubtrie *Current = S;
				TrieContent *Node = nullptr;
				while (Current) {
				TrieSubtrie *Next = nullptr;
				// find first used slot in the trie.
				for (unsigned I = 0, E = Current->Slots.size(); I < E; ++I) {
				auto *S = Current->get(I);
				if (!S)
				continue;

				if (auto *Content = dyn_cast<TrieContent>(S))
				Node = Content;
				else if (auto *Sub = dyn_cast<TrieSubtrie>(S))
				Next = Sub;
				break;
				}

				// Found the node.
				if (Node)
				break;

				// Continue to the next level if the node is not found.
				Current = Next;
				}

				assert(Node && "malformed trie, cannot find TrieContent on leaf node");
				// The prefix for the current trie is the first `StartBit` of the content
				// stored underneath this subtrie.
				std::string Bits;
				for (unsigned I = 0, E = S->StartBit; I < E; ++I) {
				unsigned Index = I / 8;
				unsigned Offset = 7 - I % 8;
				Bits.push_back('0' + ((Node->getHash()[Index] >> Offset) & 1));
				}

				std::string Str;
				raw_string_ostream SS(Str);
				printPrefix(SS, Bits);
				return SS.str();
				}

				unsigned ThreadSafeTrieRawHashMapBase::getNumTries() const {
				ImplType *Impl = ImplPtr.load();
				if (!Impl)
				return 0;
				unsigned Num = 0;
				for (TrieSubtrie *Trie = &Impl->Root; Trie; Trie = Trie->Next.load())
				++Num;
				return Num;
				}

				ThreadSafeTrieRawHashMapBase::PointerBase
				ThreadSafeTrieRawHashMapBase::getNextTrie(
				ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				assert(!P.isHint() && "Not a valid trie");
				if (!P.P)
				return PointerBase();
				auto S = dyn_cast<TrieSubtrie>((TrieNode )P.P);
				if (!S)
				return PointerBase();
				if (auto *E = S->Next.load())
				return PointerBase(E);
				return PointerBase();
				}

llvm/unittests/ADT/CMakeLists.txt

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	add_llvm_unittest(ADTTests
SparseSetTest.cpp		SparseSetTest.cpp
StatisticTest.cpp		StatisticTest.cpp
StringExtrasTest.cpp		StringExtrasTest.cpp
StringMapTest.cpp		StringMapTest.cpp
StringRefTest.cpp		StringRefTest.cpp
StringSetTest.cpp		StringSetTest.cpp
StringSwitchTest.cpp		StringSwitchTest.cpp
TinyPtrVectorTest.cpp		TinyPtrVectorTest.cpp
		TrieRawHashMapTest.cpp
TwineTest.cpp		TwineTest.cpp
TypeSwitchTest.cpp		TypeSwitchTest.cpp
TypeTraitsTest.cpp		TypeTraitsTest.cpp
)		)

target_link_libraries(ADTTests PRIVATE LLVMTestingSupport)		target_link_libraries(ADTTests PRIVATE LLVMTestingSupport)

add_dependencies(ADTTests intrinsics_gen)		add_dependencies(ADTTests intrinsics_gen)

llvm/unittests/ADT/TrieRawHashMapTest.cpp

This file was added.

				//===- TrieRawHashMapTest.cpp ---------------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/TrieRawHashMap.h"
				#include "llvm/ADT/Twine.h"
				#include "llvm/Support/Endian.h"
				#include "llvm/Support/SHA1.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				namespace llvm {
				class TrieRawHashMapTestHelper {
				public:
				TrieRawHashMapTestHelper() = default;

				void setTrie(ThreadSafeTrieRawHashMapBase *T) { Trie = T; }

				ThreadSafeTrieRawHashMapBase::PointerBase getRoot() const {
				return Trie->getRoot();
				}
				unsigned getStartBit(ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				return Trie->getStartBit(P);
				}
				unsigned getNumBits(ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				return Trie->getNumBits(P);
				}
				unsigned getNumSlotUsed(ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				return Trie->getNumSlotUsed(P);
				}
				unsigned getNumTries() const { return Trie->getNumTries(); }
				std::string
				getTriePrefixAsString(ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				return Trie->getTriePrefixAsString(P);
				}
				ThreadSafeTrieRawHashMapBase::PointerBase
				getNextTrie(ThreadSafeTrieRawHashMapBase::PointerBase P) const {
				return Trie->getNextTrie(P);
				}

				private:
				ThreadSafeTrieRawHashMapBase *Trie = nullptr;
				};
				} // namespace llvm

				namespace {
				template <typename DataType, size_t HashSize>
				class SimpleTrieHashMapTest : public TrieRawHashMapTestHelper,
				public ::testing::Test {
				public:
				using NumType = DataType;
				using HashType = std::array<uint8_t, HashSize>;
				using TrieType = ThreadSafeTrieRawHashMap<DataType, sizeof(HashType)>;

				TrieType &createTrie(size_t RootBits, size_t SubtrieBits) {
				auto &Ret = Trie.emplace(RootBits, SubtrieBits);
				TrieRawHashMapTestHelper::setTrie(&Ret);
				return Ret;
				}

				void destroyTrie() {
				Trie.reset();
				}

				~SimpleTrieHashMapTest() {
				if (Trie)
				Trie.reset();
				}

				// Use the number itself as hash to test the pathological case.
				static HashType hash(uint64_t Num) {
				uint64_t HashN = llvm::support::endian::byte_swap(Num, llvm::support::big);
				HashType Hash;
				memcpy(&Hash[0], &HashN, sizeof(HashType));
				return Hash;
				};

				private:
				std::optional<TrieType> Trie;
				};

				using SmallNodeTrieTest = SimpleTrieHashMapTest<uint64_t, sizeof(uint64_t)>;

				TEST_F(SmallNodeTrieTest, TrieAllocation) {
				NumType Numbers[] = {
				0x0, std::numeric_limits<NumType>::max(), 0x1, 0x2,
				0x3, std::numeric_limits<NumType>::max() - 1u,
				};

				unsigned ExpectedTries[] = {
				1, // Allocate Root.
				1, // Both on the root.
				64, // 0 and 1 sinks all the way down.
				64, // no new allocation needed.
				65, // need a new node between 2 and 3.
				65 + 63, // 63 new allocation to sink two big numbers all the way.
				};

				const char *ExpectedPrefix[] = {
				"", // Root.
				"", // Root.
				"000000000000000[000]",
				"000000000000000[000]",
				"000000000000000[001]",
				"fffffffffffffff[111]",
				};

				// Use root and subtrie sizes of 1 so this gets sunk quite deep.
				auto &Trie = createTrie(/RootBits=/1, /SubtrieBits=/1);

				for (unsigned I = 0; I < 6; ++I) {
				// Lookup first to exercise hint code for deep tries.
				TrieType::pointer Lookup = Trie.find(hash(Numbers[I]));
				EXPECT_FALSE(Lookup);

				Trie.insert(Lookup, TrieType::value_type(hash(Numbers[I]), Numbers[I]));
				EXPECT_EQ(getNumTries(), ExpectedTries[I]);
				EXPECT_EQ(getTriePrefixAsString(getNextTrie(getRoot())), ExpectedPrefix[I]);
				}
				}

				TEST_F(SmallNodeTrieTest, TrieStructure) {
				NumType Numbers[] = {
				// Three numbers that will nest deeply to test (1) sinking subtries and
				// (2) deep, non-trivial hints.
				std::numeric_limits<NumType>::max(),
				std::numeric_limits<NumType>::max() - 2u,
				std::numeric_limits<NumType>::max() - 3u,
				// One number to stay at the top-level.
				0x37,
				};

				// Use root and subtrie sizes of 1 so this gets sunk quite deep.
				auto &Trie = createTrie(/RootBits=/1, /SubtrieBits=/1);

				for (NumType N : Numbers) {
				// Lookup first to exercise hint code for deep tries.
				TrieType::pointer Lookup = Trie.find(hash(N));
				EXPECT_FALSE(Lookup);

				Trie.insert(Lookup, TrieType::value_type(hash(N), N));
				}
				for (NumType N : Numbers) {
				TrieType::pointer Lookup = Trie.find(hash(N));
				EXPECT_TRUE(Lookup);
				if (!Lookup)
				continue;
				EXPECT_EQ(hash(N), Lookup->Hash);
				EXPECT_EQ(N, Lookup->Data);

				// Confirm a subsequent insertion fails to overwrite by trying to insert a
				// bad value.
				auto Result = Trie.insert(Lookup, TrieType::value_type(hash(N), N - 1));
				EXPECT_FALSE(Result.second);
				EXPECT_EQ(N, Result.first->Data);
				}

				// Check the trie so we can confirm the structure is correct. Each subtrie
				// should have 2 slots. The root's index=0 should have the content for
				// 0x37 directly, and index=1 should be a linked-list of subtries, finally
				// ending with content for (max-2) and (max-3).
				//
				// Note: This structure is not exhaustive (too expensive to update tests),
				// but it does test that the dump format is somewhat readable and that the
				// basic structure is correct.
				//
				// Note: This test requires that the trie reads bytes starting from index 0
				// of the array of uint8_t, and then reads each byte's bits from high to low.

				// Check the Trie.
				// We should allocated a total of 64 SubTries for 64 bit hash.
				ASSERT_EQ(getNumTries(), 64u);
				// Check the root trie. Two slots and both are used.
				ASSERT_EQ(getNumSlotUsed(getRoot()), 2u);
				// Check last subtrie.
				// Last allocated trie is the next node in the allocation chain.
				auto LastAlloctedSubTrie = getNextTrie(getRoot());
				ASSERT_EQ(getTriePrefixAsString(LastAlloctedSubTrie),
				"fffffffffffffff[110]");
				ASSERT_EQ(getStartBit(LastAlloctedSubTrie), 63u);
				ASSERT_EQ(getNumBits(LastAlloctedSubTrie), 1u);
				ASSERT_EQ(getNumSlotUsed(LastAlloctedSubTrie), 2u);
				}

				TEST_F(SmallNodeTrieTest, TrieStructureSmallFinalSubtrie) {
				NumType Numbers[] = {
				// Three numbers that will nest deeply to test (1) sinking subtries and
				// (2) deep, non-trivial hints.
				std::numeric_limits<NumType>::max(),
				std::numeric_limits<NumType>::max() - 2u,
				std::numeric_limits<NumType>::max() - 3u,
				// One number to stay at the top-level.
				0x37,
				};

				// Use subtrie size of 5 to avoid hitting 64 evenly, making the final subtrie
				// small.
				auto &Trie = createTrie(/RootBits=/8, /SubtrieBits=/5);

				for (NumType N : Numbers) {
				// Lookup first to exercise hint code for deep tries.
				TrieType::pointer Lookup = Trie.find(hash(N));
				EXPECT_FALSE(Lookup);

				Trie.insert(Lookup, TrieType::value_type(hash(N), N));
				}
				for (NumType N : Numbers) {
				TrieType::pointer Lookup = Trie.find(hash(N));
				EXPECT_TRUE(Lookup);
				if (!Lookup)
				continue;
				EXPECT_EQ(hash(N), Lookup->Hash);
				EXPECT_EQ(N, Lookup->Data);

				// Confirm a subsequent insertion fails to overwrite by trying to insert a
				// bad value.
				auto Result = Trie.insert(Lookup, TrieType::value_type(hash(N), N - 1));
				EXPECT_FALSE(Result.second);
				EXPECT_EQ(N, Result.first->Data);
				}

				// Check the trie so we can confirm the structure is correct. The root
				// should have 2^8=256 slots, most subtries should have 2^5=32 slots, and the
				// deepest subtrie should have 2^1=2 slots (since (64-8)mod(5)=1).
				// should have 2 slots. The root's index=0 should have the content for
				// 0x37 directly, and index=1 should be a linked-list of subtries, finally
				// ending with content for (max-2) and (max-3).
				//
				// Note: This structure is not exhaustive (too expensive to update tests),
				// but it does test that the dump format is somewhat readable and that the
				// basic structure is correct.
				//
				// Note: This test requires that the trie reads bytes starting from index 0
				// of the array of uint8_t, and then reads each byte's bits from high to low.

				// Check the Trie.
				// 64 bit hash = 8 + 5 * 11 + 1, so 1 root, 11 8bit subtrie and 1 last level
				// subtrie, 13 total.
				ASSERT_EQ(getNumTries(), 13u);
				// Check the root trie. Two slots and both are used.
				ASSERT_EQ(getNumSlotUsed(getRoot()), 2u);
				// Check last subtrie.
				// Last allocated trie is the next node in the allocation chain.
				auto LastAlloctedSubTrie = getNextTrie(getRoot());
				ASSERT_EQ(getTriePrefixAsString(LastAlloctedSubTrie),
				"fffffffffffffff[110]");
				ASSERT_EQ(getStartBit(LastAlloctedSubTrie), 63u);
				ASSERT_EQ(getNumBits(LastAlloctedSubTrie), 1u);
				ASSERT_EQ(getNumSlotUsed(LastAlloctedSubTrie), 2u);
				}

				TEST_F(SmallNodeTrieTest, TrieDestructionLoop) {
				// Test destroying large Trie. Make sure there is no recursion that can
				// overflow the stack.

				// Limit the tries to 2 slots (1 bit) to generate subtries at a higher rate.
				auto &Trie = createTrie(/NumRootBits=/1, /NumSubtrieBits=/1);

				// Fill them up. Pick a MaxN high enough to cause a stack overflow in debug
				// builds.
				static constexpr uint64_t MaxN = 100000;
				for (uint64_t N = 0; N != MaxN; ++N) {
				HashType Hash = hash(N);
				Trie.insert(TrieType::pointer(), TrieType::value_type(Hash, NumType{N}));
				}

				// Destroy tries. If destruction is recursive and MaxN is high enough, these
				// will both fail.
				destroyTrie();
				}

				struct NumWithDestructorT {
				uint64_t Num;
				~NumWithDestructorT() {}
				};

				using NodeWithDestructorTrieTest =
				SimpleTrieHashMapTest<NumWithDestructorT, sizeof(uint64_t)>;

				TEST_F(NodeWithDestructorTrieTest, TrieDestructionLoop) {
				// Test destroying large Trie. Make sure there is no recursion that can
				// overflow the stack.

				// Limit the tries to 2 slots (1 bit) to generate subtries at a higher rate.
				auto &Trie = createTrie(/NumRootBits=/1, /NumSubtrieBits=/1);

				// Fill them up. Pick a MaxN high enough to cause a stack overflow in debug
				// builds.
				static constexpr uint64_t MaxN = 100000;
				dblaikieUnsubmitted Not Done Reply Inline Actions This seems like a fair bit of text to wrap the trie - what's the value/extra test coverage this is providingg? (sorry I'm not following too clearly) I'd have hoped/expected this API could be tested more directly, without needing this wrapper and/or without needing to involve something as non-trivial as std::string (instead using simple user defined types in the test)? dblaikie: This seems like a fair bit of text to wrap the trie - what's the value/extra test coverage this…
				steven_wuAuthorUnsubmitted Done Reply Inline Actions This a simplified version of the original test from the original patch. The original patch implements a generic set data structure in this test file and tests how it trie can be used like a set. Now it is simplified to be a stringset with a fake hash algorithm because there is no current value to put TrieSet in ADT. steven_wu: This a simplified version of the original test from the original patch. The original patch…
				dblaikieUnsubmitted Not Done Reply Inline Actions What extra test coverage does it provide? How bad/what would be the tradeoff of testing that functionality more directly without the mock? dblaikie: What extra test coverage does it provide? How bad/what would be the tradeoff of testing that…
				steven_wuAuthorUnsubmitted Done Reply Inline Actions The extra test coverage it provides is the ability to store a non-POD type data into the Trie. where `insertLazy` is tested for lazy construction. I guess we can also tested it with the naive uint64_t. The alternative is might be just put `ThreadSafeHashMappedTrieSet` from the original patch into a header in ADT but there isn't a use case for that yet. So maybe testing insertLazy with uint64_t will make this cleaner. steven_wu: The extra test coverage it provides is the ability to store a non-POD type data into the Trie.
				for (uint64_t N = 0; N != MaxN; ++N) {
				HashType Hash = hash(N);
				Trie.insert(TrieType::pointer(), TrieType::value_type(Hash, NumType{N}));
				}

				// Destroy tries. If destruction is recursive and MaxN is high enough, these
				// will both fail.
				destroyTrie();
				}

				using NumStrNodeTrieTest = SimpleTrieHashMapTest<std::string, sizeof(uint64_t)>;

				TEST_F(NumStrNodeTrieTest, TrieInsertLazy) {
				for (unsigned RootBits : {2, 3, 6, 10}) {
				for (unsigned SubtrieBits : {2, 3, 4}) {
				auto &Trie = createTrie(RootBits, SubtrieBits);
				for (int I = 0, E = 1000; I != E; ++I) {
				TrieType::pointer Lookup;
				HashType H = hash(I);
				if (I & 1)
				Lookup = Trie.find(H);

				auto insertNum = [&](uint64_t Num) {
				std::string S = Twine(I).str();
				auto Hash = hash(Num);
				return Trie.insertLazy(Hash, [&](TrieType::LazyValueConstructor C) {
				C(std::move(S));
				});
				};
				auto S1 = insertNum(I);
				// The address of the Data should be the same.
				EXPECT_EQ(&S1.first->Data, &insertNum(I).first->Data);

				auto insertStr = [&](std::string S) {
				int Num = std::stoi(S);
				return insertNum(Num);
				};
				std::string S2 = S1.first->Data;
				// The address of the Data should be the same.
				EXPECT_EQ(&S1.first->Data, &insertStr(S2).first->Data);
				}
				for (int I = 0, E = 1000; I != E; ++I) {
				std::string S = Twine(I).str();
				TrieType::pointer Lookup = Trie.find(hash(I));
				EXPECT_TRUE(Lookup);
				if (!Lookup)
				continue;
				EXPECT_EQ(S, Lookup->Data);
				}
				}
				}
				}
				} // end anonymous namespace