This is an archive of the discontinued LLVM Phabricator instance.

PDB HashTable: Move TraitsT from class parameter to the methods that need it
ClosedPublic

Authored by thakis on Jul 12 2019, 8:21 AM.

Details

Summary

The traits object is only used by a few methods. Deserializing a hash
table and walking it is possible without the traits object, so it
shouldn't be required to build a dummy object for that use case.

The TraitsT object used to be a function template parameter before
r327647, this restores it to that state.

This makes it clear that the traits object isn't needed at all in 1 of
the current 3 uses of HashTable (and I am going to add another use that
doesn't need it), and that the default PdbHashTraits isn't used outside
of tests.

While here, also re-enable 3 checks in the test that were commented out
(which requires making HashTableInternals templated and giving FooBar
an operator==).

No intended behavior change.

Diff Detail

Event Timeline

thakis created this revision.Jul 12 2019, 8:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 8:21 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
rnk added a comment.Jul 12 2019, 11:45 AM

The traits object is only used by a few methods. Deserializing a hash
table and walking it is possible without the traits object, so it
shouldn't be required to build a dummy object for that use case.

The traits are needed to do any actual hashing. I think I have a bit of a preference for the code as it is before your change. Now all the hash lookup operations have to take an extra parameter, and the caller is responsible for managing the lifetime of a new object that probably should exactly match the lifetime of the hash table.

What do you think of having a traits-less HashTableView or HashTableBase and having HashTable inherit from it? Would that solve your use case?

In D64640#1583265, @rnk wrote:

The traits object is only used by a few methods. Deserializing a hash
table and walking it is possible without the traits object, so it
shouldn't be required to build a dummy object for that use case.

The traits are needed to do any actual hashing.

I believe that's not really true: The has key is always an uint32_t, and assuming that lookupKeyToStorageKey() and storageKeyToLookupKey() are inverses of each other (which currently is true for StringTableHashTraits; for NamedStreamMapTraits I'm not yet sure what the semantics for two streams with the same name are. I believe it's not allowed – passing two identical /natvis: params to lld seems to confuse everyone at least -- and then it'd be true there too). So I currently think (but could be wrong, I'm not sure yet) that the hashtable hash function is overly complicated and should always work just on the uint32_t key and there should be a convenience function to convert a string key to an uint32_t. But that's for another change.

I think I have a bit of a preference for the code as it is before your change. Now all the hash lookup operations have to take an extra parameter, and the caller is responsible for managing the lifetime of a new object that probably should exactly match the lifetime of the hash table.

The callers did manage the traits object before as well. Maybe that wasn't necessary, but see PDBFileBuilder.h and llvm/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h on the lhs – the traits objects were already member variables. There are 3 distinct call sites to the lookup operations from non-test code.

What do you think of having a traits-less HashTableView or HashTableBase and having HashTable inherit from it? Would that solve your use case?

https://reviews.llvm.org/D64428?id=208735 used to do this. It was imo much messier (see the "drat" comments for some unresolved issues with that approach, but even with them resolved it's pretty messy). I then left the hash table as is and added another traits class that was silently unused in https://reviews.llvm.org/D64428?id=209475 but didn't like that either (see "xxx blah" for an annoying issue there – a function that's silently unused can't be implemented nicely). I think this change is by far the nicest.

rnk accepted this revision.Jul 12 2019, 4:03 PM
In D64640#1583265, @rnk wrote:

The traits object is only used by a few methods. Deserializing a hash
table and walking it is possible without the traits object, so it
shouldn't be required to build a dummy object for that use case.

The traits are needed to do any actual hashing.

I believe that's not really true: The has key is always an uint32_t, and assuming that lookupKeyToStorageKey() and storageKeyToLookupKey() are inverses of each other (which currently is true for StringTableHashTraits; for NamedStreamMapTraits I'm not yet sure what the semantics for two streams with the same name are. I believe it's not allowed – passing two identical /natvis: params to lld seems to confuse everyone at least -- and then it'd be true there too). So I currently think (but could be wrong, I'm not sure yet) that the hashtable hash function is overly complicated and should always work just on the uint32_t key and there should be a convenience function to convert a string key to an uint32_t. But that's for another change.

Hey, I like the sound of that: who needs traits templates, we can just use functions. Insofar as this is a step in that direction, I'm on board.

I think I have a bit of a preference for the code as it is before your change. Now all the hash lookup operations have to take an extra parameter, and the caller is responsible for managing the lifetime of a new object that probably should exactly match the lifetime of the hash table.

The callers did manage the traits object before as well. Maybe that wasn't necessary, but see PDBFileBuilder.h and llvm/include/llvm/DebugInfo/PDB/Native/NamedStreamMap.h on the lhs – the traits objects were already member variables. There are 3 distinct call sites to the lookup operations from non-test code.

What do you think of having a traits-less HashTableView or HashTableBase and having HashTable inherit from it? Would that solve your use case?

https://reviews.llvm.org/D64428?id=208735 used to do this. It was imo much messier (see the "drat" comments for some unresolved issues with that approach, but even with them resolved it's pretty messy). I then left the hash table as is and added another traits class that was silently unused in https://reviews.llvm.org/D64428?id=209475 but didn't like that either (see "xxx blah" for an annoying issue there – a function that's silently unused can't be implemented nicely). I think this change is by far the nicest.

I took a look at the linked stuff, but mostly I'm convinced that you've thought about these other approaches and I'll trust that you've made the right judgement here, looks good to me.

This revision is now accepted and ready to land.Jul 12 2019, 4:03 PM
This revision was automatically updated to reflect the committed changes.