SmallSetVector used to use find and insert from from the DenseSet (using DenseMapInfo) to determine equality. Some recent changes (https://reviews.llvm.org/D152497) added a call to std::find, resulting in the use of operator==, which changes semantic behavior of the set. This change changes it back to use DenseMapInfo<value_type>::isEqual to determine equality without requiring operator== be implemented on the type.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/include/llvm/ADT/SetVector.h | ||
---|---|---|
271–272 | We can probably reduce this entire function down to a call to contains since that returns a boolean and this only returns 0 or 1. Thoughts? |
Adding onto what @efriedma said, if you do want to use hash-value equality I would suggest having value equality as well, chained using &&. Given that the "small" functionality is meant to prevent us from having to compute a hash value in the first place, I would also suggest you to measure this change on the compile time tracker, although I don't expect it to make the performance better especially because this is quite a performance-sensitive part of code...
llvm/include/llvm/ADT/SetVector.h | ||
---|---|---|
271–272 | That makes sense to me, the reason I didn't do that was to keep the diff as insertions-only and the name count also confused me. |
Sorry, should have used the the in the DenseMapInfo<T>::isEqual.
So for background, the issue that I'm running into right now is two-fold, this container used to accept types that didn't necessarily have equality (operator==) (except as defined by the DenseMapInfo), which made it very useful for storing unique non-equatable things like non-canonical types in a type system. This change means that that code doesn't compile anymore. Second, the DenseMapInfo<T>::isEqual and DenseMapInfo<T>::getHashValue from the DenseMapInfo used before aren't necessarily the same as T::operator==, which can result in different outcomes.
At least as far as I can tell, DenseMapInfo<T>::isEqual in find_if should result in the same result as before. Is there an issue with using going back to DenseMapInfo<T>::isEqual over T::operator==?
SetVector is templated over the set type, so using DenseMapInfo is not great.
My 2c here is that if you are implementing DenseMapInfo for a type, it needs to match operator== behavior. The only differences it should have from operator== is potentially different handling for empty / tombstone keys.
If you want to use a DenseMap with a different comparison criterion (or over a type that is not inherently comparable), then you should not implement DenseMapInfo for that type, but instead provide a separate DenseMapInfo implementation.
For example, see https://github.com/llvm/llvm-project/blob/c51c607dfc6388050ed1381a32089cd08a36ff9a/clang/include/clang/Sema/Weak.h#L37, which defines DenseMapInfoByAliasOnly for a type that is intentionally not comparable. You can see how it gets use (incidentally inside a SetVector) here: https://github.com/llvm/llvm-project/blob/c51c607dfc6388050ed1381a32089cd08a36ff9a/clang/include/clang/Sema/Sema.h#L1128 This sounds like it would be the proper thing to do in your case.
However, you can also just disable the optimization by passing N = 0 to SetVector (which is the default if you're not going through SmallSetVector). Using SetVector<T, SmallVector<T, N>, SmallDenseSet<T, N>> should give you exactly the same behavior as previously.
Updating to use the DenseMapInfo implementation of isEqual instead of the hash value to ensure we're using the same semantics that DenseMap/DenseSet use to determine equality.
Adding test that fails to build with current implementation that used to build with original SetVector insert and contains implementation.
We can probably reduce this entire function down to a call to contains since that returns a boolean and this only returns 0 or 1. Thoughts?