This is an archive of the discontinued LLVM Phabricator instance.

Implement P0919R3
ClosedPublic

Authored by ldionne on Sep 4 2020, 3:45 PM.

Details

Summary

Implement heterogenous lookup for unordered containers including the refinement from P1690R1

Diff Detail

Event Timeline

rarutyun created this revision.Sep 4 2020, 3:45 PM
rarutyun requested review of this revision.Sep 4 2020, 3:45 PM
rarutyun updated this revision to Diff 297199.Oct 9 2020, 5:16 AM
rarutyun retitled this revision from Take P0919R3 for implementation to Implement P0919R3.
rarutyun edited the summary of this revision. (Show Details)
rarutyun updated this revision to Diff 297208.Oct 9 2020, 5:35 AM
rarutyun added inline comments.Oct 9 2020, 6:20 AM
libcxx/include/unordered_map
1357

I may do it without requires for all the places with just using the common approach with enable_if since not all compilers support concepts/constraints even if _LIBCPP_STD_VER > 17

ldionne added a reviewer: Restricted Project.Oct 9 2020, 6:22 AM
ldionne set the repository for this revision to rG LLVM Github Monorepo.
ldionne requested changes to this revision.Oct 9 2020, 8:15 AM
ldionne added subscribers: zoecarver, curdeius.

Thanks for the patch! This is looking pretty good, with some minor comments.

Misc comment: I think we're missing the __cpp_lib_generic_unordered_lookup feature test macro?

Perhaps my mind is playing tricks to me, but I have the feeling someone else had been working on this before. @zoecarver @curdeius perhaps? In all cases, please do update this revision as we will take it. If someone else had been working on this, we could steal some tests from that other revision if relevant.

libcxx/include/unordered_map
473

Can you use this pattern instead?

template <typename _K2, typename = enable_if_t<__is_transparent<_Hash, _K2>::value && __is_transparent<_Pred, _K2>::value>>
size_t ...

It usually leads to easier-to-read code since the return type isn't hidden at the end of enable_if_t.

1357

Yes, please use enable_if instead of requires.

libcxx/test/std/containers/unord/unord.map/contains.pass.cpp
17 ↗(On Diff #297208)

Can you please create a separate contains.transparent.pass.cpp test? Same for other methods.

64 ↗(On Diff #297208)

You can use using XX = ... here to make the line wrapping less weird. (also in other tests)

libcxx/test/std/containers/unord/unord.map/count.pass.cpp
24 ↗(On Diff #297208)

This include can (and should) be removed.

libcxx/test/support/is_transparent.h
102

Could you make this maintain a pointer to an int stored externally instead of having a static data member? This will make it easier to port the test to constexpr in the future. I've had to fix a lot of those while working on constexpr allocators and vector :)

This revision now requires changes to proceed.Oct 9 2020, 8:15 AM

Perhaps my mind is playing tricks to me, but I have the feeling someone else had been working on this before. @zoecarver @curdeius perhaps? In all cases, please do update this revision as we will take it. If someone else had been working on this, we could steal some tests from that other revision if relevant.

Your mind is not playing tricks on you. I implemented this in D59886.

It looks like the main difference between the patches (aside from tests) is that mine relies on a static assertion to prevent "incorrect" use of the new function template overloads whereas yours uses enable_if. I think yours is correct because the standard says, "shall not participate in overload resolution".

I'm OK closing the other patch in favor of this one (feel free to look at it or take tests from it if you'd like).

Perhaps my mind is playing tricks to me, but I have the feeling someone else had been working on this before. @zoecarver @curdeius perhaps? In all cases, please do update this revision as we will take it. If someone else had been working on this, we could steal some tests from that other revision if relevant.

Your mind is not playing tricks on you. I implemented this in D59886.

It looks like the main difference between the patches (aside from tests) is that mine relies on a static assertion to prevent "incorrect" use of the new function template overloads whereas yours uses enable_if. I think yours is correct because the standard says, "shall not participate in overload resolution".

I'm OK closing the other patch in favor of this one (feel free to look at it or take tests from it if you'd like).

I'm sorry for the duplicate work :-(. It looks like the other patch stopped getting attention at some point.

@rarutyun Please look at the tests in D59886 and take anything that increases the coverage.

Hi @zoecarver. I am also sorry about duplicating the work. I looked at the cxx2a_status.html and saw that it is not taken. I created the review with just having the update of cxx2a_status.html with marking that feature as In progress. Unfortunately, I didn't have information who should be invited to the review. I knew only Marshall and added him. Seems nobody was informed I took this part of work. As you can see the review was created on September the 5th. After that it was very hard to find the time to finish the feature implementation. Finally I found the bandwidth and implemented all of this stuff for a couple of days.

Thanks to agree to close your patch. I hope the process will become clearer to me and I hope we will avoid such situations in future. This is my first patch to libc++ and I definitely want to deliver it.

Yes, you are right the correct behavior is just do not allow the methods to participate in overload resolution. static_assert doesn't provide the desired behavior. Instead, user may see the compile-time error in places where it (error) should not happen.

@ldionne, sure I will look at D59886 and try to figure out what tests I can take from there. And regarding __cpp_lib_generic_unordered_lookup feature macro question. At least, I didn't add it. If it should be added let me figure out how to do that.

libcxx/include/unordered_map
473

Personally I don't like the approach with dummy argument since it adds possibility to use interface in the wrong way. Furthermore, It doesn't work in all the cases. I definitely can do that for those internal interfaces and I don't see the problems with that. But if you allow me I would prefer to have the enable_if_t as the return type for public interfaces (like find, contains, etc.) to do not have such a dummy parameter on the user level. If prevents incorrect usage and makes code completion hints in IDEs clearer.

libcxx/test/std/containers/unord/unord.map/contains.pass.cpp
17 ↗(On Diff #297208)

Sure.

64 ↗(On Diff #297208)

Yes, I also had that in mind. That would make things easier. I decided to finish the patch before this improvement. But I agree. Thanks for the suggestion. I am also considering to add common template alias with the template-template parameter as the first place to just pass unordered map or set. I want to look how if would be in the code and decide if such a common alias is acceptable or not

libcxx/test/std/containers/unord/unord.map/count.pass.cpp
24 ↗(On Diff #297208)

Right. Thanks. I noticed this after submitting a patch.

libcxx/test/support/is_transparent.h
102

I think I understand the idea. Sure, I can make that.

I am also sorry about duplicating the work. I looked at the cxx2a_status.html and saw that it is not taken.

No worries. I probably should have updated the status.

I hope the process will become clearer to me and I hope we will avoid such situations in future.

I'm not sure we have a very good process right now. Especially for those without commit access. I know not everyone agrees with this idea but, I am still holding out hope for a unified platform for issues, status, and patches. It's at least a year out but maybe one day we'll be able to keep track of what papers need to be implemented on GitHub then people can easily assign those issues to themselves and link them to PRs.

Also, if you search for the paper number on phab you can see if anyone's already got a patch up.

And regarding __cpp_lib_generic_unordered_lookup feature macro question. At least, I didn't add it. If it should be added let me figure out how to do that.

Just add a #define __cpp_lib_generic_unordered_lookup 202010L in <version>.

zoecarver added inline comments.Oct 9 2020, 2:38 PM
libcxx/include/unordered_map
473

Where possible, I suggest using _EnableIf. We can optimize that internally to improve compile times.

And regarding __cpp_lib_generic_unordered_lookup feature macro question. At least, I didn't add it. If it should be added let me figure out how to do that.

Just add a #define __cpp_lib_generic_unordered_lookup 202010L in <version>.

Oops, that's incorrect! Instead, look for __cpp_lib_generic_unordered_lookup in libcxx/utils/generate_feature_test_macro_components.py. We already have that feature test macro in the script, but it's disabled. Enable it and run the script, you'll see the magic happen!

Also, please rebase on top of master -- I updated the generate_feature_test_macro_components.py with one fix that'll make your life easier.

libcxx/include/unordered_map
473

Ok to leave as-is. As Zoe suggests, you can use _EnableIf instead.

libcxx/test/std/containers/unord/unord.map/contains.pass.cpp
64 ↗(On Diff #297208)

It's acceptable if it makes the code easier to follow!

libcxx/test/support/is_transparent.h
102

Basically, it would look like this:

struct Comp2Int { // convertible from int
  Comp2Int(int* conversions_count) : i_(0), conversions_count_(conversions_count) {}
  Comp2Int(int i, int* conversions_count) : i_(i), conversions_count_(conversions_count) { ++*conversions_count_; }
  int get_conversions_count() { return *conversions_count_; }
  operator int() const noexcept { return i_; }

private:
  int* conversions_count_;
  int i_;
};

reset_conversions_count should not be relevant anymore.

Then, from the tests, you create a local int conversions = 0 and pass the address of that to Comp2Int.

rarutyun updated this revision to Diff 299055.Oct 19 2020, 7:56 AM
rarutyun marked an inline comment as not done.
rarutyun marked 4 inline comments as done and an inline comment as not done.Oct 19 2020, 8:22 AM
rarutyun added inline comments.
libcxx/include/unordered_map
473

I applied EnableIf pattern with dummy template parameter on the private interfaces and used _EnableIf with return type for public methods.

libcxx/test/support/is_transparent.h
102

I actually thought about that in a different manner. But what I had in mind also doesn't help with the constexpr use-case I guess.

With what you suggest I don't understand how to make Comp2Int class implicitly constructible from int. I just pass one int value to the find or count (or whatever) interface. This int argument won't match the constructor with two arguments.

rarutyun marked an inline comment as done.Oct 19 2020, 10:15 AM

@rarutyun Please look at the tests in D59886 and take anything that increases the coverage.

Hi @ldionne. I have looked at the D59886 review to find any other relevant tests. It implements the P0919R3 without the refinement P1690R1. My review does both.

So when looked at the tests I didn't find any relevant among those because at my point of view they don't bring anything new comparing with tests I wrote.

ldionne requested changes to this revision.Oct 21 2020, 7:39 AM
ldionne added inline comments.
libcxx/test/support/is_transparent.h
102

Ah, yeah, that makes it impossible. If you're using int.

However, what we can do is:

template <typename T>
struct CountConversions {
  CountConversions(T value, int* counter) : ... { }
  operator T() const { ++*conversions_; return value_; }

private:
  T value_;
  int* conversions_;
};

template <typename T>
struct ComparableTo {
  ComparableTo() = default;
  ComparableTo(T value) : value_(value) { }

  friend bool operator==(T const& lhs, ComparableTo const& rhs) {
    return lhs == rhs.value_;
  }

  friend bool operator==(ComparableTo const& lhs, T const& rhs) {
    return lhs.value_ == rhs;
  }

  friend bool operator==(ComparableTo const& lhs, ComparableTo const& rhs) {
    return lhs.value_ == rhs.value_;
  }

private:
  T value_;
};

Then, use:

std::unordered_set<ComparableTo<int>, transparent_hash, std::equal_to<>> set = {
  ComparableTo<int>(1),
  ComparableTo<int>(2),
  ComparableTo<int>(3)
};

int conversions = 0;
assert(c.contains(CountConversions(1, &conversions)));
assert(c.contains(CountConversions(2, &conversions)));
assert(conversions == 0);
assert(!c.contains(CountConversions(4, &conversions)));
assert(conversions == 0);

I haven't tested this, but I think it should work.

107

Is this necessary for testing, or only to implement the comparison operators below and the hashing above? If the latter, I would much rather remove this as it makes things harder to understand -- this type focuses on being comparable with int, and constructible from int.

Edit: Please try using the definitions I've provided above and let me know if that works -- if it does, I think it would be simpler and wouldn't require a conversion operator for the type stored inside the container.

libcxx/test/support/test_transparent_unordered.h
21

Now I understand what you meant in your previous comment. I don't feel like these aliases buy us a lot of readability -- I would prefer removing them.

Also, they seem to be reversed (unord_set_type = UnorderedMap<...> seems wrong).

This revision now requires changes to proceed.Oct 21 2020, 7:39 AM
ckennelly added inline comments.
libcxx/www/cxx2a_status.html
204

Does this implementation take this paper into account?

It reworked the design of heterogeneous lookup from P919.

rarutyun added inline comments.Oct 27 2020, 10:03 AM
libcxx/www/cxx2a_status.html
204

Yes, it does. I wrote about that in some comments. Thanks for noticing. I didn't find it previously when I tried. Will also mark it as completed.

rarutyun updated this revision to Diff 301234.Oct 28 2020, 4:18 AM
rarutyun marked 5 inline comments as done.Oct 28 2020, 4:42 AM
rarutyun added inline comments.
libcxx/test/support/is_transparent.h
102

No, it doesn't. You still have conversions in the scenario above. If you create the unordered_set<ComparableTo<int>> then the T = int then the hidden friends have int as well. When the heterogeneous API is called it calls operator== that causes CountConversions to be converted to int and it makes conversions_ to be incremented. Of course the test fails.

I think that if we create ComparableTo<CountConversions<int>> that might actually work but you need to teach the operator== to work with such types. But the biggest problem is you should create int conversions variable in places where you want to initialize the container but you don't need it, it's just dummy parameter to initialize the container of ComparableTo<CountConversions<int>>.

Having all of that in mind I made a step further and seems like I found the way how to test the containers with passing external reference counter. I uploaded the patch with that approach. It allows to avoid dummy int conversions definition when container is initialized. It also tests the constructor of KeyType, which IMO better fits the testing expectations rather then conversion operator of heterogeneous Key.

107

Yes, it was necessary to implement hash above and allow this type to be used in std::hash<int>. Due to all of these wrappers I've added we cannot use std::hash<int> anyway. But we still need to calculate the hash for both ComparableTo and CounterWrapper. So, I added get() method to both. The other approach is friend but I don't think that it's better

libcxx/test/support/test_transparent_unordered.h
21

Yes, thanks for noticing. It's just name swapping. I probably agree that they don't bring much readability but what the improve is maintainability. I was needed to rewrite the tests on the new approach. Those aliases helped me to do that easier. I would prefer to leave them. They also help to change KeyEqual and Hash types without listing others.

If you insist I may remove them but my opinion that they bring value.

rarutyun marked 2 inline comments as done.Oct 29 2020, 3:07 PM

@ldionne any thoughts on the current state of the review?

ldionne commandeered this revision.Nov 10 2020, 6:00 AM
ldionne edited reviewers, added: rarutyun; removed: ldionne.

I think this is pretty much good to go, however there's a specific way I'd like to structure the test harness to make the approach clearer. I'll commandeer this because it's a lot easier than explaining exactly what I mean, but LMK if you're OK with my approach. It's very very close to yours.

If the update looks good to you, please provide the Author Name <email@domain> for the commit.

ldionne updated this revision to Diff 304162.Nov 10 2020, 6:01 AM

Minor changes to the test harness

ldionne accepted this revision as: Restricted Project.Nov 10 2020, 6:02 AM

I just noticed two small nits.

libcxx/www/cxx2a_status.html
123

Can you change this <td>Complete</td><td></td> to <td>Complete</td><td>12.0</td> so it contains the proper release version?

204

Can you change this <td>Complete</td><td></td> to <td>Complete</td><td>12.0</td> so it contains the proper release version?

I think this is pretty much good to go, however there's a specific way I'd like to structure the test harness to make the approach clearer. I'll commandeer this because it's a lot easier than explaining exactly what I mean, but LMK if you're OK with my approach. It's very very close to yours.

From my perspective the recent changes use the very close approach. So I don't mind regarding to leave it as is. I would like to add the few comments, but just for the information. I don't expect them to be applied

  • I think that LookupType is better name than SearchedType. But I can live with that:)
  • I don't mind with the friend approach. I did it myself but decided to change in the last moment.
  • I don't understand why you changed operator() for hash_impl to take concrete types instead of T&&. If you think that it's better I don't my but I appreciate the explanation.

If the update looks good to you, please provide the Author Name <email@domain> for the commit.

Yes, it looks good to me. Ruslan Arutyunyan <ruslan.arutyunyan@intel.com>

By the way currently the tests failed on GCC with friend approach.

This is the small reproducer with Conformance Viewer: https://godbolt.org/z/enxMxK. As you can see both GCC and MSVC cannot compile the example.

I prepared the small change with adding the proper release version but I cannot upload the new diff. I guess that's because I am not author anymore @ldionne, is it possible to make me author again? I didn't find the way to do it myself, I guess I don't have permissions for that.

I prepared the small change with adding the proper release version but I cannot upload the new diff. I guess that's because I am not author anymore @ldionne, is it possible to make me author again? I didn't find the way to do it myself, I guess I don't have permissions for that.

You should have been able to commandeer the revision again using the Phabricator UI. I don't think I have any special permission that allows me to do this but wouldn't let you.

[...]

  • I don't understand why you changed operator() for hash_impl to take concrete types instead of T&&. If you think that it's better I don't my but I appreciate the explanation.

I used concrete types because I thought it made the code easier to read, and also it ensures that hash gets called with exactly these two types.

Since these changes are very simple, I just made them and I'll be committing this. I also fixed the GCC issue and verified it locally in a Docker image matching the CI bots.

Thanks for your contribution, and also thanks @Mordante for the find!

This revision was not accepted when it landed; it landed in state Needs Review.Nov 11 2020, 2:45 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

is this really P0919R3?
Looks to me more like P0919R0

Using libcxx 12, the example from P0919r3

struct string_hash {
  using transparent_key_equal = std::equal_to<>;  // Pred to use
  using hash_type = std::hash<std::string_view>;  // just a helper local type
  size_t operator()(std::string_view txt) const   { return hash_type{}(txt); }
  size_t operator()(const std::string& txt) const { return hash_type{}(txt); }
  size_t operator()(const char* txt) const        { return hash_type{}(txt); }
};

std::unordered_map<std::string, int, string_hash> map = /* ... */;
map.find("abc");
map.find("def"sv);

doesn't compile. The example from P0919r0

struct string_hash {
  using is_transparent = void;                    // I confirm I know what I am doing
  using hash_type = std::hash<std::string_view>;  // just a helper local type
  size_t operator()(std::string_view txt) const   { return hash_type{}(txt); }
  size_t operator()(const std::string& txt) const { return hash_type{}(txt); }
  size_t operator()(const char* txt) const        { return hash_type{}(txt); }
};

std::unordered_map<std::string, int, string_hash, std::equal_to<>> map = /* ... */;
map.find("abc");
map.find("def"sv);

compiles fine

is this really P0919R3?
Looks to me more like P0919R0

Using libcxx 12, the example from P0919r3

struct string_hash {
  using transparent_key_equal = std::equal_to<>;  // Pred to use
  using hash_type = std::hash<std::string_view>;  // just a helper local type
  size_t operator()(std::string_view txt) const   { return hash_type{}(txt); }
  size_t operator()(const std::string& txt) const { return hash_type{}(txt); }
  size_t operator()(const char* txt) const        { return hash_type{}(txt); }
};

std::unordered_map<std::string, int, string_hash> map = /* ... */;
map.find("abc");
map.find("def"sv);

doesn't compile. The example from P0919r0

struct string_hash {
  using is_transparent = void;                    // I confirm I know what I am doing
  using hash_type = std::hash<std::string_view>;  // just a helper local type
  size_t operator()(std::string_view txt) const   { return hash_type{}(txt); }
  size_t operator()(const std::string& txt) const { return hash_type{}(txt); }
  size_t operator()(const char* txt) const        { return hash_type{}(txt); }
};

std::unordered_map<std::string, int, string_hash, std::equal_to<>> map = /* ... */;
map.find("abc");
map.find("def"sv);

compiles fine

P0919r3 is not implemented solely. There was another paper [[ https://wg21.link/P1690 | P1690 ]]that is refinement of P0919. They both were merged into the standard and thus, they both should be implemented. transparent_key_equal is no more in the standard. Instead, we have hash that might contain is_transparent and key_equal that also might contain is_transparent.

There is a little bit inconvenience because user cannot just take one paper and understand the design and the API that eventually was merged into the working draft. There might be a lot of papers for one feature but that forces users dig deeper into the C++ standard itself.

Hope, this answer is helpful.

makes perfect sense

Thanks for the clarification!

Via Web