This is an archive of the discontinued LLVM Phabricator instance.

[libc++][P1115][C++20] Improving the Return Value of Erase-Like Algorithms II: Free erase/erase if.
ClosedPublic

Authored by curdeius on Mar 10 2020, 4:20 AM.

Details

Summary

This patch adds return type to std::erase and std::erase_if functions.

Also:

  • Update __cpp_lib_erase_if to 202002L.
  • Fix synopsis in unordered_map.
  • Fix generate_feature_test_macro_components.py script.

Diff Detail

Event Timeline

curdeius created this revision.Mar 10 2020, 4:20 AM
Herald added a project: Restricted Project. · View Herald Transcript
curdeius edited the summary of this revision. (Show Details)Mar 10 2020, 4:22 AM
curdeius added reviewers: mclow.lists, ldionne.
curdeius marked an inline comment as done.Mar 10 2020, 4:29 AM
curdeius added inline comments.
libcxx/include/deque
3028–3032

For deque, string and vector, it might be nice to refactor this repetitive code (e.g. to a __libcpp_erase_container function, analogue to __libcpp_erase_if_container).
However, I don't know in which header it should be put as these 3 headers have no good candidate being a common include.
Note that __libcpp_erase_if_container is in <functional> and that doesn't pose any problems for maps/sets as they already include it.
Vector and string already include <__functional_base> but deque doesn't, so I'm not sure what should be the best solution. Creating a new helper header? Leaving it unrefactored?

Possible implementation of the helper:

template <class _Container>
inline typename _Container::size_type
__libcpp_erase_container(_Container& __c,
                         typename _Container::iterator __new_end) {
  typename _Container::size_type __erased_count =
      std::distance(__new_end, __c.end());
  __c.erase(__new_end, __c.end());
  return __erased_count;
}
curdeius updated this revision to Diff 249323.Mar 10 2020, 5:07 AM

Fixed forgotten return type tests.

curdeius updated this revision to Diff 249324.Mar 10 2020, 5:10 AM

Fix string.

curdeius marked an inline comment as not done.Mar 10 2020, 5:12 AM
curdeius updated this revision to Diff 249338.Mar 10 2020, 5:51 AM

Fix tests with custom allocators. Fix assert in map test.

curdeius updated this revision to Diff 249344.Mar 10 2020, 5:58 AM

Fix __cpp_lib_erase_if value.

The pre-merge build results show this build as failed because of failing clang-format linter tests (https://reviews.llvm.org/harbormaster/build/52549/, https://results.llvm-merge-guard.org/amd64_debian_testing_clang8-4604/summary.html).
However, these "failures" are all misindentations of preprocessor directives in generated files in libcxx/test/std/language.support/support.limits/support.limits.general/.
That's out of scope of this patch, but isn't the real fix modifying .clang-format config file to indent PP directives (or having a local copy of .clang-format in tests)?

EricWF requested changes to this revision.Mar 11 2020, 8:23 PM

I think there is a bug in the paper, but I think the return type should be difference_type not size_type.
The paper says the return value is equivalent to:

auto r = distance(a, b);
return r;

And std::distance returns difference_type.

What do you think?

libcxx/include/functional
3087

We need to avoid calling an overloaded comma operator here.

This revision now requires changes to proceed.Mar 11 2020, 8:23 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptMar 11 2020, 8:23 PM

It would be nice if we could remove the calls to std::difference and replace them with a count variable so that we only have to iterate once. It may require adding a __libcpp_ erase function, though.

curdeius updated this revision to Diff 249862.Mar 12 2020, 2:23 AM

Avoid using overloaded comma operator.

curdeius updated this revision to Diff 249863.Mar 12 2020, 2:25 AM
curdeius marked 2 inline comments as done.

Fix typo.

It would be nice if we could remove the calls to std::difference and replace them with a count variable so that we only have to iterate once. It may require adding a __libcpp_ erase function, though.

Where do you see that the iteration takes place twice? std::distance on LegacyRandomAccessIterators is an O(1) operation. And all of basic_string, deque and vector iterators belong to this category.
What do you mean by adding a count variable? remove and remove_if functions do not provide us with a count.

I think there is a bug in the paper, but I think the return type should be difference_type not size_type.

I'm not sure if it's a bug. I think that a coherence with other APIs was intended. Apart the additions in P0646, there is already pre-C++11 size_type map::erase(key_type) as well as size_type set::erase(key_type).

Otherwise, to be coherent with __libcpp_erase_if_container and avoid casting difference_type to size_type, I'd suggest returning old_size - size() instead of using distance.

libcxx/include/functional
3087

Done.

curdeius updated this revision to Diff 249883.Mar 12 2020, 3:40 AM

Avoid using std::distance.

curdeius marked an inline comment as done.Mar 12 2020, 3:44 AM

Sorry, this slipped through my inbox somehow.

Where do you see that the iteration takes place twice? std::distance on LegacyRandomAccessIterators is an O(1) operation. And all of basic_string, deque and vector iterators belong to this category. What do you mean by adding a count variable? remove and remove_if functions do not provide us with a count.

Here's the patch I made to implement the first paper [1]. @mclow.lists gave me some great review comments, which apply to this patch as well. Essentially, we never want to traverse the whole range twice (i.e. remove_if and distance). It looks like that's not a problem anymore, though.

That being said, you make a good point. In all those cases, x.distance isn't a traversal of the container. Make sure that you're using the method, not the std::distance function, though.

[1]: https://reviews.llvm.org/D58332

libcxx/include/functional
3084

This seems like a weird place to have this function. Is there another header that would work better?

I know you didn't make this change but, I figured I'd ask anyway.

3085

Instead of calling size, maybe make this a count that is incremented throughout the loop? I don't think it matters much but, it might make it a bit more general.

libcxx/include/string
4397

Why not use __libcpp_erase_if_container?

libcxx/include/vector
3405

Also could use __libcpp_erase_if_container, no?

curdeius marked 4 inline comments as done.Apr 10 2020, 4:53 AM

Hopefully I answered your questions, @zoecarver.

libcxx/include/functional
3084

Well, this function is used in map, set, unordered_map and unordered_set headers. The common includes of these are: __config, __node_handle, functional and version. IMO, __config and version are really not the place for this. functional seems okayish. I'm unsure about __node_handle.
I'd rather leave it as is.

I'd rather be reluctant as to creating another __header for this particular function, but I don't know what is the policy of using additional headers in libc++.

3085

I have no opinion on this. Both seem ok.

libcxx/include/string
4397

I think that __libcpp_erase_if_container would be inefficient for vector-like containers, as you would call erase n times (n = count of removed elements), hence moving left all elements after the erased one n times.

libcxx/include/vector
3405

Ditto.

ldionne accepted this revision.May 1 2020, 9:20 AM

Thanks for your patience. Regarding the code duplication, I think the only way to go is to split up libc++ into tinier headers. Trying to find a "clever" place to put common code is just a waste of time.

This revision was not accepted when it landed; it landed in state Needs Review.May 2 2020, 5:16 AM
This revision was automatically updated to reflect the committed changes.

Does anyone of you know why libcxx ARM buildbots fail after changing include headers?

All of them fail (http://lab.llvm.org:8011/console) with something similar to (http://lab.llvm.org:8011/builders/libcxx-libcxxabi-libunwind-armv7-linux/builds/1314/steps/test.libcxx/logs/FAIL%3A libc%2B%2B%3A%3Astds_include.sh.cpp):

/home/tcwg-buildslave/worker/libcxx-libcxxabi-libunwind-armv7-linux/llvm/libcxx/test/libcxx/modules/stds_include.sh.cpp:37:2: fatal error: file '/home/tcwg-buildslave/worker/libcxx-libcxxabi-libunwind-armv7-linux/llvm/libcxx/include/version' has been modified since the module file '/tmp/org.llvm.clang.11827/ModuleCache/8KDYU3/std-1WP0A9P.pcm' was built
#include <vector>
 ^
/home/tcwg-buildslave/worker/libcxx-libcxxabi-libunwind-armv7-linux/llvm/libcxx/test/libcxx/modules/stds_include.sh.cpp:37:2: note: please rebuild precompiled header '/tmp/org.llvm.clang.11827/ModuleCache/8KDYU3/std-1WP0A9P.pcm'
1 error generated.

It happens whenever an include is modified and it seems that the precompiled headers are not updated accordingly. Do you know if there's a fix? Do you know who's the maintainer of these bots?

@curdeius It's an issue with the bots, they need to be fixed. @ldionne let the maintainers know so, I don't think there's anything else to do.

@curdeius It's an issue with the bots, they need to be fixed. @ldionne let the maintainers know so, I don't think there's anything else to do.

I actually think it's a module cache invalidation bug. I think @EricWF has more context?