This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Fix unwrapping ranges with different iterators and sentinels
ClosedPublic

Authored by philnik on Jul 2 2022, 6:43 AM.

Diff Detail

Event Timeline

philnik created this revision.Jul 2 2022, 6:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2022, 6:43 AM
Herald added a subscriber: mgorny. · View Herald Transcript
philnik requested review of this revision.Jul 2 2022, 6:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2022, 6:43 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
huixie90 added inline comments.Jul 2 2022, 2:22 PM
libcxx/include/__algorithm/copy.h
91–96

is it ok pass nullptr to __rewrap_iter. The implementation below seems that __first_un.second can be nullptr

libcxx/include/__algorithm/unwrap_range.h
15

unused?

33

random_access_iterator implies copy_constructible
(forward + are copyable)

35

not sure about the variable name __sent. because there is a template parameter called _Sent, at the first glance, I thought __sent is of type _Sent because of the name. but actually, it is of type _Iter

40

I'd be really cautious about nullptr, which is comparable with pointers. if any of the logic goes wrong, it could result in a runtime UB. Is it possible to avoid nullptr?

philnik updated this revision to Diff 441938.Jul 3 2022, 3:54 AM
philnik marked 4 inline comments as done.
  • Address comments
philnik added inline comments.Jul 3 2022, 3:54 AM
libcxx/include/__algorithm/unwrap_range.h
40

How exactly would anything go wrong at runtime? This results in pair<pair<_Iter, nullptr_t>, _Sent>. The nullptr_t should only ever be passed back to __rewrap_iter. If that weren't the case there would be type mismatches all over the place. (i.e. __rewrap_iter would just return the unwrapped pointer) But anyways, just to make it really hard to break I'll change it to a tag type, since there isn't any downside to do so.

philnik updated this revision to Diff 441939.Jul 3 2022, 3:56 AM

Upload correct diff

I really would like to encourage you to add a bit more comments to new code. Historically we haven't done that a lot, but IMO that makes some older parts of the code harder to understand.

libcxx/include/__algorithm/copy.h
13

Please also add this to the top level algorithm header.

72

Can we use fewer autos here. It's quite hard to understand the types without looking at other places.

74

__last has been moved from at line 70. It there a test that fails, if not please add a test.

libcxx/include/__algorithm/unwrap_range.h
31

Please don't use the automatically deduced auto.

35

Based on the code in __algorithm/copy.h with all .first.first and the like. I would prefer to use a small struct with good names for the types instead of a pair. I think that will aid the readability of the code in __algorithm/copy.h.

huixie90 accepted this revision.Jul 3 2022, 7:09 AM

LGTM.

Regarding testing, I am wondering if the tests cover those cases:
iterators that model c++20's random_access_range but only model c++17's InputIterator (this is very common though)
iterators that model c++17's RandomAccessIterator but only model C++20's input_or_output_iterator (or input_iterator) (this is not very common and I guess we need to define member typedef iterator_category and iterator_concept to make one)

Since the tests in D128611 and D128983 depend on this fix and currently commented with TODO. Depending the order of which patch land first, we need to update those tests

libcxx/include/__algorithm/unwrap_range.h
40

yeah I agree changing it to a tag type is better, at least more obvious to the readers. It is unclear to me what __rewrap_iter would do for nullptr. Since it is a pair, I would guess it might compare the first with second (since pointers are comparable with nullptr_t). With the tag type it is obvious that it is not going to be compared with first

philnik added inline comments.Jul 3 2022, 9:19 AM
libcxx/include/__algorithm/copy.h
13

Why should we do that? It's an implementation detail, and adding the include to the top level header just exposes that implementation detail (even for modules users).

74

See D129044.

libcxx/include/__algorithm/unwrap_range.h
31

I can see you point about readability in general, but in this case I don't think it helps in any way. It would essentially be pair<decltype(std::__unwrap_iter(std::move(std::declval<_Iter>()))), decltype(std::__unwrap_iter(std::move(std::declval<_Iter>())).first)> which is just a really long way to write "Find it out yourself". (The std::moves might be unnecessary, but removing them wouldn't make it any better)

35

I've got

template <class _OrigIter, class _Iter>
struct _UnwrappedRange {
  _OrigIter __rewrap;
  _Iter __first;
  _Iter __last;
};

right now, but that loses the information that you can only rewrap __first. That might be acceptable, Idk. I also really dislike __rewrap. WDYT?

Mordante added inline comments.Jul 5 2022, 9:31 AM
libcxx/include/__algorithm/copy.h
13

Are other header not allowed to use this? unwrap_iter is also in <algorithm> so IMO this should be there too.

libcxx/include/__algorithm/unwrap_range.h
31

I agree that's also not nice to read. But I think we should give some thought about whether we can do better. I feel this entire unwrapping looks quite hard to understand. The original unwrap code already took some effort to understand, but I feel it gets worse with ranges.

I will give it some more thought.

35

Maybe rename __rewrap to __base?
When it's not possible to rewrap __last we can add some comment why it's not possible.

philnik updated this revision to Diff 444034.Jul 12 2022, 11:54 AM

Rebased and updated to new style

ldionne requested changes to this revision.Jul 21 2022, 8:18 AM

Please ping me on Discord once you have implemented all comments just so I can do another quick pass, and we can ship this in time for LLVM 15.

libcxx/.clang-format
71

Shouldn't be here.

libcxx/include/__algorithm/unwrap_range.h
45

std::move?

48

This would avoid a copy?

53

Please leave at least some sort of documentation explaining what this file and group of functions do. And in particular, it would be useful to explain why they are useful/needed.

59

Would it make sense to std::move here?

74

I had completely missed this because it's a large-ish block.

77

std::move

82

std::move

libcxx/include/__iterator/reverse_iterator.h
335–339

I think this diff doesn't belong here.

This revision now requires changes to proceed.Jul 21 2022, 8:18 AM
philnik updated this revision to Diff 447421.Jul 25 2022, 11:40 AM
philnik marked 16 inline comments as done.
  • Address comments
ldionne accepted this revision.Jul 25 2022, 11:56 AM
ldionne added inline comments.
libcxx/include/__algorithm/unwrap_range.h
29
This revision is now accepted and ready to land.Jul 25 2022, 11:56 AM
philnik updated this revision to Diff 447663.Jul 26 2022, 6:14 AM
philnik marked an inline comment as done.
  • Address comments
libcxx/include/__algorithm/copy.h
13

Other headers are allowed to use this, but they should include __algorithm/unwrap_range.h and not algorithm.

libcxx/include/__algorithm/unwrap_range.h
48

It turns out that this causes the overloads to be ambiguous. I'll try to improve it later, since this code probably still needs sum debugging.

philnik updated this revision to Diff 447769.Jul 26 2022, 11:13 AM
  • Try to fix CI
philnik updated this revision to Diff 448022.Jul 27 2022, 6:02 AM
  • Try to fix CI
philnik updated this revision to Diff 448170.Jul 27 2022, 2:53 PM
  • Next try