This is an archive of the discontinued LLVM Phabricator instance.

ADT: Fix const-correctness of iterator adaptors
ClosedPublic

Authored by dexonsmith on Nov 3 2021, 5:41 PM.

Details

Summary

This fixes const-correctness of iterator adaptors, dropping non-const
overloads for operator*().

Iterators, like the pointers that they generalize, have two types of
const.

The const qualifier on members indicates whether the iterator itself
can be changed. This is analagous to int *const.

The const qualifier on return values of operator*(), operator[](),
and operator->() controls whether the the pointed-to value can be
changed. This is analogous to const int *.

Since operator*() does not (in principle) change the iterator, then
there should only be one definition, which is const-qualified. E.g.,
iterators wrapping int* should look like:

int *operator*() const; // always const-qualified, no overloads

ba7a6b314fd14bb2c9ff5d3f4fe2b6525514cada changed iterator_adaptor_base
away from this to work around bugs in other iterator adaptors. That was
already reverted. This patch adds back its test, which combined
llvm::enumerate() and llvm::make_filter_range(), adds a test for
iterator_adaptor_base itself, and cleans up the const-ness of the
other iterator adaptors.

This also updates the documented requirements for
iterator_facade_base:

/// OLD:
///   - const T &operator*() const;
///   - T &operator*();

/// New:
///   - T &operator*() const;

In a future commit we might also clean up iterator_facade's overloads
of operator->() and operator[](). These already (correctly) return
non-const proxies regardless of the iterator's const qualifier.

Diff Detail

Event Timeline

dexonsmith requested review of this revision.Nov 3 2021, 5:41 PM
dexonsmith created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2021, 5:41 PM
dexonsmith added inline comments.Nov 3 2021, 5:45 PM
llvm/unittests/ADT/IteratorTest.cpp
104

I expected this test to fail (and the proof below not to compile) after https://reviews.llvm.org/D112981, but locally it seemed fine. It appears that const ReferenceT is equivalent to int& when ReferenceT==int&... which confuses me a bit, but so be it.

124

This returned false after https://reviews.llvm.org/D112981 and the proof below failed to compile.

Thanks for all the fixes!
@dblaikie is a better reviewer than I am for this, I'm easily lost in the layers of iterators wrapping iterators...

llvm/unittests/ADT/IteratorTest.cpp
104

Right this always confuses me as well, this is "reference is a const reference, but not a reference to const", so the "const" applies to the reference, but a reference is always "const" (as in int *const).
This is why we have std::remove_reference_t in generic code right?

dexonsmith updated this revision to Diff 384626.Nov 3 2021, 5:59 PM
dexonsmith edited the summary of this revision. (Show Details)

Squashed in https://reviews.llvm.org/D112981 since it was reverted (effectively just picks up its test).

dexonsmith marked an inline comment as done.Nov 3 2021, 6:07 PM
dexonsmith added inline comments.
llvm/unittests/ADT/IteratorTest.cpp
104

Right, right; thanks!

In which case, https://reviews.llvm.org/D112981 only adds const to by-value returns (which are super rare, outside of proxies).

dexonsmith marked an inline comment as done.Nov 3 2021, 6:57 PM

@dblaikie / others, I have some doubts after doing a deeper audit of custom iterators in LLVM.

There are lots of custom iterators that *only* define a non-const qualified operator*(). In some cases, it'd make sense to make a member mutable, or change the return type to always be const-qualified. But perhaps not in every case?

Another option, which would be consistent with iterator_facade_base's implementations of [] and ->, would be to define both here but not adjust the return type:

ReferenceT operator*() { return *I; } // Note: return has no `const`
ReferenceT operator*() const { return *I; }

Other adaptors could also overload operator*(), but also would avoid adding const to the return.

This would:

  • Conform to the ReferenceT defined in the template parameters (avoids my main concern from https://reviews.llvm.org/D112981).
  • Support adapting iterators where operator*() is non-const.
  • Error on const-qualified dereferences if the adapted iterator doesn't return ReferenceT for that.

WDYT?

There are lots of custom iterators that *only* define a non-const qualified operator*(). In some cases, it'd make sense to make a member mutable, or change the return type to always be const-qualified. But perhaps not in every case?

I think that covers all cases I can think of, and seems like the right thing - but if it means lots of cleanup, and just keeping/making two overloads (const/non-const) that both return non-const ref (well, return ref, whatever that is - const ref for const_iterators (that themselves may or may not be const... you know what I mean)) is significantly cheaper right now - I'd be OK with that. Seems like it'd make things incrementally better at least - leave a comment aobut it being a stop-gap, not a complete solution, maybe some comments about the functionality or patterns that are deprecated/urge people towards just writing the one const op*?

There are lots of custom iterators that *only* define a non-const qualified operator*(). In some cases, it'd make sense to make a member mutable, or change the return type to always be const-qualified. But perhaps not in every case?

I think that covers all cases I can think of, and seems like the right thing - but if it means lots of cleanup, and just keeping/making two overloads (const/non-const) that both return non-const ref (well, return ref, whatever that is - const ref for const_iterators (that themselves may or may not be const... you know what I mean)) is significantly cheaper right now - I'd be OK with that. Seems like it'd make things incrementally better at least - leave a comment aobut it being a stop-gap, not a complete solution, maybe some comments about the functionality or patterns that are deprecated/urge people towards just writing the one const op*?

I think I'd rather commit as-is with just one operator*() const if you agree it's the right answer. (I found the counter-examples by doing a grep for operator*() {, but they aren't being used in iterator adaptors right now... so I don't think we need a stop-gap.)

Think there's value in adding a test that a second operator non-const-qualified operator isn't added as a place to add a comment, something like:

// There should only be one `operator*()` and it should be const-qualified.
// Rather than adding overloads, fields in adapted iterators should be made
// mutable or ReferenceT should be `const`-qualified.
auto x = &decltype(make_filter_range(...).begin())::operator*;
auto x = &decltype(make_map_range(...).begin())::operator*;
auto x = &decltype(reverse(...).begin())::operator*;
...

?

dblaikie accepted this revision.Nov 10 2021, 4:09 PM

There are lots of custom iterators that *only* define a non-const qualified operator*(). In some cases, it'd make sense to make a member mutable, or change the return type to always be const-qualified. But perhaps not in every case?

I think that covers all cases I can think of, and seems like the right thing - but if it means lots of cleanup, and just keeping/making two overloads (const/non-const) that both return non-const ref (well, return ref, whatever that is - const ref for const_iterators (that themselves may or may not be const... you know what I mean)) is significantly cheaper right now - I'd be OK with that. Seems like it'd make things incrementally better at least - leave a comment aobut it being a stop-gap, not a complete solution, maybe some comments about the functionality or patterns that are deprecated/urge people towards just writing the one const op*?

I think I'd rather commit as-is with just one operator*() const if you agree it's the right answer. (I found the counter-examples by doing a grep for operator*() {, but they aren't being used in iterator adaptors right now... so I don't think we need a stop-gap.)

Yep, I'm on board!

Think there's value in adding a test that a second operator non-const-qualified operator isn't added as a place to add a comment, something like:

// There should only be one `operator*()` and it should be const-qualified.
// Rather than adding overloads, fields in adapted iterators should be made
// mutable or ReferenceT should be `const`-qualified.
auto x = &decltype(make_filter_range(...).begin())::operator*;
auto x = &decltype(make_map_range(...).begin())::operator*;
auto x = &decltype(reverse(...).begin())::operator*;
...

?

Can we put that in the iterator adapter base somehow, so we don't have to make it a statement about each use of iterator adapters?

This revision is now accepted and ready to land.Nov 10 2021, 4:09 PM
This revision was automatically updated to reflect the committed changes.

Thanks for the review! Landed in 1b651be0465de70cfa22ce4f715d3501a4dcffc1.

Think there's value in adding a test that a second operator non-const-qualified operator isn't added as a place to add a comment, something like:

// There should only be one `operator*()` and it should be const-qualified.
// Rather than adding overloads, fields in adapted iterators should be made
// mutable or ReferenceT should be `const`-qualified.
auto x = &decltype(make_filter_range(...).begin())::operator*;
auto x = &decltype(make_map_range(...).begin())::operator*;
auto x = &decltype(reverse(...).begin())::operator*;
...

?

Can we put that in the iterator adapter base somehow, so we don't have to make it a statement about each use of iterator adapters?

I tried adding something like static_assert(&DerivedT::operator*, "") to iterator_adaptor_base but at the point of evaluation (DerivedT's parent class), DerivedT doesn't have an operator* yet. For now I've left this alone.

(BTW, I have some follow-ups to clean up operator-> and operator[].)