Page MenuHomePhabricator

Align mapped_iterator::reference type with mapped_iterator::operator*() return value.
Needs ReviewPublic

Authored by ikelarev on May 11 2020, 12:59 PM.

Details

Summary

This change is to provide FuncReturnTy as a reference type to iterator_adaptor_base class. It fixes a compilation error when
mapped_iterator is wrapped into another iterator object (e.g. reverse_iterator) and its operator*() tries to return a reference type.
Actually mapped_iterator could not provide a references to the values it iterates on, which leads to the compilation error.

Example 1:

std::vector<int> V;
auto It = map_iterator(V.begin(), [](int) { return 0; });
decltype(It)::reference R = *It; // error C2440: 'initializing': cannot convert 
                                 // from 'FuncReturnTy' to 'int &' with FuncReturnTy=int

Example 2:

std::vector<int> V;
auto R1 = map_range(V, [](int) { return 0; });
auto R2 = reverse(R1);
*R1.begin(); // OK
*R2.begin(); // error C2440: 'return': cannot convert from
             // 'FuncReturnTy' to 'int &' with FuncReturnTy=int

Diff Detail

Event Timeline

ikelarev created this revision.May 11 2020, 12:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2020, 12:59 PM

I don't think it'd technically meet the iterator requirements to try to reverse a mapped iterator that returns by value - anything other than an Input/OutputIterator has to return a reference to the value, I think? ( http://eel.is/c++draft/forward.iterators#1.3 ) So perhaps map_range should check if the functor returns by value, then the iterator type should be Input, not anything more powerful that it doesn't actually conform to?

ikelarev updated this revision to Diff 263291.May 11 2020, 3:05 PM

I'm not sure if this addresses my concern - it doesn't chance the iterator category, which seems like the main problem (if the functor returns by value, I think the iterator category must be downgraded to "input")

If I'm understanding the proposed change correctly - this is saying "if the return type of the functor is the same as the value type of the underlying iterator - then the pointer type can be transparent, otherwise it's "pointer to FuncReturnTy" - but that seems incorrect/like it introduces a new bug. Even if the return type is the same as the value type, the fact that the functor's return value is a temporary in the "op*" means returning a pointer or reference to it would be incorrect/leave a dangling pointer, right?

ikelarev edited the summary of this revision. (Show Details)May 11 2020, 4:01 PM

David, thank you for your comments! Actually I didn't try to change pointer type and difference type of the iterator, I left them exactly equal to default values of iterator_adaptor_base. I just wanted to fix reference type of the iterator. I added another example (without map_range/reverse) to make clearer what I mean. Shouldn't operator* return exactly iterator's reference type here?

std::vector<int> V;
auto It = map_iterator(V.begin(), [](int) { return 0; });
decltype(It)::reference R = *It; // error C2440: 'initializing': cannot convert 
                                 // from 'FuncReturnTy' to 'int &' with FuncReturnTy=int
ikelarev updated this revision to Diff 263305.May 11 2020, 4:21 PM

David, I agree with you regarding the pointer type, it really seems incorrect too. I modified my patch to replace it with FuncReturnTy as well.

ikelarev edited the summary of this revision. (Show Details)May 11 2020, 4:30 PM

I don't think this is correct either - the mapping function might return by reference - in which case changing it to return by value would probably be not what the user wants. Also, the iterator requirements state that for a forward iterator (or any refinement of that) the reference type must be "T&" (http://eel.is/c++draft/forward.iterators#1.3), so this would not conform to those requirements & I think the right/necessary thing to do is to downgrade the iterator category to "input iterator" in the case where the functor returns by value.

@dblaikie are you saying that a mapped_iterator that returns by value should never be used by a reverse_iterator?

@dblaikie are you saying that a mapped_iterator that returns by value should never be used by a reverse_iterator?

That's my hypothesis/understanding, yes. I'm open to other ideas/counterpoints.

@dblaikie are you saying that a mapped_iterator that returns by value should never be used by a reverse_iterator?

That's my hypothesis/understanding, yes. I'm open to other ideas/counterpoints.

An alternative: A mapped_iterator that invokes the functor on increment, and stores the result in the iterator itself. That way its op* can return a reference to that object, etc. Then you satisfy the other iterator requirements.

So, I guess, we could make that happen implicitly - if the functor returns by value we switch to an iterator implementation that stashes that value in a member, otherwise if it returns by reference we can assume it has storage elsewhere and use the existing implementation.

ikelarev updated this revision to Diff 263470.May 12 2020, 10:40 AM

Thanks, David, I believe I see your point. I added a separated implementation of mapped_iterator for the case when functor returns non-reference value. As a reference implementation I used existing pointer_iterator class, which has mutable Ptr member updated in operator*() call.

I think it might be more obvious if it's implemented as two specializations, rather than one generic/one specialized - since they're kind of "equal" here (so it's not obvious in the current code that the primary template actually only applies when IsReference is true) - maybe add a comment before the boolean literal in the specializations to mention what it represents (so you don't have to go check the primary template declaration to see what the boolean is about)?

Should we call "F" every time the iterator is dereferenced? Or should it be done on increment... guess it can't be done on increment because you don't know at that point whether the new iterator is valid for dereferencing. If "V" is an Optional, then you could set it to None on increment, initialize it in op* if and only if it's None. That way it's only called once for the value, which might be important? Any idea what Boost or other iterators do here - there might be some useful prior art to learn from/model on?

(some testing would be good for this - unit tests)

ikelarev updated this revision to Diff 263587.May 12 2020, 6:41 PM

@dblaikie
Hi David! I've implemented the class as two specializations of the template and added a test, could you please take a look?

@dblaikie
Hi David! I've implemented the class as two specializations of the template and added a test, could you please take a look?

Any chance you've looked into prior art (I assume Boost has something like this) with regards to when/how often the functor should be called?

Any chance you've looked into prior art (I assume Boost has something like this) with regards to when/how often the functor should be called?

David, as I can see, in boost transform_iterator calls the functor directly every time operator* is called.

More detailed, transform_iterator does not impelement operator* itself, it just have dereference() method which returns m_f(*this->base()) value. operator* is implemented in iterator_facade class, the base class of iterator_adaptor class which is base class of transform_iterator. iterator_facade class operator* implementation just returns iterator_core_access::dereference(this->derived()) value, where iterator_core_access::dereference method (declared in auxiliary iterator_core_access class) just calls f.dereference() method.

I used this source codes: https://github.com/boostorg/iterator/tree/develop/include/boost/iterator

Hmm - starting to think I might be unqualified to properly review this & may've given you a bad direction to go down and/or this isn't possible to properly support at least given my understanding of iterator requirements.

The example code you're trying to make work (reverse of a mapped iterator) I think is still invoking UB with this new implementation. Inside reverse_iterator, it's doing something that amounts to Iter tmp = current; return *--tmp; (to quote cppreference.com) which, in the case of a mapped iterator over a value-returning functor, would return a reference to a member of tmp, that reference would be dangling in the caller, so in your test case, the initialization of Val1 and Val2 should be UB, because it's dereferencing a dangling reference.

(this is essentially summed up on cppreference by the comment "std::reverse_iterator does not work with iterators that return a reference to a member object (so-called "stashing iterators"). An example of stashing iterator is std::filesystem::path::iterator." - https://en.cppreference.com/w/cpp/iterator/reverse_iterator - though I'm not sure how that requirement is stated by the C++ standard)

Have I misunderstood? Do you think this is supported/supportable in some way that ensures the iterators conform to their requirements as per the C++ standard?

@dblaikie
Yes, David, you are right. reverse_iterator should not be used with such implementation. Talking about boost implementation I noticed this thing:

        // By default, dereferencing the iterator yields the same as
        // the function.
        typedef typename ia_dflt_help<
            Reference
#ifdef BOOST_RESULT_OF_USE_TR1
          , result_of<const UnaryFunc(typename std::iterator_traits<Iterator>::reference)>
#else
          , result_of<const UnaryFunc&(typename std::iterator_traits<Iterator>::reference)>
#endif
        >::type reference;

In other words, for the function returning by value boost's transform_iterator::operator* will also return by value without any reference. Like I proposed at the very beginning - explicitly specify "reference" type as the resulting type of the functor. Yes, this solution does not 100% conform to iterator requirements, but it's simple and does not bring any UB into the code. And boost uses this approach.

Boost documentation says the same: "If Reference is use_default then the reference member of transform_iterator is result_of<const UnaryFunction(iterator_traits<Iterator>::reference)>::type. Otherwise, reference is Reference" (https://www.boost.org/doc/libs/1_73_0/libs/iterator/doc/transform_iterator.html).

template <class UnaryFunction,
          class Iterator,
          class Reference = use_default,
          class Value = use_default>
class transform_iterator
{
public:
  typedef /* see below */ value_type;
  typedef /* see below */ reference;
...

Boost documentation says the same: "If Reference is use_default then the reference member of transform_iterator is result_of<const UnaryFunction(iterator_traits<Iterator>::reference)>::type. Otherwise, reference is Reference" (https://www.boost.org/doc/libs/1_73_0/libs/iterator/doc/transform_iterator.html).

template <class UnaryFunction,
          class Iterator,
          class Reference = use_default,
          class Value = use_default>
class transform_iterator
{
public:
  typedef /* see below */ value_type;
  typedef /* see below */ reference;
...

Hrm - seems your original proposed change would be a bit more nuanced than Boost's solution, if I'm understanding things correctly - boost's solution is to use FuncReturnTy as the reference type unconditionally/as the default (so if your functor returns by value, you have a reference type that's a value, etc - if it returns by reference, you've got a reference, etc). Your proposed solution is to always use a value type as the reference type and to use the underlying iterator's pointer type if the FuncReturnTy (without reference - so, the value type essentially?) matches the value type of the underlying iterator?

Perhaps just do what Boost's doing here - reference type based on the functor's return type? So if you return a subobject of the underlying object by reference, it can continue to be a reference, but if you compute some value from the underlying object and return by value, it's a value.

(@rsmith @EricWF as C++ librarians in case they have some particular insight on this quirky corner)