Page MenuHomePhabricator

[ADT] Add a range-based version of std::move
ClosedPublic

Authored by njames93 on Jul 15 2020, 1:21 PM.

Details

Summary

Adds a range-based version of std::move, the version that moves a range, not the one that creates r-value references.

Diff Detail

Event Timeline

njames93 created this revision.Jul 15 2020, 1:21 PM
Herald added a project: Restricted Project. ยท View Herald TranscriptJul 15 2020, 1:21 PM

Could you add a test case?

njames93 updated this revision to Diff 278391.Jul 16 2020, 12:54 AM

Added unit test.

I am happy to see this change, as I could use it inside my clang-tidy check as well!

llvm/unittests/ADT/STLExtrasTest.cpp
582

We could implement a dummy class here in the test code with an explicit move and copy ctors and assignment operators to not depend on the behaviour of std::string. That's because if small-string optimization is employed in the implementation of std::string, it is perfectly valid, and thus possible that the 'valid but unspecified' state of the moved-from string is not empty.

dblaikie added inline comments.Jul 17 2020, 12:18 PM
llvm/unittests/ADT/STLExtrasTest.cpp
584

Moved-from strings aren't guaranteed to be empty (they're in a "valid but unspecified" state - for small strings it's possible they still contain the original value) - probably using a specific type that counts moves/copies/destructions might help.

Range<Movable, Copyable> further up in this file provides that - but perhaps could use a refactor Maybe pull out most of the functionality (except the begin/end function) into a "LifetimeTracker" class - and then use that in this test?

njames93 updated this revision to Diff 278897.Jul 17 2020, 1:57 PM

Removed std::string from test case in favour of custom moveable class.

njames93 marked 4 inline comments as done.Jul 17 2020, 2:12 PM
njames93 added inline comments.
llvm/unittests/ADT/STLExtrasTest.cpp
582

Thats a good point, this ought to do the job regardless of implementation specifics.

584

I just created a moveable class, seemed simplest and safest way.

gamesh411 added inline comments.Jul 17 2020, 2:25 PM
llvm/unittests/ADT/STLExtrasTest.cpp
10

Why is Twine necessary? Maybe just an unintended addition?

ychen added a subscriber: ychen.Jul 17 2020, 2:29 PM
ychen added inline comments.
llvm/include/llvm/ADT/STLExtras.h
1542

Why std::forward is not used?

njames93 marked 5 inline comments as done.Jul 17 2020, 3:38 PM
njames93 added inline comments.
llvm/include/llvm/ADT/STLExtras.h
1542

Forwarding the Range or the Output iterator. Forwarding the range to adl_begin and adl_end is not a good idea as if you pass an rvalue e.g. llvm::move(std::move(vec), output), You're gonna have a bad time, I guess the output iterator could be forwarded, not sure how much would be gained from that.

llvm/unittests/ADT/STLExtrasTest.cpp
10

Must be a clangd IWYU inclusion, I never added it.

njames93 updated this revision to Diff 278916.Jul 17 2020, 3:38 PM
njames93 marked an inline comment as done.

Remove unneeded include

gamesh411 accepted this revision.Jul 17 2020, 4:00 PM
gamesh411 marked an inline comment as done.

LGTM!

llvm/include/llvm/ADT/STLExtras.h
1542

IMO, if return std::move(adl_begin(std::forward<R>(Range)), adl_end(std::forward<R>(Range)), Out); was meant, then forwarding to 2 calls is not really wise, as even if function argument evaluation is undefined but not overlapping (as from c++17 I think), one of the move-s will happen before the other, and the other move would be a use-after-move error.

Forwarding Out is out of the question as it is not a universal reference, and moving the result out would be a potentially unnecessary move (RVO with 2 stage overload resolution comes into mind), and nothing would be gained by explicitly moving the return value of the std::move algorithm.

Finally, iterators themselves are implemented with value semantics ingrained, designed to be cheap to copy.

All in all, I would only see merit in forwarding if Range was used exactly once, but even then perfect forwarding can interfere with overload resolution and type deduction, which is not something you want in such a generic piece of code.

This revision is now accepted and ready to land.Jul 17 2020, 4:00 PM
ychen added inline comments.Jul 17 2020, 4:12 PM
llvm/include/llvm/ADT/STLExtras.h
1542

I meant forwarding Range. We don't have to call forwarding twice. We could call it once. The point is the Range is of universal reference type (which means llvm::move(std::move(vec), output) should be a valid case), there should a reason not to forwarding. This is more about the type-safety range than efficiency.

ychen added inline comments.Jul 17 2020, 4:16 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Actually, calling std::forward twice is a little bit confusing but not wrong.

gamesh411 added inline comments.Jul 17 2020, 4:24 PM
llvm/include/llvm/ADT/STLExtras.h
1542

I am really no expert or authority, but you could be right if forwarding twice would somehow guarantee that both rvalue references refer to something that is 'safe to use' so to speak. It would go against my intuition, but again you might be right. However, even if this is the case, it should be done in every algorithm (like in llvm::copy above), and that alone would warrant that this change of applying perfect forwarding to all algorithms a separate patch.

ychen added inline comments.Jul 17 2020, 4:32 PM
llvm/include/llvm/ADT/STLExtras.h
1542

I wouldn't be 100% sure, but this is my intuition. I just realized that @dblaikie authored many other similar ones as you mentioned. Probably he got better idea.

gamesh411 added inline comments.Jul 17 2020, 4:35 PM
llvm/include/llvm/ADT/STLExtras.h
1542

I would be really interested in a throughout explanation as well :)

dblaikie added inline comments.Jul 17 2020, 4:42 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Not a fantastically better idea, unfortunately - I've had similar thoughts to most sides of this debate.

I don't know of any container that offers, for instance, moving iterators when you retrieve begin/end from the range - so it's "optimistic" at best to have those std::forward calls in for some such hypothetical situation.

ychen added inline comments.Jul 17 2020, 4:43 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Digging a little bit more. I think adl_detail::adl_begin(ContainerTy &&container) should be adl_detail::adl_begin(ContainerTy &container) since std::begin does not have a universal reference version and that makes sense. With this, many other clients using adl_begin()/adl_end() need not use universal reference type.

dblaikie added inline comments.Jul 17 2020, 4:58 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Perhaps give it a go - but I believe that'd break use with const containers.

Essentially for the same reason this code fails to compile:

template<typename T>
void f1(T&);
int main() {
  f1(3); // No matching function for call to 'f1'  test.cpp:2:6: note: candidate function [with T = int] not viable: expects an l-value for 1st argument 
}
ychen added inline comments.Jul 17 2020, 5:06 PM
llvm/include/llvm/ADT/STLExtras.h
1542

add a const l-value reference version should do it. std::begin has this version.

template<typename T>
void f1(const T&);

dblaikie added inline comments.Jul 17 2020, 5:08 PM
llvm/include/llvm/ADT/STLExtras.h
1542

What benefit would be gained by using overloads, rather than perfect forwarding?

gamesh411 added inline comments.Jul 17 2020, 5:35 PM
llvm/include/llvm/ADT/STLExtras.h
1542

This is painful... If you want to support both non-const and const iterators coming out of adl_begin *without perfect forwarding*, you are forced to do both overloads (the const-ref and the non-const-ref).
The added benefit would be, that you don't have to concern yourself with temporary containers. The downside would be that you have to maintain both overloads for every algorithm. The rationale behind not supporting pure rvalue (temporary) containers is that any iterators coming out of them would be invalidated by the time the algorithm returns, which is not practical. Const references would still allow this weird use-case, but then again there are algorithms which don't necessarily return an iterator (for_each). Those could be used with even a temporary.
I would be most definitely on the perfect forwarding side, if it could be guaranteed that forwarding to 2 functions is safe. This is safe if nowhere along the call stack is a move called on Range. if adl_begin *only ever* calls begin *and* begin will *never* be overloaded on the instance variable being rvalue (like it_t begin() && {} inside the container, and doing something fancy inside), *or* if a free function begin is chosen that doesn't move the value (a move could occur in this case even if the chosen overload takes the param by value). Those conditions are most likely to hold now and even in the future, but the problem is you are depending on the implementation to be correct with no way of ensuring it is on the library level.

dblaikie added inline comments.Jul 17 2020, 6:06 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Writing const and non-const overloads of all of these seems a bit of a pain that I'd rather avoid, with pretty low-risk of downsides - and I'm generally pretty risk averse.

If a container does implement an rvalue overload for begin/end - I mean, what would that mean exactly? (if it provided only the rudimentary guarantee of "valid but unspecified state" - I could more reasonably imagine a container with && begin/end overloads that don't produce a "valid but unspecified state" but instead specifically produce move iterators & do so correctly when you call both begin and end)

So maybe we do need the std::forward - to cope with temporary containers, even if they aren't being moved-from? Anyone care to test that?

My test, at least:

#include <utility>   
void f1(const int&) = delete;  
void f1(int&);  
void f1(const float&);  
void f1(float&) = delete;  
template<typename T>  
void t1(T &&t) {  
  f1(t);  
}  
int main() {  
  int i;  
  t1(i);  // compiles
  const float f = 3.0;
  t1(f);  // compiles
  t1(3.0f) // fails to compile without std::forward
}

Pretty narrow-use-case, would only come up if you had non-member begin/end that took a const R&? We don't really have a lot of non-member begin/end anyway (probably none? nearly?) - but seems like a good thing to generalize over.

ychen added inline comments.Jul 17 2020, 6:11 PM
llvm/include/llvm/ADT/STLExtras.h
1542

The benefit is to let API not take rvalue when it does not mean to. But in this case, this requires, as @gamesh411 said, duplicating the const and non-const L-value version for more than 10 methods.

So the universal reference here is to reduce the number of overloads. So probably we should remove the std::forwarding along the stack to indicate that the rvalue does not matter.

It is also probably worthwhile to add a comment about this in general.

dblaikie added inline comments.Jul 17 2020, 6:22 PM
llvm/include/llvm/ADT/STLExtras.h
1542

I think it should take rvalues, though, as in:

llvm::move(build_container(), std::back_inserter(v));

Or other such examples - in fact move seems like the poster child for algorithms you'd like to pass an rvalue to. Even if you don't have rvalue begin/end overloads.

The std::forward does seem potentially necessary to support that use case, again, even when there are no rvalue begin/end overloads - as in the f1/t1 example above, where a const ref and non-const ref overloads aren't handled correctly in the presence of a temporary passed through the template indirection. Even when none of the f1 overloads traffic in rvalue parameters themselves.

ychen added inline comments.Jul 17 2020, 6:37 PM
llvm/include/llvm/ADT/STLExtras.h
1542

hmmm, interesting, yeah, for this to work, we need std::forward to keep the constness.

ychen added inline comments.Jul 17 2020, 7:38 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Could not resist to try it out: https://gcc.godbolt.org/z/T3hs5G (with std::forward removed along the stack)

rvalue should becomes lvalue down the stack so that adl_begin/end return non-const iterator to do the move.

dblaikie added inline comments.Jul 20 2020, 2:18 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Not sure I follow - in any case, test cases (for this patch) would be good, to demonstrate what is/isn't needed.

njames93 marked an inline comment as done.Jul 20 2020, 2:48 PM
njames93 added inline comments.
llvm/include/llvm/ADT/STLExtras.h
1542

Test cases for what exactly??

llvm::move(build_container(), iter);
llvm::move(std::move(container), iter);
dblaikie added inline comments.Jul 20 2020, 9:22 PM
llvm/include/llvm/ADT/STLExtras.h
1542

Test cases for what exactly??

llvm::move(build_container(), iter);
llvm::move(std::move(container), iter);

Sure, things like that - ^ do both of those work with this patch as-is? I guess one or both of these require the std::forwards to be added?

njames93 updated this revision to Diff 279476.Jul 21 2020, 4:03 AM

Added more test cases

njames93 marked an inline comment as done.Jul 21 2020, 4:13 AM
njames93 added inline comments.
llvm/unittests/ADT/STLExtrasTest.cpp
613

When I put std::forward on the range, this call failed to compile as the copy constructor of Foo has been deleted. So for some reason when the container is forwarded to adl_begin and adl_end, it results in trying to create a copy of the container first.

This call though doesn't actually move V2, its contents have been moved, but the items are still in there (size() == ItemCount).
That one has me a little perplexed. Should say nothing strange showed up with ASAN even when I changed to using heap allocation instead of the SmallVector inline storage.

To further investigate the moving story, I also created a small example on Godbolt:
https://godbolt.org/z/5qMf1o

Note that the example linked works as expected even though perfect forwarding is employed everywhere. This is because the begin and is implemented by taking const reference.
One interesting thing to see if you would take A instances by value in begin and end functions, a move would occur and the use-after-move would be demonstrated.
Now, I don't think this is a very likely case, still, this is the best I could come up with, that is hypothetically dangerous.

ychen added inline comments.Jul 21 2020, 11:20 AM
llvm/unittests/ADT/STLExtrasTest.cpp
613

Without std::forward in llvm::move, std::begin/end in adl_begin/end would get a lvalue(yes, the type are still rvalue ref), so non-const version of std::begin/end is called, so elements could be moved.

If std::forward is in llvm::move, it would be perfect forwarded to std::begin/end, trigger its const version which return const iterator that could not be moved.

All in all, the universal reference here is confusing, and it is not meant for perfect forwarding, just a shortcut for overloading const and non-const reference of parameters.

gamesh411 added inline comments.Jul 21 2020, 11:59 AM
llvm/unittests/ADT/STLExtrasTest.cpp
613

Good point! I have also been bitten by std::move algorithm falling back silently to copy-ing if somehow a const-iterator is given to it (and the copy ctor of the underlying element is not deleted). I agree that the main reason for universal ref here is to avoid overloading! ๐Ÿ‘

dblaikie accepted this revision.Jul 23 2020, 7:25 PM

Yeah, please commit this as-is, without the std::forward calls, and including the test cases you've provided that help exercise/demonstrate the necessity.

llvm/unittests/ADT/STLExtrasTest.cpp
613

I don't think it's trying to create a copy of the container, so far as I can understand from the error. Here's the error I get:

/usr/local/google/home/blaikie/dev/llvm/src/llvm/include/llvm/ADT/SmallVector.h:249:33: error: call to deleted constructor of 'Foo'
    ::new ((void*) this->end()) T(Elt);
                                ^ ~~~
/usr/local/google/home/blaikie/install/bin/../lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../include/c++/10.0.0/bits/stl_iterator.h:532:13: note: in instantiation of member function 'llvm::SmallVectorTemplateBase<Foo, false>::push_back' requested here
        container->push_back(__value);
                   ^
/usr/local/google/home/blaikie/install/bin/../lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../include/c++/10.0.0/bits/stl_algobase.h:426:18: note: in instantiation of member function 'std::back_insert_iterator<llvm::SmallVector<Foo, 4>>::operator=' requested here
              *__result = std::move(*__first);
                        ^
/usr/local/google/home/blaikie/install/bin/../lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../include/c++/10.0.0/bits/stl_algobase.h:499:22: note: in instantiation of function template specialization 'std::__copy_move<true, false, std::random_access_iterator_tag>::__copy_m<const Foo *, std::back_insert_iterator<llvm::SmallVector<Foo, 4>>>' requested here
                              _Category>::__copy_m(__first, __last, __result);
                                          ^
/usr/local/google/home/blaikie/install/bin/../lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../include/c++/10.0.0/bits/stl_algobase.h:533:19: note: in instantiation of function template specialization 'std::__copy_move_a2<true, const Foo *, std::back_insert_iterator<llvm::SmallVector<Foo, 4>>>' requested here
    { return std::__copy_move_a2<_IsMove>(__first, __last, __result); }
                  ^
/usr/local/google/home/blaikie/install/bin/../lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../include/c++/10.0.0/bits/stl_algobase.h:541:8: note: in instantiation of function template specialization 'std::__copy_move_a1<true, const Foo *, std::back_insert_iterator<llvm::SmallVector<Foo, 4>>>' requested here
                std::__copy_move_a1<_IsMove>(std::__niter_base(__first),
                     ^
/usr/local/google/home/blaikie/install/bin/../lib/gcc/x86_64-pc-linux-gnu/10.0.0/../../../../include/c++/10.0.0/bits/stl_algobase.h:628:19: note: in instantiation of function template specialization 'std::__copy_move_a<true, const Foo *, std::back_insert_iterator<llvm::SmallVector<Foo, 4>>>' requested here
      return std::__copy_move_a<true>(std::__miter_base(__first),
                  ^
/usr/local/google/home/blaikie/dev/llvm/src/llvm/include/llvm/ADT/STLExtras.h:1542:15: note: in instantiation of function template specialization 'std::move<const Foo *, std::back_insert_iterator<llvm::SmallVector<Foo, 4>>>' requested here 
  return std::move(adl_begin(std::forward<R>(Range)), adl_end(std::forward<R>(Range)), Out);
              ^
/usr/local/google/home/blaikie/dev/llvm/src/llvm/unittests/ADT/STLExtrasTest.cpp:613:9: note: in instantiation of function template specialization 'llvm::move<llvm::SmallVector<Foo, 4>, std::back_insert_iterator<llvm::SmallVector<Foo, 4>>>' requested here
  llvm::move(std::move(V2), std::back_inserter(V3));
        ^
/usr/local/google/home/blaikie/dev/llvm/src/llvm/unittests/ADT/STLExtrasTest.cpp:578:5: note: 'Foo' has been explicitly marked deleted here
    Foo(const Foo &) = delete;
    ^

I think the issue is that adl_begin(std::forward<R>(Container)) is giving an rvalue to adl_begin, which goes down, etc, etc, then std::begin which is const ref/non-const-ref overloaded ends up calling the const ref version.

On that basis, perhaps not using forward on calls to adl_begin/adl_end might actually be the right thing to do - means rvalues turn into lvalues (which they are, in that context - it was an rvalue to the original function, but an lvalue when asking for its begin/end).

(& sorry for the large sidetrack - and thanks for everyone talking it through, testing it out, etc to figure out the issues around std::forward)

This revision was automatically updated to reflect the committed changes.