Page MenuHomePhabricator

[libc++] Implement P0288R9 (move_only_function)
Changes PlannedPublic

Authored by philnik on Jul 27 2022, 5:57 AM.


Group Reviewers
Restricted Project

Diff Detail

Event Timeline

philnik created this revision.Jul 27 2022, 5:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2022, 5:57 AM
Herald added a subscriber: mgorny. · View Herald Transcript
philnik requested review of this revision.Jul 27 2022, 5:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2022, 5:57 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
philnik planned changes to this revision.Jul 28 2022, 11:36 AM

Thanks for looking into this :) std::move_only_function is very high on the list of my favorite new C++23 features!


Is it more efficient to only set __status_ = NotEngaged? Afaict, this would overwrite only 1 instead of 48 bytes

(here and a couple of other places which set __storage_ = {}


redundant return before else?


std::bit_cast<__ReturnT (*)(...)> instead of std::bit_cast<void (*)(...)>?

Also, it seems there are no test cases which instantiate move_only_function with a return type != void?


I think this should also take _LIBCPP_MOVE_ONLY_FUNCTION_NOEXCEPT into account when casting to the function pointer.

As written now, for move_only_function<void () noexcept>, this operator() will contain a call to a non-noexcept function. The compiler will hence generate additional exception handling code. Also, this blocks tail call optimization (


I think this can be implemented more efficiently by folding this switch with the indirect function call we are doing here anyway.

Proposed changes

Move this switch into __function_wrappers_impl::__call, taking _Store::_Status as a template parameter:

template<_Store::_Status status>
__call(_Store* store, _ArgTypes&&... __args) noexcept(_LIBCPP_MOVE_ONLY_FUNCTION_NOEXCEPT) {
    auto __doCall = [&](void* __functor) {
        return std::invoke(static_cast<_Functor _LIBCPP_MOVE_ONLY_FUNCTION_INV_QUALS>(*static_cast<_Functor*>(__functor)),
    using enum _Store::_Status;
    if constexpr (status == _NotEngaged) {
      _LIBCPP_ASSERT(false, "Tried to call move_only_function which doesn't hold a callable object");
    }  else if constexpr (status == _TriviallyDestructible) {
        return __doCall(&__storage_.__trivially_destructible_.__data_);
    }  else if constexpr (status == _TriviallyRelocatable) {
        return __doCall(&__storage_.__trivially_relocatable_.__data_);
    }  else if constexpr (status == _Heap) {
        return __doCall(&__storage_.__large_.__data_);

In __construct, the __storage.__large_.__call would then be assigned similar to

auto& __data   = __storage_.__large_;
__data.__call_ = std::bit_cast<void*>(&_FuncWraps::__call<_Store::_Status::_Heap>);

Because the __call_ member is always at the same position inside the __move_only_function_storage, we can then implement operator() as a simple

_LIBCPP_HIDE_FROM_ABI _ReturnT operator()(_ArgTypes... __args) _LIBCPP_MOVE_ONLY_FUNCTION_CV_REF
  return std::bit_cast<ReturnT (*)(_LIBCPP_MOVE_ONLY_FUNCTION_CV  _Store*, _ArgTypes...)>(__storage_.__large_.__call_)(
      &__storage, std::forward<_ArgTypes>(__args)...);

Why I think this is a good idea

Replacing the switch with a if constexpr chain guarantees that this switch is completely optimized away.

Futhermore for usages like

void setCallback(std::move_only_function<void()> cb);

void setCallbackWithGreetings(bool bye, const char* whom) {
     takesSomeCallback([=]() {
          std::cout << (bye ? "Goodbye " : "Hello ") << whom << "\n";

the compiler can inline the lambda body into the __function_wrappers_impl::__call. The compiler can then fold the offset computation for __storage_.__large_.__data_ with the offset computations for whom and bye. With the previous implementation of operator() this would not have been possible since the __storage_.__large_.__data_ happens before the indirect call, while the whom & bye offset computations happened inside the indirect call.

avogelsgesang added inline comments.Jul 29 2022, 9:00 PM

__functor should probably be the last argument.

This function forwards all its arguments except for __functor.
Removing an argument from the front of the argument list means that all parameters must be "shifted" between the registers/on the stack.
If instead functor is the last argument, no arguments need to be moved around.


This is also beneficial for operator(): operator() has to add the functor to the argument list. Appending an argument is trivial, but prepending is expensive...


Not sure if this idea is blessed by the standard or undefined behavior, but:

After changing operator() to append instead of prepend the pointer to the functor, this could be rewritten to std::bit_cast<void*>(static_cast<UnrefFunc>(_func));, i.e. we could directly store the function pointer inside __call_ instead of a pointer to the _FuncWraps::__call dispatch function.

The operator() would call this function pointer with one additional argument. But at on the x64 calling conventions, it is fine to call a function with additional parameters. Additional parameters are simply ignored...

avogelsgesang added inline comments.Aug 2 2022, 4:02 AM

nevermind - on 2nd thought: this code is only removing the first argument if __function is a function pointer. If it is, e.g., a lambda then the first argument, i.e. the this pointer, will be replaced but all other arguments stay in their original position.

Similarly, operator() does not actually prepend an argument, but just replaces the first argument, i.e. the this pointer


nevermind - see other comment why this makes no sense

lichray requested changes to this revision.Aug 10 2022, 12:17 PM
lichray added a subscriber: lichray.
lichray added inline comments.

We should aim for a macro-free implementation to benefit users.


In his CppCon 2017 talk, Louis @ldionne showed that embedding a vtable in the object storage is not an optimization compared to using a vptr even if we ignore the fact that embedded vtable uses more space. Also, mo-func of 6 x pointers in size may be too large. It's even larger than Swift's existential container (3-pointers + metadata + vptr).


Similarly, operator() does not actually prepend an argument, but just replaces the first argument, i.e. the this pointer


  1. _ArgTypes&& - should consider passing objects by copy rather than references as an optimization in certain cases.
  2. Please make sure that we don't have a design that works like the following: when move_only_function is initialized from a pointer to function, we need to dereference __functor first to reach the function pointer, and then make a call via that function pointer.

It seems that some logic is embedded in the mo-func body, rather than dispatched. __destroy_ is not sufficiently used.


Please make sure the type is ABI-compatible after adopting It is on the way to C++26.


We should test against behaviors, not guts. If the difference is not observable according to the standard, consider observing some extension such as a more strict noexcept.