This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/
1
future
-
src/
-
future.cpp

Differential D37677

[libc++] implement future synchronization using atomic_flag
Changes PlannedPublic

Authored by dennis.luxen on Sep 11 2017, 1:59 AM.

Download Raw Diff

Details

Reviewers

EricWF
mclow.lists
cfe-commits

Summary

This task is listed in TODO.txt. The implementation swaps mutex against a spinlock based on atomic_flag. The spin lock itself is implemented as a nested class in a protected context of the associated state.

Diff Detail

Build Status

Buildable 10068
Build 10068: arc lint + arc unit

Event Timeline

dennis.luxen created this revision.Sep 11 2017, 1:59 AM

Harbormaster completed remote builds in B10068: Diff 114545.Sep 11 2017, 2:01 AM

Is there a benchmark where this demonstrates some performance improvement? I fear that the switch to condition_variable_any will swamp any performance gains from the switch to a spin lock.

Also, the spin lock is being held during allocating operations (the exception throws and at_thread_exit code). That's a little long to be holding a spin lock.

In D37677#866362, @bcraig wrote:

Is there a benchmark where this demonstrates some performance improvement? I fear that the switch to condition_variable_any will swamp any performance gains from the switch to a spin lock.

Also, the spin lock is being held during allocating operations (the exception throws and at_thread_exit code). That's a little long to be holding a spin lock.

Thanks @bcraig, I will devise a benchmark and report back with its numbers for X86_64.

TODO.txt says "future should use <atomic> for synchronization." I would have interpreted this as meaning that the line

unsigned __state_;

should become

std::atomic<unsigned> __state_;

and appropriate loads and stores used so that __is_ready() can be checked without having to take the mutex first. OTOH, I don't actually see how that would help: if it's not ready, you probably want to take a unique_lock so you can wait, and if it *is* ready, you probably want to take a lock so that you can get the value out.
Atomics might allow functions like __set_future_attached() to stop taking the lock as well. But again I'm not sure what the benefit would be; the cost is obviously "risk of subtle bugs and maintenance nightmare."

This current patch just swaps out std::mutex for a std::mutex-alike class that claims to be faster for uncontested accesses. Definitely safer than my interpretation. :) If this patch actually helps, then I would offer that the class could be provided as a reusable class std::__spin_lock in the <mutex> header instead of being hidden inside __assoc_shared_state.

In D37677#866743, @Quuxplusone wrote:

This current patch just swaps out std::mutex for a std::mutex-alike class that claims to be faster for uncontested accesses. Definitely safer than my interpretation. :) If this patch actually helps, then I would offer that the class could be provided as a reusable class std::__spin_lock in the <mutex> header instead of being hidden inside __assoc_shared_state.

I think the bar for accepting this should be significantly faster, and not just a little faster. Spinlocks don't behave as well as mutexes in abnormal conditions. Spinlocks are more likely to cause priority inversion. They are more likely to cause throughput issues when there is a lot of contention, as the spinlock'd thread will consume a full time slice before relinquishing a cpu. On Windows, CRITICAL_SECTION and SRWLOCK become electrified during process termination to avoid indefinite hangs. We shouldn't give all of that up for a minor perf gain. We might give it up for a large perf gain though.

I agree with the general consensus that we should only make this change if it's significantly faster, and only after we have a test that demonstrates this.

Unfortunately I don't recall exactly why I wrote that TODO in the first place, but I'm sure I meant changing __state_, and not the lock. I suspect it had to do with http://llvm.org/PR24692 .

include/future
538	It seems this change is ABI breaking, since `mutex` and `__spin_lock` don't have the same layout. The change will need to be guarded behind a `_LIBCPP_ABI_FOO` macro. See `__config` for examples.

This revision now requires changes to proceed.Sep 12 2017, 3:50 PM

In D37677#868851, @EricWF wrote:

I agree with the general consensus that we should only make this change if it's significantly faster, and only after we have a test that demonstrates this.

Unfortunately I don't recall exactly why I wrote that TODO in the first place, but I'm sure I meant changing __state_, and not the lock. I suspect it had to do with http://llvm.org/PR24692 .

I talked to Marshall during CPPCon and he mentioned that the remark to use atomic was related to using std::call_once. I will rework this patch and use the defines from __config to mark it as an ABI breaking change.

Revision Contents

Path

Size

include/

future

39 lines

src/

future.cpp

16 lines

Diff 114545

include/future

Show First 20 Lines • Show All 361 Lines • ▼ Show 20 Lines

*/		*/

#include <__config>		#include <__config>
#include <system_error>		#include <system_error>
#include <memory>		#include <memory>
#include <chrono>		#include <chrono>
#include <exception>		#include <exception>
#include <mutex>		#include <atomic>
		#include <condition_variable>
#include <thread>		#include <thread>

#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)		#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
#pragma GCC system_header		#pragma GCC system_header
#endif		#endif

#ifdef _LIBCPP_HAS_NO_THREADS		#ifdef _LIBCPP_HAS_NO_THREADS
#error <future> is not supported on this single threaded system		#error <future> is not supported on this single threaded system
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines
#endif		#endif
}		}

class _LIBCPP_TYPE_VIS _LIBCPP_AVAILABILITY_FUTURE __assoc_sub_state		class _LIBCPP_TYPE_VIS _LIBCPP_AVAILABILITY_FUTURE __assoc_sub_state
: public __shared_count		: public __shared_count
{		{
protected:		protected:
exception_ptr __exception_;		exception_ptr __exception_;
mutable mutex __mut_;
mutable condition_variable __cv_;		mutable class __spin_lock
		EricWFUnsubmitted Not Done Reply Inline Actions It seems this change is ABI breaking, since `mutex` and `__spin_lock` don't have the same layout. The change will need to be guarded behind a `_LIBCPP_ABI_FOO` macro. See `__config` for examples. EricWF: It seems this change is ABI breaking, since `mutex` and `__spin_lock` don't have the same…
		{
		atomic_flag __locked_ = ATOMIC_FLAG_INIT ;
		public:
		void lock() {
		while (__locked_.test_and_set(memory_order_acquire)) { ; }
		}
		void unlock() {
		__locked_.clear(memory_order_release);
		}
		} __mut_;

		mutable condition_variable_any __cv_;
unsigned __state_;		unsigned __state_;

virtual void __on_zero_shared() _NOEXCEPT;		virtual void __on_zero_shared() _NOEXCEPT;
void __sub_wait(unique_lock<mutex>& __lk);		void __sub_wait(unique_lock<__spin_lock>& __lk);
public:		public:
enum		enum
{		{
__constructed = 1,		__constructed = 1,
__future_attached = 2,		__future_attached = 2,
ready = 4,		ready = 4,
deferred = 8		deferred = 8
};		};

_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
__assoc_sub_state() : __state_(0) {}		__assoc_sub_state() : __state_(0) {}

_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
bool __has_value() const		bool __has_value() const
{return (__state_ & __constructed) \|\| (__exception_ != nullptr);}		{return (__state_ & __constructed) \|\| (__exception_ != nullptr);}

_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
void __set_future_attached()		void __set_future_attached()
{		{
lock_guard<mutex> __lk(__mut_);		lock_guard<__spin_lock> __lk(__mut_);
__state_ \|= __future_attached;		__state_ \|= __future_attached;
}		}
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
bool __has_future_attached() const {return (__state_ & __future_attached) != 0;}		bool __has_future_attached() const {return (__state_ & __future_attached) != 0;}

_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
void __set_deferred() {__state_ \|= deferred;}		void __set_deferred() {__state_ \|= deferred;}

Show All 21 Lines	public:

virtual void __execute();		virtual void __execute();
};		};

template <class _Clock, class _Duration>		template <class _Clock, class _Duration>
future_status		future_status
__assoc_sub_state::wait_until(const chrono::time_point<_Clock, _Duration>& __abs_time) const		__assoc_sub_state::wait_until(const chrono::time_point<_Clock, _Duration>& __abs_time) const
{		{
unique_lock<mutex> __lk(__mut_);		unique_lock<__spin_lock> __lk(__mut_);
if (__state_ & deferred)		if (__state_ & deferred)
return future_status::deferred;		return future_status::deferred;
while (!(__state_ & ready) && _Clock::now() < __abs_time)		while (!(__state_ & ready) && _Clock::now() < __abs_time)
__cv_.wait_until(__lk, __abs_time);		__cv_.wait_until(__lk, __abs_time);
if (__state_ & ready)		if (__state_ & ready)
return future_status::ready;		return future_status::ready;
return future_status::timeout;		return future_status::timeout;
}		}
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
_LIBCPP_AVAILABILITY_FUTURE		_LIBCPP_AVAILABILITY_FUTURE
void		void
#ifndef _LIBCPP_HAS_NO_RVALUE_REFERENCES		#ifndef _LIBCPP_HAS_NO_RVALUE_REFERENCES
__assoc_state<_Rp>::set_value(_Arg&& __arg)		__assoc_state<_Rp>::set_value(_Arg&& __arg)
#else		#else
__assoc_state<_Rp>::set_value(_Arg& __arg)		__assoc_state<_Rp>::set_value(_Arg& __arg)
#endif		#endif
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
if (this->__has_value())		if (this->__has_value())
__throw_future_error(future_errc::promise_already_satisfied);		__throw_future_error(future_errc::promise_already_satisfied);
::new(&__value_) _Rp(_VSTD::forward<_Arg>(__arg));		::new(&__value_) _Rp(_VSTD::forward<_Arg>(__arg));
this->__state_ \|= base::__constructed \| base::ready;		this->__state_ \|= base::__constructed \| base::ready;
__cv_.notify_all();		__cv_.notify_all();
}		}

template <class _Rp>		template <class _Rp>
template <class _Arg>		template <class _Arg>
void		void
#ifndef _LIBCPP_HAS_NO_RVALUE_REFERENCES		#ifndef _LIBCPP_HAS_NO_RVALUE_REFERENCES
__assoc_state<_Rp>::set_value_at_thread_exit(_Arg&& __arg)		__assoc_state<_Rp>::set_value_at_thread_exit(_Arg&& __arg)
#else		#else
__assoc_state<_Rp>::set_value_at_thread_exit(_Arg& __arg)		__assoc_state<_Rp>::set_value_at_thread_exit(_Arg& __arg)
#endif		#endif
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
if (this->__has_value())		if (this->__has_value())
__throw_future_error(future_errc::promise_already_satisfied);		__throw_future_error(future_errc::promise_already_satisfied);
::new(&__value_) _Rp(_VSTD::forward<_Arg>(__arg));		::new(&__value_) _Rp(_VSTD::forward<_Arg>(__arg));
this->__state_ \|= base::__constructed;		this->__state_ \|= base::__constructed;
__thread_local_data()->__make_ready_at_thread_exit(this);		__thread_local_data()->__make_ready_at_thread_exit(this);
}		}

template <class _Rp>		template <class _Rp>
_Rp		_Rp
__assoc_state<_Rp>::move()		__assoc_state<_Rp>::move()
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
this->__sub_wait(__lk);		this->__sub_wait(__lk);
if (this->__exception_ != nullptr)		if (this->__exception_ != nullptr)
rethrow_exception(this->__exception_);		rethrow_exception(this->__exception_);
return _VSTD::move(reinterpret_cast<_Rp>(&__value_));		return _VSTD::move(reinterpret_cast<_Rp>(&__value_));
}		}

template <class _Rp>		template <class _Rp>
typename add_lvalue_reference<_Rp>::type		typename add_lvalue_reference<_Rp>::type
__assoc_state<_Rp>::copy()		__assoc_state<_Rp>::copy()
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
this->__sub_wait(__lk);		this->__sub_wait(__lk);
if (this->__exception_ != nullptr)		if (this->__exception_ != nullptr)
rethrow_exception(this->__exception_);		rethrow_exception(this->__exception_);
return reinterpret_cast<_Rp>(&__value_);		return reinterpret_cast<_Rp>(&__value_);
}		}

template <class _Rp>		template <class _Rp>
class _LIBCPP_AVAILABILITY_FUTURE __assoc_state<_Rp&>		class _LIBCPP_AVAILABILITY_FUTURE __assoc_state<_Rp&>
Show All 19 Lines
{		{
delete this;		delete this;
}		}

template <class _Rp>		template <class _Rp>
void		void
__assoc_state<_Rp&>::set_value(_Rp& __arg)		__assoc_state<_Rp&>::set_value(_Rp& __arg)
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
if (this->__has_value())		if (this->__has_value())
__throw_future_error(future_errc::promise_already_satisfied);		__throw_future_error(future_errc::promise_already_satisfied);
__value_ = _VSTD::addressof(__arg);		__value_ = _VSTD::addressof(__arg);
this->__state_ \|= base::__constructed \| base::ready;		this->__state_ \|= base::__constructed \| base::ready;
__cv_.notify_all();		__cv_.notify_all();
}		}

template <class _Rp>		template <class _Rp>
void		void
__assoc_state<_Rp&>::set_value_at_thread_exit(_Rp& __arg)		__assoc_state<_Rp&>::set_value_at_thread_exit(_Rp& __arg)
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
if (this->__has_value())		if (this->__has_value())
__throw_future_error(future_errc::promise_already_satisfied);		__throw_future_error(future_errc::promise_already_satisfied);
__value_ = _VSTD::addressof(__arg);		__value_ = _VSTD::addressof(__arg);
this->__state_ \|= base::__constructed;		this->__state_ \|= base::__constructed;
__thread_local_data()->__make_ready_at_thread_exit(this);		__thread_local_data()->__make_ready_at_thread_exit(this);
}		}

template <class _Rp>		template <class _Rp>
_Rp&		_Rp&
__assoc_state<_Rp&>::copy()		__assoc_state<_Rp&>::copy()
{		{
unique_lock<mutex> __lk(this->__mut_);		unique_lock<__spin_lock> __lk(this->__mut_);
this->__sub_wait(__lk);		this->__sub_wait(__lk);
if (this->__exception_ != nullptr)		if (this->__exception_ != nullptr)
rethrow_exception(this->__exception_);		rethrow_exception(this->__exception_);
return *__value_;		return *__value_;
}		}

template <class _Rp, class _Alloc>		template <class _Rp, class _Alloc>
class _LIBCPP_AVAILABILITY_FUTURE __assoc_state_alloc		class _LIBCPP_AVAILABILITY_FUTURE __assoc_state_alloc
▲ Show 20 Lines • Show All 1,843 Lines • Show Last 20 Lines

src/future.cpp

	Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
	__assoc_sub_state::__on_zero_shared() _NOEXCEPT			__assoc_sub_state::__on_zero_shared() _NOEXCEPT
	{			{
	delete this;			delete this;
	}			}

	void			void
	__assoc_sub_state::set_value()			__assoc_sub_state::set_value()
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	#ifndef _LIBCPP_NO_EXCEPTIONS			#ifndef _LIBCPP_NO_EXCEPTIONS
	if (__has_value())			if (__has_value())
	throw future_error(make_error_code(future_errc::promise_already_satisfied));			throw future_error(make_error_code(future_errc::promise_already_satisfied));
	#endif			#endif
	__state_ \|= __constructed \| ready;			__state_ \|= __constructed \| ready;
	__cv_.notify_all();			__cv_.notify_all();
	}			}

	void			void
	__assoc_sub_state::set_value_at_thread_exit()			__assoc_sub_state::set_value_at_thread_exit()
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	#ifndef _LIBCPP_NO_EXCEPTIONS			#ifndef _LIBCPP_NO_EXCEPTIONS
	if (__has_value())			if (__has_value())
	throw future_error(make_error_code(future_errc::promise_already_satisfied));			throw future_error(make_error_code(future_errc::promise_already_satisfied));
	#endif			#endif
	__state_ \|= __constructed;			__state_ \|= __constructed;
	__thread_local_data()->__make_ready_at_thread_exit(this);			__thread_local_data()->__make_ready_at_thread_exit(this);
	}			}

	void			void
	__assoc_sub_state::set_exception(exception_ptr __p)			__assoc_sub_state::set_exception(exception_ptr __p)
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	#ifndef _LIBCPP_NO_EXCEPTIONS			#ifndef _LIBCPP_NO_EXCEPTIONS
	if (__has_value())			if (__has_value())
	throw future_error(make_error_code(future_errc::promise_already_satisfied));			throw future_error(make_error_code(future_errc::promise_already_satisfied));
	#endif			#endif
	__exception_ = __p;			__exception_ = __p;
	__state_ \|= ready;			__state_ \|= ready;
	__cv_.notify_all();			__cv_.notify_all();
	}			}

	void			void
	__assoc_sub_state::set_exception_at_thread_exit(exception_ptr __p)			__assoc_sub_state::set_exception_at_thread_exit(exception_ptr __p)
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	#ifndef _LIBCPP_NO_EXCEPTIONS			#ifndef _LIBCPP_NO_EXCEPTIONS
	if (__has_value())			if (__has_value())
	throw future_error(make_error_code(future_errc::promise_already_satisfied));			throw future_error(make_error_code(future_errc::promise_already_satisfied));
	#endif			#endif
	__exception_ = __p;			__exception_ = __p;
	__thread_local_data()->__make_ready_at_thread_exit(this);			__thread_local_data()->__make_ready_at_thread_exit(this);
	}			}

	void			void
	__assoc_sub_state::__make_ready()			__assoc_sub_state::__make_ready()
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	__state_ \|= ready;			__state_ \|= ready;
	__cv_.notify_all();			__cv_.notify_all();
	}			}

	void			void
	__assoc_sub_state::copy()			__assoc_sub_state::copy()
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	__sub_wait(__lk);			__sub_wait(__lk);
	if (__exception_ != nullptr)			if (__exception_ != nullptr)
	rethrow_exception(__exception_);			rethrow_exception(__exception_);
	}			}

	void			void
	__assoc_sub_state::wait()			__assoc_sub_state::wait()
	{			{
	unique_lock<mutex> __lk(__mut_);			unique_lock<__spin_lock> __lk(__mut_);
	__sub_wait(__lk);			__sub_wait(__lk);
	}			}

	void			void
	__assoc_sub_state::__sub_wait(unique_lock<mutex>& __lk)			__assoc_sub_state::__sub_wait(unique_lock<__spin_lock>& __lk)
	{			{
	if (!__is_ready())			if (!__is_ready())
	{			{
	if (__state_ & static_cast<unsigned>(deferred))			if (__state_ & static_cast<unsigned>(deferred))
	{			{
	__state_ &= ~static_cast<unsigned>(deferred);			__state_ &= ~static_cast<unsigned>(deferred);
	__lk.unlock();			__lk.unlock();
	__execute();			__execute();
	▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines