This is an archive of the discontinued LLVM Phabricator instance.

provide Win32 native threading
ClosedPublic

Authored by compnerd on Jan 2 2017, 3:36 PM.

Details

Summary

Add an implementation for the Win32 threading model as a backing API for
the internal c++ threading interfaces. This uses the Fls* family for
the TLS (which has the support for adding termination callbacks),
CRITICAL_SECTIONs for the recursive mutex, and Slim Reader/Writer locks
(SRW locks) for non-recursive mutexes. These APIs should all be
available on Vista or newer.

Diff Detail

Event Timeline

compnerd updated this revision to Diff 82821.Jan 2 2017, 3:36 PM
compnerd retitled this revision from to provide Win32 native threading.
compnerd updated this object.
compnerd added reviewers: EricWF, mclow.lists.
compnerd set the repository for this revision to rL LLVM.
compnerd added subscribers: rnk, majnemer, smeenai, kastiglione.

slim reader-writer locks are faster than critical sections, I'd recommend your implementation switch to those.

Also, why do you use Fls instead of Tls?

EricWF edited edge metadata.Jan 2 2017, 4:38 PM

Added inline comments.

The main issue is that this patch needs to use the same <__threading_support> forward declarations as every other supported threading API; Only new definitions should be provided.

include/__threading_support
36

Isn't checking _LIBCPP_HAS_THREAD_API_WIN32 enough?

49

The forward declarations of the __libcpp_ threading wrapper should be shared between all API's. Please don't add your own forward declarations for Windows.

src/algorithm.cpp
51 ↗(On Diff #82821)

This makes me sad because not having constant initialization manifests itself as a static initialization order fiasco in libc++. If we do go this route please file a libc++ bug stating that Win32 mutex's do not provide constant initialization.

I think this should check _LIBCPP_HAS_THREAD_API_WIN32?

src/memory.cpp
158 ↗(On Diff #82821)

I think this should check _LIBCPP_HAS_THREAD_API_WIN32?

src/mutex.cpp
198 ↗(On Diff #82821)

I think this should check _LIBCPP_HAS_THREAD_API_WIN32?

@majnemer Im using the Fls* APIs since they provide the thread termination callback, which Tls* doesn't. Good point about the SRW. Those are newer, but, we can always provide a fallback later.

@majnemer Im using the Fls* APIs since they provide the thread termination callback, which Tls* doesn't. Good point about the SRW. Those are newer, but, we can always provide a fallback later.

I don't believe they are newer than the condition variable API so all should be fine.

compnerd marked 3 inline comments as done.Jan 2 2017, 5:29 PM

The SRW initializers avoid the initializer changes, so its one less set of changes that is necessary.

include/__threading_support
36

No, because the structure of the header is such that it requires redeclarations :-(.

compnerd updated this revision to Diff 82826.Jan 2 2017, 6:13 PM
compnerd edited edge metadata.

Use SRW locks, rebase for avoid redeclaration of interfaces, remove static initialization check removals.

majnemer added inline comments.Jan 2 2017, 6:32 PM
include/__threading_support
300–305

I don't think you can use slim rw locks for recursive locks. I think we will need to use CRITICAL_SECTION for those. std::recursive_mutex can't be used with std::condition_variable AFAIK so all you need (I think) is recursive versions of __libcpp_mutex_...

Recursive locks should be used far less frequently which makes it valuable, IMO, to use slim rw locks for the non-recursive mutex implementation.

355

I don't think it should be __libcpp_condvar_timedwait's problem. __libcpp_condvar_timedwait wraps pthread_cond_timedwait on POSIX platforms and the caller of __libcpp_condvar_wait properly handles spurious wakeups. The caller of __libcpp_condvar_timedwait probably should be audited.

compnerd added inline comments.Jan 2 2017, 8:41 PM
include/__threading_support
300–305

You are absolutely right. That was something that I looked at originally and went with the CS. However, the overhead of a tagged struct is 5 or 9 bytes (sizeof(void *) + 1) bytes (ignoring padding for MS ABI). Going with that should give the benefits of always being able to properly initialize the CS instead of the kludge.

compnerd added inline comments.Jan 2 2017, 8:41 PM
include/__threading_support
355

SG; seems that there is a single user in condition_variable.cpp

compnerd updated this revision to Diff 82834.Jan 2 2017, 8:44 PM

switch between a CRITICAL_SECTION and SRWLOCK

EricWF added a comment.Jan 2 2017, 8:52 PM

Could you upload this patch with more context?

majnemer added inline comments.Jan 2 2017, 8:54 PM
include/__threading_support
300–305

Er, isn't the overhead much more than that? IIRC, CRITICAL_SECTION is quite large. You'd be making all the users of std::mutex pay for the space of std::recursive_mutex...

smeenai added inline comments.Jan 2 2017, 8:55 PM
include/__threading_support
83

Why not a tagged union?

smeenai added inline comments.Jan 2 2017, 9:03 PM
include/__threading_support
300–305

Yeah, CRITICAL_SECTION is 24 bytes vs 4 bytes for a SRWLOCK.

EricWF added inline comments.Jan 2 2017, 10:39 PM
include/__threading_support
44

Do these definitions have any affect when <Windows.h> has already been included?
Also are these definitions required before including the header, or merely beneficial? If they are required this will make the <Windows.h> header a pain to use with modules.

306

The initial __libcpp_foo declarations specify the correct linkage and visibility attributes. Please don't decorate the definitions.

compnerd updated this revision to Diff 82843.Jan 2 2017, 11:18 PM

update for separation of mutex and recursive_mutex.

compnerd updated this revision to Diff 82846.Jan 2 2017, 11:24 PM

add more context

compnerd added inline comments.Jan 2 2017, 11:26 PM
include/__threading_support
44

No, they dont effect it once it has been included. They are beneficial since they reduce the amount of stuff that gets included (including things which, at least when I last checked, can cause clang to choke).

49

Yeah, restructed that. There is now the dependent patch for the type-erased mutex handling.

EricWF added inline comments.Jan 3 2017, 12:33 AM
include/__threading_support
44

And can users re-include <Windows.h> afterwards in the same TU and get all of the symbols?

kimgr added a subscriber: kimgr.Jan 3 2017, 1:41 AM

Re Eric's windows.h concern.

include/__threading_support
44

I don't think so.

We've recently switched to defining these two symbols in our build system, and I think that's basically the only way to make this work in a project composed of headers from various authors. I think you're right that libc++ should not define them.

awson added a subscriber: awson.Jan 3 2017, 12:24 PM

You don't need #define VC_EXTRA_LEAN since:

  1. VC_EXTRA_LEAN is a mistype, only VC_EXTRALEAN has some meaning in Windows ecosystem;
  2. VC_EXTRALEAN is used *only* in MFC headers which obviously aren't used in clang codebase.

Thus it would be better to remove this altogether, and (perhaps) replace with the relevant #define WIN32_LEAN_AND_MEAN.

halyavin added inline comments.
include/__threading_support
441

We can have integer overflow on cast to DWORD here and get unexpectedly small wait for large timeouts.

478

For other reviewers: CC means Calling Convention.

compnerd updated this revision to Diff 82954.Jan 3 2017, 2:35 PM

Remove incorrect use of VC_EXTRALEAN. Fix signature for __libcpp_set_thread_specific. Provide CC adjustment thunks for the thread creation. Add an assertion for sleep timeout truncation. Implement exec-once interfaces.

halyavin added inline comments.Jan 4 2017, 1:49 AM
include/__threading_support
433

In posix, pthread_cond_timedwait uses absolute time but we use relative time here.

435

We need to include <chrono> to use types and functions from this namespace.

439

You have done it accidentally right. I realized there is a second problem here - if timeout equals to INFINITE, the thread will wait forever. INFINITE equals to (DWORD)-1, so the strict sign in the assert is required. But for sanity sake _VSTD::numeric_limits<DWORD>::max() should be replaced with INFINITE.

461

We need underscores for parameters and init_routine.

472

Underscores for parameters are missing.

502

Trampolines will never be inlined. Should we put them in support *.cpp instead? The downside is new public symbols which can't be changed without breaking backward compatibility. The upside is that we will have only one copy of each trampoline. What do you think?

506

Do we need #include <cstdlib> for free() and malloc() now? Can we use new/delete instead?

BTW, What is the purpose of _VSTD? I think it is used to prevent argument-dependent lookup and so is not needed for free and malloc.

507

Should we even try to pass thread exit code, given that sizeof(unsigned int) < sizeof(void*) on 64-bit system? std::thread doesn't support thread exit code anyway.

rnk added inline comments.Jan 4 2017, 9:18 AM
include/__threading_support
44

I don't think libc++ should not include windows.h in a public header. I'd rather write our own __threading_support_win.h that re-prototypes everything we need from windows.h, and then we can have some test in libc++ that validates that there are no mismatches when including windows.h before libc++ <mutex>. If we do this, we should sink as much win32 API usage as possible out of headers to reduce our duplication.

It's worth pointing out VS 2015's thread implementation also hides its win32 API usage.

502

Considering that libc++ already has a __thread_proxy trampoline, let's just give it the right CC and get rid of this trampoline.

568

Yeah, we need to fix that. We should find a way to make __thread_specific_ptr::__at_thread_exit(void*) have the right convention out of the box, rather than thunking. Something like:

#define __LIBCPP_TLS_CALLBACK_CC WINAPI
.. // else
#define __LIBCPP_TLS_CALLBACK_CC
... // <thread>
    static void __LIBCPP_TLS_CALLBACK_CC __at_thread_exit(void*);
compnerd updated this revision to Diff 83174.Jan 4 2017, 7:14 PM
compnerd updated this object.
compnerd marked 6 inline comments as done.Jan 4 2017, 7:27 PM
compnerd added inline comments.
include/__threading_support
502

I think I prefer @rnk's solution here. Lets try to use the __thread_proxy to hide the thunk.

507

Im not sure what the cleanest way to address this is.

compnerd updated this revision to Diff 83178.Jan 4 2017, 7:28 PM

Address a number of review comments.

compnerd marked an inline comment as done.Jan 4 2017, 7:45 PM
compnerd added inline comments.
include/__threading_support
433

Good catch. I guess we have to convert the timespec to a relative time before using it. Slightly annoying, but thankfully, we can fallback on chrono :-).

compnerd updated this revision to Diff 83182.Jan 4 2017, 7:45 PM
compnerd marked an inline comment as done.

Fix __libcpp_condvar_timedwait parameter usage (absolute vs relative time)

EricWF accepted this revision.Jan 4 2017, 10:00 PM
EricWF edited edge metadata.

This LGTM. I'm sure we can flush out any bugs once we get the tests running.

This revision is now accepted and ready to land.Jan 4 2017, 10:00 PM
halyavin added inline comments.Jan 4 2017, 11:35 PM
include/__threading_support
456

Since negative timeouts can't be avoided, we must make sure that timeout_ms.count() is at least zero.

458

It is >= INFINITE. _LIBCPP_ASSERT has 2 arguments and supports error message out of the box.

compnerd marked an inline comment as done.Jan 5 2017, 10:10 AM
compnerd added inline comments.
include/__threading_support
456

Good point. I suppose that the assert takes care of that though.

458

Shouldnt this be, timeout_ms.count() > 0 which implicitly ensures that it is not INFINITE (-1)?

halyavin added inline comments.Jan 5 2017, 12:46 PM
include/__threading_support
456

Negative timeout_ms.count() is a normal case, we shouldn't fail on assertion. The reason is that program can always be slow enough for current timestamp (system_clock::now()) to pass any fixed point in time (__ts/abstime).

458

I just checked in the Windows SDK, INFINITE is defined as 0xFFFFFFFF and negative timeouts are legal anyway.

EricWF requested changes to this revision.Jan 6 2017, 12:20 PM
EricWF edited edge metadata.

This needs to be rebased following the <__threading_support> cleanup in r291275.

This revision now requires changes to proceed.Jan 6 2017, 12:20 PM
compnerd updated this revision to Diff 83430.Jan 6 2017, 2:21 PM
compnerd edited edge metadata.
compnerd marked an inline comment as done.

rebase, address negative timeouts, fix thunking for __libcpp_tls_create

compnerd updated this revision to Diff 83432.Jan 6 2017, 2:36 PM
compnerd edited edge metadata.
compnerd removed rL LLVM as the repository for this revision.

rebase

LGTM.

include/__threading_support
459

There is still problem with large timeout but let us fix it later.

EricWF added a comment.Jan 6 2017, 3:06 PM

I'm seeing the following build errors on Windows:

C:\Users\Eric\workspace\libcxx\include\__threading_support(521,3):  warning: cannot delete expression with pointer-to-'void' type 'void *' [-Wdelete-incomplete]
  delete __data;
  ^      ~~~~~~
C:\Users\Eric\workspace\libcxx\include\__threading_support(530,18):  error: assigning to 'void *(*)(void *)' from incompatible type 'void (*)(void *)': different return type ('void *' vs 'void')
  data->__func = __func;
                 ^~~~~~
C:\Users\Eric\workspace\libcxx\include\__threading_support(572,5):  error: functions that differ only in their return type cannot be overloaded
int __libcpp_thread_yield()
~~~ ^
C:\Users\Eric\workspace\libcxx\include\__threading_support(178,6):  note: previous declaration is here
void __libcpp_thread_yield();
~~~~ ^
C:\Users\Eric\workspace\libcxx\include\__threading_support(589,5):  error: functions that differ only in their return type cannot be overloaded
int __libcpp_tls_get(__libcpp_tls_key __key)
~~~ ^
C:\Users\Eric\workspace\libcxx\include\__threading_support(185,7):  note: previous declaration is here
void *__libcpp_tls_get(__libcpp_tls_key __key);
~~~~~~^
C:\Users\Eric\workspace\libcxx\include\__threading_support(591,10):  error: cannot initialize return object of type 'int' with an rvalue of type 'PVOID' (aka 'void *')
  return FlsGetValue(__key);
compnerd updated this revision to Diff 83453.Jan 6 2017, 4:01 PM
compnerd set the repository for this revision to rL LLVM.
EricWF added a comment.Jan 6 2017, 4:29 PM

Just tried the updated patch and I still get one build error:

C:\Users\Eric\workspace\libcxx\src\thread.cpp(135,16):  error: use of undeclared identifier 'nanosleep'
        while (nanosleep(&ts, &ts) == -1 && errno == EINTR)
EricWF added inline comments.Jan 6 2017, 4:34 PM
include/__threading_support
29

I think we agreed that we cannot use this macro.

426

DO NOT ANNOTATE THESE DEFINITIONS!! The forward declarations are already properly declared.

majnemer added inline comments.Jan 6 2017, 4:37 PM
include/__threading_support
29–30

Can we do as Reid suggests and not expose users to windows.h?

EricWF added inline comments.Jan 6 2017, 4:42 PM
include/__threading_support
580

The _LIBCPP_TLS_DESTRUCTOR_CC needs to appear an the initial declaration as well.

EricWF added inline comments.Jan 6 2017, 4:53 PM
include/__threading_support
30

Can we do as Reid suggests and not expose users to windows.h?

I was about to ask the same question. These includes are dragging in the __deallocate macro and I would love to avoid that.

compnerd updated this revision to Diff 83469.Jan 6 2017, 5:08 PM
compnerd removed rL LLVM as the repository for this revision.
compnerd marked 2 inline comments as done.

remove WIN32_LEAN_AND_MEAN, fix decoration, remove inline decorations

smeenai added inline comments.Jan 6 2017, 5:23 PM
include/__threading_support
30

I feel like we would end up with a lot of duplication if we went down this route, since this is using a fair amount of Windows APIs. @rnk suggested having a test for prototype mismatches, but even with those checks there could be a high maintenance burden to the duplication.

Was the main objection to WIN32_LEAN_AND_MEAN that it would be problematic for modules? If we're including windows.h, it seems strictly preferable to include it with WIN32_LEAN_AND_MEAN than without, since we'll pull in a lot less that way. Including windows.h without WIN32_LEAN_AND_MEAN can also interact with other headers badly sometimes, e.g. [winsock2.h](https://msdn.microsoft.com/en-us/library/windows/desktop/ms737629%28v=vs.85%29.aspx).

EricWF added inline comments.Jan 6 2017, 5:23 PM
include/__threading_support
30

It seems that dragging in the __deallocate macro is inevitable :-(

I submitted a patch to work around __deallocate here: https://reviews.llvm.org/D28426

EricWF added inline comments.Jan 6 2017, 5:30 PM
include/__threading_support
30

Was the main objection to WIN32_LEAN_AND_MEAN that it would be problematic for modules? If we're including windows.h, it seems strictly preferable to include it with WIN32_LEAN_AND_MEAN than without, since we'll pull in a lot less that way. Including windows.h without WIN32_LEAN_AND_MEAN can also interact with other headers badly sometimes, e.g. winsock2.h.

The objection is that it breaks user code. For example:

#include <thread>
#include <Windows.h> // Windows.h already included as lean and mean.

typedef NonLeanAndMeanSymbol foo; // ERROR NonLeanAndMeanSymbol not defined
smeenai added inline comments.Jan 6 2017, 5:34 PM
include/__threading_support
30

But without the WIN32_LEAN_AND_MEAN, we're gonna break

#include <thread>
#include <winsock2.h>

(you could fix this by reordering the includes, which would also fix your example, or by defining WIN32_LEAN_AND_MEAN yourself, but it doesn't seem great either)

EricWF added inline comments.Jan 6 2017, 5:55 PM
include/__threading_support
30

I would much rather break that code. The fact that Windows.h doesn't play nice with other Windows headers is a problem for Windows not libc++.

EricWF added inline comments.Jan 6 2017, 5:57 PM
include/__threading_support
30

Although I think avoiding the Windows.h include all together would be better (if possible). However I think that can be fixed after committing this.

EricWF accepted this revision.Jan 6 2017, 7:09 PM
EricWF edited edge metadata.

LGTM. I just successfully built this on Windows.

This revision is now accepted and ready to land.Jan 6 2017, 7:09 PM
compnerd closed this revision.Jan 6 2017, 7:18 PM

SVN r291333

include/__threading_support
29–30

I think that I would rather do that in a follow up. This will require a fair amount of duplication.