This is an archive of the discontinued LLVM Phabricator instance.

[sanitizer] Don't reuse the main thread in ThreadRegistry
ClosedPublic

Authored by kubamracek on Apr 29 2016, 6:49 AM.

Details

Summary

There is a hard-to-reproduce crash happening on OS X that involves terminating the main thread (dispatch_main does that, see discussion at http://reviews.llvm.org/D18496) and later reusing the main thread's ThreadContext. This patch disables reuse of the main thread. I believe this problem exists only on OS X, because on other systems the main thread cannot be terminated without exiting the process.

Diff Detail

Repository
rL LLVM

Event Timeline

kubamracek updated this revision to Diff 55591.Apr 29 2016, 6:49 AM
kubamracek retitled this revision from to [sanitizer] Don't reuse the main thread in ThreadRegistry.
kubamracek updated this object.
kubamracek added reviewers: dvyukov, samsonov, glider, kcc.
kubamracek added a project: Restricted Project.
dvyukov edited edge metadata.May 1 2016, 11:25 PM

Why is it difficult to reproduce? The quarantine if FIFO queue of size 16 IIRC. So if you create/destroy 16 threads, next thread creation will reuse the oldest thread context. Or the crash does not always happen after reuse?
I am hinting on a test.

The crash happens with GCD worker threads. The only way to wait for a worker thread to be destroyed that I know of, is using long sleep()s and even then it's non-deterministic and the actual delays differ in different OS versions.

If you can make the test crash once in 100 runs that's still better than no test. If we have a regression, the failure will be detected eventually (on bots or in manual test runs).

The test would need to sleep for a long time (~30 seconds total). Is that okay?

No, it is not OK. How does the look like?
Note to get thread id reuse, you don't need to use GCD worker threads, you can create 16 normal threads, ensure that they started and join them.

No, it is not OK. How does the look like?
Note to get thread id reuse, you don't need to use GCD worker threads, you can create 16 normal threads, ensure that they started and join them.

The crash only happens when the main thread is reused as a worker thread. It's hard to trigger that, since you can't control the creation and termination of worker threads. I don't know how would throwing in some regular pthreads help.

dvyukov accepted this revision.May 2 2016, 3:13 AM
dvyukov edited edge metadata.

OK, let's leave it as is.

This revision is now accepted and ready to land.May 2 2016, 3:13 AM
This revision was automatically updated to reflect the committed changes.