This is an archive of the discontinued LLVM Phabricator instance.

Fix deadlock due to thread list locking in 'bt all' with obj-c
ClosedPublic

Authored by fjricci on Mar 10 2016, 5:57 PM.

Details

Summary

The gdb-remote async thread cannot modify thread state while the main thread
holds a lock on the state. Don't use locking thread iteration for bt all.

Specifically, the deadlock manifests when lldb attempts to JIT code to
symbolicate objective c while backtracing. As part of this code path,
SetPrivateState() is called on an async thread. This async thread will
block waiting for the thread_list lock held by the main thread in
CommandObjectIterateOverThreads. The main thread will also block on the
async thread during DoResume (although with a timeout), leading to a
deadlock. Due to the timeout, the deadlock is not immediately apparent,
but the inferior will be left in an invalid state after the bt all completes,
and objective-c symbols will not be successfully resolved in the backtrace.

Diff Detail

Event Timeline

fjricci updated this revision to Diff 50386.Mar 10 2016, 5:57 PM
fjricci retitled this revision from to Fix deadlock due to thread list locking in 'bt all' with obj-c.
fjricci updated this object.
fjricci added subscribers: lldb-commits, sas.
fjricci updated this revision to Diff 50387.Mar 10 2016, 6:00 PM

Remove duplication of index variable

jingham requested changes to this revision.Mar 10 2016, 6:27 PM
jingham edited edge metadata.

Yeah, I'm beginning to wonder whether we're doing ourselves a favor by putting off computing the ObjC runtime symbols till some time that's bound to be more inconvenient than right when we've stopped, but that's beyond the scope of this patch.

For this patch, there's no guarantee that running the JIT'ed code won't cause a thread to exit, so that by the time you got around calling HandleOneThread in this loop it had no real backing . The Thread your ThreadSP is keeping alive should have been Clear()ed in that case, so IsValid will return false. It would be good to put an explicit check for that in this loop, and if the thread is no longer valid, report some useful message ("thread disappeared while computing the backtraces.") Actually, that check should be done before the HandleOneThread in the the other branch of the if as well.

This revision now requires changes to proceed.Mar 10 2016, 6:27 PM

Another even safer way to do this is to take the thread list, turn it into a vector of TID's, then iterate over the TID's looking them up one at a time as you go. That way you don't have to worry about your ThreadSP going stale. You could even change HandleOneThread to take a tid, so you could centralize the lookup.

Oh, yeah, and note the other branch of the if holds the thread list mutex the whole time. So you'll need to change that to get the list of threads, then drop the mutex & iterate over them or you'll get into the same problem if somebody does:

(lldb) thread backtrace 1 2 3 4 5

or whatever.

fjricci updated this revision to Diff 50454.Mar 11 2016, 11:24 AM
fjricci edited edge metadata.

Use vector of tids for iteration, rather than iterating over unlocked threadlist

jingham accepted this revision.Mar 11 2016, 11:28 AM
jingham edited edge metadata.

That looks good, thanks.

This revision is now accepted and ready to land.Mar 11 2016, 11:28 AM
clayborg accepted this revision.Mar 11 2016, 11:44 AM
clayborg edited edge metadata.

Looks good.

sas closed this revision.Mar 17 2016, 11:59 AM

Committed as r263735.