This is an archive of the discontinued LLVM Phabricator instance.

Don't hold the StackFrame mutex while getting a ValueObject Dynamic value
ClosedPublic

Authored by jingham on Jul 25 2022, 3:11 PM.

Details

Summary

We've noticed occasional hangs in the test TestGuiExpandThreadsTree.py. They are caused by a lock inversion between the StackFrameList and StackFrame mutexes.

One thread has the StackFrame mutex, and is trying to acquire the StackFrameList mutex for that thread:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x000000018fffaafc libsystem_kernel.dylib`__psynch_mutexwait + 8
    frame #1: 0x0000000190034358 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 84
    frame #2: 0x0000000190031cbc libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 248
    frame #3: 0x000000018ff86b7c libc++.1.dylib`std::__1::recursive_mutex::lock() + 16
    frame #4: 0x000000010dc85c0c liblldb.15.0.0git.dylib`lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) + 72
    frame #5: 0x000000010dcbad44 liblldb.15.0.0git.dylib`lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) + 64
    frame #6: 0x000000010dc3cd94 liblldb.15.0.0git.dylib`lldb_private::ExecutionContextRef::GetFrameSP() const + 84
    frame #7: 0x000000010dc3ca3c liblldb.15.0.0git.dylib`lldb_private::ExecutionContext::ExecutionContext(lldb_private::ExecutionContextRef const&) + 244
    frame #8: 0x000000010db314a8 liblldb.15.0.0git.dylib`lldb_private::ValueObject::CalculateDynamicValue(lldb::DynamicValueType) + 84
    frame #9: 0x000000010db31564 liblldb.15.0.0git.dylib`lldb_private::ValueObject::GetDynamicValue(lldb::DynamicValueType) + 108
    frame #10: 0x000000010dc7f4ac liblldb.15.0.0git.dylib`lldb_private::StackFrame::GetValueObjectForFrameVariable(std::__1::shared_ptr<lldb_private::Variable> const&, lldb::DynamicValueType) + 204

   ^^^^^^  frame #10 is the frame that acquires the StackFrame mutex  ^^^^^^^^^^

    frame #11: 0x000000010daffdc0 liblldb.15.0.0git.dylib`FrameVariablesWindowDelegate::WindowDelegateDraw(curses::Window&, bool) + 216
    frame #12: 0x000000010daeb208 liblldb.15.0.0git.dylib`curses::Window::Draw(bool) + 52
    frame #13: 0x000000010daeb23c liblldb.15.0.0git.dylib`curses::Window::Draw(bool) + 104
    frame #14: 0x000000010daea3f8 liblldb.15.0.0git.dylib`curses::Application::Run(lldb_private::Debugger&) + 172
    frame #15: 0x000000010daea338 liblldb.15.0.0git.dylib`lldb_private::IOHandlerCursesGUI::Run() + 28
    frame #16: 0x000000010dac8abc liblldb.15.0.0git.dylib`lldb_private::Debugger::RunIOHandlers() + 140
    frame #17: 0x000000010dbbe210 liblldb.15.0.0git.dylib`lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) + 156
    frame #18: 0x000000010d90f780 liblldb.15.0.0git.dylib`lldb::SBDebugger::RunCommandInterpreter(bool, bool) + 168
    frame #19: 0x0000000104058938 lldb`Driver::MainLoop() + 2776
    frame #20: 0x000000010405959c lldb`main + 2456
    frame #21: 0x000000021b293c14 dyld`start + 2372

The other has the StackFrameList mutex and is trying to get the StackFrame mutex held above.

  thread #2, name = 'lldb.debugger.event-handler'
    frame #0: 0x000000018fffaafc libsystem_kernel.dylib`__psynch_mutexwait + 8
    frame #1: 0x0000000190034358 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 84
    frame #2: 0x0000000190031cbc libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 248
    frame #3: 0x000000018ff86b7c libc++.1.dylib`std::__1::recursive_mutex::lock() + 16
    frame #4: 0x000000010dc7c630 liblldb.15.0.0git.dylib`lldb_private::StackFrame::GetStackID() + 32

^^^^ frame #4 is trying to get the StackFrame mutex held by thread 1

    frame #5: 0x000000010dc85c34 liblldb.15.0.0git.dylib`lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) + 112

^^^^ frame #5 is the frame that acquires the StackFrameList mutex  ^^^^^^^^

    frame #6: 0x000000010dcbad44 liblldb.15.0.0git.dylib`lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) + 64
    frame #7: 0x000000010dc3cd94 liblldb.15.0.0git.dylib`lldb_private::ExecutionContextRef::GetFrameSP() const + 84
    frame #8: 0x000000010dc3d04c liblldb.15.0.0git.dylib`lldb_private::ExecutionContext::ExecutionContext(lldb_private::ExecutionContextRef const*, bool) + 528
    frame #9: 0x000000010db3435c liblldb.15.0.0git.dylib`lldb_private::ValueObject::EvaluationPoint::SyncWithProcessState(bool) + 52
    frame #10: 0x000000010db2ae28 liblldb.15.0.0git.dylib`lldb_private::ValueObject::UpdateValueIfNeeded(bool) + 120
    frame #11: 0x000000010db2e678 liblldb.15.0.0git.dylib`lldb_private::ValueObject::GetValueAsCString(lldb_private::TypeFormatImpl const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&) + 36
    frame #12: 0x000000010db2e840 liblldb.15.0.0git.dylib`lldb_private::ValueObject::GetValueAsCString() + 300
    frame #13: 0x000000010db2f1b8 liblldb.15.0.0git.dylib`lldb_private::ValueObject::DumpPrintableRepresentation(lldb_private::Stream&, lldb_private::ValueObject::ValueObjectRepresentationStyle, lldb::Format, lldb_private::ValueObject::PrintableRepresentationSpecialCases, bool) + 1092
    frame #14: 0x000000010dae0af0 liblldb.15.0.0git.dylib`lldb_private::FormatEntity::Format(lldb_private::FormatEntity::Entry const&, lldb_private::Stream&, lldb_private::SymbolContext const*, lldb_private::ExecutionContext const*, lldb_private::Address const*, lldb_private::ValueObject*, bool, bool) + 9724
    frame #15: 0x000000010daded2c liblldb.15.0.0git.dylib`lldb_private::FormatEntity::Format(lldb_private::FormatEntity::Entry const&, lldb_private::Stream&, lldb_private::SymbolContext const*, lldb_private::ExecutionContext const*, lldb_private::Address const*, lldb_private::ValueObject*, bool, bool) + 2104
    frame #16: 0x000000010daded2c liblldb.15.0.0git.dylib`lldb_private::FormatEntity::Format(lldb_private::FormatEntity::Entry const&, lldb_private::Stream&, lldb_private::SymbolContext const*, lldb_private::ExecutionContext const*, lldb_private::Address const*, lldb_private::ValueObject*, bool, bool) + 2104
    frame #17: 0x000000010dadecdc liblldb.15.0.0git.dylib`lldb_private::FormatEntity::Format(lldb_private::FormatEntity::Entry const&, lldb_private::Stream&, lldb_private::SymbolContext const*, lldb_private::ExecutionContext const*, lldb_private::Address const*, lldb_private::ValueObject*, bool, bool) + 2024
    frame #18: 0x000000010dc81d54 liblldb.15.0.0git.dylib`lldb_private::StackFrame::DumpUsingSettingsFormat(lldb_private::Stream*, bool, char const*) + 244
    frame #19: 0x000000010dc82418 liblldb.15.0.0git.dylib`lldb_private::StackFrame::GetStatus(lldb_private::Stream&, bool, bool, bool, char const*) + 92
    frame #20: 0x000000010dc86488 liblldb.15.0.0git.dylib`lldb_private::StackFrameList::GetStatus(lldb_private::Stream&, unsigned int, unsigned int, bool, unsigned int, bool, char const*) + 580
    frame #21: 0x000000010dcb8e24 liblldb.15.0.0git.dylib`lldb_private::Thread::GetStatus(lldb_private::Stream&, unsigned int, unsigned int, unsigned int, bool, bool) + 964
    frame #22: 0x000000010dacb6d4 liblldb.15.0.0git.dylib`lldb_private::Debugger::HandleThreadEvent(std::__1::shared_ptr<lldb_private::Event> const&) + 184
    frame #23: 0x000000010dacba48 liblldb.15.0.0git.dylib`lldb_private::Debugger::DefaultEventHandler() + 556
    frame #24: 0x000000010dace1f4 liblldb.15.0.0git.dylib`std::__1::__function::__func<lldb_private::Debugger::StartEventHandlerThread()::$_3, std::__1::allocator<lldb_private::Debugger::StartEventHandlerThread()::$_3>, void* ()>::operator()() + 16
    frame #25: 0x000000010db8cf60 liblldb.15.0.0git.dylib`lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) + 100
    frame #26: 0x000000010e833e18 liblldb.15.0.0git.dylib`lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(void*) + 32
    frame #27: 0x0000000190037280 libsystem_pthread.dylib`_pthread_start + 148

This is happening because StackFrame::GetValueObjectForFrameVariable holds the StackFrame mutex over the call to ValueObject::GetDynamicValue, which is not right. It's not necessary because the ValueObject we've made doesn't rely directly on its StackFrame, it depends on the StackID instead, and isn't modifying the StackFrame either. And because it can cause lock inversion against the common pattern of getting the StackFrameList, and then asking it for some StackFrame, doing so is dangerous as well.

Diff Detail

Event Timeline

jingham created this revision.Jul 25 2022, 3:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 3:11 PM
jingham requested review of this revision.Jul 25 2022, 3:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 3:11 PM
jingham edited the summary of this revision. (Show Details)Jul 25 2022, 3:18 PM
jingham edited the summary of this revision. (Show Details)
clayborg accepted this revision.Jul 25 2022, 4:46 PM
This revision is now accepted and ready to land.Jul 25 2022, 4:46 PM