This is an archive of the discontinued LLVM Phabricator instance.

Add a Debugger interruption mechanism in parallel to the Command Interpreter interruption
ClosedPublic

Authored by jingham on Mar 1 2023, 6:31 PM.

Details

Summary

I put up an RFC for debugger interruption a while back. This is my version of that RFC.

The implementation is based on the way lldb is often used in UI's (as opposed to the simple use in lldb's command driver). In those cases, the lldb client often includes a thread running the LLDB command interpreter so that they can provide an "LLDB Console" window where the user can run commands, and then on other threads they call various SB API's that are used to fill the Threads, Locals, etc. windows.

So it seemed the most useful model was to separate interrupting into "Interrupting commands servicing the Console" and "Interrupting work the client is doing on its own behalf." For instance if the client is filling the Locals view, but then the user issues "continue" in the console, the client would want to interrupt filling the Locals view (e.g. interrupt SBFrame.FindVariables()) so that it can continue right away, but not interrupt the user's "continue" command. Similarly, if the user types some command that's producing lots of output, they want to interrupt that but not stop whatever work the client was doing at the same time.

Another question is how do we tell "Console" window commands from Client commands? Since we provide a nice API that packages up the "run a command interpreter" behavior, it seemed reasonable to have commands run by the thread servicing RunCommandInterpreter be the "Interpreter" commands, and all other threads belong to the client.

This is also convenient because we can programmatically determine which flag we should check when we check the interruption, which allows us to provide a unified API for checking whether an interrupt was requested.

Also, the CommandInterpreter does its job Command by Command, so there's a natural boundary on which to take down the interrupt request. However the client program is really the only agent that can know when it wants to start & stop interruption, so instead of a single InterruptCommand with automatic boundaries, I added explicit raise & lower interrupt flag API's.

One slight ugliness is that we already have SBCommandInterpreter::WasInterrupted(). I could have just left that as the one interface to query for interruption. But long term, I think it makes more sense to think about this as the Debugger figuring out who to interrupt, so I think an SBDebugger::InterruptRequested API is better. Even though I can't remove the old API, I still added the new one.

One other detail you will see in the tests. The current behavior for command interpreter interruption is that if a command is interrupted, we discard the output it has accumulated so far and just return the result "Interrupted..." That's why I had to set the ImmediateOutput file in the test. I'm on the fence about this behavior, so I opted not to change it for now.

Diff Detail

Event Timeline

jingham created this revision.Mar 1 2023, 6:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2023, 6:31 PM
jingham requested review of this revision.Mar 1 2023, 6:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2023, 6:31 PM
clayborg added inline comments.Mar 2 2023, 9:27 PM
lldb/include/lldb/Core/Debugger.h
398–399
406

Saying "Decrement the interrupt requested _flag_" seems weird. Is this a flag or an interrupt count? Maybe saying "Decrement the interrupt request count of non zero" might be a bit more clear?

labath added a comment.Mar 6 2023, 6:56 AM

I kinda like this.

lldb/include/lldb/Core/Debugger.h
565

const on a return value rarely makes sense. OTOH, const on the argument usually does. Did you want this the other way around?

649

I think this is unused

650

can we make this non-recursive?

lldb/source/API/SBDebugger.cpp
1709

bad formatting

lldb/test/API/python_api/was_interrupted/TestDebuggerInterruption.py
40

This had me confused for a while, because "thread local data" has a very specific meaning. Could you rename this to something else?

113

I am confused as to what this lock's purpose is. I can see that it's taken on one thread and released on another (which is in itself an unusual thing to do, even though python permits it), but I still don't understand what does it achieve. Lock are generally used for protecting a certain object. If the goal is to control the interleaving of two threads, then maybe a sequence of barriers would be better. Something like:

#thread 1
work1()
t1_done.wait()
t2_done.wait()
work3()
t1_done.wait()
# etc.. odd-numbered work happens on this thread

#thread 2
t1_done.wait()
work2()
t2_done.wait()
t1_done.wait()
work4()
# etc.. even-numbered work happens here
jingham updated this revision to Diff 503136.EditedMar 7 2023, 12:56 PM

Converted the test from using a lock to synchronize the command and test threads to using barriers.
Changed the CommandInterpreter to print partial results when commands are interrupted.
I added two more places where we check for interruption:

  1. SBFrame::GetVariables
  2. StackFrameList::GetFramesUpTo.

I'm still not sure how to actually test these without racing against the operation completing before you get around to the interrupt.

Fixed other review comments.

jingham marked 5 inline comments as done.Mar 7 2023, 1:02 PM
jingham added inline comments.
lldb/test/API/python_api/was_interrupted/TestDebuggerInterruption.py
40

I did intend to use thread local data here - that's why LockContainer derives from threading.local.

The problem is that I have to use the event & lock in the Python command I will run either directly or under RunCommandInterpreter, but I don't have way to explicitly pass python objects to a command when invoking it. I could put the data in globals in the interruptible module. However, since the test variants could be running in parallel this didn't seem like a great idea. So instead I pass the test to the CommandRunner that's going to create my test thread for me, and once it's created its thread, it puts the lock & event in the thread local data, and that's where WelcomeCommand picks it up from. This is probably not 100% necessary, but seemed cleaner to me.

113

Here's a version using barriers exclusively.

jingham updated this revision to Diff 503138.Mar 7 2023, 1:04 PM

Forgot to address Pavel's const comment.

jingham added inline comments.Mar 7 2023, 1:06 PM
lldb/include/lldb/Core/Debugger.h
565

Neither of these is really const. The incoming thread will get set to be the IOHandler for the debugger which is not really const. Not sure what I was thinking here.

jingham updated this revision to Diff 503143.Mar 7 2023, 1:18 PM

Update the test comments for the new test method.

Small nits

lldb/include/lldb/API/SBCommandInterpreter.h
251–253

Some info on the return value would be useful here.

lldb/include/lldb/API/SBDebugger.h
203

Could this be marked const if the mutex was marked mutable?

lldb/include/lldb/Core/Debugger.h
398–399

+1. Also you mention the use of "flag" here but talked about counter before which is inconsistent.

401–402

This method takes no argument message.

clayborg added inline comments.Mar 7 2023, 2:21 PM
lldb/include/lldb/API/SBDebugger.h
203

We don't tend to mark things in the public API as const because it does nothing when you have a pointer, unique pointer or shared pointer as the only member variable. I guess it would stop you from clearing the pointer, but the "const" does nothing to stop the mutation of the underlying object, and it would be weird to add "const" to all methods just to stop each method from modifying the contents of the member variable.

jingham updated this revision to Diff 503171.Mar 7 2023, 3:57 PM

Fix up some comments.

jingham marked 5 inline comments as done.Mar 7 2023, 3:58 PM

polite ping... I think I've addressed all the concerns folks raised.

JDevlieghere added inline comments.Mar 15 2023, 10:14 AM
lldb/source/Core/Debugger.cpp
1252–1253

Nit: else-after-return.

2011–2015

This doesn't look right. You're taking a reference to the old thread, and then immediately change it below. If you want to return the old thread, old_host should be a copy.

bulbazord added inline comments.Mar 15 2023, 11:29 AM
lldb/source/Core/Debugger.cpp
2011–2015

+1. Does it make sense to use a shared_ptr or a unique_ptr for HostThreads? If so, we could use those.

JDevlieghere added inline comments.Mar 15 2023, 11:45 AM
lldb/source/Core/Debugger.cpp
2011–2015

I think the HostThread already kind of serves as a wrapper. I think the solution is to either just make a plain copy or call Release() and assign it to the old_host temporary.

jingham updated this revision to Diff 505645.Mar 15 2023, 3:58 PM

Address Jonas' review comment.

This revision is now accepted and ready to land.Mar 15 2023, 4:43 PM