Page MenuHomePhabricator

Don't push null ExecutionContext on CommandInterpreter exectx stack
Needs ReviewPublic

Authored by jasonmolenda on Oct 6 2021, 12:57 AM.

Details

Summary

Part of Tatyana's patch in https://reviews.llvm.org/D92164 added a stack of ExecutionContexts to the CommandInterpreter (pushed with OverrideExecutionContext, popped with RestoreExecutionContext) and when we need to retrieve the currently selected ExecutionContext via CommandInterpeter:: GetExecutionContext(), if the stack has an entry, we return the exe_ctx from the top of the stack, else we fall back to the Debugger::GetSelectedExecutionContext() which gives you an exe_ctx with the currently selected Target.

This patch is fixing the fact that the lldb command driver during startup, pushes the current exe_ctx on the stack here,

-> 2874	    OverrideExecutionContext(m_debugger.GetSelectedExecutionContext());
   2875	  auto finalize = llvm::make_scope_exit([this]() {
   2876	    RestoreExecutionContext();
0 0x00010d9fc80d    LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete + 285  CommandInterpreter.cpp:2874
1 0x00010d81573d    LLDB`lldb_private::IOHandlerEditline::Run + 349  IOHandler.cpp:582
2 0x00010d7ccd71    LLDB`lldb_private::Debugger::RunIOHandlers + 81  Debugger.cpp:879
3 0x00010d9fe0d4    LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter + 212  CommandInterpreter.cpp:3126
4 0x00010d268370    LLDB`lldb::SBDebugger::RunCommandInterpreter + 544  SBDebugger.cpp:1238
5 0x00010854b8a2    lldb`Driver::MainLoop + 3298  Driver.cpp:678

There is no Target at this point, so we have an ExecutionContext with a nullptr for a target (and all other fields).

This Target won't get popped off the stack, it remains there as the first entry. When loading a corefile, which loads a binary and dSYM, and the dSYM has Python code in it, and the SB API tries to set some settings via SBCommandInterpreter::HandleCommand, if those commands need the current Target to set something, they will silently fail to do anything. Target settings that are set in the template target will work, but those that only take effect in the current target are ignored. It can be a little tricky to spot.

Complicating the test case writing, API tests don't have this same "push a nullptr target on to the stack" issue, so they won't reproduce the problem. I started writing an API test and I kept it in this patch because it might find regressions in the future. Jonas helped rewrite the test into a shell test that uses the driver program to show the issue and confirm the patch fixes it.

I can imagine we might iterate on exactly how I avoid pushing a nullptr ExecutionContext, but I think the basic safeguard is pretty straightforward - it's a pretty unusual situation to have no selected Targets, this only happens during early startup I suspect. Passing an arg to the make_scope_exit to avoid popping the stack seems especially unpretty. I wasn't worried about popping too many elements off the stack, the code will ignore that case, but I was worried that it might be possible to have a real ExecutionContext on the stack, then a nullptr ExecutionContext is pushed, and I didn't want to pop the real exectx later on.

Diff Detail

Event Timeline

jasonmolenda created this revision.Oct 6 2021, 12:57 AM
jasonmolenda requested review of this revision.Oct 6 2021, 12:57 AM

Another possible approach would be to allow an ExecutionContext with a nullptr Target to push on to the stack, but then in CommandInterpeter::GetExecutionContext skip over those until we find one with a Target, or fall back to the Debugger::GetSelectedExecutionContext like we do with an empty stack.

Added inline comment

lldb/source/Interpreter/CommandInterpreter.cpp
2842

The intention of overriding the context here with the currently selected one was to have the same context during HandleCommand and GetProcessOutput. However, this logic looks wrong to me now. HandleCommand may change the context and GetProcessOutput should use the new one. I think that you can safely remove this piece of code (untill IOHandlerInputComplete doesn't allow overriding execution context).

ted added a subscriber: ted.Nov 3 2021, 7:16 AM

@jasonmolenda I discovered the same issue in another way - create a python command and load it with "command script import". Here is test.py:

def __lldb_init_module(debugger, dict):
    debugger.HandleCommand(
        'command script add -f test.test test')
    print("test command loaded")

def test(debugger, register, result, dict):
    result.Clear()
    res = lldb.SBCommandReturnObject()
    command = 'target create /bin/ls'
    debugger.GetCommandInterpreter().HandleCommand(command, res)
    command = 'target modules search-paths add . ' + '/tmp'
    debugger.GetCommandInterpreter().HandleCommand(command, res)
    output = res.GetOutput()
    print(output)

    print('test done')

The 2nd HandleCommand uses the execution context of the original command, which doesn't have a target, so the search-paths add fails with no target.

(lldb) command script import test
test command loaded
(lldb) test
Current executable set to '/bin/ls' (x86_64).

test command called
error: error: invalid target, create a target using the 'target create' command

That should be fairly easy to turn into a Shell test, with "REQUIRES: python", because the error happens before the search-path is parsed, so it doesn't have to be valid.

I think we should make simple cases like this work for sure. But OTOH, if you are writing a script that you might use anywhere non-trivial, SBCommandInterpreter.HandleCommand(char *, SBError) is NOT a safe thing to call. You could be in an lldb session with more than one Target, and while your command is running, the other target could have hit a breakpoint that it intended to stop at, and so it becomes the currently selected target.

I'm not sure there's a good way to enforce this, but if you have an execution context in mind when running a command with HandleCommand, you really need to use the version of HandleCommand that takes an execution context, rather than just hoping that when the command is run your execution context is still the currently selected one.

jingham added a comment.EditedNov 3 2021, 10:21 AM

This needs some more design, I think. The obvious solution to Ted's problem is that target create should make the new target the currently selected one. Then the second HandleCommand would have a selected target, and the second command would work. But that doesn't seem right to me when you consider that there might be more than one process at a time in a target, which we definitely want to support so that you can do things like coordinate debugging from a server process into a client, etc...

For instance, suppose I have two targets A & B, and I'm stopped at a breakpoint in target B, issuing commands in the command interpreter. Meanwhile, target A hits a breakpoint with a callback that creates a new target (C) and attaches that Target C to some other process, then continues without stopping. If we do have the HandleCommand for "target create" select the current target, then without anything that I can discern happening, Target C becomes the selected target, and all my commands start going to it. That can't be right.

For the Driver, this is not so complicated. If some one issues a target create command directly in the Driver, then it makes sense to select that target. That's clearly the user's intent.

But in scripting contexts, I don't think we should be switching around the selected target. Instead, commands run in the script interpreter should be explicit about what execution context they are working on. If I could deprecate one API it would be the HandleCommand with no exe_ctx...

So probably the only safe thing to do is to ensure if the command doesn't specify a target in the execution context, we pick the currently selected one. IIUC, that's the proposal here, so I think this is all good. But we want to be careful not to allow implicit behaviors like this to change the currently selected target/thread/frame.