This is an archive of the discontinued LLVM Phabricator instance.

Make CommandInterpreter's execution context the same as debugger's one.
ClosedPublic

Authored by tatyana-krasnukha on Nov 26 2020, 2:56 AM.

Details

Summary

Currently, the interpreter's context is not updated until a command is executed. This has resulted in the behavior of SB-interface functions and some commands depend on previous user actions. The interpreter's context can stay uninitialized, point to a currently selected target, or point to one of the previously selected targets.

This patch removes any usages of CommandInterpreter::UpdateExecutionContext. CommandInterpreter::HandleCommand* functions still may temporarily override context, but now they always restore it before exiting.

Added test reproduces one of the issues. Without this fix, the last assertion fails because the interpreter's execution context is empty until running "target list", so, the value of the global property was updated instead of the process's local instance.

Diff Detail

Event Timeline

tatyana-krasnukha requested review of this revision.Nov 26 2020, 2:56 AM

Removed refactoring to make the changes clearer.

JDevlieghere added inline comments.Nov 30 2020, 11:46 AM
lldb/packages/Python/lldbsuite/test/python_api/debugger/main.cpp
1 ↗(On Diff #308325)

Test source files shouldn't have the header.

lldb/source/API/SBDebugger.cpp
1315 ↗(On Diff #308325)

Does this append or override the error message? If it overrides you might as well us LLDB_LOG directly which supports formatv.

jingham requested changes to this revision.Nov 30 2020, 4:52 PM

I don't see any reason for, and lots of reasons against having more than one source of truth for a Debugger's "Currently Selected ExecutionContext". In particular, I can't see any good uses of the Debugger and the CommandInterpreter being able to have different "currently selected targets/threads/frames". For instance, I think people would generally be surprised if calling SBDebugger.SetSelectedTarget didn't also change the default target that subsequent command interpreter commands use.

This patch solves the problem that the CommandInterpreter's version of this truth is out of sync with the Debugger's notion by ignoring the CommandInterpreter's version. That seems to me the wrong way to solve the problem.

Rather, we should make sure that everyone is looking to the same source. I'm not sure it makes sense for the CommandInterpreter to be the one holding the "Currently Selected Execution Context". Maybe that should be kept in the Debugger, and the Command Interpreter could then call Debugger::GetSelectedExecutionContext?

This revision now requires changes to proceed.Nov 30 2020, 4:52 PM
tatyana-krasnukha retitled this revision from Make SBDebugger internal variable getter and setter not use CommandInterpreter's context to Make CommandInterpreter's execution context the same as debugger's one..
tatyana-krasnukha edited the summary of this revision. (Show Details)

Addressed comments

This all looks fine except I'm not sure you can have a single override context. The lldb command line is only a bit recursive, but you can have sequences like:

(lldb) command source file_that_contains_a_step

Step hits a breakpoint which has commands

One of those commands is a Python command that calls SBCommandInterpreter::HandleCommandsFromFile - passing in some other SBExecutionContext

IIUC, running the breakpoint command will push one override before running the command. Then HandleCommandsFromFile will push another, and you'll won't get back to the first one.

And one of the weaknesses of lldb right now is that commands can't nest as well as they should. For instance, you really should be able to hit a breakpoint, and have the breakpoint command able to do a couple of next's of whatever, even though the weakness of the lldb command interpreter precludes that at present.

So both for somewhat esoteric reasons like that above, and to future proof this change, I think you have to use a stack for the overrides/restore, not a single element.

Thanks for pointing to the nested command problem! Replaced pointer to the execution context with the stack of contexts.

jingham accepted this revision.Dec 11 2020, 10:31 AM

LGTM - thanks for doing this!

This revision is now accepted and ready to land.Dec 11 2020, 10:31 AM

Sorry for not keeping track of this patch -- I'm not very active these days.

Nevertheless, I did notice that this patch introduced some kind of nondeterminism/raciness/flakyness to the "finish" command. Specifically, it causes the command (as used in TestGuiBasicDebug.py) to fail occasionally (~20% of the time). Since this patch landed (in build 3660, there has been a constant stream of builds (build 3667, build 3674, build 3678, ...) where this test is failing. The test has been stable up to that point.

The breakage can be also by running the test commands without the gui mode:

$ bin/lldb lldb-test-build.noindex/commands/gui/basicdebug/TestGuiBasicDebug.test_gui/a.out -o 'br set -f main.c -p "// Break here"' -o r
(lldb) target create "lldb-test-build.noindex/commands/gui/basicdebug/TestGuiBasicDebug.test_gui/a.out"
Current executable set to '/home/pavelo/ll/build/opt/lldb-test-build.noindex/commands/gui/basicdebug/TestGuiBasicDebug.test_gui/a.out' (x86_64).
(lldb) br set -f main.c -p "// Break here"
Breakpoint 1: where = a.out`main + 22 at main.c:4:3, address = 0x0000000000401116
(lldb) r
Process 32346 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000401116 a.out`main(argc=1, argv=0x00007fffffffdb58) at main.c:4:3
   1   	extern int func();
   2   	
   3   	int main(int argc, char **argv) {
-> 4   	  func(); // Break here
   5   	  func(); // Second
   6   	  return 0;
   7   	}

Process 32346 launched: '/home/pavelo/ll/build/opt/lldb-test-build.noindex/commands/gui/basicdebug/TestGuiBasicDebug.test_gui/a.out' (x86_64)
(lldb) s
Process 32346 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000000000401134 a.out`func at func.c:2:3
   1   	int func() {
-> 2   	  return 1; // In function
   3   	}
(lldb) fin
Process 32346 exited with status = 0 (0x00000000) 

I've reverted this patch to get the build green. I think it should be fairly easy to reproduce this problem (the command line approach fails 100% consistently for me), but if you're having trouble reproducing it, let me know, and I can send some logs or something. (The logs seem to indicate that the "step out" thread plan does not dequeue itself after steping out, but I guess the root cause is somewhere else.)

Fixed CommandInterpreter::GetProcessOutput to avoid deadlock in Windows process plugin. This should also fix the problem @labath described above.

Pavel, could you please check whether it works for you?

With this version of the patch, I am unable to reproduce the issue using the approach I described in the previous comment. However, it still reproduces when issuing the equivalent commands via the "gui" (either "by hand" or by running TestGuiBasicDebug.py).

With this version of the patch, I am unable to reproduce the issue using the approach I described in the previous comment. However, it still reproduces when issuing the equivalent commands via the "gui" (either "by hand" or by running TestGuiBasicDebug.py).

Could you send logs, please? I cannot reproduce the issue on available platforms.

I've included two logs, showing failing and successful runs (the successful run was captured without this patch). Interesting things happen around line 1057, where the "step out" plan decides we should not stop after stepping out. I'm not sure what's the relation to this patch, but I suppose it could have something to do with it not (always) fetching the correct frame (from which it should step out) from the execution context.

tatyana-krasnukha edited the summary of this revision. (Show Details)

It turns out that the Debugger recalculated the selected stack frame without taking the Process's run lock. I replaced the whole context evaluation with creating ExecutionContextRef which does these things right.

@labath, I would appreciate it if you check that the latest version works for you too (I ran TestGuiBasicDebug.py 100 times on Linux and it succeeded all the time).

Removed test since the same case was added by D95761.

werat added a subscriber: werat.Feb 2 2021, 2:05 AM

Removed test since the same case was added by D95761.

I think your test case is even better, the one in D95761 doesn't try to run commands.

I think your test case is even better, the one in D95761 doesn't try to run commands.

Ok, then I'll replace it if you don't mind.

I reverted D95761 (see Jim's comment), so you can just re-add your old test. Thanks!