This is an archive of the discontinued LLVM Phabricator instance.

Implement interactive command interruption
ClosedPublic

Authored by lemo on Sep 15 2017, 1:00 PM.

Details

Summary

The core of this change is the new CommandInterpreter::m_command_state, which models the state transitions for interactive commands, including an "interrupted" state transition.

In general, command interruption requires cooperation from the code executing the command, which needs to poll for interruption requests through CommandInterpreter::WasInterrupted().

CommandInterpreter::PrintCommandOutput() implements an optionally interruptible printing of the command output, which for large outputs was likely the longest blocking part

(ex. target modules dump symtab on a complex binary could take 10+ minutes)

Differential Revision: https://reviews.llvm.org/D37923

Diff Detail

Repository
rL LLVM

Event Timeline

lemo created this revision.Sep 15 2017, 1:00 PM
amccarth edited edge metadata.Sep 15 2017, 1:16 PM

I haven't looked at the whole patch yet, but it seems the SIGINT fix is well isolated. That should probably be in a separate patch.

lemo updated this revision to Diff 115476.Sep 15 2017, 1:33 PM

Split the SIGINT handles fixes into a stanalone patch

amccarth added inline comments.Sep 15 2017, 1:48 PM
source/Commands/CommandObjectTarget.cpp
2058 ↗(On Diff #115463)

Many LLVM developers prefer to omit the braces when the body of the control-flow statement is a single statement.

source/Interpreter/CommandInterpreter.cpp
2713 ↗(On Diff #115463)

Should we be that trusting of a caller? In a non-debug build, if a caller doesn't end the (non-empty) string with '\n', this'll just run past the end of the buffer. Did I miss something that guarantees the caller won't make a mistake? Would it be too expensive to play it safe?

2720 ↗(On Diff #115463)

This assert should precede the line before it.

tools/driver/Driver.cpp
1182 ↗(On Diff #115463)

I'm not sure why these two ifs aren't one, as in:

if (g_driver && !g_interrupt_sent.test_and_set())
1189 ↗(On Diff #115463)

Can you add a comment explaining why this uses _exit rather than exit? It's not obvious to me.

tools/lldb-mi/MIDriverMain.cpp
71 ↗(On Diff #115463)

I think this concurrency fix for SIGINT would be better in a separate patch. I understand how it's related to the rest of this patch, but LLVM folks tend to prefer small, incremental patches.

lemo added inline comments.Sep 15 2017, 2:03 PM
source/Commands/CommandObjectTarget.cpp
2058 ↗(On Diff #115463)

So do the hackers: https://blog.codecentric.de/en/2014/02/curly-braces :) I too prefer to omit braces in small test snippets, but in production code it's not worth the risk of making a silly mistake.

source/Interpreter/CommandInterpreter.cpp
2720 ↗(On Diff #115463)

Pedantically, it should be both before and after (and for ultimate paranoid mode, asserting that Write returns <= than the passed in value)

But the asserts looks for the really nasty case where "size -= chunk_size" overflows.

tools/driver/Driver.cpp
1189 ↗(On Diff #115463)

Explained in the SIGINT patch: exit() is not signal-safe (http://pubs.opengroup.org/onlinepubs/000095399/functions/exit.html)

Calling it from a signal handler can result in all kind of nasty issues, in particular exit() does call a lot of stuff, both runtime and user code (ex. atexit functions)

tools/lldb-mi/MIDriverMain.cpp
71 ↗(On Diff #115463)

Agreed, I already split this change into separate patches (I wasn't sure if people prefer to review two small changes vs a single one with more context)

clayborg added inline comments.
source/Commands/CommandObjectTarget.cpp
2056–2058 ↗(On Diff #115476)

Remove braces

2087–2089 ↗(On Diff #115476)

Remove braces

2155–2157 ↗(On Diff #115476)

Remove braces

2179–2181 ↗(On Diff #115476)

Remove braces

2254–2256 ↗(On Diff #115476)

Remove braces

2278–2280 ↗(On Diff #115476)

Remove braces

2348–2350 ↗(On Diff #115476)

Remove braces

source/Interpreter/CommandInterpreter.cpp
2682 ↗(On Diff #115476)

lldb_assert

2687 ↗(On Diff #115476)

lldb_assert

2708–2710 ↗(On Diff #115476)

Since we are using "llvm::StringRef" here, why not use its split functionality? Something like:

bool done = false;
while (!done) {
  auto pair = str.split('\n');
  auto len = pair.first.size();
  done = pair.second.empty();
  // Include newline if we are not done
  if (!done)
    ++len; 
  stream.Write(pair.first.data(), 
}

This approach also avoids the issue amccarth mentioned below about not ending with a newline. It is also quite a bit simpler to follow.

2728 ↗(On Diff #115476)

llvm::StringRef can contain NULLs right? Maybe use

stream.Write(data, size);
lemo added inline comments.Sep 15 2017, 2:17 PM
source/Interpreter/CommandInterpreter.cpp
2708–2710 ↗(On Diff #115476)

I'll give it a try, thanks for the suggestion.

2728 ↗(On Diff #115476)

The original code (line 2714) was using PutCString(), so this path is just preserving the original functionality (which also suggests that the output is not expected to have embedded nuls)

2713 ↗(On Diff #115463)

There's no assumption that the string ends with \n, see the loop condition: chunk_size < size. The assert is just to validate that we don't have embedded NULs.

lemo marked 9 inline comments as done.Sep 15 2017, 2:22 PM
lemo updated this revision to Diff 115516.Sep 15 2017, 3:43 PM
lemo edited the summary of this revision. (Show Details)

Incorporating CR feedback.

lemo added inline comments.Sep 18 2017, 2:33 PM
source/Interpreter/CommandInterpreter.cpp
2708–2710 ↗(On Diff #115476)

I forgot to reply to this: I tried using str.split() as suggested, with a small tweak to avoid reading past the end of the first slice (++len to include the '\n'):

while (!str.empty() && !WasInterrupted()) {
  auto pair = str.split('\n');
  stream.Write(pair.first.data(), pair.first.size());
  if (str.size() != pair.first.size())
    stream.PutChar('\n');
  str = pair.second;
}
if (!str.empty()) {
  stream.Printf("\n... Interrupted.\n");
}

While this version is shorter there I see a number of small problems with it:

  1. It requires a close look at the StringRef::split() interface (ex. done = pair.second.empty() is actually not the correct check since it will eat the final \n)
  2. While conceptually straightforward, the StringRef::split() interface is not ideal in this case - see #1
  3. This version assumes that stream.Write() actually writes everything (ie. not handling the return size)
  4. In order to avoid reading past the end of the first slice we need to do a separate PutChar() for each \n

So in the end I prefer the original version, which although a bit more verbose, tracks the invariants clearly, is potentially faster and handles strings regardless if they are terminated with \n or not.

lemo added inline comments.Sep 18 2017, 2:38 PM
source/Interpreter/CommandInterpreter.cpp
2708–2710 ↗(On Diff #115476)

Thoughts?

zturner added inline comments.Sep 18 2017, 2:43 PM
source/Interpreter/CommandInterpreter.cpp
2708–2710 ↗(On Diff #115476)

I haven't followed the rest of this too closely, so forgive me if this is a dumb question, but it looks like by the time we reach this function, the command is already complete. Why would you want to interrupt it if it's already done? Couldn't you just print the entire output in one call to stream.Write() at this point?

I think Zach might be correct. If we already gathered the data, why not just output it all? Lets answer this first before returning to which implementation to use when parsing it up into lines and dumping it. Seems a shame to throw any data away.

jingham edited edge metadata.EditedSep 18 2017, 3:17 PM

In the original discussion of this feature (it was on the mailing list), it sounded like the specific case that motivated Leonard in adding this feature was when there's a command that's in the process of generating tons of output, and you want to interrupt the tons of output. Not the work, just the output. So the fact that is discards output was part of the design.

And in general, interruption should be as responsive as you can make it. When I tell lldb to "Stop right now" it really should do as close as possible to "stop right now". Not, :-"excuse me I have 10 more pages of output to dump". It's not uncommon to dump some info, see what you want and not need to see the rest. Interrupt in that case is expressing that "don't show me any more" so we should honor that.

ok. Then back to the "can we use StringRef"? We might be able to check if the second.data() is NULL? It might be NULL if the string doesn't end with a newline. It it does, it might be empty but contain a pointer to the '\0' byte?

lemo added a comment.Sep 19 2017, 10:33 AM

Ping? Are there any more open questions or unresolved comments or is this good to be checked in?

Thanks!

We should have a test. The test would need to use pexpect or something similar. If anyone has any pointer on how to make a test for this, please chime in. I was thinking just a pexpect based test.

zturner edited edge metadata.Sep 19 2017, 10:38 AM

Give me a few more hours, if there's a way to make this work with line_iterator I'd really prefer that since I think it improves readability. Can you confirm that if you were able to write:

auto begin = line_iterator(str, /* skip_empty_lines =*/ false);
auto end = line_iterator();
while (begin != end && !WasInterrupted()) {
  stream.Write(*begin);
  if (++begin != end)
    stream.Write("\n");
}

That this would be equivalent?

We should have a test. The test would need to use pexpect or something similar. If anyone has any pointer on how to make a test for this, please chime in. I was thinking just a pexpect based test.

This kind of thing is perfect for a unit test, but I'm not sure how easy that would be with the current design. You'd basically do something like:

struct MockStream {
  explicit MockStream(CommandInterpreter &Interpreter, int InterruptAfter) 
    : CommandInterpreter(Interpreter), InterruptAfter(InterruptAfter) {}

  CommandInterpreter &Interpreter;
  const int InterruptAfter;
  int Lines = 0;
  std::string Buffer;

  void Write(StringRef S) {
    ++Lines;
    if (Lines >= InterruptAfter) {
      Interpreter.Interrupt();
      return;
    }
    Buffer += S;
  }
};

TEST_F(CommandInterruption) {
  CommandInterpreter Interpreter;
  MockStream Stream(Interpreter, 3);
  Interpreter.PrintCommandOutput(Stream, "a\nb\nc\nd\ne\nf\n");
  EXPECT_EQ(Stream.Lines == 3);
  EXPECT_EQ(Stream.Buffer == "a\nb\nc\n");
}
lemo updated this revision to Diff 116074.Sep 20 2017, 2:36 PM
  1. Added SB APIs (SBCommandInterpreter::WasInterrupted()). This allows python commands to query for interrupt requests.
  1. I talked offline with Zach and decided that the line_iterator would require more tinkering to get it to work and it's not worth blocking the change for it.
  1. I looked into adding a test for the SB APIs, but it would not really test much of the real path - which targets the _interactive_ interruption. Also, while I appreciate the ideal of testing everything I'm afraid that the net value of a test here might be negative (testing for async interruption would introduce some level of non determinism in the test suite and would require extra logic that's only used for testing). If anyone has any specific guidance on how to create a good test please let me know!
zturner accepted this revision.Sep 20 2017, 3:39 PM

seems fine to me, I agree with the point about non-determinism, due to the nature of signals and the fact this is only intended to provide a best-effort interruption.

This revision is now accepted and ready to land.Sep 20 2017, 3:39 PM
jingham accepted this revision.Sep 20 2017, 3:51 PM

I don't quite understand the comment about signals adding indeterminacy. No signal delivery is required to test this part. The lldb driver has a sigint handler that calls SBDebugger::DispatchInputInterrupt. But since you aren't testing whether SIGINT actually calls DispatchInputInterrupt, you can just call it directly. I was suggesting executing a command (with SBCommandInterpreter::ExecuteCommand) for a Python command that cycles calling WasInterrupted. Then you would make another thread in Python and have that thread wait a bit then call DispatchInputInterrupt, If "wait a bit" bothers you then you could have the python based command you are interrupting signal the interrupting thread before going into its loop. I don't see how this would result in indeterminacy, And it would be testing exactly what you want to test: does calling DispatchInputInterrupt cause a command in flight to be interrupted.

But this change is fine. Check it in and I'll add a test when I get a chance.

amccarth accepted this revision.Sep 20 2017, 4:24 PM

LGTM.

But just a thought: Is it worth doing all the work to scan for line endings for the interruption points? What if, instead, it just printed the next _n_ characters on each iteration until the entire buffer has been printed. Sure, sometimes an interruption will split a line, and sometimes it won't. I'm not sure that's important for interactive usage. It would mean less fiddly code, faster output (because you don't have to scan every character), and a zillion short lines will print as fast as a smaller number of longer lines that represents the same volume of text.

This revision was automatically updated to reflect the committed changes.
lemo updated this revision to Diff 117900.Oct 5 2017, 2:49 PM
lemo edited the summary of this revision. (Show Details)

Updating the original changes to handle reentrant calls to CommandInterpreter::IOHandlerInputComplete().

  • This fixes bug #34758
  • Also enhanced the existing test case for "command source" to cover this regression
  • I considered changing SBCommandReturnObject::PutOutput() to handle output interruptions too, but CommandReturnObjects are decoupled from CommandInterpreter instances, so there's no easy way to query for interruption in the current scheme.
lemo reopened this revision.Oct 5 2017, 2:53 PM
This revision is now accepted and ready to land.Oct 5 2017, 2:53 PM
lemo requested review of this revision.Oct 5 2017, 2:53 PM
lemo edited edge metadata.
jingham accepted this revision.Oct 5 2017, 3:41 PM

That looks fine.

This revision is now accepted and ready to land.Oct 5 2017, 3:41 PM
amccarth accepted this revision.Oct 5 2017, 3:54 PM

LGTM

This revision was automatically updated to reflect the committed changes.