This is an archive of the discontinued LLVM Phabricator instance.

[Reproducers] Add command provider
ClosedPublic

Authored by JDevlieghere on Feb 22 2019, 5:29 PM.

Details

Summary

Add a provider to the command interpreter. Essentially this writes all the commands to a file which is used during replay as input to the command interpreter.

Diff Detail

Event Timeline

JDevlieghere created this revision.Feb 22 2019, 5:29 PM
aprantl added inline comments.Feb 25 2019, 12:42 PM
lldb/include/lldb/Interpreter/CommandInterpreter.h
31

Doxygen comments?

625

///

lldb/source/Interpreter/CommandInterpreter.cpp
121

Any toplevel Doxygen comments that would be useful to add here?

132

At least from the outside it's not quite obvious what this function does. Could you perhaps add a high-level comment?

JDevlieghere marked 4 inline comments as done.

Add comments

aprantl accepted this revision.Feb 25 2019, 2:15 PM
This revision is now accepted and ready to land.Feb 25 2019, 2:15 PM
  • Add testcase.
  • Don't log sourced commands twice.

For some reason I thought that we no longer needed to differentiate between commands being sourced from a file. I remember an earlier discussing with Pavel on this topic where we considered doing this at a lower level, but there we face the same issue, without an easy way to differentiate.

I am sorry that I won't have much time to review this in the next couple of weeks, but I don't think this is a good direction here. I don't see how this will interact with the SB API recorder, specifically with things like SBCommandInterpreter::HandleCommand, and ::HandleCommandsFromFile. The thing I would expect to see is that SB recorder captures the input of those commands (for a somewhat broad interpretation of "capture") during recording, and then substitute this during replay. That way the CommandInterpreter class would not need (almost?) any modifications.

With this approach (shoving all commands into a single stream in the CommandInterpreter) it becomes impossible to replay the API calls above. If you want to proceed with this, then go ahead (it's your feature), but I believe you'll run into some problems down the line.

I am sorry that I won't have much time to review this in the next couple of weeks, but I don't think this is a good direction here. I don't see how this will interact with the SB API recorder, specifically with things like SBCommandInterpreter::HandleCommand, and ::HandleCommandsFromFile. The thing I would expect to see is that SB recorder captures the input of those commands (for a somewhat broad interpretation of "capture") during recording, and then substitute this during replay. That way the CommandInterpreter class would not need (almost?) any modifications.

It’s been a while but wasn’t this exactly what you proposed in the other differential? How would you capture commands that are entered interactively (through RunCommqndInterpreter)?

Anyway, I don’t believe this is a concern. The provider here only capture what’s entered interactively, hence the flag. Replaying the API call should work exactly as expected. I’ll double check later today.

With this approach (shoving all commands into a single stream in the CommandInterpreter) it becomes impossible to replay the API calls above. If you want to proceed with this, then go ahead (it's your feature), but I believe you'll run into some problems down the line.

I am sorry that I won't have much time to review this in the next couple of weeks, but I don't think this is a good direction here. I don't see how this will interact with the SB API recorder, specifically with things like SBCommandInterpreter::HandleCommand, and ::HandleCommandsFromFile. The thing I would expect to see is that SB recorder captures the input of those commands (for a somewhat broad interpretation of "capture") during recording, and then substitute this during replay. That way the CommandInterpreter class would not need (almost?) any modifications.

It’s been a while but wasn’t this exactly what you proposed in the other differential? How would you capture commands that are entered interactively (through RunCommqndInterpreter)?

Anyway, I don’t believe this is a concern. The provider here only capture what’s entered interactively, hence the flag. Replaying the API call should work exactly as expected. I’ll double check later today.

I have to admit I haven't looked at this in detail, but the thing I'm missing here is the connection between commands and API calls. If we take RunCommandInterpreter, for instance, you can see that the lldb driver invokes this function three times. How do you ensure the "right" commands get replayed as a part of the API call?

If you look at the driver more closely, you'll see that each call to RunCommandInterpreter is preceeded by a call to SetInputFileHandle. I don't have this idea fully baked, but the way I'd try to approach this is to have each SetInputFileHandle create a new buffer where the commands will be stored in. Then as the commands are being processed (in RunCommandInterpreter), they would be added into this buffer. Then, when replaying you would know that you only should replay the commands from the given buffer.

I'd also probably try to capture these commands at a slightly lower level, because I am hoping that this will allow us to get rid of the add_to_reproducer flag. Ideally, this should fall out naturally due to the different source the commands are coming from -- the commands executed through the HandleCommand API would be captured at the SB boundary, and the "interactive" commands would be captured by whoever invokes the command interpreter in interactive mode.

(I have no idea how easy it is to achieve this, but that's how I'd approach this.)

Pavel made a good point that with the previous implementation, the first call to RunCommandInterpreter would replay every recorded commands. This is indeed incorrect, because it's possible and likely that the state of the debugger has changed between different runs of the commands interpreter.

Now every call to RunCommandInterpreter gets its own buffer. During replay, we will change the input file handler before invocation of "RunCommandInterpreter". This works because this function is only called through the SB layer.

I'm not convinced doing the recorded at a lower level has any benefits. I investigated this route before and the IOHandler seems basically the same things as the command interpreter. We would still need to make the distinction between things that should and shouldn't be recorded (e.g. sourcing a file vs every command in the file). This would be a lot harder to do there, because we have less information.

Pavel made a good point that with the previous implementation, the first call to RunCommandInterpreter would replay every recorded commands. This is indeed incorrect, because it's possible and likely that the state of the debugger has changed between different runs of the commands interpreter.

Now every call to RunCommandInterpreter gets its own buffer. During replay, we will change the input file handler before invocation of "RunCommandInterpreter". This works because this function is only called through the SB layer.

I'm not convinced doing the recorded at a lower level has any benefits. I investigated this route before and the IOHandler seems basically the same things as the command interpreter. We would still need to make the distinction between things that should and shouldn't be recorded (e.g. sourcing a file vs every command in the file). This would be a lot harder to do there, because we have less information.

In my imagination, this "information" would come down from SBDebugger::SetInputFileHandle, together with the FILE* we actually read the commands from. So, a really crude prototype could be something like:

SBDebugger::SetInputFileHandle(FILE *in) {
if (recording) m_debugger->SetInputFileHandle(in, /*new argument!!!*/ new FileShadowRecorder(...));
else if (replaying) m_debugger->SetInputFileHandle(GetRecordedFile(...), nullptr);
}

The FILE* would trickle down into where you read the commands (this could be the IOHandler, or it could be the CommandInterpreter object), where you would do something like:

auto stuff = read_stuff(in);
if (shadow_recorder)
  shadow_recorder->record_stuff(stuff);

Now when you're sourcing an external file, you just pass in a nullptr for the recorder when you're setting the input handle for the command interpreter.

So, in essence the shadow recorder would kind of serve the same purpose as your add_to_reproducer flag, but IMO this would be better because it would come straight from the source (SetInputFileHandle) and you wouldn't need to rely on comments like "This works because this function is only called through the SB layer". Another benefit would be that this would work out-of-the-box in case you have multiple SBDebugger objects around, each with it's own command interpreter (right now this wouldn't work because the StartNewBuffer thingies would step on each others toes). I don't know when or if you plan to support that, but right now that tells me that this is a better design.

lldb/source/Interpreter/CommandInterpreter.cpp
144

This is incorrect usage of the Twine class. The temporary Twine objects will be destroyed before you get a chance to stringify them.

Thanks Pavel. I've updated the patch with your suggestion. I agree it's a lot better :-)

I implemented the logic in the Debugger rather than the SBDebugger because I think the latter should be a thin wrapper, but let me know if you had a particular reason for this.

labath added a comment.Mar 1 2019, 5:31 AM

Thanks Pavel. I've updated the patch with your suggestion. I agree it's a lot better :-)

I implemented the logic in the Debugger rather than the SBDebugger because I think the latter should be a thin wrapper, but let me know if you had a particular reason for this.

Thanks Jonas. This looks even simpler than I anticipated. :)

While I agree that SB layer should be as thin as possible, I would say this deserves an exception. My reasoning behind that is that this command capture mechanism is kind of an extended arm of the SB recorder. Like, if we had an oracle, and were able to "instantly" capture the contents of the FILE* at the SB layer, we would just do that, and avoid this shadowing dance altogether. Unfortunately, we cannot see into the future, so we have to have this recorder "shadow" follow the input FILE* whereever it goes and capture the input as it is being read. The cleanest way to me seems to be to have that shadow start tracking the value as soon as it crosses the SB boundary.

Or, looking at it from a different angle, if somebody other than the SBDebugger calls Debugger::SetInputFileHandle, then we probably don't want to capture that input because it did not come from the outside world. (I have no idea why anybody would want to do that, but it does not seem completely out of the question.). This FILE* then most likely came from the filesystem, in which case it would be captured by the FileSystem recorder. Or if it did come through the SB API, but through a different function, then it should have been shadowed as soon it entered that function.

Apart from this I have some inline comments about error handling, but overall, I am very happy with how this is turning out.

lldb/include/lldb/Utility/Reproducer.h
120 ↗(On Diff #188813)

Remove operator bool and the embedded error code. Instead have a static factory function which checks the error and only returns a DataRecorder if the file was created correctly. I'm thinking of something like

static std::unique_ptr<DataRecorder> /*or Expected<...> ?*/ Create(FileSpec f) { 
  std::error_code ec;
  auto rec = make_unique<DataRecorder>(f, ec);
  if (!ec)
    return rec;
  return nullptr; /*or the error code*/
}
lldb/source/Core/Debugger.cpp
933 ↗(On Diff #188813)

Why not store this as a field? This way you still can have problems if two debugger objects are active concurrently.

lldb/source/Utility/Reproducer.cpp
228–229 ↗(On Diff #188813)

So what happens if creating the file fails here? We abort when the assert in Debugger::GetInputRecorder fails? If we're going to crash, it's probably better to do it here, as that's closer to where the actual failure happened (alternatively, we could log a message and exit slightly more gracefully; or log a message and continue without recording).

JDevlieghere marked 3 inline comments as done.
  • Moved the logic into SBDebugger.
  • Created DataRecorder factory.
  • Stored DataRecorder as a field in Debugger.
labath accepted this revision.Mar 1 2019, 12:48 PM

I have a couple of more comments, including some things I missed on the previous pass, but I don't want to hold this up any more. Feel free to commit after taking the last batch into consideration.

lldb/include/lldb/Utility/Reproducer.h
116 ↗(On Diff #188932)

The error_code will need to be a reference argument for this to have any effect.

lldb/source/API/SBDebugger.cpp
63 ↗(On Diff #188932)

Use a factory function here too?

74 ↗(On Diff #188932)

It might be nice to log these errors even if we don't plan to do anything about them.

lldb/source/Core/Debugger.cpp
887 ↗(On Diff #188932)

Sticking with the "shadow" idea, I think it would be better if the input recorder is set as an extra argument to this function (instead of as a separate call), because they should always be changed together. Setting one without the other is almost certainly a mistake.

Addressed in commit. Thank you Pavel!

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2019, 4:19 PM