This is an archive of the discontinued LLVM Phabricator instance.

Add StructuredData plugin type; showcase with new DarwinLog feature
ClosedPublic

Authored by tfiala on Jul 29 2016, 2:19 PM.

Details

Reviewers
clayborg
jingham
Summary

This change adds a new plugin-type called StructuredDataPlugin. These are detailed in depth in docs/structured_data/StructuredDataPlugins.md.

The DarwinLog feature is detailed in docs/structured_data/DarwinLog.md.

Highlights:

  • StructuredData plugins are bound to a process.
  • StructuredData features are advertised by the Process-derived class.
  • ProcessGDBRemote sends off a query packet to ask about available features and wires up via Process if any plugins handle the feature.
  • StructuredData producers (the process monitors) can send data asychronously via new $J packets, eligible to be sent while the target process is running. These packets are routed back to the plugins on the LLDB client side via the Process instance.
  • DarwinLog makes use of this facility, and enables gathering and filtering new Fall 2016+ stream-based logging on Apple targets. The plugin to consume this is StructuredDataDarwinLog.cpp.

Other minor changes:

  • Options and OptionsGroups now get constructed and store a pointer rather than reference to the CommandInterpreter. This facilitates reuse of the options parser without requiring the CommandInterpreter.
  • debugserver will now log its internal logging through os_log() if os_log() is available on the platform and the LLDB_USE_OS_LOG=1 Xcode flag or define is set. This defaults to off except for the BuildAndIntegration configuration since it would otherwise fail to build on OS X 10.11 and earlier.

This change has been tested on:
macOS 10.12 Beta 3 + Xcode 8.0 Beta 3 (Xcode build)
macOS 10.11.5 + Xcode 7.3.1 (Xcode build)
macOS 10.12 Beta 3 (cmake build)
Ubuntu 16.04 x86_64 (cmake build)

The architecture of this change was substantially reviewed by Greg Clayton during implementation.

Diff Detail

Event Timeline

tfiala retitled this revision from to Add StructuredData plugin type; showcase with new DarwinLog feature.Jul 29 2016, 2:19 PM
tfiala updated this object.
tfiala added reviewers: jingham, clayborg.
tfiala updated this revision to Diff 66172.Jul 29 2016, 2:19 PM
tfiala added a subscriber: Restricted Project.
tfiala added inline comments.Jul 29 2016, 2:37 PM
include/lldb/Interpreter/Options.h
315 ↗(On Diff #66172)

It is really needed. (Address-related options parsing need to do symbolication/address lookup via the target/process).

TODO: remove this comment.

I talked to Jim Ingham about this. I'm going to replace the std::regex usage with the LLDB regex classes so that we don't end up using two different flavors of regex.

On the debugserver side, I'll end up needing to call the underlying regex library methods directly.

The other piece Jim and I discussed was the sending over of the StructuredData configuration data in the $QConfigure{FeatureName} packet from the lldb client to the gdb-remote process monitor. That needs to be properly escaped since I'm sending over regex content that can have backslashes that are going to need escaping.

tfiala updated this revision to Diff 66218.Jul 30 2016, 1:43 AM

Changes:

  • Converts std::regex to lldb RegularExpression.
  • Implements RegularExpression-like code in debugserver.
  • Sends configuration data to the debug monitor in gdb-remote escaped format. This prevents the protocol from eating some of the characters that show up in regexes.
  • Fix the totally broken binary decoder in debugserver. It would replace all decoded characters with the first character in the sequence so long as it wasn't an escape.
  • Improve error display to user when enabling DarwinLog fails with a configuration error. Now configuration errors actually show up on the error stream for the debugger.
  • Other minor bug fixes and test fixes.
tfiala marked an inline comment as done.EditedAug 8 2016, 8:18 AM

Changes:

  • Converts std::regex to lldb RegularExpression.
  • Implements RegularExpression-like code in debugserver.
  • Sends configuration data to the debug monitor in gdb-remote escaped format. This prevents the protocol from eating some of the characters that show up in regexes.
  • Fix the totally broken binary decoder in debugserver. It would replace all decoded characters with the first character in the sequence so long as it wasn't an escape.
  • Improve error display to user when enabling DarwinLog fails with a configuration error. Now configuration errors actually show up on the error stream for the debugger.
  • Other minor bug fixes and test fixes.

I was out on vacation last week. I'm back now. Jim had some more comments on the concept of changing options over to having optional access to the interpreter. I am going to split that part off into a separate change, put that up for review, and adjust this patch for that change.

tfiala added a comment.EditedAug 8 2016, 11:30 AM

I was out on vacation last week. I'm back now. Jim had some more comments on the concept of changing options over to having optional access to the interpreter. I am going to split that part off into a separate change, put that up for review, and adjust this patch for that change.

That communication was as follows:

>8

[Jim]
One big part of this change was re-jiggering the Options so you could use them to parse your Structured Data setup options. That seems a worthy goal, but in the result you make a CommandObject's CommandOptions have to worry about not having a CommandInterpreter. That feels wrong. But it might be wrong in a different way than just "I know I always have one of these" type wrong...

So far as I can see, we need a CommandInterpreter for options in two places. One for the CommandOptions class in CommandObjectBreakpoint. The other place we use this is to run validators in Args::ParseArgs. In both cases we're only using the selected execution context, and the selected platform. It seems like it would be more straightforward to pass these in to ParseArgs, and not rely on the Options having a command interpreter at all.

This is especially true because it is a little wrong to be getting the Execution Context from the interpreter in parsing arguments for a command. We've been trying to get away from assuming that every command gets run with the ExecutionContext of the currently selected target, thread, etc... That forces us to go through changing the selected thread when we want to run stop hooks on all threads, or breakpoint commands on all the threads that hit breakpoints. We've tried to replace this with explicitly passing in the Execution Context. So this change would further that aim as well.

That's more change, but I think it is closer to right.

[Todd]
So are you thinking remove the interpreter from commands entirely, and pass in an execution context (which could be null) to the various functions on Options?

[Jim]
If possible, yes, remove the interpreter from the Options & CommandOptions.

The way we get option parsing to pick up the right Execution Context at present is a little gross. HandleCommand gets called with an ExecutionContext * (the override_context). Right away we call CommandInterpreter::UpdateExecutionContext which stores this override context away in the command interpreter. Then in a few places - processing backticks, and the argument parsing bits that need an execution context fetch the interpreter from the option, and then get the ExecutionContext from there. Seems like if we could pass the ExecutionContext directly to the few places where it was needed, then the flow would be much more explicit.

That may be more change than you want to do at this point, however.

>8

[Todd]
I will do this change, but I am going to break it into a separate stand-alone change and get that reviewed and landed first.

The change to decouple Option parsing from CommandInterpreter went up for review here:
https://reviews.llvm.org/D23416

I'll refresh this patch after that.

tfiala updated this revision to Diff 68199.Aug 16 2016, 8:58 AM

This is ready to go now.

Changes from previous:

  • Removed decoupling of Options from CommandInterpreter. I did this in a separate review and commit, and I did it in a cleaner way (with ExecutionContexts).
  • Added more flags to the DarwinLog enable command.
  • Fixed a race in startup of the DarwinLog instrumentation when launching a process.
clayborg requested changes to this revision.Aug 16 2016, 10:12 AM
clayborg edited edge metadata.

Do we need and extra copy of JSON.h and JSON.cpp? It would be great if we can shared one between LLDB and debugserver.

docs/structured_data/StructuredDataPlugins.md
34–36

This is the case for GDB, but it won't be the case for other plug-ins. We should remove the fact that async packets can only arrive when the process is running.

38–42

Lets not mention any gdb-remote specific stuff in this document. Maybe mention how this would get plumbed through to the plug-in in terms of which virtual function will be called in the base class.

44–45

Maybe mention which virtual function gets called here?

47–50

Maybe say something like:

IDE clients might use this feature to receive information about the process as it is running to monitor memory usage, CPU usage, and logging.
65–67

I would mention this without calling out a specific plug-in.

75–76

No mention of ProcessGDBRemote?

88–94

No mention of ProcessGDBRemote? Move this to the DarwinLog docs.

include/lldb/Core/Broadcaster.h
274–279

We should probably make these bits not be the process bits and just pick examples like "eBroadcastBitOne". We shouldn't have to update this header file any time the process bits are updated.

include/lldb/Target/StructuredDataPlugin.h
28–36

I would try to agnostify this info so it doesn't mention any plug-in. We should move this text to the ProcessGDBRemote or into the DarwinLog plugin.

47

Maybe mention use cases like an IDE getting CPU usage, memory usage, etc.

source/API/SBDebugger.cpp
495–497

Remove this TODO and fix this to happen correctly in your plug-ins.

source/Core/Event.cpp
391

Are other flavor strings camel case?

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp
386
m_qSupported_response.clear()
source/Target/Platform.cpp
1318–1343

Make sure we document the "muck with launch info" feature correctly. We should say something like "your plug-in needs to know if it is enabled, ..."

This revision now requires changes to proceed.Aug 16 2016, 10:12 AM

Additional feedback from Jim Ingham:

  • Where are the SBAPI calls to retrieve the structured data from the new process-level structured data event?

[A. Ah, yes, I neglected that. I will add in a follow-up patch. Also, I will convert at least some of the tests over to using that API instead of pexpect].

  • The new Thread Plan:
    • Remove guess comments in text, they are correct.
    • Move the thread plan into source/Target. It is generally useful.
  • Consider (as a change separate from this) a 'thread step-out -F {step-out-callback}' command that allows a use to call arbitrary code after the step-out completes. That could be based on the new thread plan from this change.
jingham requested changes to this revision.Aug 16 2016, 12:36 PM
jingham edited edge metadata.

The idea seems great. I don't see how an SB client would be able to access this data? That's an oversight, the question is whether it is yours or mine?

Other little comments in-line.

docs/structured_data/StructuredDataPlugins.md
17

Maybe 'For instance, the type name of the DarwinLog feature is "DarwinLog".' would read more smoothly.

20–27

I'd remove the first sentence. It doesn't add anything, and obscures the flow which is: process monitor advertises features, process bargains with Plugins for support of the feature and installs the plugins found.

56

that -> when

64–65

When does this happen w.r.t. the publication of supported structured data features by the process monitor? This description makes it sound like plugins that might not actually be supported by the process monitor you are going to attach to get to muck with the launch info, which is a little odd.

include/lldb/Target/Process.h
3484–3485

This one doesn't have docs. Was that intentional.

include/lldb/Target/StructuredDataPlugin.h
118–122

How does this handle errors? The display below takes an error stream, for instance. Should this take an error object as a parameter?

154–157

If a plugin doesn't support enabling & disabling, then it will be always on, right? If so, perhaps this should have a default implementation that returns true.

195–196

This sounds like this initializes the debugger somehow. What does this actually do? RegisterWithDebugger or something like that?

source/Plugins/StructuredData/DarwinLog/ThreadPlanCallOnFunctionExit.cpp
116–117 ↗(On Diff #68199)

You never run the target on your own since your plan will never be on the top of the stack when the process restarts. You will always either have the StepOut plan in front of you, or you will be done. So it doesn't actually matter what you return here.

source/Plugins/StructuredData/DarwinLog/ThreadPlanCallOnFunctionExit.h
1–11 ↗(On Diff #68199)

This should go in source/Target, it's a generally useful plan. You might even be able to add this functionality to the extant step out plan? That's not required, after all this isn't much code, but it might make it easier to do things like extend "thread step-out" to add a completion function...

tfiala marked 12 inline comments as done.Aug 16 2016, 1:47 PM

First adjustment, based on Greg's feedback.

I'm not sure what to do with the broadcast bit comment. See question in my response.

I'll be putting up the patch with the adjustments covered here in a few moments.

include/lldb/Core/Broadcaster.h
274–279

I'm not sure I understand this. How would we know that eBroadcastBitOne is supposed to be a structured data event?

source/Core/Event.cpp
391

Yes, EventDataBytes in the same file uses camel case:

const ConstString &
EventDataBytes::GetFlavorString ()
{
    static ConstString g_flavor ("EventDataBytes");
    return g_flavor;
}

and

const ConstString &
Process::ProcessEventData::GetFlavorString ()
{
    static ConstString g_flavor ("Process::ProcessEventData");
    return g_flavor;
}

That highlights that perhaps the structured data should be called "Process::EventDataStructuredData", and perhaps it should really appear in Process.{h,cpp}.

source/Target/Platform.cpp
1318–1343

The RegisterPlugin call is now fully documented, including a description of responsibilities around the operation of the optional StructuredDataFilterLaunchInfo callback.

tfiala updated this revision to Diff 68249.Aug 16 2016, 1:50 PM
tfiala edited edge metadata.
tfiala marked an inline comment as done.

Here's the patch with fixes for round one of Greg's feedback. These are primarily documentation changes.

tfiala marked 3 inline comments as done.Aug 16 2016, 4:38 PM

Responses and/or fixes for Jim's comments coming up in a patch momentarily.

docs/structured_data/StructuredDataPlugins.md
57

I think this change got edited out.

65–66

That's the challenge. I've added docs to the plugin registration method for StructuredDataPlugins that adds notes on the responsibilities of the optional launch info filter callback.

include/lldb/Target/Process.h
3484–3485

No, oversight. Thanks!

include/lldb/Target/StructuredDataPlugin.h
119–123

I had intended that the plugin could determine which stream it wanted to use to display output. Not so much to encode an actual error.

This is going to change, though, as I add the SBStructuredData class and the methods on it per our discussion on that.

In the old scheme, it probably made sense to just not include the stderr stream. I was being over-generous to the plugin, but the complexity addition doesn't seem to make sense for it.

155–158

Good point. Sounds fine.

196–197

I was using the same name we typically give the methods that initialize a plugin for use with a Debugger instance.

What this method actually does is makes sure that the hook points (multi-word command for the base 'plugin structured-data') is created before any single plugin child class tries to attach commands to it.

I'll try a different name here that conveys that better.

I've renamed it to:

static void
InitializeBasePluginForDebugger(Debugger &debugger);

and I documented it.

tfiala updated this revision to Diff 68283.Aug 16 2016, 4:41 PM
tfiala edited edge metadata.

Diff for previous comment.

The last change on the books to be made is to expose the StructuredData event through the SBAPI so that clients can actually react to structured data events that are raised. I'll be working on that next.

tfiala updated this revision to Diff 68592.Aug 18 2016, 12:26 PM
  • Implemented the SBAPI side of the structured data plugin system, allowing retrieval of structured data and plugin-based pretty-printing of the data.
  • Converted the basic test to be event-based.
  • Fixed wrong usage of a broadcast method that was skipping placing the structured data event in a listener queue if there was another structured data event in the queue. (I was using the unique-style broadcast that I mimicked from somewhere else, but that is the wrong semantics here.)

This is ready for a final review.

jingham requested changes to this revision.Aug 18 2016, 2:19 PM
jingham edited edge metadata.

One comment inline about naming.

include/lldb/API/SBStructuredData.h
42–44 ↗(On Diff #68592)

In most of the other SB API classes that produce a text version of themselves for presenting to the a user (e.g. for SBAddress, SBBreakpoint, etc) we use GetDescription as the name. Is this different enough from that to warrant a new name?

This revision now requires changes to proceed.Aug 18 2016, 2:19 PM

One comment inline about naming.

include/lldb/API/SBStructuredData.h
42–44 ↗(On Diff #68592)

I was just using the name that I thought Greg said when we were talking about it in my office the other day.

I can switch it, although in the case of log messages, where there is formatting that is applied based on enabled options, I think this is potentially misleading. The whole reason for living of a log message is to display it. In that regard, I don't necessarily think this is the same as a typical GetDescription() method.

But I'm happy to modify it if, given my caveats above, we still think that's appropriate.

tfiala updated this revision to Diff 68627.Aug 18 2016, 3:42 PM
tfiala edited edge metadata.
  • Added test base class to do event-based testing of DarwinLog support.
  • Converted the regex-based message content tests to use the event-based test base class.
  • Same for the basic DarwinLog test.

Jim,

Okay I totally agree on the GetDescription name. I was neglecting to think that this applies to all StructuredData plugins. In that case, there is no argument to be made about a logging-focused method name.

I'm adjusting that now.

tfiala updated this revision to Diff 68639.Aug 18 2016, 5:52 PM
tfiala edited edge metadata.

Changes:

  • Adjust for Jim's comment: renamed PrettyPrint(...) => GetDescription(...)
  • Adjust for previous comment from Greg: remove StructuredDataPlugin::ShouldPrint(), and have the Debugger instance always do a GetDescription(...) and print it.

Hopefully the last change!

jingham accepted this revision.Aug 18 2016, 5:54 PM
jingham edited edge metadata.

Looks good.

Committed as r279198.

Waiting to see builders to see what happens.

Something is not building now on Linux. It's been a couple weeks since I built it there and there have been enough changes since then.

I've reverted it out in r279200.

I'll get the Linux build fixed here shortly and reapply.

Take 2, r279202.

Fixes:

  • Missing cmake subdirectory link for source/Plugins/StructuredData.
  • Return statement for default implementation of virtual was lost at some point, now restored.

There's a gtest broken on this. I don't see it breaking on macOS, but I think it's because nobody is keeping the gtests updated there.

Fix coming shortly...

Fixed up the gdb-remote gtest here:
r279203

I didn't see this because:

  1. The Xcode lldb-gtest target has not been updated to include newer gtests.
  2. The Ubuntu build process doesn't run the gtests as part of a normal build.

I need to adjust the Xcode build more - it looks like there is a depedency on libxml that didn't exist before. The macOS lldb-gtest target is broken temporarily while I fix that, but the Linux one should be fine now.

tfiala accepted this revision.Aug 18 2016, 10:06 PM
tfiala added a reviewer: tfiala.

r279203 fixed the gtests for the Ubuntu CI.

r279205 fixed the macOS running of gtests. The newly added Process gdb-remote gtests use some mock features that try to use RTTI. Our lldb-core explicitly builds without RTTI (via -fno-rtti), so I had to disable that for the newly-added gdb-remote gtests. I also needed to add libxml2 to the linkage.

Calling this good now.

tfiala removed a reviewer: tfiala.Aug 18 2016, 10:07 PM
clayborg accepted this revision.Aug 19 2016, 8:42 AM
clayborg edited edge metadata.
This revision is now accepted and ready to land.Aug 19 2016, 8:42 AM
tfiala closed this revision.Aug 19 2016, 9:25 AM

This all went in last night. The commits were noted in previous comments.