Page MenuHomePhabricator

Add a "Trace" plug-in to LLDB to add process trace support in stages.
ClosedPublic

Authored by wallace on Aug 10 2020, 9:32 PM.

Details

Summary

This is the first in a series of patches that will adds a new processor trace plug-in to LLDB.

The idea for this first patch to to add the plug-in interface with simple commands for the trace files that can "load" and "dump" the trace information. We can test the functionality and ensure people are happy with the way things are done and how things are organized before moving on to adding more functionality.

Processor trace information can be view in a few different ways:

  • post mortem where a trace is saved off that can be viewed later in the debugger
  • gathered while a process is running and allow the user to step back in time (with no variables, memory or registers) to see how each thread arrived at where it is currently stopped.

This patch attempts to start with the first solution of loading a trace file after the fact. The idea is that we will use a JSON file to load the trace information. JSON allows us to specify information about the trace like:

  • plug-in name in LLDB
  • path to trace file
  • shared library load information so we can re-create a target and symbolicate the information in the trace
  • any other info that the trace plug-in will need to be able to successfully parse the trace information
    • cpu type
    • version info
    • ???

A new "trace" command was added at the top level of the LLDB commmands:

  • "trace load"
  • "trace dump"

I did this because if we load trace information we don't need to have a process and we might end up creating a new target for the trace information that will become active. If anyone has any input on where this would be better suited, please let me know. Walter Erquinigo will end up filling in the Intel PT specific plug-in so that it works and is tested once we can agree that the direction of this patch is the correct one, so please feel free to chime in with ideas on comments!

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

A large part of this patch is concerned with parsing which worries me from a maintenance perspective. Did you consider using Yaml I/O? While I'm not a particularly big fan of the format, the benefits of being able to (de)serialize any class by implementing the appropriate traits are quite significant. We already have traits implemented for a bunch of utility classes, such as ArchSpec and FileSpec which we could reuse for this. I know changing the format would be invasive, but I think it might be worth it in the long term.

So the nice thing about using StructuredData is it can be in any format: JSON, XML, Apple property list, YAML etc. It seems like the functions that were added to ArchSpec and FileSpec for the YAML I/O could be converted to use StructuredData and then any of the formats would work.

A large part of this patch is concerned with parsing which worries me from a maintenance perspective. Did you consider using Yaml I/O? While I'm not a particularly big fan of the format, the benefits of being able to (de)serialize any class by implementing the appropriate traits are quite significant. We already have traits implemented for a bunch of utility classes, such as ArchSpec and FileSpec which we could reuse for this. I know changing the format would be invasive, but I think it might be worth it in the long term.

So the nice thing about using StructuredData is it can be in any format: JSON, XML, Apple property list, YAML etc. It seems like the functions that were added to ArchSpec and FileSpec for the YAML I/O could be converted to use StructuredData and then any of the formats would work.

If we invest in creating a library like Yaml IO for StructuredData, which allows you to declaratively specify how a class gets (de)serialized, then I'm all for using it here and for the reproducers. I think that would be really, really valuable. Given the way that structured data works and how similar it is to how these formats look, I'm also fairly confident it can be done. Still it would be a significant amount of work. But barring that I don't see a benefit from switching from YAML I/O to StructuredData, at least not for the reproducers and not really for this either. Unless we need to be able to support all these formats? Like I said, I don't really care about YAML. What I do care about is that I don't want to be in the business of writing a parser for every data structure I need to serialize. That's really where Yaml I/O shines, even though it has other drawbacks, like the traits themselves being verbose and the parser being far from fast.

wallace added a comment.EditedAug 26 2020, 1:54 PM

Given that this feature is intended to be used by non debugger developers, I want to stick to using JSON as a default input format, as it's more ubiquitous than YAML and I prefer to make this decision based on the user experience than on the technical solution. I like the YAML parser though, but converting JSON to YAML for parsing might throw away relevant error messaging.

wallace updated this revision to Diff 288145.EditedAug 26 2020, 6:01 PM
  • As error messaging is important when parsing, I create a new set of functions in StructuredData for parsing that return an llvm::Expected. This allowed me to move the helper functions for parsing from the Trace file to StructuredData, so that in the future other contributors can

use those methods when needed.

  • I modified the schema to allow for the "traceFile" entry at several levels.
  • "systemPath" was also added for the module entry
  • I didn't use the PlaceholderObject yet, as Greg suggested. That might involve a few other changes and I need to see if that class is sufficient for decoding intel-pt traces. However, as soon as I'm working on the decoding part, I'll move to PlaceholderObject, as it will be nice to avoid failing hard if a module is not available.
  • I simplified the code here and there as well.
  • I'm still using strings for addresses, like "0xFFFFFFFF" or "12312412". After checking some recommendations on stack overflow, using strings seems to be a good practice for having compatibility with JS, which has 54-bit sized integers. I imagine that these trace files might be send over the network and JS could potentially be used then.

Given that this feature is intended to be used by non debugger developers, I want to stick to using JSON as a default input format, as it's more ubiquitous than YAML and I prefer to make this decision based on the user experience than on the technical solution.

Alright, fair enough I guess. In that case I would prefer to split off the parser from the Trace data structure.

I like the YAML parser though, but converting JSON to YAML for parsing might throw away relevant error messaging.

I don't think anyone suggested this.

One more thing about the parser. I really like the usage of llvm::Error/Excepted/Optional in this path btw. However, these functions take a non-const ref to classes like Process, Thread, etc. It seems like they might be left in a potentially inconsistent state when we error out. Is that something to worry about or are we guaranteed to discard them in the caller if an error occurred? Could they for instance return an Expected<Debugger> instead?

lldb/include/lldb/Target/Trace.h
87

StringRef

106

StringRef

128

StringRef

151

StringRef

lldb/include/lldb/Utility/StructuredData.h
196 ↗(On Diff #288145)

StringRef

lldb/source/Plugins/Trace/intel-pt/CMakeLists.txt
4

Move this up to where this directory is included. e.g.

if (LLDB_BUILD_INTEL_PT)
  add_subdirectory(intel-pt)
endif()

That way a change to the CMake file won't trigger a reconfig if LLDB_BUILD_INTEL_PT is false.

23

Why not pass this in the first place? PATHS will be checked *in addition* to the default locations.

27

If you REQUIRED to find_library CMake will take care of the error reporting.

lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h
23

Redundant.

55

I think it would be nice to separate the plugin from the parser.

lldb/source/Target/Trace.cpp
28

StringRef

51

StringRef

147

StringRef should always be passed by value.

lldb/test/API/commands/trace/intelpt-trace/trace.json
3

It seems like the key here should be trace rather than "plugin"?

Thanks for the review! Those are very useful comments. So, I'll split the parsing out from the Trace object. Regarding a possible inconsistent state, you are right. That could happen. The targets, modules and processes used in the parsing are all created there, and it should be enough to only "commit" those targets if there has been no error. I'll make the necessary updates.

If the format is going to be json, why not use llvm::json from llvm/Support/JSON.h? That's what we been migrating most of the existing stuff to already, so going using that for new code makes perfect sense.

clayborg requested changes to this revision.Aug 27 2020, 12:50 PM

If the format is going to be json, why not use llvm::json from llvm/Support/JSON.h? That's what we been migrating most of the existing stuff to already, so going using that for new code makes perfect sense.

I am fine with this if we are ok always using JSON. StructuredData, when it parses JSON, is backed by llvm/Support/JSON.h, so we are already using it. If we don't need the flexibility to use other structured data formats we can do this.

lldb/include/lldb/Target/Trace.h
158

I know I suggested we might need a trace file for all process, one per process, or one per thread in the JSON schema. The question is do we need to add support for this right away? Might be easier to go with one trace file for now unless IntelPT already supports each of these?

lldb/source/Target/Trace.cpp
55

Can we inline the plug-ins schema right into this stream instead of putting the "<<<plugin_schema>>>" text in here?

This revision now requires changes to proceed.Aug 27 2020, 12:50 PM
wallace updated this revision to Diff 288496.Aug 27 2020, 5:53 PM

Addressed comments

  • Now using StringRef correctly whenever possible.
  • Revert back to using "traceFile" only inside the thread section. That's how intel-pt works at the moment and, as Greg suggested, we can change the schema in the future if needed.
  • Now the global schema construction inlines the plug-in schema.
  • Simplified the CMake logic
  • Separated the parser from the plugin. Now new plugins need to derive their own Trace implementation along with their own Parser.
  • I'm still using StructuredData for the parsing. There's a chance that once we release this feature users will want support for other formats besides JSON, so for now I prefer to ask for JSON input but parse in a format-agnostic way. If eventually no one needs any other format, we can switch to JSON-only parsing.
  • Also, now the Targets created during parsing are preserved only if the parsing succeeded. Otherwise, the Debugger object is left unchanged.
  • I'm still using StructuredData for the parsing. There's a chance that once we release this feature users will want support for other formats besides JSON, so for now I prefer to ask for JSON input but parse in a format-agnostic way. If eventually no one needs any other format, we can switch to JSON-only parsing.

That argument goes both ways. We could say that if anyone wants to add a non-json format, he can change the code to accept his preferred format. Overall, I am very sceptical of the StructuredData's ability to abstract diverse formats in a meaningful way. I doubt that a natural XML representation of this data would map to the same abstract format as this json and that it could be parsed by the same code -- simply because xml conventions are different, and xml has features that json doesn't (attributes, for one).

And since this file is just a descriptor, and actual plugin-specific data is contained in other files, I don't see why it's format couldn't be fixed.

lldb/test/API/commands/trace/intelpt-trace/trace.json
12

What if one of the processes is 64-bit and the other 32? It sounds like this should be a property of a particular process.

You changed my mind. I'll move to Support/JSON.h

lldb/test/API/commands/trace/intelpt-trace/trace.json
12

That's a very good point. Will do so

wallace updated this revision to Diff 288739.Aug 28 2020, 5:53 PM

Addressed all comments.

  • Wwitched to using llvm::json instead of StructuredData. Fortunately, the logic is pretty much the same.
  • Moved the "triple" field to the process level

A few simple changes from inlined comments. Other than that looks good. Should get another OK from others. Also not sure if the JSON stuff in llvm needs to be done separately?

lldb/include/lldb/Target/Trace.h
116

would std::unique_ptr<> might be better? Are we actually handing out multiple instances of this and sharing it anywhere?

lldb/include/lldb/lldb-private-interfaces.h
14

We should avoid this import and use forward declaration here below the includes:

namespace llvm {
  namespace json {
    class Object;
  }
}
lldb/source/Commands/Options.td
205–207

revert whitespace only changes.

768

revert whitespace only changes.

wallace updated this revision to Diff 288788.Aug 29 2020, 10:25 AM
  • Removed whitespace only changes.
  • Simplified the way a SettingsParser is created, now there's no need to passinga a pointer around.
  • Fixed the forward declaration in lldb-private-interfaces.

This should address all comments so far except for the llvm::json changes. I'm okay with creating another patch with those changes if the reviewers prefer that. They are simple, though.

wallace marked 22 inline comments as done.Aug 29 2020, 10:28 AM
clayborg accepted this revision.Aug 31 2020, 2:48 PM

LGTM now. Probably should get an ok from one more person.

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSettingsParser.cpp
2

Remove "//" after filename

wallace updated this revision to Diff 289285.Sep 1 2020, 2:33 PM

fix nit. Waiting for an additional approval

aprantl added inline comments.Sep 1 2020, 2:40 PM
lldb/source/Commands/CommandObjectTrace.h
2

Can you add -*- C++ -*- markers the the .h files you are adding? Otherwise tools (like phabricator) will think it's a C header.

wallace updated this revision to Diff 289301.Sep 1 2020, 3:55 PM

fix header c++ declaration

I'm pretty sure the json changes should get a separate review. The json owner (let's call him @sammccall) might have a different idea on how to do this thing...

wallace updated this revision to Diff 290603.Sep 8 2020, 4:11 PM
  • Moved the JSON changes to another diff.
  • Addressed all the comments so far.

As a side note, I'll work on a RFC for the LLDB mailing list to get feedback on the overall design of the feature, as @labath suggested. However, Greg and I want to get these patches in, as we need this functionality soon. We can easily make modifications whenever suggestions come, following an iterative approach to development.

clayborg accepted this revision.Sep 8 2020, 5:01 PM

This looks good to me. We will need to wait until the JSON changes from https://reviews.llvm.org/D87335 to make it in though in case there are any modifications to that patch.

Walter will put up the RFC for trace plug-ins and if that differs from this patch much we can modify this to match. I don't see a need to hold up the patch since this won't affect anyone as it is new functionality that doesn't affect normal debugging or other workflows.

wallace updated this revision to Diff 292922.Sep 18 2020, 4:17 PM

Based on the discussion from D87335, I'm moving all the JSON utilities as non-member functions to the settings parser file. Eventually this can be refactored if that patch moves forward.

There aren't any other pending issues.

wallace edited the summary of this revision. (Show Details)Sep 18 2020, 4:17 PM
This revision was not accepted when it landed; it landed in state Needs Review.Sep 21 2020, 5:13 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.
lldb/source/Target/TraceSettingsParser.cpp
123

This requires sstream. Fixed.

128

The replacement here looks strange.

<regex> implementations tend to be large, slow and buggy. It'd be best if <regex> can be avoided. (llvm has llvm/Support/Regex.h)

wallace marked an inline comment as done.Sep 21 2020, 8:05 PM
wallace added inline comments.
lldb/source/Target/TraceSettingsParser.cpp
128

will do!

labath added a comment.Oct 1 2020, 7:11 AM

I have two high-level remarks/questions about this:

  • I am surprised that it was not necessary to create a special process plugin for this purpose. I have a feeling one will be necessary sooner or later because of the need to customize the step/continue/etc. flows. Currently, this will probably produce very bizarre if one tries to execute those commands. The reason I'm bringing this up is because of the Target::GetTrace method being added in the next patch. If there is a trace-specific process class, then it might make sense for this class to hold the trace object instead of adding the GetTrace method on every Target object out there even though most targets will have that null. I don't know if that will really be the case, but I think it's something worth keeping in mind as you work on the subsequent patches.
  • I am puzzled by the TraceXXX vs. TraceXXXSettingsParser duality. The two classes are very tightly coupled, so it's not clear to me what is the advantage of separating it out this way (one could just move all the relevant methods directly into the Trace class. What's the reason for this design?
wallace marked an inline comment as done.Thu, Oct 1, 11:57 AM
wallace added subscribers: myler, aprantl, Chirag and 13 others.
  • I am surprised that it was not necessary to create a special process

plugin for this purpose. I have a feeling one will be necessary sooner or
later because of the need to customize the step/continue/etc. flows.
Currently, this will probably produce very bizarre if one tries to execute
those commands. The reason I'm bringing this up is because of the
Target::GetTrace method being added in the next patch. If there is a
trace-specific process class, then it might make sense for this class to
hold the trace object instead of adding the GetTrace method on every Target
object out there even though most targets will have that null. I don't know
if that will really be the case, but I think it's something worth keeping
in mind as you work on the subsequent patches.

Very good remark. Probably we'll end up doing that when we start
implementing reverse debugging. The tricky thing we'll need to solve in the
future is seamlessly transition between a trace and a live process. For
example, when reverse debugging a live process, you might be stopped at a
breakpoint in the trace, then you do "continue", it reaches the end of the
trace, and then it resumes the live process. I still haven't decided what
we'll propose for this, but probably we'll have to change a lot of the
current code to make that happen nicely.

  • I am puzzled by the TraceXXX vs. TraceXXXSettingsParser duality. The

two classes are very tightly coupled, so it's not clear to me what is the
advantage of separating it out this way (one could just move all the
relevant methods directly into the Trace class. What's the reason for this
design?

Well, there's definitely coupling, but there are two reasons. One is that
the Trace of a live process won't need any parsing. Parsing is only used
when loading a trace from a json file. That makes parsing one of the two
main ways to create a Trace. And besides, I think that the single
responsibility principle is good to follow. The Parser class does the
parsing, and the Trace class holds an actual trace.

Il giorno gio 1 ott 2020 alle ore 07:12 Pavel Labath via Phabricator <
reviews@reviews.llvm.org> ha scritto:

labath added a comment.

I have two high-level remarks/questions about this:

  • I am surprised that it was not necessary to create a special process

plugin for this purpose. I have a feeling one will be necessary sooner or
later because of the need to customize the step/continue/etc. flows.
Currently, this will probably produce very bizarre if one tries to execute
those commands. The reason I'm bringing this up is because of the
Target::GetTrace method being added in the next patch. If there is a
trace-specific process class, then it might make sense for this class to
hold the trace object instead of adding the GetTrace method on every Target
object out there even though most targets will have that null. I don't know
if that will really be the case, but I think it's something worth keeping
in mind as you work on the subsequent patches.

  • I am puzzled by the TraceXXX vs. TraceXXXSettingsParser duality. The two

classes are very tightly coupled, so it's not clear to me what is the
advantage of separating it out this way (one could just move all the
relevant methods directly into the Trace class. What's the reason for this
design?

Repository:

rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION

https://reviews.llvm.org/D85705/new/

https://reviews.llvm.org/D85705

clayborg added a comment.EditedThu, Oct 1, 12:16 PM

I have two high-level remarks/questions about this:

  • I am surprised that it was not necessary to create a special process plugin for this purpose. I have a feeling one will be necessary sooner or later because of the need to customize the step/continue/etc. flows. Currently, this will probably produce very bizarre if one tries to execute those commands. The reason I'm bringing this up is because of the Target::GetTrace method being added in the next patch. If there is a trace-specific process class, then it might make sense for this class to hold the trace object instead of adding the GetTrace method on every Target object out there even though most targets will have that null. I don't know if that will really be the case, but I think it's something worth keeping in mind as you work on the subsequent patches.

Eventually we should probably have a HistoryProcess class, much like the HistoryThread. This new class could then be used as a base class for the ProcessMachCore and ProcessELFCore and ProcessMinidump. This base class can then stub out the flow control (continue, step, stop etc) and all other virtual methods that are stubbed out in these classes. It can also add accessors for adding HistoryThread objects if needed. So we should probably move this way at some point.

When we have a live process, we will use the current process when we can, but the process and target will need to coordinate to be able to know if we have reversed stepped into the past so that the flow control (forward/reverse continue, forward/reverse step) can do the right things.

I totally agree with you, I didn't think about creating custom callbacks

P. I'll refactor the code accordingly. Thanks, man

Il giorno ven 2 ott 2020 alle ore 01:51 Pavel Labath <pavel@labath.sk> ha
scritto: