This is an archive of the discontinued LLVM Phabricator instance.

Add a "Trace" plug-in to LLDB to add process trace support in stages.
ClosedPublic

Authored by wallace on Aug 10 2020, 9:32 PM.

Details

Summary

This is the first in a series of patches that will adds a new processor trace plug-in to LLDB.

The idea for this first patch to to add the plug-in interface with simple commands for the trace files that can "load" and "dump" the trace information. We can test the functionality and ensure people are happy with the way things are done and how things are organized before moving on to adding more functionality.

Processor trace information can be view in a few different ways:

  • post mortem where a trace is saved off that can be viewed later in the debugger
  • gathered while a process is running and allow the user to step back in time (with no variables, memory or registers) to see how each thread arrived at where it is currently stopped.

This patch attempts to start with the first solution of loading a trace file after the fact. The idea is that we will use a JSON file to load the trace information. JSON allows us to specify information about the trace like:

  • plug-in name in LLDB
  • path to trace file
  • shared library load information so we can re-create a target and symbolicate the information in the trace
  • any other info that the trace plug-in will need to be able to successfully parse the trace information
    • cpu type
    • version info
    • ???

A new "trace" command was added at the top level of the LLDB commmands:

  • "trace load"
  • "trace dump"

I did this because if we load trace information we don't need to have a process and we might end up creating a new target for the trace information that will become active. If anyone has any input on where this would be better suited, please let me know. Walter Erquinigo will end up filling in the Intel PT specific plug-in so that it works and is tested once we can agree that the direction of this patch is the correct one, so please feel free to chime in with ideas on comments!

Diff Detail

Event Timeline

clayborg created this revision.Aug 10 2020, 9:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2020, 9:32 PM
Herald added subscribers: dang, mgorny. · View Herald Transcript
clayborg requested review of this revision.Aug 10 2020, 9:32 PM

Unresolved issues in this patch:

  • For "trace load", I get the plugin for the JSON file by matching it up with the "name" field in the JSON, but I don't store the "trace_sp" anywhere. We will need to store it with the target that we create, or for later commands add it to a target that is stopped when the trace data is loaded via the process interface (through lldb-server is the current thinking for this).
  • "trace dump" does nothing for now, but this is what we can use to test that "trace load" worked and was able to create a target.

This patch keeps things really simple for now so we can checkin small testable patches.

clayborg retitled this revision from First patch that adds processor trace. to Add a "Trace" plug-in to LLDB to add process trace support in stages..Aug 10 2020, 9:37 PM
davide added a reviewer: vsk.Aug 11 2020, 11:04 AM

The idea for the JSON file is that it must have a "name" (or maybe "plug-in"?) to identify which trace plug-in and a "trace-file" at the very least:

{
  "name": "intel-pt", 
  "trace-file": "/tmp/trace-info" 
}

The "name" field would match the name of the trace plug-in in LLDB.

For intel PT they need info about the CPU. Some intel PT libraries have a structure like:

// A cpu vendor.
enum pt_cpu_vendor {
    pcv_unknown,
    pcv_intel
};

struct pt_cpu {
  enum pt_cpu_vendor vendor;
  uint16_t family;
  uint8_t model;
  uint8_t stepping;
};

So the trace JSON file for intel-pt might look like:

{ 
  "name": "intel-pt", 
  "trace-file": "/tmp/a",
  "pt_cpu": {
    "vendor": "intel",
    "family": 10,
    "model": 2,
    "stepping": 1
  }
}
vsk added a comment.Aug 11 2020, 11:51 AM

This looks very cool, thanks @clayborg! I think using JSON to describe the trace data (what kind of trace is this, what's in it, etc.) sounds reasonable.

For "trace load", I get the plugin for the JSON file by matching it up with the "name" field in the JSON, but I don't store the "trace_sp" anywhere. We will need to store it with the target that we create, or for later commands add it to a target that is stopped when the trace data is loaded via the process interface (through lldb-server is the current thinking for this).

Have you considered what might happen if there are multiple targets covered by a single trace? Strawman proposal: would it make sense to register the trace with a Debugger instance? This can be a list of traces if it makes sense to support debugging more than one trace at a time.

"trace dump" does nothing for now, but this is what we can use to test that "trace load" worked and was able to create a target.

It'd be great to have some test for this, even if all 'trace load' does at this point is print an error about bad JSON input.

lldb/include/lldb/Target/Trace.h
28

nit: 'may have', or omit the 'may'?

lldb/source/Commands/CommandObjectTrace.cpp
2

nit: formatting seems a bit off here?

lldb/source/Commands/Options.td
207

nit: seems like an accidental change here?

754

ditto

In D85705#2211073, @vsk wrote:

This looks very cool, thanks @clayborg! I think using JSON to describe the trace data (what kind of trace is this, what's in it, etc.) sounds reasonable.

For "trace load", I get the plugin for the JSON file by matching it up with the "name" field in the JSON, but I don't store the "trace_sp" anywhere. We will need to store it with the target that we create, or for later commands add it to a target that is stopped when the trace data is loaded via the process interface (through lldb-server is the current thinking for this).

Have you considered what might happen if there are multiple targets covered by a single trace? Strawman proposal: would it make sense to register the trace with a Debugger instance? This can be a list of traces if it makes sense to support debugging more than one trace at a time.

I hadn't thought of that but that does make sense!

We can work this all into the JSON format. We should actually make a schema for the common parts of information that should be represented in the JSON and also allow each plug-in to supply a schema for the parts that is requires in the JSON.

Some ideas that this information should contain:

  • array of process infos dictionaries
  • process info dictionary
    • pid
    • array of shared library dictionaries
  • shared library dictionary:
    • original path
    • UUID if available
    • MD5 of file if no UUID
    • load location
    • optional URL to download

And LLDB will easily be able to load up N targets with everything setup correctly.

"trace dump" does nothing for now, but this is what we can use to test that "trace load" worked and was able to create a target.

It'd be great to have some test for this, even if all 'trace load' does at this point is print an error about bad JSON input.

I agree! If we like this format, Walter Erquinigo can commandeer this revision and fill in actual Intel PT guts and have this work. The idea is I am going to get the infrastructure in place and once we work this out, I will let Walter take over the patch and actually fill it in with real Intel PT stuff and test it fully.

lldb/source/Commands/Options.td
207

yeah, my editor removes trailing spaces.

vsk added a comment.Aug 11 2020, 3:35 PM
In D85705#2211073, @vsk wrote:

This looks very cool, thanks @clayborg! I think using JSON to describe the trace data (what kind of trace is this, what's in it, etc.) sounds reasonable.

For "trace load", I get the plugin for the JSON file by matching it up with the "name" field in the JSON, but I don't store the "trace_sp" anywhere. We will need to store it with the target that we create, or for later commands add it to a target that is stopped when the trace data is loaded via the process interface (through lldb-server is the current thinking for this).

Have you considered what might happen if there are multiple targets covered by a single trace? Strawman proposal: would it make sense to register the trace with a Debugger instance? This can be a list of traces if it makes sense to support debugging more than one trace at a time.

I hadn't thought of that but that does make sense!

We can work this all into the JSON format. We should actually make a schema for the common parts of information that should be represented in the JSON and also allow each plug-in to supply a schema for the parts that is requires in the JSON.

Some ideas that this information should contain:

  • array of process infos dictionaries
  • process info dictionary
    • pid
    • array of shared library dictionaries
  • shared library dictionary:
    • original path
    • UUID if available
    • MD5 of file if no UUID
    • load location
    • optional URL to download

And LLDB will easily be able to load up N targets with everything setup correctly.

I'm generally on board with this. On Darwin we tend not to have the original path to a DSO available, so I'd argue that that ought to be optional as well. Really the only mandatory stuff for debugging purposes is the load address and some kind of UUID.

"trace dump" does nothing for now, but this is what we can use to test that "trace load" worked and was able to create a target.

It'd be great to have some test for this, even if all 'trace load' does at this point is print an error about bad JSON input.

I agree! If we like this format, Walter Erquinigo can commandeer this revision and fill in actual Intel PT guts and have this work. The idea is I am going to get the infrastructure in place and once we work this out, I will let Walter take over the patch and actually fill it in with real Intel PT stuff and test it fully.

clayborg updated this revision to Diff 285177.Aug 12 2020, 1:50 PM

Fix inline comment issues from vsk.

I only added a few non-substantial inline comments.

lldb/include/lldb/Target/Trace.h
39

I'm just curious: What's the purpose of this? Is the destructor = 0 in the base class and we want to not have to write this line in Trace implementations?

lldb/source/Commands/CommandObjectTrace.cpp
88

Would it be possible to early-exit-ify this function? Perhaps by using a lambda for the result.AppendErrorWithFormat error return part?

lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h
25

FYI: The Doxygen way of writing this is

/// Static Functions.
/// \{
  static void Initialize();
  static void Terminate();
/// \}

Thanks for doing this. There are some basic bits to make this that I was unaware of, so this saves a lot of time for me :)
I'll take over this patch next week and fill it with intel-pt stuff, I already have a good amount of code that is quite mergeable with this patch.

wallace commandeered this revision.Aug 21 2020, 12:10 PM
wallace edited reviewers, added: clayborg; removed: wallace.
wallace updated this revision to Diff 287079.EditedAug 21 2020, 12:18 PM
  • Added libipt as a dependency to build the new intel pt plugin. It's merely a copy paste of what the old intel pt plugin does.
  • Added a test with a sample trace.json definition file. It includes a simple a.out object file, the source and an intel pt trace that goes from main to the final return 0. The test checks that the plugin does the parsing of the structured data correctly and creates the target, process and modules correctly.
  • Added the necessary parsing functionalities in Trace.cpp to make sense of the structured data.
  • Added a simple parsing of the intel pt-specific information in the structured data.

I didn't implement Dump nor Decoding, as in this case the test is quite comprehensive regarding Load and I want to keep the diff small. I'll implement the other features later.

clayborg requested changes to this revision.Aug 21 2020, 1:02 PM

Lots of little things regarding the encoding and format of the JSON.

lldb/source/Target/Trace.cpp
38

Dump schema here as part of the error?

54

dump schema when we have an error?

66

dump schema?

106–107

rename "filePath" to just "path"?

110

Should load address be encoded as an integer to begin with? I know it is less readable as an integer since we can't use hex numbers It would simplify the logic here. If we want to use strings, we should make a helper function that decodes an address from a key's value in a dictionary so we can re-use elsewhere.

110–111

does JSON typically use camel case? Should this be "load-address"?

117–120

We shouldn't assume that "file_path" is relative. This code should be something like:

FileSpec file_spec(file_path);
if (file_spec.IsRelative())
  file_spec.PrependPathComponent(m_info_dir);
128

We should store a target triple as a mandatory key/value pair in the top level trace JSON file and access via a getter. Then we should also fill out a ModuleSpec with a path, UUID (optional) and architecture to make sure we don't end up getting the wrong file:

ModuleSpec module_spec;
module_spec.GetFileSpec() = file_spec;
module_spec.GetArchitecture() = Trace::GetArchitecture(); // This would be a new accessor that will hand out a architecture for the "triple" setting in the JSON trace settings
StringRef uuid_str;
if (module->GetValueForKeyAsString("uuid", uuid_str))
  module_spec.GetUUID().SetFromStringRef(uuid_str);
lldb::ModuleSP module_sp = target_sp->GetOrCreateModule(module_spec, false /* notify */, &error);

We wouldn't want to accidentally load "/usr/lib/libc.so" on a different machine with the wrong architecture since "libc.so" can exist on many systems.

132

Remove this since we create the module above already in the target in my inline comment code changes.

182

This should be more generic like:

Trace::ParseSettings(...)

We will add more top level key/value pairs that follow a structured data schema which we should make available through a command eventually, so lets not limit this to just processes. We might add architecture or target triple, and much more in the future.

199

Maybe we should confine all plug-in specific settings to a key/value pair where the key is "arguments" or "plug-in-arguments" and the value is a dictionary. This will help ensure that no plug-in specific settings can ever conflict.

{ 
  "type": "intel-pt",
  "processes": [...],
  ....  (other settings that are for this "lldb_private::Trace::ParseSettings(...)")
  "arguments": {
      ... (plug-in specific arguments go here)
  }
}

Or maybe a structure like:

{
  "plug-in": {
    "name": "intel-pt",
    "arguments: {
      ... (TraceIntelPT plug-in specific arguments go here)
    }
  }
  "processes": [...],
  "triple": "armv7-apple-ios",
  ....  (other settings that are for this "lldb_private::Trace::ParseSettings(...)")
}
This revision now requires changes to proceed.Aug 21 2020, 1:02 PM
wallace marked 12 inline comments as done.Aug 24 2020, 4:16 PM
wallace added inline comments.
lldb/source/Target/Trace.cpp
110

The way this is implemented is that it's expecting a string number in any radix. You can pass "123123" or "0xFFAABBFF" for example. You cannot pass it directly as a simple number because JSON doesn't support 64-bit integers

110–111

yes, it's camel case typically

wallace updated this revision to Diff 287518.Aug 24 2020, 4:19 PM
  • Addressed comments.
  • Added schemas for both the base class and the implementation plugins.
  • Added two tests checking the error messages along with the schema when parsing failed.
  • Made some general clean up of the code
wallace updated this revision to Diff 287527.Aug 24 2020, 5:18 PM
  • Added a command that shows the schema of a given trace plugin.
  • Added a test for such command

A large part of this patch is concerned with parsing which worries me from a maintenance perspective. Did you consider using Yaml I/O? While I'm not a particularly big fan of the format, the benefits of being able to (de)serialize any class by implementing the appropriate traits are quite significant. We already have traits implemented for a bunch of utility classes, such as ArchSpec and FileSpec which we could reuse for this. I know changing the format would be invasive, but I think it might be worth it in the long term.

That's a very good idea! I'll revisit this patch then. I was not fond of doing the manual parsing but I hadn't found a more automated way.

clayborg requested changes to this revision.Aug 25 2020, 3:55 PM

Looking better. The main thing we need to do it modify StructuredData classes a bit by moving a lot of the static functions we are using here into the appropriate classes. See inlined comments.

lldb/source/Target/Trace.cpp
60

Do we want an optional "trace-file" here in case trace files can contain all data for all processes?

63

Do we want an optional "trace-file" here in case trace files can contain all data for a single process?

67

Do all trace files have one file per thread? Should this be "trace-file" instead of "traceFile"?

Might make sense to make this optional and also include an optional traceFile at the process level and possibly one at the root level? See inline comments above

72

This should probably be optional as we won't always have the file copied into the trace directory?

Do we want a "system-path" for the original path here?

73

"load-address" instead of "loadAddress" and should be an integer instead of a string.

115–124

This entire function should go into the StructuredData::Array class for re-use

126–134

This entire function should go into StructuredData::Dictionary so other code can re-use this.

140

Seems like we should modify GetValueForKeyAsInteger to take a pointer to an error as the 3rd parameter here? Then that can return an appropriate error in case there is a "tid" entry, but it isn't an integer.

152

Seems like we should modify GetValueForKeyAsArray to take an pointer to an error as the 3rd parameter here? Then that can return an appropriate error in case there is a "threads" entry, but it isn't an array.

157

This functionality should be moved into StructuredData::Array::ExtractItemAsArray(...)

169

"load-address" might be a better identifier and it is also weird to store an address as a string. Should be an integer.

194–195

This will need to be able to fail gracefully here. We might not always have the module file on disk. We might need to create a Placeholder module like the ProcessMinidump does.

This revision now requires changes to proceed.Aug 25 2020, 3:55 PM

After speaking with Walter a bit, key names in the JSON should be camel case. Many languages (Swift and JavaScript) can auto import JSON and create objects and use the key names as member variable names, and if they are camelCase, then no name conversion needs to happen.

A large part of this patch is concerned with parsing which worries me from a maintenance perspective. Did you consider using Yaml I/O? While I'm not a particularly big fan of the format, the benefits of being able to (de)serialize any class by implementing the appropriate traits are quite significant. We already have traits implemented for a bunch of utility classes, such as ArchSpec and FileSpec which we could reuse for this. I know changing the format would be invasive, but I think it might be worth it in the long term.

So the nice thing about using StructuredData is it can be in any format: JSON, XML, Apple property list, YAML etc. It seems like the functions that were added to ArchSpec and FileSpec for the YAML I/O could be converted to use StructuredData and then any of the formats would work.

A large part of this patch is concerned with parsing which worries me from a maintenance perspective. Did you consider using Yaml I/O? While I'm not a particularly big fan of the format, the benefits of being able to (de)serialize any class by implementing the appropriate traits are quite significant. We already have traits implemented for a bunch of utility classes, such as ArchSpec and FileSpec which we could reuse for this. I know changing the format would be invasive, but I think it might be worth it in the long term.

So the nice thing about using StructuredData is it can be in any format: JSON, XML, Apple property list, YAML etc. It seems like the functions that were added to ArchSpec and FileSpec for the YAML I/O could be converted to use StructuredData and then any of the formats would work.

If we invest in creating a library like Yaml IO for StructuredData, which allows you to declaratively specify how a class gets (de)serialized, then I'm all for using it here and for the reproducers. I think that would be really, really valuable. Given the way that structured data works and how similar it is to how these formats look, I'm also fairly confident it can be done. Still it would be a significant amount of work. But barring that I don't see a benefit from switching from YAML I/O to StructuredData, at least not for the reproducers and not really for this either. Unless we need to be able to support all these formats? Like I said, I don't really care about YAML. What I do care about is that I don't want to be in the business of writing a parser for every data structure I need to serialize. That's really where Yaml I/O shines, even though it has other drawbacks, like the traits themselves being verbose and the parser being far from fast.

wallace added a comment.EditedAug 26 2020, 1:54 PM

Given that this feature is intended to be used by non debugger developers, I want to stick to using JSON as a default input format, as it's more ubiquitous than YAML and I prefer to make this decision based on the user experience than on the technical solution. I like the YAML parser though, but converting JSON to YAML for parsing might throw away relevant error messaging.

wallace updated this revision to Diff 288145.EditedAug 26 2020, 6:01 PM
  • As error messaging is important when parsing, I create a new set of functions in StructuredData for parsing that return an llvm::Expected. This allowed me to move the helper functions for parsing from the Trace file to StructuredData, so that in the future other contributors can

use those methods when needed.

  • I modified the schema to allow for the "traceFile" entry at several levels.
  • "systemPath" was also added for the module entry
  • I didn't use the PlaceholderObject yet, as Greg suggested. That might involve a few other changes and I need to see if that class is sufficient for decoding intel-pt traces. However, as soon as I'm working on the decoding part, I'll move to PlaceholderObject, as it will be nice to avoid failing hard if a module is not available.
  • I simplified the code here and there as well.
  • I'm still using strings for addresses, like "0xFFFFFFFF" or "12312412". After checking some recommendations on stack overflow, using strings seems to be a good practice for having compatibility with JS, which has 54-bit sized integers. I imagine that these trace files might be send over the network and JS could potentially be used then.

Given that this feature is intended to be used by non debugger developers, I want to stick to using JSON as a default input format, as it's more ubiquitous than YAML and I prefer to make this decision based on the user experience than on the technical solution.

Alright, fair enough I guess. In that case I would prefer to split off the parser from the Trace data structure.

I like the YAML parser though, but converting JSON to YAML for parsing might throw away relevant error messaging.

I don't think anyone suggested this.

One more thing about the parser. I really like the usage of llvm::Error/Excepted/Optional in this path btw. However, these functions take a non-const ref to classes like Process, Thread, etc. It seems like they might be left in a potentially inconsistent state when we error out. Is that something to worry about or are we guaranteed to discard them in the caller if an error occurred? Could they for instance return an Expected<Debugger> instead?

lldb/include/lldb/Target/Trace.h
87

StringRef

106

StringRef

128

StringRef

151

StringRef

lldb/include/lldb/Utility/StructuredData.h
196 ↗(On Diff #288145)

StringRef

lldb/source/Plugins/Trace/intel-pt/CMakeLists.txt
4

Move this up to where this directory is included. e.g.

if (LLDB_BUILD_INTEL_PT)
  add_subdirectory(intel-pt)
endif()

That way a change to the CMake file won't trigger a reconfig if LLDB_BUILD_INTEL_PT is false.

23

Why not pass this in the first place? PATHS will be checked *in addition* to the default locations.

27

If you REQUIRED to find_library CMake will take care of the error reporting.

lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h
23

Redundant.

55

I think it would be nice to separate the plugin from the parser.

lldb/source/Target/Trace.cpp
28

StringRef

51

StringRef

147

StringRef should always be passed by value.

lldb/test/API/commands/trace/intelpt-trace/trace.json
2 ↗(On Diff #288145)

It seems like the key here should be trace rather than "plugin"?

Thanks for the review! Those are very useful comments. So, I'll split the parsing out from the Trace object. Regarding a possible inconsistent state, you are right. That could happen. The targets, modules and processes used in the parsing are all created there, and it should be enough to only "commit" those targets if there has been no error. I'll make the necessary updates.

If the format is going to be json, why not use llvm::json from llvm/Support/JSON.h? That's what we been migrating most of the existing stuff to already, so going using that for new code makes perfect sense.

clayborg requested changes to this revision.Aug 27 2020, 12:50 PM

If the format is going to be json, why not use llvm::json from llvm/Support/JSON.h? That's what we been migrating most of the existing stuff to already, so going using that for new code makes perfect sense.

I am fine with this if we are ok always using JSON. StructuredData, when it parses JSON, is backed by llvm/Support/JSON.h, so we are already using it. If we don't need the flexibility to use other structured data formats we can do this.

lldb/include/lldb/Target/Trace.h
158

I know I suggested we might need a trace file for all process, one per process, or one per thread in the JSON schema. The question is do we need to add support for this right away? Might be easier to go with one trace file for now unless IntelPT already supports each of these?

lldb/source/Target/Trace.cpp
55

Can we inline the plug-ins schema right into this stream instead of putting the "<<<plugin_schema>>>" text in here?

This revision now requires changes to proceed.Aug 27 2020, 12:50 PM
wallace updated this revision to Diff 288496.Aug 27 2020, 5:53 PM

Addressed comments

  • Now using StringRef correctly whenever possible.
  • Revert back to using "traceFile" only inside the thread section. That's how intel-pt works at the moment and, as Greg suggested, we can change the schema in the future if needed.
  • Now the global schema construction inlines the plug-in schema.
  • Simplified the CMake logic
  • Separated the parser from the plugin. Now new plugins need to derive their own Trace implementation along with their own Parser.
  • I'm still using StructuredData for the parsing. There's a chance that once we release this feature users will want support for other formats besides JSON, so for now I prefer to ask for JSON input but parse in a format-agnostic way. If eventually no one needs any other format, we can switch to JSON-only parsing.
  • Also, now the Targets created during parsing are preserved only if the parsing succeeded. Otherwise, the Debugger object is left unchanged.
  • I'm still using StructuredData for the parsing. There's a chance that once we release this feature users will want support for other formats besides JSON, so for now I prefer to ask for JSON input but parse in a format-agnostic way. If eventually no one needs any other format, we can switch to JSON-only parsing.

That argument goes both ways. We could say that if anyone wants to add a non-json format, he can change the code to accept his preferred format. Overall, I am very sceptical of the StructuredData's ability to abstract diverse formats in a meaningful way. I doubt that a natural XML representation of this data would map to the same abstract format as this json and that it could be parsed by the same code -- simply because xml conventions are different, and xml has features that json doesn't (attributes, for one).

And since this file is just a descriptor, and actual plugin-specific data is contained in other files, I don't see why it's format couldn't be fixed.

lldb/test/API/commands/trace/intelpt-trace/trace.json
11 ↗(On Diff #288496)

What if one of the processes is 64-bit and the other 32? It sounds like this should be a property of a particular process.

You changed my mind. I'll move to Support/JSON.h

lldb/test/API/commands/trace/intelpt-trace/trace.json
11 ↗(On Diff #288496)

That's a very good point. Will do so

wallace updated this revision to Diff 288739.Aug 28 2020, 5:53 PM

Addressed all comments.

  • Wwitched to using llvm::json instead of StructuredData. Fortunately, the logic is pretty much the same.
  • Moved the "triple" field to the process level

A few simple changes from inlined comments. Other than that looks good. Should get another OK from others. Also not sure if the JSON stuff in llvm needs to be done separately?

lldb/include/lldb/Target/Trace.h
116

would std::unique_ptr<> might be better? Are we actually handing out multiple instances of this and sharing it anywhere?

lldb/include/lldb/lldb-private-interfaces.h
14

We should avoid this import and use forward declaration here below the includes:

namespace llvm {
  namespace json {
    class Object;
  }
}
lldb/source/Commands/Options.td
206–207

revert whitespace only changes.

754

revert whitespace only changes.

wallace updated this revision to Diff 288788.Aug 29 2020, 10:25 AM
  • Removed whitespace only changes.
  • Simplified the way a SettingsParser is created, now there's no need to passinga a pointer around.
  • Fixed the forward declaration in lldb-private-interfaces.

This should address all comments so far except for the llvm::json changes. I'm okay with creating another patch with those changes if the reviewers prefer that. They are simple, though.

wallace marked 22 inline comments as done.Aug 29 2020, 10:28 AM
clayborg accepted this revision.Aug 31 2020, 2:48 PM

LGTM now. Probably should get an ok from one more person.

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSettingsParser.cpp
1 ↗(On Diff #288788)

Remove "//" after filename

wallace updated this revision to Diff 289285.Sep 1 2020, 2:33 PM

fix nit. Waiting for an additional approval

aprantl added inline comments.Sep 1 2020, 2:40 PM
lldb/source/Commands/CommandObjectTrace.h
2

Can you add -*- C++ -*- markers the the .h files you are adding? Otherwise tools (like phabricator) will think it's a C header.

wallace updated this revision to Diff 289301.Sep 1 2020, 3:55 PM

fix header c++ declaration

I'm pretty sure the json changes should get a separate review. The json owner (let's call him @sammccall) might have a different idea on how to do this thing...

wallace updated this revision to Diff 290603.Sep 8 2020, 4:11 PM
  • Moved the JSON changes to another diff.
  • Addressed all the comments so far.

As a side note, I'll work on a RFC for the LLDB mailing list to get feedback on the overall design of the feature, as @labath suggested. However, Greg and I want to get these patches in, as we need this functionality soon. We can easily make modifications whenever suggestions come, following an iterative approach to development.

clayborg accepted this revision.Sep 8 2020, 5:01 PM

This looks good to me. We will need to wait until the JSON changes from https://reviews.llvm.org/D87335 to make it in though in case there are any modifications to that patch.

Walter will put up the RFC for trace plug-ins and if that differs from this patch much we can modify this to match. I don't see a need to hold up the patch since this won't affect anyone as it is new functionality that doesn't affect normal debugging or other workflows.

wallace updated this revision to Diff 292922.Sep 18 2020, 4:17 PM

Based on the discussion from D87335, I'm moving all the JSON utilities as non-member functions to the settings parser file. Eventually this can be refactored if that patch moves forward.

There aren't any other pending issues.

wallace edited the summary of this revision. (Show Details)Sep 18 2020, 4:17 PM
This revision was not accepted when it landed; it landed in state Needs Review.Sep 21 2020, 5:13 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.
lldb/source/Target/TraceSettingsParser.cpp
123 ↗(On Diff #293299)

This requires sstream. Fixed.

128 ↗(On Diff #293299)

The replacement here looks strange.

<regex> implementations tend to be large, slow and buggy. It'd be best if <regex> can be avoided. (llvm has llvm/Support/Regex.h)

wallace marked an inline comment as done.Sep 21 2020, 8:05 PM
wallace added inline comments.
lldb/source/Target/TraceSettingsParser.cpp
128 ↗(On Diff #293299)

will do!

labath added a comment.Oct 1 2020, 7:11 AM

I have two high-level remarks/questions about this:

  • I am surprised that it was not necessary to create a special process plugin for this purpose. I have a feeling one will be necessary sooner or later because of the need to customize the step/continue/etc. flows. Currently, this will probably produce very bizarre if one tries to execute those commands. The reason I'm bringing this up is because of the Target::GetTrace method being added in the next patch. If there is a trace-specific process class, then it might make sense for this class to hold the trace object instead of adding the GetTrace method on every Target object out there even though most targets will have that null. I don't know if that will really be the case, but I think it's something worth keeping in mind as you work on the subsequent patches.
  • I am puzzled by the TraceXXX vs. TraceXXXSettingsParser duality. The two classes are very tightly coupled, so it's not clear to me what is the advantage of separating it out this way (one could just move all the relevant methods directly into the Trace class. What's the reason for this design?
wallace marked an inline comment as done.Oct 1 2020, 11:57 AM
wallace added subscribers: myler, aprantl, Chirag and 13 others.
  • I am surprised that it was not necessary to create a special process

plugin for this purpose. I have a feeling one will be necessary sooner or
later because of the need to customize the step/continue/etc. flows.
Currently, this will probably produce very bizarre if one tries to execute
those commands. The reason I'm bringing this up is because of the
Target::GetTrace method being added in the next patch. If there is a
trace-specific process class, then it might make sense for this class to
hold the trace object instead of adding the GetTrace method on every Target
object out there even though most targets will have that null. I don't know
if that will really be the case, but I think it's something worth keeping
in mind as you work on the subsequent patches.

Very good remark. Probably we'll end up doing that when we start
implementing reverse debugging. The tricky thing we'll need to solve in the
future is seamlessly transition between a trace and a live process. For
example, when reverse debugging a live process, you might be stopped at a
breakpoint in the trace, then you do "continue", it reaches the end of the
trace, and then it resumes the live process. I still haven't decided what
we'll propose for this, but probably we'll have to change a lot of the
current code to make that happen nicely.

  • I am puzzled by the TraceXXX vs. TraceXXXSettingsParser duality. The

two classes are very tightly coupled, so it's not clear to me what is the
advantage of separating it out this way (one could just move all the
relevant methods directly into the Trace class. What's the reason for this
design?

Well, there's definitely coupling, but there are two reasons. One is that
the Trace of a live process won't need any parsing. Parsing is only used
when loading a trace from a json file. That makes parsing one of the two
main ways to create a Trace. And besides, I think that the single
responsibility principle is good to follow. The Parser class does the
parsing, and the Trace class holds an actual trace.

Il giorno gio 1 ott 2020 alle ore 07:12 Pavel Labath via Phabricator <
reviews@reviews.llvm.org> ha scritto:

labath added a comment.

I have two high-level remarks/questions about this:

  • I am surprised that it was not necessary to create a special process

plugin for this purpose. I have a feeling one will be necessary sooner or
later because of the need to customize the step/continue/etc. flows.
Currently, this will probably produce very bizarre if one tries to execute
those commands. The reason I'm bringing this up is because of the
Target::GetTrace method being added in the next patch. If there is a
trace-specific process class, then it might make sense for this class to
hold the trace object instead of adding the GetTrace method on every Target
object out there even though most targets will have that null. I don't know
if that will really be the case, but I think it's something worth keeping
in mind as you work on the subsequent patches.

  • I am puzzled by the TraceXXX vs. TraceXXXSettingsParser duality. The two

classes are very tightly coupled, so it's not clear to me what is the
advantage of separating it out this way (one could just move all the
relevant methods directly into the Trace class. What's the reason for this
design?

Repository:

rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION

https://reviews.llvm.org/D85705/new/

https://reviews.llvm.org/D85705

clayborg added a comment.EditedOct 1 2020, 12:16 PM

I have two high-level remarks/questions about this:

  • I am surprised that it was not necessary to create a special process plugin for this purpose. I have a feeling one will be necessary sooner or later because of the need to customize the step/continue/etc. flows. Currently, this will probably produce very bizarre if one tries to execute those commands. The reason I'm bringing this up is because of the Target::GetTrace method being added in the next patch. If there is a trace-specific process class, then it might make sense for this class to hold the trace object instead of adding the GetTrace method on every Target object out there even though most targets will have that null. I don't know if that will really be the case, but I think it's something worth keeping in mind as you work on the subsequent patches.

Eventually we should probably have a HistoryProcess class, much like the HistoryThread. This new class could then be used as a base class for the ProcessMachCore and ProcessELFCore and ProcessMinidump. This base class can then stub out the flow control (continue, step, stop etc) and all other virtual methods that are stubbed out in these classes. It can also add accessors for adding HistoryThread objects if needed. So we should probably move this way at some point.

When we have a live process, we will use the current process when we can, but the process and target will need to coordinate to be able to know if we have reversed stepped into the past so that the flow control (forward/reverse continue, forward/reverse step) can do the right things.

I totally agree with you, I didn't think about creating custom callbacks

P. I'll refactor the code accordingly. Thanks, man

Il giorno ven 2 ott 2020 alle ore 01:51 Pavel Labath <pavel@labath.sk> ha
scritto: