This is an archive of the discontinued LLVM Phabricator instance.

[trace] Scaffold "thread trace dump instructions"
ClosedPublic

Authored by wallace on Oct 2 2020, 6:17 PM.

Details

Summary

Depends on D88841

As per the discussion in the RFC, we'll implement both

thread trace dump [instructions | functions]

This is the first step in implementing the "instructions" dumping command.

It includes:

  • A minimal ProcessTrace plugin for representing processes from a trace file. I noticed that it was a required step to mimic how core-based processes are initialized, e.g. ProcessElfCore and ProcessMinidump. I haven't had the need to create ThreadTrace yet, though. So far HistoryThread seems good enough.
  • The command handling itself in CommandObjectThread, which outputs a placeholder text instead of the actual instructions. I'll do that part in the next diff.
  • Tests

Depends on D88841.

Diff Detail

Event Timeline

wallace created this revision.Oct 2 2020, 6:17 PM
Herald added a project: Restricted Project. · View Herald Transcript
wallace requested review of this revision.Oct 2 2020, 6:17 PM
wallace edited the summary of this revision. (Show Details)Oct 2 2020, 6:18 PM
wallace edited the summary of this revision. (Show Details)Oct 2 2020, 7:10 PM
clayborg requested changes to this revision.Oct 3 2020, 12:15 AM

See inline comments about "thread trace dump instructions" as I am not quite sure what --offset means.

After we load a trace file, what is the default "offset" value? Is there a kind of "trace data position" that can be moved around? Does the offset default to the start of all trace instructions for a thread? The end of the trace data?

lldb/include/lldb/Target/Target.h
1122–1137

This should probably go on the lldb_private::Thread now if we are only going to ever dump information for a thread at a time.

lldb/source/Commands/Options.td
1011

This option text might need to be a bit more descriptive. What if the user types:

(lldb) thread trace dump --count 22

What would that display? 22 instructions forward from the current point we are at? If we load a trace file, will the position be at the end by default or at the beginning? I would think we would selec the end. Would this go 22 instructions back? That doesn't make much sense if we use in conjunction with the --offset.

1014–1015

So what happens if the user just types:

(lldb) thread trace dump

If the trace position offset defaults to zero, and the current position is the end of the data by default, would this display nothing? Everything? Or would it display the stats for this thread? like:

(lldb) thread trace dump
thread #1: tid = 1234, 4096 trace branches available, position = end
thread #2: tid = 2345, 2345 trace branches available, position = end
lldb/source/Plugins/Process/Trace/ProcessTrace.cpp
41–44

You must return "plugin_specified_by_name" here so that this plug-in only gets selected if it was specified by name. We don't want to have this ProcessTrace plug-in get selected by any real debug sessions. So this should be:

bool ProcessTrace::CanDebug(lldb::TargetSP target_sp,
                            bool plugin_specified_by_name) {
  return plugin_specified_by_name;
}

That way we can request this process plug-in by name.

If this is set to true, then when a target is shopping for a process plug-in by iterating through all installed Process plug-ins, if this function gets called before the ProcessGDBRemote or ProcessWindows it will take precedence over these other plug-ins. Typically a process plug-in will look at the target triple and look for an architecture, os, or vendor that matches and also look at the target's platform ("windows" for ProcessWindows) to determine if it can be used for debugging. In our case we should only be able to get a ProcessTrace instance back if we ask for it by name, it won't care about the target's triple, platform or anything else as it could be anything.

63–66

Probably shouldn't use the LoadCore functionality, see inline comments.

lldb/source/Plugins/Process/Trace/ProcessTrace.h
40

Seems weird to use LoadCore, see other inline comments.

lldb/source/Target/Target.cpp
3004–3013

move to lldb_private::Thread?

lldb/source/Target/TraceSettingsParser.cpp
68–73 ↗(On Diff #295947)

We might need a ThreadTrace instead of a HistoryThread, and it might be a good idea to do that now. Soon we will end up wanting to step around and continue and reverse continue then we will need to muck with the thread and set its stop reason to things like "hit breakpoint" and change the stack frames.

114–115 ↗(On Diff #295947)

We will need to make sure that this only happens when doing "trace load", and not when we do "process trace start" as we will have a real process at that point. We should talk about architecture as I have some ideas to make things work smoothly, but that can wait until we start to try and do more forward/reverse run/step stuff.

128–132 ↗(On Diff #295947)

Seems weird to use LoadCore here as this isn't a core file. Since we loaded the Process plug-in by name, you can cast the a Process * to be a ProcessTrace and call a custom method on it if we need to do custom setup. DidAttach might be better to override instead of LoadCore?

lldb/test/API/commands/trace/TestTraceDumpInstructions.py
41

What will this do by default? I would assume it would dump all instructions starting from the beginning and dump all the way to the end. Where is the default "offset" for these instructions right after a trace is loaded? Is it set to the end of the trace instructions? If so, I am not sure what I would expect "thread trace dump instructions" to do. Again I would assume it would dump them all? Can you elaborate on what you are thinking here?

45

use the long options as this makes it much clearer what is happening.

Also what does --offset mean? Is it an offset in bytes within the trace data? Is it the number of branches from the start of the all of the trace data? Number of branches from the end of the trace data? Will this need to be specialized for each trace plug-in?

49–50

Add a test for an invalid thread index id?

This revision now requires changes to proceed.Oct 3 2020, 12:15 AM
wallace updated this revision to Diff 296106.Oct 5 2020, 1:08 AM

Address comments:

  • Created a basic ThreadTrace class instead of using HistoryThread. I'll modify this class later when I add the decoding functionality and refactor the json file parsing logic.
  • Renamed offset to start_position, and added a clearer documentation about it. We'll have a current_position in each thread, and we'll print instructions in reverse order chronologically starting at current_position, unless start_position is specified. This matches gdb's implementation and works well with repeat commands, e.g. the user types "thread trace dump instructions" repeatedly showing a continous list of instructions from most recent to oldest.
  • Now using DidAttach instead of LoadCore for setting the newly created procesess state to stopped.
  • Moved the dump implementation from Target to Thread.
  • Fixed the CanDebug function.
  • Added a test for "thread trace dump instructions <invalid_thread_index>" and now using the long options instead of the short ones in the tests.

Some notes:

  • I think that TraceSettingsParser should be used only when constructing a trace from a json file. In the case of a live process, we can skip it and directly create the trace objects accordingly. So the json parser can assume that the process is defunct.
  • I haven't thought of making "thread trace dump" do anything. I'm thinking of having

thread trace info (for general information, including trace size, for example)
thread trace dump instructions
thread trace dump functions

labath added a comment.Oct 5 2020, 5:06 AM

Before going into the code details, could you give a high-level overview of the intended interactions between the Trace and ProcessTrace classes?

I am thinking about how to organize this so that we don't have too many criss-cross dependencies. My main question is whether it's expected that each "trace" plugin will also create its own ProcessTrace subclass.

lldb/source/Target/TraceSettingsParser.cpp
13 ↗(On Diff #296106)

This is a layering violation. We shouldn't have non-plugin code depend on plugin code. (I know there are some existing usages like that, but let's not add new ones -- people have been working for a long time to remove these.)

wallace planned changes to this revision.Oct 5 2020, 9:59 AM
wallace updated this revision to Diff 296254.EditedOct 5 2020, 12:17 PM

Rebased on top of https://reviews.llvm.org/D88841

  • Comments on the architecture of classes are in that diff
  • I was able to get rid of cross-plug-in dependencies except for RegisterContextHistory. The class is very simple, so I could move it to lldb/Core. What do you think?
clayborg requested changes to this revision.Oct 5 2020, 2:56 PM
clayborg added inline comments.
lldb/source/Commands/CommandObjectThread.cpp
2234–2243

This command does seem hard to abstract over all flavors of trace. Would it be better to have the trace plug-ins vend a command objects for "thread trace dump"? The idea would be something like "hey trace plug-in, do you have any commands you support for the 'thread trace dump' multiword command?". It could say "yes, I have one called 'instructions' and here is the CommandObjectSP

2248

We should override the repeat command method here:

const char *GetRepeatCommand(Args &current_command_args, uint32_t index) override;

This way if you type:

(lldb) thread trace dump instruction --offset 0 --count 32

And then hit enter, the next command it should run is:

(lldb) thread trace dump instruction --offset 32 --count 32

That you can keep dumping more instructions by just hitting ENTER.

lldb/source/Commands/Options.td
1008

Do we add a "--summary" option to do this command so that when this option is specified would dump the number of instructions that any threads from the might have? Like:

(lldb) thread trace dump instructions --summary
thread #1 has 4096 instructions
thread #2 has 234 instructions
1017–1020

Is there a "current position" that is maintained? Or is the current position where TraceThread currently is stopped?

lldb/source/Plugins/Process/Trace/ProcessTrace.h
19

So one issue is how do we eventually deal with debugging a live process that enables tracing. In that case we already have a real process class: ProcessGDBRemote most likely. We should avoid putting anything custom that is required from a process in this ProcessTrace class for when we actually have a real process class already. If we need to add anything, we will need to have virtual functions on the lldb_private::Process class that can call through to the Trace plug-in via its virtual functions as well to implement any functionality we might need.

Is this class solely going to be used for "trace load"?

lldb/test/API/commands/trace/TestTraceDumpInstructions.py
47

need to test the repeat command stuff here if we add support for:

const char *GetRepeatCommand(Args &current_command_args, uint32_t index) override {
This revision now requires changes to proceed.Oct 5 2020, 2:56 PM
labath added a comment.Oct 6 2020, 6:01 AM
  • Comments on the architecture of classes are in that diff

Where would that be? I couldn't find anything that would answer my question from the previous comment?

The reason I want to know that is that if each TraceXXX "plugin" is supposed to have/use/need a ProcessTraceXXX plugin, then I'd rather organize them such that they are closer together. OTOH, if ProcessTrace is supposed to work with all kinds of traces, then the current design makes perfect sense (though I am having trouble imagining how would that work).

  • I was able to get rid of cross-plug-in dependencies except for RegisterContextHistory. The class is very simple, so I could move it to lldb/Core. What do you think?

IIUC, the only place which uses it the PT trace plugin. That is ok. Some plugin-to-plugin dependencies (though not cycles, ideally) are fine, particularly for something called "utility". That said, I don't think that moving this class to the core libraries would be a particularly bad choice either...

labath added a comment.Oct 6 2020, 6:22 AM
  • Comments on the architecture of classes are in that diff

Where would that be? I couldn't find anything that would answer my question from the previous comment?

The reason I want to know that is that if each TraceXXX "plugin" is supposed to have/use/need a ProcessTraceXXX plugin, then I'd rather organize them such that they are closer together. OTOH, if ProcessTrace is supposed to work with all kinds of traces, then the current design makes perfect sense (though I am having trouble imagining how would that work).

I've now read the description of D88841, which kind of answers that. The reason I am not sure of this design is that I fear this will require a fairly big API surface between ProcessTrace and Trace classes in order to implement the more complicated commands like reverse-step. It also adds a bunch of dependency restrictions:

  • Trace class (being a core class) should not know anything about ProcessTrace (a plugin) -- this bit could be removed if we make ProcessTrace a non-plugin
  • ProcessTrace (being a generic plugin serving multiple traces) should not know anything about any specific trace plugin

OTOH, if ProcessTraceXXX is an internal affair of the TraceXXX plugin, then the two classes can freely cooperate (within reason), and can use any interfaces they see fit. And the TraceYYY plugin can use a completely different interface to communicate with its process class.

However, I don't think your design is bad, if you can pull it off, so feel free to carry on like you were planning. I'd just encourage you to, as you're working on it, to think about whether things can be made simpler with an different organization of things.

tatyana-krasnukha added inline comments.
lldb/source/Plugins/Process/Trace/ProcessTrace.h
19

One option is to implement btrace request in the ProcessGDBRemote and make remote stubs support it.

I'm also interested in live tracing for a custom process plugin which obtains instruction history in its own way. So, it would be good if a real process/thread provides data to the tracing plug-in.

wallace added inline comments.Oct 6 2020, 4:59 PM
lldb/source/Plugins/Process/Trace/ProcessTrace.h
19

Thanks @tatyana-krasnukha! I think I'll end up doing what you said. And definitely, step 2 of this project is to trace a live process.

19

Summarizing an offline conversation we had, we'd implement in this class only what is required for tracing a defunct process (i.e. from a json file). Otherwise, any trace logic should be in the main Process class, so that all trace plugins could reuse that logic.

Also, during a live debugging session, we don't want to create a new Process object to hold the trace. We'll try to keep a single process to keep the architecture simple.

wallace updated this revision to Diff 296866.EditedOct 7 2020, 10:25 PM
wallace edited the summary of this revision. (Show Details)

Addressed comments, which includes:

  • Implemented the requested repeat command logic, including tests
  • Now showing the total number of instructions as part of the dump command, which is something like

    thread #1: tid = 3842849, total instructions = 1000 instruction 0 instruction 1 instruction 2 ...
  • Moved some logic to the parent diff D88841, leaving this diff simpler.

Some notes:

  • In my current design, TraceProcess will be shared by all trace plug-ins when loading json trace sessions. For live processes, we'll simply use the actual Process instance used by the process, which means that most of the trace logic regarding processes will end up being in the base Process.
  • In my next diff I'll include the trace decoding functionality, which will remove the placeholder text printed when invoking the dump command.
  • As discussed offline with Greg, each thread will hold a pointer to the currently stopped instruction, and dumping instructions, printing backtraces, e.g. will use that as starting point.
clayborg accepted this revision.Oct 8 2020, 1:28 PM

LGTM. Pavel please chime in if you have any issues.

This revision is now accepted and ready to land.Oct 8 2020, 1:28 PM
wallace marked 17 inline comments as done.Oct 8 2020, 11:25 PM
wallace updated this revision to Diff 297127.Oct 8 2020, 11:43 PM

Rebase and made some cosmetic fixes.

labath added a comment.Oct 9 2020, 1:55 AM

I was waiting for the dependant patch to take shape. This looks ok to me -- just one small design question.

lldb/include/lldb/Target/Target.h
1120

const TraceSP & ?

lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.cpp
61

auto trace_instance = std::make_shared<TraceIntelPT>(...)

63

SetTrace(trace_instance)

lldb/source/Target/Thread.cpp
1617–1627 ↗(On Diff #297127)

Given the intended design of having one process (and thread) class serve multiple trace types, can this function do anything useful besides calling into the Trace object to perform trace-specific dumping ?

And if it cannot, couldn't we just skip it have have callers call the relevant Trace method directly?

wallace added inline comments.Oct 9 2020, 5:36 PM
lldb/source/Target/Thread.cpp
1617–1627 ↗(On Diff #297127)

That's a very good point! There's really no need to have this method here

wallace marked 3 inline comments as done.Oct 9 2020, 5:59 PM
wallace added inline comments.
lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.cpp
61

I can't do this because the constructor is private (specifically to std::shared_ptr).
I want to have this method as the main way to create instances because I need to use shared_ptr of this instance to set up everything correctly, and I can't do it from the constructor itself.
I've checked ways to make std::make_shared<TraceIntelPT>(...), but it looks like too much code that outweighs the benefit of the fancy allocator.

wallace updated this revision to Diff 297373.Oct 9 2020, 6:18 PM
  • Moved the thread dumping logic to Trace.h.
  • Did the small fixes that were requested.
  • Left a comment regarding the shared_ptr instantiation.
labath accepted this revision.Oct 12 2020, 12:54 AM
labath added inline comments.
lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.cpp
61

Ah, cool. Sorry for the noise.

This revision was landed with ongoing or failed builds.Oct 12 2020, 12:08 PM
This revision was automatically updated to reflect the committed changes.