This is an archive of the discontinued LLVM Phabricator instance.

[trace][intelpt] Support system-wide tracing [14] - Decode per cpu
ClosedPublic

Authored by wallace on May 25 2022, 10:03 AM.

Details

Summary

This is the final functional patch to support intel pt decoding per cpu.
It works by doing the following:

  • First, all context switches are split by tid and sorted in order. This produces a list of continuous executes per thread per core.
  • Then, all intel pt subtraces are split by PSB boundaries and assigned to individual thread continuous executions on the same core by doing simple TSC-based comparisons.
  • With this, we have, per thread, a sorted list of continuous executions each one with a list of intel pt subtraces. Up to this point, this is really fast because no instructions were actually decoded.
  • Then, each thread can be decoded by traversing their continuous executions and intel pt subtraces. An advantage of having these continuous executions is that we can identify if a continuous exexecution doesn't have intel pt data, and thus has a gap in it. We can later to more sofisticated comparisons to identify if within a continuous execution there are gaps.

I'm adding a test as well.

Diff Detail

Event Timeline

wallace created this revision.May 25 2022, 10:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2022, 10:03 AM
wallace requested review of this revision.May 25 2022, 10:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2022, 10:03 AM
wallace updated this revision to Diff 434081.Jun 3 2022, 11:31 AM
wallace edited the summary of this revision. (Show Details)

finish diff

wallace updated this revision to Diff 434093.Jun 3 2022, 11:50 AM

update test files

jj10306 requested changes to this revision.Jun 12 2022, 12:22 PM
jj10306 added inline comments.
lldb/include/lldb/Utility/TraceIntelPTGDBRemotePackets.h
83–85

why get rid of chronos here?

lldb/source/Plugins/Trace/intel-pt/LibiptDecoder.cpp
22–38

nit: Is there a reason these classes/structs are not declared in the header file?

100–102

is this what you intended to say here?

353

Why is this the case?

lldb/source/Plugins/Trace/intel-pt/LibiptDecoder.h
21–24

docs

47–53

docs

lldb/source/Plugins/Trace/intel-pt/PerfContextSwitchDecoder.cpp
16–82

why not put these declarations in the header file?

228

what's this doing here?

240

does this need to be a lambda? iiuc this is only called once (at the end of this function), so it seems like this could just be placed inline instead of being in a lambda

lldb/source/Plugins/Trace/intel-pt/PerfContextSwitchDecoder.h
2

do the structures/logic contained in this file (and its .cpp) belong with the other Perf logic of LLDB or should does this belong with the intelpt specific code?

125
lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp
43
64

I think naming this subtraces would help as it makes it makes the distinction between the subtraces and thread executions more clear while reading the code below.

84
93

perhaps I'm misunderstanding the intention of this counter, but won't this be incremented for all subtraces that don't belong to the specific ThreadContinousExecution currently being operated on instead of being incremented for the subtraces that don't belong to any ThreadContinousExecution?

121

Should this be named differently since this is doing much more than decoding the context switches (it splits the intel pt traces and correlates them with the context switch executions)?

lldb/source/Target/Trace.cpp
415
This revision now requires changes to proceed.Jun 12 2022, 12:22 PM
wallace added inline comments.Jun 14 2022, 3:26 PM
lldb/include/lldb/Utility/TraceIntelPTGDBRemotePackets.h
83–85

because chronos uses signed integers and we lose precision by using it

lldb/source/Plugins/Trace/intel-pt/LibiptDecoder.cpp
22–38

they are not used by anyone else, so just that

100–102

hehe yes

lldb/source/Plugins/Trace/intel-pt/LibiptDecoder.h
21–24

+1

47–53

+1

lldb/source/Plugins/Trace/intel-pt/PerfContextSwitchDecoder.cpp
16–82

they are not used externally, so I rather not expose them until we really need to

228

lol, forgive this poor man

240

I'm using this lambda trick to capture any possible error returned by the complex logic in it and then append some text to this error. This would be equivalent to moving the lambda to its own function. I'm fine with creating another function for this case, but the lambda seems fine to me

lldb/source/Plugins/Trace/intel-pt/PerfContextSwitchDecoder.h
2

the other Perf.h file is lldb-server only, so they can't be together nor merged :(

125

+1

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTMultiCoreDecoder.cpp
43

owch, thank you

64

+1

84

thanks

93

not really. In this case it is any ThreadContinousExecution. You need to notice a few things:

  • the intel pt subtraces (or executions), are sorted by time
  • on_new_thread_execution is invoked sorted by time

when we process a new thread execution, we look for the intel pt subtraces that we haven't processed yet that happened no later than our current execution. If we find a subtrace that happened before the current execution, then that means that we don't have any thread execution for this subtrace, regardless of which thread it was

121

+1

lldb/source/Target/Trace.cpp
415

i've gotten rid of the set

wallace requested review of this revision.Jun 15 2022, 9:28 AM
jj10306 accepted this revision.Jun 15 2022, 9:36 AM
This revision is now accepted and ready to land.Jun 15 2022, 9:36 AM
This revision was landed with ongoing or failed builds.Jun 16 2022, 11:23 AM
This revision was automatically updated to reflect the committed changes.