This is an archive of the discontinued LLVM Phabricator instance.

[trace][intelpt] Support system-wide tracing [12] - Support multi-core trace load and save
ClosedPublic

Authored by wallace on May 19 2022, 1:34 PM.

Details

Summary

This diff is massive, but it's because it connects the client with lldb-server
and also ensures that the postmortem case works.

  • Flatten the postmortem trace schema. The reason is that the schema has become quite complex due to the new multicore case, which defeats the original purpose of having a schema that could work for every trace plug-in. At this point, it's better that each trace plug-in defines it's own full schema. This means that the only common field is "type".
    • Because of this new approach, I merged the "common" trace load and saving functionalities into the IntelPT one. This simplified the code quite a bit. If we eventually implement another trace plug-in, we can see then what we could reuse.
    • The new schema, which is flattened, has now better comments and is parsed better. A change I did was to disallow hex addresses, because they are a bit error prone. I'm asking now to print the address in decimal.
    • Renamed "intel" to "GenuineIntel" in the schema because that's what you see in /proc/cpuinfo.
  • Implemented reading the context switch trace data buffer. I had to do

some refactors to do that cleanly.

    • A major change that I did here was to simplify the perf_event circular buffer reading logic. It was too complex. Maybe the original Intel author had something different in mind.
  • Implemented all the necessary bits to read trace.json files with per-core data.
  • Implemented all the necessary bits to save to disk per-core trace session.
  • Added a test that ensures that parsing and saving to disk works.

Diff Detail

Event Timeline

wallace created this revision.May 19 2022, 1:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2022, 1:34 PM
Herald added a subscriber: mgorny. · View Herald Transcript
wallace requested review of this revision.May 19 2022, 1:34 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 19 2022, 1:34 PM
wallace updated this revision to Diff 430797.May 19 2022, 1:46 PM
wallace edited the summary of this revision. (Show Details)

nit

jj10306 requested changes to this revision.May 31 2022, 6:28 PM

feedback-v1

lldb/include/lldb/Target/Trace.h
520

Would this work instead? I noticed that the several other maps around this code also use unordered_map and string, maybe we could update those too at some point.

538

same as comment above related to using LLVM types rather than std types wherever possible.

lldb/source/Plugins/Process/Linux/IntelPTMultiCoreTrace.cpp
193–209

Expected<Optional<...>> feels weirddddd. Are both of these "layers" needed?
I noticed this in a couple places but I'm just leaving a single comment here.

lldb/source/Plugins/Process/Linux/Perf.cpp
191

Be careful with the "strong" language here. The AUX head only wraps around when the AUX buffer is mmaped with read-only, if it is mmapped with write perms, then the AUX head behaves like the data head (it doesn't auto wrap)

See this snipped of kernel code, the overwrite flag is set based on the mmap PROT and that overwrite flag is passed to the intelpt code.
https://github.com/torvalds/linux/blob/df0cc57e/kernel/events/ring_buffer.c#L670-L733

This revision now requires changes to proceed.May 31 2022, 6:28 PM

feedback-v2

lldb/source/Plugins/Process/Linux/Perf.cpp
183

Do we need the offset/size? These arguments make the reading logic much more involved versus reading the whole buffer.
I understand that this is a more "general" method, but iiuc currently we only read the entire buffer. I guess in the future somebody could want the flexibility of providing an offset/size, but I don't see the need currently.
wdyt?
Same comment applies to the AUX buffer read method

206–212

What happens if an offset larger then the size of the buffer is provided? should that offset be wrapped or not?
What happens if a size larger than the size of the buffer is provided? should we only copy up to buffer size bytes or should we do multiple copies of the buffer to satisfy the request?

As I mentioned above, imo the ambiguity/complexity these arguments introduce is not worth the value add (which currently is none since we are always copying the whole buffer).

If we do stick with these args, we should make it very clear what happens in the cases I mentioned above and/or put restrictions on the ranges of these values (ie they must always be less than the buffer size)

lldb/source/Plugins/Trace/common/ThreadPostMortemTrace.h
47

why is this now optional?

lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h
169
176–177

do we need the threads explicitly or does traced_processes implicitly contain this info?

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTJSONStructs.cpp
53

if this is none, does this still create a key value pair?

88–90

nit: use what De Morgan taught us (;
Having to add a ! for each new addition to the predicate is very error prone and would be a pain to debug if you accidentally forgot to add the negation.

109–110

same comment as above related to De Morgan's law

131–133

same comment as above related to De Morgan's law

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTJSONStructs.h
32–42

If the intention here is for system wide tracing to not provide threads while single thread tracing does then should we change this like so?
(i know you're currently working on the thread detection so potentially this will be addressed then?)

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSessionFileParser.cpp
31

take a reference to the underlying object directly?

33

what happens if system_path is invalid? or is it already guaranteed to exist at this point (ie did we already check this somewhere?)?

44–45

curious, what is the debugger able to provide when the uuid is provided versus when it isn't?

71

take a reference to the underlying object directly?

227

use make_shared instead?

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSessionFileParser.h
55–57

docs?

64–69

docs?

90
lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSessionSaver.cpp
33–40

docs?

38

to better defend against if the size of data's element's changes?

65

is the string construction necessary?

93

nit: what's the difference between this and using the NormalizePath method you added? I know that method is private but could that potentially be made public or global so we can use that here?

131–132

same comment as above related to potentially using the ResolvePath method?

141

nit: we need to discuss this offline to ensure we are following the same structure as currently I have

cpus/
  <cpuId>/
    ipt.data
    context_switch.data

whereas this is a little different (ie using "cores" instead of "cpus" and not including the sub directory for each core id.

188

<directory> is the directory that the trace.json file will live at, correct?

feeback-v3 - completed review

lldb/docs/use/intel_pt.rst
184–185

Consider adding a section on the perfTscConversion parameters while we're editing this file as I don't currently see that in this file. Also, it doesn't appear the "cores" section is here either

lldb/source/Target/Trace.cpp
30–32

do we need this struct or can we just use a string directly since the struct only has a single member?

134
135

why is this named thread_data, isn't this the size of the core's buffer?

325
lldb/unittests/Process/Linux/PerfTests.cpp
9

isn't this needed since readTsc() uses intel specific assembly instructions?

wallace added inline comments.Jun 14 2022, 12:14 AM
lldb/include/lldb/Target/Trace.h
520

great idea! I'll use ConstString instead of string. That will work. ConstString is like StringRef but uses a shared pool for storage.

538

+1

lldb/source/Plugins/Process/Linux/IntelPTMultiCoreTrace.cpp
193–209

I think they are fine because in this case the collector is looking for who might own the data being fetched, and each object that implements TryGetBinaryData can return 3 possible outputs:

  • None -> it doesn't have the data
  • the sought value
  • an error -> it should have the data but something went wrong

so this is fine.

lldb/source/Plugins/Process/Linux/Perf.cpp
183

After giving many thoughts to this, I agree with you. If we ever need to stream the data, we can implement it then, but it's not needed now.

191

+1

206–212

yep, i got rid of all of this

lldb/source/Plugins/Trace/common/ThreadPostMortemTrace.h
47

because now trace buffers are optional for threads. The trace buffer might be at the core level

lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h
169

+1

176–177

we don't really need, but it's useful because this way we don't need to cast ThreadSP to ThreadPostMortemTraceSP, which might break in some situations. So I'm making this constructor being harder to use to that it's only used by the bundle parser and not used by anyone else incorrectly

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTJSONStructs.cpp
53

yes, with null as value, but I'm changing it to

if (thread.trace_buffer)
  obj["traceBuffer"] = *thread.trace_buffer;

for simplicity of the output

88–90

i like this!

109–110

+1

131–133

+1

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTJSONStructs.h
32–42

as discussed offline, we'll keep the threads section because it will be useful for handling named threads

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSessionFileParser.cpp
31

good idea

33

we are assuming it exists. But we actually really check it when we try to fetch its data.

44–45

if the UUID is provided and doesn't match the actual binary's UUID from the elf build section, then LLDB returns an error

71

+1

227

can't because the constructor is private and not visible to make_shared

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSessionFileParser.h
55–57

+1

64–69

+1

90

+1

lldb/source/Plugins/Trace/intel-pt/TraceIntelPTSessionSaver.cpp
33–40

+1

38

i made this simpler

if (!data.empty())
  out_fs.write(reinterpret_cast<const char *>(&data[0]), data.size());
65

i've changed it to

os << formatv("{0:2}", trace_session_json).str();.

we need to explicitly convert it to string somehow because std::ofstream doesn't understand formatv

93

i got rid of this, it was not needed

131–132

+1

141

as we discussed offline, the structure itself doesn't matter. LLDB can use a structure and dynolog a different one

188

yes

lldb/source/Target/Trace.cpp
30–32

the structure is useful using fromJSON, which gives nice error messages

134

+1

135

silly me

325

+1

lldb/unittests/Process/Linux/PerfTests.cpp
9

ahh good catch

wallace requested review of this revision.Jun 15 2022, 9:27 AM
jj10306 accepted this revision.Jun 15 2022, 9:35 AM
This revision is now accepted and ready to land.Jun 15 2022, 9:35 AM
lldb/source/Plugins/Trace/common/TraceSessionFileParser.cpp