This is an archive of the discontinued LLVM Phabricator instance.

[trace] Make events first class items in the trace cursor and rework errors
ClosedPublic

Authored by wallace on Jun 25 2022, 12:40 AM.

Details

Summary

We want to include events with metadata, like context switches, and this
requires the API to handle events with payloads (e.g. information about
such context switches). Besides this, we want to support multiple
similar events between two consecutive instructions, like multiple
context switches. However, the current implementation is not good for this because
we are defining events as bitmask enums associated with specific
instructions. Thus, we need to decouple instructions from events and
make events actual items in the trace, just like instructions and
errors.

  • Add accessors in the TraceCursor to know if an item is an event or not
  • Modify from the TraceDumper all the way to DecodedThread to support event items. Fortunately this simplified many things
  • Renamed the paused event to disabled.
  • Improved the tsc handling logic. I was using an API for getting the tsc from libipt, but that was an overkill that should be used when not processing events manually, but as we are already processing events, we can more easily get the tscs.
  • As part of this refactor, I also fixed and long stating issue, which is that some non decoding errors were being inserted in the decoded thread. I changed this so that TraceIntelPT::Decode returns an error if the decoder couldn't be set up properly. Then, errors within a trace are actual anomalies found in between instrutions.

All test pass

Diff Detail

Event Timeline

wallace created this revision.Jun 25 2022, 12:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2022, 12:40 AM
Herald added a subscriber: mgorny. · View Herald Transcript
wallace requested review of this revision.Jun 25 2022, 12:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2022, 12:40 AM
wallace edited the summary of this revision. (Show Details)Jun 25 2022, 12:41 AM
jj10306 requested changes to this revision.Jun 28 2022, 9:31 AM

review- part 1

lldb/include/lldb/Target/Trace.h
171

Do we want to keep the cursor as a UP or take this refactor before creating the public bindings to make it a SP? IMO a UP might make sense because I can't really think of many cases where you would want multiple users of the same cursor.
wdyt?

lldb/include/lldb/Target/TraceCursor.h
44–46

do you want to include pointers to the methods you can use to check/get events similar to what you did for errors above?

232–246

For the getters, what are your thoughts on returning an optional instead of using designated value/variants (ie nullptr, LLDB_INVALID_INSTRUCTION, eTraceEventNone`) to indicate that the current item does not match what Get method is being used?

Along the same lines, but a slightly bigger change:
With the introduction of Events as first class citizens of the trace cursor API, it may make sense to introduce the notion of of a trace "item" or something similar that encapsulates (instructions, events, errors, etc). I'm thinking of a tagged union type structure (ie a Rust enum) that enforces the correctness of the access to the underlying item.
wdyt?

267

Now that we are introducing the notion of an Event? wdyt about combining Events and Counters since a counter value in a trace really is just a type of event? In this case, Counter could just become a variant of TraceEvent. This may make more sense because even in the case of TSCs in an IntelPT trace because, strictly speaking, these aren't associated with instructions, correct? Instead the TSC values are emitted with PSBs and then we "attribute" these values to the nearest instruction, right?

lldb/include/lldb/Target/TraceDumper.h
59

oh looks like you already introduced a TraceItem structure!
Similar to my comment above related to introducing a trace item type structure for the trace cursor, would it make sense to have a separate type for each of the different "items". With the way the TraceItem struct is currently, a lot of this data is mutually exclusive, hence the many optional fields, correct?

lldb/source/Plugins/Trace/intel-pt/DecodedThread.h
137–138

Why not use an optional here instead?

209–221

Continuing the theme of my comments related to the notion of a TraceItem - would it be possible to unite the notion of a trace item? currently there is an Item enum here, a item struct in the Dumper and I feel there is a need for a Item at the trace cursor level as well.
wdyt?

214

why is this needed?

225–230

I found this a bit confusing. Are the values of the vectors the indexes into the DenseMaps below (m_errors and m_events) or are these the actual error/event values?
How does this vector relate to the two vectors below (instruction_size/classes since now this is items and not just instructions?

This revision now requires changes to proceed.Jun 28 2022, 9:31 AM
wallace added inline comments.Jun 28 2022, 8:02 PM
lldb/include/lldb/Target/Trace.h
171

It seems that we can use unique_ptrs! SBMemoryRegionInfoList is an example of a SB wrapper of a unique_ptr. It seems that python reuses the same reference of the SB object with all the variables that use it. On the other hand, if you use c++, the unique_ptr would be enforced strictly

lldb/include/lldb/Target/TraceCursor.h
44–46

good idea!

232–246

check the new version of the code. I'm now using a tagged union for the low level storage, and then using an enum for the item kinds in the TraceCursor. I think this version introduces more safety and it's easier to handle the low level storage, not to mention that we now use less memory!

267

I really like your idea, but that's not that trivial, so let's try to come up with a nice design. What about the following?

  • We create a new event TraceEventHWClockTick. This will help use identify quickly anytime we see a different TSC. That way we don't need to keep asking every instruction for its TSC even when it doesn't change. The new name also makes it more generic.
  • In order to access the TSC value, the user could invoke uint64_t GetEventValueHWClockTick(). I'd imagine that each event type would have its own data accessor.
  • We create another method optional<uint64_t> GetEventValueLastHWClockTick(), which returns the most recent value for this event type that happened on or before the cursor's current point. I wouldn't make this kind of functionality generic for all event kinds because for some it might not make sense at all. Anyway, the reason is the following code

In the dumper, we would like to support this flow:

 cursor->GoTo(starting_id)
 hw_timestamp = cursor->GetEventValueLastHWClockTick();

 for (; cursor->HasValue(); cursor->Next()) {
   if (cursor->GetItemKind() == eTraceItemKindEvent && cursor->GetEventKind() == eTraceEventHWClockTick)
     hw_timestamp = cursor->GetEventValueHWClockTick();

   cout << "current clock (tsc) " << hw_timestamp << endl;
}

so, basically we need a way to know the starting tsc for a given instruction id, and then to identify changes.

Later, we can implement a new method nanoseconds GetEstimatedWallClockTime(), which delegates to the plug-in the job to estimate the wall clock time for a given instruction item. This estimation makes more sense as an attribute of each item.

What do you think?

lldb/include/lldb/Target/TraceDumper.h
59

I think it's fine to have a very dumb TraceItem in the Dumper, because of its simplicity. However, when dealing with the API, it's better to avoid returning data structures. Remember that because of python support, we need to create a unique_ptr or shared_ptr for every data structure that we return, which can potentially add a really big overhead if we need to create a data structure for every trace item. On the other hand, if we do most of the queries using functions, we can avoid that overhead and allow fast trace iteration even from python. So being specific, I want to avoid creating a SBTraceItem, or SBTraceInstruction, etc, except for very exceptional cases, like a possible SBTraceContextSwitchEvent, which is something so rare in the trace that would add no real traversal overhead

lldb/source/Plugins/Trace/intel-pt/DecodedThread.h
137–138

we are going to get rid of this after @persona0220 's work

209–221

i'm reworking this with a new tagged union

214

see above

225–230

see above

wallace updated this revision to Diff 440827.Jun 28 2022, 8:03 PM
  • the trace cursor api now exposes an enum of item kinds for safety
  • the low level storage has been reworked and it's now simpler
  • other minor changes

Let's see what @jj10306 thinks of the way I'm proposing to expose hw clock ticks.

jj10306 accepted this revision.Jun 29 2022, 6:00 AM

This is awesome work - the code is much more understandable, so thanks for doing this!

lgtm overall, just left a couple final minor comments. I think we can continue to iterate/improve this design, so we can discuss those improvements for future diffs.
Also, let me know what you think about how to support the two different types of disable events - this doesn't need to be done in this diff but it would be extremely useful as we begin to explore kernel tracing since this will most likely require us to understand any differences between user only, kernel only, or user + kernel tracing.

lldb/include/lldb/lldb-enumerations.h
1162–1167

nice

lldb/source/Plugins/Trace/intel-pt/DecodedThread.cpp
51–53

with this current implementation, it's the caller's responsibility to check the Item's type before calling this otherwise they will get a garbage load address if the item is actually an event for example.
Would it be better to enforce the validity of the Get call at this lowest level of storage by checking that the type matches what is expected and return None or something like LLDB_INVALID_ADDRESS if there's a mismatch?
same idea applies below for the other getters

65–67

nit: consider using emplace since this returns a ref to the item. or just create a constructor for the TraceItemKind that you can pass the kind to so you can set the kind in the emplace call and then there's no need for a reference to the newly inserted object

another suggestion would be to just make this method return a reference to the newly created TraceItemStorage since all the callers end up using the index to get access to the newly added item, so returning the ref would remove the need for those "redundant" lookups

lldb/source/Plugins/Trace/intel-pt/DecodedThread.h
230

why not use the enum as the type?

249

why is this "most"?

250

nice, this is much more clear now (:

lldb/source/Plugins/Trace/intel-pt/LibiptDecoder.cpp
159–167

oooh, having a separate event for these two disabled cases would be provide great insight for us when trying to understand how tracing is enabled/disabled for user vs kernel mode (ie is it disabled by the hardware due to CPL filtering (ptev_disabled) or by the kernel/user (ptev_asyn_disabled)) - this is super cool and will be very useful!

not sure how we could cleanly expose this though since these different types of disables are pretty intelpt specific, but the notion of the traceevent is exposed at the top-level trace-agnostic tracecursor, right?
wdyt?

This revision is now accepted and ready to land.Jun 29 2022, 6:00 AM
wallace updated this revision to Diff 441037.Jun 29 2022, 8:53 AM
  • prevent padding in the trace item data