Page MenuHomePhabricator

[intel-pt] Implement a basic test case

Authored by wallace on Mar 30 2020, 5:31 PM.



Depends on D76872.

There was no test for the Intel PT support on LLDB, so I'm creating one, which
will help making progress on solid grounds.

The test is skipped if the Intel PT plugin library is not built.

Diff Detail

Event Timeline

wallace created this revision.Mar 30 2020, 5:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2020, 5:31 PM
wallace updated this revision to Diff 253748.Mar 30 2020, 5:34 PM

Added a stop command invocation in the test

I am worried if this test will be flaky on loaded machines. Not sure how we can ever guarantee we will see processor traces with our stuff in it if the machine is busy running many tests or even doing other things.


can we guarantee we will see any of these on a fully loaded machine running many tests simultaneously? Maybe we need to settle for the header of the output only to know that it tried to display something?

It's nice to see this code getting some use. I was starting to think we should delete it...




better avoid referencing functions from the system library... makes the test more hermetic


What exactly is the case you're worried about? I'm not very familiar with how all this works, but I would think that the kernel trace buffer for this is application specific, and is automatically switched off when the os schedules a different process (anything else would be a security breach). If that is true, then we should have pretty good control over what goes into the buffer, and we can ensure that it is: (a) big enough; and/or (b) application does not execute too much code and overflows it (not calling rand would help us get a reasonable upper bound on that).

(Nonetheless it would be good to run some stress tests to verify this is stable.)


We're not putting license headers on tests.

(Do these get automatically created by some IDEs or something? Can they be configured not to do that?)

wallace marked 2 inline comments as done.Mar 31 2020, 5:37 PM
wallace added inline comments.

This is how it works: by default Intel PT has to be enabled on each logical CPU, where it traces everything, regardless of which thread if running. The kernel has the ability to switch Intel PT on and off on each logical CPU whenever there's a thread context switch, and this kind of filtering is what this LLDB plugin is using.

For some context, that kind of filtering is expensive because of the constant enabling/disabling of Intel PT and could incur in up to 5% total CPU cost according to Intel. Another drawback of this approach is that threads spawned by an already traced thread are not traced by default. The user has to enable filtered tracing explicitly for these threads.

A faster approach is to enable Intel PT on all CPUs without filtering. That leads to a ~2% total CPU cost according to some tests some colleagues and I ran. However, this needs having a secondary trace of context switches to be able to attribute Intel PT packets to individual threads. We are not following that approach in this plugin because of the added complexity. However, I plan to make this plugin flexible enough to be able to load Intel PT traces collected by other mechanisms which can do global tracing correctly.

Lastly, the PT Trace unit in the cpu writes PT packets on memory without interrupting the CPU itself nor the kernel. The only case in which packets couldn't be written is when the BUS is completely full. However, this is extremely rare and the CPU would retry later.

That being said, I see no reason why the trace wouldn't be collected at that point. Just in case I'll add a 0.1 ms wait time for the CPU to have enough time to send all the packets, which should be more than enough.


I just copy pasted it from another test

wallace updated this revision to Diff 254065.Mar 31 2020, 5:56 PM

Addressed comments

Also ran the test in parallel to a 'stress -c 100 -i 100' invocation, and it
passed correctly

wallace updated this revision to Diff 254067.Mar 31 2020, 6:04 PM


wallace updated this revision to Diff 254068.Mar 31 2020, 6:05 PM

Noise, for some reason I can't run clang-format as part of arc lint on this device.
I'm running it manually anyway

Harbormaster failed remote builds in B51233: Diff 254067!
Harbormaster failed remote builds in B51234: Diff 254068!
labath accepted this revision.Apr 1 2020, 2:19 AM
labath added inline comments.

Ok, that sounds good to me, though I'd be very surprised if 100ms makes any difference -- I've seen flaky tests with much bigger delays.


Which one? I don't see any test like that in the lldb repo. Was it from the swift fork or something?

This revision is now accepted and ready to land.Apr 1 2020, 2:19 AM
wallace marked an inline comment as done.Apr 1 2020, 10:16 AM
wallace added inline comments.

I copied it from here
I'll remove the header from that file as well

This revision was automatically updated to reflect the committed changes.