This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Instrumenting Clang/LLVM with Perfetto
Needs ReviewPublic

Authored by Nathan-Huckleberry on Jul 1 2020, 1:34 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Instrumenting Clang/LLVM with Perfetto

Overview

Perfetto is an event based tracer designed to replace chrome://tracing. It
allows for fine-grained control over trace data and is currently in use by
Chrome and Android.

Instrumentation of Clang with Perfetto would give nicely formatted traces
that are easily shareable by link. Compile time regression bugs could be
filed with Perfetto links that clearly show the regression.

Perfetto exposes a C++ library that allows arbitrary applications to record
app-specific events. Trace events can be added to Clang by calling macros
exposed by Perfetto.

The trace events are sent to an in-process tracing service and are kept in
memory until the trace is written to disk. The trace is written as a protobuf
and can be opened by the Perfetto trace processor (https://ui.perfetto.dev/).

The Perfetto trace processor allows you to vizualize traces as flamegraphs.
The view can be scrolled with "WASD" keys. There is also a query engine
built into the processor that can run queries by pressing CTRL+ENTER.

The benefits of Perfetto:

  • Shareable Perfetto links
    • Traces can be easily shared without sending the trace file
  • Traces can be easily aggregated with UNIX cat
  • Fine-grained Tracing Control
    • Trace events can span across function boundaries (Start a trace in one function, end it in another)
    • Finer granularity than function level that you would see with Linux perf
  • Less tracing overhead
    • Trace events are buffered in memory, not sent directly to disk
    • Perfetto macros are optimized to prevent overhead
  • Smaller trace sizes
    • Strings and other reused data is interned
    • Traces are stored as protobufs instead of JSON
    • 3x smaller than -ftrace-time traces
  • SQL queries for traces
    • The Perfetto UI has a query language built in for data aggregation
  • Works on Linux/MacOS/Windows

Example Trace

This is an example trace on a Linux kernel source file.
https://ui.perfetto.dev/#!/?s=c7942d5118f3ccfe16f46d166b05a66d077eb61ef8e22184a7d7dfe87ba8ea

This is an example trace on the entire Linux kernel.
https://ui.perfetto.dev/#!/?s=10556b46b46aba46188a51478102a6ce21a9c767c218afa5b8429eac4cb9d4
Recorded with:

make CC="clang-9" KCFLAGS="-perfetto" -j72
find /tmp -name "*pftrace" -exec cat {} \; > trace.pftrace

Current Implementation

These changes are behind a CMake flag (-DPERFETTO). When building Clang with
the CMake flag enabled, the Perfetto GitHub is cloned into the build folder and
linked against any code that uses Perfetto macros.

The -ftime-trace and Perfetto trace events have been combined into one
macro that expands to trace events for both. The behavior of -ftime-trace
is unchanged.

To run a Perfetto trace, pass the flag -perfetto to Clang (built with
-DPERFETTO). The trace output file follows the convention set by
-ftime-trace and uses the filename passed to -o to determine the trace
filename.

For example:
clang -perfetto -c foo.c -o foo.o
would generate foo.pftrace.

Tracing documentation

LLVM_TRACE_BEGIN(name, detail)
Begins a tracing slice if Perfetto or -ftime-trace is enabled.
name : constexpr String
This is what will be displayed on the tracing UI.
detail : StringRef
Additional detail to add to the trace slice. This expands to a lambda
and will be evaluated lazily only if Perfetto or -ftime-trace are
enabled.

LLVM_TRACE_END()
Ends the most recently started slice.

LLVM_TRACE_SCOPE(name, detail)
Begins a tracing slice and initializes an anonymous struct if Perfetto or
-ftime-trace is enabled. When the struct goes out of scope, the tracing
slice will end.
name : constexpr String
This is what will be displayed on the tracing UI.
detail : StringRef
Additional detail to add to the trace slice. This expands to a lambda
and will be evaluated lazily only if Perfetto or -ftime-trace are
enabled.

Perfetto Documentation: https://perfetto.dev/

FAQs

Why not use Linux Perf?
Perfetto's event based model allows for much finer grained control over
the trace.

  • Linux Perf is only available on Linux.
  • Visualization requires post processing with separate tools.
  • Requires kernel version specific dependencies.

Why not use -ftime-trace?
Perfetto has almost the same functionality as -ftime-trace, but with a
few added benefits.

  • Shareable links.
  • Traces can be aggregated easily with UNIX cat.
  • The query engine for trace analysis.
  • The Perfetto UI is browser agnostic and could be used the same way as godbolt.
  • The resulting trace files are ~3x smaller.
    • A trace of the Linux kernel is 50MB with Perfetto and 139MB with -ftime-trace.

Extra Notes

Perfetto also has a system-mode that interacts with Linux ftrace. It can
record things like process scheduling, syscalls, memory usage and CPU
usage.

Known Issues

When no-integrated-as is enabled, traces are outputted to /tmp/. This is a
bug with the current implementation of -ftime-trace. When the Perfetto
change is applied, the bug also applies to Perfetto.

Diff Detail

Event Timeline

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 1 2020, 1:34 PM
Nathan-Huckleberry edited the summary of this revision. (Show Details)Jul 1 2020, 1:36 PM
Nathan-Huckleberry edited the summary of this revision. (Show Details)
Nathan-Huckleberry edited the summary of this revision. (Show Details)Jul 1 2020, 1:47 PM
Nathan-Huckleberry edited the summary of this revision. (Show Details)
Nathan-Huckleberry edited the summary of this revision. (Show Details)
Nathan-Huckleberry edited the summary of this revision. (Show Details)Jul 1 2020, 1:52 PM
Nathan-Huckleberry edited the summary of this revision. (Show Details)

New files are missing standard license header blurb.
Also probably missing some tests.

clang/lib/Driver/Driver.cpp
3754

This appears unrelated to the patch.

clang/tools/clang-shlib/CMakeLists.txt
40

Could use more words. What issues?

llvm/cmake/modules/AddPerfetto.cmake
9

I have concerns about this.
It really should use system-provided version via find_package()
At worst, the sources should be bundled into the tree like it's already done in some rare cases.

clang/lib/Driver/Driver.cpp
3754

This might be related to the requirement that perfetto trace everything in process; though I think this is also a pre-existing bug IMO that can be precommitted.

llvm/cmake/modules/AddPerfetto.cmake
9

Not that I'm very good with CMake, but this seems to suggest that ExternalProject_Add may not compose well with find_package: https://stackoverflow.com/questions/6351609/cmake-linking-to-library-downloaded-from-externalproject-add

lebedev.ri added inline comments.Jul 16 2020, 1:43 PM
llvm/cmake/modules/AddPerfetto.cmake
9

What i am saying is that any code that fetches anything from internet during cmake/build time is just plain broken.
perfetto should be provided by system package and we should link to it just like we link to zlib/etc.

llvm/cmake/modules/AddPerfetto.cmake
9

That's how GTest is fetched. See: llvm/utils/benchmark/cmake/HandleGTest.cmake.

lebedev.ri added inline comments.Jul 20 2020, 11:19 PM
llvm/cmake/modules/AddPerfetto.cmake
9

That's how GTest is fetched.

No, it's not. That cmake file is never ever executed by LLVM's cmake.
LLVM's gtest is bundled in llvm/utils/unittest/googletest,
much like googlebenchmark is bundled in llvm/utils/benchmark,
etc.

Nathan-Huckleberry edited the summary of this revision. (Show Details)Jul 27 2020, 12:02 PM
  • Attach translation unit names to trace processes
Mordante added inline comments.
clang/include/clang/Basic/CodeGenOptions.def
244

Seems this line should be one lower to keep the comment above together.

clang/include/clang/Frontend/FrontendOptions.h
251

Is it intended to have the same comment as TimeTrace?