This is an archive of the discontinued LLVM Phabricator instance.

Non-debuginfo JITLink perf jitdump support
ClosedPublic

Authored by pchintalapudi on Mar 15 2023, 2:34 PM.

Details

Summary

This patch ports PerfJITEventListener to a JITLink plugin, but adds unwind record support and drops debuginfo support temporarily. Debuginfo can be enabled in the future by providing a way to obtain a DWARFContext from a LinkGraph.

See D146060 for an experimental implementation that adds debuginfo parsing.

Diff Detail

Event Timeline

pchintalapudi created this revision.Mar 15 2023, 2:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2023, 2:34 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
pchintalapudi requested review of this revision.Mar 15 2023, 2:34 PM
Herald added a project: Restricted Project. · View Herald Transcript

Add JITLink perf support without debuginfo record

This extracts the non-debuginfo changes from D146060 into its own revision, as debuginfo is more complicated currently.

pchintalapudi retitled this revision from Non-debuginfo perf support to Non-debuginfo JITLink perf jitdump support.Mar 15 2023, 2:41 PM
pchintalapudi edited the summary of this revision. (Show Details)
pchintalapudi added a reviewer: lhames.
pchintalapudi added subscribers: Restricted Project, vchuravy.
pchintalapudi edited the summary of this revision. (Show Details)

Fix Windows compilation

This looks great -- thanks very much @pchintalapudi!

I love that you've gone for the full cross-process implementation -- that's amazing. Have you tested it in a cross-process setup with llvm-jitlink-executor? Is it really working out-of-process already?

I have a few requests, mostly cosmetic (though the licensing info should go in before this lands), but other than that LGTM.

llvm/include/llvm/ExecutionEngine/Orc/PerfSupportPlugin.h
2

Should be padded out to 80 cols with dashes either side of the name/description.

40–45

Do the perf JIT APIs support unloading code? If so these should have a // TODO to fill them in later. (I don't think they're needed for the initial commit).

136–141

In SimplePackedSerialization argument lists and tuples are guaranteed to have the same serialization, so there's a convenience method on the SPS tuple list that you can use here:

static size_t size(const PerfJITRecordPrefix &Val) {
  return SPSPerfJITRecordPrefix::AsArgList::size(
    static_cast<uint32_t>(Val.Id), Val.TotalSize, Val.Timestamp);
}
143–153

Likewise here:

static bool deserialize(SPSInputBuffer &IB, PerfJITRecordPrefix &Val) {
    uint32_t Id;
    if (SPSPerfJITRecordPrefix::AsArgList::deserialize(IB, Id, Val.TotalSize, Val.Timestamp)) {
      Val.Id = Id;
      return true;
    }
    return false;
}
154–158

And here:

static bool serialize(SPSOutputBuffer &OB, const PerfJITRecordPrefix &Val) {
  return SPSPerfJITRecordPrefix:AsArgList::
        serialize(OB, static_cast<uint32_t>(Val.Id), Val.TotalSize, Val.Timestamp);
}
166

Same advice as above here -- the methods below can probably be shortened by using SPSPerfJITCodeLoadRecord::AsArgList.

208

More potential AsArgList simplifications here.

243

More potential AsArgList simplifications here.

279

More potential AsArgList simplifications here.

325

More potential AsArgList simplifications here.

llvm/include/llvm/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.h
2

Looks like this is missing the file name / description line.

//===------- JITLoaderPerf.h - Register profiler objects ------*- C++ -*-===//
llvm/lib/ExecutionEngine/Orc/PerfSupportPlugin.cpp
2

Needs LLVM licensing header.

23

There's already a LinkGraph::dump(raw_ostream&) method. Could that be used instead of this?

Alternatively if the former is dumping too verbosely could we clean it up? Or add this to LinkGraph as dumpCompact?

48–87

This shoul

83–84

It looks like the header is a known size -- you might be better off using a BinaryStreamWriter here.

101–114

This is being used to get the timestamp in the host process. Should we be timestamping the deserialized records on the executor side instead?

llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp
2

Needs LLVM licensing header.

We should also have a file-level TODO to move this into the ORC runtime once we can.

Address review comments

Do the perf JIT APIs support unloading code? If so these should have a // TODO to fill them in later. (I don't think they're needed for the initial commit).

I don't see any records for code unloading in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jitdump-specification.txt , and PerfJITEventListener didn't have support for code unloading either, so I'm inclined to believe it doesn't exist right now.

Diff against origin, address review comments

Add license to JITLoaderPerf

Diff against origin

  • Missed a license

Cascade writer errors upward

lhames accepted this revision.Mar 20 2023, 6:43 PM

Do the perf JIT APIs support unloading code? If so these should have a // TODO to fill them in later. (I don't think they're needed for the initial commit).

I don't see any records for code unloading in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jitdump-specification.txt , and PerfJITEventListener didn't have support for code unloading either, so I'm inclined to believe it doesn't exist right now.

Thanks for checking, and for addressing all the review feedback. LGTM!

I really appreciate your work on this -- it's very exciting to see profiling support making its way into ORC. Thanks Prem!

  • Lang.
This revision is now accepted and ready to land.Mar 20 2023, 6:43 PM

Fix bugs in unwind record creation

Rebase on main

Diff against main

  • Switch to lookupAndRecordAddrs

Just testing this on Linux and Darwin now.

On Linux it looks good, but I've switch llvm-jitlink's -perf-support option to false by default since it's writing unconditionally to the filesystem, which would be surprising default behavior.

On Darwin I'm seeing unused variable warnings due to the #ifdef __linux__ conditionals. Would it be reasonable to just conditionalize inclusion of that whole file in the TargetProcess library? When it's not included the registration addr will just be unavailable and PerfSupportPlugin::Create will error out based on that. Alternatively we could make most of JITLoaderPerf.cpp conditional on #ifdef __linux__ and just have non-linux versions of registerJITLoaderPerfImpl, registerJITLoaderPerfStartImpl, and registerJITLoaderPerfEndImpl that return descriptive errors.

@pchintalapudi -- what do you think?

Also I've just noticed that JITLoaderPerf.cpp (in OrcTargetProcess) depends directly on llvm/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.h from Orc. That's a layering violation -- the shared types should be moved up into a header in llvm/ExecutionEngine/Orc/Shared.

Let me know if you want me to take care of that.

Address review comments

lhames accepted this revision.Apr 17 2023, 7:49 PM

LGTM. Thanks very much @pchintalapudi!

This revision was landed with ongoing or failed builds.Apr 18 2023, 2:16 PM
This revision was automatically updated to reflect the committed changes.
void added a subscriber: void.Apr 18 2023, 2:34 PM

I think this change is causing build failures. Could you take a look?

thakis added a subscriber: thakis.Apr 18 2023, 2:37 PM

This breaks the build on Mac and win (eg http://45.33.8.238/macm1/58953/step_4.txt) (Linux is fine)

Please take a look and revert for now if it takes a while to fix.

It also breaks Android: https://lab.llvm.org/buildbot/#/builders/77/builds/26352/steps/8/logs/stdio

FAILED: lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/JITLoaderPerf.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LIBCPP_ENABLE_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/lib/ExecutionEngine/Orc/TargetProcess -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm_build64/include -I/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/include -gmlt -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/JITLoaderPerf.cpp.o -MF lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/JITLoaderPerf.cpp.o.d -o lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/JITLoaderPerf.cpp.o -c /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp:349:12: error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]
    return std::move(err);
           ^
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp:349:12: note: remove std::move call here
    return std::move(err);
           ^~~~~~~~~~   ~
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp:378:12: error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]
    return std::move(err);
           ^
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp:378:12: note: remove std::move call here
    return std::move(err);
           ^~~~~~~~~~   ~
2 errors generated.

Reverted for now.

Looks like a few warnings triggering errors on -Werror bots. There are the moves above, plus:

C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\lib\ExecutionEngine\Orc\PerfSupportPlugin.cpp(285,44): warning: lambda capture 'MR' is not used [-Wunused-lambda-capture]
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\lib\ExecutionEngine\Orc\PerfSupportPlugin.cpp(88,24): warning: unused function 'timespec_to_ns' [-Wunused-function]
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\llvm\lib\ExecutionEngine\Orc\PerfSupportPlugin.cpp(93,24): warning: unused function 'perf_get_timestamp' [-Wunused-function]

Oh, and the execution test will need a # REQUIRES: native && target-x86_64 line to prevent it running on other architectures (e.g. https://lab.llvm.org/buildbot#builders/230/builds/11974).

Side note: I should probably add nice error message to llvm-jitlink for "you're trying to run code from a different architecture", rather than just jumping straight into it. :)

vchuravy reopened this revision.May 8 2023, 12:51 PM
This revision is now accepted and ready to land.May 8 2023, 12:51 PM
This revision was automatically updated to reflect the committed changes.

Can you please fix or revert https://lab.llvm.org/buildbot/#/builders/5/builds/36753

I guess JIT does not play well with sanitizers, maybe just set UNSUPPORTED: for whatever is msan feature