This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] Introduce Subprocess Executor Mode
ClosedPublic

Authored by aidengrossman on May 20 2023, 3:38 AM.

Details

Summary

This patch introduces the subprocess executor mode. Currently, this new
mode doesn't do anything fancy, just executing the same code that the
inprocess executor would do, but within a subprocess. This sets up the
ability to add in many more memory-related features in the future.

Diff Detail

Event Timeline

aidengrossman created this revision.May 20 2023, 3:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2023, 3:38 AM
aidengrossman requested review of this revision.May 20 2023, 3:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2023, 3:38 AM
courbet added inline comments.May 24 2023, 12:41 AM
llvm/test/tools/llvm-exegesis/X86/latency/subprocess-segfault.s
8

is this guaranteed to segfault ? maybe use 0 ?

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
307

Please add a simple class description.

351

Add a comment here: // Note: terminates the child process..

384

[nit] create a temporary variable for this to avoid repetition below ? These macros are not exactly self-explanatory.

409

[[noreturn]] ?

440

[nit] createSubProcessAndRunBenchmark for consistency ?

aidengrossman marked 6 inline comments as done.

Address reviewer feedback.

courbet accepted this revision.May 24 2023, 1:31 AM
courbet added inline comments.
llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
310

with

This revision is now accepted and ready to land.May 24 2023, 1:31 AM
aidengrossman marked an inline comment as done.

Rebase, address reviewer feedback.

Add preprocessor directives around libpfm specific defines.

Fix build failure on Debian stable due to not having the pidfd_getfd system call.

This revision was automatically updated to reflect the committed changes.

(FWIW this was re-committed in rG4d618b52f6e05e41d35f56653cb36bf7d4dc794e, without mentioning this Differential Revision...)

@aidengrossman AFAICT this requires Linux 5.6 to pass tests on x86. Can you please reconsider this change without prior notice? Some long-term distros don't have this yet (I care about EL8, which has Linux 4.18)

aidengrossman added a comment.EditedJun 22 2023, 10:21 AM

Sorry about the missing differential revision. arc patch refused to work (of course) and I forgot to add the differential revision to the commit message in the recommit.

I'll work on a patch to get tests passing on systems with a pre-5.6 kernel. I don't think any of the buildbots have libpfm/perf counters together so none of these tests end up executing in the main CI infrastructure (maybe something we should fix?). I might work on making this more kernel version agnostic at some point, but currently I believe we're the only downstream consumer of this functionality within an already somewhat esoteric tool.

aidengrossman reopened this revision.Jun 25 2023, 4:10 PM
This revision is now accepted and ready to land.Jun 25 2023, 4:10 PM
This revision was automatically updated to reflect the committed changes.

@Hahnfeld I just recommitted this with an additional lit check to ensure that the tests don't run in environments where they're not supported so you shouldn't observe any additional test failures. Please let me know if you do and I'll work on getting them fixed.

For what it's worth, I'm also probably going to write a follow up patch at some point to get rid of the usage of these two system calls as I found out recently that my new place of employment has most of its HPC infrastructure on RHEL8 which is Kernel 4.18 so most of the platform specific test running should be gone at some point.

@Hahnfeld I just recommitted this with an additional lit check to ensure that the tests don't run in environments where they're not supported so you shouldn't observe any additional test failures. Please let me know if you do and I'll work on getting them fixed.

Thanks, works perfectly fine now!

For what it's worth, I'm also probably going to write a follow up patch at some point to get rid of the usage of these two system calls as I found out recently that my new place of employment has most of its HPC infrastructure on RHEL8 which is Kernel 4.18 so most of the platform specific test running should be gone at some point.

Yes, RHEL8 and its clones have that older kernel, long-term support also always means that they are moving slower...

I met an issue that ptrace.h missing PTRACE_SEIZE enum value. Anybody know the reason?

abidh added a subscriber: abidh.Jul 5 2023, 11:05 AM

I met an issue that ptrace.h missing PTRACE_SEIZE enum value. Anybody know the reason?

We have seen this error too. In our case, it was due to using an old sysroot. The PTRACE_SEIZE was introduced in Linux 3.4.

Sorry for the late response. As @abidh mentions, PTRACE_SEIZE was introduced in Linux 3.4. Given how old 3.4 is, I don't think it is too common of a configuration, but if it is a significant blocker to someone I can put up a patch to get things compiling. I probably need to upstream my debugging flag anyways.

zixuan-wu added a comment.EditedJul 5 2023, 7:09 PM

In case such requirement of linux kernel version, should we put the linux kernel requirement in the hardware requirement section of get started page? https://llvm.org/docs/GettingStarted.html#hardware

@chandlerc

Having the issues with PTRACE_SEIZE on CentOS 6 (will probably need to build LLVM where for around a year). Don't use llvm-exegesis there, but it's hard to cleanly disable it from build, since tests depend on it.

abidh added a comment.Jul 6 2023, 1:51 AM

Sorry for the late response. As @abidh mentions, PTRACE_SEIZE was introduced in Linux 3.4. Given how old 3.4 is, I don't think it is too common of a configuration, but if it is a significant blocker to someone I can put up a patch to get things compiling. I probably need to upstream my debugging flag anyways.

That will be greatly appreciated. It is a blocker for us and it is not easy to cleanly disable llvm-exegesis from the build as @danilaml also pointed out.

75b5541fe5d096d115d213cb8dc8a10cf3d11fbd should fix the issue as PTRACE_ATTACH was introduced a lot earlier from what I understand. Let me know if there are any other issues.