This patch adds in support for using memory annotations in the
subprocess execution mode.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Rebase, cleanup tests, fix some function calls in unittests that I forgot to update after adjusting function signatures.
llvm/tools/llvm-exegesis/lib/Assembler.cpp | ||
---|---|---|
51–54 | If we're always going to call these 4 in a sequence, let's have a single point of entry in ExegesisTarget |
Update preprocessor directives to prevent build failures on Linux platforms that don't have libpfm.
llvm/tools/llvm-exegesis/lib/Assembler.cpp | ||
---|---|---|
70 | This might clobber the registers we just set up above. |
llvm/tools/llvm-exegesis/lib/Assembler.cpp | ||
---|---|---|
70 | Yes. I have a TODO listed in https://reviews.llvm.org/D151023 for this function to push the current values of RAX, RDI, and RSI to the stack to save them and then pop them back after the operation. I was planning on doing this as a follow up after this landed once I actually needed to use those registers, but if this is a blocker I can get this fixed before landing. |
llvm/tools/llvm-exegesis/lib/Assembler.cpp | ||
---|---|---|
234 | Ah, yep. I made it either/or here as it doesn't support passing the RDI (or different depending on ABI) scratch memory register, but there are definitely other uses for passing registers as live ins. Should be fixed in the next update. Thanks for the catch! |
Ignore the "in the next update" part of the last inline comment. I submitted it at the wrong point. @courbet This should be ready for another round of review when you get a chance. Thank you for all your patience with reviews on this stack!
llvm/test/tools/llvm-exegesis/X86/memory-annotations-livein.test | ||
---|---|---|
5 ↗ | (On Diff #532569) | Can you add a comment to explain what the test checks ? Something like: # This test check that we can use a combination of memory and register definitions. Same for the other test. |
The tests aren't working on kernel 4.18 - are they missing a REQUIRES: exegesis-can-execute-in-subprocess?
Ah, yep. Sorry for not catching that. Hopefully 864787990091f61e0b2cdcc13bdcd1e3677a334f fixes that for you.
It seems that you lost Differential Revision: in the commit. If you use arc (https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-command-line), arc amend should give you the reviewed-by information from @courbet
Yes. I meant to add it on when cherry-picking from the branch containing these changes in my fork but forgot to add it in. I was doing it manually though and didn't realize arc amend would just do that. Thanks for the tip and sorry for the confusion there.
@aidengrossman the two new lit tests are consistently failing on my machine, which is Ubuntu 22.04.2 but with kernel version 6.3.7. I get:
$ /home/jayfoad2/llvm-release/bin/llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -snippets-file=/home/jayfoad2/git/llvm-project/llvm/test/tools/llvm-exegesis/X86/latency/memory-annotations.s -execution-mode=subprocess --- mode: latency key: instructions: - 'MOV64ri32 RAX i_0x2000' - 'MOV64rm RDI RAX i_0x1 %noreg i_0x0 %noreg' config: '' register_initial_values: [] cpu_name: znver4 llvm_triple: x86_64-unknown-unknown num_repetitions: 10000 measurements: [] error: 'The benchmarking subprocess sent unexpected signal: Segmentation fault' info: '' assembled_snippet: 415541544989FC4989F548BF0000000000000000488D350000000048C1EE0C48C1E60C4881EE0010000048B80B000000000000000F054C8D05000000004C89E74C01C748C1EF0C48C1E70C4881C70010000048BE00F0FFFFFF7F00004829FE48B80B000000000000000F0548BF00E0FFFFFF7F000048BE001000000000000048BA030000000000000049BA01001000000000004D89E849B9000000000000000048B809000000000000000F0548BF002000000000000048BE001000000000000048BA030000000000000049BA010010000000000049B804E0FFFFFF7F0000458B0049B9000000000000000048B809000000000000000F0548BC00F0FFFFFF7F000050575648BF00E0FFFFFF7F00008B3F48BE032400000000000048B810000000000000000F055E585848C7C000200000488B3848C7C000200000488B3848C7C000200000488B3848C7C000200000488B3848BF00E0FFFFFF7F00008B3F48BE012400000000000048B810000000000000000F0548BF000000000000000048B83C000000000000000F05415C415DC3 ...
I've tried running this under lldb but I can't catch the segfault, even if I use settings set target.process.follow-fork-mode child.
Any ideas what might be going wrong or how to debug it?
I'm not sure what could be going wrong. I saw segmentation fault failures on one of the builders but after I added a check to the lit config to ensure that the subprocess mode was actually supported it went away. It might have something to do with security policies since this patch does some things that probably seem quite abnormal in regards to the process memory, but I might be completely wrong there with that assumption.
To debug you need to modify the code slightly as this patch uses ptrace, so lldb isn't able to grab the child process because there's already another process tracing it. You'll need to comment out the ptrace calls and then you should be able to get into the child process. I had a flag to do that on one of my development branches that maybe I should upstream at some point.
Once you have that setup, I've found it easiest to debug by breaking on a line right before the function call into the MCJITed test/test harness. Then I would single step through the assembly to see what was happening.
However, I understand this might be a pain to debug, so if you want me to disable the tests or add something to the lit config to ensure that these tests only run if memory annotations don't cause seg-faults while we can work on reproducing it without being too interruptive, I'm fine with doing that.
llvm-exegesis -mode=latency -snippets-file=memory-annotations-livein.s -execution-mode=subprocess has this 'The benchmarking subprocess sent unexpected signal: Segmentation fault' for me as well, but I haven't looked closely.
(sudo apt install libpfm4-dev so that LLVM_ENABLE_LIBPFM links the dependency)
strace on the JITted part of the child process shows:
munmap(NULL, 140344954515456) = 0 munmap(0x7fa49b29d000, 392533778432) = 0 mmap(0x7fffffffe000, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED_NOREPLACE, 8, 0) = 0x7fffffffe000 mmap(0x2000, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED_NOREPLACE, 9, 0) = -1 EPERM (Operation not permitted) ioctl(7, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP|0x2) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2000} --- +++ killed by SIGSEGV (core dumped) +++
i.e. the access to address 0x2000 fails because the mmap call failed with EPERM.
There is some discussion here about why you might get the "nonsensical" error EPERM if you try to map an address lower than mmap_min_addr: https://lore.kernel.org/all/20230418214009.1142926-1-Liam.Howlett@oracle.com/T/#u
And sure enough my system has mmap_min_addr set higher than 0x2000:
$ sysctl vm.mmap_min_addr vm.mmap_min_addr = 65536
This setting comes from /etc/sysctl.d/10-zeropage.conf which seems to be an Ubuntu-specific change to the Debain procps package.
I have verified that sed -i s/8192/65536/ test/tools/llvm-exegesis/X86/latency/memory-annotations*.s fixes the tests for me.
Interesting. I'm running Ubuntu 22.04 as well and running sysctl vm.mmap_min_addr for me also gives 65536, but I was doing all of my development inside of a privileged container which I assume was able to allocate memory at an address lower than that maybe due to permissions differences. It fails for me when I do a build outside a container. Thank you so much for debugging this! I'll get a patch up changing the addresses soon and hopefully that fixes the test failures.
46f42e2ee59ac5ff7b153648e30273e499f7b7e3 should fix the issue. Please let me know if there are any further issues.
Interesting. I'm running Ubuntu 22.04 as well and running sysctl vm.mmap_min_addr for me also gives 65536, but I was doing all of my development inside of a privileged container which I assume was able to allocate memory at an address lower than that maybe due to permissions differences.
Yeah, I saw something about the limit not being enforced for processes with CAP_SYS_RAWIO, so maybe that was the difference.
Anyway your fix works for me - thanks!
If we're always going to call these 4 in a sequence, let's have a single point of entry in ExegesisTarget