Include the complete list of threads of all running processes
in the FreeBSDKernel plugin. This makes it possible to inspect
the states (including partial register dumps from PCB) of all kernel
and userspace threads at the time of crash, or at the time of reading
/dev/mem first.
Details
Diff Detail
Event Timeline
I think I've managed to figure out what the code is doing, but it wouldn't hurt to include a comment giving a high-level overview of what that code is doing and/or splitting it into smaller functions with helpful names.
Can we expect to have a test for this?
lldb/source/Plugins/Process/FreeBSDKernel/ProcessFreeBSDKernel.cpp | ||
---|---|---|
203 | sizeof(thread_name) | |
221 | is this going to be unique (for the entire system)? | |
lldb/source/Plugins/Process/FreeBSDKernel/ThreadFreeBSDKernel.cpp | ||
29 | std::move |
lldb/source/Plugins/Process/FreeBSDKernel/ProcessFreeBSDKernel.cpp | ||
---|---|---|
221 | Yes, I have specifically verified that. |
Update the code per suggestions, update the first test case.
TODO: update the remaining tests, document the code better
Update all tests.
Phab doesn't like git binary deltas for some reason, so I've uploaded the patch with --no-binary.
New file sizes:
-rw-r--r-- 1 mgorny mgorny 120K 12-29 15:36 lldb/test/API/functionalities/postmortem/FreeBSDKernel/vmcore-amd64-full.bz2 -rw-r--r-- 1 mgorny mgorny 132K 12-29 12:45 lldb/test/API/functionalities/postmortem/FreeBSDKernel/vmcore-amd64-minidump.bz2 -rw-r--r-- 1 mgorny mgorny 37K 12-29 15:45 lldb/test/API/functionalities/postmortem/FreeBSDKernel/vmcore-arm64-minidump.bz2 -rw-r--r-- 1 mgorny mgorny 136K 12-29 15:38 lldb/test/API/functionalities/postmortem/FreeBSDKernel/vmcore-i386-minidump.bz2
Can we make the number of threads smaller? Perhaps by patching the linked lists to skip the uninteresting ones? 600 threads is way too much, all we really need is one thread from each of the three categories. Then you can meaningfully assert the names and registers of the existing threads (since their layout is the same, there's probably no point in asserting all registers, but it'd be good to check one or two, just to make sure the starting offset is right).
That would be a lot of work, and it would need to be repeated whenever the test data needs to be regenerated. How about leaving the threads as-is but just testing the few selected ones from the list?
That would deal with the code coverage, but it still leaves us with a fairly large core file, and a lot of uninteresting tests obscuring the output.
How do you generate these core files? Do you actually create a fresh core dump or you just recompute the "interesting" portions from a master file you have around? If you make the changes the the master core file, then they would get automatically picked up during the recomputation.
Alternatively, maybe there is a way to capture a core file without so many processes. Either killing off everything before the core file is written, or by making sure the other processes are never started (something like init=/bin/bash on linux)?
I do recompute them from master copies (which I finally need to upload somewhere) but the recomputation is really dumb and unaware of the file format. I suppose I could just hack LLDB to stop after grabbing the first N threads but… the first non-dump on-CPU thread is no. 200. Finishing on that will probably make the core smaller but not sure how much smaller. I'll try in a minute.
Alternatively, maybe there is a way to capture a core file without so many processes. Either killing off everything before the core file is written, or by making sure the other processes are never started (something like init=/bin/bash on linux)?
The vast majority of threads are kernel threads. At a quick glance, only about 20 threads look like userspace.
Ok, so limiting to 200-ish threads halves the data size:
-rw-r--r-- 1 mgorny mgorny 76K 12-30 14:16 /tmp/vmcore.bz2
I am aware how the recomputation works. My point is that if you hack the "master" copy (quotes because you probably still want to keep the original master around) then you only need to redo the hacks after when you're regenerating the master. At that point you'll also need to update all the test expectations, so it doesn't seem like too much extra work. (In fact, I wouldn't be surprised if we just end up adding a new core file instead of updating the old one).
I suppose I could just hack LLDB to stop after grabbing the first N threads but… the first non-dump on-CPU thread is no. 200. Finishing on that will probably make the core smaller but not sure how much smaller. I'll try in a minute.
You may still need to do some hex editing to avoid ending the list with a dangling pointer...
Alternatively, maybe there is a way to capture a core file without so many processes. Either killing off everything before the core file is written, or by making sure the other processes are never started (something like init=/bin/bash on linux)?
The vast majority of threads are kernel threads. At a quick glance, only about 20 threads look like userspace.
yikes. And I thought linux was bad..
Skip repeating thread name when it's equal to comm. Include thread status in name field (i.e. 'crashed', 'on CPU n').
Update the test tooling to strip "non-interesting" processes from allproc and to grab bt/regs from three first threads. Update amd64 test to include only interesting processes (making the core much smaller), and to test backtrace and regs on one thread of each kind.
Update all tests and the test documentation.
New sizes:
-rw-r--r-- 1 mgorny mgorny 13K 12-31 15:27 vmcore-amd64-full.bz2 -rw-r--r-- 1 mgorny mgorny 18K 12-31 13:14 vmcore-amd64-minidump.bz2 -rw-r--r-- 1 mgorny mgorny 12K 12-31 16:02 vmcore-arm64-minidump.bz2 -rw-r--r-- 1 mgorny mgorny 14K 12-31 16:31 vmcore-i386-minidump.bz2
Thanks for doing this.
Interesting solution to the problem. I'm not sure how long will the diff files remain applicable, but I would've accepted this even without them so I think all is fine.
lldb/test/API/functionalities/postmortem/FreeBSDKernel/TestFreeBSDKernelVMCore.py | ||
---|---|---|
76–77 | If you want to be even more fancy :) |
lldb/test/API/functionalities/postmortem/FreeBSDKernel/TestFreeBSDKernelVMCore.py | ||
---|---|---|
76–77 | Neat. I didn't know threads are iterable like this. Unfortunately, swig makes it kinda hard to inspect stuff in a really Pythonic way. |
sizeof(thread_name)