This is my WIP on fixing multithreaded process support.
So far includes support for reporting new and exited threads, and resuming/stepping individual threads.
Paths
| Differential D64647
<WIP> [lldb] [Process/NetBSD] Multithread support AbandonedPublic Authored by mgorny on Jul 12 2019, 10:23 AM.
Details
Summary This is my WIP on fixing multithreaded process support. So far includes support for reporting new and exited threads, and resuming/stepping individual threads.
Diff Detail Event Timelinemgorny retitled this revision from [lldb] [Process/NetBSD] Implement per-thread executation control to [lldb] [Process/NetBSD] Implement per-thread execution control.Jul 12 2019, 10:27 AM Comment Actions I think it looks OK. there are some nits that could be optimized in future or handled additionally.. but for now it should be fine. Comment Actions Shouldn't there be some tests that come along with this patch?
mgorny marked an inline comment as done. Comment Actions
I was actually hoping that the test suite already covers what needs to be covered, and buildbot would tell me which tests were fixed ;-). Comment Actions
That is possible, though it would be good to know that before landing the patch. But I would expect that at least some aspect of this functionality is not covered by existing tests (e.g. the handling of multiple concurrent signals).
Comment Actions Something that we do not cover here is that once a tracee reports a signal (like someone poked it with SIGUSR1) and we want to pass it over to the tracee, we will reset siginfo. This scenario should be covered by a test and we should handle it properly.. The solution in NetBSD for passing over signals without changing siginfo, is to not calling PT_SET_SIGINFO as the default one will be kept by the kernel. Optionally pick old siginfo with PT_GET_SIGINFO and optionally change destination lwp.
Comment Actions
How is 'properly'? Should we reject resuming with a signal when there's already another signal pending?
Comment Actions
We need to pass the same signal (with unchanged siginfo) to the tracee. The easiest way is to not changing it, however as we can emit a signal over to tracee specifying LWP. we can go for the sequence of PT_GET_SIGINFO, change lwp, PT_SET_SIGINFO, PT_CONTINUE with a signal.
Comment Actions It seems that the patch is wrong with the current logic, and this is somehow blurry because of Linux implementation limitations. FWIU, eStateSuspended will never happen and instead we should keep threads without explicit action suspsended. It's unclear yet whether this is desirable long-term, or if we should change the logic to explicitly indicate state for each thread. Comment Actions
Yes, eStateSuspended will never happen on linux. It wouldn't be too hard to make use of that state -- we could set unresumed threads to "suspened" in NPL::Resume, and then when we go to stop all threads (NPL::StopRunningThreads), we would just avoid sending signals to suspended threads and just silently set them back to "stopped". However, I don't see a reason to do that now, as it would just add code for jumping between the states. However, if it makes your life easier, I think you should be able to use eStateSuspended inside NetBSD code for anything you want. Nobody will inspect the state of threads while the process is running, so that could just be an implementation detail. In fact I suspect the nobody will inspect the thread state whatsoever because once the process is stopped, all threads are assumed to be stopped too. Comment Actions Not sure if we're taking about the same thing. I meant the status in resume actions. I think it'd be logical if 'no action' meant 'resume in same state as prior to stopping the process', and explicit eStateS* meant 'keep this thread stopped after resuming'. Comment Actions
Ah, you're right, we weren't talking about the same thing. I see what you mean now... I can see how explicit modelling of suspension in resume actions would make sense. And it would definitely map better to how netbsd controls threads. OTOH, I think the current semantics (no action => keep stopped) makes sense too, and it maps better to how the gdb protocol represents resume actions. So either way, you have to do the conversion somewhere... mgorny mentioned this in D65555: [lldb] [Process/NetBSD] Enable reporting of new and exited threads [WIP]. mgorny marked 5 inline comments as done. mgorny retitled this revision from [lldb] [Process/NetBSD] Implement per-thread execution control to <WIP> [lldb] [Process/NetBSD] Multithread support. Comment Actions So I've made some progress on this. However, I'm stuck and I'd use some help going forward. I've made a trivial two-threaded program: #include <stdio.h> #include <assert.h> #include <unistd.h> #include <pthread.h> void* thread_func(void* x) { printf("thread 1\n"); int i; for (i = 0; i < 100; i++) { printf("t2 %d\n", i); sleep(3); } return NULL; } int main() { pthread_t t; printf("step 1\n"); assert(!pthread_create(&t, NULL, thread_func, NULL)); printf("step 2\n"); int i; for (i = 0; i < 100; i++) { printf("t1 %d\n", i); sleep(3); } assert(!pthread_join(t, NULL)); printf("step 3\n"); return 0; } Now, if I start it, then issue process interrupt, then thread continue 1, everything works as expected. However, if I interrupt it again and then thread continue 2 (i.e. try to change the thread running), nothing happens. According to the gdb-remote log, vCont is received but nothing is sent back (though normally stdout should happen). If I send any other command, lldb-server dies: 1568906517.952658892 < 16> read packet: $vCont;c:0002#a4 1568906874.978140354 < 1> read packet: lldb-server exiting... According to the posix log, Resume is working as expected. 1568906976.934143305 NativeProcessNetBSD.cpp:Resume no action specified for pid 12124 tid 1 1568906976.934377670 NativeProcessNetBSD.cpp:Resume processing resume action state suspended signal 2147483647 for pid 12124 tid 1 1568906976.934622765 NativeProcessNetBSD.cpp:Resume processing resume action state running signal 2147483647 for pid 12124 tid 2 I suspect I'm not setting some state correctly but I so far haven't managed to figure out which. Comment Actions Nevermind that long report, it actually turned out to be a bug in NetBSD kernel that resulted in the thread not being resumed.
Revision Contents
Diff 227295 lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentBreakpointDelayBreakpointOneSignal.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentBreakpointOneDelayBreakpointThreads.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentCrashWithBreak.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentCrashWithSignal.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentDelaySignalBreak.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentDelayedCrashWithBreakpointSignal.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentManyCrash.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentSignalBreak.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentTwoBreakpointThreads.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentTwoBreakpointsOneDelaySignal.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/concurrent_events/TestConcurrentTwoBreakpointsOneSignal.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/exit_during_break/TestExitDuringBreak.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/exit_during_step/TestExitDuringStep.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/num_threads/TestNumThreads.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/step_out/TestThreadStepOut.py
lldb/packages/Python/lldbsuite/test/functionalities/thread/thread_exit/TestThreadExit.py
lldb/packages/Python/lldbsuite/test/python_api/lldbutil/iter/TestLLDBIterator.py
lldb/source/Plugins/Process/NetBSD/NativeProcessNetBSD.h
lldb/source/Plugins/Process/NetBSD/NativeProcessNetBSD.cpp
lldb/source/Plugins/Process/NetBSD/NativeThreadNetBSD.h
lldb/source/Plugins/Process/NetBSD/NativeThreadNetBSD.cpp
|
Is this "passing multiple signals simultaneously", or "passing multiple *distinct* signals simultaneously". (E.g,. thread 1 gets a SIGALRM, thread 2 gets SIGIO, etc.).