This is an archive of the discontinued LLVM Phabricator instance.

[lldb] [server] Support for multiprocess extension
ClosedPublic

Authored by mgorny on Mar 12 2021, 12:04 AM.

Details

Summary

Add a minimal support for the multiprocess extension in lldb-server.
The server indicates support for it via qSupported, and accepts
thread-ids containing a PID. However, the server still supports
debugging only one inferior, so any other PID value results in an error.

Diff Detail

Event Timeline

mgorny created this revision.Mar 12 2021, 12:04 AM
mgorny requested review of this revision.Mar 12 2021, 12:04 AM
mgorny updated this revision to Diff 330225.Mar 12 2021, 6:27 AM
mgorny edited the summary of this revision. (Show Details)

Updated to include multiprocess in qSupported, and proposed how to handle it server-side on the example of H packet. Basically, accept the correct PID of current process, reject anything else (including 0 and -1 that do not make sense for H).

Some initial remarks:

  • the parsing function is clearly gdb-related, so it should go into StringExtractorGDBRemote.h, and not its base class
  • the universalness of the function makes its interface pretty complicated, and bloats the call sites. Its probably good for the lowest layer, but maybe we could make things a lot cleaner if we created some more specialized functions on top of that. For example, most of the code (packets) statically knows whether it can accept -1 or 0 for process or thread ids, so we would benefit from those checks being done in a central place. in particular, until we really start supporting multiple processes, all of the packets will be doing something like if (pid != m_process->pid()) return error(), so we could have a version that accepts a value that *must* be present as the pid field. That would also allow us to mostly abstract away the protocol differences since the caller could just deal with the tid, knowing that the pid (if present) has already been checked.
  • I wouldn't add multiprocess to the qSupported packets just yet. Since this is accepting the extended syntax regardless of whether the other side claimed to support it (there's no hard in doing that, right?), we could flip the switch after all packets have been ported.
lldb/source/Utility/StringExtractor.cpp
375 ↗(On Diff #330225)

I think this would benefit from using more StringRefs (this class is ancient, so it reads very C like), similar to the GetNameColonValue function above. Maybe something like:

llvm::StringRef view = StringRef(m_packet).substr(m_index);
view = view.ltrim();
if (view.consume_front("p")) {
  view = view.ltrim();
  if (view.consume_front("-1") ...
    
}
mgorny updated this revision to Diff 330432.Mar 13 2021, 1:36 AM
mgorny marked an inline comment as done.

Updated utility function to use StringRef and reformatted. Further changes incoming.

  • I wouldn't add multiprocess to the qSupported packets just yet. Since this is accepting the extended syntax regardless of whether the other side claimed to support it (there's no hard in doing that, right?), we could flip the switch after all packets have been ported.

Sure, I have this split into more commits locally, I'll keep `qSupported' last.

mgorny updated this revision to Diff 330435.Mar 13 2021, 2:01 AM

Add a GDBRemoteCommunicationServerLLGS::ReadTid() helper that does appropriate PID verification and constructs error packets.

mgorny updated this revision to Diff 330441.Mar 13 2021, 4:18 AM

Added support for 0 and -1 values in ReadTid() helper, controlled via allow_any and allow_all parameters.

I'm still bugged by the GetPidTid interface -- the pair of optionals is very complicated. What if we had this function take a default_pid argument (which would be equal to the currently selected process), and it returned that as the process id, in case the packet does not contain one? Then the interface could be something like Optional<pair<pid_t, tid_t>> GetPidTid(pid_t default_pid) (because tid automatically defaults to -1), which I think is more intuitive. In case we really need to differentiate between a pid not being present and an implicitly chosen pid, then the interface could be like Optional<pair<Optional<pid_t>, tid_t>> GetPidTid(), which still seems /slightly/ better as its easier to recognise invalid input. WDYT?

Regarding the ReadTid function, I think that sending the (error) response packet from within that is a bit too much. I think it would be better if that returned an Error/Expected and left the sending to the caller.

I'm still bugged by the GetPidTid interface -- the pair of optionals is very complicated. What if we had this function take a default_pid argument (which would be equal to the currently selected process), and it returned that as the process id, in case the packet does not contain one? Then the interface could be something like Optional<pair<pid_t, tid_t>> GetPidTid(pid_t default_pid) (because tid automatically defaults to -1), which I think is more intuitive. In case we really need to differentiate between a pid not being present and an implicitly chosen pid, then the interface could be like Optional<pair<Optional<pid_t>, tid_t>> GetPidTid(), which still seems /slightly/ better as its easier to recognise invalid input. WDYT?

I've originally wanted to precisely convey whatever was in the packet since it was a generic class, and I actually thought nested Optionals are going to be worse. But thinking about it, you're probably right — the spec explicitly says to assume tid=-1 when only pid is provided, so I guess we can use a single Optional and just assume a default pid. However, I think I'm going to go a step further and eliminate pair as well, in favor of llvm::Optional<lldb::tid_t> GetPidTid(lldb:pid_t& pid), with pid being modified if specified or otherwise left in its current (i.e. default) value.

Or scratch that. What you propose is cleaner and requires less changes to tests ;-).

mgorny updated this revision to Diff 330671.Mar 15 2021, 8:31 AM
mgorny edited the summary of this revision. (Show Details)

Updated GetPidTid() to use llvm::Optional<std::pair<lldb::pid_t, lldb::tid_t>> return type with a default PID approach as suggested by @labath.

Updated ReadTid() to use llvm::Expected<lldb::tid_t>. Create a new GDBRemoteError class to pass gdb-remote $E codes through cleanly.

mgorny updated this revision to Diff 330674.Mar 15 2021, 8:35 AM

clang-format.

mgorny updated this revision to Diff 331220.Mar 17 2021, 4:38 AM
mgorny edited the summary of this revision. (Show Details)

I've just discovered that LLDB_INVALID_*_ID isn't -1/UINT_MAX as I thought, so I've had to introduce additional constants. That said, I'm not convinced about the current 0/-1 API. Maybe it'd make sense to handle them directly in ReadTid() and return some (current?) thread ID for 0, and LLDB_INVALID_*_ID for -1?

Added vCont support along with tests.

Create a new GDBRemoteError class to pass gdb-remote $E codes through cleanly.

The error codes we use right now are completely meaningless, so don't bother preserving them. I don't think we should be introducing a separate error class on their account, and I'm particularly unhappy about how this class has insinuated itself into the Status object.

I've just discovered that LLDB_INVALID_*_ID isn't -1/UINT_MAX as I thought, so I've had to introduce additional constants. That said, I'm not convinced about the current 0/-1 API. Maybe it'd make sense to handle them directly in ReadTid() and return some (current?) thread ID for 0, and LLDB_INVALID_*_ID for -1?

I think that decision which thread to use for the "any" thread should not be handled at such a low level, as it can have real impact on how the inferior behaves. I think using constants for this is fine. Although I applaud the avoidance of macros, I think the constants should also use non-macro nomenclature. Maybe just put the names inside the StringExtractorGdbRemote class, and call them AnyThread (etc.) ? Also, I think that these days we can just use constexpr variables instead of the enum hack.

lldb/source/Utility/StringExtractorGDBRemote.cpp
628

This should be something like view.consumeInteger(16, pid). At that point you can stop keeping m_index in sync, and just update it at the very end (maybe even via llvm::scope_exit).

lldb/unittests/Utility/StringExtractorGDBRemoteTest.cpp
9–11

For empty fixtures, you can just drop this, along with the _F in the TEST_F macro.

Create a new GDBRemoteError class to pass gdb-remote $E codes through cleanly.

The error codes we use right now are completely meaningless, so don't bother preserving them. I don't think we should be introducing a separate error class on their account, and I'm particularly unhappy about how this class has insinuated itself into the Status object.

Well, I found the ability to use different codes useful for debugging tests but I guess it doesn't matter much. So a plain StringError then?

lldb/source/Utility/StringExtractorGDBRemote.cpp
628

Do you have some specific method of updating it mind? Comparing size before and after?

mgorny updated this revision to Diff 331327.Mar 17 2021, 11:16 AM
mgorny marked 2 inline comments as done.
  • switched thread-id parsing to use view.consumeInteger() and replaced m_index manipulations with a single update at the very end
  • added additional test for parsing multiple thread-ids in a single packet (to verify that m_index is correctly updated)
  • moved thread-id constants to static constexpr inside the class
  • replaced GDBError with plain StringErrors
mgorny retitled this revision from [lldb] Support for multiprocess extension [WIP] to [lldb] Support for multiprocess extension.Mar 19 2021, 3:34 AM
mgorny edited the summary of this revision. (Show Details)
labath accepted this revision.Mar 22 2021, 3:55 AM

This should be fine, assuming the following statement is true: "all thread id's that we're passing from server to client are in the form of some lldb-specific extension to the gdb-remote protocol". If that is not the case, then we should also update the client to work with the new format.

Create a new GDBRemoteError class to pass gdb-remote $E codes through cleanly.

The error codes we use right now are completely meaningless, so don't bother preserving them. I don't think we should be introducing a separate error class on their account, and I'm particularly unhappy about how this class has insinuated itself into the Status object.

Well, I found the ability to use different codes useful for debugging tests but I guess it doesn't matter much. So a plain StringError then?

Yeah. Note we also have the ability to send the error messages through the protocol nowadays. I'm not sure if the lldb-server test suite enables them, but we could certainly change that.

lldb/source/Utility/StringExtractorGDBRemote.cpp
628

StringExtractor::GetNameColonValue does it by subtracting the .data() pointers, but this could be even better...

lldb/unittests/Utility/StringExtractorGDBRemoteTest.cpp
143–146

IIRC gdb docs say this is invalid, so I'd reject it immediately at this level.

This revision is now accepted and ready to land.Mar 22 2021, 3:55 AM

This should be fine, assuming the following statement is true: "all thread id's that we're passing from server to client are in the form of some lldb-specific extension to the gdb-remote protocol". If that is not the case, then we should also update the client to work with the new format.

That statement is somewhat confusing. Do I understand correctly that I should update the client to handle process IDs in all the places where GDB protocol permits server to send PIDs? I suppose that makes sense.

lldb/unittests/Utility/StringExtractorGDBRemoteTest.cpp
143–146

Will do.

mgorny updated this revision to Diff 332255.Mar 22 2021, 5:16 AM
mgorny edited the summary of this revision. (Show Details)

Move 0/-1 value checking down to StringExtractorGDBRemote. Reject 0 unconditionally, since we do not use it.

TODO: client support

This should be fine, assuming the following statement is true: "all thread id's that we're passing from server to client are in the form of some lldb-specific extension to the gdb-remote protocol". If that is not the case, then we should also update the client to work with the new format.

That statement is somewhat confusing. Do I understand correctly that I should update the client to handle process IDs in all the places where GDB protocol permits server to send PIDs? I suppose that makes sense.

Yeah, that's what I meant. The having "multiprocess" in qSupported means that we support the relevant syntax, and I don't want the client to lie about that. If it turns out to be particularly problematic, we can discuss that. Possibly by including the extension only in the server response?

mgorny updated this revision to Diff 332322.Mar 22 2021, 9:19 AM

Added client support for PIDs.

mgorny updated this revision to Diff 332324.Mar 22 2021, 9:21 AM

try uploading again

mgorny added inline comments.Mar 22 2021, 9:24 AM
lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp
741 ↗(On Diff #332324)

I've noticed a few places where we take U32 instead of U64, so fixed that as well.

2795 ↗(On Diff #332324)

@labath, any clue if we could make this work some other way? Calling GetCurrentProcessID() causes a crash for some tests here:

Fatal Python error: Segmentation fault

Thread 0x00007f918d203640 (most recent call first):
  File "/home/mgorny/git/llvm-project/lldb/test/API/functionalities/gdb_remote_client/gdbclientutils.py", line 470 in _handlePacket
  File "/home/mgorny/git/llvm-project/lldb/test/API/functionalities/gdb_remote_client/gdbclientutils.py", line 380 in _receive
  File "/home/mgorny/git/llvm-project/lldb/test/API/functionalities/gdb_remote_client/gdbclientutils.py", line 364 in _run
  File "/usr/lib/python3.10/threading.py", line 910 in run
  File "/usr/lib/python3.10/threading.py", line 972 in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 930 in _bootstrap

Current thread 0x00007f9196327740 (most recent call first):
  File "/home/mgorny/git/llvm-project/build.gentoo/lib/python3.10/site-packages/lldb/__init__.py", line 10250 in ConnectRemote
  File "/home/mgorny/git/llvm-project/lldb/test/API/functionalities/gdb_remote_client/gdbclientutils.py", line 526 in connect
  File "/home/mgorny/git/llvm-project/lldb/test/API/functionalities/gdb_remote_client/TestRecognizeBreakpoint.py", line 103 in test
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/case.py", line 413 in runMethod
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/case.py", line 383 in run
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/case.py", line 458 in __call__
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/suite.py", line 117 in _wrapped_run
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/suite.py", line 115 in _wrapped_run
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/suite.py", line 85 in run
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/suite.py", line 66 in __call__
  File "/home/mgorny/git/llvm-project/lldb/third_party/Python/module/unittest2/unittest2/runner.py", line 165 in run
  File "/home/mgorny/git/llvm-project/lldb/packages/Python/lldbsuite/test/dotest.py", line 997 in run_suite
  File "/home/mgorny/git/llvm-project/lldb/test/API/dotest.py", line 7 in <module>

Extension modules: lldb._lldb (total: 1)
mgorny requested review of this revision.Mar 22 2021, 4:12 PM

I think it'd be better to commit this in two patches after all. The first one could include the server bits (and the multiprocess+ server thingy) -- I believe this is good to go already. The second patch/review could deal with the thread/process id business on the client (and finally enable multiprocess+ client-side).

lldb/include/lldb/Utility/StringExtractorGDBRemote.h
197

These also need a definition in the .cpp file (debug builds fail without that).

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp
741 ↗(On Diff #332324)

Maybe just commit that separately (no review needed).

2795 ↗(On Diff #332324)

This backtrace is useless, as it only shows python threads (and the crash happens in a native one), but if you run the test in a debugger, the problem becomes obvious -- mutual recursion between GetCurrentThreadIDs and GetCurrentProcessID. To fix that, we need to break it.

I'd probably to in by creating a new function, which implements the core of q[fs]ThreadInfo functionality (for instance, it could return something like (vector<pair<pid, tid>> with INVALID_PID for unknown), and then call it from both of these functions to do their thing.

Ok, I'm going to use this review to push the server part, then commit the uint64 client change separately, then create a new diff for client.

If you find a few more minutes, the server-side fork/vfork patch is ready for review (and independent of this one).

mgorny updated this revision to Diff 334131.Mar 30 2021, 6:09 AM
mgorny marked 2 inline comments as done.
mgorny retitled this revision from [lldb] Support for multiprocess extension to [lldb] [server] Support for multiprocess extension.

Limited to server changes. Added definitions for constexpr vars.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 30 2021, 6:09 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2021, 6:09 AM

Looks like this broke the Windows lldb buildbot:

https://lab.llvm.org/buildbot/#/builders/83/builds/5173

Thanks for pinging me. It seems that all Hg tests fail on Windows, so just added xfail to these 3 as well.

lldb/include/lldb/Utility/StringExtractorGDBRemote.h
197

I've reproduced that and fixed it.