This is an archive of the discontinued LLVM Phabricator instance.

Implement vAttachWait in lldb-server
Needs ReviewPublic

Authored by rlovelett on Jun 7 2018, 5:57 AM.

Details

Event Timeline

rlovelett created this revision.Jun 7 2018, 5:57 AM
rlovelett updated this revision to Diff 150313.Jun 7 2018, 6:24 AM

Updated the comment in GDBRemoteCommunicationServerLLGS.h

rlovelett added inline comments.Jun 7 2018, 6:31 AM
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
342

Based on looking through how Host.cpp builds the executable it seems that it puts the whole path in there. I made the decision that the input is trying to match the process name only so that it should use NameMatch::EndsWith. This seems to work but it might not be the most robust solution.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
86
  • I was unsure of the default parameter for waitfor_interval. This might be against the style of the project cause I have not seen in anywhere else in the project.
  • I copied the @return of AttachToProcess since after getting the pid it just calls that method.
labath added a comment.Jun 7 2018, 7:57 AM

Thanks for the patch. I have a bunch of comments, but none of them are really serious. Now we just need to figure out the testing...

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
342

If you want to match the file name exactly, you can take the list returned by FindProcesses and then filter it based on an exact match of the filename part.

347–349

Could you please LLDB_LOG for new code. This would be something like LLDB_LOG(log, "waiting for '{0}'", waitfor_process_name).

357

Should we maybe abort in case of multiple matches ?

360–363

The clearing and resetting makes this more confusing than necessary. Could you use a temporary variable for the candidate pid and then only set waitfor_pid once it has been checked?

371

If you use LLDB_LOG, this will actually work. Right now you are passing a c++ class through varargs printf -- bad idea.

3016

Add return SendErrorResponse(error) here.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
83

The argument type should be llvm::StringRef (and you can probably shorten the name somewhat). Since you're not using the second argument in any meaningful way, you can just drop it (a 1 second sleep time seems like a reasonable interval to me).

clayborg added inline comments.Jun 7 2018, 9:00 AM
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
342

EndsWith is fine, just makes sure to filter extra stuff. Say if you are looking for "foo", make make sure /path/to/barfoo" doesn't match, but "path/to/foo" or "foo" does

357

We should and return an error that specifies the multiple processes that match so the user could choose to "attach <pid>". So the error message should return a message that contains all pids that now match.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
83

I might say to go with a lower sleep interval. If you are waiting for a process to launch, you might want to debug the C++ global constructor chain and you will want to catch this as soon as possible. Might be nice to leave this here and also add a new option to "process attach" like "--wait-interval" which is some number of microseconds or milliseconds and then pass that through to here? I would say that we would want to default say 1 ms?

rlovelett updated this revision to Diff 150526.Jun 8 2018, 9:01 AM
rlovelett marked 5 inline comments as done.

Address parts of the code-review:

  • std::string -> llvm::StringRef
  • log->Printf -> LLDB_LOG
  • Added SendErrorResponse return in Handle_vAttachWait
  • Removed waitfor_interval from AttachWaitProcess
rlovelett added inline comments.Jun 8 2018, 9:02 AM
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
83

Might be nice to leave this here and also add a new option to "process attach" like "--wait-interval" which is some number of microseconds or milliseconds and then pass that through to here?

  • I think this is a really good suggestion. I personally would really like to have this be configurable in that way. As far as I can tell, this would change the behavior between macOS and Linux (and probably the other platforms). In my mind, this be better as a follow on PR that adds that capability for this and other platforms. For now, my suggestion is to leave it as a hard-coded 1 second. What do you think?
  • Just like this change, I would not really know how to go about adding this, specifically to the GDB packet. I am interested in learning but there might be more questions in a new thread.

This is a good start, We probably need a test. Pavel: can you outline what Ryan would need to do to add a test for this?

This is a good start, We probably need a test. Pavel: can you outline what Ryan would need to do to add a test for this?

We spoke about the tests on the lldb-dev thread. I gave a rough outline there. Ryan, if you have any specific questions, let me know.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
347–349

the if(log) is already hidden in the LLDB_LOG macro. You don't need to type it out yourself.

370

The format arguments are zero-based (so, found pid {0})

Could somebody please give a concise description of what's needed to get this landed? We ran into this problem today and I would be happy to take this over the goal line. Is it just the comments given that haven't been addressed?

aprantl added inline comments.
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
336

Please never hard-code timeouts. Timeouts frequently need to be adapted for specific use-cases, for example to debug lldb, or to make asanified buildbots work, which are running orders of magnitude slower.

We have several global timeout constants available, just pick the most appropriate (cf., https://reviews.llvm.org/D60340)

Thanks, Adrian. One other question: was https://bugs.swift.org/browse/SR-3380 closed improperly?

Thanks, Adrian. One other question: was https://bugs.swift.org/browse/SR-3380 closed improperly?

I'll have to defer that question to someone who actually knows something about ptrace on Linux ;-)

Could somebody please give a concise description of what's needed to get this landed? We ran into this problem today and I would be happy to take this over the goal line. Is it just the comments given that haven't been addressed?

I think the biggest issue is the lack of a test case. Besides that, I could make some additional comments on the implementation (and the patch could probably use a reabase), but all of those seems fairly straight-forward.

Thanks, Adrian. One other question: was https://bugs.swift.org/browse/SR-3380 closed improperly?

I'll have to defer that question to someone who actually knows something about ptrace on Linux ;-)

It sounds like closing that as "invalid" was right, since the bug was about general attaching, and that does "work" on lldb. The only thing which doesn't work is waiting for a process to appear.

One gotcha there is that if you have "yama" enabled (which is the default on a lot of systems), then you cannot just automatically attach to any process. You either need to:

  • disable yama
  • become root
  • arrange so that the process is your child
  • have the process be willing to be attached (call prctl(PR_SET_PTRACER)

Neither of those is really in-scope for lldb or swift.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
336

I'm not sure if we actually have a global constant of this. This isn't really a timeout, but just the length of the polling interval, so it's value should not impact correctness, only set a tradeoff between responsiveness and burning the cpu needlessly.

1 second seems a relatively reasonable value for that, though if we wanted to be really fancy we could try some exponential scaling, where we start with a couple of ~100ms waits, and evetually settle in say 5--10 second range...

aprantl added inline comments.Apr 22 2020, 9:37 AM
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
336

Okay, that makes sense! I have a knee-jerk response every time I see std::chrono:: in a patch ...