This is an archive of the discontinued LLVM Phabricator instance.

Fix race condition when launching and attaching.This is a modified version of a previous patch that was reverted: https://reviews.llvm.org/D119797This version only waits for the process to stop when using "launchCommands" or "attachCommands"...
ClosedPublic

Authored by clayborg on Mar 1 2022, 11:07 AM.

Details

Summary

...and doesn't play with the async mode when doing normal launch/attach.

We discovered that when using "launchCommands" or "attachCommands" that there was an issue where these commands were not being run synchronously. There were further problems in this case where we would get thread events for the process that was just launched or attached before the IDE was ready, which is after "configurationDone" was sent to lldb-vscode.

This fix introduces the ability to wait for the process to stop after "launchCommands" or "attachCommands" are run to ensure that we have a stopped process point that is ready for the debug session to proceed. We spin up the thread that listens for process events before we start the launch or attach, but we don't want stop events being delivered through the DAP protocol until the "configurationDone" packet is received. We now always ignore the stop event with a stop ID of 1, which is the first stop. All normal launch and attach scenarios use the synchronous mode, and "launchCommands and "attachCommands" run an array of LLDB commands in async mode.

This should make our launch with "launchCommands" and attach with "attachCommands" avoid a race condition when the process is being launched or attached.

Diff Detail

Event Timeline

clayborg requested review of this revision.Mar 1 2022, 11:07 AM
clayborg created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 11:07 AM
wallace accepted this revision.Mar 1 2022, 11:10 AM

I'm assuming this passes all tests locally

This revision is now accepted and ready to land.Mar 1 2022, 11:10 AM
clayborg retitled this revision from Fix race condition when launching and attaching. This is a modified version of a previous patch that was reverted: https://reviews.llvm.org/D119797 This version only waits for the process to stop when using "launchCommands" or "attachCommands"... to Fix race condition when launching and attaching.This is a modified version of a previous patch that was reverted: https://reviews.llvm.org/D119797This version only waits for the process to stop when using "launchCommands" or "attachCommands"....Mar 1 2022, 11:44 AM
clayborg added a reviewer: yinghuitan.

I'm not exactly exactly thrilled by the sleep-based implementation, but otherwise, the patch seems fine.

lldb/tools/lldb-vscode/VSCode.cpp
551

steady_clock would be better here (no time travel there)

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 12:03 AM
yinghuitan added inline comments.Mar 2 2022, 10:21 AM
lldb/tools/lldb-vscode/VSCode.cpp
532–533

The same: update the comment to reflect latest state.

551

Agreed.

lldb/tools/lldb-vscode/VSCode.h
250–252

This comment is not true in this diff anymore. We only use async mode in launchCommands/attachCommands.

lldb/tools/lldb-vscode/lldb-vscode.cpp
459

This assumes there is only one stop event sent before request_configuarationDone. This is true in our detected situation but I like the other logic of not sending any stop event before request_configuarationDone. That is more accurately aligned with synchronous launch/attach mode withotu relying on one stop event assumption for some future unknown situation.

clayborg updated this revision to Diff 412539.Mar 2 2022, 2:10 PM

Fixes:

  • use a "configuration_done_sent" boolean to track when the configurationDone packet has already been sent and stop all process events from being sent before this happens. This allows "launchCommands" and "attachCommands" to have more than one stop if desired before we sync up and make sure the process is stopped after running these commands
  • fixed comments to reflect the fact that we are only synchronizing with the process if "attachCommands" or "launchCommands" are used
  • use steady clock
yinghuitan accepted this revision.Mar 2 2022, 2:38 PM
This revision was automatically updated to reflect the committed changes.

Greg, this is still causing the TestVSCode_attach.py test to fail on lldb-x86_64-debian (it would probably fail elsewhere as well, but the test is already disabled everywhere else). Judging by the error message (attach failed (process exited during launch or attach)), it definitely seems related to this patch. Can you do something about it?

Greg, this is still causing the TestVSCode_attach.py test to fail on lldb-x86_64-debian (it would probably fail elsewhere as well, but the test is already disabled everywhere else). Judging by the error message (attach failed (process exited during launch or attach)), it definitely seems related to this patch. Can you do something about it?

I will check into this, yes

I think I found the root cause of the issue in linux and it might fix this test for many other platforms:

commit c41c57468949036a06695f06a97a8e14e8b252bd (HEAD -> main)
Author: Greg Clayton <gclayton@fb.com>
Date: Mon Mar 7 11:31:21 2022 -0800

Fix buildbots after https://reviews.llvm.org/D120755.

This improves this test a lot because before when using the "attachCommands" to run the following commands:

(lldb) target create -d /path/to/a.out
(lldb) process launch

This was racy as it wasn't stopping the program at the entry point, and the process might run to completion before we can even debug it. With the recent changes to the "attachCommands" we were waiting for the process to stop, but the process might be exited already, and that _should_ have caused the attach to fail since there was no process to attach to. By adding "--stop-at-entry" to the process launch, we ensure this should be less racy and give us a valid process to attach to.

The problem seemed to be we had a test case that was kind of bogus that was running attach commands with:

"target create ..." followed by "process launch". There was no synchronization before, but now that is. The problem is that the "process launch" was run, but the process could run to the exit stage before we synchronized with it. So it was correctly saying that the attach failed, because we weren't able to actually attach to the process as it had run to the exit point by the time we tried to attach. Fix this by adding "--stop-at-entry" to the "process launch" command which will ensure it does stop and it validly attached.

I verified that my linux server was failing the attach in the same way and it is now fixed with the Buildbot patch