On macOS lldb has an option that can be added to SBLaunchInfo to create a new Terminal window and run the inferior in that new window. lldb uses some AppleScript to open the new window and run the process there. It runs darwin-debug in the window to set up the architecture / current-working-directory / environment variables before launching the actual inferior process. lldb opens a socket on the local filesystem and passes the name to darwin-debug; darwin-debug sends back its pid to lldb over the socket, so lldb can attach to the inferior process once it has been exec'ed. AppleScript sits between lldb and the inferior, so the normal way we control processes at the start does not work. We want to attach to the inferior binary once it has been started, and is sitting at its entry point in dyld_start, stopped, waiting for lldb to connect.
Today, darwin-debug sends its pid over the socket, then parses / constructs the environment variables to be passed to the inferior, then calls posix_spawn to start the inferior. lldb gets the pid, then calls WaitForProcessToSIGSTOP which is intended to poll repeatedly for 5 seconds to detect when the inferior process has stopped in dyld_start. Unfortunately this function has a few bugs - the main loop never runs, the return value from proc_pidinfo is incorrectly handled so none of the intended code will ever run, and finally the condition that it's looking for -- pbi_status==SSTOP -- is not going to indicate that the inferior has been suspended.
The only way to detect that the inferior has been started, and is sitting at dyld_start suspended, is to use mach calls (task_info()) which requires that lldb task_for_pid the inferior, which lldb doesn't have permissions for. I haven't found a kernel API that doesn't require the task port which I can query the suspend count with yet.
The bug being fixed is that lldb attaches before the inferior has started when we have a lot of environment variables (darwin-debug is still processing the env vars when lldb attaches to it) and the UI layer isn't clear what is going on.
This patch changes darwin-debug so it sends its pid just before it calls posix_spawn. It changes lldb to wait up to 5 seconds to receive the pid, then it adds an extra 0.1 seconds of sleep in lldb before it tries to attach to the inferior (plus the time it takes to construct the attach packet and send it to debugserver, and debugserver to decode that packet and try to attach).
Fred suggested using the closing of the socket between lldb and darwin-debug as a way of telling when the exec() is actually happening, but the Read methods in lldb don't detect that closing via close-on-exec in my testing and I didn't dig in much further on this - it's straightforward to get the pid before we call posix_spawn which is close enough.
rdar://problem/29760580
This seems racy still?