This is an archive of the discontinued LLVM Phabricator instance.

Fix debugserver translation check
ClosedPublic

Authored by aperez on May 2 2022, 4:31 PM.

Details

Summary

Currently, debugserver has a test to check if it was launched in
translation. The intent was to cover the case where an x86_64
debugserver attempts to control an arm64/arm64e process, returning
an error. However, this check also covers the case where users
are attaching to an x86_64 process, exiting out before attempting
to hand off control to the translated debugserver at
/Library/Apple/usr/libexec/oah/debugserver.

This diff delays the debugserver translation check until after
determining whether to hand off control to
/Library/Apple/usr/libexec/oah/debugserver. Only when the
process is not translated and thus has not been handed off do we
check if the debugserver is translated, erroring out in that case.

Diff Detail

Event Timeline

aperez created this revision.May 2 2022, 4:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2022, 4:31 PM
aperez edited the summary of this revision. (Show Details)May 2 2022, 4:32 PM
aperez added a project: Restricted Project.
aperez published this revision for review.May 2 2022, 5:01 PM

I did not add unit tests because I would need to debug an x86_64 binary using an x86_64 debugserver on arm64, and was not sure if the test infrastructure allowed for that.

I did run a few manual tests which I'm including in this comment. I compiled and used a Darwin-x86_64 debugserver with the patch from this diff applied. x86_64 processes are now handed off to the translated debugserver:

$ lldb main_x86_64
(lldb) target create "main_x86_64"
Current executable set to '/Users/alexandreperez/temp/main_x86_64' (x86_64).
(lldb) b main
Breakpoint 1: where = main_x86_64`main + 11 at main.cpp:2:3, address = 0x0000000100003fab
(lldb) r
Process 80635 launched: '/Users/alexandreperez/temp/main_x86_64' (x86_64)
Process 80635 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100003fab main_x86_64`main at main.cpp:2:3
   1   	int main() {
-> 2   	  return 0;
   3   	}
(lldb)

Verified that the arm64 binary fail to attach using the patched Darwin-x86_64 debugserver, as expected:

$ lldb main_arm64
(lldb) target create "main_arm64"
Current executable set to '/Users/alexandreperez/temp/main_arm64' (arm64).
(lldb) r
error: process exited with status -1 (debugserver is x86_64 binary running in translation, attach failed.)
(lldb)

Using the LLDB_DEBUGSERVER_PATH environment variable prevents from handing off to the translated debugserver, as expected:

$ LLDB_DEBUGSERVER_PATH="/path/to/patched/debugserver" lldb main_x86_64
(lldb) target create "main_x86_64"
Current executable set to '/Users/alexandreperez/temp/main_x86_64' (x86_64).
(lldb) r
error: process exited with status -1 (debugserver is x86_64 binary running in translation, attach failed.)
(lldb)

Ah, I see what you're doing. You've got a shell running x86_64 or something (maybe started lldb as x86_64) so everything that is spawned from that is x86_64 -- debugserver and the inferior process -- and so you hit this. let me look more closely later tonight, but I didn't take that scenario into account. Normally when people are running debugserver as x86_64 it's because they unintentionally ran a parent as x86_64 and are paying an unintended perf hit across the debug session and part of this error reporting was to say "hey, you probably didn't mean to do this".

(tbh every case I've seen, it's someone who checked "Run in Rosetta" on the Xcode.app bundle with Finder, and are running their entire IDE as x86_64....)

aperez added a comment.May 2 2022, 6:48 PM

Ah, I see what you're doing. You've got a shell running x86_64 or something (maybe started lldb as x86_64) so everything that is spawned from that is x86_64 -- debugserver and the inferior process -- and so you hit this. let me look more closely later tonight, but I didn't take that scenario into account. Normally when people are running debugserver as x86_64 it's because they unintentionally ran a parent as x86_64 and are paying an unintended perf hit across the debug session and part of this error reporting was to say "hey, you probably didn't mean to do this".

In our case we are currently distributing an x86_64-only version of the toolchain (including lldb and debugserver) across many machines. Some of these machines are m1 laptops, and we started observing failures to attach on these machines once 0c443e92d3b9 was included.

jasonmolenda accepted this revision.May 3 2022, 1:31 PM

LGTM. I see how your workflow is set up. Yeah, you return an error if someone tries to run debugserver in translation (and the inferior isn't x86_64) - that's what we need to do, a translated debugserver won't be able to debug arm64 processes.

This revision is now accepted and ready to land.May 3 2022, 1:31 PM
This revision was automatically updated to reflect the committed changes.