This is an archive of the discontinued LLVM Phabricator instance.

[Driver] Expand the executable path in the target create output
ClosedPublic

Authored by JDevlieghere on Aug 1 2019, 3:19 PM.

Details

Summary

Resolve the path in the target create output. This is nice when passing relative paths to the lldb command line driver.

$ lldb ./binary
(lldb) target create "./binary"
Current executable set to '/absolute/path/to/binary' (x86_64).

This change only affects the target create output and does not change the debugger's behavior. It doesn't resolve symbolic links so it won't cause confusing when debugging something like clang++ that's symlinked to clang.

Diff Detail

Event Timeline

JDevlieghere created this revision.Aug 1 2019, 3:19 PM

If you just want the path and don't need the FileSpec, you can call the static SBFileSpec::ResolvePath.

IIRC both GetPath and ResolvePath return the number of characters it would take to actually resolve the path. 128 is actually pretty short, so you should check the result and malloc a buffer of the return + 1 if 128 ends up not being enough.

The details of how our FS capture works have unfortunately, swapped out of my memory, but... isn't this the sort of thing that should already be handled by the filesystem capture/replay machinery? It sounds to me like it could/should remember what the CWD at the time of capture was, and then, when replaying, point the fake CWD into the right place in the captured filesystem image. This way, the relative paths should resolve the same way as they originally did.

This patch just moves the place where this VFS definiciency manifests itself (*), so that it does not pose a problem for this particular use case. However, I'd be surprised if this is the only relative path that is being resolved inconsistently, and I think a more general solution would be more appropriate.

(*) What I mean by that, is that the SBFileSpec::ResolvePath will still end up returning a different value during replay than what it did originally. However, because the data leaves the SB boundary, and then crosses it back in through SBStream::Printf, we lose track of the fact that this is the same data, and we capture it as a constant. Before this patch, the data was always under the SB boundary, and so we were assuming that the data flow will be identical during record&replay, which it wasn't, and that is the real bug, I think.

The details of how our FS capture works have unfortunately, swapped out of my memory, but... isn't this the sort of thing that should already be handled by the filesystem capture/replay machinery? It sounds to me like it could/should remember what the CWD at the time of capture was, and then, when replaying, point the fake CWD into the right place in the captured filesystem image. This way, the relative paths should resolve the same way as they originally did.

This functionality is not currently present in the VFS, but I've created a patch to support it: https://reviews.llvm.org/D65677

This patch just moves the place where this VFS definiciency manifests itself (*), so that it does not pose a problem for this particular use case. However, I'd be surprised if this is the only relative path that is being resolved inconsistently, and I think a more general solution would be more appropriate.

(*) What I mean by that, is that the SBFileSpec::ResolvePath will still end up returning a different value during replay than what it did originally. However, because the data leaves the SB boundary, and then crosses it back in through SBStream::Printf, we lose track of the fact that this is the same data, and we capture it as a constant. Before this patch, the data was always under the SB boundary, and so we were assuming that the data flow will be identical during record&replay, which it wasn't, and that is the real bug, I think.

clayborg added a comment.EditedAug 2 2019, 3:21 PM

Also be careful if the user uses a symlink to not resolve the link. If a user tries to debug clang++:

$ ls -AFlG /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang*
-rwxr-xr-x  1 root  wheel  81666656 Jul 12 21:48 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang*
lrwxr-xr-x  1 root  wheel         5 Jul 25 07:57 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++@ -> clang

We don't want to resolve it to "clang" or it will change the behavior for any program that checks argv[0] and doesn't something different depending on what that values is.

Also be careful if the user uses a symlink to not resolve the link. If a user tries to debug clang++:

$ ls -AFlG /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang*
-rwxr-xr-x  1 root  wheel  81666656 Jul 12 21:48 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang*
lrwxr-xr-x  1 root  wheel         5 Jul 25 07:57 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++@ -> clang

We don't want to resolve it to "clang" or it will change the behavior for any program that checks argv[0] and doesn't something different depending on what that values is.

That's a good point, and something I hadn't considered. If my memory serves me right ResolvePath doesn't call realpath, but I'd have to double check for sure.

One thing I liked about this is that when you're debugging something in your PATH it will show you the absolute path, like /bin/ls instead of just ls.

Also be careful if the user uses a symlink to not resolve the link. If a user tries to debug clang++:

$ ls -AFlG /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang*
-rwxr-xr-x  1 root  wheel  81666656 Jul 12 21:48 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang*
lrwxr-xr-x  1 root  wheel         5 Jul 25 07:57 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++@ -> clang

We don't want to resolve it to "clang" or it will change the behavior for any program that checks argv[0] and doesn't something different depending on what that values is.

That's a good point, and something I hadn't considered. If my memory serves me right ResolvePath doesn't call realpath, but I'd have to double check for sure.

Please do!

One thing I liked about this is that when you're debugging something in your PATH it will show you the absolute path, like /bin/ls instead of just ls.

This can be nice, but people also have come to expect interesting things from the "argv[0]" over the years. This code is fragile and we do have tests, but we should make sure we have plenty of tests to make sure we don't regress. clang and clang++ should be a good example to use as clang won't enable C++ by default if argv[0] isn't clang++

labath added a comment.Aug 5 2019, 9:53 AM

So what if the user doesn't do lldb ./foo but instead does file ./foo at the lldb comand prompt? Then, you'll be back at square one as far as reproducers are concerned...

or lldb -o "file ./foo" for that matter. I've seen people who are unaware of -O do that when they want to execute stuff before the target is set...

So what if the user doesn't do lldb ./foo but instead does file ./foo at the lldb comand prompt? Then, you'll be back at square one as far as reproducers are concerned...

We definitely need the VFS change for the reproducers, no argument there. My earlier comment was more about the change in general, orthogonal to the reproducers.

labath added a comment.Aug 6 2019, 2:52 AM

We definitely need the VFS change for the reproducers, no argument there. My earlier comment was more about the change in general, orthogonal to the reproducers.

Ah, I see. Thanks for explaining.

As for the expansion, I think a better way to handle that would be to change the output of the "target create" to show to resolved path, independent of the actual input:

$ bin/lldb ls
(lldb) target create "ls"
Current executable set to 'ls' (x86_64).

We could change the last line to say "/bin/ls" without affecting the argv[0] of the target program. That way, you'd get the resolved path no matter how the user specifies the target to debug...

JDevlieghere retitled this revision from [Driver] Expand the target in the driver. to [Driver] Expand the executable path in the target create output.
JDevlieghere edited the summary of this revision. (Show Details)
labath accepted this revision.Aug 6 2019, 10:38 AM

I don't know if anyone has strong opinions on how we should print this, but I think I slightly prefer the seeing the full path.

This revision is now accepted and ready to land.Aug 6 2019, 10:38 AM

Given that we do some work to find executables (like search along the path for "ls" -> /bin/ls") which work might not end up with the result you expected, printing the full path seems useful.

I am fine with full path as long as argv[0] still behaves correctly.

clayborg accepted this revision.Aug 6 2019, 1:20 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2019, 9:20 AM