This is an archive of the discontinued LLVM Phabricator instance.

sys::path: Detect posix paths starting with ~ as absolute
AbandonedPublic

Authored by aprantl on Nov 11 2021, 2:31 PM.

Details

Summary

Turns out there is more than one way a POSIX path can be absolute. Paths starting with ~, such as ~user, or ~/.git are also absolute paths, or at least not relative paths.

Without this, llvm-dwarfdump will append the working directory to paths starting with ~, resulting in paths like ~user/~user.

Diff Detail

Event Timeline

aprantl created this revision.Nov 11 2021, 2:31 PM
aprantl requested review of this revision.Nov 11 2021, 2:31 PM
Herald added a project: Restricted Project. · View Herald Transcript

Hmm. My main question is whether it should be considered absolute in a windows path... We do expand tildes in the native() function into the actual home directory, so in such a context, it is an absolute path there too.

I don't believe we should be treating these paths as absolute at such a deep level. ~ is is a purely a userspace artifact used by shells and programs wishing to provide similar UI convenience. It is not recognized by any system calls, nor by any filesystem library that I am aware of. Normally most programs do not even see it, as it will be expanded by the shell. However, if you convince the shell to not expand it, the programs will happily operate on it, as ~ is a valid filename on most filesystems (not that I recommend using it):

$ touch ~/a.cc
$ touch '~/a.cc'
touch: cannot touch '~/a.cc': No such file or directory
$ clang '~/a.cc'
clang-12: error: no such file or directory: '~/a.cc'
clang-12: error: no input files
$ clang ./~/a.cc
clang-12: error: no such file or directory: './~/a.cc'
clang-12: error: no input files
$ clang ~/a.cc
/usr/bin/x86_64-pc-linux-gnu-ld: /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../lib64/crt1.o: in function `_start':
(.text+0x20): undefined reference to `main'
clang-12: error: linker command failed with exit code 1 (use -v to see invocation)
$ mkdir '~'
$ touch '~/a.cc'
$ ls '~'
a.cc

By adding our own special handling of ~ we would be messing with that mechanism. The case of ./~/a.cc is particularly amusing because then sys::path::remove_leading_dotslash will be turning a relative path into an absolute one.

I agree that there are contexts where we would want to treat ~ specially, but this should be opt-in only and based on the context from which that path is coming from. Maybe DW_AT_comp_dir could be one of those contexts. Strictly speaking, it wouldn't be correct, but we do funny things with dwarf paths anyway. IIRC our dwarf code already has some funky logic to detect absolute windows paths on non-windows hosts. Maybe that could be extended to handle ~ as well?

dexonsmith requested changes to this revision.Nov 12 2021, 3:08 PM

I agree with @labath; we shouldn't change sys::path APIs to do have implicit knowledge of shell-specific behaviours (or to guess at what they might be).

llvm/test/tools/llvm-dwarfdump/AArch64/tilde.s
4

Seems potentially like a bug in whatever generated the DWARF for this location to appear un-expanded; or, potentially a response to an interesting user request, such as specifying -fdebug-prefix-map="some/path=~user".

102

Maybe this testcase could be stripped down to skip the d56b171ee965eba9ba30f4a479a9f2e1703105cf?

This revision now requires changes to proceed.Nov 12 2021, 3:08 PM
aprantl abandoned this revision.Nov 12 2021, 3:47 PM

Thanks for the feedback!

llvm/test/tools/llvm-dwarfdump/AArch64/tilde.s
4

This is literally how this testcase was created, yes.