This is an archive of the discontinued LLVM Phabricator instance.

Support: add sys::path::canonical
AbandonedPublic

Authored by arphaman on Jul 31 2014, 1:10 PM.

Details

Summary

This patch implements the the sys::path::canonical function, which converts a path to its canonical form.

This function is needed for Clang's FileManager::getCanonicalName (there's a FIXME there) and the new
coverage mapping system.

Diff Detail

Event Timeline

arphaman updated this revision to Diff 12079.Jul 31 2014, 1:10 PM
arphaman updated this revision to Diff 12081.
arphaman retitled this revision from to Support: add sys::path::canonical.
arphaman updated this object.
arphaman edited the test plan for this revision. (Show Details)
arphaman added reviewers: bogner, dexonsmith, bob.wilson.
arphaman set the repository for this revision to rL LLVM.
arphaman added a subscriber: Unknown Object (MLST).

Fix comment typos

rnk added a subscriber: rnk.Jul 31 2014, 4:58 PM

If this is implemented correctly as Duncan describes, we can probably use it to solve http://llvm.org/bugs/show_bug.cgi?id=20440.

I guess than it would make sense to move it to sys::fs and use realpath internally for unix.
For windows GetFullPathName can be used.

rnk added a comment.Jul 31 2014, 5:16 PM

Doug Gregor attempted to use realpath years ago, but it wasn't portable. I would find out why realpath got reverted before using it again.

Well, his revert comment was:

Rip out realpath() support. It's expensive, and often a bad idea, and I have another way to achieve the same goal.

I'm not sure that portability is an issue here.

ikonst added a subscriber: ikonst.EditedAug 1 2014, 8:54 AM

As someone who was looking into http://llvm.org/bugs/show_bug.cgi?id=20440, some comments:

  • This naive canonization will break Unix path behavior with symlinks. realpath does the right thing on Un*x. I suspect (although didn't verify) that this is what makes it slow.
  • Win32 directory symlinks don't behave like Unix symlinks, so this will do the right thing on Win32.
  • That looks like a whole lot of work for path manipulation. All this copying and vectors. Why not do string processing, e.g. "Got a ".."? Go back to the last slash/backslash." I thought performance is of importance here. You can see my code: https://github.com/ikonst/LongPathFix/blob/master/LongPathFixHook/Path.cpp -- I suppose there's a way to do it without using the venerable C string functions.
  • This comment says "in place", but the work isn't done "in place" nor is the result "in place".

Now, as to 20440:

  • We can keep doing include path concatenations as usual (growing it even to thousands of characters in length) and when time comes for CreateFileWs, we'll pass a canonized \??\ path, same as I implemented here in the form of an API hook: https://github.com/ikonst/LongPathFix
  • I don't think this canonizing function quite cuts it for \\?\. You need to handle absolute paths, root paths, relative paths, UNC paths. It's rather Win32 specific.
  • The beautiful thing is that 20440 isn't an issue on Un*x since open accepts long paths naturally.
arphaman abandoned this revision.Aug 1 2014, 9:44 AM

I'm abandoning this patch because there is another way to deal with my problem for code coverage,
and thus I think that I would also like to keep the non-canonical names in the coverage data.