This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Fix conversion from Windows UNC paths to file URI format.
ClosedPublic

Authored by ilya-golovenko on Jul 20 2020, 7:31 AM.

Details

Summary

The fix improves handling of Windows UNC paths to align with Appendix E. Nonstandard Syntax Variations of RFC 8089.

Before this fix it was difficult to use Windows UNC paths in compile_commands.json database as such paths were converted to file URIs using 'file:////auth/share/file.cpp' notation instead of recommended 'file://auth/share/file.cpp'.

As an example, VS.Code cannot understand file URIs with 4 starting slashes, thus such features as go-to-definition, jump-to-file, hover tooltip, etc. stop working. This also applicable to files which reside on Windows network-mapped drives because clangd internally resolves file paths to real paths in some cases and such paths get resolved to UNC paths.

Diff Detail

Event Timeline

ilya-golovenko created this revision.Jul 20 2020, 7:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2020, 7:31 AM
ilya-golovenko edited the summary of this revision. (Show Details)Jul 20 2020, 7:35 AM
ilya-golovenko added reviewers: sammccall, kadircet.

Thanks for doing this! And sorry about the shaky windows support...

(There are potentially other lurking issues due to filenames being used as keys internally, particularly case-insensitivity issues...)

clang-tools-extra/clangd/URI.cpp
29

the UNC paths are also basically a windows thing, can we have hasWindowsDriveLetter and isWindowsNetworkPath (or isWindowsUNCPath)?

34

conventionally LLVM tools accept both slashes on windows and / on other platforms - i.e. we should probably treat //foo/bar as a valid network path on windows unless there's a good reason not to.

So Path.size() > 2 && llvm::sys::path::is_separator(Path[0]) && llvm::sys::path::is_separator(Path[1]), I guess

51–56

this raises the question: what if both conditions are true?

file://abc/c:/foo

\\abc\c: seems much more plausible than the current \\abcc:, so I think this should be else if.

52

These comments could be extended a little and be even more useful:

Windows UNC paths e.g. \\server\share <=> file://server/share

68

FWIW, I think we were previously parsing drive letters even on unix, and will no longer do so.
I don't think this is a real problem though.

clang-tools-extra/clangd/unittests/URITests.cpp
149

Could we add a test that the old four-slash version can still be resolved to a path?

It probably worked by accident - we probably weren't thinking about UNC paths at all. But someone still might depend on it now. I don't think this patch breaks it, just want to check.

ilya-golovenko marked an inline comment as done.

Address code review comments.

ilya-golovenko marked 2 inline comments as done.Jul 20 2020, 9:06 AM
ilya-golovenko added inline comments.
clang-tools-extra/clangd/URI.cpp
29

This kind of network paths are also supported on Unix/Linux system, e.g. //hostname/path/file.txt and isNetworkPath will handle those as well. For example, samba and samba client support such paths. RFC 8089 calls them "non-local files" with unspecified type of protocol to access the file, i.e. it is not necessary a UNC path. Does it make sense to continue supporting Linux/Unix version of network path?

34

I tried to prevent treating paths like /\path or \/path as network paths. Maybe the following variant will work:

bool isNetworkPath(llvm::StringRef Path) {
 return Path.size() > 2 && Path[0] == Path[1] && llvm::sys::path::is_separator(Path[0]);
}

Thanks for doing this! And sorry about the shaky windows support...

(There are potentially other lurking issues due to filenames being used as keys internally, particularly case-insensitivity issues...)

I'm glad to be useful. And thank you much for the time you spend reviewing the code. I really appreciate it!

sammccall accepted this revision.Jul 20 2020, 10:08 AM

Thanks! Do you have commit access, or should I land this for you?

clang-tools-extra/clangd/URI.cpp
29

Ah, I wasn't aware. This seems fine to me.

This revision is now accepted and ready to land.Jul 20 2020, 10:08 AM

Thanks! Do you have commit access, or should I land this for you?

No, I don't have commit access. Could you please commit on my behalf?

Ilya Golovenko <ilya.golovenko@huawei.com>

And thanks again for the review.

@sammccall sorry for bothering you, could you please land this for me?

Ilya Golovenko <ilya.golovenko@huawei.com>

This revision was automatically updated to reflect the committed changes.

@kbobyrev Kirill, thank you for landing this for me! Unfortunately @walrus is my not-used-anymore account...
By the way, do you know is it possible to delete my unused account @walrus to avoid such confusions in future?

@kbobyrev Kirill, thank you for landing this for me! Unfortunately @walrus is my not-used-anymore account...
By the way, do you know is it possible to delete my unused account @walrus to avoid such confusions in future?

Not that I'm aware of :( I have way too many accounts myself, I'd be interested in that, too. I think not having your email associated with it anymore would be good.

Hopefully, reviews move to GitHub at some point and this will not be necessary.

This comment was removed by walrus.

@kbobyrev Kirill, thank you for landing this for me! Unfortunately @walrus is my not-used-anymore account...
By the way, do you know is it possible to delete my unused account @walrus to avoid such confusions in future?

Not that I'm aware of :( I have way too many accounts myself, I'd be interested in that, too. I think not having your email associated with it anymore would be good.

Hopefully, reviews move to GitHub at some point and this will not be necessary.

I asked because according to policies all my commits should be attributed to Huawei (as this work is done here) and should be committed using my work email: ilya.golovenko@huawei.com, not with personal ilya.golovenko@gmail.com

@kbobyrev Kirill, thank you for landing this for me! Unfortunately @walrus is my not-used-anymore account...
By the way, do you know is it possible to delete my unused account @walrus to avoid such confusions in future?

Not that I'm aware of :( I have way too many accounts myself, I'd be interested in that, too. I think not having your email associated with it anymore would be good.

Hopefully, reviews move to GitHub at some point and this will not be necessary.

I asked because according to policies all my commits should be attributed to Huawei (as this work is done here) and should be committed using my work email: ilya.golovenko@huawei.com, not with personal ilya.golovenko@gmail.com

Uh, I see, sorry for that. I'm afraid we can not rewrite the commit history to change the email for the one already submitted, but I think the commits are attributed to the primary email for your account, so simply changing it should do the trick.