This is an archive of the discontinued LLVM Phabricator instance.

[clangd][remote] Add Windows paths support
ClosedPublic

Authored by ArcsinX on Oct 16 2020, 2:48 AM.

Details

Summary

Without this patch 6 marshalling tests fail on Windows.
This patch contains the following changes:

  • Allow paths with Windows slashes (convert to the POSIX style instead of assertion)
  • Add support for URI with Windows path.
  • Change the value of the second parameter of several llvm::sys::path::convert_to_slash() calls: we should use windows instead of posix to ensure UNIX slashes in the path.
  • Port RemoteMarshallingTest::IncludeHeaderURI test to Windows.

Diff Detail

Event Timeline

ArcsinX created this revision.Oct 16 2020, 2:48 AM
ArcsinX requested review of this revision.Oct 16 2020, 2:48 AM
kbobyrev requested changes to this revision.Oct 16 2020, 4:00 AM

Hi, thanks for flagging the issue! Could you please give more information where exactly the tests fail (e.g. logs with assertions)? This way I could understand where exactly the failure happens.

I can understand what you're trying to do but the patch introduces the behaviour which is unintended as of right now. The convention is to use POSIX paths everywhere in the Marshalling code since the end result will be a URI anyway. The patch changes it to Windows slashes which is orthogonal to this direction.

You're right about llvm::Expected<clangd::FuzzyFindRequest> Marshaller::fromProtobuf needing to convert to the native path though since there aren't any URIs in the end. Also, I think we should convert roots to the POSIX instead of asserting it.

I'm really curious what failures beyond FuzzyFindRequest are on Windows and I'd be happy to fix those but unfortunately I don't have a Windows machine to run the tests myself :(

This revision now requires changes to proceed.Oct 16 2020, 4:00 AM
ArcsinX added a comment.EditedOct 16 2020, 4:56 AM

Hi, thanks for flagging the issue! Could you please give more information where exactly the tests fail (e.g. logs with assertions)? This way I could understand where exactly the failure happens.

With current implementation I could not pass Windows path (e.g. C:\a\b\c) to Marshaller constructor because of assert(RemoteIndexRoot == llvm::sys::path::convert_to_slash(RemoteIndexRoot));, because on Windows llvm::sys::path::convert_to_slash(RemoteIndexRoot) converts C:\a\b\c to C:/a/b/c

Another problem is that Windows URI paths are not supported without this patch (e.g. URI=file:///X:/path => Body=/X:/path, and /X:/path does not start with X:\path)

The third problem, that without conversion local Windows paths to native slashes, mixed-slash paths appears at path::append() calls and we again face the problem when some paths does not start with given prefix (but it should)

I can understand what you're trying to do but the patch introduces the behaviour which is unintended as of right now. The convention is to use POSIX paths everywhere in the Marshalling code since the end result will be a URI anyway. The patch changes it to Windows slashes which is orthogonal to this direction.

Idea is to use native slashes (Windows slashes on Windows) *for local paths only* and UNIX slashes inside the wire.

Just to clarify why I changed llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::posix) to llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::windows):

  • llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::posix) no-op on any platform and does nothing with '\' in Windows paths.
  • llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::windows) replaces '\' with '/' on any platform.

Logs:

D:\work\llvm-project\build>ninja check-clangd
[6/7] Running the Clangd regression tests
llvm-lit.py: D:\work\llvm-project\build\bin\..\..\llvm\utils\lit\lit\llvm\config.py:347: note: using clang: d:\work\llvm-project\build\bin\clang.exe
-- Testing: 905 tests, 8 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.IncludeHeaderURIs (665 of 905)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.IncludeHeaderURIs' FAILED ********************
Note: Google Test filter = RemoteMarshallingTest.IncludeHeaderURIs
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RemoteMarshallingTest
[ RUN      ] RemoteMarshallingTest.IncludeHeaderURIs
D:\work\llvm-project\clang-tools-extra\clangd\unittests\remote\MarshallingTests.cpp(256): error: Value of: bool(Serialized)
  Actual: false
Expected: true
[  FAILED  ] RemoteMarshallingTest.IncludeHeaderURIs (1 ms)
[----------] 1 test from RemoteMarshallingTest (1 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RemoteMarshallingTest.IncludeHeaderURIs

 1 FAILED TEST

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.RefSerialization (666 of 905)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.RefSerialization' FAILED ********************
Note: Google Test filter = RemoteMarshallingTest.RefSerialization
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RemoteMarshallingTest
[ RUN      ] RemoteMarshallingTest.RefSerialization
D:\work\llvm-project\clang-tools-extra\clangd\unittests\remote\MarshallingTests.cpp(228): error: Value of: bool(Serialized)
  Actual: false
Expected: true
[  FAILED  ] RemoteMarshallingTest.RefSerialization (0 ms)
[----------] 1 test from RemoteMarshallingTest (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (0 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RemoteMarshallingTest.RefSerialization

 1 FAILED TEST

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.URIToRelativePathTranslation (673 of 905)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.URIToRelativePathTranslation' FAILED ********************
Note: Google Test filter = RemoteMarshallingTest.URIToRelativePathTranslation
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RemoteMarshallingTest
[ RUN      ] RemoteMarshallingTest.URIToRelativePathTranslation
D:\work\llvm-project\clang-tools-extra\clangd\unittests\remote\MarshallingTests.cpp(436): error: Value of: bool(RelativePath)
  Actual: false
Expected: true
[  FAILED  ] RemoteMarshallingTest.URIToRelativePathTranslation (1 ms)
[----------] 1 test from RemoteMarshallingTest (1 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RemoteMarshallingTest.URIToRelativePathTranslation

 1 FAILED TEST

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.RelationsSerializion (675 of 905)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.RelationsSerializion' FAILED ********************
Note: Google Test filter = RemoteMarshallingTest.RelationsSerializion
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RemoteMarshallingTest
[ RUN      ] RemoteMarshallingTest.RelationsSerializion
D:\work\llvm-project\clang-tools-extra\clangd\unittests\remote\MarshallingTests.cpp(407): error: Value of: bool(Serialized)
  Actual: false
Expected: true
[  FAILED  ] RemoteMarshallingTest.RelationsSerializion (0 ms)
[----------] 1 test from RemoteMarshallingTest (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RemoteMarshallingTest.RelationsSerializion

 1 FAILED TEST

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.URITranslation (676 of 905)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.URITranslation' FAILED ********************
Note: Google Test filter = RemoteMarshallingTest.URITranslation
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RemoteMarshallingTest
[ RUN      ] RemoteMarshallingTest.URITranslation
D:\work\llvm-project\clang-tools-extra\clangd\unittests\remote\MarshallingTests.cpp(101): error: Value of: bool(Serialized)
  Actual: false
Expected: true
[  FAILED  ] RemoteMarshallingTest.URITranslation (1 ms)
[----------] 1 test from RemoteMarshallingTest (1 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RemoteMarshallingTest.URITranslation

 1 FAILED TEST

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.SymbolSerialization (677 of 905)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.SymbolSerialization' FAILED ********************
Note: Google Test filter = RemoteMarshallingTest.SymbolSerialization
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RemoteMarshallingTest
[ RUN      ] RemoteMarshallingTest.SymbolSerialization
D:\work\llvm-project\clang-tools-extra\clangd\unittests\remote\MarshallingTests.cpp(154): error: Value of: bool(Serialized)
  Actual: false
Expected: true
[  FAILED  ] RemoteMarshallingTest.SymbolSerialization (1 ms)
[----------] 1 test from RemoteMarshallingTest (1 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RemoteMarshallingTest.SymbolSerialization

 1 FAILED TEST

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
********************
Failed Tests (6):
  Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.IncludeHeaderURIs
  Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.RefSerialization
  Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.RelationsSerializion
  Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.SymbolSerialization
  Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.URIToRelativePathTranslation
  Clangd Unit Tests :: ./ClangdTests.exe/RemoteMarshallingTest.URITranslation


Testing Time: 38.88s
  Unsupported:   7
  Passed     : 892
  Failed     :   6
FAILED: tools/clang/tools/extra/clangd/test/CMakeFiles/check-clangd
cmd.exe /C "cd /D D:\work\llvm-project\build\tools\clang\tools\extra\clangd\test && C:\Users\p00538580\AppData\Local\Programs\Python\Python38-32\python.exe D:/work/llvm-project/build/./bin/llvm-lit.py -sv D:/work/llvm-project/build/tools/clang/tools/extra/clangd/test/../unittests D:/work/llvm-project/build/tools/clang/tools/extra/clangd/test"
ninja: build stopped: subcommand failed.

Hi, thanks for flagging the issue! Could you please give more information where exactly the tests fail (e.g. logs with assertions)? This way I could understand where exactly the failure happens.

With current implementation I could not pass Windows path (e.g. C:\a\b\c) to Marshaller constructor because of assert(RemoteIndexRoot == llvm::sys::path::convert_to_slash(RemoteIndexRoot));, because on Windows llvm::sys::path::convert_to_slash(RemoteIndexRoot) converts C:\a\b\c to C:/a/b/c

Yes, I mentioned it above. The solution would be to just convert to the POSIX path instead of asserting it.

Another problem is that Windows URI paths are not supported without this patch (e.g. URI=file:///X:/path => Body=/X:/path, and /X:/path does not start with X:\path)

This is an interesting problem and I think this should be covered by the URI code that we have but I should check. This might have been triggered by a different issue.

The third problem, that without conversion local Windows paths to native slashes, mixed-slash paths appears at path::append() calls and we again face the problem when some paths does not start with given prefix (but it should)

Yes, I mentioned it above. This is not an actual problem if all the paths within marshalling are POSIX style. There is a problem with

I can understand what you're trying to do but the patch introduces the behaviour which is unintended as of right now. The convention is to use POSIX paths everywhere in the Marshalling code since the end result will be a URI anyway. The patch changes it to Windows slashes which is orthogonal to this direction.

Idea is to use native slashes (Windows slashes on Windows) *for local paths only* and UNIX slashes inside the wire.

Just to clarify why I changed llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::posix) to llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::windows):

  • llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::posix) no-op on any platform and does nothing with '\' in Windows paths.

Yeah you're right, I missed these convert_to_slash semantics and that should be changed.

  • llvm::sys::path::convert_to_slash(Result, llvm::sys::path::Style::windows) replaces '\' with '/' on any platform.

We don't have actual filesystem paths anywhere outside FuzzyFindRequest, every other clangd struct stores URIs rather than paths. And when the URI is created it doesn't matter whether this path would be a Windows or POSIX, a URI is portable. Hence, the only place with native paths should be the FuzzyFindRequest translation.

Hi, thanks for flagging the issue! Could you please give more information where exactly the tests fail (e.g. logs with assertions)? This way I could understand where exactly the failure happens.

With current implementation I could not pass Windows path (e.g. C:\a\b\c) to Marshaller constructor because of assert(RemoteIndexRoot == llvm::sys::path::convert_to_slash(RemoteIndexRoot));, because on Windows llvm::sys::path::convert_to_slash(RemoteIndexRoot) converts C:\a\b\c to C:/a/b/c

Yes, I mentioned it above. The solution would be to just convert to the POSIX path instead of asserting it.

Ok, got your point. I will think about this.
But for me it's hard to see the reason why we need to store local paths in POSIX style instead of native. From my point of view, we need to ensure that paths inside the wire are in POSIX style and nothing more. E.g. clangd converts paths extracted from LSP to native style.

Another problem is that Windows URI paths are not supported without this patch (e.g. URI=file:///X:/path => Body=/X:/path, and /X:/path does not start with X:\path)

This is an interesting problem and I think this should be covered by the URI code that we have but I should check. This might have been triggered by a different issue.

https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clangd/URI.cpp#L51

The reason why I put this all together is that single change leads to the same amount of tests failures (test just fails by other reason).

ArcsinX updated this revision to Diff 298869.Oct 18 2020, 3:55 AM

Convert RemoteIndexRoot and LocalIndexRoot to the POSIX style.

The solution would be to just convert to the POSIX path instead of asserting it.

I have updated the patch according to this advice.
It seems to me it looks more consistent now, thank you.

kbobyrev requested changes to this revision.Oct 19 2020, 2:40 AM

The solution would be to just convert to the POSIX path instead of asserting it.

I have updated the patch according to this advice.
It seems to me it looks more consistent now, thank you.

Thank you for noticing it and taking action! The patch looks good, there are few comments that I left but it should be good to go afterwards.

clang-tools-extra/clangd/index/remote/marshalling/Marshalling.cpp
53

Please explicitly spell out StringRef here, otherwise visually it is appealing to put const auto there but it would be a StringRef which is no obvious.

59

nit: maybe it's time to change type of RemoteIndexRoot field to llvm::SmallString<256> and use !this->RemoteIndexRoot.endswith(PosixSeparator) instead of additional variable. Not really important for this patch but I should probably do it anyway if it's not changed in this patch.

99

Please add a comment explaining why it is needed here: these paths are not converted to URIs and have to be specific to the server platform in order for the query to work correctly.

335

nit: .str().str() is a bit unfortunate, I have had some discussions about this in the past and I guess explicit conversion via std::string(...) looks a bit cleaner.

clang-tools-extra/clangd/unittests/remote/MarshallingTests.cpp
242

Why remove this test?

This revision now requires changes to proceed.Oct 19 2020, 2:40 AM
ArcsinX added inline comments.Oct 19 2020, 2:55 AM
clang-tools-extra/clangd/unittests/remote/MarshallingTests.cpp
242

As far as I understand, idea of this code is to use absolute path (i.e. /usr/local/user/home/project/Header.h) to be relative to "/". I do not know how to make this on Windows.

Also, Marshaller ProtobufMarshaller(convert_to_slash("/"), convert_to_slash("/")); leads to assertions on Windows:

    assert(llvm::sys::path::is_absolute(RemoteIndexRoot));
...
    assert(llvm::sys::path::is_absolute(LocalIndexRoot));
ArcsinX added inline comments.Oct 19 2020, 3:25 AM
clang-tools-extra/clangd/index/remote/marshalling/Marshalling.cpp
59

But llvm::sys::path::convert_to_slash() returns std::string.
Could you give me an advice how to copy std::string into llvm::SmallString<256> here?

E.g. the following code looks strange for me

  llvm::Optional<llvm::SmallString<256>> RemoteIndexRoot;

....

    this->RemoteIndexRoot = llvm::SmallString<256>(llvm::sys::path::convert_to_slash(
        RemoteIndexRoot, llvm::sys::path::Style::windows));
kbobyrev added inline comments.Oct 19 2020, 4:17 AM
clang-tools-extra/clangd/index/remote/marshalling/Marshalling.cpp
59

Hmm, yeah right nevermind, this should be OK for this patch, I'll deal with this later.

clang-tools-extra/clangd/unittests/remote/MarshallingTests.cpp
242

Ah, I see now, makes sense! Yeah, okay then in this case this should be something like testPath("project/Header.h") which will be absolut path (what we want) and also relative to testPath("") (test path root).

ArcsinX updated this revision to Diff 299017.Oct 19 2020, 5:20 AM

auto => StringRef
add comment for llvm::sys::path::native() call
Result.str().str() => std::string(Result)
add back removed test

ArcsinX marked 2 inline comments as done.Oct 19 2020, 5:27 AM
ArcsinX added inline comments.
clang-tools-extra/clangd/index/remote/marshalling/Marshalling.cpp
59

So, I have made no changes here

99

Add more comments in try to explain, that Path is a relative path here and it could be with any slashes, so convert it to native to make sure it's compatible with the current system.

clang-tools-extra/clangd/unittests/remote/MarshallingTests.cpp
242

Thanks for advice, fixed

kbobyrev requested changes to this revision.Oct 19 2020, 5:55 AM

Also, please hit "Done" on the comments you resolved so that it's easier to track which ones are fixed, otherwise it's hard to follow :(

The patch looks good now, almost ready to go: comments are not really clear right now, I suggested possible alternatives.

clang-tools-extra/clangd/index/remote/marshalling/Marshalling.cpp
96

We normally don't escape variable names a-la Markdown.

Right now it's no very clear what it means, maybe just Construct full path using local prefix. or optionally just remove this one.

100

Maybe just FuzzyFindRequest requires proximity paths to have platform-native format in order for SymbolIndex to process the query correctly..

This revision now requires changes to proceed.Oct 19 2020, 5:55 AM
ArcsinX updated this revision to Diff 299034.Oct 19 2020, 6:16 AM

Comments fixes

ArcsinX marked 7 inline comments as done.Oct 19 2020, 6:19 AM
kbobyrev accepted this revision.Oct 19 2020, 7:22 AM

Thank you for the patch! Ready to land it now.

Please update the patch message before commiting though:

Without this patch 7 marshalling tests fails on Windows.
This patch contains the following changes:
Allow paths with Windows slashes.

This is not correct anymore, right?

Add support for URI with Windows path.
Change the value of the second parameter of several llvm::sys::path::convert_to_slash() calls: We should use windows instead of posix to ensure UNIX slashes in the path.
Remove part of RemoteMarshallingTest::IncludeHeaderURI test which could not be ported on Windows.

This should not be the case anymor.

This revision is now accepted and ready to land.Oct 19 2020, 7:22 AM
ArcsinX edited the summary of this revision. (Show Details)Oct 19 2020, 8:09 AM

Thank you for the patch! Ready to land it now.

Thank you for review!

Please update the patch message before commiting though:

Without this patch 7 marshalling tests fails on Windows.

7 => 6

This patch contains the following changes:
Allow paths with Windows slashes.

This is not correct anymore, right?

We allow paths with Windows slashes now (convert to the POSIX style instead of assertion). I have updated description about this.

Add support for URI with Windows path.
Change the value of the second parameter of several llvm::sys::path::convert_to_slash() calls: We should use windows instead of posix to ensure UNIX slashes in the path.
Remove part of RemoteMarshallingTest::IncludeHeaderURI test which could not be ported on Windows.

This should not be the case anymor.

I have changed: "Remove" => "Port to Windows"

This revision was landed with ongoing or failed builds.Oct 20 2020, 3:05 AM
This revision was automatically updated to reflect the committed changes.