This is an archive of the discontinued LLVM Phabricator instance.

[clangd] LSP extension to switch between source/header file
ClosedPublic

Authored by Nebiroth on Aug 1 2017, 7:26 AM.

Details

Summary

Small extension to LSP to allow clients to use clangd to switch between C header files and source files.
Final version will use the completed clangd indexer to use the index of symbols to be able to switch from header to source file when the file names don't match.

Diff Detail

Repository
rL LLVM

Event Timeline

Nebiroth created this revision.Aug 1 2017, 7:26 AM
malaperle requested changes to this revision.Aug 1 2017, 7:41 AM
malaperle added inline comments.
clangd/ClangdServer.cpp
292 ↗(On Diff #109110)

this won't work for other extensions, we need to make this more generic. I believe you had a list of typical source files extensions and header extensions before?

294 ↗(On Diff #109110)

we need to try if the file exists otherwise try other typical header extensions

296 ↗(On Diff #109110)

needs to be more generic and handle more typical header extensions

test/clangd/hover.test
1 ↗(On Diff #109110)

this is from another patch

This revision now requires changes to proceed.Aug 1 2017, 7:41 AM

In the commit message, you could add that we will use and index of symbols later to be able to switch from header to source file when the file names don't match. This is more or less the justification of why we want this in Clangd and not on the client.

arphaman added inline comments.Aug 1 2017, 7:48 AM
clangd/ClangdServer.cpp
292 ↗(On Diff #109110)

LLVM has a wonderful sys::path library for path manipulations. You could rewrite these checks like this:

if (llvm::sys::path::extension(path) == ".cpp") {
 SmallString<128> NewPath = path;
 llvm::sys::path::replace_extension(NewPath, "h");
 return "\"" + NewPath.str() + "\"";
}
ilya-biryukov requested changes to this revision.Aug 1 2017, 9:12 AM
ilya-biryukov added a reviewer: ilya-biryukov.
ilya-biryukov added a subscriber: ilya-biryukov.

Also, +1 to the comments about file extensions, we have to cover at least .c, .cc and .cpp for source files, .h and .hpp for header files.

clangd/ClangdLSPServer.cpp
228 ↗(On Diff #109110)

We should probably use Uri here.

clangd/ClangdServer.h
185 ↗(On Diff #109110)

Please use descriptive typedefs for paths: Path (equivalent to std::string) for return type and PathRef (equivalent to StringRef) for parameter type.

185 ↗(On Diff #109110)

Maybe add a small comment for this function to indicate it's a fairly trivial helper?

clangd/ProtocolHandlers.cpp
260 ↗(On Diff #109110)

I don't see this method in the LSP spec.
Is this some kind of extension?

Could you add some comments on that with pointers to proposal/explanation of where this extension is used?

Nebiroth edited the summary of this revision. (Show Details)Aug 1 2017, 10:59 AM
Nebiroth updated this revision to Diff 109566.Aug 3 2017, 8:27 AM
Nebiroth edited edge metadata.
Nebiroth marked 7 inline comments as done.Aug 3 2017, 8:27 AM

[clangd] LSP extension to switch between source/header file

ilya-biryukov requested changes to this revision.Aug 4 2017, 2:05 AM

This is probably not a final version, but I feel my comments should be helpful anyway.
Also, we're missing testcases now.

clangd/ClangdLSPServer.cpp
225 ↗(On Diff #109566)

We use Params.uri.file, not Params.uri.uri when passing paths to clangd.
Also no need for extra local var: LangServer.Server.switchSourceHeader(Params.uri.file); should work.

225 ↗(On Diff #109566)

Naming: local vars should start with capital letter.

clangd/ClangdServer.cpp
295 ↗(On Diff #109566)

We should check all extensions in both upper-case and lower-case, not just .c and .C
(ideally, we could use case-insensitive comparisons).

302 ↗(On Diff #109566)

path is already a StringRef, no need to convert it.
std::string is also implicitly convertible to StringRef, you could pass std::string to every function that accepts a StringRef

307 ↗(On Diff #109566)

Maybe replace with:

if (std::find(DEFAULT_SOURCE_EXTESIONS, /*end iterator*/, llvm::sys::path::extensions(path)) {
  isSourceFile = true;
  foundExtension = true;
}
323 ↗(On Diff #109566)

Maybe use std::find here too?

339 ↗(On Diff #109566)

Don't need temp var here, there's StringRef::toVector method to convert StringRef to SmallString.

341 ↗(On Diff #109566)

Naming: EXTENSIONS_ARRAY is a local var, use UpperCamelCase.

Maybe use ArrayRef<std::string> for ExtensionsArray instead?

360 ↗(On Diff #109566)

Please use vfs::FileSystem instance, provided by FSProvider.getTaggedFileSystem() here for file accesses.

367 ↗(On Diff #109566)

Don't add quotes here, Uri constructor and Uri::unparse is responsible for doing that.

clangd/ClangdServer.h
186 ↗(On Diff #109566)

Naming: parameter names must be capitalized

185 ↗(On Diff #109110)

Maybe also change wording to indicate that this function does not mutate any state and is indeed trivial, i.e.:
Helper function that returns a path to the corresponding source file when given a header file and vice versa.

Code style: please end all comments with full stops.

clangd/ProtocolHandlers.cpp
260 ↗(On Diff #109110)

This comment was not addressed, probably marked as 'Done' by accident.

This revision now requires changes to proceed.Aug 4 2017, 2:05 AM
malaperle added inline comments.Aug 4 2017, 6:03 AM
clangd/ProtocolHandlers.cpp
260 ↗(On Diff #109110)

Is this some kind of extension?

It's briefly mentioned in the summary but could use some more detail.
The first client to use it is Theia but I wonder if it would be easy to add it to the VS Code extension, as an example.

Nebiroth updated this revision to Diff 111368.Aug 16 2017, 9:35 AM
Nebiroth edited edge metadata.
Nebiroth marked 8 inline comments as done.

Fixed diff comments.

ilya-biryukov added inline comments.Aug 17 2017, 3:22 AM
clangd/ClangdServer.cpp
295 ↗(On Diff #109566)

It turns out there's a very simple way to compare case-insetively. StringRef(str).compare_lower(str2).
Could we use that instead of extension duplicates?

302 ↗(On Diff #109566)

I don't think this comment was addressed. Why do we need pathDataRef variable?

Nebiroth updated this revision to Diff 111675.Aug 18 2017, 8:31 AM

Fixed more diff comments.

arphaman added inline comments.Aug 18 2017, 9:06 AM
clangd/ClangdServer.cpp
296 ↗(On Diff #111675)

You can use LLVM's function array_lengthof here and on the next line instead.

302 ↗(On Diff #111675)

It might be better to use a StringSet instead of an array of strings and use one lowercase/uppercase lookup:

llvm::StringSet<> DEFAULT_SOURCE_EXTENSIONS[] = {".cpp", ".c", ".cc", ".cxx", ".c++", ".m", ".mm"};
// Handles both lower and uppercase extensions and source/header files in one if/else construct:
if (DEFAULT_SOURCE_EXTENSIONS.count(llvm::sys::path::extension(path).lower())) {
  ...
  isSourceFile = true;
} else if (DEFAULT_HEADER_EXTENSIONS.count(llvm::sys::path::extension(path).lower())) {
  ...
  isSourceFile = false;
}

The function in ClangdServer seems to be getting more complicated and more complicated, mostly because of mixing up std::string, StringRef, SmallString.
Here's a slightly revamped implementation of your function, hope you'll find it (or parts of it) useful to simplify your implementation and address remaining comments.

// Note that the returned value is now llvm::Optional to clearly indicate no matching file was found.
llvm::Optional<Path> ClangdServer::switchSourceHeader(PathRef Path) {
  const StringRef SourceExtensions[] = {/*...*/; // only lower-case extensions
  const StringRef HeaderExtensions[] = {/*...*/}; // ...

  StringRef PathExt = llvm::sys::path::extension(Path);
  
  // Lookup in a list of known extensions.
  auto SourceIter = std::find(SourceExtensions, /*end iterator*/, PathExt, [](StringRef LHS, StringRef RHS) { return LHS.equals_lower(RHS); });
  bool IsSource = SourceIter != /*end iterator*/;
  
  // ...
  bool IsHeader = ...;

  // We can only switch between extensions known extensions.
  if (!IsSource && !IsHeader)
    return llvm::None;

  // Array to lookup extensions for the switch. An opposite of where original extension was found.
  ArrayRef<StringRef> NewExts = IsSource ? HeaderExtensions : SourceExtensions;

  // Storage for the new path.
  SmallString<128> NewPath;
  Path.toVector(NewPath);

  // Instance of vfs::FileSystem, used for file existence checks.
  auto FS = FSProvider.getTaggedFileSystem(Path).Value;

  // Loop through switched extension candidates.
  for (StringRef NewExt : NewExts) { 
    llvm::sys::path::replace_extension(NewPath, NewExt)
    if (FS->exists(NewPath))
      return NewPath.str().str(); // First str() to convert from SmallString to StringRef, second to convert from StringRef to std::string

    // Also check NewExt in upper-case, just in case.
    llvm::sys::path::replace_extension(NewPath, NewExt.upper())
    if (FS->exists(NewPath))
      return NewPath.str().str();
  }

  return llvm::None;
}
clangd/ClangdServer.cpp
404 ↗(On Diff #111675)

No need to create a URI here, this should be handled outside ClangdServer, just return a path with replaced extension.

Nebiroth updated this revision to Diff 114046.Sep 6 2017, 12:23 PM

Refactored switchSourceHeader function help from ilya
Added helper function to check for uppercase on current file.

Nebiroth updated this revision to Diff 114048.Sep 6 2017, 12:32 PM

Remove unintentional file addition

Updating D36150: [clangd] LSP extension to switch between source/header file

ll#

Nebiroth updated this revision to Diff 114054.Sep 6 2017, 12:42 PM

Some more code cleanup.

ilya-biryukov requested changes to this revision.Sep 8 2017, 1:46 AM

Just a few more comments. Should be easy to fix.
Could you also please clang-format the code?

clangd/ClangdLSPServer.cpp
225 ↗(On Diff #114054)

What if Result does not contain a value?

225 ↗(On Diff #114054)

Use *Result instead of introducing an extra variable?

229 ↗(On Diff #114054)

We should convert paths here. Convert to URI and use unparse.

Uri ResultUri = Uri::fromFile(*Result);
//....
        R"(,"result":)" + Uri::unparse(ResultUri) + R"(})");
//...
clangd/ClangdServer.cpp
325 ↗(On Diff #114054)

Instead of an checkUpperCaseExtensions and two iterations use a single find_if and equals_lower:

// Checks for both uppercase and lowercase matches in a single loop over extensions array.
std::find_if(std::begin(SourceExtensions), std::end(SourceExtensions), [](PathRef SourceExt) {
  return SourceExt.equals_lower(PathExt);
});
363 ↗(On Diff #114054)

Don't add "file://" or quoting here, simply return NewPath.

clangd/ClangdServer.h
186 ↗(On Diff #114054)

Any reason not to put it into anonymous namespace inside .cpp file?

clangd/ProtocolHandlers.cpp
214 ↗(On Diff #114054)

Code style: we don't use braces around small single-statement control structures

This revision now requires changes to proceed.Sep 8 2017, 1:46 AM
Nebiroth updated this revision to Diff 114424.Sep 8 2017, 1:16 PM
Nebiroth edited edge metadata.
Nebiroth marked 7 inline comments as done.

Ran clang-format on modified files.
Minor refactoring.

ilya-biryukov requested changes to this revision.EditedSep 11 2017, 3:28 AM

Overall the code looks good, thanks for the clean-ups.

Only a few minor style comments and a major one about return value when no file is matching.
Please add tests for the new feature.

clangd/ClangdLSPServer.cpp
227 ↗(On Diff #114424)

Maybe remove this empty line?

231 ↗(On Diff #114424)

Replace with more concise form: if (Result)

234 ↗(On Diff #114424)

Running unparse on an instance created via URI::fromFile("") will result in "file:///".

Have you considered returning a list of paths as a result from switchHeaderSource?
That way you could capture the "no file matched" case with an empty list.
Later when an implementation will look into index, that would allow to return multiple source files for a header file, which also happens in some C++ projects.

Returning an empty string is also an option we could start with.

clangd/ClangdServer.cpp
22 ↗(On Diff #114424)

Do we really need this using namespace or it can be deleted?

345 ↗(On Diff #114424)

Code style: we don't use braces around small single-statement control structures

clangd/ClangdServer.h
185 ↗(On Diff #114424)

Could you please clang-format this file too?
Comment line length is clearly over the limit.

This revision now requires changes to proceed.Sep 11 2017, 3:28 AM
Nebiroth marked 5 inline comments as done.Sep 11 2017, 12:04 PM
Nebiroth added inline comments.
clangd/ClangdLSPServer.cpp
234 ↗(On Diff #114424)

I think I will start by returning an empty string for now. Returning a list of paths sounds like a good idea once an indexer is implemented, but that would require changing some parts of the code like find_if which returns only the first instance of a match.

Nebiroth updated this revision to Diff 114652.Sep 11 2017, 12:29 PM
Nebiroth edited edge metadata.

Return value when no file is found is now an empty string instead of file://
Minor code style changes and cleanup.
Ran clang-format on ClangdServer.h

ilya-biryukov requested changes to this revision.Sep 12 2017, 7:19 AM

Overall looks good, but still needs at least a few tests.

This revision now requires changes to proceed.Sep 12 2017, 7:19 AM

Overall looks good, but still needs at least a few tests.

I have a test for this commit that uses included source and header files, would that be okay to do or should I generate files dynamically? If so, how can I accomplish this?

Overall looks good, but still needs at least a few tests.

I have a test for this commit that uses included source and header files, would that be okay to do or should I generate files dynamically? If so, how can I accomplish this?

You are right, unfortunately we don't have a good story for multi-file tests yet.
Let's add a unit-test for the implementation in ClangdServer. This can be easily done by using MockFSProvider. You may find some examples in clang-tools-extra/unittest/clangd/ClangdTests.cpp

Nebiroth updated this revision to Diff 116038.Sep 20 2017, 11:21 AM
Nebiroth edited edge metadata.

Added unit test.

ilya-biryukov accepted this revision.EditedSep 21 2017, 6:57 PM

Looks good modulo a few NITS.
Let me know if you need help landing this. (In case you don't have commit access, I can land it for you)

unittests/clangd/ClangdTests.cpp
910 ↗(On Diff #116038)

Code style: local variables are UpperCamelCase

916 ↗(On Diff #116038)

Code style: local variables are UpperCamelCase

967 ↗(On Diff #116038)

The text is over the line limit.
Could you please clang-format every submission?

Nebiroth updated this revision to Diff 116387.Sep 22 2017, 12:35 PM
Nebiroth marked 2 inline comments as done.

Rebased on latest version.
Corrected code style issues in test file.

ilya-biryukov accepted this revision.Sep 22 2017, 2:32 PM

Looks good.
Do you want me to submit this patch for you?

Looks good.
Do you want me to submit this patch for you?

Yes please.

This revision was automatically updated to reflect the committed changes.

I submitted the patch. Thanks a lot William and Ilya!

Sorry for delays
@malaperle, thanks for submitting the patch.