Page MenuHomePhabricator

[clangd] Code hover for Clangd
ClosedPublic

Authored by Nebiroth on Jul 26 2017, 8:24 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Nebiroth updated this revision to Diff 127413.Dec 18 2017, 2:13 PM
Nebiroth marked 3 inline comments as done.
findHover properly returns an error
Removed many cases of bad merging
Separated getHover in two functions
Minor indenting and NIT fixes
sammccall added inline comments.
clangd/ClangdUnit.cpp
941 ↗(On Diff #127413)

Doesn't this slow down parsing by a lot? I thought @ilya-biryukov had seen that it was responsible for big performance losses in code complete.

ISTM we might prefer to look up the comment from the index by USR, and just live without it if we can't find it.

(Apologies if this has been discussed)

clangd/ClangdUnit.h
1 ↗(On Diff #127413)

Please don't cram more stuff into clangdunit that's not directly related to building ASTs and managing their lifetimes. Instead, can we pull as much as possible out into Documentation.h or similar?

Note not Hover.h as there are other use cases, like we should use this logic for the code completion item documentation too.

test/clangd/hover.test
1 ↗(On Diff #127413)

Could I ask you to do primary/exhaustive testing of this with unit tests, and just a basic smoke test with lit?
Then it's possible to read the tests, understand whether they're correct, and understand the failure messages.

I'm trying to pull more of the features in this direction - see CodeCompleteTests.cpp for an example of how this can work.

We've had quite a few cases of bugs that had tests, but nobody noticed that the tests were wrong because it's just not possible to read them.

simark updated this revision to Diff 133367.Feb 7 2018, 8:59 PM
simark added a subscriber: simark.

New version

Here's a new version of the patch, quite reworked. I scaled down the scope a
little bit to make it simpler (see commit log). We'll add more features later
on.

Thanks for picking this up!
I have just one major comment related to how we generate the hover text.

Current implementation tries to get the actual source code for every declaration and then patch it up to look nice on in the output.
It is quite a large piece of code (spanning most of the patch) and tries to mix and match source code and output from the AST when there's not enough information in the AST (e.g. tryFixupIndentation and stripOpeningBracket). This approach is really complex as we'll inevitably try to parse "parts" of C++ and figure out how to deal with macros, etc.
Worse than that, I would expect this code to grow uncontrollably with time and be very hard to maintain.
Moreover, in presence of macros it arguably gives non-useful results. For example, consider this:

#define DEFINITIONS int foo(); int bar(); int baz();

// somewhere else
DEFINITIONS

// somewhere else
int test() {
  foo(); // <-- hover currently doesn't work here. And even if it did, showing a line with just DEFINITIONS is not that useful.
}

I suggest we move to a different approach of pretty-printing the relevant parts of the AST. It is already implemented in clang, handles all the cases in the AST (and will be updated along when AST is changed), shows useful information in presence of macros and is much easier to implement.
The version of getHoverContents using this is just a few lines of code:

static std::string getHoverContents(const Decl *D) {
  if (TemplateDecl *TD = D->getDescribedTemplate())
    D = TD; // we want to see the template bits.

  std::string Result;
  llvm::raw_string_ostream OS(Result);

  PrintingPolicy Policy(D->getASTContext().getLangOpts());
  Policy.TerseOutput = true;
  D->print(OS, Policy);

  OS.flush();
  return Result;
}

It doesn't add the "Defined in ..." piece, but illustrates the APIs we can use. We should use the same APIs for the scope as well, avoiding fallbacks to manual printing if we can.
If there's something missing/wrong in the upstream pretty printers, it's fine and we can fix them (e.g., the thing that I've immediately noticed is that namespaces are always printed with curly braces).

clangd/ClangdLSPServer.cpp
325 ↗(On Diff #133367)

NIT: remove the empty line at the start of function

clangd/Protocol.cpp
324 ↗(On Diff #133367)

NIT: use nullptr instead of NULL
(irrelevant if we use StringRef, see other comment below)

331 ↗(On Diff #133367)
  • StringRef is a nicer abstraction, maybe use it instead of const char*?
  • Maybe move declaration of KindStr closer to its usage?
  • Maybe extract a helper function to mark the code path for invalid kinds as unreachable? I.e.
static StringRef toTextKind(MarkupKind Kind) {
  switch (Kind) {
    case MarkupKind::PlainText:
      return "plaintext";
    case MarkupKind::Markdown:
      return "markdown";
  }
  llvm_unreachable("Invalid MarkupKind");
}
clangd/XRefs.cpp
326 ↗(On Diff #133367)

Same suggestion as before. Could we extract a helper function to mark invalid enum values unreachable?

552 ↗(On Diff #133367)

This assert seems rather unlikely and just makes the code more complex. Let's assume it's true and remove it (along with the subsequent if statement)

560 ↗(On Diff #133367)

This assert seems rather unlikely and just makes the code more complex. Let's assume it's true and remove it (along with the subsequent if statement)

test/clangd/hover.test
2 ↗(On Diff #133367)

We have a more readable format for the lit-tests now. Could we update this test to use it?
(see other tests in test/clangd for example)

unittests/clangd/XRefsTests.cpp
231 ↗(On Diff #133367)

NIT: remove empty line at the start of the function

233 ↗(On Diff #133367)

NIT: Use StringRef instead of const char*

simark added a comment.Feb 8 2018, 1:04 PM

Thanks for picking this up!
I have just one major comment related to how we generate the hover text.

Current implementation tries to get the actual source code for every declaration and then patch it up to look nice on in the output.
It is quite a large piece of code (spanning most of the patch) and tries to mix and match source code and output from the AST when there's not enough information in the AST (e.g. tryFixupIndentation and stripOpeningBracket). This approach is really complex as we'll inevitably try to parse "parts" of C++ and figure out how to deal with macros, etc.
Worse than that, I would expect this code to grow uncontrollably with time and be very hard to maintain.
Moreover, in presence of macros it arguably gives non-useful results. For example, consider this:

#define DEFINITIONS int foo(); int bar(); int baz();

// somewhere else
DEFINITIONS

// somewhere else
int test() {
  foo(); // <-- hover currently doesn't work here. And even if it did, showing a line with just DEFINITIONS is not that useful.
}

I suggest we move to a different approach of pretty-printing the relevant parts of the AST. It is already implemented in clang, handles all the cases in the AST (and will be updated along when AST is changed), shows useful information in presence of macros and is much easier to implement.
The version of getHoverContents using this is just a few lines of code:

static std::string getHoverContents(const Decl *D) {
  if (TemplateDecl *TD = D->getDescribedTemplate())
    D = TD; // we want to see the template bits.

  std::string Result;
  llvm::raw_string_ostream OS(Result);

  PrintingPolicy Policy(D->getASTContext().getLangOpts());
  Policy.TerseOutput = true;
  D->print(OS, Policy);

  OS.flush();
  return Result;
}

It doesn't add the "Defined in ..." piece, but illustrates the APIs we can use. We should use the same APIs for the scope as well, avoiding fallbacks to manual printing if we can.
If there's something missing/wrong in the upstream pretty printers, it's fine and we can fix them (e.g., the thing that I've immediately noticed is that namespaces are always printed with curly braces).

I thought it would be nice to show in the hover the declaration formatted as it is written in the source file, because that's how the developer is used to read it. But on the other hand, I completely agree that trying to do string formatting ourselves is opening a can of worms. I'll try the approach you suggest.

simark added inline comments.Feb 8 2018, 2:04 PM
clangd/ClangdLSPServer.cpp
325 ↗(On Diff #133367)

Done.

clangd/Protocol.cpp
324 ↗(On Diff #133367)

Ack.

331 ↗(On Diff #133367)

Done.

clangd/XRefs.cpp
326 ↗(On Diff #133367)

Done.

552 ↗(On Diff #133367)

Done...

560 ↗(On Diff #133367)

... and done.

test/clangd/hover.test
2 ↗(On Diff #133367)

Indeed the new format looks much more readable. I'll modify this one to look like completion.test.

unittests/clangd/XRefsTests.cpp
231 ↗(On Diff #133367)

Done.

233 ↗(On Diff #133367)

Done.

simark added a comment.EditedFeb 9 2018, 6:39 AM

Another example where pretty-printing the AST gives better results:

int x, y = 5, z = 6;

Hover the z will now show int z = 6, before it would have shown int x, y = 5, z = 6. I think new version is better because it only shows the variable we care about.

Is there something similar we can do when printing macros, or for them the approach of getting the text from the source file is still the best thing we have?

simark added a comment.EditedFeb 9 2018, 7:08 AM

The only problem I see is that when hovering a function/struct name, it now prints the whole function/struct definition. When talking with @malaperle, he told me that you had discussed it before and we should not have the definition in the hover, just the prototype for a function and the name (e.g. "struct foo") for a struct. Glancing quickly at the PrintingPolicy, I don't see a way to do that. Do you have any idea?

simark added a comment.Feb 9 2018, 7:32 AM

The only problem I see is that when hovering a function/struct name, it now prints the whole function/struct definition. When talking with @malaperle, he told me that you had discussed it before and we should not have the definition in the hover, just the prototype for a function and the name (e.g. "struct foo") for a struct. Glancing quickly at the PrintingPolicy, I don't see a way to do that. Do you have any idea?

Arrg, Sorry, I just re-read your comment and saw that you used TerseOutput which does that. I did not catch that when transcribing the code (I did not want to copy paste to understand the code better, but got bitten).

The only problem I see is that when hovering a function/struct name, it now prints the whole function/struct definition. When talking with @malaperle, he told me that you had discussed it before and we should not have the definition in the hover, just the prototype for a function and the name (e.g. "struct foo") for a struct. Glancing quickly at the PrintingPolicy, I don't see a way to do that. Do you have any idea?

Arrg, Sorry, I just re-read your comment and saw that you used TerseOutput which does that. I did not catch that when transcribing the code (I did not want to copy paste to understand the code better, but got bitten).

Yeah, the TerseOutput bit was important :-)

The only problem I see is that when hovering a function/struct name, it now prints the whole function/struct definition. When talking with @malaperle, he told me that you had discussed it before and we should not have the definition in the hover, just the prototype for a function and the name (e.g. "struct foo") for a struct. Glancing quickly at the PrintingPolicy, I don't see a way to do that. Do you have any idea?

I'm not aware of the code that pretty-prints macros. There's a code that dumps debug output, but it's not gonna help in that case.
Alternatively, we could simply print the macro name and not print the definition. That's super-simple and is in line with hovers for the AST decls. Let's do that in the initial version.

For macros using the source code is also less of a problem, so we may try to do it. C++ doesn't allow to generate a macro inside another macro, so we don't run into the same problems we may hit with the AST. And the overall syntax is much simpler than for AST, so we can hope to get string manipulation right with reasonable effort. (Simply reindenting should do the trick).

I'm not aware of the code that pretty-prints macros. There's a code that dumps debug output, but it's not gonna help in that case.
Alternatively, we could simply print the macro name and not print the definition. That's super-simple and is in line with hovers for the AST decls. Let's do that in the initial version.

Ok, I'm considering printing "#define MACRO". Just printing the name would not be very useful, at least printing "#define" gives the information that the hovered entity is defined as a macro (and not a function or variable).

Is there a way to get the macro name from the MacroInfo object? I couldn't find any, so the solution I'm considering is to make DeclarationAndMacrosFinder::takeMacroInfos return an std::vector<std::pair<StringRef, const MacroInfo *>>, where the first member of the pair is the macro name. It would come from IdentifierInfo->getName() in DeclarationAndMacrosFinder::finish. Does that make sense, or is there a simpler way?

Is there a way to get the macro name from the MacroInfo object? I couldn't find any, so the solution I'm considering is to make DeclarationAndMacrosFinder::takeMacroInfos return an std::vector<std::pair<StringRef, const MacroInfo *>>, where the first member of the pair is the macro name. It would come from IdentifierInfo->getName() in DeclarationAndMacrosFinder::finish. Does that make sense, or is there a simpler way?

I don't think there's a way to get macro name from MacroInfo. pair<Name, MacroInfo*> sounds good to me, I'd probably even use a named struct here: struct MacroDecl { StringRef Name; const MacroInfo &Info; }

simark updated this revision to Diff 134306.Feb 14 2018, 1:45 PM

Generate Hover by pretty-printing the AST

This new version of the patch generates the hover by pretty-printing the AST.
The unit tests are updated accordingly. There are some inconsistencies in how
things are printed. For example, I find it a bit annoying that we print empty
curly braces after class names (e.g. "class Foo {}"). Also, namespace and
enums are printed with a newline between the two braces. These are things that
could get fixed by doing changes to the clang AST printer.

The hover.test test now uses a more readable format.

simark added inline comments.Feb 14 2018, 2:36 PM
unittests/clangd/XRefsTests.cpp
561 ↗(On Diff #134306)

Note that I used .str() here to make the output of failing tests readable and useful. By default, gtest tries to print StringRef as if it was a container. I tried to make add a PrintTo function to specify how it should be printed, but couldn't get it to work (to have it called by the compiler), so I settled for this.

Just a few last remarks and this is good to go.
Should I land it for you after the last comments are fixed?

clangd/XRefs.cpp
354 ↗(On Diff #134306)

We should call flush() before returning Name here. The raw_string_ostream is buffered.

373 ↗(On Diff #134306)

NIT: use llvm::None here instead of {}

394 ↗(On Diff #134306)

Accidental leftover from previous code?

unittests/clangd/XRefsTests.cpp
262 ↗(On Diff #134306)

NIT: LLVM uses UpperCamelCase for field names.

561 ↗(On Diff #134306)

Thanks for spotting that.
We have a fix for that in LLVM's gtest extensions (D43330).
str() can now be removed.

simark added inline comments.Feb 15 2018, 8:29 AM
clangd/XRefs.cpp
354 ↗(On Diff #134306)

I replaced it with Stream.str().

373 ↗(On Diff #134306)

Done.

394 ↗(On Diff #134306)

Oops, thanks.

unittests/clangd/XRefsTests.cpp
262 ↗(On Diff #134306)

Ok, I fixed all the structs this patch adds.

561 ↗(On Diff #134306)

Removed. Even if this patch is merged before D43330 it's not a big deal, since it only affects debug output of failing test cases.

simark updated this revision to Diff 134437.Feb 15 2018, 8:33 AM

New version

  • Rebase on master, change findHover to have a callback-based interface
  • Fix nits from previous review comments

Just a few last remarks and this is good to go.
Should I land it for you after the last comments are fixed?

I just received my commit access, so if you are ok with it, I would try to push it once I get your final OK.

Only the naming of fields left.

Also, please make sure to run git-clang-format on the code before submitting to make sure it's formatted properly.

clangd/ClangdLSPServer.cpp
331 ↗(On Diff #134437)

NIT: use return replyError to be consistent with other methods.

clangd/XRefs.cpp
354 ↗(On Diff #134306)

Also works, albeit less efficient (will probably do a copy where move is enough).

unittests/clangd/XRefsTests.cpp
262 ↗(On Diff #134306)

Argh, sorry that I didn't mention it before, but we have an exception for structs in Procotol.h, they follow the spelling in the LSP spec (i.e. should be lowerCamelCase).

Only the naming of fields left.

Also, please make sure to run git-clang-format on the code before submitting to make sure it's formatted properly.

Ahh, I was running clang-format by hand, didn't know about git-clang-format (and obviously forgot some files). Thanks.

clangd/ClangdLSPServer.cpp
331 ↗(On Diff #134437)

Ok.

clangd/XRefs.cpp
354 ↗(On Diff #134306)

Ah, didn't think about that. I'll change it to flush + return Name.

unittests/clangd/XRefsTests.cpp
262 ↗(On Diff #134306)

Haha, no problem, changing them back is one keystroke away.

simark updated this revision to Diff 134440.Feb 15 2018, 8:54 AM

Fix some nits

  • Revert case change of struct fields in Protocol.h
  • Change return formatting in onHover
  • Use flush() + return Name in NamedDeclQualifiedName and TypeDeclToString
ilya-biryukov accepted this revision.Feb 15 2018, 9:18 AM

Thanks for fixing all the NITs!

simark updated this revision to Diff 134452.Feb 15 2018, 10:17 AM

Fix field names in XRefsTests.cpp

I realized I forgot to add some fixups in the previous version, the field names
in XRefsTests.cpp were not updated and therefore it did not build.

malaperle updated this revision to Diff 134708.Feb 16 2018, 1:19 PM

Add a missing std::move, fix some formatting.

malaperle accepted this revision.Feb 16 2018, 1:39 PM

Works well and code looks good. There were only minor tweaks since the last approval. I'll land this since there are some issues with Simon's svn account.

This revision is now accepted and ready to land.Feb 16 2018, 1:39 PM
Closed by commit rL325395: [clangd] Implement textDocument/hover (authored by malaperle, committed by ). · Explain Why
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.