This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Implementation of textDocument/documentSymbol
ClosedPublic

Authored by malaperle on Jun 6 2018, 2:11 PM.

Details

Summary

An AST-based approach is used to retrieve the document symbols rather than an
in-memory index query. The index is not an ideal fit to achieve this because of
the file-centric query being done here whereas the index is suited for
project-wide queries. Document symbols also includes more symbols and need to
keep the order as seen in the file.

Signed-off-by: Marc-Andre Laperle <marc-andre.laperle@ericsson.com>

Event Timeline

malaperle created this revision.Jun 6 2018, 2:11 PM
malaperle added inline comments.Jun 6 2018, 2:41 PM
unittests/clangd/FindSymbolsTests.cpp
40

I updated the other tests to use this in https://reviews.llvm.org/D47847

tomgr added a subscriber: tomgr.Jun 21 2018, 5:14 AM
malaperle planned changes to this revision.Jun 26 2018, 9:19 PM

I found some issues while testing, I will investigate before review.

This looks great! Would like to get your thoughts on reusing/not reusing SymbolCollector.

clangd/FindSymbols.cpp
181

I guess the alternative here is to use SymbolCollector and then convert Symbol->SymbolInformation (which we should already have for workspace/symbol).

I also guess there's some divergence in behavior, which is why you went this route :-)
Mostly in filtering? (I'd think that for e.g. printing, we'd want to be consistent)

Do you think it's worth adding enough customization to SymbolCollector to make it usable here? Even if it was making shouldFilterDecl a std::function, there'd be some value in unifying the rest. WDYT?

clangd/SourceCode.cpp
194

the common case when tryGetRealPathName() is empty seems to be when it's a file that was part of the preamble we're reusing.
Does this fallback tend to give the same answer in that case? (If so, great! I know some other places we should reuse this function!)

unittests/clangd/FindSymbolsTests.cpp
447

in a perfect world, I think the template definitions might be printed Tmpl<T>, Tmpl<int>. Not sure about Tmpl::x vs Tmpl<T>::x though. WDYT?

(Not necessarily worth doing this patch, keep it simple)

sammccall added inline comments.Jun 28 2018, 1:16 AM
clangd/SourceCode.cpp
194

Sorry, I just noticed this is only being moved. Please disregard (and thangs for moving it somewhere common).
Note that D48687 modifies this function, let's make sure not to lose anything in the merge.

malaperle added inline comments.Jun 29 2018, 1:22 PM
clangd/FindSymbols.cpp
181

I put a bit of the justification in the summary, perhaps you missed it or maybe did you think it was not enough of a justification?

An AST-based approach is used to retrieve the document symbols rather than an
in-memory index query. The index is not an ideal fit to achieve this because of
the file-centric query being done here whereas the index is suited for
project-wide queries. Document symbols also includes more symbols and need to
keep the order as seen in the file.

It's not a clear cut decision but I felt that there was more diverging parts than common and that it would be detrimental to SymbolCollector in terms of added complexity.
Differences:

  • We need a different way to filter (as you suggested)
  • Don't need anything about completion
  • Don't need to distinguish declaration vs definition or canonical declaration
  • Don't need a map<USR, Symbol>, we just need them in same order as they appear and SymbolCollector should be free to store them in whichever order for purposes of proving a project-wide collection of symbols
  • Don't want to track references
  • DocumentSymbols needs symbols in main files

Common things:

  • Both need to generate Position/Uri from a SourceLocation. But they can both call sourceLocToPosition.
  • Both need to generate a symbol name from Decl&. But they can both call AST.h's printQualifiedName (I fixed this in a new version).
  • Both use/extend a IndexDataConsumer. Not worth sharing the same code path just to not have to create a class definition IMO.

Let me know if that makes sense to you, otherwise I can try to make another version that uses SymbolCollector.

unittests/clangd/FindSymbolsTests.cpp
447

I'm thinking Tmpl<class T>, Tmpl<int>, so that we can more easily distinguish between a primary and specialization, especially in partial specializations and when the generic type might not be just a single letter. And Tmpl<class T>::x. Maybe this is too much :)

malaperle updated this revision to Diff 153611.Jun 29 2018, 9:16 PM

Fix handling of externs, definition vs declaration and call more common code.

sammccall accepted this revision.Jul 5 2018, 12:07 AM

Sorry, I thought I'd accepted this already!

clangd/ClangdServer.cpp
471

hover -> documentSymbols

clangd/FindSymbols.cpp
188

OS is unused

203

nit: rename to shouldIncludeSymbol or something? "filter" is slightly ambiguous, and a negation. (We've made this change in SymbolCollector recently)

229

nit: isWrittenInMainFile(NameLoc)

234

remove

256

nit: for readability, could we declare a local SymbolInformation above, fill in the fields by name, and then move it into Symbols?

a) it's easier to see that the right values go into the right slots, b), it's easier to search (find references!) for how fields are populated.

268

applying the filter is fine, but note that SystemSymbolFilter only applies to included symbols in -isystem headers (I only just noticed that myself). So the comment is a little misleading, I would drop it.

This revision is now accepted and ready to land.Jul 5 2018, 12:07 AM

(Reasoning for not using SymbolCollector totally makes sense, thanks for the breakdown!)

malaperle updated this revision to Diff 154280.Jul 5 2018, 12:01 PM
malaperle marked 7 inline comments as done.

Address comments.

Thanks a lot for the great comments (as always)!

This revision was automatically updated to reflect the committed changes.