This is an archive of the discontinued LLVM Phabricator instance.

[clangd] automatically index STL
AbandonedPublic

Authored by kuhnel on Jan 8 2021, 3:51 AM.

Details

Reviewers
sammccall
Summary

This addresses issue 403.

Goal
Provide code completion for entire STL, even if it was not yet used in the project and thus is not contained in any index.

Design:
Generate a header file that includes all of STL. Then inject this file into the background indexer.

Pros:

  • relatively easy to implement, reusing background index for the hard work

Cons:

  • need to write to a file
  • file needs to be close to project to find compilation database
  • not sure when file can be deleted after background indexer finishes
  • need to check with every opened file if we still need to perform the STL indexing

Notes:
to enable the feature run clangd with the new flag --index-stl.

Diff Detail

Event Timeline

kuhnel created this revision.Jan 8 2021, 3:51 AM
kuhnel requested review of this revision.Jan 8 2021, 3:51 AM
kuhnel edited the summary of this revision. (Show Details)Jan 8 2021, 4:07 AM
kuhnel added a reviewer: sammccall.
kuhnel added a comment.Jan 8 2021, 4:10 AM

@sammccall here the first implementation we talked about yesterday. This works well on my testing poject that has a compile_commands.json. It does not work without one.

Also I'm not really happy with the design as I haven't really found a good place to store the generated header file.

Should we look into implementing a separate Index to only handle the STL?

(not sure if you were looking for comments yet, but i was just passing by and it was a small-ish patch, so couldn't resist :D)

clang-tools-extra/clangd/index/Background.cpp
449

In theory BackgroundIndex only reads headers from the FS, so we can provide an in-memory buffer as the main file itself. prepareCompilerInstance in Compiler.h (and used by BackgroundIndex::index does that exactly).

It might be better to just have a separate endpoint in BackgroundIndex (as indexSTLHeaders) that is invoked once at construction time that enqueues indexing of this phantom file.

WDYT?

clang-tools-extra/clangd/tool/ClangdMain.cpp
527

do we really need to hide this behind a flag ? it sounds like a quite useful feature to me with minimal risk of regressions and unlikely to make anyone upset. in the end we are not doing anything heuristically and just throwing clang on some STL headers, that'll probably be included within the TUs eventually.

there's always the risk of crashing while parsing some STL headers, but i don't think it is any different than user just including the header manually.

the biggest problem i can see is people using custom STL headers, but hopefully compile flags interpolation logic should be able to infer the relevant location for those. and in the cases it fails we either index the default STL and suggest people some symbols their implementation might lack, or fail to find STL at all and print some logs for the missing includes.

kuhnel added a comment.Jan 8 2021, 4:28 AM

Yes, I was looking for feedback on this one as I wasn't happy with my design.

clang-tools-extra/clangd/index/Background.cpp
449

Agree to both. I'll take a look.

clang-tools-extra/clangd/tool/ClangdMain.cpp
527

Back when I was developing embedded software, we couldn't use the STL for various reasons (object size, no exceptions, no dynamic memory, ...) and had our own, proprietary base library. In such a scenario it would be annoying if clangd would be proposing things I could not use in the project.

So we should have a way for users to disable this. I'm fine if we switch it on by default and offer a --disable-stl-index flag instead.

How does this approach work with differing language standards. For example with an old implementation designed for c++14, string_view header won't exist. If the implementation is designed to work with c++17, including that header in c++14 mode will probably be ifdef... Error.

How does this approach work with differing language standards. For example with an old implementation designed for c++14, string_view header won't exist. If the implementation is designed to work with c++17, including that header in c++14 mode will probably be ifdef... Error.

As I mentioned in my comments for putting this behind a flag, in such scenarios we should just hit some "missing includes" while indexing the phantom file, so all should be fine.

clang-tools-extra/clangd/tool/ClangdMain.cpp
527

I see. It still feels like those developers would be less inclined to accept autocompletion of std::vector compared to xyz::vector and also our ranking should be doing a good job of promoting xyz::vector due to its high number of usage, compared to std::vector.

I am mostly unhappy about the flag as it needs propagation to many components. It might be better to have it as a config option, which requires a lot less plumbing (at least for bg-index), if you think even the down-ranking of std symbols wouldn't be a good experience.

How does this approach work with differing language standards. For example with an old implementation designed for c++14, string_view header won't exist. If the implementation is designed to work with c++17, including that header in c++14 mode will probably be ifdef... Error.

Yeah, this seems at least annoying. There's some spectrum of:

  • build indexes per language version and dynamically switch between them based on the file
  • make this variable in some global way (e.g. a flag, or first file opened chooses the language version)
  • language version is always (e.g.) c++14, hopefully it's better than nothing

Similar to the no-stdlib question and the multiple-languages question, this points to a separate index layer being a stronger design to allow dynamically swapping it (though worse can be better, too).

clang-tools-extra/clangd/ClangdServer.h
161

nit: "the standard library"

for two reasons:

  • "STL" is no longer the correct name, even for C++
  • languages other than C++ have standard libraries (C in particular) that we could also control with this flag
clang-tools-extra/clangd/index/Background.cpp
451

separate from the issue of virtual vs real file, it's not clear to me why we *want* this file to be close to the project rather than in some completely anonymous location.

(I have a few guesses, but...)

clang-tools-extra/clangd/index/Background.h
160

this can happen 0+ times (roughly it happens once every time an enumerable CDB is discovered).

Instead, we'd want it to happen exactly once, or possibly 0-1 times if we're going to be language-sensitive.

clang-tools-extra/clangd/tool/ClangdMain.cpp
527

Yeah, this is a good point.

Having this configurable would be valuable, but the useful granularity for such an option is per-file (Config) rather than a global clangd flag, as it's codebase-specific.

Unfortunately that doesn't really square with having it in the background index.

nridge added a subscriber: nridge.Jan 10 2021, 4:29 PM
kuhnel abandoned this revision.Jan 22 2021, 5:58 AM

thx for the review and feedback!

I guess I'll re-design the feature and add a new indexing layer instead.
However that might take a while, so I'll abandon this revision and start a new one later on.