This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/clangd/
-
clangd/
-
CMakeLists.txt
-
ClangdServer.h
3/6
ClangdServer.cpp
-
Config.h
-
ConfigCompile.cpp
-
ConfigFragment.h
-
ConfigYAML.cpp
-
TUScheduler.h
-
TUScheduler.cpp
-
index/
-
FileIndex.h
-
FileIndex.cpp
4/4
StdLib.h
31/31
StdLib.cpp
-
SymbolOrigin.h
-
SymbolOrigin.cpp
-
unittests/
-
CMakeLists.txt
-
StdLibTests.cpp
-
TUSchedulerTests.cpp

Differential D115232

[clangd] Indexing of standard library
ClosedPublic

Authored by sammccall on Dec 7 2021, 3:27 AM.

Download Raw Diff

Details

Reviewers

kadircet

Commits

rGecaa4d9662c9: [clangd] Indexing of standard library

Summary

This provides a nice "warm start" with all headers indexed, not just
those included so far.

The standard library is indexed after a preamble is parsed, using that
file's configuration. The result is pushed into the dynamic index.
If we later see a higher language version, we reindex it.

It's configurable as Index.StandardLibrary, off by default for now.

Based on D105177 by @kuhnel

Fixes https://github.com/clangd/clangd/issues/618

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sammccall created this revision.Dec 7 2021, 3:27 AM

Herald added subscribers: usaxena95, kadircet, arphaman and 3 others. · View Herald TranscriptDec 7 2021, 3:27 AM

sammccall requested review of this revision.Dec 7 2021, 3:27 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2021, 3:27 AM

Herald added subscribers: cfe-commits, MaskRay, ilya-biryukov. · View Herald Transcript

Harbormaster completed remote builds in B137860: Diff 392332.Dec 7 2021, 3:46 AM

remove parts split into other patches

Harbormaster completed remote builds in B138033: Diff 392582.Dec 7 2021, 4:42 PM

Tests, polish, comments

Herald added a subscriber: kristof.beyls. · View Herald TranscriptDec 7 2021, 5:40 PM

sammccall added a reviewer: kadircet.Dec 7 2021, 5:40 PM

Harbormaster completed remote builds in B138048: Diff 392605.Dec 7 2021, 6:20 PM

nridge added a subscriber: nridge.Dec 8 2021, 9:54 PM

sammccall mentioned this in D115768: [clangd] Proof of concept: indexing after the preamble is built.Dec 15 2021, 1:56 AM

rZhBoYao added a subscriber: rZhBoYao.Jan 12 2022, 4:23 PM

thanks LG, mostly nits and a couple of questions

clang-tools-extra/clangd/ClangdServer.cpp
90	I suppose this should be rare hence won't bite us in practice, but might worth having a comment mentioning this creates tasks with no barriers.
clang-tools-extra/clangd/index/StdLib.cpp
45	`llvm_uncreachable` instead?
50	nit: this feels a little bit hard to read, what about: if(2b) return 2b; if(20) return 20; ... return 98;
57	same here
93	maybe move second half of this comment into `buildUmbrella` ?
122	s/our our/to our/
132	drop static, and move into previous anonymous namespace ?
143	s/Name/Header/ ?
161	`StandardHeaders` are always in verbatim format, but are we sure if that holds for the `IncludeHeader` ? I suppose that should be the case since we map known path suffixes back to a verbatim umbrella header, I just wonder how exhaustive the list is. (it might be nice to manually check no implementation detail headers falls out of cracks)
178	seems like debugging artifact, either drop or put in `#ifndef NDEBUG`
205	why log if we think we can't hit this case?
217	why are we doing this exactly? once we override the same file multiple times, I believe the last one takes precedence. it's definitely safer to clear the remapped files here rather than relying on that fact, but I am afraid of losing other mappings, possibly set outside of clangd's knowledge (e.g. through flags like `-remap-file`?)
234	what does `the location` refer to here? I think we should also stress the fact that even when indexing the same file, we have a good chance of seeing different symbols due to PP directives (and different std versions)
238	s/containers/include graph
257	i'd make this part of the next log
300	maybe move this check to the top
314	why do we resolve the symlinks ?
324	any reason for going with `Noncached` version? (clangd doesn't set one up today, but not relying on that would be nice if we don't have a particular concern here)
330	s/bust/must
clang-tools-extra/clangd/index/StdLib.h
67	maybe drop the optinal and bail out in indexing when `Paths` are empty ?
69	s/a built/built an/

address review comments

Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2022, 1:57 PM

(sorry about the long delay, would still love to merge this)

clang-tools-extra/clangd/index/StdLib.cpp
93	I think that removes the context of why we're #including them.
161	Right, I looked at these manually and the headers (and symbols) we were dropping seemed reasonable. Note that we're not filtering symbols in this loop, just building a list of blessed files to add to the StdLibLocation. So we only drop symbols that: aren't recognized by the indexer as being in the standard library aren't in the standard library directory don't share a file with anything recognized as being in the standard library
178	This is useful for debugging, and fits well with the other dlog()s in this file, I'd like to check it in. I was going to call you paranoid, being sure this would get compiled out. but indeed not: https://godbolt.org/z/hKWhro3Mv (gcc manages it) Added #ifndef NDEBUG
205	Because I'm not certain, and would much rather get a user bug report with this log line in it than without! (Assert because I'd most rather find out before release of course)
217	We map dirty buffers ourselves. Conceivably, it may be part of the standard library itself that was remapped to some dirty buffer content! We're reusing the CompilerInvocation from building a preamble, where we remap the main-file buffer to the dirty contents. (We do this in prepareCompilerInstance, but the PPOptions are shared). This isn't part of the "compilerinvocation-as-proxy-for-build-flags" that we're trying to index. I don't think it's a realistic possibility that anyone would rely on `-Xclang -remap-file` to find the standard library (note that it's a cc1 flag, not a public one...).
234	what does the location refer to here? It refers to the StdLibLocation Loc, made that explicit. I think we should also stress the fact that even when indexing the same file, we have a good chance of seeing different symbols due to PP directives (and different std versions) Different than what? Do you mean "why might different calls to indexStandardLibrary see different symbols" from the same file?
257	Can you say why? I generally like one thought per line. Scanning vertically through familiar lines, it's easy to miss something unfamiliar tacked onto the end. This message should be rare, and log lines aren't precious. (I reordered them, which seems a bit more natural)
314	Oops, because I read the documentation of getCanonicalPath() instead of the implementation, and they diverged in https://github.com/llvm/llvm-project/commit/dd67793c0c549638cd93cad1142d324b979ba682 :-D Ultimately we're forming URIs to lexically compare with the ones emitted by the indexer, which uses getCanonicalPath(). Today getCanonicalPath() wants a FileEntry and we don't have one, but I think there's no fundamental reason for that, I can fix it. (I'll do it as a separate patch, for now it's just calling getCanonicalPath with the wrong arguments)
324	Because the cached version is more complicated with no benefits: any entanglement between the IO of the preamble indexing and that of the translation unit that happened to trigger it seems like a complicated idea, that's worth understanding before doing as you say, we don't actually install a statcache, so there's no concrete benefit in fact there exists no caching implementation of FileSystemStatCache, so the idea that we might be able to implement that interface and gain benefits is extremely speculative
clang-tools-extra/clangd/index/StdLib.h
67	Why? This would definitely be using an empty vector as a sentinel value: 2 paths -> index 1 path -> index 0 paths -> don't index And it's not as if "probe for a standard library" is the main point of this function so the interpretation of the return value is obvious - that's only one of three criteria. None seems to be a clearer way to communicate this than {}, and performance doesn't seem to be an issue here.

Revert StdLibLocation to realpath, document why

revert to previous version of realpath code

clang-tools-extra/clangd/index/StdLib.cpp
314	Actually, nevermind, the code is correct and I'd just forgotten why :-) Added a comment to StdLibLocation. getCanonicalPath() does actually resolve symlinks and so on: it asks the FileManager for the directory entry and grabs the its "canonical name" which is just FS->getRealPath behind a cache. So the strings are going to match the indexer after all. It'd be possible to modify getCanonicalPath() so we can call it here, but I don't think it helps. Calling it with (path, filemanager) is inconvenient for the (many) existing callsites, so it'd have to be a new overload just for this case. And the FileManager caching we'd gain doesn't matter here. I can still do it if you like, though. (Also, relevant to your interests, realpath is probably taking care of case mismatches too!)

Harbormaster completed remote builds in B157050: Diff 419265.Mar 31 2022, 3:00 AM

This had a "LG" comment above... want to take another pass?
(Not urgent, just checking)

sorry for the long turn around here, LGTM. let's ship it!

clang-tools-extra/clangd/index/StdLib.cpp
234	Different than what? Do you mean "why might different calls to indexStandardLibrary see different symbols" from the same file? yes, i meant compared to a previous runs. but i don't think it's as relevant here. i believe i was thinking about caching indexing status across runs and using that cache to implement filefilter, so that we don't index the same file twice (as we normally do in bgindex).
257	i was rather implying to add it as a `(in)complete` field into the current log line you have. usually when clangd is printing lots of logs across threads it might be hard to correlate these. hence having them printed as a single log would help.
314	So the strings are going to match the indexer after all. thanks, this makes sense. It'd be possible to modify getCanonicalPath() so we can call it here, but I don't think it helps. Calling it with (path, filemanager) is inconvenient for the (many) existing callsites, so it'd have to be a new overload just for this case. And the FileManager caching we'd gain doesn't matter here. I can still do it if you like, though. No need. We can take a look at that if the logic is likely to change (or get more complicated) in the future.
clang-tools-extra/clangd/index/StdLib.h
67	okay, makes sense.

This revision is now accepted and ready to land.May 16 2022, 8:42 AM

sammccall marked 8 inline comments as done.May 17 2022, 1:11 AM

Address comments
Add end-to-end test
Move ownership of AsyncTaskRunner to allow blockUntilIdle() in test
Fix bugs caught by end-to-end-test

This revision was landed with ongoing or failed builds.May 17 2022, 7:51 AM

Closed by commit rGecaa4d9662c9: [clangd] Indexing of standard library (authored by sammccall). · Explain Why

This revision was automatically updated to reflect the committed changes.

sammccall added a commit: rGecaa4d9662c9: [clangd] Indexing of standard library.

sammccall added a reverting change: rG76ddbb1ca747: Revert "[clangd] Indexing of standard library".May 17 2022, 8:17 AM

Harbormaster completed remote builds in B164897: Diff 430062.May 17 2022, 8:39 AM

sammccall reopened this revision.May 17 2022, 11:11 AM

This revision is now accepted and ready to land.May 17 2022, 11:11 AM

fix HasSubstr matcher type issue

Harbormaster completed remote builds in B164943: Diff 430130.May 17 2022, 11:57 AM

Hmm, the test keeps crashing on the GN bot: http://45.33.8.238/win/58316/step_9.txt
Unfortunately the stacktrace is not symbolized, and I'm not seeing this elsewhere (e.g. premerge bot).

@thakis, any idea why unittests no longer manage to symbolize stack traces on crash on the windows bot? I believe this used to work...

In D115232#3520461, @sammccall wrote:

Hmm, the test keeps crashing on the GN bot: http://45.33.8.238/win/58316/step_9.txt
Unfortunately the stacktrace is not symbolized, and I'm not seeing this elsewhere (e.g. premerge bot).

@thakis, any idea why unittests no longer manage to symbolize stack traces on crash on the windows bot? I believe this used to work...

I do not know. Maybe related to the "run many unit tests in a single process" lit change from a month ago?

Anyways, looks like this relanded and broke tests yet again. Maybe find a win box before relanding the next time?

In D115232#3522514, @thakis wrote:

In D115232#3520461, @sammccall wrote:

Hmm, the test keeps crashing on the GN bot: http://45.33.8.238/win/58316/step_9.txt
Unfortunately the stacktrace is not symbolized, and I'm not seeing this elsewhere (e.g. premerge bot).

@thakis, any idea why unittests no longer manage to symbolize stack traces on crash on the windows bot? I believe this used to work...

I do not know. Maybe related to the "run many unit tests in a single process" lit change from a month ago?

I suspected that, and verified locally that:

llvm-symbolizer on PATH still works
LLVM_SYMBOLIZER_PATH env variable didn't work, but I fixed it in 1236b66a98197109ed40141329d6056dfbe25967 along with this reland, still no dice.

Anyways, looks like this relanded and broke tests yet again. Maybe find a win box before relanding the next time?

I found one, but the test doesn't crash (nor on the premerge bots).

I'll revert again, but I have no idea how to proceed. Only this bot and llvm-avr-linux show the failure, and neither of them have a working symbolizer.

In D115232#3522571, @sammccall wrote:

In D115232#3522514, @thakis wrote:

In D115232#3520461, @sammccall wrote:

Hmm, the test keeps crashing on the GN bot: http://45.33.8.238/win/58316/step_9.txt
Unfortunately the stacktrace is not symbolized, and I'm not seeing this elsewhere (e.g. premerge bot).

@thakis, any idea why unittests no longer manage to symbolize stack traces on crash on the windows bot? I believe this used to work...

I do not know. Maybe related to the "run many unit tests in a single process" lit change from a month ago?

I suspected that, and verified locally that:

llvm-symbolizer on PATH still works

LLVM_SYMBOLIZER_PATH env variable didn't work, but I fixed it in 1236b66a98197109ed40141329d6056dfbe25967 along with this reland, still no dice.

(My bot neither has llvm-symbolizer on path, nor sets LLVM_SYMBOLIZER_PATH fwiw.)

Anyways, looks like this relanded and broke tests yet again. Maybe find a win box before relanding the next time?

I found one, but the test doesn't crash (nor on the premerge bots).

I'll revert again, but I have no idea how to proceed. Only this bot and llvm-avr-linux show the failure, and neither of them have a working symbolizer.

I wouldn't be super surprised if this is related to windows and delayed template parsing. I haven't found any bots on llvm's official waterfall that run ClangdTests.exe – maybe that's why there aren't more bots finding this?

I'd recommend building and running the test on a win box and see if it repros locally.

I managed to get a stack trace from a bot (by leaving the broken commit up for longer this time).

In D115232#3522598, @thakis wrote:

In D115232#3522571, @sammccall wrote:

In D115232#3522514, @thakis wrote:

In D115232#3520461, @sammccall wrote:

Hmm, the test keeps crashing on the GN bot: http://45.33.8.238/win/58316/step_9.txt
Unfortunately the stacktrace is not symbolized, and I'm not seeing this elsewhere (e.g. premerge bot).

@thakis, any idea why unittests no longer manage to symbolize stack traces on crash on the windows bot? I believe this used to work...

I do not know. Maybe related to the "run many unit tests in a single process" lit change from a month ago?

I suspected that, and verified locally that:

llvm-symbolizer on PATH still works

LLVM_SYMBOLIZER_PATH env variable didn't work, but I fixed it in 1236b66a98197109ed40141329d6056dfbe25967 along with this reland, still no dice.

(My bot neither has llvm-symbolizer on path, nor sets LLVM_SYMBOLIZER_PATH fwiw.)

Is this something you could add? I'd much rather revert quickly when after seeing a problem than wait for the slower official bots to catch it, but the stacktrace is pretty key.

Anyways, looks like this relanded and broke tests yet again. Maybe find a win box before relanding the next time?

I found one, but the test doesn't crash (nor on the premerge bots).

I'll revert again, but I have no idea how to proceed. Only this bot and llvm-avr-linux show the failure, and neither of them have a working symbolizer.

I wouldn't be super surprised if this is related to windows and delayed template parsing.

It seems possible. The other other failure I'd seen was the avr-linux bot (not windows).
However it definitely passes in some configurations: both the premerge tests and my local build.

I haven't found any bots on llvm's official waterfall that run ClangdTests.exe – maybe that's why there aren't more bots finding this?

clang-x64-windows-msvc does, but it's slow.

I'd recommend building and running the test on a win box and see if it repros locally.

Again, I have done this, and cannot reproduce (on 19.29.30038.1).

Now I have some idea where the crash is I can try some blind fixes, though.

Landed finally as 03ea140b3a285c9a4400ee007b1790b110cbf984

bnbarham added a subscriber: bnbarham.Dec 19 2022, 4:23 PM

bnbarham added inline comments.

clang-tools-extra/clangd/ClangdServer.cpp
1008–1010	@sammccall shouldn't we also be waiting for this to finish when `ClangdServer` is destroyed? IIUC right now the both `FileIndex` itself (stored in `ClangdServer`) and the actual `UpdateIndexCallbacks` (stored in `TUScheduler`) can be freed while `indexStdlib` is running asynchronously, resulting in a use-after-free on eg. `FIndex->updatePreamble(std::move(IF))`. I was confused as to why this wasn't happening in the tests, but these lines would explain it 😅 Adding a `IndexTasks->wait()` to `~ClangdServer` fixes the crash I'm seeing in the sourcekit-lsp tests (see https://github.com/apple/llvm-project/pull/5837), though ideally we (sourcekit-lsp) wouldn't be running any indexing at all. As far as I can tell there's no way to turn off dynamic indexing now though, except for `StandardLibrary` indexing through the config file (but not from clangd args)?

sammccall added inline comments.Dec 21 2022, 10:09 AM

clang-tools-extra/clangd/ClangdServer.cpp
1008–1010	Thanks for flagging this! We almost have the sequencing we need in ~ClangdServer: when we fall off the end of ~ClangdServer it destroys all its members `ClangdServer::IndexTasks` is declared after `FIndex`, so is destroyed first ~AsyncTaskRunner calls `wait()` But the task we schedule on `IndexTasks` captures a ref to `UpdateIndexCallbacks`, which is owned by the `TUScheduler`, which we explicitly destroy at the beginning of `~ClangdServer`. However I think your patch is also not quite correct: we can wait for the tasks to be empty, but then the TUScheduler could fill it up again before we destroy TUScheduler. Options include adding an explicit stop() to TUScheduler, changing TUScheduler to not (exclusively) own UpdateIndexCallbacks, or have the task not capture the callbacks by reference. I'll try the latter first, which seems least invasive. ideally we (sourcekit-lsp) wouldn't be running any indexing at all. As far as I can tell there's no way to turn off dynamic indexing now though, except for StandardLibrary indexing through the config file (but not from clangd args)? Clangd won't provide any top-level/namespace-level completions at all without dynamic index (index of preambles), and various other features won't work (docs on hover, include-fixer, type/call-hierarchy). We dropped support for disabling this at some point, as it didn't really seem usable and made features more complex if we tried to accommodate it. At a technical level it would be possible to disable I think, but I'd be really surprised if completion worked well, or if a language server without completion was useful. `StandardLibrary` indexing through the config file (but not from clangd args) We've tried to move away from flags for options that are interesting to users, as config files are more flexible, more forgiving on errors, and allow different settings per-project in a consistent way. (We don't own the editors, so cross-editor consistency is important to being able to support users at all...) I can see how requiring config to be materialized on disk could be inconvenient for IDEs though. I think we could add a general-purpose `--config-inline=<YAML/JSON goes here>` flag, and/or the ability to set config over LSP (this can be dynamic, accordingly bigger design space that might be hard to get right).

sammccall mentioned this in D140486: [clangd] Fix crashing race in ClangdServer shutdown with stdlib indexing.Dec 21 2022, 10:16 AM

bnbarham added inline comments.Dec 21 2022, 10:36 AM

clang-tools-extra/clangd/ClangdServer.cpp
1008–1010	Ah, I didn't actually check `AsyncTaskRunner`. Makes sense it would wait though :). Thanks for looking into this in detail! or have the task not capture the callbacks by reference. I'll try the latter first, which seems least invasive. This + moving `FIndex` after `IndexTasks` seems reasonable to me. Clangd won't provide any top-level/namespace-level completions at all without dynamic index (index of preambles), and various other features won't work (docs on hover, include-fixer, type/call-hierarchy). That's good to know - I assume this extends to indexing the stdlib as well, ie. the stdlib would be missing from top/namespace level completion if not indexed? Does the dynamic index grow with every opened file, or is it just the currently opened file? If disabling breaks everything it's not something we'd want to do, we just don't need it for find refs/etc.

sammccall added inline comments.Dec 21 2022, 11:02 AM

clang-tools-extra/clangd/ClangdServer.cpp
1008–1010	This + moving FIndex after IndexTasks seems reasonable to me. I sent D140486. FIndex should be before IndexTasks in order to outlive it, unless I'm missing something. I assume this extends to indexing the stdlib as well, ie. the stdlib would be missing from top/namespace level completion if not indexed? Any parts of the standard library that you haven't (transitively) included from an open file would be missing. In particular, if you start up clangd, open a blank file, and type `std::` you'll get nothing. Does the dynamic index grow with every opened file, or is it just the currently opened file? It contains symbols for all the transitively included headers of every file you've had open. So it does grow for each opened file, but in practice usually only by a little bit: if you've opened three files usually >90% of the headers visible from the fourth file are already in the index.

bnbarham added inline comments.Dec 21 2022, 11:09 AM

clang-tools-extra/clangd/ClangdServer.cpp
1008–1010	I sent D140486. FIndex should be before IndexTasks in order to outlive it, unless I'm missing something. Thanks! No, that's right. And it is already so 👍. And thanks for the extra information too.

sammccall mentioned this in rGb494f67f6796: [clangd] Fix crashing race in ClangdServer shutdown with stdlib indexing.Dec 21 2022, 2:36 PM

Revision Contents

Path

Size

clang-tools-extra/

clangd/

1 line

1 line

45 lines

5 lines

5 lines

3 lines

4 lines

4 lines

7 lines

index/

1 line

16 lines

110 lines

363 lines

1 line

2 lines

unittests/

CMakeLists.txt

1 line

StdLibTests.cpp

162 lines

TUSchedulerTests.cpp

3 lines

Diff 430130

clang-tools-extra/clangd/CMakeLists.txt

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	add_clang_library(clangDaemon
index/Index.cpp		index/Index.cpp
index/IndexAction.cpp		index/IndexAction.cpp
index/MemIndex.cpp		index/MemIndex.cpp
index/Merge.cpp		index/Merge.cpp
index/ProjectAware.cpp		index/ProjectAware.cpp
index/Ref.cpp		index/Ref.cpp
index/Relation.cpp		index/Relation.cpp
index/Serialization.cpp		index/Serialization.cpp
		index/StdLib.cpp
index/Symbol.cpp		index/Symbol.cpp
index/SymbolCollector.cpp		index/SymbolCollector.cpp
index/SymbolID.cpp		index/SymbolID.cpp
index/SymbolLocation.cpp		index/SymbolLocation.cpp
index/SymbolOrigin.cpp		index/SymbolOrigin.cpp
index/YAMLSerialization.cpp		index/YAMLSerialization.cpp

index/dex/Dex.cpp		index/dex/Dex.cpp
▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

clang-tools-extra/clangd/ClangdServer.h

Show First 20 Lines • Show All 421 Lines • ▼ Show 20 Lines	private:
bool PreambleParseForwardingFunctions = false;		bool PreambleParseForwardingFunctions = false;

// GUARDED_BY(CachedCompletionFuzzyFindRequestMutex)		// GUARDED_BY(CachedCompletionFuzzyFindRequestMutex)
llvm::StringMap<llvm::Optional<FuzzyFindRequest>>		llvm::StringMap<llvm::Optional<FuzzyFindRequest>>
CachedCompletionFuzzyFindRequestByFile;		CachedCompletionFuzzyFindRequestByFile;
mutable std::mutex CachedCompletionFuzzyFindRequestMutex;		mutable std::mutex CachedCompletionFuzzyFindRequestMutex;

llvm::Optional<std::string> WorkspaceRoot;		llvm::Optional<std::string> WorkspaceRoot;
		llvm::Optional<AsyncTaskRunner> IndexTasks; // for stdlib indexing.
llvm::Optional<TUScheduler> WorkScheduler;		llvm::Optional<TUScheduler> WorkScheduler;
// Invalidation policy used for actions that we assume are "transient".		// Invalidation policy used for actions that we assume are "transient".
TUScheduler::ASTActionInvalidation Transient;		TUScheduler::ASTActionInvalidation Transient;

// Store of the current versions of the open documents.		// Store of the current versions of the open documents.
// Only written from the main thread (despite being threadsafe).		// Only written from the main thread (despite being threadsafe).
DraftStore DraftMgr;		DraftStore DraftMgr;

std::unique_ptr<ThreadsafeFS> DirtyFS;		std::unique_ptr<ThreadsafeFS> DirtyFS;
};		};

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

#endif		#endif

clang-tools-extra/clangd/ClangdServer.cpp

Show All 20 Lines
#include "SemanticHighlighting.h"		#include "SemanticHighlighting.h"
#include "SemanticSelection.h"		#include "SemanticSelection.h"
#include "SourceCode.h"		#include "SourceCode.h"
#include "TUScheduler.h"		#include "TUScheduler.h"
#include "XRefs.h"		#include "XRefs.h"
#include "index/CanonicalIncludes.h"		#include "index/CanonicalIncludes.h"
#include "index/FileIndex.h"		#include "index/FileIndex.h"
#include "index/Merge.h"		#include "index/Merge.h"
		#include "index/StdLib.h"
#include "refactor/Rename.h"		#include "refactor/Rename.h"
#include "refactor/Tweak.h"		#include "refactor/Tweak.h"
#include "support/Cancellation.h"		#include "support/Cancellation.h"
#include "support/Logger.h"		#include "support/Logger.h"
#include "support/MemoryTree.h"		#include "support/MemoryTree.h"
#include "support/ThreadsafeFS.h"		#include "support/ThreadsafeFS.h"
#include "support/Trace.h"		#include "support/Trace.h"
#include "clang/Format/Format.h"		#include "clang/Format/Format.h"
Show All 17 Lines

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace {		namespace {

// Update the FileIndex with new ASTs and plumb the diagnostics responses.		// Update the FileIndex with new ASTs and plumb the diagnostics responses.
struct UpdateIndexCallbacks : public ParsingCallbacks {		struct UpdateIndexCallbacks : public ParsingCallbacks {
UpdateIndexCallbacks(FileIndex *FIndex,		UpdateIndexCallbacks(FileIndex *FIndex,
ClangdServer::Callbacks *ServerCallbacks)		ClangdServer::Callbacks *ServerCallbacks,
: FIndex(FIndex), ServerCallbacks(ServerCallbacks) {}		const ThreadsafeFS &TFS, AsyncTaskRunner *Tasks)
		: FIndex(FIndex), ServerCallbacks(ServerCallbacks), TFS(TFS),
		Tasks(Tasks) {}

void onPreambleAST(PathRef Path, llvm::StringRef Version, ASTContext &Ctx,		void onPreambleAST(PathRef Path, llvm::StringRef Version,
		const CompilerInvocation &CI, ASTContext &Ctx,
Preprocessor &PP,		Preprocessor &PP,
const CanonicalIncludes &CanonIncludes) override {		const CanonicalIncludes &CanonIncludes) override {
		// If this preamble uses a standard library we haven't seen yet, index it.
		if (FIndex)
		if (auto Loc = Stdlib.add(*CI.getLangOpts(), PP.getHeaderSearchInfo()))
		indexStdlib(CI, std::move(*Loc));

if (FIndex)		if (FIndex)
FIndex->updatePreamble(Path, Version, Ctx, PP, CanonIncludes);		FIndex->updatePreamble(Path, Version, Ctx, PP, CanonIncludes);
}		}

		void indexStdlib(const CompilerInvocation &CI, StdLibLocation Loc) {
		auto Task = [this, LO(*CI.getLangOpts()), Loc(std::move(Loc)),
		CI(std::make_unique<CompilerInvocation>(CI))]() mutable {
		IndexFileIn IF;
		IF.Symbols = indexStandardLibrary(std::move(CI), Loc, TFS);
		if (Stdlib.isBest(LO))
		FIndex->updatePreamble(std::move(IF));
		};
		if (Tasks)
		// This doesn't have a semaphore to enforce -j, but it's rare.
		kadircetUnsubmitted Done Reply Inline Actions I suppose this should be rare hence won't bite us in practice, but might worth having a comment mentioning this creates tasks with no barriers. kadircet: I suppose this should be rare hence won't bite us in practice, but might worth having a comment…
		Tasks->runAsync("IndexStdlib", std::move(Task));
		else
		Task();
		}

void onMainAST(PathRef Path, ParsedAST &AST, PublishFn Publish) override {		void onMainAST(PathRef Path, ParsedAST &AST, PublishFn Publish) override {
if (FIndex)		if (FIndex)
FIndex->updateMain(Path, AST);		FIndex->updateMain(Path, AST);

assert(AST.getDiagnostics().hasValue() &&		assert(AST.getDiagnostics().hasValue() &&
"We issue callback only with fresh preambles");		"We issue callback only with fresh preambles");
std::vector<Diag> Diagnostics = *AST.getDiagnostics();		std::vector<Diag> Diagnostics = *AST.getDiagnostics();
if (ServerCallbacks)		if (ServerCallbacks)
Show All 18 Lines	struct UpdateIndexCallbacks : public ParsingCallbacks {
void onPreamblePublished(PathRef File) override {		void onPreamblePublished(PathRef File) override {
if (ServerCallbacks)		if (ServerCallbacks)
ServerCallbacks->onSemanticsMaybeChanged(File);		ServerCallbacks->onSemanticsMaybeChanged(File);
}		}

private:		private:
FileIndex *FIndex;		FileIndex *FIndex;
ClangdServer::Callbacks *ServerCallbacks;		ClangdServer::Callbacks *ServerCallbacks;
		const ThreadsafeFS &TFS;
		StdLibSet Stdlib;
		AsyncTaskRunner *Tasks;
};		};

class DraftStoreFS : public ThreadsafeFS {		class DraftStoreFS : public ThreadsafeFS {
public:		public:
DraftStoreFS(const ThreadsafeFS &Base, const DraftStore &Drafts)		DraftStoreFS(const ThreadsafeFS &Base, const DraftStore &Drafts)
: Base(Base), DirtyFiles(Drafts) {}		: Base(Base), DirtyFiles(Drafts) {}

private:		private:
Show All 35 Lines	: FeatureModules(Opts.FeatureModules), CDB(CDB), TFS(TFS),
DynamicIdx(Opts.BuildDynamicSymbolIndex ? new FileIndex() : nullptr),		DynamicIdx(Opts.BuildDynamicSymbolIndex ? new FileIndex() : nullptr),
ClangTidyProvider(Opts.ClangTidyProvider),		ClangTidyProvider(Opts.ClangTidyProvider),
UseDirtyHeaders(Opts.UseDirtyHeaders),		UseDirtyHeaders(Opts.UseDirtyHeaders),
PreambleParseForwardingFunctions(Opts.PreambleParseForwardingFunctions),		PreambleParseForwardingFunctions(Opts.PreambleParseForwardingFunctions),
WorkspaceRoot(Opts.WorkspaceRoot),		WorkspaceRoot(Opts.WorkspaceRoot),
Transient(Opts.ImplicitCancellation ? TUScheduler::InvalidateOnUpdate		Transient(Opts.ImplicitCancellation ? TUScheduler::InvalidateOnUpdate
: TUScheduler::NoInvalidation),		: TUScheduler::NoInvalidation),
DirtyFS(std::make_unique<DraftStoreFS>(TFS, DraftMgr)) {		DirtyFS(std::make_unique<DraftStoreFS>(TFS, DraftMgr)) {
		if (Opts.AsyncThreadsCount != 0)
		IndexTasks.emplace();
// Pass a callback into `WorkScheduler` to extract symbols from a newly		// Pass a callback into `WorkScheduler` to extract symbols from a newly
// parsed file and rebuild the file index synchronously each time an AST		// parsed file and rebuild the file index synchronously each time an AST
// is parsed.		// is parsed.
WorkScheduler.emplace(		WorkScheduler.emplace(CDB, TUScheduler::Options(Opts),
CDB, TUScheduler::Options(Opts),		std::make_unique<UpdateIndexCallbacks>(
std::make_unique<UpdateIndexCallbacks>(DynamicIdx.get(), Callbacks));		DynamicIdx.get(), Callbacks, TFS,
		IndexTasks ? IndexTasks.getPointer() : nullptr));
// Adds an index to the stack, at higher priority than existing indexes.		// Adds an index to the stack, at higher priority than existing indexes.
auto AddIndex = [&](SymbolIndex *Idx) {		auto AddIndex = [&](SymbolIndex *Idx) {
if (this->Index != nullptr) {		if (this->Index != nullptr) {
MergedIdx.push_back(std::make_unique<MergedIndex>(Idx, this->Index));		MergedIdx.push_back(std::make_unique<MergedIndex>(Idx, this->Index));
this->Index = MergedIdx.back().get();		this->Index = MergedIdx.back().get();
} else {		} else {
this->Index = Idx;		this->Index = Idx;
}		}
▲ Show 20 Lines • Show All 799 Lines • ▼ Show 20 Lines
ClangdServer::blockUntilIdleForTest(llvm::Optional<double> TimeoutSeconds) {		ClangdServer::blockUntilIdleForTest(llvm::Optional<double> TimeoutSeconds) {
// Order is important here: we don't want to block on A and then B,		// Order is important here: we don't want to block on A and then B,
// if B might schedule work on A.		// if B might schedule work on A.

// Nothing else can schedule work on TUScheduler, because it's not threadsafe		// Nothing else can schedule work on TUScheduler, because it's not threadsafe
// and we're blocking the main thread.		// and we're blocking the main thread.
if (!WorkScheduler->blockUntilIdle(timeoutSeconds(TimeoutSeconds)))		if (!WorkScheduler->blockUntilIdle(timeoutSeconds(TimeoutSeconds)))
return false;		return false;
		// TUScheduler is the only thing that starts background indexing work.
		if (IndexTasks && !IndexTasks->wait(timeoutSeconds(TimeoutSeconds)))
		return false;
		bnbarhamUnsubmitted Not Done Reply Inline Actions @sammccall shouldn't we also be waiting for this to finish when `ClangdServer` is destroyed? IIUC right now the both `FileIndex` itself (stored in `ClangdServer`) and the actual `UpdateIndexCallbacks` (stored in `TUScheduler`) can be freed while `indexStdlib` is running asynchronously, resulting in a use-after-free on eg. `FIndex->updatePreamble(std::move(IF))`. I was confused as to why this wasn't happening in the tests, but these lines would explain it 😅 Adding a `IndexTasks->wait()` to `~ClangdServer` fixes the crash I'm seeing in the sourcekit-lsp tests (see https://github.com/apple/llvm-project/pull/5837), though ideally we (sourcekit-lsp) wouldn't be running any indexing at all. As far as I can tell there's no way to turn off dynamic indexing now though, except for `StandardLibrary` indexing through the config file (but not from clangd args)? bnbarham: @sammccall shouldn't we also be waiting for this to finish when `ClangdServer` is destroyed?
		sammccallAuthorUnsubmitted Done Reply Inline Actions Thanks for flagging this! We almost have the sequencing we need in ~ClangdServer: when we fall off the end of ~ClangdServer it destroys all its members `ClangdServer::IndexTasks` is declared after `FIndex`, so is destroyed first ~AsyncTaskRunner calls `wait()` But the task we schedule on `IndexTasks` captures a ref to `UpdateIndexCallbacks`, which is owned by the `TUScheduler`, which we explicitly destroy at the beginning of `~ClangdServer`. However I think your patch is also not quite correct: we can wait for the tasks to be empty, but then the TUScheduler could fill it up again before we destroy TUScheduler. Options include adding an explicit stop() to TUScheduler, changing TUScheduler to not (exclusively) own UpdateIndexCallbacks, or have the task not capture the callbacks by reference. I'll try the latter first, which seems least invasive. ideally we (sourcekit-lsp) wouldn't be running any indexing at all. As far as I can tell there's no way to turn off dynamic indexing now though, except for StandardLibrary indexing through the config file (but not from clangd args)? Clangd won't provide any top-level/namespace-level completions at all without dynamic index (index of preambles), and various other features won't work (docs on hover, include-fixer, type/call-hierarchy). We dropped support for disabling this at some point, as it didn't really seem usable and made features more complex if we tried to accommodate it. At a technical level it would be possible to disable I think, but I'd be really surprised if completion worked well, or if a language server without completion was useful. `StandardLibrary` indexing through the config file (but not from clangd args) We've tried to move away from flags for options that are interesting to users, as config files are more flexible, more forgiving on errors, and allow different settings per-project in a consistent way. (We don't own the editors, so cross-editor consistency is important to being able to support users at all...) I can see how requiring config to be materialized on disk could be inconvenient for IDEs though. I think we could add a general-purpose `--config-inline=<YAML/JSON goes here>` flag, and/or the ability to set config over LSP (this can be dynamic, accordingly bigger design space that might be hard to get right). sammccall: Thanks for flagging this! We almost have the sequencing we need in ~ClangdServer: - when we…
		bnbarhamUnsubmitted Not Done Reply Inline Actions Ah, I didn't actually check `AsyncTaskRunner`. Makes sense it would wait though :). Thanks for looking into this in detail! or have the task not capture the callbacks by reference. I'll try the latter first, which seems least invasive. This + moving `FIndex` after `IndexTasks` seems reasonable to me. Clangd won't provide any top-level/namespace-level completions at all without dynamic index (index of preambles), and various other features won't work (docs on hover, include-fixer, type/call-hierarchy). That's good to know - I assume this extends to indexing the stdlib as well, ie. the stdlib would be missing from top/namespace level completion if not indexed? Does the dynamic index grow with every opened file, or is it just the currently opened file? If disabling breaks everything it's not something we'd want to do, we just don't need it for find refs/etc. bnbarham: Ah, I didn't actually check `AsyncTaskRunner`. Makes sense it would wait though :). Thanks for…
		sammccallAuthorUnsubmitted Done Reply Inline Actions This + moving FIndex after IndexTasks seems reasonable to me. I sent D140486. FIndex should be before IndexTasks in order to outlive it, unless I'm missing something. I assume this extends to indexing the stdlib as well, ie. the stdlib would be missing from top/namespace level completion if not indexed? Any parts of the standard library that you haven't (transitively) included from an open file would be missing. In particular, if you start up clangd, open a blank file, and type `std::` you'll get nothing. Does the dynamic index grow with every opened file, or is it just the currently opened file? It contains symbols for all the transitively included headers of every file you've had open. So it does grow for each opened file, but in practice usually only by a little bit: if you've opened three files usually >90% of the headers visible from the fourth file are already in the index. sammccall: > This + moving FIndex after IndexTasks seems reasonable to me. I sent D140486. FIndex should…
		bnbarhamUnsubmitted Not Done Reply Inline Actions I sent D140486. FIndex should be before IndexTasks in order to outlive it, unless I'm missing something. Thanks! No, that's right. And it is already so 👍. And thanks for the extra information too. bnbarham: > I sent D140486. FIndex should be before IndexTasks in order to outlive it, unless I'm missing…

// Unfortunately we don't have strict topological order between the rest of		// Unfortunately we don't have strict topological order between the rest of
// the components. E.g. CDB broadcast triggers backrgound indexing.		// the components. E.g. CDB broadcast triggers backrgound indexing.
// This queries the CDB which may discover new work if disk has changed.		// This queries the CDB which may discover new work if disk has changed.
//		//
// So try each one a few times in a loop.		// So try each one a few times in a loop.
// If there are no tricky interactions then all after the first are no-ops.		// If there are no tricky interactions then all after the first are no-ops.
// Then on the last iteration, verify they're idle without waiting.		// Then on the last iteration, verify they're idle without waiting.
Show All 28 Lines

clang-tools-extra/clangd/Config.h

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	struct ExternalIndexSpec {
/// This is one of:		/// This is one of:
/// - Address of a clangd-index-server, in the form of "ip:port".		/// - Address of a clangd-index-server, in the form of "ip:port".
/// - Absolute path to an index produced by clangd-indexer.		/// - Absolute path to an index produced by clangd-indexer.
std::string Location;		std::string Location;
/// Absolute path to source root this index is associated with, uses		/// Absolute path to source root this index is associated with, uses
/// forward-slashes.		/// forward-slashes.
std::string MountPoint;		std::string MountPoint;
};		};
/// Controls background-index behavior.		/// Controls index behavior.
struct {		struct {
/// Whether this TU should be indexed.		/// Whether this TU should be background-indexed.
BackgroundPolicy Background = BackgroundPolicy::Build;		BackgroundPolicy Background = BackgroundPolicy::Build;
ExternalIndexSpec External;		ExternalIndexSpec External;
		bool StandardLibrary = false;
} Index;		} Index;

enum UnusedIncludesPolicy { Strict, None };		enum UnusedIncludesPolicy { Strict, None };
/// Controls warnings and errors when parsing code.		/// Controls warnings and errors when parsing code.
struct {		struct {
bool SuppressAll = false;		bool SuppressAll = false;
llvm::StringSet<> Suppress;		llvm::StringSet<> Suppress;

▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

clang-tools-extra/clangd/ConfigCompile.cpp

Show First 20 Lines • Show All 326 Lines • ▼ Show 20 Lines	if (F.Background) {
.map("Build", Config::BackgroundPolicy::Build)		.map("Build", Config::BackgroundPolicy::Build)
.map("Skip", Config::BackgroundPolicy::Skip)		.map("Skip", Config::BackgroundPolicy::Skip)
.value())		.value())
Out.Apply.push_back(		Out.Apply.push_back(
[Val](const Params &, Config &C) { C.Index.Background = *Val; });		[Val](const Params &, Config &C) { C.Index.Background = *Val; });
}		}
if (F.External)		if (F.External)
compile(std::move(**F.External), F.External->Range);		compile(std::move(**F.External), F.External->Range);
		if (F.StandardLibrary)
		Out.Apply.push_back(
		[Val(**F.StandardLibrary)](const Params &, Config &C) {
		C.Index.StandardLibrary = Val;
		});
}		}

void compile(Fragment::IndexBlock::ExternalBlock &&External,		void compile(Fragment::IndexBlock::ExternalBlock &&External,
llvm::SMRange BlockRange) {		llvm::SMRange BlockRange) {
if (External.Server && !Trusted) {		if (External.Server && !Trusted) {
diag(Error,		diag(Error,
"Remote index may not be specified by untrusted configuration. "		"Remote index may not be specified by untrusted configuration. "
"Copy this into user config to use it.",		"Copy this into user config to use it.",
▲ Show 20 Lines • Show All 280 Lines • Show Last 20 Lines

clang-tools-extra/clangd/ConfigFragment.h

Show First 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	struct ExternalBlock {
/// `123.1.1.1:13337`.		/// `123.1.1.1:13337`.
llvm::Optional<Located<std::string>> Server;		llvm::Optional<Located<std::string>> Server;
/// Source root governed by this index. Default is the directory		/// Source root governed by this index. Default is the directory
/// associated with the config fragment. Absolute in case of user config		/// associated with the config fragment. Absolute in case of user config
/// and relative otherwise. Should always use forward-slashes.		/// and relative otherwise. Should always use forward-slashes.
llvm::Optional<Located<std::string>> MountPoint;		llvm::Optional<Located<std::string>> MountPoint;
};		};
llvm::Optional<Located<ExternalBlock>> External;		llvm::Optional<Located<ExternalBlock>> External;
		// Whether the standard library visible from this file should be indexed.
		// This makes all standard library symbols available, included or not.
		llvm::Optional<Located<bool>> StandardLibrary;
};		};
IndexBlock Index;		IndexBlock Index;

/// Controls behavior of diagnostics (errors and warnings).		/// Controls behavior of diagnostics (errors and warnings).
struct DiagnosticsBlock {		struct DiagnosticsBlock {
/// Diagnostic codes that should be suppressed.		/// Diagnostic codes that should be suppressed.
///		///
/// Valid values are:		/// Valid values are:
▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

clang-tools-extra/clangd/ConfigYAML.cpp

Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	Dict.handle("External", [&](Node &N) {
parse(External, scalarValue(N, "External").getValue());		parse(External, scalarValue(N, "External").getValue());
} else {		} else {
error("External must be either a scalar or a mapping.", N);		error("External must be either a scalar or a mapping.", N);
return;		return;
}		}
F.External.emplace(std::move(External));		F.External.emplace(std::move(External));
F.External->Range = N.getSourceRange();		F.External->Range = N.getSourceRange();
});		});
		Dict.handle("StandardLibrary", [&](Node &N) {
		if (auto StandardLibrary = boolValue(N, "StandardLibrary"))
		F.StandardLibrary = *StandardLibrary;
		});
Dict.parse(N);		Dict.parse(N);
}		}

void parse(Fragment::IndexBlock::ExternalBlock &F,		void parse(Fragment::IndexBlock::ExternalBlock &F,
Located<std::string> ExternalVal) {		Located<std::string> ExternalVal) {
if (!llvm::StringRef(*ExternalVal).equals_insensitive("none")) {		if (!llvm::StringRef(*ExternalVal).equals_insensitive("none")) {
error("Only scalar value supported for External is 'None'",		error("Only scalar value supported for External is 'None'",
ExternalVal.Range);		ExternalVal.Range);
▲ Show 20 Lines • Show All 253 Lines • Show Last 20 Lines

clang-tools-extra/clangd/TUScheduler.h

	Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
	class ParsingCallbacks {			class ParsingCallbacks {
	public:			public:
	virtual ~ParsingCallbacks() = default;			virtual ~ParsingCallbacks() = default;

	/// Called on the AST that was built for emitting the preamble. The built AST			/// Called on the AST that was built for emitting the preamble. The built AST
	/// contains only AST nodes from the #include directives at the start of the			/// contains only AST nodes from the #include directives at the start of the
	/// file. AST node in the current file should be observed on onMainAST call.			/// file. AST node in the current file should be observed on onMainAST call.
	virtual void onPreambleAST(PathRef Path, llvm::StringRef Version,			virtual void onPreambleAST(PathRef Path, llvm::StringRef Version,
	ASTContext &Ctx, Preprocessor &PP,			const CompilerInvocation &CI, ASTContext &Ctx,
	const CanonicalIncludes &) {}			Preprocessor &PP, const CanonicalIncludes &) {}

	/// The argument function is run under the critical section guarding against			/// The argument function is run under the critical section guarding against
	/// races when closing the files.			/// races when closing the files.
	using PublishFn = llvm::function_ref<void(llvm::function_ref<void()>)>;			using PublishFn = llvm::function_ref<void(llvm::function_ref<void()>)>;
	/// Called on the AST built for the file itself. Note that preamble AST nodes			/// Called on the AST built for the file itself. Note that preamble AST nodes
	/// are not deserialized and should be processed in the onPreambleAST call			/// are not deserialized and should be processed in the onPreambleAST call
	/// instead.			/// instead.
	/// The \p AST always contains all AST nodes for the main file itself, and			/// The \p AST always contains all AST nodes for the main file itself, and
	▲ Show 20 Lines • Show All 206 Lines • Show Last 20 Lines

clang-tools-extra/clangd/TUScheduler.cpp

Show First 20 Lines • Show All 1,007 Lines • ▼ Show 20 Lines	ThreadCrashReporter ScopedReporter([&Inputs]() {
llvm::errs() << "Signalled while building preamble\n";		llvm::errs() << "Signalled while building preamble\n";
crashDumpParseInputs(llvm::errs(), Inputs);		crashDumpParseInputs(llvm::errs(), Inputs);
});		});

PreambleBuildStats Stats;		PreambleBuildStats Stats;
bool IsFirstPreamble = !LatestBuild;		bool IsFirstPreamble = !LatestBuild;
LatestBuild = clang::clangd::buildPreamble(		LatestBuild = clang::clangd::buildPreamble(
FileName, *Req.CI, Inputs, StoreInMemory,		FileName, *Req.CI, Inputs, StoreInMemory,
[this, Version(Inputs.Version)](ASTContext &Ctx, Preprocessor &PP,		[&](ASTContext &Ctx, Preprocessor &PP,
const CanonicalIncludes &CanonIncludes) {		const CanonicalIncludes &CanonIncludes) {
Callbacks.onPreambleAST(FileName, Version, Ctx, PP, CanonIncludes);		Callbacks.onPreambleAST(FileName, Inputs.Version, *Req.CI, Ctx, PP,
		CanonIncludes);
},		},
&Stats);		&Stats);
if (!LatestBuild)		if (!LatestBuild)
return;		return;
reportPreambleBuild(Stats, IsFirstPreamble);		reportPreambleBuild(Stats, IsFirstPreamble);
if (isReliable(LatestBuild->CompileCommand))		if (isReliable(LatestBuild->CompileCommand))
HeaderIncluders.update(FileName, LatestBuild->Includes.allHeaders());		HeaderIncluders.update(FileName, LatestBuild->Includes.allHeaders());
}		}
▲ Show 20 Lines • Show All 761 Lines • Show Last 20 Lines

clang-tools-extra/clangd/index/FileIndex.h

	Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
	class FileIndex : public MergedIndex {			class FileIndex : public MergedIndex {
	public:			public:
	FileIndex();			FileIndex();

	/// Update preamble symbols of file \p Path with all declarations in \p AST			/// Update preamble symbols of file \p Path with all declarations in \p AST
	/// and macros in \p PP.			/// and macros in \p PP.
	void updatePreamble(PathRef Path, llvm::StringRef Version, ASTContext &AST,			void updatePreamble(PathRef Path, llvm::StringRef Version, ASTContext &AST,
	Preprocessor &PP, const CanonicalIncludes &Includes);			Preprocessor &PP, const CanonicalIncludes &Includes);
				void updatePreamble(IndexFileIn);

	/// Update symbols and references from main file \p Path with			/// Update symbols and references from main file \p Path with
	/// `indexMainDecls`.			/// `indexMainDecls`.
	void updateMain(PathRef Path, ParsedAST &AST);			void updateMain(PathRef Path, ParsedAST &AST);

	void profile(MemoryTree &MT) const;			void profile(MemoryTree &MT) const;

	private:			private:
	▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

clang-tools-extra/clangd/index/FileIndex.cpp

Show First 20 Lines • Show All 419 Lines • ▼ Show 20 Lines

FileIndex::FileIndex()		FileIndex::FileIndex()
: MergedIndex(&MainFileIndex, &PreambleIndex),		: MergedIndex(&MainFileIndex, &PreambleIndex),
PreambleSymbols(IndexContents::Symbols \| IndexContents::Relations),		PreambleSymbols(IndexContents::Symbols \| IndexContents::Relations),
PreambleIndex(std::make_unique<MemIndex>()),		PreambleIndex(std::make_unique<MemIndex>()),
MainFileSymbols(IndexContents::All),		MainFileSymbols(IndexContents::All),
MainFileIndex(std::make_unique<MemIndex>()) {}		MainFileIndex(std::make_unique<MemIndex>()) {}

void FileIndex::updatePreamble(PathRef Path, llvm::StringRef Version,		void FileIndex::updatePreamble(IndexFileIn IF) {
ASTContext &AST, Preprocessor &PP,
const CanonicalIncludes &Includes) {
IndexFileIn IF;
std::tie(IF.Symbols, std::ignore, IF.Relations) =
indexHeaderSymbols(Version, AST, PP, Includes);
FileShardedIndex ShardedIndex(std::move(IF));		FileShardedIndex ShardedIndex(std::move(IF));
for (auto Uri : ShardedIndex.getAllSources()) {		for (auto Uri : ShardedIndex.getAllSources()) {
auto IF = ShardedIndex.getShard(Uri);		auto IF = ShardedIndex.getShard(Uri);
// We are using the key received from ShardedIndex, so it should always		// We are using the key received from ShardedIndex, so it should always
// exist.		// exist.
assert(IF);		assert(IF);
PreambleSymbols.update(		PreambleSymbols.update(
Uri, std::make_unique<SymbolSlab>(std::move(*IF->Symbols)),		Uri, std::make_unique<SymbolSlab>(std::move(*IF->Symbols)),
Show All 14 Lines	auto NewIndex = PreambleSymbols.buildIndex(
PreambleIndex.reset(std::move(NewIndex));		PreambleIndex.reset(std::move(NewIndex));
vlog(		vlog(
"Build dynamic index for header symbols with estimated memory usage of "		"Build dynamic index for header symbols with estimated memory usage of "
"{0} bytes",		"{0} bytes",
PreambleIndex.estimateMemoryUsage());		PreambleIndex.estimateMemoryUsage());
}		}
}		}

		void FileIndex::updatePreamble(PathRef Path, llvm::StringRef Version,
		ASTContext &AST, Preprocessor &PP,
		const CanonicalIncludes &Includes) {
		IndexFileIn IF;
		std::tie(IF.Symbols, std::ignore, IF.Relations) =
		indexHeaderSymbols(Version, AST, PP, Includes);
		updatePreamble(std::move(IF));
		}

void FileIndex::updateMain(PathRef Path, ParsedAST &AST) {		void FileIndex::updateMain(PathRef Path, ParsedAST &AST) {
auto Contents = indexMainDecls(AST);		auto Contents = indexMainDecls(AST);
MainFileSymbols.update(		MainFileSymbols.update(
URI::create(Path).toString(),		URI::create(Path).toString(),
std::make_unique<SymbolSlab>(std::move(std::get<0>(Contents))),		std::make_unique<SymbolSlab>(std::move(std::get<0>(Contents))),
std::make_unique<RefSlab>(std::move(std::get<1>(Contents))),		std::make_unique<RefSlab>(std::move(std::get<1>(Contents))),
std::make_unique<RelationSlab>(std::move(std::get<2>(Contents))),		std::make_unique<RelationSlab>(std::move(std::get<2>(Contents))),
/CountReferences=/true);		/CountReferences=/true);
Show All 30 Lines

clang-tools-extra/clangd/index/StdLib.h

This file was added.

				//===--- StdLib.h - Index the C and C++ standard library ---------- C++--===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				// Eagerly indexing the standard library gives a much friendlier "warm start"
				// with working code completion in a standalone file or small project.
				//
				// We act as if we saw a file which included the whole standard library:
				// #include <array>
				// #include <bitset>
				// #include <chrono>
				// ...
				// We index this TU and feed the result into the dynamic index.
				//
				// This happens within the context of some particular open file, and we reuse
				// its CompilerInvocation. Matching its include path, LangOpts etc ensures that
				// we see the standard library and configuration that matches the project.
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_STDLIB_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_STDLIB_H

				#include "index/Symbol.h"
				#include "support/ThreadsafeFS.h"
				#include "llvm/ADT/StringRef.h"
				#include <string>

				namespace clang {
				class CompilerInvocation;
				class LangOptions;
				class HeaderSearch;
				namespace clangd {

				// The filesystem location where a standard library was found.
				//
				// This is the directory containing <vector> or <stdio.h>.
				// It's used to ensure we only index files that are in the standard library.
				//
				// The paths are canonicalized (FS "real path" with symlinks resolved).
				// This allows them to be easily compared against paths the indexer returns.
				struct StdLibLocation {
				llvm::SmallVector<std::string> Paths;
				};

				// Tracks the state of standard library indexing within a particular index.
				//
				// In general, we don't want to index the standard library multiple times.
				// In most cases, this class just acts as a flag to ensure we only do it once.
				//
				// However, if we first open a C++11 file, and then a C++20 file, we do
				// want the index to be upgraded to include the extra symbols.
				// Similarly, the C and C++ standard library can coexist.
				class StdLibSet {
				std::atomic<int> Best[2] = {{-1}, {-1}};

				public:
				// Determines if we should index the standard library in a configuration.
				//
				// This is true if:
				// - standard library indexing is enabled for the file
				// - the language version is higher than any previous add() for the language
				// - the standard library headers exist on the search path
				// Returns the location where the standard library was found.
				//
				kadircetUnsubmitted Done Reply Inline Actions maybe drop the optinal and bail out in indexing when `Paths` are empty ? kadircet: maybe drop the optinal and bail out in indexing when `Paths` are empty ?
				sammccallAuthorUnsubmitted Done Reply Inline Actions Why? This would definitely be using an empty vector as a sentinel value: 2 paths -> index 1 path -> index 0 paths -> don't index And it's not as if "probe for a standard library" is the main point of this function so the interpretation of the return value is obvious - that's only one of three criteria. None seems to be a clearer way to communicate this than {}, and performance doesn't seem to be an issue here. sammccall: Why? This would definitely be using an empty vector as a sentinel value: - 2 paths -> index…
				kadircetUnsubmitted Done Reply Inline Actions okay, makes sense. kadircet: okay, makes sense.
				// This function is threadsafe.
				llvm::Optional<StdLibLocation> add(const LangOptions &, const HeaderSearch &);
				kadircetUnsubmitted Done Reply Inline Actions s/a built/built an/ kadircet: s/a built/built an/

				// Indicates whether a built index should be used.
				// It should not be used if a newer version has subsequently been added.
				//
				// Intended pattern is:
				// if (add()) {
				// symbols = indexStandardLibrary();
				// if (isBest())
				// index.update(symbols);
				// }
				//
				// This is still technically racy: we could return true here, then another
				// thread could add->index->update a better library before we can update.
				// We'd then overwrite it with the older version.
				// However, it's very unlikely: indexing takes a long time.
				bool isBest(const LangOptions &) const;
				};

				// Index a standard library and return the discovered symbols.
				//
				// The compiler invocation should describe the file whose config we're reusing.
				// We overwrite its virtual buffer with a lot of #include statements.
				SymbolSlab indexStandardLibrary(std::unique_ptr<CompilerInvocation> Invocation,
				const StdLibLocation &Loc,
				const ThreadsafeFS &TFS);

				// Variant that allows the umbrella header source to be specified.
				// Exposed for testing.
				SymbolSlab indexStandardLibrary(llvm::StringRef HeaderSources,
				std::unique_ptr<CompilerInvocation> CI,
				const StdLibLocation &Loc,
				const ThreadsafeFS &TFS);

				// Generate header containing #includes for all standard library headers.
				// Exposed for testing.
				llvm::StringRef getStdlibUmbrellaHeader(const LangOptions &);

				} // namespace clangd
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_STDLIB_H

clang-tools-extra/clangd/index/StdLib.cpp

This file was added.

				//===-- StdLib.cpp ----------------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				#include "StdLib.h"
				#include <fstream>
				#include <memory>
				#include <string>
				#include <vector>

				#include "Compiler.h"
				#include "Config.h"
				#include "SymbolCollector.h"
				#include "index/IndexAction.h"
				#include "support/Logger.h"
				#include "support/ThreadsafeFS.h"
				#include "support/Trace.h"
				#include "clang/Basic/LangOptions.h"
				#include "clang/Frontend/CompilerInvocation.h"
				#include "clang/Lex/PreprocessorOptions.h"
				#include "llvm/ADT/IntrusiveRefCntPtr.h"
				#include "llvm/ADT/None.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "llvm/Support/Path.h"

				namespace clang {
				namespace clangd {
				namespace {

				enum Lang { C, CXX };

				Lang langFromOpts(const LangOptions &LO) { return LO.CPlusPlus ? CXX : C; }
				llvm::StringLiteral mandatoryHeader(Lang L) {
				switch (L) {
				case C:
				return "stdio.h";
				case CXX:
				return "vector";
				}
				llvm_unreachable("unhandled Lang");
				}
				kadircetUnsubmitted Done Reply Inline Actions `llvm_uncreachable` instead? kadircet: `llvm_uncreachable` instead?

				LangStandard::Kind standardFromOpts(const LangOptions &LO) {
				if (LO.CPlusPlus) {
				if (LO.CPlusPlus2b)
				return LangStandard::lang_cxx2b;
				kadircetUnsubmitted Done Reply Inline Actions nit: this feels a little bit hard to read, what about: if(2b) return 2b; if(20) return 20; ... return 98; kadircet: nit: this feels a little bit hard to read, what about: ``` if(2b) return 2b; if(20) return 20; .
				if (LO.CPlusPlus20)
				return LangStandard::lang_cxx20;
				if (LO.CPlusPlus17)
				return LangStandard::lang_cxx17;
				if (LO.CPlusPlus14)
				return LangStandard::lang_cxx14;
				if (LO.CPlusPlus11)
				kadircetUnsubmitted Done Reply Inline Actions same here kadircet: same here
				return LangStandard::lang_cxx11;
				return LangStandard::lang_cxx98;
				}
				if (LO.C2x)
				return LangStandard::lang_c2x;
				// C17 has no new features, so treat {C11,C17} as C17.
				if (LO.C11)
				return LangStandard::lang_c17;
				return LangStandard::lang_c99;
				}

				std::string buildUmbrella(llvm::StringLiteral Mandatory,
				std::vector<llvm::StringLiteral> Headers) {
				std::string Result;
				llvm::raw_string_ostream OS(Result);

				// We __has_include guard all our #includes to avoid errors when using older
				// stdlib version that don't have headers for the newest language standards.
				// But make sure we get some error if things are totally broken.
				OS << llvm::formatv(
				"#if !__has_include(<{0}>)\n"
				"#error Mandatory header <{0}> not found in standard library!\n"
				"#endif\n",
				Mandatory);

				llvm::sort(Headers.begin(), Headers.end());
				auto Last = std::unique(Headers.begin(), Headers.end());
				for (auto Header = Headers.begin(); Header != Last; ++Header) {
				OS << llvm::formatv("#if __has_include({0})\n"
				"#include {0}\n"
				"#endif\n",
				*Header);
				}
				OS.flush();
				return Result;
				}
				kadircetUnsubmitted Done Reply Inline Actions maybe move second half of this comment into `buildUmbrella` ? kadircet: maybe move second half of this comment into `buildUmbrella` ?
				sammccallAuthorUnsubmitted Done Reply Inline Actions I think that removes the context of why we're #including them. sammccall: I think that removes the context of why we're #including them.

				} // namespace

				llvm::StringRef getStdlibUmbrellaHeader(const LangOptions &LO) {
				// The umbrella header is the same for all versions of each language.
				// Headers that are unsupported in old lang versions are usually guarded by
				// #if. Some headers may be not present in old stdlib versions, the umbrella
				// header guards with __has_include for this purpose.
				Lang L = langFromOpts(LO);
				switch (L) {
				case CXX:
				static std::string *UmbrellaCXX =
				new std::string(buildUmbrella(mandatoryHeader(L), {
				#define SYMBOL(Name, NameSpace, Header) #Header,
				#include "clang/Tooling/Inclusions/StdSymbolMap.inc"
				#undef SYMBOL
				}));
				return *UmbrellaCXX;
				case C:
				static std::string *UmbrellaC =
				new std::string(buildUmbrella(mandatoryHeader(L), {
				#define SYMBOL(Name, NameSpace, Header) #Header,
				#include "clang/Tooling/Inclusions/CSymbolMap.inc"
				#undef SYMBOL
				}));
				return *UmbrellaC;
				}
				}

				kadircetUnsubmitted Done Reply Inline Actions s/our our/to our/ kadircet: s/our our/to our/
				namespace {

				// Including the standard library leaks unwanted transitively included symbols.
				//
				// We want to drop these, they're a bit tricky to identify:
				// - we don't want to limit to symbols on our list, as our list has only
				// top-level symbols (and there may be legitimate stdlib extensions).
				// - we can't limit to only symbols defined in known stdlib headers, as stdlib
				// internal structure is murky
				// - we can't strictly require symbols to come from a particular path, e.g.
				kadircetUnsubmitted Done Reply Inline Actions drop static, and move into previous anonymous namespace ? kadircet: drop static, and move into previous anonymous namespace ?
				// libstdc++ is mostly under /usr/include/c++/10/...
				// but std::ctype_base is under /usr/include/<platform>/c++/10/...
				// We require the symbol to come from a header that is either from
				// the standard library path (as identified by the location of <vector>), or
				// another header that defines a symbol from our stdlib list.
				SymbolSlab filter(SymbolSlab Slab, const StdLibLocation &Loc) {
				SymbolSlab::Builder Result;

				static auto &StandardHeaders = *[] {
				auto *Set = new llvm::DenseSet<llvm::StringRef>();
				for (llvm::StringRef Header : {
				kadircetUnsubmitted Done Reply Inline Actions s/Name/Header/ ? kadircet: s/Name/Header/ ?
				#define SYMBOL(Name, NameSpace, Header) #Header,
				#include "clang/Tooling/Inclusions/CSymbolMap.inc"
				#include "clang/Tooling/Inclusions/StdSymbolMap.inc"
				#undef SYMBOL
				})
				Set->insert(Header);
				return Set;
				}();

				// Form prefixes like file:///usr/include/c++/10/
				// These can be trivially prefix-compared with URIs in the indexed symbols.
				llvm::SmallVector<std::string> StdLibURIPrefixes;
				for (const auto &Path : Loc.Paths) {
				StdLibURIPrefixes.push_back(URI::create(Path).toString());
				if (StdLibURIPrefixes.back().back() != '/')
				StdLibURIPrefixes.back().push_back('/');
				}
				// For each header URI, is it either prefixed by StdLibURIPrefixes or
				kadircetUnsubmitted Done Reply Inline Actions `StandardHeaders` are always in verbatim format, but are we sure if that holds for the `IncludeHeader` ? I suppose that should be the case since we map known path suffixes back to a verbatim umbrella header, I just wonder how exhaustive the list is. (it might be nice to manually check no implementation detail headers falls out of cracks) kadircet: `StandardHeaders` are always in verbatim format, but are we sure if that holds for the…
				sammccallAuthorUnsubmitted Done Reply Inline Actions Right, I looked at these manually and the headers (and symbols) we were dropping seemed reasonable. Note that we're not filtering symbols in this loop, just building a list of blessed files to add to the StdLibLocation. So we only drop symbols that: aren't recognized by the indexer as being in the standard library aren't in the standard library directory don't share a file with anything recognized as being in the standard library sammccall: Right, I looked at these manually and the headers (and symbols) we were dropping seemed…
				// owner of a symbol whose insertable header is in StandardHeaders?
				// Pointer key because strings in a SymbolSlab are interned.
				llvm::DenseMap<const char *, bool> GoodHeader;
				for (const Symbol &S : Slab) {
				if (!S.IncludeHeaders.empty() &&
				StandardHeaders.contains(S.IncludeHeaders.front().IncludeHeader)) {
				GoodHeader[S.CanonicalDeclaration.FileURI] = true;
				GoodHeader[S.Definition.FileURI] = true;
				continue;
				}
				for (const char *URI :
				{S.CanonicalDeclaration.FileURI, S.Definition.FileURI}) {
				auto R = GoodHeader.try_emplace(URI, false);
				if (R.second) {
				R.first->second = llvm::any_of(
				StdLibURIPrefixes,
				[&, URIStr(llvm::StringRef(URI))](const std::string &Prefix) {
				kadircetUnsubmitted Done Reply Inline Actions seems like debugging artifact, either drop or put in `#ifndef NDEBUG` kadircet: seems like debugging artifact, either drop or put in `#ifndef NDEBUG`
				sammccallAuthorUnsubmitted Done Reply Inline Actions This is useful for debugging, and fits well with the other dlog()s in this file, I'd like to check it in. I was going to call you paranoid, being sure this would get compiled out. but indeed not: https://godbolt.org/z/hKWhro3Mv (gcc manages it) Added #ifndef NDEBUG sammccall: This is useful for debugging, and fits well with the other dlog()s in this file, I'd like to…
				return URIStr.startswith(Prefix);
				});
				}
				}
				}
				#ifndef NDEBUG
				for (const auto &Good : GoodHeader)
				if (Good.second && *Good.first)
				dlog("Stdlib header: {0}", Good.first);
				#endif
				// Empty URIs aren't considered good. (Definition can be blank).
				auto IsGoodHeader = [&](const char C) { return C && GoodHeader.lookup(C); };

				for (const Symbol &S : Slab) {
				if (!(IsGoodHeader(S.CanonicalDeclaration.FileURI) \|\|
				IsGoodHeader(S.Definition.FileURI))) {
				dlog("Ignoring wrong-header symbol {0}{1} in {2}", S.Scope, S.Name,
				S.CanonicalDeclaration.FileURI);
				continue;
				}
				Result.insert(S);
				}

				return std::move(Result).build();
				}

				} // namespace
				kadircetUnsubmitted Done Reply Inline Actions why log if we think we can't hit this case? kadircet: why log if we think we can't hit this case?
				sammccallAuthorUnsubmitted Done Reply Inline Actions Because I'm not certain, and would much rather get a user bug report with this log line in it than without! (Assert because I'd most rather find out before release of course) sammccall: Because I'm not certain, and would much rather get a user bug report with this log line in it…

				SymbolSlab indexStandardLibrary(llvm::StringRef HeaderSources,
				std::unique_ptr<CompilerInvocation> CI,
				const StdLibLocation &Loc,
				const ThreadsafeFS &TFS) {
				if (CI->getFrontendOpts().Inputs.size() != 1 \|\|
				!CI->getPreprocessorOpts().ImplicitPCHInclude.empty()) {
				elog("Indexing standard library failed: bad CompilerInvocation");
				assert(false && "indexing stdlib with a dubious CompilerInvocation!");
				return SymbolSlab();
				}
				const FrontendInputFile &Input = CI->getFrontendOpts().Inputs.front();
				kadircetUnsubmitted Done Reply Inline Actions why are we doing this exactly? once we override the same file multiple times, I believe the last one takes precedence. it's definitely safer to clear the remapped files here rather than relying on that fact, but I am afraid of losing other mappings, possibly set outside of clangd's knowledge (e.g. through flags like `-remap-file`?) kadircet: why are we doing this exactly? once we override the same file multiple times, I believe the…
				sammccallAuthorUnsubmitted Done Reply Inline Actions We map dirty buffers ourselves. Conceivably, it may be part of the standard library itself that was remapped to some dirty buffer content! We're reusing the CompilerInvocation from building a preamble, where we remap the main-file buffer to the dirty contents. (We do this in prepareCompilerInstance, but the PPOptions are shared). This isn't part of the "compilerinvocation-as-proxy-for-build-flags" that we're trying to index. I don't think it's a realistic possibility that anyone would rely on `-Xclang -remap-file` to find the standard library (note that it's a cc1 flag, not a public one...). sammccall: We map dirty buffers ourselves. Conceivably, it may be part of the standard library itself that…
				trace::Span Tracer("StandardLibraryIndex");
				LangStandard::Kind LangStd = standardFromOpts(*CI->getLangOpts());
				log("Indexing {0} standard library in the context of {1}",
				LangStandard::getLangStandardForKind(LangStd).getName(), Input.getFile());

				SymbolSlab Symbols;
				IgnoreDiagnostics IgnoreDiags;
				// CompilerInvocation is taken from elsewhere, and may map a dirty buffer.
				CI->getPreprocessorOpts().clearRemappedFiles();
				auto Clang = prepareCompilerInstance(
				std::move(CI), /Preamble=/nullptr,
				llvm::MemoryBuffer::getMemBuffer(HeaderSources, Input.getFile()),
				TFS.view(/CWD=/llvm::None), IgnoreDiags);
				if (!Clang) {
				elog("Standard Library Index: Couldn't build compiler instance");
				return Symbols;
				}
				kadircetUnsubmitted Done Reply Inline Actions what does `the location` refer to here? I think we should also stress the fact that even when indexing the same file, we have a good chance of seeing different symbols due to PP directives (and different std versions) kadircet: what does `the location` refer to here? I think we should also stress the fact that even when…
				sammccallAuthorUnsubmitted Done Reply Inline Actions what does the location refer to here? It refers to the StdLibLocation Loc, made that explicit. I think we should also stress the fact that even when indexing the same file, we have a good chance of seeing different symbols due to PP directives (and different std versions) Different than what? Do you mean "why might different calls to indexStandardLibrary see different symbols" from the same file? sammccall: > what does the location refer to here? It refers to the StdLibLocation Loc, made that…
				kadircetUnsubmitted Done Reply Inline Actions Different than what? Do you mean "why might different calls to indexStandardLibrary see different symbols" from the same file? yes, i meant compared to a previous runs. but i don't think it's as relevant here. i believe i was thinking about caching indexing status across runs and using that cache to implement filefilter, so that we don't index the same file twice (as we normally do in bgindex). kadircet: > Different than what? Do you mean "why might different calls to indexStandardLibrary see…

				SymbolCollector::Options IndexOpts;
				IndexOpts.Origin = SymbolOrigin::StdLib;
				IndexOpts.CollectMainFileSymbols = false;
				kadircetUnsubmitted Done Reply Inline Actions s/containers/include graph kadircet: s/containers/include graph
				IndexOpts.CollectMainFileRefs = false;
				IndexOpts.CollectMacro = true;
				IndexOpts.StoreAllDocumentation = true;
				// Sadly we can't use IndexOpts.FileFilter to restrict indexing scope.
				// Files from outside the StdLibLocation may define true std symbols anyway.
				// We end up "blessing" such headers, and can only do that by indexing
				// everything first.

				// Refs, relations, include graph in the stdlib mostly aren't useful.
				auto Action = createStaticIndexingAction(
				IndexOpts, [&](SymbolSlab S) { Symbols = std::move(S); }, nullptr,
				nullptr, nullptr);

				if (!Action->BeginSourceFile(*Clang, Input)) {
				elog("Standard Library Index: BeginSourceFile() failed");
				return Symbols;
				}

				if (llvm::Error Err = Action->Execute()) {
				kadircetUnsubmitted Done Reply Inline Actions i'd make this part of the next log kadircet: i'd make this part of the next log
				sammccallAuthorUnsubmitted Done Reply Inline Actions Can you say why? I generally like one thought per line. Scanning vertically through familiar lines, it's easy to miss something unfamiliar tacked onto the end. This message should be rare, and log lines aren't precious. (I reordered them, which seems a bit more natural) sammccall: Can you say why? I generally like one thought per line. Scanning vertically through familiar…
				kadircetUnsubmitted Done Reply Inline Actions i was rather implying to add it as a `(in)complete` field into the current log line you have. usually when clangd is printing lots of logs across threads it might be hard to correlate these. hence having them printed as a single log would help. kadircet: i was rather implying to add it as a `(in)complete` field into the current log line you have.
				elog("Standard Library Index: Execute failed: {0}", std::move(Err));
				return Symbols;
				}

				Action->EndSourceFile();

				unsigned SymbolsBeforeFilter = Symbols.size();
				Symbols = filter(std::move(Symbols), Loc);
				bool Errors = Clang->hasDiagnostics() &&
				Clang->getDiagnostics().hasUncompilableErrorOccurred();
				log("Indexed {0} standard library{3}: {1} symbols, {2} filtered",
				LangStandard::getLangStandardForKind(LangStd).getName(), Symbols.size(),
				SymbolsBeforeFilter - Symbols.size(),
				Errors ? " (incomplete due to errors)" : "");
				SPAN_ATTACH(Tracer, "symbols", int(Symbols.size()));
				return Symbols;
				}

				SymbolSlab indexStandardLibrary(std::unique_ptr<CompilerInvocation> Invocation,
				const StdLibLocation &Loc,
				const ThreadsafeFS &TFS) {
				return indexStandardLibrary(
				getStdlibUmbrellaHeader(*Invocation->getLangOpts()),
				std::move(Invocation), Loc, TFS);
				}

				bool StdLibSet::isBest(const LangOptions &LO) const {
				return standardFromOpts(LO) >=
				Best[langFromOpts(LO)].load(std::memory_order_acquire);
				}

				llvm::Optional<StdLibLocation> StdLibSet::add(const LangOptions &LO,
				const HeaderSearch &HS) {
				Lang L = langFromOpts(LO);
				int OldVersion = Best[L].load(std::memory_order_acquire);
				int NewVersion = standardFromOpts(LO);
				dlog("Index stdlib? {0}",
				LangStandard::getLangStandardForKind(standardFromOpts(LO)).getName());

				if (!Config::current().Index.StandardLibrary) {
				dlog("No: disabled in config");
				return llvm::None;
				}
				kadircetUnsubmitted Done Reply Inline Actions maybe move this check to the top kadircet: maybe move this check to the top

				if (NewVersion <= OldVersion) {
				dlog("No: have {0}, {1}>={2}",
				LangStandard::getLangStandardForKind(
				static_cast<LangStandard::Kind>(NewVersion))
				.getName(),
				OldVersion, NewVersion);
				return llvm::None;
				}

				// We'd like to index a standard library here if there is one.
				// Check for the existence of <vector> on the search path.
				// We could cache this, but we only get here repeatedly when there's no
				// stdlib, and even then only once per preamble build.
				kadircetUnsubmitted Done Reply Inline Actions why do we resolve the symlinks ? kadircet: why do we resolve the symlinks ?
				sammccallAuthorUnsubmitted Done Reply Inline Actions Oops, because I read the documentation of getCanonicalPath() instead of the implementation, and they diverged in https://github.com/llvm/llvm-project/commit/dd67793c0c549638cd93cad1142d324b979ba682 :-D Ultimately we're forming URIs to lexically compare with the ones emitted by the indexer, which uses getCanonicalPath(). Today getCanonicalPath() wants a FileEntry and we don't have one, but I think there's no fundamental reason for that, I can fix it. (I'll do it as a separate patch, for now it's just calling getCanonicalPath with the wrong arguments) sammccall: Oops, because I read the documentation of getCanonicalPath() instead of the implementation, and…
				sammccallAuthorUnsubmitted Done Reply Inline Actions Actually, nevermind, the code is correct and I'd just forgotten why :-) Added a comment to StdLibLocation. getCanonicalPath() does actually resolve symlinks and so on: it asks the FileManager for the directory entry and grabs the its "canonical name" which is just FS->getRealPath behind a cache. So the strings are going to match the indexer after all. It'd be possible to modify getCanonicalPath() so we can call it here, but I don't think it helps. Calling it with (path, filemanager) is inconvenient for the (many) existing callsites, so it'd have to be a new overload just for this case. And the FileManager caching we'd gain doesn't matter here. I can still do it if you like, though. (Also, relevant to your interests, realpath is probably taking care of case mismatches too!) sammccall: Actually, nevermind, the code is correct and I'd just forgotten why :-) Added a comment to…
				kadircetUnsubmitted Done Reply Inline Actions So the strings are going to match the indexer after all. thanks, this makes sense. It'd be possible to modify getCanonicalPath() so we can call it here, but I don't think it helps. Calling it with (path, filemanager) is inconvenient for the (many) existing callsites, so it'd have to be a new overload just for this case. And the FileManager caching we'd gain doesn't matter here. I can still do it if you like, though. No need. We can take a look at that if the logic is likely to change (or get more complicated) in the future. kadircet: >So the strings are going to match the indexer after all. thanks, this makes sense. > It'd be…
				llvm::StringLiteral ProbeHeader = mandatoryHeader(L);
				llvm::SmallString<256> Path; // Scratch space.
				llvm::SmallVector<std::string> SearchPaths;
				auto RecordHeaderPath = [&](llvm::StringRef HeaderPath) {
				llvm::StringRef DirPath = llvm::sys::path::parent_path(HeaderPath);
				if (!HS.getFileMgr().getVirtualFileSystem().getRealPath(DirPath, Path))
				SearchPaths.emplace_back(Path);
				};
				for (const auto &DL :
				llvm::make_range(HS.search_dir_begin(), HS.search_dir_end())) {
				kadircetUnsubmitted Done Reply Inline Actions any reason for going with `Noncached` version? (clangd doesn't set one up today, but not relying on that would be nice if we don't have a particular concern here) kadircet: any reason for going with `Noncached` version? (clangd doesn't set one up today, but not…
				sammccallAuthorUnsubmitted Done Reply Inline Actions Because the cached version is more complicated with no benefits: any entanglement between the IO of the preamble indexing and that of the translation unit that happened to trigger it seems like a complicated idea, that's worth understanding before doing as you say, we don't actually install a statcache, so there's no concrete benefit in fact there exists no caching implementation of FileSystemStatCache, so the idea that we might be able to implement that interface and gain benefits is extremely speculative sammccall: Because the cached version is more complicated with no benefits: - any entanglement between…
				switch (DL.getLookupType()) {
				case DirectoryLookup::LT_NormalDir: {
				Path = DL.getDir()->getName();
				llvm::sys::path::append(Path, ProbeHeader);
				llvm::vfs::Status Stat;
				if (!HS.getFileMgr().getNoncachedStatValue(Path, Stat) &&
				kadircetUnsubmitted Done Reply Inline Actions s/bust/must kadircet: s/bust/must
				Stat.isRegularFile())
				RecordHeaderPath(Path);
				break;
				}
				case DirectoryLookup::LT_Framework:
				// stdlib can't be a framework (framework includes must have a slash)
				continue;
				case DirectoryLookup::LT_HeaderMap:
				llvm::StringRef Target =
				DL.getHeaderMap()->lookupFilename(ProbeHeader, Path);
				if (!Target.empty())
				RecordHeaderPath(Target);
				break;
				}
				}
				if (SearchPaths.empty())
				return llvm::None;

				dlog("Found standard library in {0}", llvm::join(SearchPaths, ", "));

				while (!Best[L].compare_exchange_weak(OldVersion, NewVersion,
				std::memory_order_acq_rel))
				if (OldVersion >= NewVersion) {
				dlog("No: lost the race");
				return llvm::None; // Another thread won the race while we were checking.
				}

				dlog("Yes, index stdlib!");
				return StdLibLocation{std::move(SearchPaths)};
				}

				} // namespace clangd
				} // namespace clang

clang-tools-extra/clangd/index/SymbolOrigin.h

Show All 23 Lines	enum class SymbolOrigin : uint16_t {
Open = 1 << 1, // From the dynamic index of open files.		Open = 1 << 1, // From the dynamic index of open files.
Static = 1 << 2, // From a static, externally-built index.		Static = 1 << 2, // From a static, externally-built index.
Merge = 1 << 3, // A non-trivial index merge was performed.		Merge = 1 << 3, // A non-trivial index merge was performed.
Identifier = 1 << 4, // Raw identifiers in file.		Identifier = 1 << 4, // Raw identifiers in file.
Remote = 1 << 5, // Remote index.		Remote = 1 << 5, // Remote index.
Preamble = 1 << 6, // From the dynamic index of preambles.		Preamble = 1 << 6, // From the dynamic index of preambles.
// 7 reserved		// 7 reserved
Background = 1 << 8, // From the automatic project index.		Background = 1 << 8, // From the automatic project index.
		StdLib = 1 << 9, // Standard library index.
};		};

inline SymbolOrigin operator\|(SymbolOrigin A, SymbolOrigin B) {		inline SymbolOrigin operator\|(SymbolOrigin A, SymbolOrigin B) {
return static_cast<SymbolOrigin>(static_cast<uint16_t>(A) \|		return static_cast<SymbolOrigin>(static_cast<uint16_t>(A) \|
static_cast<uint16_t>(B));		static_cast<uint16_t>(B));
}		}
inline SymbolOrigin &operator\|=(SymbolOrigin &A, SymbolOrigin B) {		inline SymbolOrigin &operator\|=(SymbolOrigin &A, SymbolOrigin B) {
return A = A \| B;		return A = A \| B;
Show All 12 Lines

clang-tools-extra/clangd/index/SymbolOrigin.cpp

	//===--- SymbolOrigin.cpp ----------------------------------------- C++--===//			//===--- SymbolOrigin.cpp ----------------------------------------- C++--===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "SymbolOrigin.h"			#include "SymbolOrigin.h"

	namespace clang {			namespace clang {
	namespace clangd {			namespace clangd {

	llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, SymbolOrigin O) {			llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, SymbolOrigin O) {
	if (O == SymbolOrigin::Unknown)			if (O == SymbolOrigin::Unknown)
	return OS << "unknown";			return OS << "unknown";
	constexpr static char Sigils[] = "AOSMIRP7B9012345";			constexpr static char Sigils[] = "AOSMIRP7BL012345";
	for (unsigned I = 0; I < sizeof(Sigils); ++I)			for (unsigned I = 0; I < sizeof(Sigils); ++I)
	if (static_cast<uint16_t>(O) & 1u << I)			if (static_cast<uint16_t>(O) & 1u << I)
	OS << Sigils[I];			OS << Sigils[I];
	return OS;			return OS;
	}			}

	} // namespace clangd			} // namespace clangd
	} // namespace clang			} // namespace clang

clang-tools-extra/clangd/unittests/CMakeLists.txt

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	add_unittest(ClangdUnitTests ClangdTests
QualityTests.cpp		QualityTests.cpp
RenameTests.cpp		RenameTests.cpp
RIFFTests.cpp		RIFFTests.cpp
SelectionTests.cpp		SelectionTests.cpp
SemanticHighlightingTests.cpp		SemanticHighlightingTests.cpp
SemanticSelectionTests.cpp		SemanticSelectionTests.cpp
SerializationTests.cpp		SerializationTests.cpp
SourceCodeTests.cpp		SourceCodeTests.cpp
		StdLibTests.cpp
SymbolCollectorTests.cpp		SymbolCollectorTests.cpp
SymbolInfoTests.cpp		SymbolInfoTests.cpp
SyncAPI.cpp		SyncAPI.cpp
TUSchedulerTests.cpp		TUSchedulerTests.cpp
TestFS.cpp		TestFS.cpp
TestIndex.cpp		TestIndex.cpp
TestTU.cpp		TestTU.cpp
TestWorkspace.cpp		TestWorkspace.cpp
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

clang-tools-extra/clangd/unittests/StdLibTests.cpp

This file was added.

				//===-- StdLibTests.cpp ------------------------------------------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "Annotations.h"
				#include "ClangdServer.h"
				#include "CodeComplete.h"
				#include "Compiler.h"
				#include "Config.h"
				#include "SyncAPI.h"
				#include "TestFS.h"
				#include "index/StdLib.h"
				#include "clang/Basic/LangOptions.h"
				#include "clang/Basic/SourceManager.h"
				#include "gmock/gmock.h"
				#include "gtest/gtest.h"
				#include <memory>

				using namespace testing;

				namespace clang {
				namespace clangd {
				namespace {

				// Check the generated header sources contains usual standard library headers.
				TEST(StdLibTests, getStdlibUmbrellaHeader) {
				LangOptions LO;
				LO.CPlusPlus = true;

				auto CXX = getStdlibUmbrellaHeader(LO).str();
				EXPECT_THAT(CXX, HasSubstr("#include <string>"));
				EXPECT_THAT(CXX, HasSubstr("#include <cstdio>"));
				EXPECT_THAT(CXX, Not(HasSubstr("#include <stdio.h>")));

				LO.CPlusPlus = false;
				auto C = getStdlibUmbrellaHeader(LO).str();
				EXPECT_THAT(C, Not(HasSubstr("#include <string>")));
				EXPECT_THAT(C, Not(HasSubstr("#include <cstdio>")));
				EXPECT_THAT(C, HasSubstr("#include <stdio.h>"));
				}

				MATCHER_P(Named, Name, "") { return arg.Name == Name; }

				// Build an index, and check if it contains the right symbols.
				TEST(StdLibTests, indexStandardLibrary) {
				MockFS FS;
				FS.Files["std/foo.h"] = R"cpp(
				#include <platform_stuff.h>
				#if __cplusplus >= 201703L
				int foo17();
				#elif __cplusplus >= 201402L
				int foo14();
				#else
				bool foo98();
				#endif
				)cpp";
				FS.Files["nonstd/platform_stuff.h"] = "int magic = 42;";

				ParseInputs OriginalInputs;
				OriginalInputs.TFS = &FS;
				OriginalInputs.CompileCommand.Filename = testPath("main.cc");
				OriginalInputs.CompileCommand.CommandLine = {"clang++", testPath("main.cc"),
				"-isystemstd/",
				"-isystemnonstd/", "-std=c++14"};
				OriginalInputs.CompileCommand.Directory = testRoot();
				IgnoreDiagnostics Diags;
				auto CI = buildCompilerInvocation(OriginalInputs, Diags);
				ASSERT_TRUE(CI);

				StdLibLocation Loc;
				Loc.Paths.push_back(testPath("std/"));

				auto Symbols =
				indexStandardLibrary("#include <foo.h>", std::move(CI), Loc, FS);
				EXPECT_THAT(Symbols, ElementsAre(Named("foo14")));
				}

				TEST(StdLibTests, StdLibSet) {
				StdLibSet Set;
				MockFS FS;
				FS.Files["std/_"] = "";
				FS.Files["libc/_"] = "";

				auto Add = [&](const LangOptions &LO,
				std::vector<llvm::StringRef> SearchPath) {
				SourceManagerForFile SM("scratch", "");
				SM.get().getFileManager().setVirtualFileSystem(FS.view(llvm::None));
				HeaderSearch HS(/HSOpts=/nullptr, SM.get(), SM.get().getDiagnostics(), LO,
				/Target=/nullptr);
				for (auto P : SearchPath)
				HS.AddSearchPath(
				DirectoryLookup(
				cantFail(SM.get().getFileManager().getDirectoryRef(testPath(P))),
				SrcMgr::C_System, /isFramework=/false),
				true);
				return Set.add(LO, HS);
				};

				Config Cfg;
				Cfg.Index.StandardLibrary = false;
				WithContextValue Disabled(Config::Key, std::move(Cfg));

				LangOptions LO;
				LO.CPlusPlus = true;
				EXPECT_FALSE(Add(LO, {"std"})) << "Disabled in config";

				Cfg = Config();
				Cfg.Index.StandardLibrary = true;
				WithContextValue Enabled(Config::Key, std::move(Cfg));

				EXPECT_FALSE(Add(LO, {"std"})) << "No <vector> found";
				FS.Files["std/vector"] = "class vector;";
				EXPECT_TRUE(Add(LO, {"std"})) << "Indexing as C++98";
				EXPECT_FALSE(Add(LO, {"std"})) << "Don't reindex";
				LO.CPlusPlus11 = true;
				EXPECT_TRUE(Add(LO, {"std"})) << "Indexing as C++11";
				LO.CPlusPlus = false;
				EXPECT_FALSE(Add(LO, {"libc"})) << "No <stdio.h>";
				FS.Files["libc/stdio.h"] = true;
				EXPECT_TRUE(Add(LO, {"libc"})) << "Indexing as C";
				}

				MATCHER_P(StdlibSymbol, Name, "") {
				return arg.Name == Name && arg.Includes.size() == 1 &&
				llvm::StringRef(arg.Includes.front().Header).startswith("<");
				}

				TEST(StdLibTests, EndToEnd) {
				Config Cfg;
				Cfg.Index.StandardLibrary = true;
				WithContextValue Enabled(Config::Key, std::move(Cfg));

				MockFS FS;
				FS.Files["stdlib/vector"] =
				"namespace std { template <class> class vector; }";
				FS.Files["stdlib/list"] =
				" namespace std { template <typename T> class list; }";
				MockCompilationDatabase CDB;
				CDB.ExtraClangFlags.push_back("-isystem" + testPath("stdlib"));
				ClangdServer::Options Opts = ClangdServer::optsForTest();
				Opts.BuildDynamicSymbolIndex = true; // also used for stdlib index
				ClangdServer Server(CDB, FS, Opts);

				Annotations A("std::^");

				Server.addDocument(testPath("foo.cc"), A.code());
				ASSERT_TRUE(Server.blockUntilIdleForTest());
				clangd::CodeCompleteOptions CCOpts;
				auto Completions =
				cantFail(runCodeComplete(Server, testPath("foo.cc"), A.point(), CCOpts));
				EXPECT_THAT(
				Completions.Completions,
				UnorderedElementsAre(StdlibSymbol("list"), StdlibSymbol("vector")));
				}

				} // namespace
				} // namespace clangd
				} // namespace clang

clang-tools-extra/clangd/unittests/TUSchedulerTests.cpp

	Show First 20 Lines • Show All 1,117 Lines • ▼ Show 20 Lines

	TEST_F(TUSchedulerTests, AsyncPreambleThread) {			TEST_F(TUSchedulerTests, AsyncPreambleThread) {
	// Blocks preamble thread while building preamble with \p BlockVersion until			// Blocks preamble thread while building preamble with \p BlockVersion until
	// \p N is notified.			// \p N is notified.
	class BlockPreambleThread : public ParsingCallbacks {			class BlockPreambleThread : public ParsingCallbacks {
	public:			public:
	BlockPreambleThread(llvm::StringRef BlockVersion, Notification &N)			BlockPreambleThread(llvm::StringRef BlockVersion, Notification &N)
	: BlockVersion(BlockVersion), N(N) {}			: BlockVersion(BlockVersion), N(N) {}
	void onPreambleAST(PathRef Path, llvm::StringRef Version, ASTContext &Ctx,			void onPreambleAST(PathRef Path, llvm::StringRef Version,
				const CompilerInvocation &, ASTContext &Ctx,
	Preprocessor &, const CanonicalIncludes &) override {			Preprocessor &, const CanonicalIncludes &) override {
	if (Version == BlockVersion)			if (Version == BlockVersion)
	N.wait();			N.wait();
	}			}

	private:			private:
	llvm::StringRef BlockVersion;			llvm::StringRef BlockVersion;
	Notification &N;			Notification &N;
	▲ Show 20 Lines • Show All 242 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Indexing of standard libraryClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 430130

clang-tools-extra/clangd/CMakeLists.txt

clang-tools-extra/clangd/ClangdServer.h

clang-tools-extra/clangd/ClangdServer.cpp

clang-tools-extra/clangd/Config.h

clang-tools-extra/clangd/ConfigCompile.cpp

clang-tools-extra/clangd/ConfigFragment.h

clang-tools-extra/clangd/ConfigYAML.cpp

clang-tools-extra/clangd/TUScheduler.h

clang-tools-extra/clangd/TUScheduler.cpp

clang-tools-extra/clangd/index/FileIndex.h

clang-tools-extra/clangd/index/FileIndex.cpp

clang-tools-extra/clangd/index/StdLib.h

clang-tools-extra/clangd/index/StdLib.cpp

clang-tools-extra/clangd/index/SymbolOrigin.h

clang-tools-extra/clangd/index/SymbolOrigin.cpp

clang-tools-extra/clangd/unittests/CMakeLists.txt

clang-tools-extra/clangd/unittests/StdLibTests.cpp

clang-tools-extra/clangd/unittests/TUSchedulerTests.cpp

[clangd] Indexing of standard library
ClosedPublic