This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/trunk/
-
trunk/
-
clangd/
-
CMakeLists.txt
-
index/dex/
-
dex/
-
DexIndex.h
-
DexIndex.cpp
-
Token.h
-
unittests/clangd/
-
clangd/
-
CMakeLists.txt
-
DexIndexTests.cpp
-
IndexTests.cpp
-
TestIndex.h
-
TestIndex.cpp

Differential D50337

[clangd] DexIndex implementation prototype
ClosedPublic

Authored by kbobyrev on Aug 6 2018, 8:16 AM.

Download Raw Diff

Details

Reviewers

ioeric
ilya-biryukov

Commits

rG870aaf296396: [clangd] DexIndex implementation prototype
rCTE340175: [clangd] DexIndex implementation prototype
rL340175: [clangd] DexIndex implementation prototype

Summary

This patch is a proof-of-concept Dex index implementation. It has several flaws, which don't allow replacing static MemIndex yet, such as:

Not being able to handle queries of small size (less than 3 symbols); a way to solve this is generating trigrams of smaller size and having such incomplete trigrams in the index structure.
Speed measurements: while manually editing files in Vim and requesting autocompletion gives an impression that the performance is at least comparable with the current static index, having actual numbers is important because we don't want to hurt the users and roll out slow code. Eric (@ioeric) suggested that we should only replace MemIndex as soon as we have the evidence that this is not a regression in terms of performance. An approach which is likely to be successful here is to wait until we have benchmark library in the LLVM core repository, which is something I have suggested in the LLVM mailing lists, received positive feedback on and started working on. I will add a dependency as soon as the suggested patch is out for a review (currently there's at least one complication which is being addressed by https://github.com/google/benchmark/pull/649). Key performance improvements for iterators are sorting by cost and the limit iterator.
Quality measurements: currently, boosting iterator and two-phase lookup stage are not implemented, without these the quality is likely to be worse than the current implementation can yield. Measuring quality is tricky, but another suggestion in the offline discussion was that the drop-in replacement should only happen after Boosting iterators implementation (and subsequent query enhancement).

The proposed changes do not affect Clangd functionality or performance, DexIndex is only used in unit tests and not in production code.

Diff Detail

Repository: rL LLVM

Event Timeline

kbobyrev created this revision.Aug 6 2018, 8:16 AM

Herald added subscribers: arphaman, mgrang, jkorous and 2 others. · View Herald TranscriptAug 6 2018, 8:16 AM

The patch is currently in preview-mode; I have to make few changes:

Improve testing infrastructure; one possible way would be to use exactly the same code MemIndex currently does as it is meant to be a drop-in replacement. An existing obstacle would be not handling <3 long queries, but it's not hard to fix.
Documenting DexIndex implementation and thinking about how to abstract out very similar code pieces shared with MemIndex. The proposed implementation is rather straightforward, but few pieces are identical to MemIndex which causes some code duplication.

Don't resize retrieved symbols vector, simply let callback process at most MaxCandidateCount items.

kbobyrev planned changes to this revision.Aug 7 2018, 1:26 AM

As discussed offline, incomplete trigrams (unigrams and bigrams generation) should be a blocker for this patch, because otherwise it isn't functional. Once incomplete trigrams are in, MemIndex tests can be reused for DexIndex to ensure stability.

Continue implementing Proof of Concept Dex-based static index replacement.

This diff adds short query processing, the current solution does not utilize iterators framework (unlike the general queries) yet and is a subject to change. As discussed offline, this implementation should lean towards simplicity and usability rather then premature optimization.

The patch is still not ready for a comprehensive review yet, these are few points which should be addressed before the review is live:

Code duplication should be reduced as much as possible. DexIndex is likely to become way more sophisticated than MemIndex in the future and hence it does not simply inherit or reuse MemIndex, this is also a reason why (as discussed offline) code duplication in unit tests is not that bad keeping in mind that the functionality and implementation of both types of index will diverge in the future. However, it's better to abstract out as much as possible if the implementation does not become less flexible and cross-dependencies are not introduced in the process.
Slightly cleaning up unit tests (IndexHelpers.(h|cpp) is not a very good name for the new file used by both MemIndex and DexIndex testing framework, code duplication is also a slight concern)

kbobyrev planned changes to this revision.Aug 7 2018, 8:23 AM

Minor code cleanup. This is now a fully functional symbol index.

I have reflected my concerns and uncertainties in FIXMEs, please indicate if you think there's something to improve in this patch. In general, I believe it is ready for a review.

ioeric added inline comments.Aug 8 2018, 7:01 AM

clang-tools-extra/clangd/index/MemIndex.h
45 ↗	(On Diff #159658)	I think this FIXME still applies here. This can probably go away when we completely get rid of MemIndex.
clang-tools-extra/clangd/index/dex/DexIndex.cpp
31 ↗	(On Diff #159658)	Couldn we build the inverted index also outside of the critical section? As this blocks ongoing index requests, we should do as little work as possible in the CS.
40 ↗	(On Diff #159658)	nit: `const auto * Sym`
57 ↗	(On Diff #159658)	nit: Initialize to false
102 ↗	(On Diff #159658)	Could you document what the approach you are taking to handle short queries? It seems that this can be very expensive, for example, if all matching symbols are at the end of the DenseMap. Short queries happen very often, so we need to make sure index handles them efficiently.
102 ↗	(On Diff #159658)	Did you mean to iterate on `Symbols` which is sorted by quality?
clang-tools-extra/clangd/index/dex/DexIndex.h
8 ↗	(On Diff #159658)	I think this file could use some high-level documentation.
27 ↗	(On Diff #159658)	There is some assumption about `Symbols` (like`MemIndex::build`). Please add documentation.
31 ↗	(On Diff #159658)	Do we need this for this patch? If not, I'd suggest leaving it out and revisit when we actually need it (e.g. when replacing MemIndex); otherwise, I think what we want is basically a function that takes a `SymbolSlab` and returns `std::shared_ptr<std::vector<const Symbol *>>`. This can probably live in Index.h as you suggested.
47 ↗	(On Diff #159658)	Unlike `MemIndex` where this is used as an actual index, here it's simply a lookup table, IIUC? Maybe just `SymbolsByID` or `lookupTable`?
49 ↗	(On Diff #159658)	This can use some comments. Could be useful for people who are not familiar with inverted index.
clang-tools-extra/clangd/index/dex/Token.h
84 ↗	(On Diff #159658)	Please document this function.

ioeric added inline comments.Aug 8 2018, 7:01 AM

clang-tools-extra/clangd/index/dex/DexIndex.cpp
67 ↗	(On Diff #159658)	I'm not quite sure what this FIXME means. What code do you want to share between them? But we do want to refactor the code a bit to make it easier to follow.
92 ↗	(On Diff #159658)	Can we avoid creating scope iterators in the first place if they are not going to be used?
98 ↗	(On Diff #159658)	It seems that the lack of limiting iterators can make the query very inefficient. Maybe we should implement the limiting iterators before getting the index in, in case people start using it before it's ready?

kbobyrev added a parent revision: D50500: [clangd] Allow consuming limited number of items.Aug 9 2018, 2:44 AM

Address a round of comments. Also put FIXMEs where appropriate for the future changes.

ioeric added inline comments.Aug 10 2018, 3:02 AM

clang-tools-extra/clangd/index/dex/DexIndex.cpp
29 ↗	(On Diff #159908)	Calculating `quality` on each comparison can also get expensive. I think we could store the quality.
37 ↗	(On Diff #159908)	nit: use `try_emplace` to save one lookup?
62 ↗	(On Diff #159908)	I think we should let the helpers grab the lock. Some preparatory work doesn't require lock. FWIW, I think the separation of short and long code paths is heading in a wrong direction. And it's also pretty hard to find a very clean abstraction here. For example, there is some overlaps in both functions, and `useCallback` seems a bit awkward. As all this would probably go away after D50517 anyway, I think we could try to get that patch landed and incorporate it into this patch? If you prefer to get this patch in first, please add `FIXME` somewhere to make it clear that the divergence is temporary.
82 ↗	(On Diff #159908)	nit: avoid `auto`here as the type of `Score` is not obvious here.
84 ↗	(On Diff #159908)	Put this `FIXME` on the for-loop level as iterating all symbols is the problem. And I think the `FIXME` could simply be FIXME(...): This can be very expensive. We should first filter symbols by scopes (with scope iterators). We can leave out the details/options as they are not interesting for most readers (as they are mostly concerns about scope filtering). In general, we should try to keep comments in documentation brief to keep the code shorter and easier to read.
94 ↗	(On Diff #159908)	nit: `if (Score)`
95 ↗	(On Diff #159908)	The code here is trivial, so the comment seems redundant.
102 ↗	(On Diff #159908)	For clarity, `- (Score) quality`. Please also comment on why we negate the number here?
175 ↗	(On Diff #159908)	Shouldn't `Scores` already be sorted for both short query and long query?
clang-tools-extra/clangd/index/dex/DexIndex.h
12 ↗	(On Diff #159908)	nit: we don't really need to mention MemIndex here as it's likely to be replaced soon, and the comment will outdated then. It's okay to mention why `Dex` is cool though :)
60 ↗	(On Diff #159908)	nit: The comment about implementation details should go with the implementation. Same below.
65 ↗	(On Diff #159908)	`Req.Query.size() < 3`?
clang-tools-extra/clangd/index/dex/Token.h
84 ↗	(On Diff #159658)	nit: I'd try to avoid tying documentation to the current state as it can easily get outdated. If you want to include future changes, consider making it more explicit. For example: Returns the tokens which are given symbol's characteristics. For example, ... trigrams and scopes. FIXME: support more tokens types: - path proxmity - ...
88 ↗	(On Diff #159908)	Dependency on `Index.h` form `Token.h` seems strange. I think this function should probably belong in `DexIndex.cpp`? Do we expect this to be used outside of `DexIndex`?

Address most comments.

Store symbol qualities (so that it's not computed each time when requested which might be expensive). Use operator[] to construct the value for inverted index when key is not inserted yet.

kbobyrev mentioned this in D50576: [clangd] Allow consumption of DocIDs without overhead.Aug 10 2018, 11:35 AM

As discussed offline, I should update the patch to reflect changes accepted in https://reviews.llvm.org/D50517.

Don't separate the logic for "long" and "short" queries: D50517 (rCTE339548) introduced incomplete trigrams which can be used on for "short" queries, too.

In D50337#1198914, @kbobyrev wrote:

Don't separate the logic for "long" and "short" queries: D50517 (rCTE339548) introduced incomplete trigrams which can be used on for "short" queries, too.

Have you forgotten to upload the new revision? :)

Sorry, the last diff was the old one. Should be correct now.

ioeric added inline comments.Aug 16 2018, 7:54 AM

clang-tools-extra/clangd/index/dex/DexIndex.cpp
25 ↗	(On Diff #161017)	Why is this enforced? `fuzzyFind` doesn't say anything about callback order. Also `useCallback` seems to be the right abstraction to me; different requests can have different callback behaviors. I think we could simply inline the code here.
99 ↗	(On Diff #161017)	Any reason we are not doing the two-stage scoring now? Retrieving while scoring with more expensive scoring seems to be diverging from the expected design. I think we can retrieve a relatively large number of symbols (e.g. a fixed large number or `100*MaxCandidateCount`?) before re-scoring them with more expensive scorers (e.g. fuzzy match), as consuming only `Req.MaxCandidateCount` symbols from the iterator tree can easily leave out good candidates (e.g. those that would've gotten good fuzzy match scores).
110 ↗	(On Diff #161017)	It seems that the trigram generation could be done outside of the critical section?
111 ↗	(On Diff #161017)	I think we could pull some helper functions here to make the code a bit easier to follow e.g. `createTrigramIterators(TrigramTokens)`, `createScopeIterators(scopes)`...
clang-tools-extra/clangd/index/dex/DexIndex.h
50 ↗	(On Diff #161017)	Why virtual?
clang-tools-extra/unittests/clangd/TestIndexOperations.h
1 ↗	(On Diff #161017)	As this file contains helper functions for testing indexes, I'd suggest calling this `TestIndex.h` like `TestTU.h`.

Address a round of comments.

ioeric added inline comments.Aug 17 2018, 6:27 AM

clang-tools-extra/clangd/index/dex/DexIndex.cpp
86 ↗	(On Diff #161202)	`createTrigramIterators` and `createScopeIterators` use `InvertedIndex`, so they should be called in the critical section. Maybe /// .... /// Requires: Called from a critical section of `InvertedIndex`. std::vector<std::unique_ptr<Iterator>> createTrigramIterators( const llvm::DenseMap<Token, PostingList> &InvertedIndex, TrigramTokens);
102 ↗	(On Diff #161202)	Maybe add a `FIXME` about finding the best pre-scoring retrieval threshold. I'm not sure if `100*MaxCandidateCount` would work well in practice. For example, if the `MaxCandidateCount` is small e.g. `5`, then the retrieval threshold would only be 50, which still seems to be too small.
110 ↗	(On Diff #161202)	I don't think this assertion is well justified. I think we should just skip if the fuzzy match fails.
118 ↗	(On Diff #161202)	I think we should use a priority queue so that we don't need to store/sort all retrieved symbols.
clang-tools-extra/unittests/clangd/DexIndexTests.cpp
367 ↗	(On Diff #161202)	This seems to be testing `mergeIndex(...)` which is not relevant in this patch?
388 ↗	(On Diff #161202)	Now that we are handling both short and long queries. I think we can address this FIXME in this patch?
540 ↗	(On Diff #161202)	Again, `mergeIndex(...)` is not interesting here.

Address another round of comments.

Almost LG! Just a few more nits.

clang-tools-extra/clangd/index/dex/DexIndex.cpp
87 ↗	(On Diff #161235)	nit: move `SymbolDocIDs` and `Top` closer to where they're used.
97 ↗	(On Diff #161235)	I think we should let `createScopeIterator` handle empty scope list case; it can return an empty list anyway.
109 ↗	(On Diff #161235)	This is not a proper place to set `More`. It's already handled below.
112 ↗	(On Diff #161235)	nit: use range-based for loop?
113 ↗	(On Diff #161235)	nit: Maybe take `const auto &Sym`?
119 ↗	(On Diff #161235)	nit: `- (Score) ...` for readability.
127 ↗	(On Diff #161235)	Can we simply iterate without `pop()`?
150 ↗	(On Diff #161235)	This assumes that `createTrigramIterator` and `createScopeIterator` are already guarded by the mutex, which is implicit. I think we can make it clearer by making these local helpers that take in InvertedIndex` with the requirement that local has been acquired.
clang-tools-extra/unittests/clangd/DexIndexTests.cpp
366 ↗	(On Diff #161235)	What tests do we want? If it's related to the changes in this patch, we should add it now. Tests shouldn't be `FIXME` :)

kbobyrev added a child revision: D50897: [clangd] Allow using experimental Dex index.Aug 17 2018, 7:17 AM

Address another round of comments.

ioeric added inline comments.Aug 17 2018, 8:17 AM

clang-tools-extra/clangd/index/dex/DexIndex.cpp
97 ↗	(On Diff #161252)	As discussed offline, triggering assertion seems to be a pretty bad behavior. Although the trigram generation, as you suggested, always more than one token, we should try to get rid of this FIXME by introducing the true iterator as proposed here.
121 ↗	(On Diff #161252)	Still, we shouldn't set `More` here.
clang-tools-extra/clangd/index/dex/DexIndex.h
58 ↗	(On Diff #161252)	nit: add /GUARDED_BY(Mutex)/
clang-tools-extra/unittests/clangd/DexIndexTests.cpp
366 ↗	(On Diff #161252)	Please add tests with empty `Query`.

Address all the comment, except the one about True iterators.

I should create another patch with True iterator to address the last comment.

clang-tools-extra/clangd/index/dex/DexIndex.cpp
97 ↗	(On Diff #161235)	Yes, but it returns an iterator now and `OrIterator` (just like any other iterator) has to have non-empty list of children.

Use TRUE iterator to ensure validity of the query processing.

Herald added a subscriber: kadircet. · View Herald TranscriptAug 20 2018, 1:30 AM

kbobyrev added a parent revision: D50955: [clangd] Implement TRUE Iterator.Aug 20 2018, 1:31 AM

LG. Last few nits and then good to go.

clang-tools-extra/clangd/index/dex/DexIndex.cpp
97 ↗	(On Diff #161432)	Check `!TrigramIterators.empty()`?
128 ↗	(On Diff #161432)	nit: `if (!Score)`
149 ↗	(On Diff #161432)	This also needs lock.
clang-tools-extra/unittests/clangd/TestIndex.h
23 ↗	(On Diff #161432)	Add comment about what `SymbolID` is?
25 ↗	(On Diff #161432)	Please add documentation.
46 ↗	(On Diff #161432)	Please add documentation.

This revision is now accepted and ready to land.Aug 20 2018, 5:25 AM

Address post-LGTM comments.

Closed by commit rL340175: [clangd] DexIndex implementation prototype (authored by omtcyfz). · Explain WhyAug 20 2018, 7:40 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptAug 20 2018, 7:40 AM

Revision Contents

Path

Size

clang-tools-extra/

trunk/

clangd/

CMakeLists.txt

1 line

index/

dex/

DexIndex.h

76 lines

DexIndex.cpp

167 lines

Token.h

2 lines

unittests/

clangd/

1 line

179 lines

84 lines

64 lines

83 lines

Diff 161481

clang-tools-extra/trunk/clangd/CMakeLists.txt

Show All 37 Lines	add_clang_library(clangDaemon
index/CanonicalIncludes.cpp		index/CanonicalIncludes.cpp
index/FileIndex.cpp		index/FileIndex.cpp
index/Index.cpp		index/Index.cpp
index/MemIndex.cpp		index/MemIndex.cpp
index/Merge.cpp		index/Merge.cpp
index/SymbolCollector.cpp		index/SymbolCollector.cpp
index/SymbolYAML.cpp		index/SymbolYAML.cpp

		index/dex/DexIndex.cpp
index/dex/Iterator.cpp		index/dex/Iterator.cpp
index/dex/Trigram.cpp		index/dex/Trigram.cpp

LINK_LIBS		LINK_LIBS
clangAST		clangAST
clangASTMatchers		clangASTMatchers
clangBasic		clangBasic
clangDriver		clangDriver
Show All 19 Lines

clang-tools-extra/trunk/clangd/index/dex/DexIndex.h

				//===--- DexIndex.h - Dex Symbol Index Implementation ------------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This defines Dex - a symbol index implementation based on query iterators
				// over symbol tokens, such as fuzzy matching trigrams, scopes, types, etc.
				// While consuming more memory and having longer build stage due to
				// preprocessing, Dex will have substantially lower latency. It will also allow
				// efficient symbol searching which is crucial for operations like code
				// completion, and can be very important for a number of different code
				// transformations which will be eventually supported by Clangd.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_DEX_DEXINDEX_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_DEX_DEXINDEX_H

				#include "../Index.h"
				#include "../MemIndex.h"
				#include "Iterator.h"
				#include "Token.h"
				#include "Trigram.h"
				#include <mutex>

				namespace clang {
				namespace clangd {
				namespace dex {

				/// In-memory Dex trigram-based index implementation.
				// FIXME(kbobyrev): Introduce serialization and deserialization of the symbol
				// index so that it can be loaded from the disk. Since static index is not
				// changed frequently, it's safe to assume that it has to be built only once
				// (when the clangd process starts). Therefore, it can be easier to store built
				// index on disk and then load it if available.
				class DexIndex : public SymbolIndex {
				public:
				/// \brief (Re-)Build index for `Symbols`. All symbol pointers must remain
				/// accessible as long as `Symbols` is kept alive.
				void build(std::shared_ptr<std::vector<const Symbol *>> Symbols);

				bool
				fuzzyFind(const FuzzyFindRequest &Req,
				llvm::function_ref<void(const Symbol &)> Callback) const override;

				void lookup(const LookupRequest &Req,
				llvm::function_ref<void(const Symbol &)> Callback) const override;

				void findOccurrences(const OccurrencesRequest &Req,
				llvm::function_ref<void(const SymbolOccurrence &)>
				Callback) const override;

				private:
				mutable std::mutex Mutex;

				std::shared_ptr<std::vector<const Symbol >> Symbols /GUARDED_BY(Mutex)*/;
				llvm::DenseMap<SymbolID, const Symbol > LookupTable /GUARDED_BY(Mutex)*/;
				llvm::DenseMap<const Symbol , float> SymbolQuality /GUARDED_BY(Mutex)*/;
				// Inverted index is a mapping from the search token to the posting list,
				// which contains all items which can be characterized by such search token.
				// For example, if the search token is scope "std::", the corresponding
				// posting list would contain all indices of symbols defined in namespace std.
				// Inverted index is used to retrieve posting lists which are processed during
				// the fuzzyFind process.
				llvm::DenseMap<Token, PostingList> InvertedIndex /GUARDED_BY(Mutex)/;
				};

				} // namespace dex
				} // namespace clangd
				} // namespace clang

				#endif

clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp

				//===--- DexIndex.cpp - Dex Symbol Index Implementation ---------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "DexIndex.h"
				#include "../../FuzzyMatch.h"
				#include "../../Logger.h"
				#include <algorithm>
				#include <queue>

				namespace clang {
				namespace clangd {
				namespace dex {

				namespace {

				// Returns the tokens which are given symbol's characteristics. Currently, the
				// generated tokens only contain fuzzy matching trigrams and symbol's scope,
				// but in the future this will also return path proximity tokens and other
				// types of tokens such as symbol type (if applicable).
				// Returns the tokens which are given symbols's characteristics. For example,
				// trigrams and scopes.
				// FIXME(kbobyrev): Support more token types:
				// * Path proximity
				// * Types
				std::vector<Token> generateSearchTokens(const Symbol &Sym) {
				std::vector<Token> Result = generateIdentifierTrigrams(Sym.Name);
				Result.push_back(Token(Token::Kind::Scope, Sym.Scope));
				return Result;
				}

				} // namespace

				void DexIndex::build(std::shared_ptr<std::vector<const Symbol *>> Syms) {
				llvm::DenseMap<SymbolID, const Symbol *> TempLookupTable;
				llvm::DenseMap<const Symbol *, float> TempSymbolQuality;
				for (const Symbol Sym : Syms) {
				TempLookupTable[Sym->ID] = Sym;
				TempSymbolQuality[Sym] = quality(*Sym);
				}

				// Symbols are sorted by symbol qualities so that items in the posting lists
				// are stored in the descending order of symbol quality.
				std::sort(begin(Syms), end(Syms),
				[&](const Symbol LHS, const Symbol RHS) {
				return TempSymbolQuality[LHS] > TempSymbolQuality[RHS];
				});
				llvm::DenseMap<Token, PostingList> TempInvertedIndex;
				// Populate TempInvertedIndex with posting lists for index symbols.
				for (DocID SymbolRank = 0; SymbolRank < Syms->size(); ++SymbolRank) {
				const auto Sym = (Syms)[SymbolRank];
				for (const auto &Token : generateSearchTokens(*Sym))
				TempInvertedIndex[Token].push_back(SymbolRank);
				}

				{
				std::lock_guard<std::mutex> Lock(Mutex);

				// Replace outdated index with the new one.
				LookupTable = std::move(TempLookupTable);
				Symbols = std::move(Syms);
				InvertedIndex = std::move(TempInvertedIndex);
				SymbolQuality = std::move(TempSymbolQuality);
				}
				}

				/// Constructs iterators over tokens extracted from the query and exhausts it
				/// while applying Callback to each symbol in the order of decreasing quality
				/// of the matched symbols.
				bool DexIndex::fuzzyFind(
				const FuzzyFindRequest &Req,
				llvm::function_ref<void(const Symbol &)> Callback) const {
				assert(!StringRef(Req.Query).contains("::") &&
				"There must be no :: in query.");
				FuzzyMatcher Filter(Req.Query);
				bool More = false;

				std::vector<std::unique_ptr<Iterator>> TopLevelChildren;
				const auto TrigramTokens = generateIdentifierTrigrams(Req.Query);

				{
				std::lock_guard<std::mutex> Lock(Mutex);

				// Generate query trigrams and construct AND iterator over all query
				// trigrams.
				std::vector<std::unique_ptr<Iterator>> TrigramIterators;
				for (const auto &Trigram : TrigramTokens) {
				const auto It = InvertedIndex.find(Trigram);
				if (It != InvertedIndex.end())
				TrigramIterators.push_back(create(It->second));
				}
				if (!TrigramIterators.empty())
				TopLevelChildren.push_back(createAnd(move(TrigramIterators)));

				// Generate scope tokens for search query.
				std::vector<std::unique_ptr<Iterator>> ScopeIterators;
				for (const auto &Scope : Req.Scopes) {
				const auto It = InvertedIndex.find(Token(Token::Kind::Scope, Scope));
				if (It != InvertedIndex.end())
				ScopeIterators.push_back(create(It->second));
				}
				// Add OR iterator for scopes if there are any Scope Iterators.
				if (!ScopeIterators.empty())
				TopLevelChildren.push_back(createOr(move(ScopeIterators)));

				// Use TRUE iterator if both trigrams and scopes from the query are not
				// present in the symbol index.
				auto QueryIterator = TopLevelChildren.empty()
				? createTrue(Symbols->size())
				: createAnd(move(TopLevelChildren));
				// Retrieve more items than it was requested: some of the items with high
				// final score might not be retrieved otherwise.
				// FIXME(kbobyrev): Pre-scoring retrieval threshold should be adjusted as
				// using 100x of the requested number might not be good in practice, e.g.
				// when the requested number of items is small.
				const unsigned ItemsToRetrieve = 100 * Req.MaxCandidateCount;
				std::vector<DocID> SymbolDocIDs = consume(*QueryIterator, ItemsToRetrieve);

				// Retrieve top Req.MaxCandidateCount items.
				std::priority_queue<std::pair<float, const Symbol *>> Top;
				for (const auto &SymbolDocID : SymbolDocIDs) {
				const auto Sym = (Symbols)[SymbolDocID];
				const llvm::Optional<float> Score = Filter.match(Sym->Name);
				if (!Score)
				continue;
				// Multiply score by a negative factor so that Top stores items with the
				// highest actual score.
				Top.emplace(-(Score) SymbolQuality.find(Sym)->second, Sym);
				if (Top.size() > Req.MaxCandidateCount) {
				More = true;
				Top.pop();
				}
				}

				// Apply callback to the top Req.MaxCandidateCount items.
				for (; !Top.empty(); Top.pop())
				Callback(*Top.top().second);
				}

				return More;
				}

				void DexIndex::lookup(const LookupRequest &Req,
				llvm::function_ref<void(const Symbol &)> Callback) const {
				std::lock_guard<std::mutex> Lock(Mutex);
				for (const auto &ID : Req.IDs) {
				auto I = LookupTable.find(ID);
				if (I != LookupTable.end())
				Callback(*I->second);
				}
				}


				void DexIndex::findOccurrences(
				const OccurrencesRequest &Req,
				llvm::function_ref<void(const SymbolOccurrence &)> Callback) const {
				log("findOccurrences is not implemented.");
				}

				} // namespace dex
				} // namespace clangd
				} // namespace clang

clang-tools-extra/trunk/clangd/index/dex/Token.h

	Show All 16 Lines
	// * Trigram "out"			// * Trigram "out"
	// * Type "std::ostream"			// * Type "std::ostream"
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_DEX_TOKEN_H			#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_DEX_TOKEN_H
	#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_DEX_TOKEN_H			#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_DEX_TOKEN_H

				#include "../Index.h"
	#include "llvm/ADT/DenseMap.h"			#include "llvm/ADT/DenseMap.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"

	#include <string>			#include <string>
	#include <vector>			#include <vector>

	namespace clang {			namespace clang {
	namespace clangd {			namespace clangd {
	namespace dex {			namespace dex {

	/// A Token represents an attribute of a symbol, such as a particular trigram			/// A Token represents an attribute of a symbol, such as a particular trigram
	▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

clang-tools-extra/trunk/unittests/clangd/CMakeLists.txt

Show All 24 Lines	add_extra_unittest(ClangdTests
HeadersTests.cpp		HeadersTests.cpp
IndexTests.cpp		IndexTests.cpp
QualityTests.cpp		QualityTests.cpp
SourceCodeTests.cpp		SourceCodeTests.cpp
SymbolCollectorTests.cpp		SymbolCollectorTests.cpp
SyncAPI.cpp		SyncAPI.cpp
TUSchedulerTests.cpp		TUSchedulerTests.cpp
TestFS.cpp		TestFS.cpp
		TestIndex.cpp
TestTU.cpp		TestTU.cpp
ThreadingTests.cpp		ThreadingTests.cpp
TraceTests.cpp		TraceTests.cpp
URITests.cpp		URITests.cpp
XRefsTests.cpp		XRefsTests.cpp
)		)

target_link_libraries(ClangdTests		target_link_libraries(ClangdTests
Show All 15 Lines

clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp

//===-- DexIndexTests.cpp ----------------------------- C++ ------------===//		//===-- DexIndexTests.cpp ----------------------------- C++ ------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "TestIndex.h"
		#include "index/Index.h"
		#include "index/Merge.h"
		#include "index/dex/DexIndex.h"
#include "index/dex/Iterator.h"		#include "index/dex/Iterator.h"
#include "index/dex/Token.h"		#include "index/dex/Token.h"
#include "index/dex/Trigram.h"		#include "index/dex/Trigram.h"
#include "llvm/Support/ScopedPrinter.h"		#include "llvm/Support/ScopedPrinter.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "gmock/gmock.h"		#include "gmock/gmock.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"
#include <string>		#include <string>
#include <vector>		#include <vector>

		using ::testing::ElementsAre;
		using ::testing::UnorderedElementsAre;

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace dex {		namespace dex {
		namespace {
using ::testing::ElementsAre;

TEST(DexIndexIterators, DocumentIterator) {		TEST(DexIndexIterators, DocumentIterator) {
const PostingList L = {4, 7, 8, 20, 42, 100};		const PostingList L = {4, 7, 8, 20, 42, 100};
auto DocIterator = create(L);		auto DocIterator = create(L);

EXPECT_EQ(DocIterator->peek(), 4U);		EXPECT_EQ(DocIterator->peek(), 4U);
EXPECT_FALSE(DocIterator->reachedEnd());		EXPECT_FALSE(DocIterator->reachedEnd());

▲ Show 20 Lines • Show All 321 Lines • ▼ Show 20 Lines	TEST(DexIndexTrigrams, QueryTrigrams) {

EXPECT_THAT(generateQueryTrigrams("IsOK"), trigramsAre({"iso", "sok"}));		EXPECT_THAT(generateQueryTrigrams("IsOK"), trigramsAre({"iso", "sok"}));

EXPECT_THAT(generateQueryTrigrams("abc_defGhij__klm"),		EXPECT_THAT(generateQueryTrigrams("abc_defGhij__klm"),
trigramsAre({"abc", "bcd", "cde", "def", "efg", "fgh", "ghi",		trigramsAre({"abc", "bcd", "cde", "def", "efg", "fgh", "ghi",
"hij", "ijk", "jkl", "klm"}));		"hij", "ijk", "jkl", "klm"}));
}		}

		TEST(DexIndex, Lookup) {
		DexIndex I;
		I.build(generateSymbols({"ns::abc", "ns::xyz"}));
		EXPECT_THAT(lookup(I, SymbolID("ns::abc")), UnorderedElementsAre("ns::abc"));
		EXPECT_THAT(lookup(I, {SymbolID("ns::abc"), SymbolID("ns::xyz")}),
		UnorderedElementsAre("ns::abc", "ns::xyz"));
		EXPECT_THAT(lookup(I, {SymbolID("ns::nonono"), SymbolID("ns::xyz")}),
		UnorderedElementsAre("ns::xyz"));
		EXPECT_THAT(lookup(I, SymbolID("ns::nonono")), UnorderedElementsAre());
		}

		TEST(DexIndex, FuzzyFind) {
		DexIndex Index;
		Index.build(generateSymbols({"ns::ABC", "ns::BCD", "::ABC", "ns::nested::ABC",
		"other::ABC", "other::A"}));
		FuzzyFindRequest Req;
		Req.Query = "ABC";
		Req.Scopes = {"ns::"};
		EXPECT_THAT(match(Index, Req), UnorderedElementsAre("ns::ABC"));
		Req.Scopes = {"ns::", "ns::nested::"};
		EXPECT_THAT(match(Index, Req),
		UnorderedElementsAre("ns::ABC", "ns::nested::ABC"));
		Req.Query = "A";
		Req.Scopes = {"other::"};
		EXPECT_THAT(match(Index, Req),
		UnorderedElementsAre("other::A", "other::ABC"));
		Req.Query = "";
		Req.Scopes = {};
		EXPECT_THAT(match(Index, Req),
		UnorderedElementsAre("ns::ABC", "ns::BCD", "::ABC",
		"ns::nested::ABC", "other::ABC",
		"other::A"));
		}

		TEST(DexIndexTest, FuzzyMatchQ) {
		DexIndex I;
		I.build(
		generateSymbols({"LaughingOutLoud", "LionPopulation", "LittleOldLady"}));
		FuzzyFindRequest Req;
		Req.Query = "lol";
		Req.MaxCandidateCount = 2;
		EXPECT_THAT(match(I, Req),
		UnorderedElementsAre("LaughingOutLoud", "LittleOldLady"));
		}

		TEST(DexIndexTest, DexIndexSymbolsRecycled) {
		DexIndex I;
		std::weak_ptr<SlabAndPointers> Symbols;
		I.build(generateNumSymbols(0, 10, &Symbols));
		FuzzyFindRequest Req;
		Req.Query = "7";
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("7"));

		EXPECT_FALSE(Symbols.expired());
		// Release old symbols.
		I.build(generateNumSymbols(0, 0));
		EXPECT_TRUE(Symbols.expired());
		}

		// FIXME(kbobyrev): This test is different for DexIndex and MemIndex: while
		// MemIndex manages response deduplication, DexIndex simply returns all matched
		// symbols which means there might be equivalent symbols in the response.
		// Before drop-in replacement of MemIndex with DexIndex happens, FileIndex
		// should handle deduplication instead.
		TEST(DexIndexTest, DexIndexDeduplicate) {
		auto Symbols = generateNumSymbols(0, 10);

		// Inject duplicates.
		auto Sym = symbol("7");
		Symbols->push_back(&Sym);
		Symbols->push_back(&Sym);
		Symbols->push_back(&Sym);

		FuzzyFindRequest Req;
		Req.Query = "7";
		DexIndex I;
		I.build(std::move(Symbols));
		auto Matches = match(I, Req);
		EXPECT_EQ(Matches.size(), 4u);
		}

		TEST(DexIndexTest, DexIndexLimitedNumMatches) {
		DexIndex I;
		I.build(generateNumSymbols(0, 100));
		FuzzyFindRequest Req;
		Req.Query = "5";
		Req.MaxCandidateCount = 3;
		bool Incomplete;
		auto Matches = match(I, Req, &Incomplete);
		EXPECT_EQ(Matches.size(), Req.MaxCandidateCount);
		EXPECT_TRUE(Incomplete);
		}

		TEST(DexIndexTest, FuzzyMatch) {
		DexIndex I;
		I.build(
		generateSymbols({"LaughingOutLoud", "LionPopulation", "LittleOldLady"}));
		FuzzyFindRequest Req;
		Req.Query = "lol";
		Req.MaxCandidateCount = 2;
		EXPECT_THAT(match(I, Req),
		UnorderedElementsAre("LaughingOutLoud", "LittleOldLady"));
		}

		TEST(DexIndexTest, MatchQualifiedNamesWithoutSpecificScope) {
		DexIndex I;
		I.build(generateSymbols({"a::y1", "b::y2", "y3"}));
		FuzzyFindRequest Req;
		Req.Query = "y";
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("a::y1", "b::y2", "y3"));
		}

		TEST(DexIndexTest, MatchQualifiedNamesWithGlobalScope) {
		DexIndex I;
		I.build(generateSymbols({"a::y1", "b::y2", "y3"}));
		FuzzyFindRequest Req;
		Req.Query = "y";
		Req.Scopes = {""};
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("y3"));
		}

		TEST(DexIndexTest, MatchQualifiedNamesWithOneScope) {
		DexIndex I;
		I.build(generateSymbols({"a::y1", "a::y2", "a::x", "b::y2", "y3"}));
		FuzzyFindRequest Req;
		Req.Query = "y";
		Req.Scopes = {"a::"};
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("a::y1", "a::y2"));
		}

		TEST(DexIndexTest, MatchQualifiedNamesWithMultipleScopes) {
		DexIndex I;
		I.build(generateSymbols({"a::y1", "a::y2", "a::x", "b::y3", "y3"}));
		FuzzyFindRequest Req;
		Req.Query = "y";
		Req.Scopes = {"a::", "b::"};
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("a::y1", "a::y2", "b::y3"));
		}

		TEST(DexIndexTest, NoMatchNestedScopes) {
		DexIndex I;
		I.build(generateSymbols({"a::y1", "a::b::y2"}));
		FuzzyFindRequest Req;
		Req.Query = "y";
		Req.Scopes = {"a::"};
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("a::y1"));
		}

		TEST(DexIndexTest, IgnoreCases) {
		DexIndex I;
		I.build(generateSymbols({"ns::ABC", "ns::abc"}));
		FuzzyFindRequest Req;
		Req.Query = "AB";
		Req.Scopes = {"ns::"};
		EXPECT_THAT(match(I, Req), UnorderedElementsAre("ns::ABC", "ns::abc"));
		}

		TEST(DexIndexTest, Lookup) {
		DexIndex I;
		I.build(generateSymbols({"ns::abc", "ns::xyz"}));
		EXPECT_THAT(lookup(I, SymbolID("ns::abc")), UnorderedElementsAre("ns::abc"));
		EXPECT_THAT(lookup(I, {SymbolID("ns::abc"), SymbolID("ns::xyz")}),
		UnorderedElementsAre("ns::abc", "ns::xyz"));
		EXPECT_THAT(lookup(I, {SymbolID("ns::nonono"), SymbolID("ns::xyz")}),
		UnorderedElementsAre("ns::xyz"));
		EXPECT_THAT(lookup(I, SymbolID("ns::nonono")), UnorderedElementsAre());
		}

		} // namespace
} // namespace dex		} // namespace dex
} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

clang-tools-extra/trunk/unittests/clangd/IndexTests.cpp

//===-- IndexTests.cpp -------------------------------- C++ ------------===//		//===-- IndexTests.cpp -------------------------------- C++ ------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "TestIndex.h"
#include "index/Index.h"		#include "index/Index.h"
#include "index/MemIndex.h"		#include "index/MemIndex.h"
#include "index/Merge.h"		#include "index/Merge.h"
#include "gmock/gmock.h"		#include "gmock/gmock.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"

using testing::UnorderedElementsAre;
using testing::Pointee;		using testing::Pointee;
		using testing::UnorderedElementsAre;

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace {		namespace {

Symbol symbol(llvm::StringRef QName) {
Symbol Sym;
Sym.ID = SymbolID(QName.str());
size_t Pos = QName.rfind("::");
if (Pos == llvm::StringRef::npos) {
Sym.Name = QName;
Sym.Scope = "";
} else {
Sym.Name = QName.substr(Pos + 2);
Sym.Scope = QName.substr(0, Pos + 2);
}
return Sym;
}

MATCHER_P(Named, N, "") { return arg.Name == N; }		MATCHER_P(Named, N, "") { return arg.Name == N; }

TEST(SymbolSlab, FindAndIterate) {		TEST(SymbolSlab, FindAndIterate) {
SymbolSlab::Builder B;		SymbolSlab::Builder B;
B.insert(symbol("Z"));		B.insert(symbol("Z"));
B.insert(symbol("Y"));		B.insert(symbol("Y"));
B.insert(symbol("X"));		B.insert(symbol("X"));
EXPECT_EQ(nullptr, B.find(SymbolID("W")));		EXPECT_EQ(nullptr, B.find(SymbolID("W")));
for (const char *Sym : {"X", "Y", "Z"})		for (const char *Sym : {"X", "Y", "Z"})
EXPECT_THAT(B.find(SymbolID(Sym)), Pointee(Named(Sym)));		EXPECT_THAT(B.find(SymbolID(Sym)), Pointee(Named(Sym)));

SymbolSlab S = std::move(B).build();		SymbolSlab S = std::move(B).build();
EXPECT_THAT(S, UnorderedElementsAre(Named("X"), Named("Y"), Named("Z")));		EXPECT_THAT(S, UnorderedElementsAre(Named("X"), Named("Y"), Named("Z")));
EXPECT_EQ(S.end(), S.find(SymbolID("W")));		EXPECT_EQ(S.end(), S.find(SymbolID("W")));
for (const char *Sym : {"X", "Y", "Z"})		for (const char *Sym : {"X", "Y", "Z"})
EXPECT_THAT(*S.find(SymbolID(Sym)), Named(Sym));		EXPECT_THAT(*S.find(SymbolID(Sym)), Named(Sym));
}		}

struct SlabAndPointers {
SymbolSlab Slab;
std::vector<const Symbol *> Pointers;
};

// Create a slab of symbols with the given qualified names as both IDs and
// names. The life time of the slab is managed by the returned shared pointer.
// If \p WeakSymbols is provided, it will be pointed to the managed object in
// the returned shared pointer.
std::shared_ptr<std::vector<const Symbol *>>
generateSymbols(std::vector<std::string> QualifiedNames,
std::weak_ptr<SlabAndPointers> *WeakSymbols = nullptr) {
SymbolSlab::Builder Slab;
for (llvm::StringRef QName : QualifiedNames)
Slab.insert(symbol(QName));

auto Storage = std::make_shared<SlabAndPointers>();
Storage->Slab = std::move(Slab).build();
for (const auto &Sym : Storage->Slab)
Storage->Pointers.push_back(&Sym);
if (WeakSymbols)
*WeakSymbols = Storage;
auto *Pointers = &Storage->Pointers;
return {std::move(Storage), Pointers};
}

// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
// to the `generateSymbols` above.
std::shared_ptr<std::vector<const Symbol *>>
generateNumSymbols(int Begin, int End,
std::weak_ptr<SlabAndPointers> *WeakSymbols = nullptr) {
std::vector<std::string> Names;
for (int i = Begin; i <= End; i++)
Names.push_back(std::to_string(i));
return generateSymbols(Names, WeakSymbols);
}

std::string getQualifiedName(const Symbol &Sym) {
return (Sym.Scope + Sym.Name).str();
}

std::vector<std::string> match(const SymbolIndex &I,
const FuzzyFindRequest &Req,
bool *Incomplete = nullptr) {
std::vector<std::string> Matches;
bool IsIncomplete = I.fuzzyFind(Req, [&](const Symbol &Sym) {
Matches.push_back(getQualifiedName(Sym));
});
if (Incomplete)
*Incomplete = IsIncomplete;
return Matches;
}

TEST(MemIndexTest, MemIndexSymbolsRecycled) {		TEST(MemIndexTest, MemIndexSymbolsRecycled) {
MemIndex I;		MemIndex I;
std::weak_ptr<SlabAndPointers> Symbols;		std::weak_ptr<SlabAndPointers> Symbols;
I.build(generateNumSymbols(0, 10, &Symbols));		I.build(generateNumSymbols(0, 10, &Symbols));
FuzzyFindRequest Req;		FuzzyFindRequest Req;
Req.Query = "7";		Req.Query = "7";
EXPECT_THAT(match(I, Req), UnorderedElementsAre("7"));		EXPECT_THAT(match(I, Req), UnorderedElementsAre("7"));

▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	TEST(MemIndexTest, IgnoreCases) {
MemIndex I;		MemIndex I;
I.build(generateSymbols({"ns::ABC", "ns::abc"}));		I.build(generateSymbols({"ns::ABC", "ns::abc"}));
FuzzyFindRequest Req;		FuzzyFindRequest Req;
Req.Query = "AB";		Req.Query = "AB";
Req.Scopes = {"ns::"};		Req.Scopes = {"ns::"};
EXPECT_THAT(match(I, Req), UnorderedElementsAre("ns::ABC", "ns::abc"));		EXPECT_THAT(match(I, Req), UnorderedElementsAre("ns::ABC", "ns::abc"));
}		}

// Returns qualified names of symbols with any of IDs in the index.
std::vector<std::string> lookup(const SymbolIndex &I,
llvm::ArrayRef<SymbolID> IDs) {
LookupRequest Req;
Req.IDs.insert(IDs.begin(), IDs.end());
std::vector<std::string> Results;
I.lookup(Req, [&](const Symbol &Sym) {
Results.push_back(getQualifiedName(Sym));
});
return Results;
}

TEST(MemIndexTest, Lookup) {		TEST(MemIndexTest, Lookup) {
MemIndex I;		MemIndex I;
I.build(generateSymbols({"ns::abc", "ns::xyz"}));		I.build(generateSymbols({"ns::abc", "ns::xyz"}));
EXPECT_THAT(lookup(I, SymbolID("ns::abc")), UnorderedElementsAre("ns::abc"));		EXPECT_THAT(lookup(I, SymbolID("ns::abc")), UnorderedElementsAre("ns::abc"));
EXPECT_THAT(lookup(I, {SymbolID("ns::abc"), SymbolID("ns::xyz")}),		EXPECT_THAT(lookup(I, {SymbolID("ns::abc"), SymbolID("ns::xyz")}),
UnorderedElementsAre("ns::abc", "ns::xyz"));		UnorderedElementsAre("ns::abc", "ns::xyz"));
EXPECT_THAT(lookup(I, {SymbolID("ns::nonono"), SymbolID("ns::xyz")}),		EXPECT_THAT(lookup(I, {SymbolID("ns::nonono"), SymbolID("ns::xyz")}),
UnorderedElementsAre("ns::xyz"));		UnorderedElementsAre("ns::xyz"));
Show All 29 Lines	TEST(MergeIndexTest, FuzzyFind) {
Req.Scopes = {"ns::"};		Req.Scopes = {"ns::"};
EXPECT_THAT(match(*mergeIndex(&I, &J), Req),		EXPECT_THAT(match(*mergeIndex(&I, &J), Req),
UnorderedElementsAre("ns::A", "ns::B", "ns::C"));		UnorderedElementsAre("ns::A", "ns::B", "ns::C"));
}		}

TEST(MergeTest, Merge) {		TEST(MergeTest, Merge) {
Symbol L, R;		Symbol L, R;
L.ID = R.ID = SymbolID("hello");		L.ID = R.ID = SymbolID("hello");
L.Name = R.Name = "Foo"; // same in both		L.Name = R.Name = "Foo"; // same in both
L.CanonicalDeclaration.FileURI = "file:///left.h"; // differs		L.CanonicalDeclaration.FileURI = "file:///left.h"; // differs
R.CanonicalDeclaration.FileURI = "file:///right.h";		R.CanonicalDeclaration.FileURI = "file:///right.h";
L.References = 1;		L.References = 1;
R.References = 2;		R.References = 2;
L.Signature = "()"; // present in left only		L.Signature = "()"; // present in left only
R.CompletionSnippetSuffix = "{$1:0}"; // present in right only		R.CompletionSnippetSuffix = "{$1:0}"; // present in right only
Symbol::Details DetL, DetR;		Symbol::Details DetL, DetR;
DetL.ReturnType = "DetL";		DetL.ReturnType = "DetL";
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

clang-tools-extra/trunk/unittests/clangd/TestIndex.h

				//===-- IndexHelpers.h ------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
				#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H

				#include "index/Index.h"
				#include "index/Merge.h"
				#include "index/dex/DexIndex.h"
				#include "index/dex/Iterator.h"
				#include "index/dex/Token.h"
				#include "index/dex/Trigram.h"

				namespace clang {
				namespace clangd {

				// Creates Symbol instance and sets SymbolID to given QualifiedName.
				Symbol symbol(llvm::StringRef QName);

				// Bundles symbol pointers with the actual symbol slab the pointers refer to in
				// order to ensure that the slab isn't destroyed while it's used by and index.
				struct SlabAndPointers {
				SymbolSlab Slab;
				std::vector<const Symbol *> Pointers;
				};

				// Create a slab of symbols with the given qualified names as both IDs and
				// names. The life time of the slab is managed by the returned shared pointer.
				// If \p WeakSymbols is provided, it will be pointed to the managed object in
				// the returned shared pointer.
				std::shared_ptr<std::vector<const Symbol *>>
				generateSymbols(std::vector<std::string> QualifiedNames,
				std::weak_ptr<SlabAndPointers> *WeakSymbols = nullptr);

				// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
				// to the `generateSymbols` above.
				std::shared_ptr<std::vector<const Symbol *>>
				generateNumSymbols(int Begin, int End,
				std::weak_ptr<SlabAndPointers> *WeakSymbols = nullptr);

				// Returns fully-qualified name out of given symbol.
				std::string getQualifiedName(const Symbol &Sym);

				// Performs fuzzy matching-based symbol lookup given a query and an index.
				// Incomplete is set true if more items than requested can be retrieved, false
				// otherwise.
				std::vector<std::string> match(const SymbolIndex &I,
				const FuzzyFindRequest &Req,
				bool *Incomplete = nullptr);

				// Returns qualified names of symbols with any of IDs in the index.
				std::vector<std::string> lookup(const SymbolIndex &I,
				llvm::ArrayRef<SymbolID> IDs);

				} // namespace clangd
				} // namespace clang

				#endif

clang-tools-extra/trunk/unittests/clangd/TestIndex.cpp

				//===-- IndexHelpers.cpp ----------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "TestIndex.h"

				namespace clang {
				namespace clangd {

				Symbol symbol(llvm::StringRef QName) {
				Symbol Sym;
				Sym.ID = SymbolID(QName.str());
				size_t Pos = QName.rfind("::");
				if (Pos == llvm::StringRef::npos) {
				Sym.Name = QName;
				Sym.Scope = "";
				} else {
				Sym.Name = QName.substr(Pos + 2);
				Sym.Scope = QName.substr(0, Pos + 2);
				}
				return Sym;
				}

				std::shared_ptr<std::vector<const Symbol *>>
				generateSymbols(std::vector<std::string> QualifiedNames,
				std::weak_ptr<SlabAndPointers> *WeakSymbols) {
				SymbolSlab::Builder Slab;
				for (llvm::StringRef QName : QualifiedNames)
				Slab.insert(symbol(QName));

				auto Storage = std::make_shared<SlabAndPointers>();
				Storage->Slab = std::move(Slab).build();
				for (const auto &Sym : Storage->Slab)
				Storage->Pointers.push_back(&Sym);
				if (WeakSymbols)
				*WeakSymbols = Storage;
				auto *Pointers = &Storage->Pointers;
				return {std::move(Storage), Pointers};
				}

				std::shared_ptr<std::vector<const Symbol *>>
				generateNumSymbols(int Begin, int End,
				std::weak_ptr<SlabAndPointers> *WeakSymbols) {
				std::vector<std::string> Names;
				for (int i = Begin; i <= End; i++)
				Names.push_back(std::to_string(i));
				return generateSymbols(Names, WeakSymbols);
				}

				std::string getQualifiedName(const Symbol &Sym) {
				return (Sym.Scope + Sym.Name).str();
				}

				std::vector<std::string> match(const SymbolIndex &I,
				const FuzzyFindRequest &Req, bool *Incomplete) {
				std::vector<std::string> Matches;
				bool IsIncomplete = I.fuzzyFind(Req, [&](const Symbol &Sym) {
				Matches.push_back(clang::clangd::getQualifiedName(Sym));
				});
				if (Incomplete)
				*Incomplete = IsIncomplete;
				return Matches;
				}

				// Returns qualified names of symbols with any of IDs in the index.
				std::vector<std::string> lookup(const SymbolIndex &I,
				llvm::ArrayRef<SymbolID> IDs) {
				LookupRequest Req;
				Req.IDs.insert(IDs.begin(), IDs.end());
				std::vector<std::string> Results;
				I.lookup(Req, [&](const Symbol &Sym) {
				Results.push_back(getQualifiedName(Sym));
				});
				return Results;
				}

				} // namespace clangd
				} // namespace clang