Page MenuHomePhabricator

[clangd] Index API and implementations for relations

Authored by nridge on Jun 3 2019, 9:58 PM.


Diff Detail


Event Timeline

nridge created this revision.Jun 3 2019, 9:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2019, 9:58 PM

This isn't quite ready for a thorough review, but I have a question for now: in BackgroundIndex::update, there is a step where we partition symbols and references into files.

For relations, should we include a copy in both the file containing the definition of the subject, and (if different) the file containing the definition of the object?

For relations, should we include a copy in both the file containing the definition of the subject, and (if different) the file containing the definition of the object?

The point of sharding is to reduce duplication, so I would rather store it only in subject's definition location.

nridge updated this revision to Diff 203072.Jun 4 2019, 10:10 PM

Fill in BackgroundIndex and Dex implementations

nridge retitled this revision from [WIP] [clangd] Index API and implementations for relations to [clangd] Index API and implementations for relations.Jun 4 2019, 10:10 PM
nridge edited the summary of this revision. (Show Details)

Ok, this is probably ready for a first round of review now.

I know I should add new test cases to at least DexTests and BackgroundIndexTests, I'll do this in the next version of the patch.

kadircet added inline comments.Jun 6 2019, 10:13 AM
293 ↗(On Diff #203072)

nit: rename to SymbolIDToFile?

329 ↗(On Diff #203072)

nit: braces

201 ↗(On Diff #203072)

redundant scope

225 ↗(On Diff #203072)

no need to pass RelationSlabs in here AllRelations are just value types.

250 ↗(On Diff #203072)

I think we need to pass relationslab in here. Since we might miss relations like the following that are outside the main file:

class A {};
class B : public A {};

Would be glad if you could prove me right/wrong with a unittest as well.

107 ↗(On Diff #203072)

We might need additional options, like limiting number of results. Could you instead accept a RelsRequest object? You can check RefsRequest for a sample

22 ↗(On Diff #203072)

+ Relations.bytes()

24 ↗(On Diff #203072)

no need to put Relations into Data they are just values, right?

94 ↗(On Diff #203072)

nit: braces

29 ↗(On Diff #203072)

why not just store densemap<<s,p>,arrayref<o>> ?

126 ↗(On Diff #203072)

avoiding deduplicates is not enough, we also wouldn't want to report stale relations coming from static index.

Could you check the logic in MergedIndex::refs, and hopefully refactor it into a class to share between these two?

28 ↗(On Diff #203072)


269 ↗(On Diff #203072)

nit: braces

110 ↗(On Diff #203072)

let's store densemap<<s,p>, arrayref<o>> here as well

nridge updated this revision to Diff 203495.Jun 6 2019, 10:28 PM
nridge marked 16 inline comments as done.

Addressed most review comments, also added some tests

nridge added inline comments.Jun 6 2019, 10:28 PM
107 ↗(On Diff #203072)

I called it RelationsRequest to avoid visual confusion with RefsRequest.

29 ↗(On Diff #203072)

I used std::vector<o> because it wasn't clear where the array would live otherwise.

126 ↗(On Diff #203072)

The description in MergedIndex::refs() says:

// We don't want duplicated refs from the static/dynamic indexes,
// and we can't reliably deduplicate them because offsets may differ slightly.
// We consider the dynamic index authoritative and report all its refs,
// and only report static index refs from other files.

It seems to me that the problem of "can't reliably deduplicate them because offsets may differ slightly" doesn't apply to relations, as there are no offsets involved. So, is there still a need to have additional logic here?

If so, what would the logic be -- don't return an object O1 from the static index if we've returned an object O2 from the dynamic index defined in the same file as O1? (Note that to implement this, we'd have to additionally look up each SymbolID we return in the symbol table, as we don't otherwise know what file the object is located in.)

28 ↗(On Diff #203072)

Size is only the backing data size, so if we don't include the relations in the backing data, we shouldn't include Rels.size(), right?

kadircet marked an inline comment as done.Jun 7 2019, 6:01 AM
kadircet added inline comments.
202 ↗(On Diff #203495)

nit: braces

107 ↗(On Diff #203072)

yeah that's a good call thanks!

76 ↗(On Diff #203495)

let's have a limit as well

139 ↗(On Diff #203495)

thinking about this callback, I don't think there is any user that will make use of the SymbolIDs without querying for a Symbol.

So to reduce this boilerplate let's rather move this lookup into this API(all SymbolIndex implementations provide lookup anyway so they all have a way to convert a SymbolID into a Symbol) and change the callback to accept a const Symbol& instead of a SymbolID

126 ↗(On Diff #203072)

yes I was referring to second part actually, but you are right we have no good way of distinguishing stale information in this case. Let's leave it as it is, but add a comment stating that we might return stale relations from static index.

28 ↗(On Diff #203072)

ok, but then let's make sure we include size of the densemap in estimateMemoryUsage manually.

220 ↗(On Diff #203495)

could we instead check only relation is A_CC isbaseof B_CC ?

nridge updated this revision to Diff 204189.Jun 11 2019, 4:29 PM
nridge marked 9 inline comments as done.

Address remaining review comments

250 ↗(On Diff #203072)

You are right! Fixed, and added a test to FileIndexTests.

LGTM, except the batch query support.

97 ↗(On Diff #204189)

we should have only a couple call sites to this function. Let's rather change those instead of introducing a new helper.

77 ↗(On Diff #204189)

sorry for missing it in previous iteration. but this should also enable batch queries. let's make this one a llvm::DenseMap<SymbolID>.
And in the callback we should output both the SymbolID of the Subject and Object.

We have a network based index(internally) and it uses this batch query optimization to get rid of network latency.

113 ↗(On Diff #204189)

I think the old comment is better. No need to expose internals.

244 ↗(On Diff #203495)

could you revert these changes?

268 ↗(On Diff #204189)

can you also add a span

trace::Span Tracer("Dex relations");
nridge marked an inline comment as done.Jun 12 2019, 8:15 PM
nridge added inline comments.
77 ↗(On Diff #204189)

What is an example use case for a batch query? In a typical LSP editor, the user can only request a type hierarchy for one class at a time.

kadircet added inline comments.Jun 13 2019, 1:43 AM
77 ↗(On Diff #204189)

yes, that's not really applicable to TypeHierarchy, but we are rather introducing another endpoint to an existing API(SymbolIndex), which will be used by a lot more clients later on.

The first example that comes to my mind is, fetching all overrides of a class' virtual methods for example.

nridge updated this revision to Diff 204691.Jun 13 2019, 9:12 PM
nridge marked 7 inline comments as done.

Address latest review comments

nridge added inline comments.Jun 13 2019, 9:14 PM
244 ↗(On Diff #203495)

Reverted. Not sure how they were introduced in the first place... rebase error perhaps.

kadircet accepted this revision.Jun 14 2019, 1:05 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jun 14 2019, 1:05 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2019, 7:23 PM
hintonda added a subscriber: hintonda.EditedJun 15 2019, 3:07 PM

This seems to have broken gcc 5.4 builds -- for example see

I believe the following patch should fix the problem:

diff --git a/clang-tools-extra/clangd/index/FileIndex.cpp b/clang-tools-extra/clangd/index/FileIndex.cpp
index a90c51b0990..802e4c4a396 100644
--- a/clang-tools-extra/clangd/index/FileIndex.cpp
+++ b/clang-tools-extra/clangd/index/FileIndex.cpp
@@ -72,7 +72,7 @@ static SlabTuple indexSymbols(ASTContext &AST, std::shared_ptr<Preprocessor> PP,
        "  relations slab: {7} relations, {8} bytes",
        FileName, IsIndexMainAST, Syms.size(), Syms.bytes(), Refs.size(),
        Refs.numRefs(), Refs.bytes(), Relations.size(), Relations.bytes());
-  return {std::move(Syms), std::move(Refs), std::move(Relations)};
+  return std::make_tuple(std::move(Syms), std::move(Refs), std::move(Relations));

 SlabTuple indexMainDecls(ParsedAST &AST) {

Fixed in r363504.

Thanks for fixing that!