This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Implement FileFilter for the indexer
AbandonedPublic

Authored by kbobyrev on Aug 6 2020, 5:08 AM.

Details

Reviewers
kadircet
Summary

Tests failing so far: BackgroundIndexTests.IndexTwoFiles,
BackgroundIndexTests.ShardStorageLoad.

Diff Detail

Event Timeline

kbobyrev created this revision.Aug 6 2020, 5:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2020, 5:08 AM
kbobyrev requested review of this revision.Aug 6 2020, 5:08 AM
kbobyrev planned changes to this revision.Aug 6 2020, 5:12 AM

@sammccall This is the change I was talking about during the standup: the code looks legit, but BackgroundIndexTests.IndexTwoFiles fails because with the following code path in Background.cpp's FileFilter:

const auto *F = SM.getFileEntryForID(FID);
if (!F)
  return false; // Skip invalid files.
hokein added a subscriber: hokein.Aug 6 2020, 7:54 AM
hokein added inline comments.
clang-tools-extra/clangd/index/SymbolCollector.cpp
643

A drive-by comment from D84811: the file granularity vs symbol granularity is tricky here.

Note that a *full* symbol (with declaration, definition, etc) may be formed from different files (.h, .cc), thinking of a following case:

// foo.h
void func();

// user.cc
#include "foo.h"

// foo.cc
#include "foo.h"
void func() {}

if our indexer indexes user.cc first, then foo.h is considered indexed, later when indexing foo.cc, we will skip the func symbol. so the symbol foo will not have definition.

kbobyrev added inline comments.Aug 6 2020, 7:59 AM
clang-tools-extra/clangd/index/SymbolCollector.cpp
643

Oh, you're right, good catch! That's why the post-filtering would probably work but maybe won't be as fancy :(

Thank you for mentioning it!

kbobyrev marked an inline comment as not done.Aug 6 2020, 8:02 AM

@sammccall This is the change I was talking about during the standup: the code looks legit, but BackgroundIndexTests.IndexTwoFiles fails because with the following code path in Background.cpp's FileFilter:

What location is invalid, and why?

clang-tools-extra/clangd/index/SymbolCollector.cpp
643

This is indeed happening in the wrong place, I think, but postfiltering isn't the only solution.

The FileFilter refers to what files you want to index, not which symbols. So I think you want to make the decision in handle*Occurrence, based on the location of the occurrence, not the location of the declaration it refers to.

By the time you get to addDefinition/addDeclaration it really is too late, the caller has already decided that we want to index this symbol and which declaration should be used. (BTW, weakening addDeclaration to return nullptr sometimes seems wrong for the same reason).

This is all very confusing, and we should rewrite the indexer to make it more explicit how data is combined, and less stateful. (More like a mapreduce)

kbobyrev updated this revision to Diff 290256.Sep 7 2020, 5:33 AM

"Fail" early.

kbobyrev updated this revision to Diff 290453.Sep 8 2020, 4:58 AM

Also check if deinition comes from unwwanted file (e.g. omit forward declarations).

kbobyrev planned changes to this revision.Sep 8 2020, 4:59 AM
kbobyrev edited reviewers, added: kadircet; removed: sammccall.
kbobyrev added a subscriber: sammccall.

WIP. Need to get around background indexing tricks to make it work.

Failed Tests (3):
  Clangd Unit Tests :: ./ClangdTests/BackgroundIndexTest.Config
  Clangd Unit Tests :: ./ClangdTests/BackgroundIndexTest.IndexTwoFiles
  Clangd Unit Tests :: ./ClangdTests/BackgroundIndexTest.ShardStorageLoad
kbobyrev abandoned this revision.May 19 2022, 5:51 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 19 2022, 5:51 AM