This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/clangd/
-
clangd/
-
index/
1/1
SymbolCollector.h
8/10
SymbolCollector.cpp
-
SymbolID.h
-
SymbolID.cpp
-
unittests/
-
SymbolCollectorTests.cpp

Differential D123289

[clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups
ClosedPublic

Authored by kadircet on Apr 7 2022, 1:42 AM.

Download Raw Diff

Details

Reviewers

sammccall

Summary

Inline SymbolID hashing to header
Don't collect references for symbols without a SymbolID
Store referenced symbols, rather than separately storing decls and macros.
Don't defer ref collection to end of translation unit
Perform const_cast when updating reference counts (~0.5% saving)
Introduce caching for getSymbolID in SymbolCollector. (~30% saving)
Don't modify symbolslab if there's no definition location
Don't lex the whole file to deduce spelled tokens, just lex the relevant piece (~8%)

Overall this achieves ~38% reduction in time spent inside SymbolCollector compared to baseline (on my machine :)).

I'd expect the last optimization to affect dynamic index a lot more, I was testing with clangd-indexer on clangd subfolder of LLVM. As clangd-indexer runs indexing of whole TU at once, we indeed see almost every token from every source included in the TU (hence lexing full files vs just lexing referenced tokens are almost the same), whereas during dynamic indexing we mostly index main file symbols, but we would touch the files defining/declaring those symbols, and lex complete files for nothing, rather than just the token location.

The last optimization is also a functional change (added test), previously we used raw tokens from syntax::tokenize, which didn't canonicalize trigraphs/newlines in identifiers, wheres Lexer::getSpelling canonicalizes them.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kadircet created this revision.Apr 7 2022, 1:42 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2022, 1:42 AM

Herald added subscribers: usaxena95, arphaman. · View Herald Transcript

kadircet requested review of this revision.Apr 7 2022, 1:42 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2022, 1:42 AM

Herald added subscribers: cfe-commits, MaskRay, ilya-biryukov. · View Herald Transcript

Harbormaster completed remote builds in B158425: Diff 421127.Apr 7 2022, 2:08 AM

kadircet added a reviewer: sammccall.Apr 7 2022, 10:11 AM

Neat!

I think you've broken the FilesToTokenCache by moving it into the loop, so I'd expect further wins from fixing that.
(If you don't see any, something seems wrong: syntax::tokenize should be called a fair bit and should be pretty expensive!)

clang-tools-extra/clangd/index/SymbolCollector.cpp
608	there's a big block of code here that's checking if the reference was spelled or not, pull out a function?
610	Um, is this cache meant to be a member? It's pretty well guaranteed to be empty on the next line :-)
714–718	What does a non-spelled macro reference look like?
815	no need for this to be a lambda anymore, a plain loop seems fine
816–817	if you have timing set up, try making this `const_cast<Symbol*>(S)->References++`. Reinserting into a symbol slab is pretty expensive I think, and this is just an awkward gap in the API. We could fix the API or live with const_cast if it matters.
972	nit: just try_emplace(D). The rest of the arguments get forwarded to the V constructor, so you're calling an unneccesary move constructor here. (Shouldn't matter here because SymbolID is trivial, but it can)
clang-tools-extra/clangd/index/SymbolCollector.h
180	hash_code(SymbolID) is not defined in the header, but the hash function is trivial and could be inlined everywhere. Might be worth exposing. (not really related to this patch, but you have some setup for benchmarking at the moment...)

This revision is now accepted and ready to land.Apr 7 2022, 10:54 AM

Address comments and more cleanups

kadircet edited the summary of this revision. (Show Details)Apr 11 2022, 1:41 AM

Harbormaster completed remote builds in B158955: Diff 421847.Apr 11 2022, 1:54 AM

Get rid of leftovers and update comments

clang-tools-extra/clangd/index/SymbolCollector.cpp
714–718	I don't fully understand it either (hence decided to keep it as-is), but my initial guess is nested macro expansions, e.g: #define FOO(X) X #define BAR(X) FOO(X) BAR(int x); I suppose we assume there's a reference to FOO at expansion of BAR here today. But I am not sure if `libIndex` will actually emit a macro reference for `FOO` here.

sammccall accepted this revision.Apr 11 2022, 2:33 AM

sammccall added inline comments.

clang-tools-extra/clangd/index/SymbolCollector.cpp
189	You've changed this from tokenizing the file with a cache. If I'm reading your benchmark spreadsheet right, this is ~3% speedup. I'm not sure this is significant (I imagine not), but you don't actually have to run the lexer here, since you already know what the string is going to be, it's enough to grab the buffer pointer, check that it starts with the text, check that the next character is not an identifier-continuer.
714–718	oops nevermind, you're only moving this comment from elsewhere

Harbormaster completed remote builds in B158961: Diff 421855.Apr 11 2022, 2:45 AM

kadircet added inline comments.Apr 11 2022, 2:47 AM

clang-tools-extra/clangd/index/SymbolCollector.cpp
189	right, I actually did that at first, but it implies keeping the behaviour around unclean tokens broken, and there wasn't much of a win (delta was in the noise). I think because the expensive part is actually figuring out the fileid, and the lexing call can do it cheaply right now as it benefits from the single element cache.

(This is ready to land, right?)

Landed in 001e88ac83b5c3a4d4f4e61480953ebcabc82b88

Revision Contents

Path

Size

clang-tools-extra/

clangd/

index/

18 lines

189 lines

14 lines

9 lines

unittests/

SymbolCollectorTests.cpp

23 lines

Diff 421855

clang-tools-extra/clangd/index/SymbolCollector.h

//===--- SymbolCollector.h ---------------------------------------- C++--===//		//===--- SymbolCollector.h ---------------------------------------- C++--===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H		#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H
#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H		#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H

#include "index/CanonicalIncludes.h"
#include "CollectMacros.h"		#include "CollectMacros.h"
		#include "index/CanonicalIncludes.h"
#include "index/Ref.h"		#include "index/Ref.h"
#include "index/Relation.h"		#include "index/Relation.h"
#include "index/Symbol.h"		#include "index/Symbol.h"
		#include "index/SymbolID.h"
#include "index/SymbolOrigin.h"		#include "index/SymbolOrigin.h"
#include "clang/AST/ASTContext.h"		#include "clang/AST/ASTContext.h"
#include "clang/AST/Decl.h"		#include "clang/AST/Decl.h"
#include "clang/Basic/SourceLocation.h"		#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/SourceManager.h"		#include "clang/Basic/SourceManager.h"
#include "clang/Index/IndexDataConsumer.h"		#include "clang/Index/IndexDataConsumer.h"
#include "clang/Index/IndexSymbol.h"		#include "clang/Index/IndexSymbol.h"
#include "clang/Sema/CodeCompleteConsumer.h"		#include "clang/Sema/CodeCompleteConsumer.h"
		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include <functional>		#include <functional>

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {

/// Collect declarations (symbols) from an AST.		/// Collect declarations (symbols) from an AST.
/// It collects most declarations except:		/// It collects most declarations except:
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	private:
void addDefinition(const NamedDecl &, const Symbol &DeclSymbol);		void addDefinition(const NamedDecl &, const Symbol &DeclSymbol);
void processRelations(const NamedDecl &ND, const SymbolID &ID,		void processRelations(const NamedDecl &ND, const SymbolID &ID,
ArrayRef<index::SymbolRelation> Relations);		ArrayRef<index::SymbolRelation> Relations);

llvm::Optional<SymbolLocation> getTokenLocation(SourceLocation TokLoc);		llvm::Optional<SymbolLocation> getTokenLocation(SourceLocation TokLoc);

llvm::Optional<std::string> getIncludeHeader(const Symbol &S, FileID);		llvm::Optional<std::string> getIncludeHeader(const Symbol &S, FileID);

		SymbolID getSymbolIDCached(const Decl *D);
		SymbolID getSymbolIDCached(const llvm::StringRef MacroName,
		const MacroInfo *MI, const SourceManager &SM);

// All Symbols collected from the AST.		// All Symbols collected from the AST.
SymbolSlab::Builder Symbols;		SymbolSlab::Builder Symbols;
// File IDs for Symbol.IncludeHeaders.		// File IDs for Symbol.IncludeHeaders.
// The final spelling is calculated in finish().		// The final spelling is calculated in finish().
llvm::DenseMap<SymbolID, FileID> IncludeFiles;		llvm::DenseMap<SymbolID, FileID> IncludeFiles;
void setIncludeLocation(const Symbol &S, SourceLocation);		void setIncludeLocation(const Symbol &S, SourceLocation);
// Indexed macros, to be erased if they turned out to be include guards.		// Indexed macros, to be erased if they turned out to be include guards.
llvm::DenseSet<const IdentifierInfo *> IndexedMacros;		llvm::DenseSet<const IdentifierInfo *> IndexedMacros;
// All refs collected from the AST. It includes:		// All refs collected from the AST. It includes:
// 1) symbols declared in the preamble and referenced from the main file (		// 1) symbols declared in the preamble and referenced from the main file (
// which is not a header), or		// which is not a header), or
// 2) symbols declared and referenced from the main file (which is a header)		// 2) symbols declared and referenced from the main file (which is a header)
RefSlab::Builder Refs;		RefSlab::Builder Refs;
// All relations collected from the AST.		// All relations collected from the AST.
RelationSlab::Builder Relations;		RelationSlab::Builder Relations;
ASTContext *ASTCtx;		ASTContext *ASTCtx;
Preprocessor *PP = nullptr;		Preprocessor *PP = nullptr;
std::shared_ptr<GlobalCodeCompletionAllocator> CompletionAllocator;		std::shared_ptr<GlobalCodeCompletionAllocator> CompletionAllocator;
std::unique_ptr<CodeCompletionTUInfo> CompletionTUInfo;		std::unique_ptr<CodeCompletionTUInfo> CompletionTUInfo;
Options Opts;		Options Opts;
struct SymbolRef {		struct SymbolRef {
SourceLocation Loc;		SourceLocation Loc;
		FileID FID;
index::SymbolRoleSet Roles;		index::SymbolRoleSet Roles;
const Decl *Container;		const Decl *Container;
		bool Spelled;
};		};
		void addRef(SymbolID ID, const SymbolRef &SR);
// Symbols referenced from the current TU, flushed on finish().		// Symbols referenced from the current TU, flushed on finish().
llvm::DenseSet<const NamedDecl *> ReferencedDecls;		llvm::DenseSet<SymbolID> ReferencedSymbols;
		sammccallUnsubmitted Done Reply Inline Actions hash_code(SymbolID) is not defined in the header, but the hash function is trivial and could be inlined everywhere. Might be worth exposing. (not really related to this patch, but you have some setup for benchmarking at the moment...) sammccall: hash_code(SymbolID) is not defined in the header, but the hash function is trivial and could be…
llvm::DenseSet<const IdentifierInfo *> ReferencedMacros;
llvm::DenseMap<const NamedDecl *, std::vector<SymbolRef>> DeclRefs;
llvm::DenseMap<SymbolID, std::vector<SymbolRef>> MacroRefs;
// Maps canonical declaration provided by clang to canonical declaration for		// Maps canonical declaration provided by clang to canonical declaration for
// an index symbol, if clangd prefers a different declaration than that		// an index symbol, if clangd prefers a different declaration than that
// provided by clang. For example, friend declaration might be considered		// provided by clang. For example, friend declaration might be considered
// canonical by clang but should not be considered canonical in the index		// canonical by clang but should not be considered canonical in the index
// unless it's a definition.		// unless it's a definition.
llvm::DenseMap<const Decl , const Decl > CanonicalDecls;		llvm::DenseMap<const Decl , const Decl > CanonicalDecls;
// Cache whether to index a file or not.		// Cache whether to index a file or not.
llvm::DenseMap<FileID, bool> FilesToIndexCache;		llvm::DenseMap<FileID, bool> FilesToIndexCache;
// Encapsulates calculations and caches around header paths, which headers		// Encapsulates calculations and caches around header paths, which headers
// to insert for which symbol, etc.		// to insert for which symbol, etc.
class HeaderFileURICache;		class HeaderFileURICache;
std::unique_ptr<HeaderFileURICache> HeaderFileURIs;		std::unique_ptr<HeaderFileURICache> HeaderFileURIs;
		llvm::DenseMap<const Decl *, SymbolID> DeclToIDCache;
		llvm::DenseMap<const MacroInfo *, SymbolID> MacroToIDCache;
};		};

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

#endif		#endif

clang-tools-extra/clangd/index/SymbolCollector.cpp

Show All 15 Lines
#include "index/CanonicalIncludes.h"		#include "index/CanonicalIncludes.h"
#include "index/Relation.h"		#include "index/Relation.h"
#include "index/SymbolID.h"		#include "index/SymbolID.h"
#include "index/SymbolLocation.h"		#include "index/SymbolLocation.h"
#include "clang/AST/Decl.h"		#include "clang/AST/Decl.h"
#include "clang/AST/DeclBase.h"		#include "clang/AST/DeclBase.h"
#include "clang/AST/DeclObjC.h"		#include "clang/AST/DeclObjC.h"
#include "clang/AST/DeclTemplate.h"		#include "clang/AST/DeclTemplate.h"
		#include "clang/AST/DeclarationName.h"
		#include "clang/Basic/LangOptions.h"
#include "clang/Basic/SourceLocation.h"		#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/SourceManager.h"		#include "clang/Basic/SourceManager.h"
#include "clang/Index/IndexSymbol.h"		#include "clang/Index/IndexSymbol.h"
#include "clang/Lex/Preprocessor.h"		#include "clang/Lex/Preprocessor.h"
#include "clang/Tooling/Syntax/Tokens.h"		#include "clang/Lex/Token.h"
		#include "llvm/ADT/ArrayRef.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace {		namespace {

▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	if (ND && SymbolCollector::shouldCollectSymbol(*ND, ND->getASTContext(),
Opts, true)) {		Opts, true)) {
break;		break;
}		}
Enclosing = dyn_cast_or_null<Decl>(Enclosing->getDeclContext());		Enclosing = dyn_cast_or_null<Decl>(Enclosing->getDeclContext());
}		}
return Enclosing;		return Enclosing;
}		}

		// Check if there is an exact spelling of \p ND at \p Loc.
		bool isSpelled(SourceLocation Loc, const NamedDecl &ND) {
		auto Name = ND.getDeclName();
		const auto NameKind = Name.getNameKind();
		if (NameKind != DeclarationName::Identifier &&
		NameKind != DeclarationName::CXXConstructorName)
		return false;
		const auto &AST = ND.getASTContext();
		const auto &SM = AST.getSourceManager();
		const auto &LO = AST.getLangOpts();
		clang::Token Tok;
		if (clang::Lexer::getRawToken(Loc, Tok, SM, LO))
		return false;
		sammccallUnsubmitted Not Done Reply Inline Actions You've changed this from tokenizing the file with a cache. If I'm reading your benchmark spreadsheet right, this is ~3% speedup. I'm not sure this is significant (I imagine not), but you don't actually have to run the lexer here, since you already know what the string is going to be, it's enough to grab the buffer pointer, check that it starts with the text, check that the next character is not an identifier-continuer. sammccall: You've changed this from tokenizing the file with a cache. If I'm reading your benchmark…
		kadircetAuthorUnsubmitted Done Reply Inline Actions right, I actually did that at first, but it implies keeping the behaviour around unclean tokens broken, and there wasn't much of a win (delta was in the noise). I think because the expensive part is actually figuring out the fileid, and the lexing call can do it cheaply right now as it benefits from the single element cache. kadircet: right, I actually did that at first, but it implies keeping the behaviour around unclean tokens…
		auto StrName = Name.getAsString();
		return clang::Lexer::getSpelling(Tok, SM, LO) == StrName;
		}
} // namespace		} // namespace

// Encapsulates decisions about how to record header paths in the index,		// Encapsulates decisions about how to record header paths in the index,
// including filename normalization, URI conversion etc.		// including filename normalization, URI conversion etc.
// Expensive checks are cached internally.		// Expensive checks are cached internally.
class SymbolCollector::HeaderFileURICache {		class SymbolCollector::HeaderFileURICache {
struct FrameworkUmbrellaSpelling {		struct FrameworkUmbrellaSpelling {
// Spelling for the public umbrella header, e.g. <Foundation/Foundation.h>		// Spelling for the public umbrella header, e.g. <Foundation/Foundation.h>
▲ Show 20 Lines • Show All 357 Lines • ▼ Show 20 Lines	if (const auto *CID = dyn_cast<ObjCCategoryImplDecl>(D)) {
DeclIsCanonical = true;		DeclIsCanonical = true;
if (const auto *CD = CID->getCategoryDecl())		if (const auto *CD = CID->getCategoryDecl())
D = CD;		D = CD;
}		}
const NamedDecl *ND = dyn_cast<NamedDecl>(D);		const NamedDecl *ND = dyn_cast<NamedDecl>(D);
if (!ND)		if (!ND)
return true;		return true;

		auto ID = getSymbolIDCached(ND);
		if (!ID)
		return true;

// Mark D as referenced if this is a reference coming from the main file.		// Mark D as referenced if this is a reference coming from the main file.
// D may not be an interesting symbol, but it's cheaper to check at the end.		// D may not be an interesting symbol, but it's cheaper to check at the end.
auto &SM = ASTCtx->getSourceManager();		auto &SM = ASTCtx->getSourceManager();
if (Opts.CountReferences &&		if (Opts.CountReferences &&
(Roles & static_cast<unsigned>(index::SymbolRole::Reference)) &&		(Roles & static_cast<unsigned>(index::SymbolRole::Reference)) &&
SM.getFileID(SM.getSpellingLoc(Loc)) == SM.getMainFileID())		SM.getFileID(SM.getSpellingLoc(Loc)) == SM.getMainFileID())
ReferencedDecls.insert(ND);		ReferencedSymbols.insert(ID);

auto ID = getSymbolID(ND);
if (!ID)
return true;

// ND is the canonical (i.e. first) declaration. If it's in the main file		// ND is the canonical (i.e. first) declaration. If it's in the main file
// (which is not a header), then no public declaration was visible, so assume		// (which is not a header), then no public declaration was visible, so assume
// it's main-file only.		// it's main-file only.
bool IsMainFileOnly =		bool IsMainFileOnly =
SM.isWrittenInMainFile(SM.getExpansionLoc(ND->getBeginLoc())) &&		SM.isWrittenInMainFile(SM.getExpansionLoc(ND->getBeginLoc())) &&
!isHeaderFile(SM.getFileEntryForID(SM.getMainFileID())->getName(),		!isHeaderFile(SM.getFileEntryForID(SM.getMainFileID())->getName(),
ASTCtx->getLangOpts());		ASTCtx->getLangOpts());
// In C, printf is a redecl of an implicit builtin! So check OrigD instead.		// In C, printf is a redecl of an implicit builtin! So check OrigD instead.
if (ASTNode.OrigD->isImplicit() \|\|		if (ASTNode.OrigD->isImplicit() \|\|
!shouldCollectSymbol(ND, ASTCtx, Opts, IsMainFileOnly))		!shouldCollectSymbol(ND, ASTCtx, Opts, IsMainFileOnly))
return true;		return true;

// Note: we need to process relations for all decl occurrences, including		// Note: we need to process relations for all decl occurrences, including
// refs, because the indexing code only populates relations for specific		// refs, because the indexing code only populates relations for specific
// occurrences. For example, RelationBaseOf is only populated for the		// occurrences. For example, RelationBaseOf is only populated for the
// occurrence inside the base-specifier.		// occurrence inside the base-specifier.
processRelations(*ND, ID, Relations);		processRelations(*ND, ID, Relations);

bool CollectRef = static_cast<bool>(Opts.RefFilter & toRefKind(Roles));		bool CollectRef = static_cast<bool>(Opts.RefFilter & toRefKind(Roles));
bool IsOnlyRef =
!(Roles & (static_cast<unsigned>(index::SymbolRole::Declaration) \|
static_cast<unsigned>(index::SymbolRole::Definition)));

if (IsOnlyRef && !CollectRef)
return true;

// Unlike other fields, e.g. Symbols (which use spelling locations), we use		// Unlike other fields, e.g. Symbols (which use spelling locations), we use
// file locations for references (as it aligns the behavior of clangd's		// file locations for references (as it aligns the behavior of clangd's
// AST-based xref).		// AST-based xref).
// FIXME: we should try to use the file locations for other fields.		// FIXME: we should try to use the file locations for other fields.
if (CollectRef &&		if (CollectRef &&
(!IsMainFileOnly \|\| Opts.CollectMainFileRefs \|\|		(!IsMainFileOnly \|\| Opts.CollectMainFileRefs \|\|
ND->isExternallyVisible()) &&		ND->isExternallyVisible()) &&
!isa<NamespaceDecl>(ND) &&		!isa<NamespaceDecl>(ND)) {
(Opts.RefsInHeaders \|\|		auto FileLoc = SM.getFileLoc(Loc);
SM.getFileID(SM.getFileLoc(Loc)) == SM.getMainFileID()))		auto FID = SM.getFileID(FileLoc);
DeclRefs[ND].push_back(SymbolRef{SM.getFileLoc(Loc), Roles,		if (Opts.RefsInHeaders \|\| FID == SM.getMainFileID()) {
getRefContainer(ASTNode.Parent, Opts)});		addRef(ID, SymbolRef{FileLoc, FID, Roles,
		sammccallUnsubmitted Done Reply Inline Actions there's a big block of code here that's checking if the reference was spelled or not, pull out a function? sammccall: there's a big block of code here that's checking if the reference was spelled or not, pull out…
		getRefContainer(ASTNode.Parent, Opts),
		isSpelled(FileLoc, *ND)});
		sammccallUnsubmitted Done Reply Inline Actions Um, is this cache meant to be a member? It's pretty well guaranteed to be empty on the next line :-) sammccall: Um, is this cache meant to be a member? It's pretty well guaranteed to be empty on the next…
		}
		}
// Don't continue indexing if this is a mere reference.		// Don't continue indexing if this is a mere reference.
if (IsOnlyRef)		if (!(Roles & (static_cast<unsigned>(index::SymbolRole::Declaration) \|
		static_cast<unsigned>(index::SymbolRole::Definition))))
return true;		return true;

// FIXME: ObjCPropertyDecl are not properly indexed here:		// FIXME: ObjCPropertyDecl are not properly indexed here:
// - ObjCPropertyDecl may have an OrigD of ObjCPropertyImplDecl, which is		// - ObjCPropertyDecl may have an OrigD of ObjCPropertyImplDecl, which is
// not a NamedDecl.		// not a NamedDecl.
auto *OriginalDecl = dyn_cast<NamedDecl>(ASTNode.OrigD);		auto *OriginalDecl = dyn_cast<NamedDecl>(ASTNode.OrigD);
if (!OriginalDecl)		if (!OriginalDecl)
return true;		return true;
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	bool SymbolCollector::handleMacroOccurrence(const IdentifierInfo *Name,

const auto &SM = PP->getSourceManager();		const auto &SM = PP->getSourceManager();
auto DefLoc = MI->getDefinitionLoc();		auto DefLoc = MI->getDefinitionLoc();
// Also avoid storing predefined macros like __DBL_MIN__.		// Also avoid storing predefined macros like __DBL_MIN__.
if (SM.isWrittenInBuiltinFile(DefLoc) \|\|		if (SM.isWrittenInBuiltinFile(DefLoc) \|\|
Name->getName() == "__GCC_HAVE_DWARF2_CFI_ASM")		Name->getName() == "__GCC_HAVE_DWARF2_CFI_ASM")
return true;		return true;

auto ID = getSymbolID(Name->getName(), MI, SM);		auto ID = getSymbolIDCached(Name->getName(), MI, SM);
if (!ID)		if (!ID)
return true;		return true;

auto SpellingLoc = SM.getSpellingLoc(Loc);		auto SpellingLoc = SM.getSpellingLoc(Loc);
bool IsMainFileOnly =		bool IsMainFileOnly =
SM.isInMainFile(SM.getExpansionLoc(DefLoc)) &&		SM.isInMainFile(SM.getExpansionLoc(DefLoc)) &&
!isHeaderFile(SM.getFileEntryForID(SM.getMainFileID())->getName(),		!isHeaderFile(SM.getFileEntryForID(SM.getMainFileID())->getName(),
ASTCtx->getLangOpts());		ASTCtx->getLangOpts());
// Do not store references to main-file macros.		// Do not store references to main-file macros.
if ((static_cast<unsigned>(Opts.RefFilter) & Roles) && !IsMainFileOnly &&		if ((static_cast<unsigned>(Opts.RefFilter) & Roles) && !IsMainFileOnly &&
(Opts.RefsInHeaders \|\| SM.getFileID(SpellingLoc) == SM.getMainFileID()))		(Opts.RefsInHeaders \|\| SM.getFileID(SpellingLoc) == SM.getMainFileID())) {
// FIXME: Populate container information for macro references.		// FIXME: Populate container information for macro references.
MacroRefs[ID].push_back({Loc, Roles, /Container=/nullptr});		// FIXME: All MacroRefs are marked as Spelled now, but this should be
		// checked.
		addRef(ID, SymbolRef{Loc, SM.getFileID(Loc), Roles, /Container=/nullptr,
		/Spelled=/true});
		}
		sammccallUnsubmitted Not Done Reply Inline Actions What does a non-spelled macro reference look like? sammccall: What does a non-spelled macro reference look like?
		kadircetAuthorUnsubmitted Done Reply Inline Actions I don't fully understand it either (hence decided to keep it as-is), but my initial guess is nested macro expansions, e.g: #define FOO(X) X #define BAR(X) FOO(X) BAR(int x); I suppose we assume there's a reference to FOO at expansion of BAR here today. But I am not sure if `libIndex` will actually emit a macro reference for `FOO` here. kadircet: I don't fully understand it either (hence decided to keep it as-is), but my initial guess is…
		sammccallUnsubmitted Done Reply Inline Actions oops nevermind, you're only moving this comment from elsewhere sammccall: oops nevermind, you're only moving this comment from elsewhere

// Collect symbols.		// Collect symbols.
if (!Opts.CollectMacro)		if (!Opts.CollectMacro)
return true;		return true;

// Skip main-file macros if we are not collecting them.		// Skip main-file macros if we are not collecting them.
if (IsMainFileOnly && !Opts.CollectMainFileSymbols)		if (IsMainFileOnly && !Opts.CollectMainFileSymbols)
return false;		return false;

// Mark the macro as referenced if this is a reference coming from the main		// Mark the macro as referenced if this is a reference coming from the main
// file. The macro may not be an interesting symbol, but it's cheaper to check		// file. The macro may not be an interesting symbol, but it's cheaper to check
// at the end.		// at the end.
if (Opts.CountReferences &&		if (Opts.CountReferences &&
(Roles & static_cast<unsigned>(index::SymbolRole::Reference)) &&		(Roles & static_cast<unsigned>(index::SymbolRole::Reference)) &&
SM.getFileID(SpellingLoc) == SM.getMainFileID())		SM.getFileID(SpellingLoc) == SM.getMainFileID())
ReferencedMacros.insert(Name);		ReferencedSymbols.insert(ID);

// Don't continue indexing if this is a mere reference.		// Don't continue indexing if this is a mere reference.
// FIXME: remove macro with ID if it is undefined.		// FIXME: remove macro with ID if it is undefined.
if (!(Roles & static_cast<unsigned>(index::SymbolRole::Declaration) \|\|		if (!(Roles & static_cast<unsigned>(index::SymbolRole::Declaration) \|\|
Roles & static_cast<unsigned>(index::SymbolRole::Definition)))		Roles & static_cast<unsigned>(index::SymbolRole::Definition)))
return true;		return true;

// Only collect one instance in case there are multiple.		// Only collect one instance in case there are multiple.
Show All 33 Lines	void SymbolCollector::processRelations(
const NamedDecl &ND, const SymbolID &ID,		const NamedDecl &ND, const SymbolID &ID,
ArrayRef<index::SymbolRelation> Relations) {		ArrayRef<index::SymbolRelation> Relations) {
for (const auto &R : Relations) {		for (const auto &R : Relations) {
auto RKind = indexableRelation(R);		auto RKind = indexableRelation(R);
if (!RKind)		if (!RKind)
continue;		continue;
const Decl *Object = R.RelatedSymbol;		const Decl *Object = R.RelatedSymbol;

auto ObjectID = getSymbolID(Object);		auto ObjectID = getSymbolIDCached(Object);
if (!ObjectID)		if (!ObjectID)
continue;		continue;

// Record the relation.		// Record the relation.
// TODO: There may be cases where the object decl is not indexed for some		// TODO: There may be cases where the object decl is not indexed for some
// reason. Those cases should probably be removed in due course, but for		// reason. Those cases should probably be removed in due course, but for
// now there are two possible ways to handle it:		// now there are two possible ways to handle it:
// (A) Avoid storing the relation in such cases.		// (A) Avoid storing the relation in such cases.
Show All 14 Lines	if (shouldCollectIncludePath(S.SymInfo.Kind))
// Use the expansion location to get the #include header since this is		// Use the expansion location to get the #include header since this is
// where the symbol is exposed.		// where the symbol is exposed.
IncludeFiles[S.ID] =		IncludeFiles[S.ID] =
PP->getSourceManager().getDecomposedExpansionLoc(Loc).first;		PP->getSourceManager().getDecomposedExpansionLoc(Loc).first;
}		}

void SymbolCollector::finish() {		void SymbolCollector::finish() {
// At the end of the TU, add 1 to the refcount of all referenced symbols.		// At the end of the TU, add 1 to the refcount of all referenced symbols.
auto IncRef = [this](const SymbolID &ID) {		for (const auto &ID : ReferencedSymbols) {
		sammccallUnsubmitted Done Reply Inline Actions no need for this to be a lambda anymore, a plain loop seems fine sammccall: no need for this to be a lambda anymore, a plain loop seems fine
if (const auto *S = Symbols.find(ID)) {		if (const auto *S = Symbols.find(ID)) {
Symbol Inc = *S;		// SymbolSlab::Builder returns const symbols because strings are interned
		sammccallUnsubmitted Done Reply Inline Actions if you have timing set up, try making this `const_cast<Symbol>(S)->References++`. Reinserting into a symbol slab is pretty expensive I think, and this is just an awkward gap in the API. We could fix the API or live with const_cast if it matters. sammccall:* if you have timing set up, try making this `const_cast<Symbol*>(S)->References++`. Reinserting…
++Inc.References;		// and modifying returned symbols without inserting again wouldn't go
Symbols.insert(Inc);		// well. const_cast is safe here as we're modifying a data owned by the
}		// Symbol. This reduces time spent in SymbolCollector by ~1%.
};		++const_cast<Symbol *>(S)->References;
for (const NamedDecl *ND : ReferencedDecls) {
if (auto ID = getSymbolID(ND)) {
IncRef(ID);
}		}
}		}
if (Opts.CollectMacro) {		if (Opts.CollectMacro) {
assert(PP);		assert(PP);
// First, drop header guards. We can't identify these until EOF.		// First, drop header guards. We can't identify these until EOF.
for (const IdentifierInfo *II : IndexedMacros) {		for (const IdentifierInfo *II : IndexedMacros) {
if (const auto *MI = PP->getMacroDefinition(II).getMacroInfo())		if (const auto *MI = PP->getMacroDefinition(II).getMacroInfo())
if (auto ID = getSymbolID(II->getName(), MI, PP->getSourceManager()))		if (auto ID =
		getSymbolIDCached(II->getName(), MI, PP->getSourceManager()))
if (MI->isUsedForHeaderGuard())		if (MI->isUsedForHeaderGuard())
Symbols.erase(ID);		Symbols.erase(ID);
}		}
// Now increment refcounts.
for (const IdentifierInfo *II : ReferencedMacros) {
if (const auto *MI = PP->getMacroDefinition(II).getMacroInfo())
if (auto ID = getSymbolID(II->getName(), MI, PP->getSourceManager()))
IncRef(ID);
}
}		}
// Fill in IncludeHeaders.		// Fill in IncludeHeaders.
// We delay this until end of TU so header guards are all resolved.		// We delay this until end of TU so header guards are all resolved.
llvm::SmallString<128> QName;		llvm::SmallString<128> QName;
for (const auto &Entry : IncludeFiles) {		for (const auto &Entry : IncludeFiles) {
if (const Symbol *S = Symbols.find(Entry.first)) {		if (const Symbol *S = Symbols.find(Entry.first)) {
llvm::StringRef IncludeHeader;		llvm::StringRef IncludeHeader;
// Look for an overridden include header for this symbol specifically.		// Look for an overridden include header for this symbol specifically.
Show All 17 Lines	if (const Symbol *S = Symbols.find(Entry.first)) {
if (!IncludeHeader.empty()) {		if (!IncludeHeader.empty()) {
Symbol NewSym = *S;		Symbol NewSym = *S;
NewSym.IncludeHeaders.push_back({IncludeHeader, 1});		NewSym.IncludeHeaders.push_back({IncludeHeader, 1});
Symbols.insert(NewSym);		Symbols.insert(NewSym);
}		}
}		}
}		}

const auto &SM = ASTCtx->getSourceManager();		ReferencedSymbols.clear();
auto CollectRef = [&](SymbolID ID, const SymbolRef &LocAndRole,
bool Spelled = false) {
auto FileID = SM.getFileID(LocAndRole.Loc);
// FIXME: use the result to filter out references.
shouldIndexFile(FileID);
if (const auto *FE = SM.getFileEntryForID(FileID)) {
auto Range = getTokenRange(LocAndRole.Loc, SM, ASTCtx->getLangOpts());
Ref R;
R.Location.Start = Range.first;
R.Location.End = Range.second;
R.Location.FileURI = HeaderFileURIs->toURI(FE).c_str();
R.Kind = toRefKind(LocAndRole.Roles, Spelled);
R.Container = getSymbolID(LocAndRole.Container);
Refs.insert(ID, R);
}
};
// Populate Refs slab from MacroRefs.
// FIXME: All MacroRefs are marked as Spelled now, but this should be checked.
for (const auto &IDAndRefs : MacroRefs)
for (const auto &LocAndRole : IDAndRefs.second)
CollectRef(IDAndRefs.first, LocAndRole, /Spelled=/true);
// Populate Refs slab from DeclRefs.
llvm::DenseMap<FileID, std::vector<syntax::Token>> FilesToTokensCache;
for (auto &DeclAndRef : DeclRefs) {
if (auto ID = getSymbolID(DeclAndRef.first)) {
for (auto &LocAndRole : DeclAndRef.second) {
const auto FileID = SM.getFileID(LocAndRole.Loc);
// FIXME: It's better to use TokenBuffer by passing spelled tokens from
// the caller of SymbolCollector.
if (!FilesToTokensCache.count(FileID))
FilesToTokensCache[FileID] =
syntax::tokenize(FileID, SM, ASTCtx->getLangOpts());
llvm::ArrayRef<syntax::Token> Tokens = FilesToTokensCache[FileID];
// Check if the referenced symbol is spelled exactly the same way the
// corresponding NamedDecl is. If it is, mark this reference as spelled.
const auto *IdentifierToken =
spelledIdentifierTouching(LocAndRole.Loc, Tokens);
DeclarationName Name = DeclAndRef.first->getDeclName();
const auto NameKind = Name.getNameKind();
bool IsTargetKind = NameKind == DeclarationName::Identifier \|\|
NameKind == DeclarationName::CXXConstructorName;
bool Spelled = IdentifierToken && IsTargetKind &&
Name.getAsString() == IdentifierToken->text(SM);
CollectRef(ID, LocAndRole, Spelled);
}
}
}

ReferencedDecls.clear();
ReferencedMacros.clear();
DeclRefs.clear();
IncludeFiles.clear();		IncludeFiles.clear();
}		}

const Symbol *SymbolCollector::addDeclaration(const NamedDecl &ND, SymbolID ID,		const Symbol *SymbolCollector::addDeclaration(const NamedDecl &ND, SymbolID ID,
bool IsMainFileOnly) {		bool IsMainFileOnly) {
auto &Ctx = ND.getASTContext();		auto &Ctx = ND.getASTContext();
auto &SM = Ctx.getSourceManager();		auto &SM = Ctx.getSourceManager();

▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	const Symbol *SymbolCollector::addDeclaration(const NamedDecl &ND, SymbolID ID,
setIncludeLocation(S, ND.getLocation());		setIncludeLocation(S, ND.getLocation());
return Symbols.find(S.ID);		return Symbols.find(S.ID);
}		}

void SymbolCollector::addDefinition(const NamedDecl &ND,		void SymbolCollector::addDefinition(const NamedDecl &ND,
const Symbol &DeclSym) {		const Symbol &DeclSym) {
if (DeclSym.Definition)		if (DeclSym.Definition)
return;		return;
		const auto &SM = ND.getASTContext().getSourceManager();
		auto Loc = nameLocation(ND, SM);
		shouldIndexFile(SM.getFileID(Loc));
		auto DefLoc = getTokenLocation(Loc);
// If we saw some forward declaration, we end up copying the symbol.		// If we saw some forward declaration, we end up copying the symbol.
// This is not ideal, but avoids duplicating the "is this a definition" check		// This is not ideal, but avoids duplicating the "is this a definition" check
// in clang::index. We should only see one definition.		// in clang::index. We should only see one definition.
		if (!DefLoc)
		return;
Symbol S = DeclSym;		Symbol S = DeclSym;
const auto &SM = ND.getASTContext().getSourceManager();
auto Loc = nameLocation(ND, SM);
// FIXME: use the result to filter out symbols.		// FIXME: use the result to filter out symbols.
shouldIndexFile(SM.getFileID(Loc));
if (auto DefLoc = getTokenLocation(Loc))
S.Definition = *DefLoc;		S.Definition = *DefLoc;
Symbols.insert(S);		Symbols.insert(S);
}		}

bool SymbolCollector::shouldIndexFile(FileID FID) {		bool SymbolCollector::shouldIndexFile(FileID FID) {
if (!Opts.FileFilter)		if (!Opts.FileFilter)
return true;		return true;
auto I = FilesToIndexCache.try_emplace(FID);		auto I = FilesToIndexCache.try_emplace(FID);
if (I.second)		if (I.second)
I.first->second = Opts.FileFilter(ASTCtx->getSourceManager(), FID);		I.first->second = Opts.FileFilter(ASTCtx->getSourceManager(), FID);
return I.first->second;		return I.first->second;
}		}

		void SymbolCollector::addRef(SymbolID ID, const SymbolRef &SR) {
		const auto &SM = ASTCtx->getSourceManager();
		sammccallUnsubmitted Done Reply Inline Actions nit: just try_emplace(D). The rest of the arguments get forwarded to the V constructor, so you're calling an unneccesary move constructor here. (Shouldn't matter here because SymbolID is trivial, but it can) sammccall: nit: just try_emplace(D). The rest of the arguments get forwarded to the V constructor, so…
		// FIXME: use the result to filter out references.
		shouldIndexFile(SR.FID);
		if (const auto *FE = SM.getFileEntryForID(SR.FID)) {
		auto Range = getTokenRange(SR.Loc, SM, ASTCtx->getLangOpts());
		Ref R;
		R.Location.Start = Range.first;
		R.Location.End = Range.second;
		R.Location.FileURI = HeaderFileURIs->toURI(FE).c_str();
		R.Kind = toRefKind(SR.Roles, SR.Spelled);
		R.Container = getSymbolIDCached(SR.Container);
		Refs.insert(ID, R);
		}
		}

		SymbolID SymbolCollector::getSymbolIDCached(const Decl *D) {
		auto It = DeclToIDCache.try_emplace(D, SymbolID{});
		if (It.second)
		It.first->second = getSymbolID(D);
		return It.first->second;
		}

		SymbolID SymbolCollector::getSymbolIDCached(const llvm::StringRef MacroName,
		const MacroInfo *MI,
		const SourceManager &SM) {
		auto It = MacroToIDCache.try_emplace(MI, SymbolID{});
		if (It.second)
		It.first->second = getSymbolID(MacroName, MI, SM);
		return It.first->second;
		}
} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

clang-tools-extra/clangd/index/SymbolID.h

	//===--- SymbolID.h ----------------------------------------------- C++--===//			//===--- SymbolID.h ----------------------------------------------- C++--===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLID_H			#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLID_H
	#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLID_H			#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLID_H

	#include "llvm/ADT/Hashing.h"			#include "llvm/ADT/Hashing.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include "llvm/Support/Error.h"			#include "llvm/Support/Error.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
	#include <array>			#include <array>
				#include <cstddef>
	#include <cstdint>			#include <cstdint>
	#include <string>			#include <string>

	namespace clang {			namespace clang {
	namespace clangd {			namespace clangd {

	// The class identifies a particular C++ symbol (class, function, method, etc).			// The class identifies a particular C++ symbol (class, function, method, etc).
	//			//
	// As USRs (Unified Symbol Resolution) could be large, especially for functions			// As USRs (Unified Symbol Resolution) could be large, especially for functions
	// with long type arguments, SymbolID is using truncated SHA1(USR) values to			// with long type arguments, SymbolID is using truncated SHA1(USR) values to
	// guarantee the uniqueness of symbols while using a relatively small amount of			// guarantee the uniqueness of symbols while using a relatively small amount of
	// memory (vs storing USRs directly).			// memory (vs storing USRs directly).
	//			//
	// SymbolID can be used as key in the symbol indexes to lookup the symbol.			// SymbolID can be used as key in the symbol indexes to lookup the symbol.
	class SymbolID {			class SymbolID {
	public:			public:
	SymbolID() = default;			SymbolID() = default;
	explicit SymbolID(llvm::StringRef USR);			explicit SymbolID(llvm::StringRef USR);

	bool operator==(const SymbolID &Sym) const {			bool operator==(const SymbolID &Sym) const {
	return HashValue == Sym.HashValue;			return HashValue == Sym.HashValue;
	}			}
	bool operator!=(const SymbolID &Sym) const {			bool operator!=(const SymbolID &Sym) const { return !(*this == Sym); }
	return !(*this == Sym);
	}
	bool operator<(const SymbolID &Sym) const {			bool operator<(const SymbolID &Sym) const {
	return HashValue < Sym.HashValue;			return HashValue < Sym.HashValue;
	}			}

	// The stored hash is truncated to RawSize bytes.			// The stored hash is truncated to RawSize bytes.
	// This trades off memory against the number of symbols we can handle.			// This trades off memory against the number of symbols we can handle.
	constexpr static size_t RawSize = 8;			constexpr static size_t RawSize = 8;
	llvm::StringRef raw() const;			llvm::StringRef raw() const;
	static SymbolID fromRaw(llvm::StringRef);			static SymbolID fromRaw(llvm::StringRef);

	// Returns a hex encoded string.			// Returns a hex encoded string.
	std::string str() const;			std::string str() const;
	static llvm::Expected<SymbolID> fromStr(llvm::StringRef);			static llvm::Expected<SymbolID> fromStr(llvm::StringRef);

	bool isNull() const { return *this == SymbolID(); }			bool isNull() const { return *this == SymbolID(); }
	explicit operator bool() const { return !isNull(); }			explicit operator bool() const { return !isNull(); }

	private:			private:
	std::array<uint8_t, RawSize> HashValue{};			std::array<uint8_t, RawSize> HashValue{};
	};			};

	llvm::hash_code hash_value(const SymbolID &ID);			inline llvm::hash_code hash_value(const SymbolID &ID) {
				// We already have a good hash, just return the first bytes.
				static_assert(sizeof(size_t) <= SymbolID::RawSize,
				"size_t longer than SHA1!");
				size_t Result;
				memcpy(&Result, ID.raw().data(), sizeof(size_t));
				return llvm::hash_code(Result);
				}

	// Write SymbolID into the given stream. SymbolID is encoded as ID.str().			// Write SymbolID into the given stream. SymbolID is encoded as ID.str().
	llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, const SymbolID &ID);			llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, const SymbolID &ID);

	} // namespace clangd			} // namespace clangd
	} // namespace clang			} // namespace clang

	namespace llvm {			namespace llvm {
	Show All 21 Lines

clang-tools-extra/clangd/index/SymbolID.cpp

Show All 40 Lines	if (!llvm::isHexDigit(C))
return error("Bad hex ID");		return error("Bad hex ID");
return fromRaw(llvm::fromHex(Str));		return fromRaw(llvm::fromHex(Str));
}		}

llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, const SymbolID &ID) {		llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, const SymbolID &ID) {
return OS << llvm::toHex(ID.raw());		return OS << llvm::toHex(ID.raw());
}		}

llvm::hash_code hash_value(const SymbolID &ID) {
// We already have a good hash, just return the first bytes.
static_assert(sizeof(size_t) <= SymbolID::RawSize,
"size_t longer than SHA1!");
size_t Result;
memcpy(&Result, ID.raw().data(), sizeof(size_t));
return llvm::hash_code(Result);
}

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp

	Show First 20 Lines • Show All 980 Lines • ▼ Show 20 Lines
	}			}

	TEST_F(SymbolCollectorTest, SpelledReferences) {			TEST_F(SymbolCollectorTest, SpelledReferences) {
	struct {			struct {
	llvm::StringRef Header;			llvm::StringRef Header;
	llvm::StringRef Main;			llvm::StringRef Main;
	llvm::StringRef TargetSymbolName;			llvm::StringRef TargetSymbolName;
	} TestCases[] = {			} TestCases[] = {
	{			{
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - { - R"cpp( + { + R"cpp( Lint: Pre-merge checks: clang-format: please reformat the code ``` - { - R"cpp( + { + R"cpp( ```
	R"cpp(			R"cpp(
	struct Foo;			struct Foo;
	#define MACRO Foo			#define MACRO Foo
	)cpp",			)cpp",
	R"cpp(			R"cpp(
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - R"cpp( + R"cpp( Lint: Pre-merge checks: clang-format: please reformat the code ``` - R"cpp( + R"cpp( ```
	struct $spelled[[Foo]] {			struct $spelled[[Foo]] {
	$spelled[[Foo]]();			$spelled[[Foo]]();
	~$spelled[[Foo]]();			~$spelled[[Foo]]();
	};			};
	$spelled[[Foo]] Variable1;			$spelled[[Foo]] Variable1;
	$implicit[[MACRO]] Variable2;			$implicit[[MACRO]] Variable2;
	)cpp",			)cpp",
	"Foo",			"Foo",
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - "Foo", - }, - { - R"cpp( + "Foo", + }, + { + R"cpp( Lint: Pre-merge checks: clang-format: please reformat the code ``` - "Foo", - }, - { - R"cpp( +…
	},			},
	{			{
	R"cpp(			R"cpp(
	class Foo {			class Foo {
	public:			public:
	Foo() = default;			Foo() = default;
	};			};
	)cpp",			)cpp",
	R"cpp(			R"cpp(
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - R"cpp( + R"cpp( Lint: Pre-merge checks: clang-format: please reformat the code ``` - R"cpp( + R"cpp( ```
	void f() { Foo $implicit[[f]]; f = $spelled[[Foo]]();}			void f() { Foo $implicit[[f]]; f = $spelled[[Foo]]();}
	)cpp",			)cpp",
	"Foo::Foo" /// constructor.			"Foo::Foo" /// constructor.
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - "Foo::Foo" /// constructor. - }, - { // Unclean identifiers - R"cpp( + "Foo::Foo" /// constructor. + }, + { + // Unclean identifiers + R"cpp( Lint: Pre-merge checks: clang-format: please reformat the code ``` - "Foo::Foo" /// constructor. - }, - { //…
	},			},
				{ // Unclean identifiers
				R"cpp(
				struct Foo {};
				)cpp",
				R"cpp(
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - R"cpp( + R"cpp( Lint: Pre-merge checks: clang-format: please reformat the code ``` - R"cpp( + R"cpp( ```
				$spelled[[Fo\
				o]] f{};
				)cpp",
				"Foo",
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - "Foo", - }, + "Foo", + }, Lint: Pre-merge checks: clang-format: please reformat the code ``` - "Foo", - }, + "Foo", + }…
				},
	};			};
	CollectorOpts.RefFilter = RefKind::All;			CollectorOpts.RefFilter = RefKind::All;
	CollectorOpts.RefsInHeaders = false;			CollectorOpts.RefsInHeaders = false;
	for (const auto& T : TestCases) {			for (const auto& T : TestCases) {
				SCOPED_TRACE(T.Header + "\n---\n" + T.Main);
	Annotations Header(T.Header);			Annotations Header(T.Header);
	Annotations Main(T.Main);			Annotations Main(T.Main);
	// Reset the file system.			// Reset the file system.
	InMemoryFileSystem = new llvm::vfs::InMemoryFileSystem;			InMemoryFileSystem = new llvm::vfs::InMemoryFileSystem;
	runSymbolCollector(Header.code(), Main.code());			runSymbolCollector(Header.code(), Main.code());

	const auto SpelledRanges = Main.ranges("spelled");			const auto SpelledRanges = Main.ranges("spelled");
	const auto ImplicitRanges = Main.ranges("implicit");			const auto ImplicitRanges = Main.ranges("implicit");
	RefSlab::Builder SpelledSlabBuilder, ImplicitSlabBuilder;			RefSlab::Builder SpelledSlabBuilder, ImplicitSlabBuilder;
	const auto TargetID = findSymbol(Symbols, T.TargetSymbolName).ID;			const auto TargetID = findSymbol(Symbols, T.TargetSymbolName).ID;
	for (const auto &SymbolAndRefs : Refs) {			for (const auto &SymbolAndRefs : Refs) {
	const auto ID = SymbolAndRefs.first;			const auto ID = SymbolAndRefs.first;
	if (ID != TargetID)			if (ID != TargetID)
	continue;			continue;
	for (const auto &Ref : SymbolAndRefs.second)			for (const auto &Ref : SymbolAndRefs.second)
	if ((Ref.Kind & RefKind::Spelled) != RefKind::Unknown)			if ((Ref.Kind & RefKind::Spelled) != RefKind::Unknown)
	SpelledSlabBuilder.insert(ID, Ref);			SpelledSlabBuilder.insert(ID, Ref);
	else			else
	ImplicitSlabBuilder.insert(ID, Ref);			ImplicitSlabBuilder.insert(ID, Ref);
	}			}
	const auto SpelledRefs = std::move(SpelledSlabBuilder).build(),			const auto SpelledRefs = std::move(SpelledSlabBuilder).build(),
	ImplicitRefs = std::move(ImplicitSlabBuilder).build();			ImplicitRefs = std::move(ImplicitSlabBuilder).build();
				EXPECT_EQ(SpelledRanges.empty(), SpelledRefs.empty());
				EXPECT_EQ(ImplicitRanges.empty(), ImplicitRefs.empty());
				if (!SpelledRanges.empty())
	EXPECT_THAT(SpelledRefs,			EXPECT_THAT(SpelledRefs,
	Contains(Pair(TargetID, haveRanges(SpelledRanges))));			Contains(Pair(TargetID, haveRanges(SpelledRanges))));
				if (!ImplicitRanges.empty())
	EXPECT_THAT(ImplicitRefs,			EXPECT_THAT(ImplicitRefs,
	Contains(Pair(TargetID, haveRanges(ImplicitRanges))));			Contains(Pair(TargetID, haveRanges(ImplicitRanges))));
	}			}
	}			}

	TEST_F(SymbolCollectorTest, NameReferences) {			TEST_F(SymbolCollectorTest, NameReferences) {
	CollectorOpts.RefFilter = RefKind::All;			CollectorOpts.RefFilter = RefKind::All;
	CollectorOpts.RefsInHeaders = true;			CollectorOpts.RefsInHeaders = true;
	Annotations Header(R"(			Annotations Header(R"(
	class [[Foo]] {			class [[Foo]] {
	▲ Show 20 Lines • Show All 891 Lines • Show Last 20 Lines