This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clangd/index/
-
index/
6/13
Index.h
-
Index.cpp
2/3
SymbolCollector.cpp
-
SymbolYAML.cpp
-
unittests/clangd/
-
clangd/
-
Annotations.h
-
Annotations.cpp
-
SymbolCollectorTests.cpp

Differential D45513

[clangd] Add line and column number to the index symbol.
ClosedPublic

Authored by hokein on Apr 11 2018, 2:19 AM.

Download Raw Diff

Details

Reviewers

sammccall

Commits

rG545c02a7109c: [clangd] Add line and column number to the index symbol.
rL329997: [clangd] Add line and column number to the index symbol.
rCTE329997: [clangd] Add line and column number to the index symbol.

Summary

LSP is using Line & column as symbol position, clangd needs to transfer file
offset to Line & column when sending results back to LSP client, which is a high
cost, especially for finding workspace symbol -- we have to read the file
content from disk (if it isn't loaded in memory).

Saving these information in the index will make the clangd life eaiser.

Diff Detail

Repository: rCTE Clang Tools Extra

Event Timeline

hokein created this revision.Apr 11 2018, 2:19 AM

Herald added subscribers: MaskRay, ioeric, jkorous-apple and 2 others. · View Herald TranscriptApr 11 2018, 2:19 AM

sammccall added inline comments.Apr 11 2018, 2:55 AM

clangd/index/Index.h
27	There are 4 of these per symbol, if we can keep line + character to 32 bits we save 16 bytes per symbol. That looks like it might still be ~10% of non-string size. WDYT? (12 bits for column and 20 bits for line seems like a reasonable guess)
28	These comments seem pretty obvious, I think `int Line=0, Character=0 // zero-based` is enough, but up to you. I would like a comment explaining why we're storing these redundant representations though. e.g. `// Storing Line/Character allows us to build LSP responses without reading the file content.`
32	Column? LSP calls this "character" but this is nonstandard and I find it very confusing with offset.
39	I don't think we should remove them, we'll just have the same problem in reverse. Could position have have line/col/offset, and have Position Start, End?
clangd/index/SymbolCollector.cpp
205	I don't think this works, tokens can be split across lines. I believe you want to compute NameLoc.locWithOffset(TokenLength) and then decompose that into line/col. (getLocForEndOfToken, confusingly, is different)

sammccall mentioned this in D44882: [clangd] Implementation of workspace/symbol request.Apr 11 2018, 6:04 AM

Address review comments.

Harbormaster completed remote builds in B17014: Diff 142170.Apr 12 2018, 7:21 AM

hokein added inline comments.Apr 12 2018, 7:21 AM

clangd/index/Index.h
39	As discussed offline, we decide to remove them as we don't have real use case of them (we could add them back when needed). I removed all the references of them in clangd. And remove these fields in a separate patch.
clangd/index/SymbolCollector.cpp
205	Good catch. Done, added a test.

sammccall added inline comments.Apr 12 2018, 7:27 AM

clangd/index/Index.h
27	after some offline discussion, I no longer think this is a good idea. We should strive to keep memory usage reasonable with the current naive index implementation, but "giant vector of Symbol structs in memory" isn't a case to micro-optimize for in the long run.
32	We should document what this is an offset into: bytes, utf-16 code units, or unicode codepoints. (Or even grid offsets - glyphs and doublewidth are a thing) Given that we intend to send it over LSP without reading the source, only utf-16 code units is really correct. Unicode codepoints is "nicer" and will give correct results in the BMP, while bytes will be correct for ASCII only. I'd vote for making this utf-16 code units. It's OK if the code populating it doesn't get this right (confuses bytes and code units) but add a fixme.
39	After offline discussion: we don't actually plan to do math on these ever, we just send them to LSP clients. So removing sounds fine. We can add later if there are clear use cases.
45	Not sure about the abbreviation - maybe just start/end? Since offsets are going away.

LG, with

consider reverting the bitpacking stuff
comment about utf-16
clang-format :)

clangd/index/SymbolCollector.cpp
202	nit: SourceManager is 1-based (or returns 1-based data here). SourceLocation uses 0-based offsets, not 1-based line/column.

This revision is now accepted and ready to land.Apr 12 2018, 8:26 AM

Update the patch, address remaining issues.

clangd/index/Index.h
32	Done. Added FIXME.

Harbormaster completed remote builds in B17023: Diff 142195.Apr 12 2018, 8:51 AM

MaskRay added inline comments.Apr 12 2018, 2:51 PM

clangd/index/Index.h
32	I'd vote for Unicode code points. I haven't looked into this closely. But UTF-8 vs UTF-16 vs Unicode code points should not be a big issue here. Unless you use emojis or some uncommon characters, the usage of UTF-16 code units in LSP does not have a lot of harm. // 😹😹👻 anything weird hidden in line comments can be ignored because they don't affect offset calculation And Microsoft might change the spec one day https://github.com/Microsoft/language-server-protocol/issues/376

Just my 2 cents. Calculation of line/character for each occurrence may not take a lot of computation. cquery/ccls computes the

clangd/index/Index.h
39	Yes. `StartOffset` and `EndOffset` should be removed some day. a line->offset mapping should be sufficient for documents that have stale index.

hokein added inline comments.Apr 13 2018, 1:10 AM

clangd/index/Index.h
32	Yeah, It'd be nicer if LSP spec is changed to use UTF-8. The `Column` is intended to align with `Character` of LSP's `Position`, let's keep it as it is at the moment.

Closed by commit rCTE329997: [clangd] Add line and column number to the index symbol. (authored by hokein). · Explain WhyApr 13 2018, 1:33 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

clangd/

index/

14 lines

3 lines

21 lines

11 lines

unittests/

clangd/

Annotations.h

4 lines

Annotations.cpp

10 lines

SymbolCollectorTests.cpp

78 lines

Diff 142349

clangd/index/Index.h

	Show All 18 Lines
	#include "llvm/ADT/StringExtras.h"			#include "llvm/ADT/StringExtras.h"
	#include <array>			#include <array>
	#include <string>			#include <string>

	namespace clang {			namespace clang {
	namespace clangd {			namespace clangd {

	struct SymbolLocation {			struct SymbolLocation {
				// Specify a position (Line, Column) of symbol. Using Line/Column allows us to
				sammccallUnsubmitted Done Reply Inline Actions There are 4 of these per symbol, if we can keep line + character to 32 bits we save 16 bytes per symbol. That looks like it might still be ~10% of non-string size. WDYT? (12 bits for column and 20 bits for line seems like a reasonable guess) sammccall: There are 4 of these per symbol, if we can keep line + character to 32 bits we save 16 bytes…
				sammccallUnsubmitted Done Reply Inline Actions after some offline discussion, I no longer think this is a good idea. We should strive to keep memory usage reasonable with the current naive index implementation, but "giant vector of Symbol structs in memory" isn't a case to micro-optimize for in the long run. sammccall: after some offline discussion, I no longer think this is a good idea. We should strive to keep…
				// build LSP responses without reading the file content.
				sammccallUnsubmitted Done Reply Inline Actions These comments seem pretty obvious, I think `int Line=0, Character=0 // zero-based` is enough, but up to you. I would like a comment explaining why we're storing these redundant representations though. e.g. `// Storing Line/Character allows us to build LSP responses without reading the file content.` sammccall: These comments seem pretty obvious, I think `int Line=0, Character=0 // zero-based` is enough…
				struct Position {
				uint32_t Line = 0; // 0-based
				// Using UTF-16 code units.
				uint32_t Column = 0; // 0-based
				sammccallUnsubmitted Done Reply Inline Actions Column? LSP calls this "character" but this is nonstandard and I find it very confusing with offset. sammccall: Column? LSP calls this "character" but this is nonstandard and I find it very confusing with…
				sammccallUnsubmitted Done Reply Inline Actions We should document what this is an offset into: bytes, utf-16 code units, or unicode codepoints. (Or even grid offsets - glyphs and doublewidth are a thing) Given that we intend to send it over LSP without reading the source, only utf-16 code units is really correct. Unicode codepoints is "nicer" and will give correct results in the BMP, while bytes will be correct for ASCII only. I'd vote for making this utf-16 code units. It's OK if the code populating it doesn't get this right (confuses bytes and code units) but add a fixme. sammccall: We should document what this is an offset into: bytes, utf-16 code units, or unicode codepoints.
				hokeinAuthorUnsubmitted Not Done Reply Inline Actions Done. Added FIXME. hokein: Done. Added FIXME.
				MaskRayUnsubmitted Not Done Reply Inline Actions I'd vote for Unicode code points. I haven't looked into this closely. But UTF-8 vs UTF-16 vs Unicode code points should not be a big issue here. Unless you use emojis or some uncommon characters, the usage of UTF-16 code units in LSP does not have a lot of harm. // 😹😹👻 anything weird hidden in line comments can be ignored because they don't affect offset calculation And Microsoft might change the spec one day https://github.com/Microsoft/language-server-protocol/issues/376 MaskRay: I'd vote for Unicode code points. I haven't looked into this closely. But UTF-8 vs UTF-16 vs…
				hokeinAuthorUnsubmitted Not Done Reply Inline Actions Yeah, It'd be nicer if LSP spec is changed to use UTF-8. The `Column` is intended to align with `Character` of LSP's `Position`, let's keep it as it is at the moment. hokein: Yeah, It'd be nicer if LSP spec is changed to use UTF-8. The `Column` is intended to align with…
				};

	// The URI of the source file where a symbol occurs.			// The URI of the source file where a symbol occurs.
	llvm::StringRef FileURI;			llvm::StringRef FileURI;
	// The 0-based offsets of the symbol from the beginning of the source file,			// The 0-based offsets of the symbol from the beginning of the source file,
	// using half-open range, [StartOffset, EndOffset).			// using half-open range, [StartOffset, EndOffset).
				// DO NOT use these fields, as they will be removed immediately.
				sammccallUnsubmitted Done Reply Inline Actions I don't think we should remove them, we'll just have the same problem in reverse. Could position have have line/col/offset, and have Position Start, End? sammccall: I don't think we should remove them, we'll just have the same problem in reverse. Could…
				hokeinAuthorUnsubmitted Not Done Reply Inline Actions As discussed offline, we decide to remove them as we don't have real use case of them (we could add them back when needed). I removed all the references of them in clangd. And remove these fields in a separate patch. hokein: As discussed offline, we decide to remove them as we don't have real use case of them (we could…
				sammccallUnsubmitted Not Done Reply Inline Actions After offline discussion: we don't actually plan to do math on these ever, we just send them to LSP clients. So removing sounds fine. We can add later if there are clear use cases. sammccall: After offline discussion: we don't actually plan to do math on these ever, we just send them to…
				MaskRayUnsubmitted Not Done Reply Inline Actions Yes. `StartOffset` and `EndOffset` should be removed some day. a line->offset mapping should be sufficient for documents that have stale index. MaskRay: Yes. `StartOffset` and `EndOffset` should be removed some day. a line->offset mapping should be…
				// FIXME(hokein): remove these fields in favor of Position.
	unsigned StartOffset = 0;			unsigned StartOffset = 0;
	unsigned EndOffset = 0;			unsigned EndOffset = 0;

				/// The symbol range, using half-open range [Start, End).
				Position Start;
				sammccallUnsubmitted Not Done Reply Inline Actions Not sure about the abbreviation - maybe just start/end? Since offsets are going away. sammccall: Not sure about the abbreviation - maybe just start/end? Since offsets are going away.
				Position End;

	operator bool() const { return !FileURI.empty(); }			operator bool() const { return !FileURI.empty(); }
	};			};
	llvm::raw_ostream &operator<<(llvm::raw_ostream &, const SymbolLocation &);			llvm::raw_ostream &operator<<(llvm::raw_ostream &, const SymbolLocation &);

	// The class identifies a particular C++ symbol (class, function, method, etc).			// The class identifies a particular C++ symbol (class, function, method, etc).
	//			//
	// As USRs (Unified Symbol Resolution) could be large, especially for functions			// As USRs (Unified Symbol Resolution) could be large, especially for functions
	// with long type arguments, SymbolID is using 160-bits SHA1(USR) values to			// with long type arguments, SymbolID is using 160-bits SHA1(USR) values to
	▲ Show 20 Lines • Show All 246 Lines • Show Last 20 Lines

clangd/index/Index.cpp

	Show All 13 Lines

	namespace clang {			namespace clang {
	namespace clangd {			namespace clangd {
	using namespace llvm;			using namespace llvm;

	raw_ostream &operator<<(raw_ostream &OS, const SymbolLocation &L) {			raw_ostream &operator<<(raw_ostream &OS, const SymbolLocation &L) {
	if (!L)			if (!L)
	return OS << "(none)";			return OS << "(none)";
	return OS << L.FileURI << "[" << L.StartOffset << "-" << L.EndOffset << ")";			return OS << L.FileURI << "[" << L.Start.Line << ":" << L.Start.Column << "-"
				<< L.End.Line << ":" << L.End.Column << ")";
	}			}

	SymbolID::SymbolID(StringRef USR)			SymbolID::SymbolID(StringRef USR)
	: HashValue(SHA1::hash(arrayRefFromStringRef(USR))) {}			: HashValue(SHA1::hash(arrayRefFromStringRef(USR))) {}

	raw_ostream &operator<<(raw_ostream &OS, const SymbolID &ID) {			raw_ostream &operator<<(raw_ostream &OS, const SymbolID &ID) {
	OS << toHex(toStringRef(ID.HashValue));			OS << toHex(toStringRef(ID.HashValue));
	return OS;			return OS;
	▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

clangd/index/SymbolCollector.cpp

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	llvm::Optional<SymbolLocation> getSymbolLocation(
const clang::LangOptions &LangOpts, std::string &FileURIStorage) {		const clang::LangOptions &LangOpts, std::string &FileURIStorage) {
SourceLocation NameLoc = findNameLoc(&D);		SourceLocation NameLoc = findNameLoc(&D);
auto U = toURI(SM, SM.getFilename(NameLoc), Opts);		auto U = toURI(SM, SM.getFilename(NameLoc), Opts);
if (!U)		if (!U)
return llvm::None;		return llvm::None;
FileURIStorage = std::move(*U);		FileURIStorage = std::move(*U);
SymbolLocation Result;		SymbolLocation Result;
Result.FileURI = FileURIStorage;		Result.FileURI = FileURIStorage;
Result.StartOffset = SM.getFileOffset(NameLoc);		auto TokenLength = clang::Lexer::MeasureTokenLength(NameLoc, SM, LangOpts);
Result.EndOffset = Result.StartOffset + clang::Lexer::MeasureTokenLength(
NameLoc, SM, LangOpts);		auto CreatePosition = [&SM](SourceLocation Loc) {
		auto FileIdAndOffset = SM.getDecomposedLoc(Loc);
		auto FileId = FileIdAndOffset.first;
		auto Offset = FileIdAndOffset.second;
		SymbolLocation::Position Pos;
		// Position is 0-based while SourceManager is 1-based.
		sammccallUnsubmitted Done Reply Inline Actions nit: SourceManager is 1-based (or returns 1-based data here). SourceLocation uses 0-based offsets, not 1-based line/column. sammccall: nit: SourceManager is 1-based (or returns 1-based data here). SourceLocation uses 0-based…
		Pos.Line = SM.getLineNumber(FileId, Offset) - 1;
		// FIXME: Use UTF-16 code units, not UTF-8 bytes.
		Pos.Column = SM.getColumnNumber(FileId, Offset) - 1;
		sammccallUnsubmitted Done Reply Inline Actions I don't think this works, tokens can be split across lines. I believe you want to compute NameLoc.locWithOffset(TokenLength) and then decompose that into line/col. (getLocForEndOfToken, confusingly, is different) sammccall: I don't think this works, tokens can be split across lines. I believe you want to compute…
		hokeinAuthorUnsubmitted Not Done Reply Inline Actions Good catch. Done, added a test. hokein: Good catch. Done, added a test.
		return Pos;
		};

		Result.Start = CreatePosition(NameLoc);
		auto EndLoc = NameLoc.getLocWithOffset(TokenLength);
		Result.End = CreatePosition(EndLoc);

return std::move(Result);		return std::move(Result);
}		}

// Checks whether \p ND is a definition of a TagDecl (class/struct/enum/union)		// Checks whether \p ND is a definition of a TagDecl (class/struct/enum/union)
// in a header file, in which case clangd would prefer to use ND as a canonical		// in a header file, in which case clangd would prefer to use ND as a canonical
// declaration.		// declaration.
// FIXME: handle symbol types that are not TagDecl (e.g. functions), if using		// FIXME: handle symbol types that are not TagDecl (e.g. functions), if using
// the first seen declaration as canonical declaration is not a good enough		// the first seen declaration as canonical declaration is not a good enough
▲ Show 20 Lines • Show All 163 Lines • Show Last 20 Lines

clangd/index/SymbolYAML.cpp

Show All 37 Lines	SymbolID denormalize(IO&) {
SymbolID ID;		SymbolID ID;
HexString >> ID;		HexString >> ID;
return ID;		return ID;
}		}

std::string HexString;		std::string HexString;
};		};

		template <> struct MappingTraits<SymbolLocation::Position> {
		static void mapping(IO &IO, SymbolLocation::Position &Value) {
		IO.mapRequired("Line", Value.Line);
		IO.mapRequired("Column", Value.Column);
		}
		};

template <> struct MappingTraits<SymbolLocation> {		template <> struct MappingTraits<SymbolLocation> {
static void mapping(IO &IO, SymbolLocation &Value) {		static void mapping(IO &IO, SymbolLocation &Value) {
IO.mapRequired("StartOffset", Value.StartOffset);
IO.mapRequired("EndOffset", Value.EndOffset);
IO.mapRequired("FileURI", Value.FileURI);		IO.mapRequired("FileURI", Value.FileURI);
		IO.mapRequired("Start", Value.Start);
		IO.mapRequired("End", Value.End);
}		}
};		};

template <> struct MappingTraits<SymbolInfo> {		template <> struct MappingTraits<SymbolInfo> {
static void mapping(IO &io, SymbolInfo &SymInfo) {		static void mapping(IO &io, SymbolInfo &SymInfo) {
// FIXME: expose other fields?		// FIXME: expose other fields?
io.mapRequired("Kind", SymInfo.Kind);		io.mapRequired("Kind", SymInfo.Kind);
io.mapRequired("Lang", SymInfo.Lang);		io.mapRequired("Lang", SymInfo.Lang);
▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

unittests/clangd/Annotations.h

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	public:
std::vector<Position> points(llvm::StringRef Name = "") const;		std::vector<Position> points(llvm::StringRef Name = "") const;

// Returns the location of the range marked by [[ ]] (or $name[[ ]]).		// Returns the location of the range marked by [[ ]] (or $name[[ ]]).
// Crashes if there isn't exactly one.		// Crashes if there isn't exactly one.
Range range(llvm::StringRef Name = "") const;		Range range(llvm::StringRef Name = "") const;
// Returns the location of all ranges marked by [[ ]] (or $name[[ ]]).		// Returns the location of all ranges marked by [[ ]] (or $name[[ ]]).
std::vector<Range> ranges(llvm::StringRef Name = "") const;		std::vector<Range> ranges(llvm::StringRef Name = "") const;

// The same to `range` method, but returns range in offsets [start, end).
std::pair<std::size_t, std::size_t>
offsetRange(llvm::StringRef Name = "") const;

private:		private:
std::string Code;		std::string Code;
llvm::StringMap<llvm::SmallVector<Position, 1>> Points;		llvm::StringMap<llvm::SmallVector<Position, 1>> Points;
llvm::StringMap<llvm::SmallVector<Range, 1>> Ranges;		llvm::StringMap<llvm::SmallVector<Range, 1>> Ranges;
};		};

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang
#endif		#endif

unittests/clangd/Annotations.cpp

Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	require(I != Ranges.end() && I->getValue().size() == 1,
"expected exactly one range", Code);		"expected exactly one range", Code);
return I->getValue()[0];		return I->getValue()[0];
}		}
std::vector<Range> Annotations::ranges(llvm::StringRef Name) const {		std::vector<Range> Annotations::ranges(llvm::StringRef Name) const {
auto R = Ranges.lookup(Name);		auto R = Ranges.lookup(Name);
return {R.begin(), R.end()};		return {R.begin(), R.end()};
}		}

std::pair<std::size_t, std::size_t>
Annotations::offsetRange(llvm::StringRef Name) const {
auto R = range(Name);
llvm::Expected<size_t> Start = positionToOffset(Code, R.start);
llvm::Expected<size_t> End = positionToOffset(Code, R.end);
assert(Start);
assert(End);
return {Start, End};
}

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

unittests/clangd/SymbolCollectorTests.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	MATCHER_P(Snippet, S, "") {
return arg.CompletionSnippetInsertText == S;		return arg.CompletionSnippetInsertText == S;
}		}
MATCHER_P(QName, Name, "") { return (arg.Scope + arg.Name).str() == Name; }		MATCHER_P(QName, Name, "") { return (arg.Scope + arg.Name).str() == Name; }
MATCHER_P(DeclURI, P, "") { return arg.CanonicalDeclaration.FileURI == P; }		MATCHER_P(DeclURI, P, "") { return arg.CanonicalDeclaration.FileURI == P; }
MATCHER_P(DefURI, P, "") { return arg.Definition.FileURI == P; }		MATCHER_P(DefURI, P, "") { return arg.Definition.FileURI == P; }
MATCHER_P(IncludeHeader, P, "") {		MATCHER_P(IncludeHeader, P, "") {
return arg.Detail && arg.Detail->IncludeHeader == P;		return arg.Detail && arg.Detail->IncludeHeader == P;
}		}
MATCHER_P(DeclRange, Offsets, "") {		MATCHER_P(DeclRange, Pos, "") {
return arg.CanonicalDeclaration.StartOffset == Offsets.first &&		return std::tie(arg.CanonicalDeclaration.Start.Line,
arg.CanonicalDeclaration.EndOffset == Offsets.second;		arg.CanonicalDeclaration.Start.Column,
}		arg.CanonicalDeclaration.End.Line,
MATCHER_P(DefRange, Offsets, "") {		arg.CanonicalDeclaration.End.Column) ==
return arg.Definition.StartOffset == Offsets.first &&		std::tie(Pos.start.line, Pos.start.character, Pos.end.line,
arg.Definition.EndOffset == Offsets.second;		Pos.end.character);
		}
		MATCHER_P(DefRange, Pos, "") {
		return std::tie(arg.Definition.Start.Line,
		arg.Definition.Start.Column, arg.Definition.End.Line,
		arg.Definition.End.Column) ==
		std::tie(Pos.start.line, Pos.start.character, Pos.end.line,
		Pos.end.character);
}		}
MATCHER_P(Refs, R, "") { return int(arg.References) == R; }		MATCHER_P(Refs, R, "") { return int(arg.References) == R; }

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {

namespace {		namespace {
class SymbolIndexActionFactory : public tooling::FrontendActionFactory {		class SymbolIndexActionFactory : public tooling::FrontendActionFactory {
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	Annotations Header(R"(
// Template is indexed, specialization and instantiation is not.		// Template is indexed, specialization and instantiation is not.
template <class T> struct [[Tmpl]] {T x = 0;};		template <class T> struct [[Tmpl]] {T x = 0;};
template <> struct Tmpl<int> {};		template <> struct Tmpl<int> {};
extern template struct Tmpl<float>;		extern template struct Tmpl<float>;
template struct Tmpl<double>;		template struct Tmpl<double>;
)");		)");
runSymbolCollector(Header.code(), /Main=/"");		runSymbolCollector(Header.code(), /Main=/"");
EXPECT_THAT(Symbols, UnorderedElementsAreArray({AllOf(		EXPECT_THAT(Symbols, UnorderedElementsAreArray({AllOf(
QName("Tmpl"), DeclRange(Header.offsetRange()))}));		QName("Tmpl"), DeclRange(Header.range()))}));
}		}

TEST_F(SymbolCollectorTest, Locations) {		TEST_F(SymbolCollectorTest, Locations) {
Annotations Header(R"cpp(		Annotations Header(R"cpp(
// Declared in header, defined in main.		// Declared in header, defined in main.
extern int $xdecl[[X]];		extern int $xdecl[[X]];
class $clsdecl[[Cls]];		class $clsdecl[[Cls]];
void $printdecl[[print]]();		void $printdecl[[print]]();

// Declared in header, defined nowhere.		// Declared in header, defined nowhere.
extern int $zdecl[[Z]];		extern int $zdecl[[Z]];

		void $foodecl[[fo\
		o]]();
)cpp");		)cpp");
Annotations Main(R"cpp(		Annotations Main(R"cpp(
int $xdef[[X]] = 42;		int $xdef[[X]] = 42;
class $clsdef[[Cls]] {};		class $clsdef[[Cls]] {};
void $printdef[[print]]() {}		void $printdef[[print]]() {}

// Declared/defined in main only.		// Declared/defined in main only.
int Y;		int Y;
)cpp");		)cpp");
runSymbolCollector(Header.code(), Main.code());		runSymbolCollector(Header.code(), Main.code());
EXPECT_THAT(		EXPECT_THAT(
Symbols,		Symbols,
UnorderedElementsAre(		UnorderedElementsAre(
AllOf(QName("X"), DeclRange(Header.offsetRange("xdecl")),		AllOf(QName("X"), DeclRange(Header.range("xdecl")),
DefRange(Main.offsetRange("xdef"))),		DefRange(Main.range("xdef"))),
AllOf(QName("Cls"), DeclRange(Header.offsetRange("clsdecl")),		AllOf(QName("Cls"), DeclRange(Header.range("clsdecl")),
DefRange(Main.offsetRange("clsdef"))),		DefRange(Main.range("clsdef"))),
AllOf(QName("print"), DeclRange(Header.offsetRange("printdecl")),		AllOf(QName("print"), DeclRange(Header.range("printdecl")),
DefRange(Main.offsetRange("printdef"))),		DefRange(Main.range("printdef"))),
AllOf(QName("Z"), DeclRange(Header.offsetRange("zdecl")))));		AllOf(QName("Z"), DeclRange(Header.range("zdecl"))),
		AllOf(QName("foo"), DeclRange(Header.range("foodecl")))
		));
}		}

TEST_F(SymbolCollectorTest, References) {		TEST_F(SymbolCollectorTest, References) {
const std::string Header = R"(		const std::string Header = R"(
class W;		class W;
class X {};		class X {};
class Y;		class Y;
class Z {}; // not used anywhere		class Z {}; // not used anywhere
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	Annotations Header(R"(

FF2();		FF2();
)");		)");

runSymbolCollector(Header.code(), /Main=/"");		runSymbolCollector(Header.code(), /Main=/"");
EXPECT_THAT(		EXPECT_THAT(
Symbols,		Symbols,
UnorderedElementsAre(		UnorderedElementsAre(
AllOf(QName("abc_Test"), DeclRange(Header.offsetRange("expansion")),		AllOf(QName("abc_Test"), DeclRange(Header.range("expansion")),
DeclURI(TestHeaderURI)),		DeclURI(TestHeaderURI)),
AllOf(QName("Test"), DeclRange(Header.offsetRange("spelling")),		AllOf(QName("Test"), DeclRange(Header.range("spelling")),
DeclURI(TestHeaderURI))));		DeclURI(TestHeaderURI))));
}		}

TEST_F(SymbolCollectorTest, SymbolFormedByCLI) {		TEST_F(SymbolCollectorTest, SymbolFormedByCLI) {
Annotations Header(R"(		Annotations Header(R"(
#ifdef NAME		#ifdef NAME
class $expansion[[NAME]] {};		class $expansion[[NAME]] {};
#endif		#endif
)");		)");

runSymbolCollector(Header.code(), /Main=/"",		runSymbolCollector(Header.code(), /Main=/"",
/ExtraArgs=/{"-DNAME=name"});		/ExtraArgs=/{"-DNAME=name"});
EXPECT_THAT(Symbols,		EXPECT_THAT(Symbols,
UnorderedElementsAre(AllOf(		UnorderedElementsAre(AllOf(
QName("name"), DeclRange(Header.offsetRange("expansion")),		QName("name"),
		DeclRange(Header.range("expansion")),
DeclURI(TestHeaderURI))));		DeclURI(TestHeaderURI))));
}		}

TEST_F(SymbolCollectorTest, IgnoreSymbolsInMainFile) {		TEST_F(SymbolCollectorTest, IgnoreSymbolsInMainFile) {
const std::string Header = R"(		const std::string Header = R"(
class Foo {};		class Foo {};
void f1();		void f1();
inline void f2() {}		inline void f2() {}
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines
---		---
ID: 057557CEBF6E6B2DD437FBF60CC58F352D1DF856		ID: 057557CEBF6E6B2DD437FBF60CC58F352D1DF856
Name: 'Foo1'		Name: 'Foo1'
Scope: 'clang::'		Scope: 'clang::'
SymInfo:		SymInfo:
Kind: Function		Kind: Function
Lang: Cpp		Lang: Cpp
CanonicalDeclaration:		CanonicalDeclaration:
StartOffset: 0
EndOffset: 1
FileURI: file:///path/foo.h		FileURI: file:///path/foo.h
		Start:
		Line: 1
		Column: 0
		End:
		Line: 1
		Column: 1
CompletionLabel: 'Foo1-label'		CompletionLabel: 'Foo1-label'
CompletionFilterText: 'filter'		CompletionFilterText: 'filter'
CompletionPlainInsertText: 'plain'		CompletionPlainInsertText: 'plain'
Detail:		Detail:
Documentation: 'Foo doc'		Documentation: 'Foo doc'
CompletionDetail: 'int'		CompletionDetail: 'int'
...		...
)";		)";
const std::string YAML2 = R"(		const std::string YAML2 = R"(
---		---
ID: 057557CEBF6E6B2DD437FBF60CC58F352D1DF858		ID: 057557CEBF6E6B2DD437FBF60CC58F352D1DF858
Name: 'Foo2'		Name: 'Foo2'
Scope: 'clang::'		Scope: 'clang::'
SymInfo:		SymInfo:
Kind: Function		Kind: Function
Lang: Cpp		Lang: Cpp
CanonicalDeclaration:		CanonicalDeclaration:
StartOffset: 10
EndOffset: 12
FileURI: file:///path/bar.h		FileURI: file:///path/bar.h
		Start:
		Line: 1
		Column: 0
		End:
		Line: 1
		Column: 1
CompletionLabel: 'Foo2-label'		CompletionLabel: 'Foo2-label'
CompletionFilterText: 'filter'		CompletionFilterText: 'filter'
CompletionPlainInsertText: 'plain'		CompletionPlainInsertText: 'plain'
CompletionSnippetInsertText: 'snippet'		CompletionSnippetInsertText: 'snippet'
...		...
)";		)";

auto Symbols1 = SymbolsFromYAML(YAML1);		auto Symbols1 = SymbolsFromYAML(YAML1);

EXPECT_THAT(Symbols1,		EXPECT_THAT(Symbols1,
UnorderedElementsAre(AllOf(		UnorderedElementsAre(AllOf(
QName("clang::Foo1"), Labeled("Foo1-label"), Doc("Foo doc"),		QName("clang::Foo1"), Labeled("Foo1-label"), Doc("Foo doc"),
Detail("int"), DeclURI("file:///path/foo.h"))));		Detail("int"), DeclURI("file:///path/foo.h"))));
auto Symbols2 = SymbolsFromYAML(YAML2);		auto Symbols2 = SymbolsFromYAML(YAML2);
EXPECT_THAT(Symbols2, UnorderedElementsAre(AllOf(		EXPECT_THAT(Symbols2, UnorderedElementsAre(AllOf(
QName("clang::Foo2"), Labeled("Foo2-label"),		QName("clang::Foo2"), Labeled("Foo2-label"),
Not(HasDetail()), DeclURI("file:///path/bar.h"))));		Not(HasDetail()), DeclURI("file:///path/bar.h"))));
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	Annotations Header(R"(
class $cdecl[[C]] {};		class $cdecl[[C]] {};
struct $sdecl[[S]] {};		struct $sdecl[[S]] {};
union $udecl[[U]] {int x; bool y;};		union $udecl[[U]] {int x; bool y;};
)");		)");
runSymbolCollector(Header.code(), /Main=/"");		runSymbolCollector(Header.code(), /Main=/"");
EXPECT_THAT(Symbols,		EXPECT_THAT(Symbols,
UnorderedElementsAre(		UnorderedElementsAre(
AllOf(QName("C"), DeclURI(TestHeaderURI),		AllOf(QName("C"), DeclURI(TestHeaderURI),
DeclRange(Header.offsetRange("cdecl")),		DeclRange(Header.range("cdecl")),
IncludeHeader(TestHeaderURI), DefURI(TestHeaderURI),		IncludeHeader(TestHeaderURI), DefURI(TestHeaderURI),
DefRange(Header.offsetRange("cdecl"))),		DefRange(Header.range("cdecl"))),
AllOf(QName("S"), DeclURI(TestHeaderURI),		AllOf(QName("S"), DeclURI(TestHeaderURI),
DeclRange(Header.offsetRange("sdecl")),		DeclRange(Header.range("sdecl")),
IncludeHeader(TestHeaderURI), DefURI(TestHeaderURI),		IncludeHeader(TestHeaderURI), DefURI(TestHeaderURI),
DefRange(Header.offsetRange("sdecl"))),		DefRange(Header.range("sdecl"))),
AllOf(QName("U"), DeclURI(TestHeaderURI),		AllOf(QName("U"), DeclURI(TestHeaderURI),
DeclRange(Header.offsetRange("udecl")),		DeclRange(Header.range("udecl")),
IncludeHeader(TestHeaderURI), DefURI(TestHeaderURI),		IncludeHeader(TestHeaderURI), DefURI(TestHeaderURI),
DefRange(Header.offsetRange("udecl")))));		DefRange(Header.range("udecl")))));
}		}

TEST_F(SymbolCollectorTest, ClassForwardDeclarationIsCanonical) {		TEST_F(SymbolCollectorTest, ClassForwardDeclarationIsCanonical) {
CollectorOpts.CollectIncludePath = true;		CollectorOpts.CollectIncludePath = true;
runSymbolCollector(/Header=/"class X;", /Main=/"class X {};");		runSymbolCollector(/Header=/"class X;", /Main=/"class X {};");
EXPECT_THAT(Symbols, UnorderedElementsAre(AllOf(		EXPECT_THAT(Symbols, UnorderedElementsAre(AllOf(
QName("X"), DeclURI(TestHeaderURI),		QName("X"), DeclURI(TestHeaderURI),
IncludeHeader(TestHeaderURI), DefURI(TestFileURI))));		IncludeHeader(TestHeaderURI), DefURI(TestFileURI))));
}		}

} // namespace		} // namespace
} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang