Diff 250885

clang-tools-extra/clangd/XRefs.cpp

Show First 20 Lines • Show All 351 Lines • ▼ Show 20 Lines	bool tokenSurvivedPreprocessing(SourceLocation Loc,
const syntax::TokenBuffer &TB) {		const syntax::TokenBuffer &TB) {
auto WordExpandedTokens =		auto WordExpandedTokens =
TB.expandedTokens(TB.sourceManager().getMacroArgExpandedLocation(Loc));		TB.expandedTokens(TB.sourceManager().getMacroArgExpandedLocation(Loc));
return !WordExpandedTokens.empty();		return !WordExpandedTokens.empty();
}		}

} // namespace		} // namespace

std::vector<LocatedSymbol>		std::vector<LocatedSymbol>
		sammccallUnsubmitted Done Reply Inline Actions I don't think we need this function, which just maps one enum onto another - just use the token kind directly? sammccall: I don't think we need this function, which just maps one enum onto another - just use the token…
locateSymbolNamedTextuallyAt(ParsedAST &AST, const SymbolIndex *Index,		locateSymbolNamedTextuallyAt(ParsedAST &AST, const SymbolIndex *Index,
SourceLocation Loc,		SourceLocation Loc,
const std::string &MainFilePath) {		const std::string &MainFilePath) {
const auto &SM = AST.getSourceManager();		const auto &SM = AST.getSourceManager();

		auto Tokens = syntax::spelledTokensTouching(Loc, AST.getTokens());
		kadircetUnsubmitted Done Reply Inline Actions you can rather use `AST.getTokens().spelledTokenAt(Loc)` to get preprocessed token, and pass that into getTokenFlavor rather than a SourceLocation. kadircet: you can rather use `AST.getTokens().spelledTokenAt(Loc)` to get preprocessed token, and pass…
		sammccallUnsubmitted Done Reply Inline Actions for a bit more context: running Lexer::getRawToken runs a raw lex that only sees the piece of text you point at. If you're inside a huge comment, it won't know that. Using AST.getTokens() uses the results of an earlier raw lex of the whole file. sammccall: for a bit more context: running Lexer::getRawToken runs a raw lex that only sees the piece of…
		nridgeAuthorUnsubmitted Done Reply Inline Actions Hmm, I've tried this and `spelledTokenAt()` seems to return null for comment tokens. nridge: Hmm, I've tried this and `spelledTokenAt()` seems to return null for comment tokens.
		nridgeAuthorUnsubmitted Done Reply Inline Actions It looks like there are two reasons for this: The lexer that is producing the token buffer is using `inKeepCommentMode() == false` `spelledTokenAt()` only returns a result if you give it the location at the beginning of the token, not something in the middle nridge: It looks like there are two reasons for this: * The lexer that is producing the token buffer…
		if (Tokens.size() != 1)
		sammccallUnsubmitted Done Reply Inline Actions this means you're not going to resolve `foo` in `a.^foo` (you're touching two tokens). The cleanest thing seems to be to use the word you've identified: iterate over the `spelledTokensTouching(WordStart)` and accept the one where `tok.range(SM).touches(WordOffset + Word.size())` sammccall: this means you're not going to resolve `foo` in `a.^foo` (you're touching two tokens). The…
		return {};
		syntax::Token Tok = Tokens[0];

		// Only consider comment and identifier tokens.
		if (!(Tok.kind() == tok::TokenKind::comment \|\|
		Tok.kind() == tok::TokenKind::identifier))
		return {};

// Get the raw word at the specified location.		// Get the raw word at the specified location.
unsigned Pos;		unsigned Pos;
FileID File;		FileID File;
std::tie(File, Pos) = SM.getDecomposedLoc(Loc);		std::tie(File, Pos) = SM.getDecomposedLoc(Loc);
llvm::StringRef Code = SM.getBufferData(File);		llvm::StringRef Code = SM.getBufferData(File);
llvm::StringRef Word = wordTouching(Code, Pos);		llvm::StringRef Word = wordTouching(Code, Pos);
if (Word.empty())		if (Word.empty())
return {};		return {};
unsigned WordOffset = Word.data() - Code.data();		unsigned WordOffset = Word.data() - Code.data();
SourceLocation WordStart = SM.getComposedLoc(File, WordOffset);		SourceLocation WordStart = SM.getComposedLoc(File, WordOffset);

// Do not consider tokens that survived preprocessing.		// If this is an identifier token, do not consider if it it survived
// We are erring on the safe side here, as a user may expect to get		// preprocessing. We are erring on the safe side here, as a user may expect to
// accurate (as opposed to textual-heuristic) results for such tokens.		// get accurate (as opposed to textual-heuristic) results for such tokens.
// FIXME: Relax this for dependent code.		// FIXME: Relax this for dependent code.
if (tokenSurvivedPreprocessing(WordStart, AST.getTokens()))		if (Tok.kind() == tok::TokenKind::identifier &&
		nridgeAuthorUnsubmitted Done Reply Inline Actions Whoops, meant to use `WordStart` rather than `Loc` here. nridge: Whoops, meant to use `WordStart` rather than `Loc` here.
		tokenSurvivedPreprocessing(WordStart, AST.getTokens())) {
return {};		return {};
		}

// Additionally filter for signals that the word is likely to be an		// Additionally filter for signals that the word is likely to be an
// identifier. This avoids triggering on e.g. random words in a comment.		// identifier. This avoids triggering on e.g. random words in a comment.
if (!isLikelyToBeIdentifier(Word))		if (!isLikelyToBeIdentifier(Word))
return {};		return {};

// Look up the selected word in the index.		// Look up the selected word in the index.
FuzzyFindRequest Req;		FuzzyFindRequest Req;
▲ Show 20 Lines • Show All 695 Lines • Show Last 20 Lines

clang-tools-extra/clangd/unittests/XRefsTests.cpp

Show First 20 Lines • Show All 622 Lines • ▼ Show 20 Lines

TEST(LocateSymbol, Textual) {		TEST(LocateSymbol, Textual) {
const char *Tests[] = {		const char *Tests[] = {
R"cpp(// Comment		R"cpp(// Comment
struct [[MyClass]] {};		struct [[MyClass]] {};
// Comment mentioning M^yClass		// Comment mentioning M^yClass
)cpp",		)cpp",
R"cpp(// String		R"cpp(// String
struct [[MyClass]] {};		struct MyClass {};
		// Not triggered for string literal tokens.
const char* s = "String literal mentioning M^yClass";		const char* s = "String literal mentioning M^yClass";
)cpp",		)cpp",
R"cpp(// Ifdef'ed out code		R"cpp(// Ifdef'ed out code
struct [[MyClass]] {};		struct [[MyClass]] {};
#ifdef WALDO		#ifdef WALDO
M^yClass var;		M^yClass var;
#endif		#endif
)cpp",		)cpp",
Show All 35 Lines	for (const char *Test : Tests) {

if (!WantDecl) {		if (!WantDecl) {
EXPECT_THAT(Results, IsEmpty()) << Test;		EXPECT_THAT(Results, IsEmpty()) << Test;
} else {		} else {
ASSERT_THAT(Results, ::testing::SizeIs(1)) << Test;		ASSERT_THAT(Results, ::testing::SizeIs(1)) << Test;
EXPECT_EQ(Results[0].PreferredDeclaration.range, *WantDecl) << Test;		EXPECT_EQ(Results[0].PreferredDeclaration.range, *WantDecl) << Test;
}		}
}		}
}		} // namespace

TEST(LocateSymbol, Ambiguous) {		TEST(LocateSymbol, Ambiguous) {
auto T = Annotations(R"cpp(		auto T = Annotations(R"cpp(
struct Foo {		struct Foo {
Foo();		Foo();
Foo(Foo&&);		Foo(Foo&&);
$ConstructorLoc[[Foo]](const char*);		$ConstructorLoc[[Foo]](const char*);
};		};
▲ Show 20 Lines • Show All 684 Lines • Show Last 20 Lines

clang/lib/Tooling/Syntax/Tokens.cpp

Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines	std::vector<syntax::Token> syntax::tokenize(const FileRange &FR,
};		};

auto SrcBuffer = SM.getBufferData(FR.file());		auto SrcBuffer = SM.getBufferData(FR.file());
Lexer L(SM.getLocForStartOfFile(FR.file()), LO, SrcBuffer.data(),		Lexer L(SM.getLocForStartOfFile(FR.file()), LO, SrcBuffer.data(),
SrcBuffer.data() + FR.beginOffset(),		SrcBuffer.data() + FR.beginOffset(),
// We can't make BufEnd point to FR.endOffset, as Lexer requires a		// We can't make BufEnd point to FR.endOffset, as Lexer requires a
// null terminated buffer.		// null terminated buffer.
SrcBuffer.data() + SrcBuffer.size());		SrcBuffer.data() + SrcBuffer.size());
		L.SetCommentRetentionState(true);
		sammccallUnsubmitted Done Reply Inline Actions Yikes, I didn't remember TokenBuffer doesn't currently record comment tokens. I'm afraid this isn't a trivial change and would definitely need tests to verify it doesn't interfere with translating between spelled/expanded tokens (I'm pretty sure there are tests that comments aren't retained now in TokensTest.cpp, part of SyntaxTests) Given that, the shorter route for this patch would be to blacklist string literals rather than whitelisting comments + identifiers. sammccall: Yikes, I didn't remember TokenBuffer doesn't currently record comment tokens. I'm afraid this…

clang::Token T;		clang::Token T;
while (!L.LexFromRawLexer(T) && L.getCurrentBufferOffset() < FR.endOffset())		while (!L.LexFromRawLexer(T) && L.getCurrentBufferOffset() < FR.endOffset())
AddToken(T);		AddToken(T);
// LexFromRawLexer returns true when it parses the last token of the file, add		// LexFromRawLexer returns true when it parses the last token of the file, add
// it iff it starts within the range we are interested in.		// it iff it starts within the range we are interested in.
if (SM.getFileOffset(T.getLocation()) < FR.endOffset())		if (SM.getFileOffset(T.getLocation()) < FR.endOffset())
AddToken(T);		AddToken(T);
▲ Show 20 Lines • Show All 366 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Do not trigger go-to-def textual fallback inside string literals
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 250885

clang-tools-extra/clangd/XRefs.cpp

clang-tools-extra/clangd/unittests/XRefsTests.cpp

clang/lib/Tooling/Syntax/Tokens.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Do not trigger go-to-def textual fallback inside string literalsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 250885

clang-tools-extra/clangd/XRefs.cpp

clang-tools-extra/clangd/unittests/XRefsTests.cpp

clang/lib/Tooling/Syntax/Tokens.cpp

[clangd] Do not trigger go-to-def textual fallback inside string literals
ClosedPublic