This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/clangd/
-
clangd/
-
Selection.h
9/9
Selection.cpp
-
unittests/
1/2
SelectionTests.cpp
-
TweakTests.cpp
-
clang/
-
include/clang/Tooling/Syntax/
-
clang/
-
Tooling/
-
Syntax/
-
Tokens.h
-
lib/Tooling/Syntax/
-
Tooling/
-
Syntax/
2/3
Tokens.cpp
-
unittests/Tooling/Syntax/
-
Tooling/
-
Syntax/
-
TokensTest.cpp

Differential D70512

[clangd] Rethink how SelectionTree deals with macros and #includes.
ClosedPublic

Authored by sammccall on Nov 20 2019, 2:26 PM.

Download Raw Diff

Details

Reviewers

hokein

Commits

rG19daa21f841a: [clangd] Rethink how SelectionTree deals with macros and #includes.

Summary

The exclusive-claim model is successful at resolving conflicts over tokens
between parent/child or siblings. However it's not the right model for selecting
AST nodes produced by a macro expansion, which can produce several independent
nodes that are equally associated with the macro invocation.
Additionally, any model that only uses the endpoints in a range can fail when
a macro invocation occurs inside the node.

To address this, we use the existing TokenBuffer in more depth. SourceRanges
are translated into an array of expanded tokens we can iterate over, and in
principle process token by token (in practice, batching related tokens helps).

For regular tokens (and macro-arg expansions) we claim the tokens as before,
but for tokens from macro bodies we merely check whether the macro name was
selected. Thus tokens in macro bodies are selected by selecting the macro name.

The aggregation of Selection is now more principled as we need to be able to
call claim()/peek() an arbitrary number of times.

One side-effect of iterating over the expanded tokens is that (usually) nothing
claims preprocessor tokens like macro names and directives.
Rather than fixing this I just left them unclaimed, and removed support for
determining the selectedness of TUDecl.
(That was originally implemented in 90a5bf92ff97b1, but doesn't seem to be very
important or worth the complexity any longer).

The expandedTokens(SourceLocation) helper could be added locally, but seems to
make sense on TokenBuffer.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sammccall created this revision.Nov 20 2019, 2:26 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 20 2019, 2:26 PM

Herald added subscribers: cfe-commits, usaxena95, kadircet and 3 others. · View Herald Transcript

This fixes:

(I've amended the commit message)

Build result: pass - 60204 tests passed, 0 failed and 732 were skipped.
Log files: console-log.txt, CMakeCache.txt

Harbormaster completed remote builds in B41271: Diff 230336.Nov 20 2019, 3:02 PM

hokein added inline comments.Nov 25 2019, 4:01 AM

clang-tools-extra/clangd/Selection.cpp
49	I might be missing something, I don't understand why we set Partial if `Result != New`.
446	what's S for this case?
464	nit: maybe use [`early-exist`](https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code) here?
496	not sure what's the rational behavior, but I think for the following case, we just have the TUDecl in the selection tree, maybe use the whole macro range? #define M() 123 int d = M(^); // now we only have the TUDecl in the selection tree.
clang-tools-extra/clangd/unittests/SelectionTests.cpp
139	could we have a testcase to cover the "tokens expanded from another #include file" code path?
clang/lib/Tooling/Syntax/Tokens.cpp
125	nit: for code readability, I'd use `llvm::ArrayRef<syntax::Token>::iterator` type here.
126	I think the parition_point requires the `ExpandedTokens` is partitioned, but I didn't found any documentation about this guarantee in the code, would be nice to have this in the comment (probably around the `ExpandedTokens`).

Address review comments.

Build result: fail - 60299 tests passed, 1 failed and 732 were skipped.

failed: Clangd Unit Tests._/ClangdTests/SelectionTest.PathologicalPreprocessor

Log files: console-log.txt, CMakeCache.txt

Harbormaster failed remote builds in B41463: Diff 230948!Nov 25 2019, 11:41 AM

No, this patch is busted, and the tests were too simple go catch it.

The issue is that with no exclusivity check, given {{{ MACRO }}} all the enclosing blocks get to count themselves selected.

We need an exclusivity check on the expanded token stream first. This will yield a sequence of slices of tokens not yet claimed. Then each slice gets partitioned by FileID, each partition gets mapped to spelled tokens and checked as in this patch.

I suspect the exclusivity/claiming at the spelled token level is no longer needed.

clang-tools-extra/clangd/Selection.cpp
49	Added a brief comment and an assertion. Intuitive explanation: Consider `Complete` = black, `Unselected` = white, `Partial` = grey. White + White -> white, Black + Black -> black, every other mixture is grey. (But honestly I wrote up the state transition table first and then extracted the logic by staring at it)
446	S is the function parameter, the implication is that it's the range of the vardecl in the example. Expanded the comment a bit.
496	This is definitely better, but isn't trivial to do, and isn't a very important case. Added a FIXME rather than clutter/delay this patch with it.
clang/lib/Tooling/Syntax/Tokens.cpp
125	I think that hurts readability on two counts: it obscures the actual type: a pointer is concrete and familiar, so it's easier to realize that e.g. `>` is well-defined here it makes it harder to understand `return {Begin, End}` which relies on the fact that the actual type here is Token*

Rewrote patch with a better approach (claiming expanded tokens rather than spelled).
Added more tests, including one showing a problem with multiple arg expansion.

Build result: pass - 60302 tests passed, 0 failed and 732 were skipped.

Log files: console-log.txt, CMakeCache.txt

Harbormaster completed remote builds in B41549: Diff 231206.Nov 27 2019, 3:36 AM

Thanks, the patch looks good! please also update the patch description.

clang-tools-extra/clangd/Selection.cpp
81	nit: this comment seems not reflect to the code now.
144	nit: remove the redundant start.
clang-tools-extra/clangd/unittests/SelectionTests.cpp
467	I think cases like below are broken: #define greater(x, y) x > y? x : y #define abs(x) x > 0 ? x : -x Selecting the first element for a macro arg seems good to me.

This revision is now accepted and ready to land.Nov 27 2019, 7:05 AM

Closed by commit rG19daa21f841a: [clangd] Rethink how SelectionTree deals with macros and #includes. (authored by sammccall). · Explain WhyNov 29 2019, 6:26 AM

This revision was automatically updated to reflect the committed changes.

sammccall marked 2 inline comments as done.

hover.test is failing on Mac: http://45.33.8.238/mac/3308/step_7.txt

Please take a look, and revert if it takes a while.

Reverted as 905b002c139f039a32ab9bf1fad63d745d12423f

@thakis any information available about that bot configuration?
I'm not seeing any failures on "official" ones, I can't reproduce locally, and the failures don't make a lot of sense to me (lots of off-by-ones everywhere maybe?)

That particular bot does GN builds, but these test failures repro in the cmake build for me on both macs I tried. (Remember that the official mac bots are on greendragon, not on buildbot -- I'd guess it shows there too.)

Mac is happy now, but it fails to build on Windows: http://45.33.8.238/win/3327/step_4.txt

Revision Contents

Path

Size

clang-tools-extra/

clangd/

Selection.h

2 lines

Selection.cpp

137 lines

unittests/

SelectionTests.cpp

15 lines

TweakTests.cpp

26 lines

clang/

include/

clang/

Tooling/

Syntax/

Tokens.h

4 lines

lib/

Tooling/

Syntax/

Tokens.cpp

16 lines

unittests/

Tooling/

Syntax/

TokensTest.cpp

15 lines

Diff 230336

clang-tools-extra/clangd/Selection.h

Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	SelectionTree(ASTContext &AST, const syntax::TokenBuffer &Tokens,
unsigned Offset);		unsigned Offset);
// Creates a selection tree for the given range in the main file.		// Creates a selection tree for the given range in the main file.
// The range includes bytes [Start, End).		// The range includes bytes [Start, End).
// If Start == End, uses the same heuristics as SelectionTree(AST, Start).		// If Start == End, uses the same heuristics as SelectionTree(AST, Start).
SelectionTree(ASTContext &AST, const syntax::TokenBuffer &Tokens,		SelectionTree(ASTContext &AST, const syntax::TokenBuffer &Tokens,
unsigned Start, unsigned End);		unsigned Start, unsigned End);

// Describes to what extent an AST node is covered by the selection.		// Describes to what extent an AST node is covered by the selection.
enum Selection {		enum Selection : unsigned char {
// The AST node owns no characters covered by the selection.		// The AST node owns no characters covered by the selection.
// Note that characters owned by children don't count:		// Note that characters owned by children don't count:
// if (x == 0) scream();		// if (x == 0) scream();
// ^^^^^^		// ^^^^^^
// The IfStmt would be Unselected because all the selected characters are		// The IfStmt would be Unselected because all the selected characters are
// associated with its children.		// associated with its children.
// (Invisible nodes like ImplicitCastExpr are always unselected).		// (Invisible nodes like ImplicitCastExpr are always unselected).
Unselected,		Unselected,
▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

clang-tools-extra/clangd/Selection.cpp

Show All 28 Lines
#include <string>		#include <string>

namespace clang {		namespace clang {
namespace clangd {		namespace clangd {
namespace {		namespace {
using Node = SelectionTree::Node;		using Node = SelectionTree::Node;
using ast_type_traits::DynTypedNode;		using ast_type_traits::DynTypedNode;

		// Sentinel value for the selectedness of a node where we've seen no tokens yet.
		// Once we see a token, treated as fully-selected (so far). If we never see a
		// token, it folds into Unselected. These are never exposed publicly,
		constexpr SelectionTree::Selection NoTokens =
		static_cast<SelectionTree::Selection>(
		static_cast<unsigned char>(SelectionTree::Complete + 1));

		// Nodes start start with NoTokens, and then use this function to aggregate
		// the selectedness as more tokens are found.
		void update(SelectionTree::Selection &Result, SelectionTree::Selection New) {
		if (Result == NoTokens)
		Result = New;
		else if (Result != New)
		hokeinUnsubmitted Done Reply Inline Actions I might be missing something, I don't understand why we set Partial if `Result != New`. hokein: I might be missing something, I don't understand why we set Partial if `Result != New`.
		sammccallAuthorUnsubmitted Done Reply Inline Actions Added a brief comment and an assertion. Intuitive explanation: Consider `Complete` = black, `Unselected` = white, `Partial` = grey. White + White -> white, Black + Black -> black, every other mixture is grey. (But honestly I wrote up the state transition table first and then extracted the logic by staring at it) sammccall: Added a brief comment and an assertion. Intuitive explanation: Consider `Complete` = black…
		Result = SelectionTree::Partial;
		}

// Identifies which tokens are selected, and evaluates claims of source ranges		// Identifies which tokens are selected, and evaluates claims of source ranges
// by AST nodes. Tokens may be claimed only once: first-come, first-served.		// by AST nodes. Tokens may be claimed only once: first-come, first-served.
class SelectedTokens {		class SelectedTokens {
public:		public:
SelectedTokens(llvm::ArrayRef<syntax::Token> Spelled, const SourceManager &SM,		SelectedTokens(llvm::ArrayRef<syntax::Token> Spelled, const SourceManager &SM,
unsigned SelBegin, unsigned SelEnd)		unsigned SelBegin, unsigned SelEnd)
: SelBegin(SelBegin), SelEnd(SelEnd) {		: SelBegin(SelBegin), SelEnd(SelEnd) {
// Extract bounds and selected-ness for all tokens spelled in the file.		// Extract bounds and selected-ness for all tokens spelled in the file.
Show All 12 Lines	for (const auto& Tok : Spelled) {
S.Selected = SelectionTree::Complete;		S.Selected = SelectionTree::Complete;
else if (S.EndOffset > SelBegin && S.StartOffset < SelEnd)		else if (S.EndOffset > SelBegin && S.StartOffset < SelEnd)
S.Selected = SelectionTree::Partial;		S.Selected = SelectionTree::Partial;
else		else
S.Selected = SelectionTree::Unselected;		S.Selected = SelectionTree::Unselected;
S.Claimed = false;		S.Claimed = false;
}		}
}		}

		hokeinUnsubmitted Done Reply Inline Actions nit: this comment seems not reflect to the code now. hokein: nit: this comment seems not reflect to the code now.
// Associates any tokens overlapping [Begin, End) with an AST node.		// Associates any tokens overlapping [Begin, End) with an AST node.
// Tokens that were already claimed by another AST node are not claimed again.		// Tokens that were already claimed by another AST node are not claimed again.
// Updates Result if the node is selected in the sense of SelectionTree.		// Updates Result if the node is selected in the sense of SelectionTree.
void claim(unsigned Begin, unsigned End, SelectionTree::Selection &Result) {		void claim(unsigned Begin, unsigned End, SelectionTree::Selection &Result) {
assert(Begin <= End);		assert(Begin <= End);

// Fast-path for missing the selection entirely.		// Fast-path for missing the selection entirely.
if (Begin >= SelEnd \|\| End <= SelBegin)		if (Begin >= SelEnd \|\| End <= SelBegin)
Show All 23 Lines	for (auto It = Start; It != Tokens.end() && It->StartOffset < End; ++It) {
PartialSelection \|= (It->Selected == SelectionTree::Partial);		PartialSelection \|= (It->Selected == SelectionTree::Partial);
}		}
} else {		} else {
// If the node covers an unselected token, it's not completely selected.		// If the node covers an unselected token, it's not completely selected.
PartialSelection = true;		PartialSelection = true;
}		}
}		}

// If some tokens were previously claimed (Result != Unselected), we may		if (ClaimedAnyToken)
// upgrade from Partial->Complete, even if no new tokens were claimed.		update(Result, PartialSelection ? SelectionTree::Partial
// Important for [[int a]].
if (ClaimedAnyToken \|\| Result) {
Result = std::max(Result, PartialSelection ? SelectionTree::Partial
: SelectionTree::Complete);		: SelectionTree::Complete);
}		}

		// Checks whether the token at Offset is selected, and updates Result.
		// Does not claim the token.
		void peek(unsigned Offset, SelectionTree::Selection &Result) const {
		// Find the token, if it exists.
		auto I = llvm::partition_point(
		Tokens, [&](const TokInfo &Tok) { return Tok.EndOffset <= Offset; });
		if (I == Tokens.end() \|\| I->StartOffset != Offset)
		return;
		update(Result, I->Selected);
}		}

private:		private:
struct TokInfo {		struct TokInfo {
unsigned StartOffset;		unsigned StartOffset;
unsigned EndOffset;		unsigned EndOffset;
SelectionTree::Selection Selected;		SelectionTree::Selection Selected;
bool Claimed;		bool Claimed;
bool operator<(const TokInfo &Other) const {		bool operator<(const TokInfo &Other) const {
return StartOffset < Other.StartOffset;		return StartOffset < Other.StartOffset;
		hokeinUnsubmitted Done Reply Inline Actions nit: remove the redundant start. hokein: nit: remove the redundant start.
}		}
};		};
std::vector<TokInfo> Tokens;		std::vector<TokInfo> Tokens;
unsigned SelBegin, SelEnd;		unsigned SelBegin, SelEnd;
};		};

// Show the type of a node for debugging.		// Show the type of a node for debugging.
void printNodeKind(llvm::raw_ostream &OS, const DynTypedNode &N) {		void printNodeKind(llvm::raw_ostream &OS, const DynTypedNode &N) {
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	public:
static std::deque<Node> collect(ASTContext &AST,		static std::deque<Node> collect(ASTContext &AST,
const syntax::TokenBuffer &Tokens,		const syntax::TokenBuffer &Tokens,
const PrintingPolicy &PP, unsigned Begin,		const PrintingPolicy &PP, unsigned Begin,
unsigned End, FileID File) {		unsigned End, FileID File) {
SelectionVisitor V(AST, Tokens, PP, Begin, End, File);		SelectionVisitor V(AST, Tokens, PP, Begin, End, File);
V.TraverseAST(AST);		V.TraverseAST(AST);
assert(V.Stack.size() == 1 && "Unpaired push/pop?");		assert(V.Stack.size() == 1 && "Unpaired push/pop?");
assert(V.Stack.top() == &V.Nodes.front());		assert(V.Stack.top() == &V.Nodes.front());
// We selected TUDecl if tokens were unclaimed (or the file is empty).
SelectionTree::Selection UnclaimedTokens = SelectionTree::Unselected;
V.Claimed.claim(Begin, End, UnclaimedTokens);
if (UnclaimedTokens \|\| V.Nodes.size() == 1) {
StringRef FileContent = AST.getSourceManager().getBufferData(File);
// Don't require the trailing newlines to be selected.
bool SelectedAll = Begin == 0 && End >= FileContent.rtrim().size();
V.Stack.top()->Selected =
SelectedAll ? SelectionTree::Complete : SelectionTree::Partial;
}
return std::move(V.Nodes);		return std::move(V.Nodes);
}		}

// We traverse all "well-behaved" nodes the same way:		// We traverse all "well-behaved" nodes the same way:
// - push the node onto the stack		// - push the node onto the stack
// - traverse its children recursively		// - traverse its children recursively
// - pop it from the stack		// - pop it from the stack
// - hit testing: is intersection(node, selection) - union(children) empty?		// - hit testing: is intersection(node, selection) - union(children) empty?
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	private:

SelectionVisitor(ASTContext &AST, const syntax::TokenBuffer &Tokens,		SelectionVisitor(ASTContext &AST, const syntax::TokenBuffer &Tokens,
const PrintingPolicy &PP, unsigned SelBegin, unsigned SelEnd,		const PrintingPolicy &PP, unsigned SelBegin, unsigned SelEnd,
FileID SelFile)		FileID SelFile)
: SM(AST.getSourceManager()), LangOpts(AST.getLangOpts()),		: SM(AST.getSourceManager()), LangOpts(AST.getLangOpts()),
#ifndef NDEBUG		#ifndef NDEBUG
PrintPolicy(PP),		PrintPolicy(PP),
#endif		#endif
		Tokens(Tokens),
Claimed(Tokens.spelledTokens(SelFile), SM, SelBegin, SelEnd),		Claimed(Tokens.spelledTokens(SelFile), SM, SelBegin, SelEnd),
SelFile(SelFile),		SelFile(SelFile),
SelBeginTokenStart(SM.getFileOffset(Lexer::GetBeginningOfToken(		SelBeginTokenStart(SM.getFileOffset(Lexer::GetBeginningOfToken(
SM.getComposedLoc(SelFile, SelBegin), SM, LangOpts))),		SM.getComposedLoc(SelFile, SelBegin), SM, LangOpts))),
SelEnd(SelEnd) {		SelEnd(SelEnd) {
// Ensure we have a node for the TU decl, regardless of traversal scope.		// Ensure we have a node for the TU decl, regardless of traversal scope.
Nodes.emplace_back();		Nodes.emplace_back();
Nodes.back().ASTNode = DynTypedNode::create(*AST.getTranslationUnitDecl());		Nodes.back().ASTNode = DynTypedNode::create(*AST.getTranslationUnitDecl());
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	#endif
// Pushes a node onto the ancestor stack. Pairs with pop().		// Pushes a node onto the ancestor stack. Pairs with pop().
// Performs early hit detection for some nodes (on the earlySourceRange).		// Performs early hit detection for some nodes (on the earlySourceRange).
void push(DynTypedNode Node) {		void push(DynTypedNode Node) {
SourceRange Early = earlySourceRange(Node);		SourceRange Early = earlySourceRange(Node);
dlog("{1}push: {0}", printNodeToString(Node, PrintPolicy), indent());		dlog("{1}push: {0}", printNodeToString(Node, PrintPolicy), indent());
Nodes.emplace_back();		Nodes.emplace_back();
Nodes.back().ASTNode = std::move(Node);		Nodes.back().ASTNode = std::move(Node);
Nodes.back().Parent = Stack.top();		Nodes.back().Parent = Stack.top();
		Nodes.back().Selected = NoTokens;
Stack.push(&Nodes.back());		Stack.push(&Nodes.back());
claimRange(Early, Nodes.back().Selected);		claimRange(Early, Nodes.back().Selected);
// Early hit detection never selects the whole node.
if (Nodes.back().Selected)
Nodes.back().Selected = SelectionTree::Partial;
}		}

// Pops a node off the ancestor stack, and finalizes it. Pairs with push().		// Pops a node off the ancestor stack, and finalizes it. Pairs with push().
// Performs primary hit detection.		// Performs primary hit detection.
void pop() {		void pop() {
Node &N = *Stack.top();		Node &N = *Stack.top();
dlog("{1}pop: {0}", printNodeToString(N.ASTNode, PrintPolicy), indent(-1));		dlog("{1}pop: {0}", printNodeToString(N.ASTNode, PrintPolicy), indent(-1));
claimRange(N.ASTNode.getSourceRange(), N.Selected);		claimRange(N.ASTNode.getSourceRange(), N.Selected);
		if (N.Selected == NoTokens)
		N.Selected = SelectionTree::Unselected;
if (N.Selected \|\| !N.Children.empty()) {		if (N.Selected \|\| !N.Children.empty()) {
// Attach to the tree.		// Attach to the tree.
N.Parent->Children.push_back(&N);		N.Parent->Children.push_back(&N);
} else {		} else {
// Neither N any children are selected, it doesn't belong in the tree.		// Neither N any children are selected, it doesn't belong in the tree.
assert(&N == &Nodes.back());		assert(&N == &Nodes.back());
Nodes.pop_back();		Nodes.pop_back();
}		}
Show All 20 Lines	#endif

// Perform hit-testing of a complete Node against the selection.		// Perform hit-testing of a complete Node against the selection.
// This runs for every node in the AST, and must be fast in common cases.		// This runs for every node in the AST, and must be fast in common cases.
// This is usually called from pop(), so we can take children into account.		// This is usually called from pop(), so we can take children into account.
// The existing state of Result is relevant (early/late claims can interact).		// The existing state of Result is relevant (early/late claims can interact).
void claimRange(SourceRange S, SelectionTree::Selection &Result) {		void claimRange(SourceRange S, SelectionTree::Selection &Result) {
if (!S.isValid())		if (!S.isValid())
return;		return;
// toHalfOpenFileRange() allows selection of constructs in macro args. e.g:
// #define LOOP_FOREVER(Body) for(;;) { Body }		// We need to iterate over all the expanded tokens that are part of S.
// void IncrementLots(int &x) {		// Consider the macro expansion FLAG(x) -> int x = 0;
// LOOP_FOREVER( ++x; )		// Neither S.getBegin() nor S.getEnd() are arg expansions, but x is.
		hokeinUnsubmitted Done Reply Inline Actions what's S for this case? hokein: what's S for this case?
		sammccallAuthorUnsubmitted Done Reply Inline Actions S is the function parameter, the implication is that it's the range of the vardecl in the example. Expanded the comment a bit. sammccall: S is the function parameter, the implication is that it's the range of the vardecl in the…
// }		// The selection FLAG([[x]]) must partially select the VarDecl.
// Selecting "++x" or "x" will do the right thing.		llvm::ArrayRef<syntax::Token> Remaining = Tokens.expandedTokens(S);
auto Range = toHalfOpenFileRange(SM, LangOpts, S);		while (!Remaining.empty()) {
assert(Range && "We should be able to get the File Range");		// Take consecutive tokens from the same context together for efficiency.
dlog("{1}claimRange: {0}", Range->printToString(SM), indent());		FileID FID = SM.getFileID(Remaining.front().location());
auto B = SM.getDecomposedLoc(Range->getBegin());		auto Batch = Remaining.take_while([&](const syntax::Token &T) {
auto E = SM.getDecomposedLoc(Range->getEnd());		return SM.getFileID(T.location()) == FID;
// Otherwise, nodes in macro expansions can't be selected.		});
if (B.first != SelFile \|\| E.first != SelFile)		assert(!Batch.empty());
return;		Remaining = Remaining.drop_front(Batch.size());
// Attempt to claim the remaining range. If there's nothing to claim, only
// children were selected.		// There are several possible categories of FileID depending on how the
Claimed.claim(B.second, E.second, Result);		// preprocessor was used to generate these tokens:
		// main file, #included file, macro args, macro bodies.
		// We need to identify the main-file tokens that represent Batch, and
		// determine whether we want to exclusively claim them. Regular tokens
		// represent one AST construct, but a macro invocation can represent many.
		if (FID == SelFile) {
		hokeinUnsubmitted Done Reply Inline Actions nit: maybe use [`early-exist`](https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code) here? hokein: nit: maybe use [`early-exist`](https://llvm.org/docs/CodingStandards.html#use-early-exits-and…
		// Tokens written directly in the main file.
		// Claim the token exclusively for this node.
		Claimed.claim(SM.getFileOffset(Batch.front().location()),
		SM.getFileOffset(Batch.back().location()) +
		Batch.back().length(),
		Result);
		} else if (Batch.front().location().isFileID()) {
		// Tokens in another file #included into the main file.
		// Check if the #include is selected, but don't claim it exclusively.
		for (SourceLocation Loc = Batch.front().location(); Loc.isValid();
		Loc = SM.getIncludeLoc(SM.getFileID(Loc))) {
		if (SM.getFileID(Loc) == SelFile) {
		Claimed.peek(SM.getFileOffset(Loc), Result);
		break;
		}
		}
		} else {
		SourceLocation ArgStart =
		SM.getTopMacroCallerLoc(Batch.front().location());
		if (SM.getFileID(ArgStart) == SelFile) {
		// Tokens that were passed as a macro argument.
		// Claim the token exclusively for this node.
		// FIXME: this prevents selecting both occurrences of args used twice.
		SourceLocation ArgEnd =
		SM.getTopMacroCallerLoc(Batch.back().location());
		Claimed.claim(SM.getFileOffset(ArgStart),
		SM.getFileOffset(ArgEnd) + Batch.back().length(),
		Result);
		} else {
		// A non-argument macro expansion.
		// Check if the macro name is selected, don't claim it exclusively.
		auto Expansion = SM.getDecomposedExpansionLoc(S.getBegin());
		hokeinUnsubmitted Done Reply Inline Actions not sure what's the rational behavior, but I think for the following case, we just have the TUDecl in the selection tree, maybe use the whole macro range? #define M() 123 int d = M(^); // now we only have the TUDecl in the selection tree. hokein: not sure what's the rational behavior, but I think for the following case, we just have the…
		sammccallAuthorUnsubmitted Done Reply Inline Actions This is definitely better, but isn't trivial to do, and isn't a very important case. Added a FIXME rather than clutter/delay this patch with it. sammccall: This is definitely better, but isn't trivial to do, and isn't a very important case. Added a…
		if (Expansion.first == SelFile)
		Claimed.peek(Expansion.second, Result);
		}
		}
		}
if (Result)		if (Result)
dlog("{1}hit selection: {0}",		dlog("{1}hit selection: {0}", S.printToString(SM), indent());
SourceRange(SM.getComposedLoc(B.first, B.second),
SM.getComposedLoc(E.first, E.second))
.printToString(SM),
indent());
}		}

std::string indent(int Offset = 0) {		std::string indent(int Offset = 0) {
// Cast for signed arithmetic.		// Cast for signed arithmetic.
int Amount = int(Stack.size()) + Offset;		int Amount = int(Stack.size()) + Offset;
assert(Amount >= 0);		assert(Amount >= 0);
return std::string(Amount, ' ');		return std::string(Amount, ' ');
}		}

SourceManager &SM;		SourceManager &SM;
const LangOptions &LangOpts;		const LangOptions &LangOpts;
#ifndef NDEBUG		#ifndef NDEBUG
const PrintingPolicy &PrintPolicy;		const PrintingPolicy &PrintPolicy;
#endif		#endif
std::stack<Node *> Stack;		std::stack<Node *> Stack;
		const syntax::TokenBuffer &Tokens;
SelectedTokens Claimed;		SelectedTokens Claimed;
std::deque<Node> Nodes; // Stable pointers as we add more nodes.		std::deque<Node> Nodes; // Stable pointers as we add more nodes.
FileID SelFile;		FileID SelFile;
// If the selection start slices a token in half, the beginning of that token.		// If the selection start slices a token in half, the beginning of that token.
// This is useful for checking whether the end of a token range overlaps		// This is useful for checking whether the end of a token range overlaps
// the selection: range.end < SelBeginTokenStart is equivalent to		// the selection: range.end < SelBeginTokenStart is equivalent to
// range.end + measureToken(range.end) < SelBegin (assuming range.end points		// range.end + measureToken(range.end) < SelBegin (assuming range.end points
// to a token), and it saves a lex every time.		// to a token), and it saves a lex every time.
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

clang-tools-extra/clangd/unittests/SelectionTests.cpp

Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	Case Cases[] = {
{		{
R"cpp(		R"cpp(
void foo() { [[if (1^11) { return; } else {^ }]] }		void foo() { [[if (1^11) { return; } else {^ }]] }
)cpp",		)cpp",
"IfStmt",		"IfStmt",
},		},
{		{
R"cpp(		R"cpp(
		int x(int);
		hokeinUnsubmitted Done Reply Inline Actions could we have a testcase to cover the "tokens expanded from another #include file" code path? hokein: could we have a testcase to cover the "tokens expanded from another #include file" code path?
		#define M(foo) x(foo)
		int a = 42;
		int b = M([[^a]]);
		)cpp",
		"DeclRefExpr",
		},
		{
		R"cpp(
void foo();		void foo();
#define CALL_FUNCTION(X) X()		#define CALL_FUNCTION(X) X()
void bar() { CALL_FUNCTION([[f^o^o]]); }		void bar() { CALL_FUNCTION([[f^o^o]]); }
)cpp",		)cpp",
"DeclRefExpr",		"DeclRefExpr",
},		},
{		{
R"cpp(		R"cpp(
▲ Show 20 Lines • Show All 234 Lines • ▼ Show 20 Lines	const char *Cases[] = {
}		}
)cpp",		)cpp",
R"cpp(		R"cpp(
template <class T>		template <class T>
struct unique_ptr {};		struct unique_ptr {};
void foo(^$C[[unique_ptr<$C[[unique_ptr<$C[[int]]>]]>]]^ a) {}		void foo(^$C[[unique_ptr<$C[[unique_ptr<$C[[int]]>]]>]]^ a) {}
)cpp",		)cpp",
R"cpp(int a = [[5 >^> 1]];)cpp",		R"cpp(int a = [[5 >^> 1]];)cpp",
R"cpp([[		R"cpp(
#define ECHO(X) X		#define ECHO(X) X
ECHO(EC^HO([[$C[[int]]) EC^HO(a]]));		ECHO(EC^HO($C[[int]]) EC^HO(a));
]])cpp",		)cpp",
R"cpp( $C[[^$C[[int]] a^]]; )cpp",		R"cpp( $C[[^$C[[int]] a^]]; )cpp",
R"cpp( $C[[^$C[[int]] a = $C[[5]]^]]; )cpp",		R"cpp( $C[[^$C[[int]] a = $C[[5]]^]]; )cpp",
};		};
for (const char *C : Cases) {		for (const char *C : Cases) {
Annotations Test(C);		Annotations Test(C);
auto AST = TestTU::withCode(Test.code()).build();		auto AST = TestTU::withCode(Test.code()).build();
auto T = makeSelectionTree(C, AST);		auto T = makeSelectionTree(C, AST);

▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	TEST(SelectionTest, Implicit) {
EXPECT_EQ(Str, &Str->Parent->Parent->ignoreImplicit())		EXPECT_EQ(Str, &Str->Parent->Parent->ignoreImplicit())
<< "Didn't unwrap " << nodeKind(&Str->Parent->Parent->ignoreImplicit());		<< "Didn't unwrap " << nodeKind(&Str->Parent->Parent->ignoreImplicit());

EXPECT_EQ("CXXConstructExpr", nodeKind(&Str->outerImplicit()));		EXPECT_EQ("CXXConstructExpr", nodeKind(&Str->outerImplicit()));
}		}

} // namespace		} // namespace
} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang
		hokeinUnsubmitted Not Done Reply Inline Actions I think cases like below are broken: #define greater(x, y) x > y? x : y #define abs(x) x > 0 ? x : -x Selecting the first element for a macro arg seems good to me. hokein: I think cases like below are broken: ``` #define greater(x, y) x > y? x : y #define…

clang-tools-extra/clangd/unittests/TweakTests.cpp

Show First 20 Lines • Show All 263 Lines • ▼ Show 20 Lines	void f(int b = [[1]]) {
goto label;		goto label;
label:		label:
a = [[1]];		a = [[1]];
}		}
)cpp";		)cpp";
EXPECT_UNAVAILABLE(UnavailableCases);		EXPECT_UNAVAILABLE(UnavailableCases);

// vector of pairs of input and output strings		// vector of pairs of input and output strings
const std::vector<std::pair<llvm::StringLiteral, llvm::StringLiteral>>		const std::vector<std::pair<std::string, std::string>>
InputOutputs = {		InputOutputs = {
// extraction from variable declaration/assignment		// extraction from variable declaration/assignment
{R"cpp(void varDecl() {		{R"cpp(void varDecl() {
int a = 5 * (4 + (3 [[- 1)]]);		int a = 5 * (4 + (3 [[- 1)]]);
})cpp",		})cpp",
R"cpp(void varDecl() {		R"cpp(void varDecl() {
auto dummy = (3 - 1); int a = 5 * (4 + dummy);		auto dummy = (3 - 1); int a = 5 * (4 + dummy);
})cpp"},		})cpp"},
Show All 35 Lines	const std::vector<std::pair<std::string, std::string>>
auto dummy = PLUS(1+a); int y = dummy;		auto dummy = PLUS(1+a); int y = dummy;
})cpp"},		})cpp"},
// ensure InsertionPoint isn't inside a macro		// ensure InsertionPoint isn't inside a macro
{R"cpp(#define LOOP(x) while (1) {a = x;}		{R"cpp(#define LOOP(x) while (1) {a = x;}
void f(int a) {		void f(int a) {
if(1)		if(1)
LOOP(5 + [[3]])		LOOP(5 + [[3]])
})cpp",		})cpp",
/*FIXME: It should be extracted like this. SelectionTree needs to be
* fixed for macros.
R"cpp(#define LOOP(x) while (1) {a = x;}		R"cpp(#define LOOP(x) while (1) {a = x;}
void f(int a) {		void f(int a) {
auto dummy = 3; if(1)		auto dummy = 3; if(1)
LOOP(5 + dummy)		LOOP(5 + dummy)
})cpp"},*/
R"cpp(#define LOOP(x) while (1) {a = x;}
void f(int a) {
auto dummy = LOOP(5 + 3); if(1)
dummy
})cpp"},		})cpp"},
{R"cpp(#define LOOP(x) do {x;} while(1);		{R"cpp(#define LOOP(x) do {x;} while(1);
void f(int a) {		void f(int a) {
if(1)		if(1)
LOOP(5 + [[3]])		LOOP(5 + [[3]])
})cpp",		})cpp",
R"cpp(#define LOOP(x) do {x;} while(1);		R"cpp(#define LOOP(x) do {x;} while(1);
void f(int a) {		void f(int a) {
▲ Show 20 Lines • Show All 296 Lines • ▼ Show 20 Lines	void f(const int c) {
std::string TemplateFailInput = R"cpp(		std::string TemplateFailInput = R"cpp(
template<typename T>		template<typename T>
void f() {		void f() {
[[int x;]]		[[int x;]]
}		}
)cpp";		)cpp";
EXPECT_EQ(apply(TemplateFailInput), "unavailable");		EXPECT_EQ(apply(TemplateFailInput), "unavailable");

// FIXME: This should be extractable after selectionTree works correctly for		std::string MacroInput = R"cpp(
// macros (currently it doesn't select anything for the following case)
std::string MacroFailInput = R"cpp(
#define F(BODY) void f() { BODY }		#define F(BODY) void f() { BODY }
F ([[int x = 0;]])		F ([[int x = 0;]])
)cpp";		)cpp";
EXPECT_EQ(apply(MacroFailInput), "unavailable");		std::string MacroOutput = R"cpp(
		#define F(BODY) void f() { BODY }
		void extracted() {
		int x = 0;
		}
		F (extracted();)
		)cpp";
		EXPECT_EQ(apply(MacroInput), MacroOutput);

// Shouldn't crash.		// Shouldn't crash.
EXPECT_EQ(apply("void f([[int a]]);"), "unavailable");		EXPECT_EQ(apply("void f([[int a]]);"), "unavailable");
// Don't extract if we select the entire function body (CompoundStmt).		// Don't extract if we select the entire function body (CompoundStmt).
std::string CompoundFailInput = R"cpp(		std::string CompoundFailInput = R"cpp(
void f() [[{		void f() [[{
int a;		int a;
}]]		}]]
▲ Show 20 Lines • Show All 1,011 Lines • Show Last 20 Lines

clang/include/clang/Tooling/Syntax/Tokens.h

Show First 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	public:
/// point to one of these tokens.		/// point to one of these tokens.
/// FIXME: figure out how to handle token splitting, e.g. '>>' can be split		/// FIXME: figure out how to handle token splitting, e.g. '>>' can be split
/// into two '>' tokens by the parser. However, TokenBuffer currently		/// into two '>' tokens by the parser. However, TokenBuffer currently
/// keeps it as a single '>>' token.		/// keeps it as a single '>>' token.
llvm::ArrayRef<syntax::Token> expandedTokens() const {		llvm::ArrayRef<syntax::Token> expandedTokens() const {
return ExpandedTokens;		return ExpandedTokens;
}		}

		/// Returns the subrange of expandedTokens() corresponding to the closed
		/// token range R.
		llvm::ArrayRef<syntax::Token> expandedTokens(SourceRange R) const;

/// Find the subrange of spelled tokens that produced the corresponding \p		/// Find the subrange of spelled tokens that produced the corresponding \p
/// Expanded tokens.		/// Expanded tokens.
///		///
/// EXPECTS: \p Expanded is a subrange of expandedTokens().		/// EXPECTS: \p Expanded is a subrange of expandedTokens().
///		///
/// Will fail if the expanded tokens do not correspond to a		/// Will fail if the expanded tokens do not correspond to a
/// sequence of spelled tokens. E.g. for the following example:		/// sequence of spelled tokens. E.g. for the following example:
///		///
▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

clang/lib/Tooling/Syntax/Tokens.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	llvm::StringRef FileRange::text(const SourceManager &SM) const {
StringRef Text = SM.getBufferData(File, &Invalid);		StringRef Text = SM.getBufferData(File, &Invalid);
if (Invalid)		if (Invalid)
return "";		return "";
assert(Begin <= Text.size());		assert(Begin <= Text.size());
assert(End <= Text.size());		assert(End <= Text.size());
return Text.substr(Begin, length());		return Text.substr(Begin, length());
}		}

		llvm::ArrayRef<syntax::Token> TokenBuffer::expandedTokens(SourceRange R) const {
		if (R.isInvalid())
		return {};
		const Token *Begin =
		hokeinUnsubmitted Not Done Reply Inline Actions nit: for code readability, I'd use `llvm::ArrayRef<syntax::Token>::iterator` type here. hokein: nit: for code readability, I'd use `llvm::ArrayRef<syntax::Token>::iterator` type here.
		sammccallAuthorUnsubmitted Done Reply Inline Actions I think that hurts readability on two counts: it obscures the actual type: a pointer is concrete and familiar, so it's easier to realize that e.g. `>` is well-defined here it makes it harder to understand `return {Begin, End}` which relies on the fact that the actual type here is Token* sammccall: I think that hurts readability on two counts: - it obscures the actual type: a pointer is…
		llvm::partition_point(expandedTokens(), [&](const syntax::Token &T) {
		hokeinUnsubmitted Done Reply Inline Actions I think the parition_point requires the `ExpandedTokens` is partitioned, but I didn't found any documentation about this guarantee in the code, would be nice to have this in the comment (probably around the `ExpandedTokens`). hokein: I think the parition_point requires the `ExpandedTokens` is partitioned, but I didn't found any…
		return SourceMgr->isBeforeInTranslationUnit(T.location(), R.getBegin());
		});
		const Token *End =
		llvm::partition_point(expandedTokens(), [&](const syntax::Token &T) {
		return !SourceMgr->isBeforeInTranslationUnit(R.getEnd(), T.location());
		});
		if (Begin > End)
		return {};
		return {Begin, End};
		}

std::pair<const syntax::Token , const TokenBuffer::Mapping >		std::pair<const syntax::Token , const TokenBuffer::Mapping >
TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const {		TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const {
assert(Expanded);		assert(Expanded);
assert(ExpandedTokens.data() <= Expanded &&		assert(ExpandedTokens.data() <= Expanded &&
Expanded < ExpandedTokens.data() + ExpandedTokens.size());		Expanded < ExpandedTokens.data() + ExpandedTokens.size());

auto FileIt = Files.find(		auto FileIt = Files.find(
SourceMgr->getFileID(SourceMgr->getExpansionLoc(Expanded->location())));		SourceMgr->getFileID(SourceMgr->getExpansionLoc(Expanded->location())));
▲ Show 20 Lines • Show All 504 Lines • Show Last 20 Lines

clang/unittests/Tooling/Syntax/TokensTest.cpp

Show All 34 Lines
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Support/FormatVariadic.h"		#include "llvm/Support/FormatVariadic.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/VirtualFileSystem.h"		#include "llvm/Support/VirtualFileSystem.h"
#include "llvm/Support/raw_os_ostream.h"		#include "llvm/Support/raw_os_ostream.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Testing/Support/Annotations.h"		#include "llvm/Testing/Support/Annotations.h"
#include "llvm/Testing/Support/SupportHelpers.h"		#include "llvm/Testing/Support/SupportHelpers.h"
		#include "gmock/gmock.h"
#include <cassert>		#include <cassert>
#include <cstdlib>		#include <cstdlib>
#include <gmock/gmock.h>		#include <gmock/gmock.h>
#include <gtest/gtest.h>		#include <gtest/gtest.h>
#include <memory>		#include <memory>
#include <ostream>		#include <ostream>
#include <string>		#include <string>

▲ Show 20 Lines • Show All 607 Lines • ▼ Show 20 Lines	recordTokens(R"cpp(
ID(1)		ID(1)
#pragma lalala		#pragma lalala
not_mapped		not_mapped
)cpp");		)cpp");
EXPECT_THAT(Buffer.spelledForExpanded(findExpanded("not_mapped")),		EXPECT_THAT(Buffer.spelledForExpanded(findExpanded("not_mapped")),
ValueIs(SameRange(findSpelled("not_mapped"))));		ValueIs(SameRange(findSpelled("not_mapped"))));
}		}

		TEST_F(TokenBufferTest, ExpandedTokensForRange) {
		recordTokens(R"cpp(
		#define SIGN(X) X##_washere
		A SIGN(B) C SIGN(D) E SIGN(F) G
		)cpp");

		SourceRange R(findExpanded("C").front().location(),
		findExpanded("F_washere").front().location());
		// Sanity check: expanded and spelled tokens are stored separately.
		EXPECT_THAT(Buffer.expandedTokens(R),
		SameRange(findExpanded("C D_washere E F_washere")));
		EXPECT_THAT(Buffer.expandedTokens(SourceRange()), testing::IsEmpty());
		}

TEST_F(TokenBufferTest, ExpansionStartingAt) {		TEST_F(TokenBufferTest, ExpansionStartingAt) {
// Object-like macro expansions.		// Object-like macro expansions.
recordTokens(R"cpp(		recordTokens(R"cpp(
#define FOO 3+4		#define FOO 3+4
int a = FOO 1;		int a = FOO 1;
int b = FOO 2;		int b = FOO 2;
)cpp");		)cpp");

▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Rethink how SelectionTree deals with macros and #includes.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 230336

clang-tools-extra/clangd/Selection.h

clang-tools-extra/clangd/Selection.cpp

clang-tools-extra/clangd/unittests/SelectionTests.cpp

clang-tools-extra/clangd/unittests/TweakTests.cpp

clang/include/clang/Tooling/Syntax/Tokens.h

clang/lib/Tooling/Syntax/Tokens.cpp

clang/unittests/Tooling/Syntax/TokensTest.cpp

[clangd] Rethink how SelectionTree deals with macros and #includes.
ClosedPublic