This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Handle go-to-definition in macro invocations where the target appears in the expansion multiple times
ClosedPublic

Authored by nridge on Dec 31 2019, 2:22 PM.

Diff Detail

Event Timeline

nridge created this revision.Dec 31 2019, 2:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 31 2019, 2:22 PM

Unit tests: pass. 61162 tests passed, 0 failed and 728 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

I like the goal here, a few high-level comments:

  • it seems unfortunate to solve this mostly in xrefs rather than mostly in SelectionTree. All the other clients potentially want this behavior.
  • the allSelectedNodes API on SelectionTree feels non-orthogonal: it's only meaningful here because the input happened to be a point and thus touches a single token and therefore node. If the input was a range, then this API doesn't help, because it loses the sense of "topmost" node.
  • maybe the function we want walks down the tree as if finding the common ancestor, but may descend to return the roots of multiple subtrees if they are located in distinct expansions of the same macro arg
  • or maybe we just want to treat non-first expansions as not-selected? That would about this issue I think

(BTW, Haojian is out the first few weeks of Jan. I'm still technically OOO, back later this week)

nridge edited the summary of this revision. (Show Details)Dec 31 2019, 3:15 PM
nridge edited reviewers, added: sammccall; removed: hokein.
nridge planned changes to this revision.Jan 9 2020, 2:27 PM
  • the allSelectedNodes API on SelectionTree feels non-orthogonal: it's only meaningful here because the input happened to be a point and thus touches a single token and therefore node. If the input was a range, then this API doesn't help, because it loses the sense of "topmost" node.

Good point!

  • maybe the function we want walks down the tree as if finding the common ancestor, but may descend to return the roots of multiple subtrees if they are located in distinct expansions of the same macro arg
  • or maybe we just want to treat non-first expansions as not-selected? That would about this issue I think

I'm thinking of going with the second suggestion, because it doesn't require changes to the SelectionTree API, and therefore addresses this concern:

  • it seems unfortunate to solve this mostly in xrefs rather than mostly in SelectionTree. All the other clients potentially want this behavior.

without touching all the clients of SelectionTree.

nridge updated this revision to Diff 238094.Jan 14 2020, 1:51 PM

Revise the approach to treat non-first expansions as not-selected

Unit tests: pass. 61848 tests passed, 0 failed and 781 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Thanks for fixing this. I think it could maybe use a further tweak, LMK if I'm missing something as my intuition around macros is pretty bad.

clang-tools-extra/clangd/Selection.cpp
289

Given the following program:

#define SQUARE(x) x * x;
int four = [[SQUARE(2)]];

We're going to now report the binary operator and one of the operands as selected and not the other, which doesn't seem desirable.

I think we want to accept macro-selected || arg-selected, so probably doing the current "non-argument macro expansion" first unconditionally or factoring it out into a function.

This will change the behavior of int four = [[SQUARE]](2) to consider the literal children selected too, I think this is fine.

289

I don't think it's a good idea to add hidden state and side-effects to testChunk() - it breaks a lot of assumptions that help reason about the code, and using mutable hides the violation of them.
(And a possible source of bugs - this is first in traversal order rather than first in source order - these are mostly but IIRC not always the same).

Instead I think you can do this statelessly: from the top-level spelling location, walk down with SM.getMacroArgExpandedLocation until you hit the target FileID (this is the first-expansion of first-expansion of first-expansion...) or the FileID stops changing (you've reached the innermost macro invocation, and your target location was on a different branch).

clang-tools-extra/clangd/Selection.h
66–69

Worth mentioning the new behavior here, IMO. e.g.

When a macro argument is specifically selected, only its first expansion is selected in the AST.
(Returning a selection forest is unreasonably difficult for callers to handle correctly).
nridge marked 4 inline comments as done.Feb 27 2020, 2:10 PM
nridge added inline comments.
clang-tools-extra/clangd/Selection.cpp
289

I agree that adding state is not great. I thought it was icky as I was writing it, I just couldn't think of an alternative. Thank you for suggesting one!

I implemented what you suggested, and it seems to work. I did want to ask a clarifying question to make sure I understand correctly: when an argument occurs multiple times in a macro exapnsion, the occurrences will have distinct FileIDs (as opposed just different offsets in the same macro FileID)?

nridge updated this revision to Diff 247106.Feb 27 2020, 2:11 PM
nridge marked an inline comment as done.

Address review comments

sammccall accepted this revision.Mar 2 2020, 4:31 AM

Awesome, thanks for fixing this!

clang-tools-extra/clangd/Selection.cpp
166

nit: return false directly?

287

when false is the fallthrough to handling as if part of the macro body deliberate?

Thinking about it I suppose either that or returning NoTokens works, but please document it, e.g. } else { /* fall through and treat as part of the macro body */}

289

That's right. My mental model is:

  • a TU is a sequence of expanded (i.e. post-PP) tokens
  • a FileID always corresponds to a single subrange of these expanded tokens...
  • ...with holes in it for child FileIDs. Parents/children nest properly.

One of the things I really like about the syntax::TokenBuffer API is that it exposes this structure.

This revision is now accepted and ready to land.Mar 2 2020, 4:31 AM
nridge marked an inline comment as done.Mar 2 2020, 7:03 AM
nridge added inline comments.
clang-tools-extra/clangd/Selection.cpp
287

It is deliberate. In fact, it is intended to address this previous comment. Please let me know if I've misunderstood and the solution doesn't match the request.

I'll add a comment as you suggested.

sammccall added inline comments.Mar 2 2020, 7:19 AM
clang-tools-extra/clangd/Selection.cpp
287

Yeah I think the behavior is great, as is the implementation :-) just a little surprising.

This revision was automatically updated to reflect the committed changes.
nridge marked 4 inline comments as done.