This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Rethink how SelectionTree deals with macros and #includes.
ClosedPublic

Authored by sammccall on Nov 20 2019, 2:26 PM.

Details

Summary

The exclusive-claim model is successful at resolving conflicts over tokens
between parent/child or siblings. However it's not the right model for selecting
AST nodes produced by a macro expansion, which can produce several independent
nodes that are equally associated with the macro invocation.
Additionally, any model that only uses the endpoints in a range can fail when
a macro invocation occurs inside the node.

To address this, we use the existing TokenBuffer in more depth. SourceRanges
are translated into an array of expanded tokens we can iterate over, and in
principle process token by token (in practice, batching related tokens helps).

For regular tokens (and macro-arg expansions) we claim the tokens as before,
but for tokens from macro bodies we merely check whether the macro name was
selected. Thus tokens in macro bodies are selected by selecting the macro name.

The aggregation of Selection is now more principled as we need to be able to
call claim()/peek() an arbitrary number of times.

One side-effect of iterating over the expanded tokens is that (usually) nothing
claims preprocessor tokens like macro names and directives.
Rather than fixing this I just left them unclaimed, and removed support for
determining the selectedness of TUDecl.
(That was originally implemented in 90a5bf92ff97b1, but doesn't seem to be very
important or worth the complexity any longer).

The expandedTokens(SourceLocation) helper could be added locally, but seems to
make sense on TokenBuffer.

Diff Detail

Event Timeline

sammccall created this revision.Nov 20 2019, 2:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 20 2019, 2:26 PM

Build result: pass - 60204 tests passed, 0 failed and 732 were skipped.
Log files: console-log.txt, CMakeCache.txt

hokein added inline comments.Nov 25 2019, 4:01 AM
clang-tools-extra/clangd/Selection.cpp
49

I might be missing something, I don't understand why we set Partial if Result != New.

446

what's S for this case?

464
496

not sure what's the rational behavior, but I think for the following case, we just have the TUDecl in the selection tree, maybe use the whole macro range?

#define M() 123

int d = M(^); // now we only have the TUDecl in the selection tree.
clang-tools-extra/clangd/unittests/SelectionTests.cpp
139

could we have a testcase to cover the "tokens expanded from another #include file" code path?

clang/lib/Tooling/Syntax/Tokens.cpp
125

nit: for code readability, I'd use llvm::ArrayRef<syntax::Token>::iterator type here.

126

I think the parition_point requires the ExpandedTokens is partitioned, but I didn't found any documentation about this guarantee in the code, would be nice to have this in the comment (probably around the ExpandedTokens).

sammccall marked 10 inline comments as done.

Address review comments.

Build result: fail - 60299 tests passed, 1 failed and 732 were skipped.

failed: Clangd Unit Tests._/ClangdTests/SelectionTest.PathologicalPreprocessor

Log files: console-log.txt, CMakeCache.txt

sammccall planned changes to this revision.Nov 25 2019, 11:50 PM

No, this patch is busted, and the tests were too simple go catch it.

The issue is that with no exclusivity check, given {{{ MACRO }}} all the enclosing blocks get to count themselves selected.

We need an exclusivity check on the expanded token stream first. This will yield a sequence of slices of tokens not yet claimed. Then each slice gets partitioned by FileID, each partition gets mapped to spelled tokens and checked as in this patch.

I suspect the exclusivity/claiming at the spelled token level is no longer needed.

clang-tools-extra/clangd/Selection.cpp
49

Added a brief comment and an assertion.
Intuitive explanation: Consider Complete = black, Unselected = white, Partial = grey.
White + White -> white, Black + Black -> black, every other mixture is grey.

(But honestly I wrote up the state transition table first and then extracted the logic by staring at it)

446

S is the function parameter, the implication is that it's the range of the vardecl in the example. Expanded the comment a bit.

496

This is definitely better, but isn't trivial to do, and isn't a very important case.
Added a FIXME rather than clutter/delay this patch with it.

clang/lib/Tooling/Syntax/Tokens.cpp
125

I think that hurts readability on two counts:

  • it obscures the actual type: a pointer is concrete and familiar, so it's easier to realize that e.g. > is well-defined here
  • it makes it harder to understand return {Begin, End} which relies on the fact that the actual type here is Token*
sammccall updated this revision to Diff 231206.Nov 27 2019, 3:26 AM

Rewrote patch with a better approach (claiming expanded tokens rather than spelled).
Added more tests, including one showing a problem with multiple arg expansion.

Build result: pass - 60302 tests passed, 0 failed and 732 were skipped.

Log files: console-log.txt, CMakeCache.txt

hokein accepted this revision.Nov 27 2019, 7:05 AM

Thanks, the patch looks good! please also update the patch description.

clang-tools-extra/clangd/Selection.cpp
81

nit: this comment seems not reflect to the code now.

144

nit: remove the redundant start.

clang-tools-extra/clangd/unittests/SelectionTests.cpp
467

I think cases like below are broken:

#define greater(x, y) x > y? x : y
#define abs(x) x > 0 ? x : -x

Selecting the first element for a macro arg seems good to me.

This revision is now accepted and ready to land.Nov 27 2019, 7:05 AM
This revision was automatically updated to reflect the committed changes.
sammccall marked 2 inline comments as done.
thakis added a subscriber: thakis.Nov 29 2019, 9:57 AM

hover.test is failing on Mac: http://45.33.8.238/mac/3308/step_7.txt

Please take a look, and revert if it takes a while.

Reverted as 905b002c139f039a32ab9bf1fad63d745d12423f

@thakis any information available about that bot configuration?
I'm not seeing any failures on "official" ones, I can't reproduce locally, and the failures don't make a lot of sense to me (lots of off-by-ones everywhere maybe?)

That particular bot does GN builds, but these test failures repro in the cmake build for me on both macs I tried. (Remember that the official mac bots are on greendragon, not on buildbot -- I'd guess it shows there too.)

Mac is happy now, but it fails to build on Windows: http://45.33.8.238/win/3327/step_4.txt