This is an archive of the discontinued LLVM Phabricator instance.

[clangd][Hover] Don't use Decl if it is not related with tokens under cursor.
AbandonedPublic

Authored by ArcsinX on Jul 9 2020, 2:22 PM.

Details

Reviewers
sammccall
Summary

This patch fixes redundant hover with information about Decl which is not under cursor.

Diff Detail

Event Timeline

ArcsinX created this revision.Jul 9 2020, 2:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2020, 2:22 PM

thanks for doing this! but i believe these look like bugs that should be addressed in other components, rather than worked around in hover.

the first example is likely a bug in selection tree, file location corresponds to a macro body that's never expanded, which resides inside a namespace (who's declaration contains everything within it's braces), hence selection tree attributes the selection to the namespace because there's no fine-grained children around the cursor.

the second one is likely a bug in ast-traversal logic, i suppose the function doesn't get a TypeSourceInfo when it is invalid, hence we don't get to traverse its parameters(children) and therefore assume all the declaration belongs to function itself.


As for your workaround in hover, it is surprising that no other tests has regressed, but in theory D->getLocation only corresponds to the main token in a declaration (e.g. name of a function), and hover can refer to declarations whose names are away from the cursor, e.g. if you are on (closing) parens of a function/ctor call.

@kadircet Thanks for your reply.

Think that you are right, because the same problem appears in GTD (with the same test cases). And as for GTD the similar workaround also does not break any regression tests except override/final.

Yeah SelectionTree is the right place to solve this but it's a harder problem there.

In general we *don't* only want to trigger on the single token returned by getLocation().
The strategy SelectionTree uses is to attribute tokens to an outer node unless they're associated with an inner node or ignored.

Your test cases show two problems with this strategy:

  • we ignore comments and semicolons, but not preprocessor directives (or disabled preprocessor regions). I think we can fix this by asking TokenBuffer if a spelled token is part of a region that maps to no (PP-)expanded tokens.
  • in the case of syntax errors, tokens are emitted by the preprocessor but then no AST node is built from them, and we associate them with the parent by default. Maybe we don't need to fix this case?

Your test cases show two problems with this strategy:

  • we ignore comments and semicolons, but not preprocessor directives (or disabled preprocessor regions). I think we can fix this by asking TokenBuffer if a spelled token is part of a region that maps to no (PP-)expanded tokens.

I have tried this locally. seems it breaks SelectionTest.IncludedFile test.

Your test cases show two problems with this strategy:

  • we ignore comments and semicolons, but not preprocessor directives (or disabled preprocessor regions). I think we can fix this by asking TokenBuffer if a spelled token is part of a region that maps to no (PP-)expanded tokens.

I have tried this locally. seems it breaks SelectionTest.IncludedFile test.

Yeah that makes sense, I guess it just says nothing is selected in that case?
Selecting the while is probably marginally better for that exact case, but selecting nothing seems fine to me.

@kadircet any concerns with that "regression"?

Your test cases show two problems with this strategy:

  • we ignore comments and semicolons, but not preprocessor directives (or disabled preprocessor regions). I think we can fix this by asking TokenBuffer if a spelled token is part of a region that maps to no (PP-)expanded tokens.

I have tried this locally. seems it breaks SelectionTest.IncludedFile test.

Yeah that makes sense, I guess it just says nothing is selected in that case?

Yes, and test crashes at T.commonAncestor() (which is nullptr) dereference. So can we also add ASSERT_TRUE(T.commonAncestor()); into several tests?

Your test cases show two problems with this strategy:

  • we ignore comments and semicolons, but not preprocessor directives (or disabled preprocessor regions). I think we can fix this by asking TokenBuffer if a spelled token is part of a region that maps to no (PP-)expanded tokens.

I have tried this locally. seems it breaks SelectionTest.IncludedFile test.

Yeah that makes sense, I guess it just says nothing is selected in that case?

Yes, and test crashes at T.commonAncestor() (which is nullptr) dereference. So can we also add ASSERT_TRUE(T.commonAncestor()); into several tests?

Right, that particular case should become EXPECT_EQ(..., nullptr).

Technically the other tests should probably have an ASSERT, because otherwise if the code is broken and returns nullptr we have UB and the test could spuriously pass.
In practice I've never seen a test spuriously pass this way, let alone on all buildbots. And these kinds of assumptions are hard to keep out of the tests. So I'm not particularly concerned about it.

Yeah that makes sense, I guess it just says nothing is selected in that case?
Selecting the while is probably marginally better for that exact case, but selecting nothing seems fine to me.

@kadircet any concerns with that "regression"?

I think I am fine with selecting nothing in that case, selecting some part of the AST that's not visible(even as hint, it is coming from another file could've been anything right?) to developer doesn't seem so important.
It is also inconsistent, we show the while stmt only on filename and not on any other part of the directive, moreover if you put a function decl instead of a whilestmt, we again don't have a selection even on the filename :(
So i think that might not even be a regression.

ArcsinX updated this revision to Diff 278440.Jul 16 2020, 5:20 AM

Prevent tokens selection in disabled preprocessor sections.

sammccall added inline comments.Jul 16 2020, 6:37 AM
clang-tools-extra/clangd/Selection.cpp
227 ↗(On Diff #278440)

I think this is more work than we'd like to do in this loop (the whole file could be selected!) and it's also not obviously correct - it relies on the assumption that any token that doesn't expand to something by itself, isn't a macro argument or macro name, is part of an ignored region. (That may or may not be correct, but it's a nontrivial assumption to make here).

I think the API we need is something like vector<TokenBuffer::Expansion> TokenBuffer::expansionsOverlapping(ArrayRef<Token> Spelled).
This requires a couple of binary searches on the TokenBuffer side to compute, and on this side we can just look at the spelled token ranges for empty expansions and mark them in a bitmap.

I'm happy to try to put together a TokenBuffer patch if this is too much of a yak-shave, though.

(Sorry, I know that this is a lot of goalpost-shifting: I think this is now being solved at the right layer and with the right behavior, and it's just a question of finding a clear+fast implementation)

ArcsinX added inline comments.Jul 16 2020, 2:02 PM
clang-tools-extra/clangd/Selection.cpp
227 ↗(On Diff #278440)

it relies on the assumption that any token that doesn't expand to something by itself, isn't a macro argument or macro name, is part of an ignored region. (That may or may not be correct, but it's a nontrivial assumption to make here).

Example #define FOO BAR.
Expanded tokens:
Spelled tokens: #, define, FOO, BAR

I think we could ignore all except FOO in this case. Also I found similar check in XRefs.cpp bool tokenSpelledAt(SourceLocation SpellingLoc, const syntax::TokenBuffer &TB)

How to judge should we skip token or not if we do not have expanded tokens for it? Do you have any advice here?

sammccall added inline comments.Jul 17 2020, 12:43 AM
clang-tools-extra/clangd/Selection.cpp
227 ↗(On Diff #278440)

I'm not following, because I think we can ignore *all* those tokens. We care about macro names when the macros are *used* (tokens expanded from macro bodies are associated from the macro name at the expansion point). But not where they're defined.

Also I found similar check in XRefs.cpp

Yes, there are a few restrictions there though:

  • it's only used for identifiers, so there are fewer cases to consider
  • performance doesn't matter at all, as it runs in response to an explicit action and runs on ~2 tokens

How to judge should we skip token or not if we do not have expanded tokens for it? Do you have any advice here?

I think we should just always skip in this case, let me try this out...

Tried this out in D84012/D84009. Works pretty well, and I think the API is a useful and natural addition to TokenBuffer.
Maybe this is too much complexity though?

(Mostly here i'm worrying about the case when the user hits "select all" in their editor and we get a code action request or so - it seems wasteful to query the token buffer for every token in the file separately. But maybe this is in the noise in any case).

Tried this out in D84012/D84009. Works pretty well, and I think the API is a useful and natural addition to TokenBuffer.

For my test cases it works well, so I think this problem is fixed.
Should I abandon this revision?

Tried this out in D84012/D84009. Works pretty well, and I think the API is a useful and natural addition to TokenBuffer.

For my test cases it works well, so I think this problem is fixed.
Should I abandon this revision?

I've landed those two patches, so I think so.
Thanks for raising this and working through it, I hope I didn't step on your toes to much here; this was a gap in TokenBuffer that I'd been hoping to have an excuse to fill at some point.

ArcsinX abandoned this revision.Jul 20 2020, 5:59 AM