Page MenuHomePhabricator

[clangd] Add a textual fallback for go-to-definition
Changes PlannedPublic

Authored by nridge on Jan 16 2020, 1:32 PM.

Details

Reviewers
sammccall
Summary

This facilitates performing go-to-definition inside comments, strings,
invalid code, and dependent contexts where AST-based heuristics are
insufficient.

Fixes https://github.com/clangd/clangd/issues/241

Diff Detail

Event Timeline

nridge created this revision.Jan 16 2020, 1:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2020, 1:32 PM

Unit tests: pass. 61850 tests passed, 0 failed and 781 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

nridge marked an inline comment as done.Jan 21 2020, 8:40 AM
nridge added inline comments.
clang-tools-extra/clangd/XRefs.cpp
203

(There should be a return here, will fix locally.)

I've tried this out locally and it's fun! As suspected on the bug though, IMO it's far from accurate enough. Examples from clangd/Compiler.cpp:

  • it triggers on almost every word, even words that plainly don't refer to any decl like format [[lazily]], in case vlog is off. This means that e.g. (in VSCode) the underline on ctrl-hover gives no/misleading signal. It also means that missing your target now jumps you somewhere random instead of doing nothing.
  • when it works properly, the correct result usually mixed with incorrect results (e.g. createInvocationFromCommandLine sets [[DisableFree]]).
  • it doesn't work for some symbols - ones that are not indexable (e.g. RemappedFileBuffers will handle the lifetime of the [[Buffer]] pointer, gives a variety of wrong results)

So while I want to stress this is really cool, it doesn't feel reliable on any dimension: you can't trust clangd on whether the word is an actual reference, you can't trust any particular result, and you can't trust the correct result is in the set.

Some suggestions:

  • only trigger when there's *some* positive signal for the word.
    • Markup like quotes/backticks/brackets/\p
    • weird case like lowerCamel, UpperCamel, CAPS, mid-sentence Capitalization, under_scores.
    • use of the word as a token in nearby code (very close if very short, anywhere in file if longer)
    • (maybe you want to support ns::Qualifiers?)
  • post-filter aggressively - only return exact name matches (I think including case).
  • call fuzzyFind directly and set ProximityPath as well as the enclosing scopes from lexing. For extra strictness consider AnyScope=false
  • if you get more than 3 results, and none from current file, maybe don't return anything, as confidence is too low. Or try a stricter query...
  • handle the most common case of non-indexable symbols (local symbols) by running the query against the closest occurrence of the token in code.
nridge planned changes to this revision.EditedJan 21 2020, 12:32 PM

Thanks for taking a look!

  • it triggers on almost every word, even words that plainly don't refer to any decl like format [[lazily]], in case vlog is off. This means that e.g. (in VSCode) the underline on ctrl-hover gives no/misleading signal. It also means that missing your target now jumps you somewhere random instead of doing nothing.

Heh, I didn't realize VSCode had this feature. I do agree that it changes the tradeoffs a bit, as it means go-to-definition can be invoked in a context where there isn't an explicit signal from the user that they think there's a target there.

The other points you make are completely fair too. I will revise and take your suggestions into account.

I'll aim to start by factoring in enough of your suggestions to reduce the noise to an acceptable level for an initial landing, and leave some of the others for follow-up enhancements.