This is an archive of the discontinued LLVM Phabricator instance.

[clangd] FuzzyMatch: forbid tail-tail matches after a miss: [pat] !~ "panther"
ClosedPublic

Authored by sammccall on Jun 8 2018, 9:25 AM.

Details

Summary

This is a small code change but vastly reduces noise in code completion results.
The intent of allowing this was to let [sc] ~ "strncpy" and [strcpy] ~ "strncpy"
however the benefits for unsegmented names aren't IMO worth the costs.

Test cases should be representative of the changes here.

Diff Detail

Repository
rL LLVM

Event Timeline

sammccall created this revision.Jun 8 2018, 9:25 AM
sammccall updated this revision to Diff 150531.Jun 8 2018, 9:48 AM

Simplify logic/make more consistent. Couple of extra test cases.

Very nice! I tried "std" and got much less (unimportant) results. I see something a bit weird with "getStandardResourceDir" but it might be VSCode. Here, I guess it's the "d" in Dir that matches but what's odd is that VS Code will highlight the first "d", i.e. in "standard". But it doesn't look like there is any way to control this in the LSP, is there?

Very nice! I tried "std" and got much less (unimportant) results. I see something a bit weird with "getStandardResourceDir" but it might be VSCode. Here, I guess it's the "d" in Dir that matches but what's odd is that VS Code will highlight the first "d", i.e. in "standard". But it doesn't look like there is any way to control this in the LSP, is there?

That's right. VSCode applies its fuzzymatch again client-side to determine which characters to highlight, and it chooses the wrong (IMO) "d" here. We wouldn't match [std] against "getStandardResourceZir".
(Worse, they use the same algorithm to rerank our results if client-side filtering applies...)

Very nice! I tried "std" and got much less (unimportant) results. I see something a bit weird with "getStandardResourceDir" but it might be VSCode. Here, I guess it's the "d" in Dir that matches but what's odd is that VS Code will highlight the first "d", i.e. in "standard". But it doesn't look like there is any way to control this in the LSP, is there?

That's right. VSCode applies its fuzzymatch again client-side to determine which characters to highlight, and it chooses the wrong (IMO) "d" here. We wouldn't match [std] against "getStandardResourceZir".
(Worse, they use the same algorithm to rerank our results if client-side filtering applies...)

Thanks for the confirmation. In that case, I think the approach of this patch is quite good! Although it would be better is someone familiar with code completion could review the details :)

Fix completion tests. (Simplify a bit - no need to re-test all the nuances of
fuzzy matching there).

@ilya-biryukov ping. I'm excited about this change :-)

ilya-biryukov accepted this revision.Jun 13 2018, 1:41 AM

LGTM with a few nits. And comment requests for things that were unclear when reading the code.

clangd/FuzzyMatch.cpp
269 ↗(On Diff #151104)

NIT: maybe split into two declarations?

auto MatchMatchScore = ...;
auto MissMatchScore = ....;
clangd/FuzzyMatch.h
56 ↗(On Diff #151104)

Maybe add a small doc comment for Action? Found myself figuring out what it is while reading this patch.

79 ↗(On Diff #151104)

Maybe add a doc comment that this is the union of CharType of all chars in the word?
Seems obvious now, but was also a bit confusing when reading the code.

This revision is now accepted and ready to land.Jun 13 2018, 1:41 AM
This revision was automatically updated to reflect the committed changes.
sammccall marked 3 inline comments as done.