This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Extract scoring/ranking logic, and shave yaks.
ClosedPublic

Authored by sammccall on May 7 2018, 6:49 AM.

Details

Summary

Code completion scoring was embedded in CodeComplete.cpp, which is bad:

  • awkward to test. The mechanisms (extracting info from index/sema) can be unit-tested well, the policy (scoring) should be quantitatively measured. Neither was easily possible, and debugging was hard. The intermediate signal struct makes this easier.
  • hard to reuse. This is a bug in workspaceSymbols: it just presents the results in the index order, which is not sorted in practice, it needs to rank them! Also, index implementations care about scoring (both query-dependent and independent) in order to truncate result lists appropriately.

The main yak shaved here is the build() function that had 3 variants across
unit tests is unified in TestTU.h (rather than adding a 4th variant).

Diff Detail

Repository
rL LLVM

Event Timeline

sammccall created this revision.May 7 2018, 6:49 AM

The change makes both testing and scoring code better.
Even though those are largely independent changes, perfectly happy to review them together.

I mostly have NITs here, overall the changes LG.

clangd/CodeComplete.cpp
36 ↗(On Diff #145458)

Is this used by DEBUG() macro?
It would be nice to have a short comment saying why we need this define.

909 ↗(On Diff #145458)

NIT: Maybe use float here instead of auto? Would remove the need to look at evaluate for anyone reading the code for the first time.

clangd/Quality.cpp
83 ↗(On Diff #145458)

This function does not seem to be exposed outside this C++ file. Maybe add static?

clangd/Quality.h
45 ↗(On Diff #145458)

Maybe use doxygen-style comments to be consistent with the rest of LLVM?

66 ↗(On Diff #145458)

Maybe rename to Inaccessible? It seems to be closer to what this bool means in C++, if I'm reading the code correctly.
Or add a comment explaining what "unavailable" means?

80 ↗(On Diff #145458)

As you mentioned in the change description, moving TopN and sortText to a separate file might be a good idea since they don't depend on various clangd-specific bits.
But I'm perfectly happy to leave it as is and do this later if needed.

119 ↗(On Diff #145458)

Maybe mention that it's used for LSP sortText? To give a bit more context on why we need this function.

unittests/clangd/TestTU.cpp
56 ↗(On Diff #145458)

Maybe add braces for the loop body here? It seems to be long enough that it could actually, arguably, improve readability.
We could some nesting by inverting the condition too:
if (QName != ...) continue;

62 ↗(On Diff #145458)

Maybe use FAIL() instead of ADD_FAILURE() followed by llvm_unreachable()?
llvm_unreachable may produce surprising behaviors, i.e. miscompiles in opt mode etc, IIUC.
Both failing or returning a first matching symbol while also signalling an error seem like good alternatives here.

unittests/clangd/TestTU.h
28 ↗(On Diff #145458)

I really like this helper, now that we can reuse the code between different tests!
It took me some time to get the semantics of this constructor, though.
I suggest to have a few constructor functions with more descriptive name (my names are not great, but should give the idea):

static TestTU FromSourceFile(StringRef Code);
static TestTU FromHeaderFile(StringRef Code);
static TestTU WithImplicitInclude(StringRef Source, StringRef IncludedHeader);
sammccall updated this revision to Diff 145478.May 7 2018, 9:01 AM
sammccall marked 9 inline comments as done.

Address comments

sammccall added inline comments.May 7 2018, 9:02 AM
clangd/CodeComplete.cpp
909 ↗(On Diff #145458)

Done. I was worried we might switch to double one day, seems unlikely.

clangd/Quality.h
66 ↗(On Diff #145458)

So it's Unavailable || Inaccessible, where neither is all that well-defined :-)
I renamed to Forbidden to avoid conflation with either, and added examples as a comment.

80 ↗(On Diff #145458)

Yeah, I'd rather punt on this for now to avoid creating too many tiny files and a random standard library supplement. If TopN is pulled out, llvm/Support might be a better place for it.

119 ↗(On Diff #145458)

Oops, I wrote the comment but put it in the cpp file by mistake...

unittests/clangd/TestTU.cpp
62 ↗(On Diff #145458)

FAIL() is the same as ADD_FAILURE() but it causes the enclosing function to immediately return - it can only be used from void functions.

llvm_unreachable may produce surprising behaviors, i.e. miscompiles in opt mode etc, IIUC

Right. After offline discussion, we don't care about asserts-disabled test behavior if the asserts-enabled failure is sensible, so just replaced with an assert.

unittests/clangd/TestTU.h
28 ↗(On Diff #145458)

Done, Just added withCode and withHeaderCode for now, and anyone who wants something more complicated can set the fields directly.
(The names mirror the struct fields)

The new uploaded diff has lots of unrelated changes to clang-tidy, clang-move, etc...
Looking at commits, it seems arc diff was called with the wrong base commit...
Could you please reupload the change?

clangd/Quality.h
45 ↗(On Diff #145458)

Some changes are missing?
File still uses 2-slash instead of 3-slash comments.

66 ↗(On Diff #145458)

Thanks! Forbidden with a comment LG.

unittests/clangd/TestTU.h
28 ↗(On Diff #145458)

LG, thanks.

sammccall updated this revision to Diff 146584.May 14 2018, 5:32 AM

fix diffbase?

sammccall updated this revision to Diff 146591.May 14 2018, 6:33 AM
sammccall marked an inline comment as done.

Doxygen

This revision is now accepted and ready to land.May 15 2018, 7:16 AM
This revision was automatically updated to reflect the committed changes.