This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Add type boosting in code completion
ClosedPublic

Authored by ilya-biryukov on Sep 19 2018, 12:17 PM.

Event Timeline

ilya-biryukov created this revision.Sep 19 2018, 12:17 PM
ilya-biryukov planned changes to this revision.Sep 19 2018, 12:17 PM

Ranking-related code should be moved to Quality.cpp

  • Rebase and update wrt to dependent changes
sammccall added inline comments.Nov 20 2018, 7:34 AM
clangd/Quality.cpp
373

is 2 really enough?

clangd/Quality.h
98

you've inserted in the middle of the file proximity stuff :-)

98

Generally we'd put both context/symbol types as the signal here, rather than just whether they match, unless it's prohibitive. They'd get populated manually, and by merge() overloads, respectively.

ilya-biryukov marked an inline comment as done.
  • Increase the multiplier
  • Move the member out of the file proximity group
clangd/Quality.cpp
373

Changed to 5. We can tweak it later if that turns out to be too big.

clangd/Quality.h
98

I did it to avoid inefficiencies of:

  1. copying the context type for each completion item,
  2. copying the symbol type for each of the indexed items.

My understanding is that (1) can be avoided by storing a reference to a type, since it outlives the signals.
But we'll still have to do a copy for (2), right?

That's probably not a bottleneck anyway, but keeping it a bool flag for now and happy to change it to your liking.

sammccall added inline comments.Nov 23 2018, 6:16 AM
clangd/CodeComplete.cpp
1307–1308

Here would be a good place to include the expected type in a log message

1521

This is too verbose for vlog... could be dlog but I'm not sure it's actually needed in the end, we have code completion signal dumps. You should update Quality.cpp so the new signal is included :-)

clangd/Quality.h
98

Agree with your analysis.

I think our goal here should be to have a model with small # of per-symbol types, and small size of each type, so the per-symbol copy would be cheap. Sequencing is tricky - we don't have that optimization yet. Can we replay a session with some code completion and see if it shows up in a profiler?

(Perfect would be one type per symbol, types are 8-byte hashes. Unfortunately I think pointer->base conversions mean symbols must admit multiple tokens. But I have some ideas...)

ilya-biryukov marked 4 inline comments as done.Nov 26 2018, 3:16 AM
ilya-biryukov added inline comments.
clangd/Quality.h
98

After discussing offline, we ended up with a bunch of flags to indicate whether:

  1. a symbol or a sema decl had a type,
  2. a context had a type,
  3. those types were the same.

Let me know if that look ok. The bit that I don't like is that they're computed ad-hoc outside the merge function, which is a bit ugly.

  • Address comments, add more flags to signals
sammccall accepted this revision.Nov 26 2018, 6:42 AM

You may want to add a FIXME in SymbolIndex to include opaque type in fuzzyfind request.

clangd/CodeComplete.cpp
1504

|= true is more traditionally known as = true :-)
(and below)

clangd/Quality.cpp
418

Consider combining these: "Type matched preferred: {0} (Context type: {1}, Symbol type: {2}"

This revision is now accepted and ready to land.Nov 26 2018, 6:42 AM
ilya-biryukov marked 2 inline comments as done.
  • Address comments
  • Add a FIXME to FuzzyFindRequest
This revision was automatically updated to reflect the committed changes.