This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Refactor code completion signal's utility properties.
ClosedPublic

Authored by usaxena95 on May 6 2020, 9:22 AM.

Details

Reviewers
sammccall
Summary

Current implementation of heuristic-based scoring function also contains
computation of derived signals (e.g. whether name contains a word from
context, computing file distances, scope distances.)
This is an attempt to separate out the logic for computation of derived
signals from the scoring function.
This will allow us to have a clean API for scoring functions that will
take only concrete code completion signals as input.

Diff Detail

Event Timeline

usaxena95 created this revision.May 6 2020, 9:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2020, 9:22 AM
usaxena95 updated this revision to Diff 262852.May 8 2020, 3:40 AM
  • Added DerivedSignals struct containing all the derived signals.
  • Added NameMatchesContext and proximtiy signals to this struct.
  • We need to call computeDerivedSignals() before calling evaluate() if we set non-concrete utilites (e.g. ContextWords and Name).
  • This is logically equivalent to the previous version (both when the utilites are explicitly set and when the default signals are used).
  • The utilites are not marked as null in computeDerivedSignals. This is due to 2 reasons:
    • Current scoring function checks whether ScopeProximityMatch is set or not to decide whether to multiply with scopeProxitiyScore. Possible solutions:
      • Have scopeProxitiyScore as a derived signal itself.
      • Or have a different derived signal HasScopeProximityMatch.
    • Having these utilities available for debug purposes is great. We can try to compute other derived signals (e.g. ContextMatchesName) and test out it's value without even adding them concretely to clangd. Once their value is justified, we can add it to Quality/Relevance signals.
sammccall added inline comments.May 11 2020, 1:09 AM
clang-tools-extra/clangd/FindSymbols.cpp
112 ↗(On Diff #262852)

why this change?

clang-tools-extra/clangd/Quality.h
139

Why is it better to group the fields acconding to how they're used in the scoring function, rather than by what they mean?
(I find the new grouping harder to follow)

163

Can we make this Optional, so we can verify it gets computed? In fact, does it need to be a member at all, or can it just be transiently created while calling evaluate?

165

why must this be called explicitly rather than being computed by Evaluate?

clang-tools-extra/clangd/Quality.h.rej
1 ↗(On Diff #262852)

Bad merge?

usaxena95 updated this revision to Diff 293706.Sep 23 2020, 5:41 AM
usaxena95 marked 4 inline comments as done.

Addressed comments.

clang-tools-extra/clangd/Quality.h
139

I intended to separate out the concrete signals from properties/utilities used to calculate other derived signals.

I agree the previous grouping made it makes it easier to follow the meaning of these. So reverted it.

165

Now evaluate() calls this.

sammccall accepted this revision.Sep 23 2020, 6:04 AM
This revision is now accepted and ready to land.Sep 23 2020, 6:04 AM
usaxena95 edited the summary of this revision. (Show Details)Sep 23 2020, 7:10 AM