This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Compute the inactive code range for semantic highlighting.
ClosedPublic

Authored by hokein on Aug 10 2020, 3:04 AM.

Details

Summary

A screenshot in VSCode (with the LSP's v3.16 semanticToken enabled)

Diff Detail

Event Timeline

hokein created this revision.Aug 10 2020, 3:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2020, 3:04 AM
hokein requested review of this revision.Aug 10 2020, 3:04 AM
hokein edited the summary of this revision. (Show Details)Aug 10 2020, 4:11 AM
nridge added a subscriber: nridge.Aug 10 2020, 3:32 PM

Do you mean "inactive" (instead of "interactive") in the commit message?

hokein retitled this revision from [clangd] Compute the interactive code range for semantic highlighting. to [clangd] Compute the inactive code range for semantic highlighting..Aug 11 2020, 12:20 AM

Do you mean "inactive" (instead of "interactive") in the commit message?

oh, right. Fixed the typo.

Yup, this is a nice improvement, though there are further things we could do.

As discussed offline, we could consider mapping "disabled" to an attribute but we can't really consume that well in VSCode (or any other editor yet) so let's leave it.

clang-tools-extra/clangd/SemanticHighlighting.cpp
243

A little more direct and avoiding the special case:

StringRef LineText = MainCode.drop_front(StartOfLine).take_until([](char C) { return C == '\n'; });
...
{Position{Line, 0}, Position{Line, lspLength(LineText)}}
247

this may overlap existing tokens.
The spec is silent on this, it may pose a challenge to clients and makes for a complex model for little benefit. I think we should avoid overlapping tokens.

For now this can actually happen, as we consider #ifndef FOO to be disabled if FOO is defined, but also a usage of the macro FOO.
I'd suggest erasing these (collect their indexes and delete them in a second pass).

In future, I think we probably want to reduce the scope of the disabled regions to not include the directive itself, then this can become an assert.

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
83

AIUI the extra complexity here is to accommodate overlapping ranges, which we can likely get rid of.

hokein updated this revision to Diff 284715.Aug 11 2020, 7:31 AM
hokein marked 3 inline comments as done.

address review comments

sammccall accepted this revision.Aug 21 2020, 3:53 AM

I think the code could be a bit clearer, but up to you. The tests are good so feel free to submit any version you're happy with.

clang-tools-extra/clangd/SemanticHighlighting.cpp
247

Hmm, the role of NonConflicting is pretty confusing here: it has an invariant preserved as we construct it, then we violate it by dumping a bunch of ranges into it, then try to restore it.
And the correctness of the restoring seems to rely on some pretty subtle facts (that the inactive region always fully contains and is sorted *before* the regions it overlaps).

What about:

// Merge token stream with "inactive line" markers.
vector WithInactiveLines;
auto It = NonConflicting.begin();
for (Range &R : Ranges) {
  for (int Line : R) {
    // Copy tokens before the inactive line
    for (; It != NonConflicting.end() && It->position.begin.line < Line)
      WithInactiveLines.push_back(std::move(*It));
    // Add a token for the inactive line itself.
    WithInactiveLines.push_back(createInactiveLineToken(MainCode, Line));
    // Skip any other tokens on the inactive line
    while (It != NonConflicting.end() && It->position.begin.line == Line)
      ++It;
  }
}
// Copy tokens after the last inactive line
for (; It != NonOverlapping.end(); ++It)
  WithInactiveLines.push_back(std::move(*It));

The only real assumption here is "a token is associated with a line", which is pretty fundamental anyway.

This revision is now accepted and ready to land.Aug 21 2020, 3:53 AM
hokein updated this revision to Diff 287405.Aug 24 2020, 8:19 AM
hokein marked an inline comment as done.

address comment.