This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Duplicate lines of semantic highlightings sent removed.
ClosedPublic

Authored by jvikstrom on Jul 10 2019, 4:29 AM.

Details

Summary

Added a function for diffing highlightings and removing duplicate lines. It also returns empty lines if the IDE should remove previous highlighting tokens on the line. Integrated with state into the highlighting generation flow. Only works correctly as long as there are no multiline tokens.

Diff Detail

Repository
rL LLVM

Event Timeline

jvikstrom created this revision.Jul 10 2019, 4:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2019, 4:29 AM

Thanks for bringing this up and implementing it.

I haven't looked into the details of the patch, just some high-level comments:

  • the scope of the patch seems a bit unclear to me, what's the problem we are trying to solve?
  • the patch looks like calculating the diff/delta highlightings (pre highlightings vs. new highlightings) from a server-only perspective, which seems incorrect. At least we should make sure that LSP client and server share the same understanding of source diff change, otherwise client may render the delta information in a wrong way. To do that, I guess we may start from the DidChangeTextDocument notification (where the client sends source changes to server);
  • I'm not sure that we should start implement this at the moment -- I don't see that we will get much performance gain, we save some traffic cost between LSP client and server, but at the same time we are spending more CPU resource to compute highlighting diff in server side. Maybe there are some other clever ways (like traversing the decls in source diff change);

We probably need more investigations, like trying the current implementation on some real-project files in a real client (e.g. theia) to figure out and understand whether the latency is a real issue.

sammccall added inline comments.Jul 11 2019, 3:19 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
172 ↗(On Diff #208919)

I think this is just Lhs.R.start < Lhs.R.end

(if it's not, we should add the operator<)

is it enforced somewhere that you don't have two highlights at the same spot?

190 ↗(On Diff #208919)

holding the lock while doing all the diffing seems dubious

You could reasonably put the cache as a map<file, highlights> in ClangdServer, then you don't have to deal with not having the cache for the current file.

You'd need to lock the map itself, but note that no races on updating individual entries are possible because onMainAST is called synchronously from the (per-file) worker thread.

207 ↗(On Diff #208919)

Whatever you do about storage, please pull out the diff(old, new) function so you can unit test it.

207 ↗(On Diff #208919)

(llvm uses unsigned for indices. It's a terrible idea, but it's used fairly consistently...)

209 ↗(On Diff #208919)

I can believe this is correct, but it's hard to verify as it's a bit monolithic and... index-arithmetic-y.

could you factor out something like:

using TokRange = ArrayRef<HighlightingToken>;

// The highlights for the current line.
TokRange OldLine = {PrevHighlightings.data(), 0}, NewLine = {Highlightings.data(), 0};
// iterate over lines until we run out of data
for (unsigned Line = 0; OldLine.end() < PrevHighlightings.end() || NewLine.end() < PrevHighlightings.end(); ++Line) {
  // Get the current line highlights
  OldLine = getLineRange(PrevHighlightings, Line, OldLine);
  NewLine = getLineRange(Highlightings, Line, NewLine);
  if (OldLine != NewLine)
    emitLine(NewLine, Line);
}

// get the highlights for Line, which follows PrevLine
TokRange getLineRange(TokRange AllTokens, unsigned Line, TokRange PrevLine) {
  return makeArrayRef(PrevLine.end(), AllTokens.end()).take_while( Tok => Tok.R.start.line == Line);
}
jvikstrom updated this revision to Diff 209225.Jul 11 2019, 8:14 AM
jvikstrom marked 5 inline comments as done.

Made diffing function shorter, added multiple previous highlighting entries.

jvikstrom updated this revision to Diff 209228.Jul 11 2019, 8:22 AM

Removed unused code snippets.

jvikstrom marked an inline comment as done.Jul 11 2019, 8:23 AM
jvikstrom added inline comments.
clang-tools-extra/clangd/SemanticHighlighting.cpp
172 ↗(On Diff #208919)

It's not enforced anywhere, it depends on how the HighlightingTokenCollector is implemented, it should not currently take any multiples of tokens at the same spot.

Even if there are tokens at the same location it should still satisfy Compare though?

190 ↗(On Diff #208919)

I did not know onMainAST was called synchronously per file.
I feel like after a while a lot of memory is going to be consumed by keeping ASTs for files that were closed long ago in the cache (say when you've opened 1000 files in a session, all those files will have one Highlightings entry here)
Could solve that with LRU (just keep a set of pair<long long, std::string> (unix time of adding/editing to cache and filename)... would actually have to be a struct, would need to make the equal operator check the filename only)

But maybe that isn't a big concern/a concern at all?

based on the offline discussion, now I understand the patch better (thanks).

some comments on the interface.

clang-tools-extra/clangd/ClangdLSPServer.h
60 ↗(On Diff #209228)

how about creating a new structure LineHighlightingTokens to represent the line-based tokens?

clang-tools-extra/clangd/ClangdServer.cpp
82 ↗(On Diff #209228)

If we make it as a pointer, the bool SemanticHighlighting is not needed (just follow the what the other field FIndex does).

clang-tools-extra/clangd/SemanticHighlighting.cpp
255 ↗(On Diff #209228)

we could get the canonical path from onMainAST callback (the first parameter PathRef Path).

clang-tools-extra/clangd/SemanticHighlighting.h
60 ↗(On Diff #209228)

nit: llvm::StringMap.

I think this map is storing all file highlightings, maybe call it FileToHighlightings?

74 ↗(On Diff #209228)

this method does two things,

  1. calculate the diff between old and new
  2. replace the old highlighting with new highlightings

but the name just reflects 1).

I think we just narrow it to 2) only:

// calculate the new highlightings, and return the old one.
std::vector<HighlightingToken> updateFile(PathRef Path, std::vector<HighlightingToken> NewHighlightings);

// to get diff, in `onMainAST`
diffHighlightings(differ.updateFile(Path, NewHighlightings), NewHighlightings);
82 ↗(On Diff #209228)

this utility method doesn't use any internal member of the class, we could pull it out or make it static.

As discussed offline, it's a little scary that the highlight state for diffing lives in a separate map from TUScheduler's workers, but it's hard to fix without introducing lots of abstractions or layering violations.

It does seem like a good idea to move the diffing up to ClangdLSPServer, as it's fairly bound to protocol details. It needs to be reset in didClose (and maybe didOpen too, for safety - I think there's a race when a document is closed, then tokens are delivered...).

I'd tend to put the map/mutex directly in ClangdLSPServer rather than encapsulating it in the Differ object (again, because it's a protocol/lifetime detail rather than diffing logic itself) but up to you.

jvikstrom updated this revision to Diff 210079.Jul 16 2019, 6:10 AM

Moved highlighting state to LSP layer. Removed class holding state. Addressed comments.

the code looks clearer now!

clang-tools-extra/clangd/ClangdLSPServer.cpp
609 ↗(On Diff #210079)

nit: I would create a new {} block merely for the hightlightings.

1110 ↗(On Diff #210079)

this seems unsafe, we get a reference of the map value, we might access it without the mutex being guarded.

std::vector<HighlightingToken> Old;
{
    std::lock_guard<std::mutex> Lock(HighlightingsMutex);
    Old = std::move(FileToHighlightings[File]);
}
1124 ↗(On Diff #210079)

FileToPrevHighlightings[File] = std::move(Highlightings);

clang-tools-extra/clangd/ClangdLSPServer.h
140 ↗(On Diff #210079)

nit: I'd drop Prev.

clang-tools-extra/clangd/SemanticHighlighting.cpp
229 ↗(On Diff #210079)

nit: I'm not sure I understand the comment.

233 ↗(On Diff #210079)

nit: this implies that OldLine and AllTokens must ref to the same container, could you refine the OldLine as a start Iterator?

255 ↗(On Diff #210079)

IIUC, we are initializing an empty ArrayRef, if so, how about using NewLine(Highlightings.data(), /*length*/0)? NewLine(Highlightings.data(), Highlightings.data()) is a bit weird, it took me a while to understand the purpose.

257 ↗(On Diff #210079)

nit: split it into a new statement.

259 ↗(On Diff #210079)

we don't change Current and Prev below, how about re-using Highlightings and PrevHighlightings?

clang-tools-extra/clangd/SemanticHighlighting.h
71 ↗(On Diff #210079)

I believe Assumes the highlightings are sorted by the tokens' ranges. is a requirement for this function?

If so, mark it more explicitly in the comment, like

///
/// REQUIRED: OldHighlightings and NewHighlightings are sorted.
74 ↗(On Diff #210079)

nit: I'd use name the parameters symmetrically, NewHighlightings and OldHighlightings.

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
32 ↗(On Diff #210079)

we don't need ParsedAST now I think.

265 ↗(On Diff #210079)

this is a complicated structure, could you add some comments describing the test strategy here?

jvikstrom marked 15 inline comments as done.

Address comments.

clang-tools-extra/clangd/ClangdLSPServer.cpp
1110 ↗(On Diff #210079)

Aren't the parsing callbacks thread safe in the same file? I think @sammccall said so above.

Although actually, the map might reallocate and move around the memory if things are inserted/deleted invalidating the reference maybe (I have no idea of how StringMap memory allocation works).

So I guess it doesn't hurt to be safe and copy it.

clang-tools-extra/clangd/SemanticHighlighting.cpp
255 ↗(On Diff #210079)

I couldn't do that because the ArrayRef(const T *data, size_t length) and ArrayRef(const T *begin, const T *end) were ambiguous. Added a cast to cast 0 to a size_t which solved it.

hokein added inline comments.Jul 17 2019, 1:53 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
255 ↗(On Diff #210079)

you could also do it like NewLine(Highlightings.data(), /*length*/0UL);, which saves you a static_cast<>.

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
281 ↗(On Diff #210253)

so the empty lines are stored separately, which is not easy to figure out from the testing code snippet. I think we should make the empty lines more obvious in the code snippet (maybe use an Empty annotations?) , and explicitly verify the empty lines.

What do you think refining the test like below, just annotate the diff tokens? I find it is easy to spot the diff tokens.

struct Testcase {
   Code Before;
   Code After;
};

std::vector<TestCase> cases = {
   { 
       R"cpp(
             int a;
             int b; 
             int c;
      )cpp",
      R"cpp(
            int a;
            $Empty[[]] // int b
            int $Variable[[C]];
      )cpp"
  }
}

oldHighlightings = getSemanticHighlightings(OldAST);
newHighlightings = getSemanticHighlightings(NewAST);
diffHighlightings = diff...;
// verify the diffHighlightings has the expected empty lines ("Empty" annotations).
// verify the diffHighlightings has the expected highlightings (the regular annotations);
jvikstrom updated this revision to Diff 210282.Jul 17 2019, 3:10 AM
jvikstrom marked 4 inline comments as done.

Rewrote test to be more readable.

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
281 ↗(On Diff #210253)

Moved everything into the checkDiffedHighlights as well.

hokein added inline comments.Jul 17 2019, 3:58 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
275 ↗(On Diff #210282)

nit: remove the redundant ;.

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
35 ↗(On Diff #210282)

could we split it into three functions?

  • getExpectedHighlightings
  • getActualHighlightings
  • getExpectedEmptyLines
70 ↗(On Diff #210282)

nit: OrgCode => OldCode, use llvm::StringRef

98 ↗(On Diff #210282)

I think the tokens are also sorted, as we pass two sorted lists of highlightings to diffHighlightings.

282 ↗(On Diff #210282)

nit: the code indentation is strange

283 ↗(On Diff #210282)

The annotations in the OldCode are not used -- you only use the actual highlightings in the checkDiffedHighlights.

jvikstrom updated this revision to Diff 210297.Jul 17 2019, 5:48 AM
jvikstrom marked 6 inline comments as done.

Address comments.

hokein added inline comments.Jul 18 2019, 3:04 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
119 ↗(On Diff #210297)

This means if we won't emit most of symbols if the AST is ill-formed? I'm not sure whether we should include it in this patch, it seems out of scope. could we leave it out in this patch?

clang-tools-extra/clangd/SemanticHighlighting.h
68 ↗(On Diff #210297)

The comment seems a bit out of date (as the code is updated during review), and it should mention the diff strategy of this function.

// Return a line-by-line diff between two highlightings.
//  - if the tokens on a line are the same in both hightlightings, we omit this line
//  - if a line exists in NewHighligtings but not in OldHighligtings, we emit the tokens on this line
//  - if a line not exists in NewHighligtings but in OldHighligtings, we emit an empty line (to tell client not highlighting this line)
clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
53 ↗(On Diff #210297)

nit: let's sort the tokens here to make the result deterministic.

59 ↗(On Diff #210297)

nit: you could use for-range loop to simplify the code.

for (const auto& EmptyRange : EmptyRanges)
   EmptyLines.push_back(EmptyRange.start.line);
75 ↗(On Diff #210297)

nit: since OldActualTokens and NewActualTokens are used only once in diffHighlightings, I'd inline them there (instead of creating two new variables).

88 ↗(On Diff #210297)

if we do sort in getExpectedTokens, we don't need this here.

274 ↗(On Diff #210297)

nit: Use an explicit struct, then you wouldn't need a comment.

struct {
   llvm::StringRef OldCode;
   llvm::StringRef NewCode;
} TestCases[] = {
   ...
}
275 ↗(On Diff #210297)

could we add one more case?

// Before
int a;
int b;
int c;

// After
int a;
int $Variable[[new_b]];
int c;
327 ↗(On Diff #210297)

nit: const auto&

jvikstrom updated this revision to Diff 210527.Jul 18 2019, 4:32 AM
jvikstrom marked 9 inline comments as done.

Address comments.

hokein accepted this revision.Jul 18 2019, 4:41 AM

Thanks! Looks good from my side.

@ilya-biryukov will be nice if you could take a second look on the patch. We plan to land it before the release cut today.

clang-tools-extra/clangd/ClangdLSPServer.cpp
1115 ↗(On Diff #210527)

use std::move() to save a copy here. we'd drop the old highlighting anyway (replace it with new highlightings).

This revision is now accepted and ready to land.Jul 18 2019, 4:41 AM

one missing point -- could you please update the patch description?

jvikstrom edited the summary of this revision. (Show Details)Jul 18 2019, 4:59 AM

A few suggestions from me, I haven't looked into the diffing algorithm and the tests yet.

clang-tools-extra/clangd/ClangdLSPServer.cpp
458 ↗(On Diff #210527)

Should can't we handle this on didClose instead?

1115 ↗(On Diff #210527)

NIT: maybe std::move() into Old?

If we had exceptions, that would be complicated, but we don't have them

1124 ↗(On Diff #210527)

To follow the same pattern as diagnostics, could we store before calling publish...?

clang-tools-extra/clangd/SemanticHighlighting.cpp
276 ↗(On Diff #210527)

Do you have any ideas on how we can fix this?

I would simply split those tokens into multiple tokens of the same kind at newlines boundaries, but maybe there are other ways to handle this.

In any case, could you put a suggested way to fix this (in case someone will want to pick it up, they'll have a starting point)

clang-tools-extra/clangd/SemanticHighlighting.h
79 ↗(On Diff #210527)

Are we locked into the line-based diff implementation?
It works nicely while editing inside the same line, but seem to do a bad job on a common case of inserting/removing lines.

Does the protocol have a way to communicate this cleanly?

jvikstrom marked 6 inline comments as done.Jul 18 2019, 7:53 AM
jvikstrom added inline comments.
clang-tools-extra/clangd/ClangdLSPServer.cpp
458 ↗(On Diff #210527)

We are removing in didClose but the problem is that there is a race condition (I think).

If someone does some edits and closes the document right after, the highlightings for the final edit might finish being generated after the FileToHighlightings have earsed the highlightings for the file. So next time when opening the file we will have those final highlightings that were generated for the last edit in the map.
I don't really know how we could handle this in didClose without having another map and with an open/closed bit and checking that every time we generate new highlightings. But we'd still have to set the bit to be open every didOpen so I don't really see the benefit.

However I've head that ASTWorked is synchronous for a single file, is that the case for the didClose call as well? Because in that case there is no race condition.

clang-tools-extra/clangd/SemanticHighlighting.cpp
276 ↗(On Diff #210527)

Yeah, I would split the tokens into multiple lines as well. Or enforce that a token is single line and handle it in addToken in the collector. (i.e. change the HighlightingToken struct)

It's a bit annoying to solve because we'd have to get the line lengths of every line that the multiline length token covers when doing that.

clang-tools-extra/clangd/SemanticHighlighting.h
79 ↗(On Diff #210527)

We are looked into some kind of line based diff implementation as the protocol sends token line by line.
There should be away of solving the inserting/removing lines, but I'll have a look at that in a follow up.
Theia seems to do a good job of moving tokens to where they should be automatically when inserting a new line though. I want to be able to see how vscode handles it first as well though before.

ilya-biryukov added inline comments.Jul 18 2019, 7:53 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
274 ↗(On Diff #210527)

NIT: maybe rename to New and Old, the suffix of the name could be easily inferred from the variable type (clangd has hover/go-to-definition to find the type quickly, the code is rather small so its should always be visible on the screen too).

276 ↗(On Diff #210527)

NIT: add assertions that highlightings are sorted.

286 ↗(On Diff #210527)

NIT: maybe mention diff in the name somehow. I was confused at first, as I thought it's all highlightings grouped by line, not the diff.

292 ↗(On Diff #210527)

NIT: maybe just inline NewEnd and OldEnd to save a few lines?

294 ↗(On Diff #210527)

Could we skip directly to the next interesting line?
The simplest way to do this I could come up with is this:

auto NextLine = [&]() {
  int NextNew = NewLine.end() != NewHighlightings.end() ? NewLine.end()->start.line : numeric_limits<int>::max();
  int NextOld = ...;
  return std::min(NextNew, NextOld);
}

for (int Line = 0; ...; Line = NextLine()) {
}
jvikstrom updated this revision to Diff 210567.Jul 18 2019, 7:54 AM
jvikstrom edited the summary of this revision. (Show Details)

Added additional note to FIXME on how to solve and also move(ing) Old.

ilya-biryukov added inline comments.Jul 18 2019, 8:04 AM
clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
307 ↗(On Diff #210527)

Can we test this in a more direct manner by specifying:

  1. annotated input for old highlightings,
  2. annotated input for new highlightings,
  3. the expected diff?

The resulting tests don't have to be real C++ then, allowing to express what we're testing in a more direct manner.

{/*Old*/ "$Variable[[a]]", /*New*/ "$Class[[a]]", /*Diff*/ "{{/*line */0, "$Class[[a]]"}}

It also seems that the contents of the lines could even be checked automatically (they should be equal to the corresponding line from /*New*/, right?), that leaves us with even simpler inputs:

{/*Old*/ "$Variable[[a]]", /*New*/ "$Class[[a]]", /*DiffLines*/ "{0}}
391 ↗(On Diff #210527)

NIT: remove braces

ilya-biryukov marked an inline comment as done.Jul 18 2019, 8:17 AM
ilya-biryukov added inline comments.
clang-tools-extra/clangd/ClangdLSPServer.cpp
458 ↗(On Diff #210527)

You are correct, there is actually a race condition. We worked hard to eliminate it for diagnostics, but highlightings are going through a different code path in ASTWorker, not guarded by DiagsMu and ReportDiagnostics.

And, unfortunately, I don't think this guard here prevents it in all cases. In particular, there is still a possibility (albeit very low, I guess) that the old highlightings you are trying to remove here are still being computed. If they are reported after this erase() runs, we will end up with inconsistent highlightings.

Ideally, we would guard the diagnostics callback with the same mutex, but given our current layering it seems like a hard problem, will need to think what's the simplest way to fix this.

clang-tools-extra/clangd/SemanticHighlighting.h
79 ↗(On Diff #210527)

It seems there no compact way to send interesting diffs then, although the clients can do a reasonably good job of moving the older version of highlightings on changes until the server sends them a new version.

The diff-based approach only seems to be helping with changes on a single line.

Which is ok, thanks for the explanation.

jvikstrom updated this revision to Diff 210600.Jul 18 2019, 9:22 AM
jvikstrom marked 7 inline comments as done.

Address comments and fixed bug where we'd send diffs outside of the file.

jvikstrom updated this revision to Diff 210601.Jul 18 2019, 9:23 AM

Remove braces.

Harbormaster completed remote builds in B35277: Diff 210601.
jvikstrom updated this revision to Diff 210606.Jul 18 2019, 9:40 AM

Fixed test case that was actually (kinda) broken even though it didn't fail.

(The last two edits in semantic-highlighting test. They removed the = 2 part and did not insert an extra ; which produced invalid code. It was still fine because the highlightings that were supposed to be generated were still generated. This just makes the code not be invalid)

jvikstrom updated this revision to Diff 210613.Jul 18 2019, 9:50 AM

Removed one lock from onSemanticHighlighting.

jvikstrom added inline comments.Jul 18 2019, 10:34 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
292 ↗(On Diff #210527)

Wouldn't that violate code style: https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop ?
Or maybe that doesn't matter? (I have no preference either way, just picked this way as I could remeber reading from the styleguide)

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
307 ↗(On Diff #210527)

That's a great idea on how to write these tests.

hokein added inline comments.Jul 19 2019, 2:28 AM
clang-tools-extra/clangd/ClangdLSPServer.cpp
458 ↗(On Diff #210527)

The race condition of highlighting sounds bad (especially when a user opens a large file and closes it immediately, then the highlighting is finished and we emit it to the client). No need to fix it in this patch, just add a FIXME.

Can we use the same mechanism for Diagnostic to guard the highlighting here? or use the DiagsMu and ReportDiagnostics to guard the callback.onMainAST() as well (which is probably not a good idea)...

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
307 ↗(On Diff #210527)

hmm, I have a different opinion here (I'd prefer the previous one), let's chat.

ilya-biryukov added inline comments.Jul 19 2019, 6:39 AM
clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
307 ↗(On Diff #210613)

@hokein rightfully pointed out that mentioning all changed lines makes the tests unreadable.
An alternative idea we all came up with is to force people to put ^ on each of the changed lines inside the NewCode, i.e.

{/*Before*/ R"(
  $Var[[a]]
  $Func[[b]]
"),
 /*After*/ R"(
  $Var[[a]]
 ^$Func[[b]]
)"} // and no list of lines is needed!

Could we do that here?

One interesting case that we can't test this way to removing lines from the end of the file. But for that particular case, could we just write a separate test case?

The fix for a race condition on remove has landed in rL366577, this revision would need a small update after it.

jvikstrom updated this revision to Diff 211890.Jul 26 2019, 1:21 AM
jvikstrom marked 2 inline comments as done.

Made tests more readable. Updated to work with rL366577.

The fix for a race condition on remove has landed in rL366577, this revision would need a small update after it.

Fixed to work with that patch.
Should the diffing be moved somewhere else that is not inside the publish function though? That would require moving the state outside the LSP server though and handle the onClose/onOpen events somewhere else though.
Maybe it's ok to keep the diffing in the publish lock?

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
307 ↗(On Diff #210613)

We don't want to send diffs that are beyond the end of the file. There is a testcase for that as well (the count of newlines in the new code is sent to the differ as the number of lines in the file).

The fix for a race condition on remove has landed in rL366577, this revision would need a small update after it.

Fixed to work with that patch.
Should the diffing be moved somewhere else that is not inside the publish function though? That would require moving the state outside the LSP server though and handle the onClose/onOpen events somewhere else though.
Maybe it's ok to keep the diffing in the publish lock?

Yeah, let's keep it in publish for now. Diffing in there does not change complexity of the operations.
If it ever turns out that the constant factor is too high, we can figure out ways to fight this.

Mostly final NITs from me, but there is an important one about removing the erase call on didOpen.

clang-tools-extra/clangd/ClangdLSPServer.cpp
467 ↗(On Diff #211890)

Now that the patch landed, this is obsolete. Could you remove?

clang-tools-extra/clangd/ClangdServer.cpp
80 ↗(On Diff #211890)

NIT: use unsigned instead of unsigned int

clang-tools-extra/clangd/ClangdServer.h
56 ↗(On Diff #211890)

NIT: could you document NLines

clang-tools-extra/clangd/SemanticHighlighting.cpp
294 ↗(On Diff #211890)

Could we try to find a better name for this? Having Line and NextLine() which represent line numbers and NewLine which represents highlightings, creates some confusion.

310 ↗(On Diff #211890)

Line != intmax is redundant (NLines is <= intmax by definition).
Maybe remove it altogether to simplify the loop condition?

clang-tools-extra/clangd/SemanticHighlighting.h
80 ↗(On Diff #211890)

Could you document what NLines is? Specifically, whether it's the number of lines for New or Old.

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
60 ↗(On Diff #211890)

NIT: add documentation on how this should be used
Most importantly, the fact that we need to put ^ on all changed lines should be mentioned.

66 ↗(On Diff #211890)

NIT: use llvm::DenseMap

a few more comments from my side.

clang-tools-extra/clangd/SemanticHighlighting.cpp
309 ↗(On Diff #211890)

nit: I believe theia is crashing because of this? could we file a bug to theia? I think it would be nice for client to be robust on this case.

clang-tools-extra/clangd/SemanticHighlighting.h
70 ↗(On Diff #211890)

nit: add , before this

73 ↗(On Diff #211890)

could you refine this sentence, I can't parse it.

74 ↗(On Diff #211890)

and here as well.

80 ↗(On Diff #211890)

could we use a more describe name for NLines? maybe MaxLines

jvikstrom updated this revision to Diff 212301.Jul 30 2019, 2:26 AM
jvikstrom marked 14 inline comments as done.

Address comments.

jvikstrom added inline comments.Jul 30 2019, 2:33 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
294 ↗(On Diff #211890)

I renamed the Line and NextLine() instead. Could also rename NewLine to something like NewLineHighlightings but I feel like it almost becomes to verbose at that point.

309 ↗(On Diff #211890)

I seem to be misremembering, when I first started testing in theia I think I was crashing the client (due to a broken base64 encoding function, will take a look and see what was actually happening, can't quite remember)
It actually seems like theia just ignores any out of bounds highlightings.

So this could be removed, it will just sometimes return highlightings that are outside of the file and hopefully any future clients would handle that as well.

hokein added inline comments.Jul 30 2019, 4:51 AM
clang-tools-extra/clangd/ClangdServer.cpp
81 ↗(On Diff #212301)

from the comment of the getLineNumber, this is not cheap to call this method. We may use SM.getBufferData(MainFileID).count('\n') to count the line numbers.

// This requires building and caching a table of line offsets for the
/// MemoryBuffer, so this is not cheap: use only when about to emit a
/// diagnostic.
clang-tools-extra/clangd/ClangdServer.h
55 ↗(On Diff #212301)

nit: drop the needed ..., this is an interface, don't mention the details of subclass (diffs are subclass implementation details).

Maybe name it "NumLines"?

jvikstrom updated this revision to Diff 212522.Jul 31 2019, 1:13 AM
jvikstrom marked 2 inline comments as done.

Stopped using expensive getLineNumber function.

ilya-biryukov added inline comments.Jul 31 2019, 1:17 AM
clang-tools-extra/clangd/ClangdLSPServer.cpp
1125 ↗(On Diff #212301)

NIT: avoid copying (from Highlightings into the map) under a lock, make the copy outside the lock instead

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
408 ↗(On Diff #212301)

Could you also add a separate test that checks diffs when the number of lines is getting smaller (i.e. taking NLines into account)

ilya-biryukov added inline comments.Jul 31 2019, 1:18 AM
clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
408 ↗(On Diff #212301)

I would expect this to be better handled outside checkDiffedHighlights to avoid over-generalizing this function. But up to you.

jvikstrom updated this revision to Diff 212528.Jul 31 2019, 1:39 AM
jvikstrom marked an inline comment as done.

Copy outside lock.

jvikstrom added inline comments.Jul 31 2019, 1:40 AM
clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
408 ↗(On Diff #212301)

That's what this test does:

{
                    R"(
        $Class[[A]]
        $Variable[[A]]
        $Class[[A]]
        $Variable[[A]]
      )",
                    R"(
        $Class[[A]]
        $Variable[[A]]
      )"},

But do you mean I should do a testcase like:

{
                    R"(
        $Class[[A]]
        $Variable[[A]]
        $Class[[A]]
        $Variable[[A]]
      )",
                    R"(
        $Class[[A]]
        $Variable[[A]]
        $Class[[A]]
        $Variable[[A]]
      )"},

And just set NLines = 2 instead when doing the diffing?

hokein added inline comments.Jul 31 2019, 1:43 AM
clang-tools-extra/clangd/ClangdLSPServer.cpp
1125 ↗(On Diff #212301)

I think we can even avoid copy, since we only use the Highlightings for calculating the diff.

Maybe just

{
    std::lock_guard<std::mutex> Lock(HighlightingsMutex);
    Old = std::move(FileToHighlightings[File]);
}
// calculate the diff.
{  
     std::lock_guard<std::mutex> Lock(HighlightingsMutex);
     FileToHighlightings[File] = std::move(Highlightings);
}
jvikstrom marked an inline comment as done.Jul 31 2019, 8:20 AM
jvikstrom added inline comments.
clang-tools-extra/clangd/ClangdLSPServer.cpp
1125 ↗(On Diff #212301)

I think I changed this to only lock once (and copy instead) at the same time me and @ilya-biryukov were talking about the race condition (?)

ilya-biryukov accepted this revision.Jul 31 2019, 9:01 AM
ilya-biryukov marked an inline comment as done.

LGTM from my side

clang-tools-extra/clangd/ClangdLSPServer.cpp
1125 ↗(On Diff #212301)

It means there's a window where nobody can access the highlightings for this file. Which is probably fine, we really do use this only for computing the diff and nothing else.

But doing the copy shouldn't matter here either, so leaving as is also looks ok from my side.

If you decide to avoid an extra copy, please add comments to FileToHighlightings (e.g. only used to compute diffs, must not be used outside onHighlightingsReady and didRemove).
It is declared really close to FixItsMap and looks very similar, but the latter is used in completely different ways; that could lead to confusion.

Don't remember exact details about race conditions, but we should be good now that it's called inside Publish()

clang-tools-extra/clangd/unittests/SemanticHighlightingTests.cpp
408 ↗(On Diff #212301)

Ah, the test actually needs the set of changed lines to be exactly the same as marked lines.
In that case, all is good, I've just missed this detail.

hokein accepted this revision.Aug 1 2019, 12:52 AM
hokein added inline comments.
clang-tools-extra/clangd/ClangdLSPServer.cpp
1125 ↗(On Diff #212301)

I'm fine with the current state, avoiding the copy cost would make the code here a bit mess...

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2019, 1:08 AM