This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Fix duplicate highlighting tokens appearing in initializer lists
ClosedPublic

Authored by jvikstrom on Jul 12 2019, 7:08 AM.

Details

Summary

The RecursiveASTVisitor sometimes visits exprs in initializer lists twice. Added deduplication to prevent duplicate highlighting tokens from appearing. Done by sorting and a linear search.

Diff Detail

Repository
rL LLVM

Event Timeline

jvikstrom created this revision.Jul 12 2019, 7:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 7:08 AM
hokein added inline comments.Jul 12 2019, 2:12 PM
clang-tools-extra/clangd/SemanticHighlighting.cpp
36 ↗(On Diff #209479)

nit: we could write it like

llvm::sort(Tokens, [](const HighlightingToken &Lhs, const HighlightingToken &Rhs) {
   return std::tie(Lhs.Kind, Lhs.R) < std::tie(Rhs.Kind, Rhs.R);
});
auto Last = std::unique(Tokens.begin(), Tokens.end());
Tokens.erase(Last, Tokens.end());
jvikstrom marked an inline comment as done.Jul 15 2019, 1:30 AM
jvikstrom added inline comments.
clang-tools-extra/clangd/SemanticHighlighting.cpp
36 ↗(On Diff #209479)

About the comparator function though. I added an < operator because https://reviews.llvm.org/D64475 also sorts the tokens. However, could I just remove the sort in the main diffHighlightings method and just rely on the highlightings to be sorted in the first place from this patch?

Instead we'd do the sorting like:

llvm::sort(Tokens, [](const HighlightingToken &Lhs, const HighlightingToken &Rhs) {
   return std::tie(Lhs.R,Lhs.Kind) < std::tie(Rhs.R,Rhs.Kind);
});

so we get it sorted by their position first and Kind second.

jvikstrom updated this revision to Diff 209773.Jul 15 2019, 1:38 AM

Remove operator< and use std::unique.

jvikstrom updated this revision to Diff 209774.Jul 15 2019, 1:39 AM

Readded newline that was removed accidentaly.

jvikstrom updated this revision to Diff 209775.Jul 15 2019, 1:42 AM

Removed return type hint for lambda.

Have we tried figuring out why RecursiveASTVisitor visits the argument lists twice? Is that an expected behavior?

clang-tools-extra/clangd/SemanticHighlighting.cpp
37 ↗(On Diff #209775)

NIT: in LLVM style, these should be called LHS and RHS as they are abbreviations.
However, in clangd we typically use L and R for that in those cases, I would suggest sticking to that.

(One-letter names are harder to confuse with each other)

jvikstrom updated this revision to Diff 209852.Jul 15 2019, 7:13 AM
jvikstrom marked an inline comment as done.

Address comment.

Have we tried figuring out why RecursiveASTVisitor visits the argument lists twice? Is that an expected behavior?

The comment for the function that traverses initialization lists says this:

This method is called once for each pair of syntactic and semantic
InitListExpr, and it traverses the subtrees defined by the two forms. This
may cause some of the children to be visited twice, if they appear both in
the syntactic and the semantic form.

So it seems to be expected. (and looking at the documentation for InitListExpr it seems to be difficult to change RecursiveASTVisitor to visit every sub expr once)

ilya-biryukov accepted this revision.EditedJul 15 2019, 7:52 AM

LGTM (I think @hokein 's comments were addressed too)

So it seems to be expected. (and looking at the documentation for InitListExpr it seems to be difficult to change RecursiveASTVisitor to visit every sub expr once)

Ah, this particular piece of behavior seems really surprising to me.
We should definitely only traverse syntactic form if shouldVisitImplicitCode() == false, which should help us avoid this corner case here. But that's a topic for a separate patch, please land this to unbreak semantic highlighting.

This revision is now accepted and ready to land.Jul 15 2019, 7:52 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2019, 8:08 AM