The RecursiveASTVisitor sometimes visits exprs in initializer lists twice. Added deduplication to prevent duplicate highlighting tokens from appearing. Done by sorting and a linear search.
Details
- Reviewers
- hokein - sammccall - ilya-biryukov 
- Commits
- rG4e34a85aa2e7: [clangd] Fix duplicate highlighting tokens appearing in initializer lists.
 rCTE366070: [clangd] Fix duplicate highlighting tokens appearing in initializer lists.
 rL366070: [clangd] Fix duplicate highlighting tokens appearing in initializer lists.
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
- Buildable 35001 - Build 35000: arc lint + arc unit 
Event Timeline
| clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
|---|---|---|
| 36 | nit: we could write it like llvm::sort(Tokens, [](const HighlightingToken &Lhs, const HighlightingToken &Rhs) {
   return std::tie(Lhs.Kind, Lhs.R) < std::tie(Rhs.Kind, Rhs.R);
});
auto Last = std::unique(Tokens.begin(), Tokens.end());
Tokens.erase(Last, Tokens.end()); | |
| clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
|---|---|---|
| 36 | About the comparator function though. I added an < operator because https://reviews.llvm.org/D64475 also sorts the tokens. However, could I just remove the sort in the main diffHighlightings method and just rely on the highlightings to be sorted in the first place from this patch? Instead we'd do the sorting like: llvm::sort(Tokens, [](const HighlightingToken &Lhs, const HighlightingToken &Rhs) {
   return std::tie(Lhs.R,Lhs.Kind) < std::tie(Rhs.R,Rhs.Kind);
});so we get it sorted by their position first and Kind second. | |
Have we tried figuring out why RecursiveASTVisitor visits the argument lists twice? Is that an expected behavior?
| clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
|---|---|---|
| 37 | NIT: in LLVM style, these should be called LHS and RHS as they are abbreviations. (One-letter names are harder to confuse with each other) | |
The comment for the function that traverses initialization lists says this:
This method is called once for each pair of syntactic and semantic
InitListExpr, and it traverses the subtrees defined by the two forms. This
may cause some of the children to be visited twice, if they appear both in
the syntactic and the semantic form.
So it seems to be expected. (and looking at the documentation for InitListExpr it seems to be difficult to change RecursiveASTVisitor to visit every sub expr once)
LGTM (I think @hokein 's comments were addressed too)
So it seems to be expected. (and looking at the documentation for InitListExpr it seems to be difficult to change RecursiveASTVisitor to visit every sub expr once)
Ah, this particular piece of behavior seems really surprising to me.
We should definitely only traverse syntactic form if shouldVisitImplicitCode() == false, which should help us avoid this corner case here. But that's a topic for a separate patch, please land this to unbreak semantic highlighting.
nit: we could write it like
llvm::sort(Tokens, [](const HighlightingToken &Lhs, const HighlightingToken &Rhs) { return std::tie(Lhs.Kind, Lhs.R) < std::tie(Rhs.Kind, Rhs.R); }); auto Last = std::unique(Tokens.begin(), Tokens.end()); Tokens.erase(Last, Tokens.end());