Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 40843 Build 40980: arc lint + arc unit
Event Timeline
This patch adds server-side support for greying out code in inactive preprocessor branches (issue #132).
I didn't write test yet. I wanted to post the patch for feedback first, to see if the general approach is ok.
For reference, I also posted a proof-of-concept client impl in D67537, which I used to test/validate this patch.
Rather than a separate method with parallel implementation, this seems very closely related to the syntax highlighting feature.
The minimal way to model this (no new protocol) would be for each disabled line, to add one token spanning the whole line with a TextMate scope like comment.disabled or meta.disabled.
There's no technical reason tokens can't overlap in the protocol, though it's possible editors will choose to turn off local heuristic highlighting on anything that's touched by an LSP-provided token.
Failing that, I'd suggest encoding a list of line-styles on SemanticHighlightingInformation, that should be combined with any tokens on that line.
The updated patch formulates this as a new semantic highlighting kind.
The tokens created for inactive regions are one per line, with the character offset and length being zero; the idea is that the client will handle this highlighting specially (see D67537) and highlight the entire line.
Note, highlighting the entire line is different than highlighting just the portion of the line that contains text. We want the grey background highlight to extend to the entire row in the editor, even over columns beyond the end of the text line (as accomplished with vscode's isWholeLine=true option in DecorationRenderOptions).
I don't love special-casing the scope in the client, but this does avoid having to extend the protocol itself which would be considerably more work.
Could you add tests for this?
clang-tools-extra/clangd/Compiler.cpp | ||
---|---|---|
66 ↗ | (On Diff #222336) | I'm not sure how does this flag impact the size of Preamble/AST, @ilya-biryukov any thoughts? |
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
164 | I think the current implementation only collects the skipped preprocessing ranges after preamble region in the main file. We probably need to store these ranges in PreambleData to make the ranges in the preamble region work, no need to do it in this patch, but we'd better have a test reflecting this fact. #ifdef ActiveCOde // inactive code here #endif #include "foo.h" // preamble ends here namespace ns { // .. } | |
173 | This seems too couple with VSCode client, I would prefer to calculate the range of the line and return to the client. Is there any big differences in VSCode between highlighting with the isWholeLine and highlighting with the range of the line? | |
clang-tools-extra/clangd/SemanticHighlighting.h | ||
45 | This is a different kind group, I would put it after the Macro, we'd need to update the LastKind. The name seems too specific, how about "UnreachableCode"? |
Certainly, I just wanted to discuss the general approach first, as it will affect what the tests look like.
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
173 | I took some screenshots to illustrate to difference. Highlighting only to the end of the line of text: Highlighting the whole line: I think the first one looks pretty bad, and is inconsistent with existing practice. Note also that the suggestion is not to special-case the VSCode client specifically; it's to special-case one particular highlighting, which any client can implement. If this special-casing is really unpalatable, we could instead try this suggestion by @sammccall:
I guess one consideration when evaluating these options is, do we expect to use that "list of line-styles" for anything else in the future? I can't think of anything at the moment, but perhaps there are other uses for it. If not, we could do something slightly simpler, and add a single isInactive flag to SemanticHighlightingInformation. |
clang-tools-extra/clangd/Compiler.cpp | ||
---|---|---|
66 ↗ | (On Diff #222336) | Have no idea, but why do we need this in the first place? |
I got the name from "$cquery/publishInactiveRegions" :) But I don't particularly care what we call it.
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
161 | I think this comment could be clearer, e.g. // Add tokens indicating lines skipped by the preprocessor. | |
173 | Three approaches seem feasible here:
3 is clearly better than 2 I think, it's more explicit. I don't have a strong opinion of 1 vs 3, but if going with 1 I think it's a good idea to measure the line as Haojian says, so we at least get a basic version of the feature if the client doesn't know about line styles.
Preprocessor directives maybe? (Though these are easy enough for clients to highlight with regex) |
clang-tools-extra/clangd/Compiler.cpp | ||
---|---|---|
66 ↗ | (On Diff #222336) | Yes, PPCallbacks::SourceRangeSkipped works in place of using DetailedRecord. Thank you for the suggestion. |
clang-tools-extra/clangd/SemanticHighlighting.h | ||
45 | I changed it to "InactiveCode". ("Unreachable" seemed like the wrong word, it brings to mind control flow.) |
How would one even measure the line length? SourceManager doesn't sem to have a method like getLineLength() or similar.
If you look at functions like offsetToPosition in SourceCode.h, basically you get the buffer as a string (contains utf-8), find the offset you're interested in, and then zoom around looking for \n. Once you have a substring, calling lspLength() on it will give you the length in UTF-16 or whatever LSP is speaking at the moment.
Yes, there is no existing API for that, I think you'd need to get the source code from the SM at the specific offset, and do the \n counting.
clang-tools-extra/clangd/ParsedAST.h | ||
---|---|---|
100 | Instead of adding new member and methods in ParsedAST, I think we can do it in CollectMainFileMacros (adding a new field SkippRanges in MainFileMacros), then we can get the skipped ranges for preamble region within the main file for free. | |
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
173 | I can't say whether highlighting the line is better than highlighting the range of the line text, but below is the how the inactive TS code is highlighted in VSCode (only the range of text), I personally prefer this style. |
One thing that may be worth considering as well, is that if the client prefers to highlight the text of the line only, it can calculate the length of the line itself. In VSCode for instance, the line lengths are readily available; I imagine other editors are similar since they need that information for many purposes.
So I don't think clients will/should prefer that - for best rendering they should know this is a line highlight.
I think this comes down to how line highlights are represented in the protocol:
- by a separate field: no need to send line length
- by a special token bounds (e.g. [0,0)): no need to send line length
- by a special scope: sending line length is a nice-to-have as it provides graceful degradation for clients that don't understand this extension
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
173 | I think that's an argument for making sure clients clearly distinguish between regular tokens and marking lines: overlapping tokens don't compose well, but we can easily say lines and token styles should compose. (That particular style is not for me, but it doesn't matter) |
I have actually seen clients that just make the text gray and it looks pretty nice (ReSharper for Visual Studio and IntelliJ IDEA definitely do that).
It also lets them consistently highlight part of the line (e.g. dead expressions or statements can be marked in gray even if they are on the same line).
So I wouldn't be so sure, at least some clients might prefer to do exclusively range (as opposed to line-based) highlightings.
PS Sorry if this was discussed before and I'm just missing context.
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
173 | I find this a convincing argument for using line styles, thanks. Especially since, at some point in the future, I would like us to be able to produce regular token highlightings for inactive code, much like in that TS code screenshot. |
I don't think there is a conflict between this style, and using line highlightings: if the style being applied is a foreground color (gray text) as opposed to a background color, then it will look the same regardless of whether you apply it as a line style, or as a character range style.
It also lets them consistently highlight part of the line (e.g. dead expressions or statements can be marked in gray even if they are on the same line).
Highlighting part of a line is not applicable to inactive preprocessor branches in C++.
If we later introduce highlightings for dead expressions or statements, we can of course use character ranges and not lines for them.
Note that the opposite is true - inactive preprocessor branches can be expressed as range-based highlightings.
If we later introduce highlightings for dead expressions or statements, we can of course use character ranges and not lines for them.
Then we'll have two modes of operation - line-based and range-based highlightings. Clients will have to support both, we'll have code dealing with both.
The alternative is having only range-based highlightings and using them everywhere. That's the main reason I would advocate for range-based.
Just noticed the next version of LSP added diagnostic tags for things like "unused field" or "dead code":
https://microsoft.github.io/language-server-protocol/specifications/specification-3-15, search for DiagnosticTag.
So I guess we won't need ever add range-based highlightings for dead code (and similar), we'll use diagnostics instead.
Concerns about having two kinds of highlightings to handle on the clients are still there, though.
Except that if we want to allow background styling, the client would need to special-case the highlighting scope to know to apply the background style to the entire line.
Ok, I updated the patch to convey the line highlight separately from the token highlights.
I did use the simplified approach I mentioned, where I use a single isInactive flag. If you'd prefer the more general approach where we send an array of line styles, I can do that, though my personal preference would be to wait until we have an actual use case before generalizing.
There are still no tests yet, as I'm still not sure whether we have agreement on the approach. If you like the approach, please let me know, and I'll follow up with tests.
Thanks, I'm fine with the current approach, feel free to add unittests.
clang-tools-extra/clangd/Protocol.h | ||
---|---|---|
1212 | could you add some documentation describing this is a clangd extension? | |
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
480 | it took me a while to understand this code, If the NewLine is IsInactive, it just contains a single token whose range is [0, 0), can't we just? if (NewLine.back().Tokens.empty()) continue; bool InactiveLine = NewLine.back().Tokens.front().Kind == InactiveCode; assert(InactiveLine && NewLine.back().Tokens.size() == 1 && "IncativeCode must have a single token"); DiffedLines.back().IsInactive = true; | |
clang-tools-extra/clangd/SemanticHighlighting.h | ||
46 | we should document this Kind as it is different than other kinds, it is for line-style I believe? |
Addressed some nits and got existing tests to pass
Will follow up with new tests in the next update
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
480 | An inactive line can contain token highlightings as well. For example, we highlight macro references in the condition of an #ifdef, and that line is also inactive if the condition is false. Clients can merge the line style (which is typically a background color) with the token styles (typically a foreground color). I did expand the comment to explain what the loop is doing more clearly. |
Address one minor remaining comment
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
164 | Your observation is correct: the current implementation only highlights inactive lines after the preamble. For now, I added a test case with a FIXME for this. |
clang-tools-extra/clangd/ParsedAST.h | ||
---|---|---|
100 | My apologies, I overlooked this. |
thanks, mostly good, a few more nits.
clang-tools-extra/clangd/ParsedAST.h | ||
---|---|---|
137 | nit: it is not used anymore, remove it. | |
clang-tools-extra/clangd/Protocol.h | ||
1212 | could you document that the line-style highlightings can be combined with any token-style highlightings? | |
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
480 | thanks, I see. I think we can still simplify the code like below, this could improve the code readability, and avoid the comment explaining it. llvm::erase_if(AddedLine.Tokens, [](const Token& T) { return T.Kind == InactiveCode;}); |
Address nits
clang-tools-extra/clangd/SemanticHighlighting.cpp | ||
---|---|---|
480 | Done. Note that we still need to set AddedLine.IsInactive appropriately. I did that in the lambda, which feels like a strange thing for an erase_if predicate to do. The alternative would be doing a count_if first (but then we're looping over the tokens twice). |
Instead of adding new member and methods in ParsedAST, I think we can do it in CollectMainFileMacros (adding a new field SkippRanges in MainFileMacros), then we can get the skipped ranges for preamble region within the main file for free.