Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This is my attempt to avoid triggering the textual fallback for string literals, as per the discussion in the issue.
To classify tokens into the categories discussed in the issue, I resurrected and modified the getTokenFlavor() function that was removed in D75176.
However, it does not seem to be working as I expect -- Lexer::getRawToken() is returning TokenKind::raw_identifier inside the string literal rather than TokenKind::string_literal.
Am I misunderstanding how this API is supposed to work? I thought "raw" meant "no preprocessor", and things like string literals would still be recognized in raw mode.
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
366 | you can rather use AST.getTokens().spelledTokenAt(Loc) to get preprocessed token, and pass that into getTokenFlavor rather than a SourceLocation. |
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
360 | I don't think we need this function, which just maps one enum onto another - just use the token kind directly? | |
366 | for a bit more context: running Lexer::getRawToken runs a raw lex that only sees the piece of text you point at. If you're inside a huge comment, it won't know that. Using AST.getTokens() uses the results of an earlier raw lex of the whole file. |
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
366 | Hmm, I've tried this and spelledTokenAt() seems to return null for comment tokens. |
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
366 | It looks like there are two reasons for this:
|
Use TokenBuffer instead of a raw lexer.
Note that getting this to word required enabling comment-retention mode for
the lexer which produces the TokenBuffer. I'm not sure if this is desirable
in general.
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
367 | this means you're not going to resolve foo in a.^foo (you're touching two tokens). The cleanest thing seems to be to use the word you've identified: iterate over the spelledTokensTouching(WordStart) and accept the one where tok.range(SM).touches(WordOffset + Word.size()) | |
clang/lib/Tooling/Syntax/Tokens.cpp | ||
343 | Yikes, I didn't remember TokenBuffer doesn't currently record comment tokens. Given that, the shorter route for this patch would be to blacklist string literals rather than whitelisting comments + identifiers. |
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
391 | Whoops, meant to use WordStart rather than Loc here. |
I don't think we need this function, which just maps one enum onto another - just use the token kind directly?