This new method repeatedly calls Lex() until end of file is reached and optionally fills a std::vector of Tokens. Use it in Clang's unit tests to avoid quite some code duplication.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1001 | Shouldn't we take the results as a LexAll(std::vector<Token> &result)? Perhaps we should rename this interface to LexTokensUntilEOF? |
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1001 | I think both suggestions make sense, especially given this doesn't return the EOF token (so it doesn't really lex *all*). Though, given how many callers do not want to store the tokens, perhaps it should take a std::optional or a pointer to the vector so we can skip the storage work if it's unnecessary? | |
1003 | I'd prefer not to assume the token stream has an EOF token (perhaps the stream is one only being used to parse until the eod token instead), so if we can turn this into a non-infinite loop, that would make me more comfortable. |
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1001 | The new LexTokensUntilEOF now takes an optional std::vector<Token> *Tokens = nullptr. | |
1003 | I'm not sure I understand entirely. Do you want something like tok.isOneOf(tok::unknown, tok::eof, tok::eod) instead of tok.is(tok::eof)? Can this happen at the level of the Preprocessor? |
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 | I was thinking something more along the lines of: if (Tokens) { for (Token Tok; !Tok.isOneOf(tok::eof, tok::eod); Lex(Tok)) Tokens->push_back(Tok); } but I hadn't thought about tok::unknown; that might be a good one to also include given that clangd operates on partial sources. |
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 | I was wondering if we could somehow merge this routine with Parser::SkipUntil since they seem to be doing a very similar tasks. |
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 | That could perhaps end up looking reasonable (they do similar tasks aside from collecting the tokens that are being skipped). Do you need the interface to be on Preprocessor or Parser though (or does it not really matter for you)? |
Handle all of tok::unknown, tok::eof, tok::eod.
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 |
I implemented this check, let me know if this looks reasonable. The code you posted doesn't do what we need because we also want to lex if Tokens is nullptr, so the hierarchy must be an if inside the loop.
I'm not sure this makes sense, given that Parser::SkipUntil requires some knowledge about the structural input. At the very least, I'd prefer not to go into that direction for this change. |
This looks reasonable to me, but I'd wait for @aaron.ballman greenlight as he seems to have had more comments/ideas.
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 | I am not sure I understand the reasoning but I somewhat see that having Parser's SkipUntil be implemented with our new Preprocessor::LexUntil... would require a lot more work. How about adding a fixme note capturing this as a future possible refactoring? |
LGTM aside from a minor naming nit. I think finding a way to unify with SkipUntil would be good follow-up work, though.
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 | Minor naming convention nit. |
Address minor naming convention nit.
clang/lib/Lex/Preprocessor.cpp | ||
---|---|---|
1003 | @v.g.vassilev Parser::SkipUntil has a long switch statement looking at the token and taking special actions depending on their kind and the context it appears in. I don't see how to generalize this in the way LexTokensUntilEOF works. |
Shouldn't we take the results as a LexAll(std::vector<Token> &result)?
Perhaps we should rename this interface to LexTokensUntilEOF?