make header guard identification stricter with Lexer.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Format/Format.cpp | ||
---|---|---|
1441 ↗ | (On Diff #59539) | I'd modify this to: while (Tok.is(comment)) if (Lex.LexFromRawLexer(Tok)) return; So it skips tokens until it finds a non-comment token. |
1452 ↗ | (On Diff #59539) | Store this value in a bool and call another Lex.LexFromRawLexer afterwards. Both calls do this next. |
1501 ↗ | (On Diff #59539) | Move the Tok.is(hash) check into isDirectiveWithName and replace the assert there. |
- Addressed reviewer's comments. Added a test case for empty code.
lib/Format/Format.cpp | ||
---|---|---|
1500 ↗ | (On Diff #59549) | Added a test case for empty file. If the code is empty, Tok will be tok::eof, which would leave MinInsertionOffset as Code.size(). But the program does fail when we insert at the end of code that does not end with '\n'. We always +1 when we calculate the offset for each line (in the for-loop on Lines); however, the last line does not necessarily end with '\n', which makes offset exceed code size. I will fix this in a follow-up patch. |
lib/Format/Format.cpp | ||
---|---|---|
1466 ↗ | (On Diff #59698) | Don't store this. Just inline Code.size() below. |
1471 ↗ | (On Diff #59698) | I'd probably just store the offset, not the entire token. |
1474 ↗ | (On Diff #59698) | I wonder whether we should also skip comments after the header guard .. If so, we could probably just move the skipComments() call into checkAndConsumeDirectiveWithName. |
1476 ↗ | (On Diff #59698) | Maybe just return here and remove the HeaderGuardFound variable? |
1480 ↗ | (On Diff #59698) | Does getFileOffset not work on the eof token? |
1484 ↗ | (On Diff #59698) | Seems like adding an std::min call would be shorter than this comment. Any reason not to just fix it now? |
- Addressed reviewer's comments.
lib/Format/Format.cpp | ||
---|---|---|
1474 ↗ | (On Diff #59698) | Not sure if we want to skip those too...seems to me that comments following header guard do not usually belong to header guard...they would most often belong to the code after it I guess? |
1480 ↗ | (On Diff #59698) | Yes, it does. |
1484 ↗ | (On Diff #59698) | We actually need more than that. For example, adding '\n' after the last line etc. This is not quite trivial to me since '\n' could be inserted after header insertions with the same offset. I thought it would be better to do the fix in a separate patch. But I'll fix the crash here and handle '\n' insertion in the future. |