Fix LexFloatLiteral Lexing to enforce the correct format before returning AsmToken::Real. It now reports an error on a wider range of invalid inputs. See test update for details.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
---|---|---|
67–68 | Spelling | |
77–78 | The comment here seems to indicate the current behavior of LexFloatLiteral is intentional, and the caller (e.g. AsmParser::parseRealValue or AsmParser::parsePrimaryExpr) should handle ill-formed floats with a more specific error message. Do you think that design is wrong? If you do think that design is wrong, please update the comments to describe what you think should happen instead. |
I have fixed a spelling error and updated a comment to better reflect the change made.
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
---|---|---|
63–64 | Probably should also fix this comment. | |
153 | 'e' and 'E' are identifier characters, so some of the checks here are redundant. | |
154 | *CurPtr == 'e' && *CurPtr == 'E' is impossible. We clearly need more test coverage given this issue wasn't caught by tests. | |
337 | It's hard to follow this logic; when it's tangled together like this; does this accept 1.+1? Need more test coverage to catch cases like this. | |
339 | If we conclude the suffix doesn't qualify as a float, we apparently treat it the suffix as an identifier; is that right? Are the resulting diagnostics really going to be understandable? (I guess "unexpected token in '.double' directive" is okay, although not great.) Should we worry about binutils compatibility at all? It apparently treats 1.e as equivalent to 1.e0. |
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
---|---|---|
339 | I think the diagnostics should be okay. For binutils compat, does it treat no exponent as "0" always, or only in the case of <digits>.e. |
For binutils compat, does it treat no exponent as "0" always, or only in the case of <digits>.e.
There seem to be tests in place that expect the program to die in response to these cases instead of handling them.
llvm/include/llvm/MC/MCParser/AsmLexer.h | ||
---|---|---|
65 | Spelling ("separated"). | |
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
73 | Is this early return necessary, or just to try to improve the error messages? | |
337 | Instead of adding a boolean parameter to LexFloatLiteral, can we make the "++CurPtr" conditional? It's easier to follow the logic if CurPtr is always before the "E" when LexFloatLiteral is called. |
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
---|---|---|
73 | Improve the error message, it feels ambiguous to allow the error message to be from the parser not expecting 2 floats in a row. |
So I guess overall, there are three fixes here:
- Make AsmLexer::LexDigit handle floats without a decimal point more consistently.
- Make AsmLexer::LexFloatLiteral print an error for floats which are apparently missing an "e".
- Make APFloat::convertFromString use binutils-compatible exponent parsing.
Is that right?
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
---|---|---|
83 | Maybe update this comment? | |
152 | This change doesn't do anything? | |
llvm/test/MC/AsmParser/floating-literals.s | ||
60 | We should probably have testcases for 1E1, 1e1e1, and 1e-1, since those don't work correctly without this patch. | |
llvm/unittests/ADT/APFloatTest.cpp | ||
1196 | We should probably keep these testcases, just change them to check for the new behavior (using ASSERT_EQ). |
llvm/lib/MC/MCParser/AsmLexer.cpp | ||
---|---|---|
152 | This change arised from a merge conflict in my local repo, the new diff is more in keeping with the order from before the patch originally which is why I have kept it |
llvm/test/MC/AsmParser/floating-literals.s | ||
---|---|---|
60 | In this context, 1E1 is different from 1e1... Probably best to check all of these with lowercase and uppercase "E". |
llvm/test/MC/AsmParser/floating-literals.s | ||
---|---|---|
60 | Err, just realized my comment "1E1 is different from 1e1" might be unclear. They *should* be treated the same way, but LLVM without this patch treats them differently, so we should have test coverage. |
Just to be clear, should I add a capital 'E' version of all the new tests I have added with this patch, or just the one mentioned?
It would be better to add a capital E version of all the relevant tests in the file, I think.
Not a problem at all. Feel free to commit this on my behalf as I do not have permissions. Thanks!
Spelling ("separated").