- Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers.
- These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ.
- In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_|$|@|#] are already supported as "part" of an Identifier.
- The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Just making sure I understand, this is so an identifier can start with # without being a comment?
No.
Currently, if the AllowAdditionalComments is set to true, any string starting with # will be lexed as a comment. This is to ensure that if it isn't lexed as a comment, we lex it as an identifier and not as an AsmToken::Hash. Like the comment in MCAsmInfo.h for the added attribute says:
/// If the CommentString is also set to "#", setting this option will have /// no effect, and the string will be lexed as a comment.
llvm/include/llvm/MC/MCAsmInfo.h | ||
---|---|---|
207–208 | Is there a place where we could add a test for this behavior, if it's how we want this to behave? Is there some natural environment/configuration where we expect the CommentString to be set to "#" while AllowHashAtStartOfIdentifier is also true? |
llvm/include/llvm/MC/MCAsmInfo.h | ||
---|---|---|
207–208 | Sure, I'll update the unit test to test the interaction between the permutations of the two attributes. By default, CommentString is set to #. But there's another point here. Whenever a comment string starts with # it will be attempted to be lexed as a comment irrespective of what CommentString is set to because the Lexer currently lexes strings that start with # as a possible comment. This is now guarded by the AllowAdditionalComments attribute in MCAsmInfo.h. So, if AllowHashAtStartOfIdentifier is set to true, then it will be only lexed as an Identifier, if it is not already lexed as a possible comment. If, under the current Lexer's rules, it is possible to be lexed as a comment, it will be done so (and not as an Identifier) |
- Added additional tests to test behaviour between setting AllowHashAtStartOfIdentifier and when AsmLexer "decides" to lex a string as a possible comment.
- Updated comments for a few attributes in MCAsmInfo.h to make it more clear.
- Rebase on lastest master
llvm/include/llvm/MC/MCAsmInfo.h | ||
---|---|---|
207–208 | I have added a few more tests. |
Is there a place where we could add a test for this behavior, if it's how we want this to behave? Is there some natural environment/configuration where we expect the CommentString to be set to "#" while AllowHashAtStartOfIdentifier is also true?