This is an archive of the discontinued LLVM Phabricator instance.

MC: AsmLexer: handle multi-character CommentStrings correctly
ClosedPublic

Authored by jannau on Jul 19 2014, 2:19 PM.

Details

Summary

As X86MCAsmInfoDarwin uses '##' as CommentString although a single '#' starts a comment a workaround for this special case is added.

Fixes divisions in constant expressions for the AArch64 assembler and other targets which use '//' as CommentString.

Diff Detail

Event Timeline

jannau updated this revision to Diff 11685.Jul 19 2014, 2:19 PM
jannau retitled this revision from to MC: handle '//' correctly in AsmLexer::isAtStartOfComment().
jannau updated this object.
jannau edited the test plan for this revision. (Show Details)
jannau added a subscriber: Unknown Object (MLST).
compnerd added inline comments.Jul 19 2014, 2:29 PM
lib/MC/MCParser/AsmLexer.cpp
465

Im not sure I like this approach. Having the special comment string being taken care of by looking at both the user supplied input and the internal pointer makes me uneasy. If we want to use the internal cursor to check the current position, then drop the argument and make check purely internal.

jannau added inline comments.Jul 19 2014, 3:19 PM
lib/MC/MCParser/AsmLexer.cpp
465

yes, I missed that it's a public function. It shouldn't use internal state.

Changing it to or adding a

bool AsmLexer::isAtStartOfComment(const char *Ptr)

seems the best option although we still have to handle the X86MCAsmInfo.cpp special case.

jannau updated this revision to Diff 11692.Jul 20 2014, 1:35 AM
jannau retitled this revision from MC: handle '//' correctly in AsmLexer::isAtStartOfComment() to MC: AsmLexer: handle multi-character CommentStrings correctly.
jannau updated this object.

adds isAtStartOfComment(const char *Ptr) and uses it to handle multi-character CommentStrings.

jannau updated this object.Jul 20 2014, 1:39 AM
compnerd edited edge metadata.Jul 25 2014, 7:59 PM

The tests feel a bit insufficient for such a change. You have a special case for Darwin yet no test for it? Is there a set of existing test cases that cover the various other cases of string comment indicators?

lib/MC/MCParser/AsmLexer.cpp
460–461

Isn't this not entirely true? If you have '//' as a comment character, and you are at '/', its unclear if this is a comment starter or not. I think that we should just drop this interface and take a const char * instead.

476

I think that special casing the 1 character into the fast path makes sense here:

if (CommentString[1] == '\0')
  return *Ptr == CommentString[0];
477

Why not fix this properly rather than work around it?

jannau updated this revision to Diff 11946.Jul 28 2014, 9:26 AM
jannau updated this object.
jannau edited edge metadata.

I've removed the old isAtStartOfComment(const char) and optimized the replacement for the single char comment string case.

The workaround for "##" in X86MCAsmInfoDarwin is still there since I'm unsure how I should solve that properly. The least bad solution I found is using different CommentStrings for output and parsing. I don't feel comfortable enough to decide how to solve this.

The MCParser portion of the change looks much better now. I am still a bit concerned about the testing. Can you at least add a test case for the special case of Darwin?

lib/MC/MCParser/AsmLexer.cpp
465

You have an extra space here.

468

This comment feels like it may get out of sync. There is nothing preventing a backend from claiming that @# is a comment identifier (other than sanity and the fear of a serious beating from a mob of angry developers).

jannau updated this revision to Diff 12175.Aug 4 2014, 12:57 PM

added explicit tests with single '#' comments for x86 darwin. A couple of existing tests will catch incorrect parsing of single '#' comments too though.

compnerd accepted this revision.Aug 4 2014, 7:21 PM
compnerd edited edge metadata.
compnerd added inline comments.
lib/MC/MCParser/AsmLexer.cpp
465

There is an extra space here.

This revision is now accepted and ready to land.Aug 4 2014, 7:21 PM
jannau updated this revision to Diff 12195.Aug 5 2014, 8:21 AM
jannau edited edge metadata.

whitespace fixes

jannau closed this revision.Aug 14 2014, 1:50 AM

Committed in r215615