Page MenuHomePhabricator

[MCParser] Correctly handle Windows line-endings when consuming lexed line comments
Needs ReviewPublic

Authored by StephenTozer on Oct 27 2020, 8:09 AM.



Fixes issue:

The AsmLexer has a function LexLineComment that, as part of the lexing, passes the contents of the comment to a CommentConsumer if one exists. The passed comment is meant to exclude newline characters, but it does this by taking the range from the start of the comment inclusive to the last newline exclusive; this works with Unix line-endings, which are a single character, but fails when used with Windows line-endings, in which case the carriage return will be included as part of the passed comment. This causes an issue with llvm-mca, as it reads directives which have no label as directives with the label \r, but may result in inconsistent behaviour for any consumer when switching between line ending styles.

Diff Detail

Event Timeline

StephenTozer created this revision.Oct 27 2020, 8:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2020, 8:09 AM
StephenTozer requested review of this revision.Oct 27 2020, 8:09 AM

I've added a test for the symptom that revealed this bug in llvm-mca. I'm also writing a unit test for AsmLexer that tests the underlying behaviour by verifying that CommentConsumers will not be sent characters that are not part of the line comment, since the problem is not specific to llvm-mca (although it's the only place that has seen an error so far, as far as I can tell).