This is an archive of the discontinued LLVM Phabricator instance.

[ms] [llvm-ml] Accept whitespace around the dot operator
ClosedPublic

Authored by epastor on Sep 28 2020, 2:20 PM.

Details

Summary

MASM allows arbitrary whitespace around the Intel dot operator, especially when used for struct field lookup

Diff Detail

Event Timeline

epastor created this revision.Sep 28 2020, 2:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2020, 2:20 PM
epastor requested review of this revision.Sep 28 2020, 2:20 PM
rnk added a comment.Sep 28 2020, 5:14 PM

This code is already so convoluted that I don't feel like I can provide very good code review for it. More tech debt can't hurt here, this codepath is already overleveraged with tech debt and needs to be rewritten. But if you want to push on through and add the functionality, it has good tests, so go for it.

llvm/lib/MC/MCParser/MasmParser.cpp
1383

Please remote the stray whitespace change.

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1706

My suggestion to try to simplify this code would be to use .slice to build two StringRefs, then check if they are non-empty, and unlex them if needed:

StringRef LHS = Identifer.slice(0, dotOffset);
StringRef RHS = Identifer.slice(dotOffset+1, Identifer.size());
if (LHS.empty())
  .. UnLex LHS
.. Unlex dot
if (RHS.empty())
  .. UnLex RHS

That keeps all the index math closer together, less spread out.

1707

Truly, breaking up identifiers into small chunks is the job of the lexer, but here we are anyway, breaking things up and "unlexing" them.

epastor updated this revision to Diff 295070.Sep 29 2020, 11:48 AM

Clean up identifier slicing (which seems sadly necessary)

epastor marked 3 inline comments as done.Sep 29 2020, 11:49 AM
epastor added inline comments.
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1706

Done, thanks! (It's RHS first, since UnLex treats the lexer as a stack.)

1707

Yep. *sigh*

rnk accepted this revision.Sep 29 2020, 12:03 PM

lgtm, thanks!

This revision is now accepted and ready to land.Sep 29 2020, 12:03 PM
epastor updated this revision to Diff 295100.Sep 29 2020, 1:15 PM
epastor marked 2 inline comments as done.

Rebase on parent

This revision was automatically updated to reflect the committed changes.