Page MenuHomePhabricator

[M68k] Implement AsmParser
ClosedPublic

Authored by ricky26 on Mar 12 2021, 10:56 AM.

Details

Summary

This is a work-in-progress implementation of an assembler for M68k.

Outstanding work:

  • Updating existing tests assembly syntax
  • Writing new tests for the assembler (and disassembler)

I've left those until there's consensus that this approach is okay (I hope that's okay!).

Questions I'm aware of:

  • Should this use Motorola or gas syntax? (At the moment it uses Motorola syntax.)

Depends on D98519

Depends on D98532

Depends on D98534

Depends on D98535

Depends on D98536

Diff Detail

Event Timeline

ricky26 created this revision.Mar 12 2021, 10:56 AM
ricky26 edited the summary of this revision. (Show Details)Mar 12 2021, 11:01 AM
ricky26 updated this revision to Diff 330395.Mar 12 2021, 4:05 PM

Move operand parser definitions into earlier patch

myhsu added a comment.EditedMar 14 2021, 11:58 PM

Should this use Motorola or gas syntax? (At the moment it uses Motorola syntax.)

I spent some time dig into GCC and binutils's source. By default GCC generate assembly that conforms to Motorola's syntax (thus GAS should consume that syntax by default). Jessica has pointed out in the review for the 4-th patch that for some reason, (GNU) objdump doesn't conform to Motorola's syntax. Though I didn't know the full story but apparently objdump and GCC are using separate code base to print out assembly and for unknown reasons, the former just didn't want to adopt Motorola's syntax :-P (it doesn't have the option to switch between different syntaxes either)
The bottom line is that I would recommend to implement Motorola's syntax first. We can always add "GNU flavor" assembly printer if needed in the future, just like ATT/Intel syntax in X86 land.

P.S. Alternatively GCC can generate they called "MIT syntax" which -- I have no idea what that is tbh --is probably closer to conventional GNU assembly syntax. But that syntax is only enabled if you're building cross-compile m68k gcc for openbsd

myhsu added a subscriber: myhsu.Mar 16 2021, 2:29 PM
ricky26 published this revision for review.EditedMar 17 2021, 4:47 PM

I've marked this 'Ready for Review' since otherwise notifications don't happen. To be clear though, this is not ready to merge.

Should this use Motorola or gas syntax? (At the moment it uses Motorola syntax.)

I spent some time dig into GCC and binutils's source. By default GCC generate assembly that conforms to Motorola's syntax (thus GAS should consume that syntax by default). Jessica has pointed out in the review for the 4-th patch that for some reason, (GNU) objdump doesn't conform to Motorola's syntax. Though I didn't know the full story but apparently objdump and GCC are using separate code base to print out assembly and for unknown reasons, the former just didn't want to adopt Motorola's syntax :-P (it doesn't have the option to switch between different syntaxes either)
The bottom line is that I would recommend to implement Motorola's syntax first. We can always add "GNU flavor" assembly printer if needed in the future, just like ATT/Intel syntax in X86 land.

P.S. Alternatively GCC can generate they called "MIT syntax" which -- I have no idea what that is tbh --is probably closer to conventional GNU assembly syntax. But that syntax is only enabled if you're building cross-compile m68k gcc for openbsd

The MIT syntax appears to be the default documented syntax in the as manual (http://web.mit.edu/gnu/doc/html/as_16.html#SEC189, a mirror but it seems up to date). Their logic seems sensible to me: we should support both addressing syntaxes and it should be pretty sane to support both.

However, as' manual on Motorola Syntax still does something which doesn't match Motorola's manual: prefixing the registers with %. Does GCC do this? It looks like as supports bare registers, but it's behind a flag. (Motorola's assembly syntax uses % for binary numbers but we can support both with a bit of peeking.)

So, perhaps the strategy should be this:

  • Change the assembler/disassembler to use %-prefixed registers in this patch
  • Add support for parsing MIT operand syntax in a later patch
  • Add support for emitting MIT operand syntax on a flag in a later patch
  • Add support for the alternate assembly directive names in a later patch
  • Add support for parsing register names without a prefix on a flag in a later patch

The last point worries me a little since it seems a very commonly referenced format but we'd be doing what as does, if I understand correctly.

Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2021, 4:47 PM

Hi!

I came across the series of patches that are adding Motorola asm support. I have a quick question as I'm not too familiar with Motorola asm.

Are there any differences in how the input/output/clobber operands in a possible inline asm statement that is parsed, or do they follow the standard GNU asm format with just platform specific syntactical differences in the assembler text?

Are there any differences in how the input/output/clobber operands in a possible inline asm statement that is parsed, or do they follow the standard GNU asm format with just platform specific syntactical differences in the assembler text?

The only difference should be in the actual assembly syntax itself, however, you raise an interesting point because in Motorola-flavour assembly %0/%1 would be parsed as a single-bit constant... Which I hadn't thought of.

I'll be honest: at this stage I've not tested inline assembly at all.

The only difference should be in the actual assembly syntax itself, however, you raise an interesting point because in Motorola-flavour assembly %0/%1 would be parsed as a single-bit constant... Which I hadn't thought of.

I'll be honest: at this stage I've not tested inline assembly at all.

Thank you for clarifying!

ricky26 updated this revision to Diff 334938.Apr 2 2021, 5:04 AM

Reworked parsing and added the first test

ricky26 updated this revision to Diff 334940.Apr 2 2021, 5:23 AM

Rebase onto latest main

ricky26 retitled this revision from WIP: [M68k] Implement AsmParser to [M68k] Implement AsmParser.Apr 2 2021, 5:27 AM

I've added a test (it uses many of the instructions which are defined with both Motorola & MIT operand syntax) and tidied everything up.

I think this is now ready for a real review.

ricky26 updated this revision to Diff 334958.Apr 2 2021, 7:55 AM

Remove header to appease clang-tidy

glaubitz added inline comments.Apr 4 2021, 1:49 PM
llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp
34 ↗(On Diff #334958)

This change is already part of D98519. Could you remove it from here or from D98519?

ricky26 updated this revision to Diff 335186.Apr 4 2021, 4:06 PM

Rebase onto changed AsmLexer patch

ricky26 marked an inline comment as done.Apr 4 2021, 4:07 PM
ricky26 added inline comments.
llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp
34 ↗(On Diff #334958)

Yeah, I just needed to rebase. :)

myhsu added a comment.Apr 5 2021, 2:28 PM

Thanks for this amazing efforts :-) I only have minor formatting comments inlined in the code.
I do have a more high-level question: It seems that you only support Motorola syntax right now, is that true? I'm totally fine if you don't add the MIT syntax in this patch, just want to make sure.
Now a bigger issue is the testing: Presumably we should rewrite all of our existing tests under test/CodeGen/M68k/Encoding into assembly and put it under test/MC/M68k. I believe this can give us a better testing coverage on all of our MC components including AsmParser.
On one hand I don't want to put those migrating changes into this patch because it will easily bloat the size. Not to mention those tests contains binary encoding tests, so if we need migrate both kinds of tests at once. On the other hand, I still need to make sure this patch reaches certain test coverage. Fortunately, the test you provided seemed to cover cases in each instruction classes. Can you add a comment in that test file to note our tests migration plan?

llvm/lib/Target/M68k/AsmParser/M68kAsmParser.cpp
52

Should this be exposed as public method?

56

ditto for the above three. Ideally we should minimize the number of public methods

llvm/test/MC/M68k/instructions.s
2

-check-prefixes=CHECK is redundant

ricky26 marked an inline comment as done.Apr 5 2021, 2:53 PM

Thanks for this amazing efforts :-) I only have minor formatting comments inlined in the code.
I do have a more high-level question: It seems that you only support Motorola syntax right now, is that true? I'm totally fine if you don't add the MIT syntax in this patch, just want to make sure.
Now a bigger issue is the testing: Presumably we should rewrite all of our existing tests under test/CodeGen/M68k/Encoding into assembly and put it under test/MC/M68k. I believe this can give us a better testing coverage on all of our MC components including AsmParser.
On one hand I don't want to put those migrating changes into this patch because it will easily bloat the size. Not to mention those tests contains binary encoding tests, so if we need migrate both kinds of tests at once. On the other hand, I still need to make sure this patch reaches certain test coverage. Fortunately, the test you provided seemed to cover cases in each instruction classes. Can you add a comment in that test file to note our tests migration plan?

Thanks! I'm having fun and it's been nice to dip my toes into LLVM contribution. :)

For now, this only supports Motorola syntax (actually, MIT-style (offset, %reg) should parse correctly too). It also only supports _some_ operand formats (immediate, address, direct register, indirect and indirect with displacement).

I agree about the tests. I think the existing tests should be converted to Motorola syntax too (as that seems to be the canonical format), however, I'd argue that we should do those after the assembler & disassembler have landed (in my opinion, fleshing out that functionality is more pressing as it helps us write better tests more easily).

The included test covers the broad strokes of the supported syntax so far and should be enough to prevent code rot. As per the older comment it'll still need a fair few (smaller) revisions. In my opinion, it makes the most sense to add more tests as we go through and ensure the AsmParser actually supports everything it should. Perhaps it makes sense to add relevant Bugzilla bugs for the missing functionality as a prerequisite to landing this?

llvm/lib/Target/M68k/AsmParser/M68kAsmParser.cpp
52

Hadn't thought much about this as the class is in an anonymous namespace but yes, makes sense, I'll sort them out properly.

myhsu added a comment.Apr 6 2021, 10:03 AM

I agree about the tests. I think the existing tests should be converted to Motorola syntax too (as that seems to be the canonical format), however, I'd argue that we should do those after the assembler & disassembler have landed (in my opinion, fleshing out that functionality is more pressing as it helps us write better tests more easily).

Perfect, I'll start to rewrite the tests in parallel because we also need to solve several issues (e.g. try to get rid of extract-section command in the tests) alone the way so I can imagine it takes time.

The included test covers the broad strokes of the supported syntax so far and should be enough to prevent code rot. As per the older comment it'll still need a fair few (smaller) revisions. In my opinion, it makes the most sense to add more tests as we go through and ensure the AsmParser actually supports everything it should. Perhaps it makes sense to add relevant Bugzilla bugs for the missing functionality as a prerequisite to landing this?

For sure, please do it.

ricky26 updated this revision to Diff 335607.Apr 6 2021, 11:29 AM
  • Hide private implementation details of M68kAsmParser
  • Remove superfluous FileCheck parameter from test
  • Add migration notice to test
ricky26 updated this revision to Diff 335614.Apr 6 2021, 11:43 AM
ricky26 marked an inline comment as done.

Use assembly comments in the assembly test rather than C++ comments

ricky26 marked 2 inline comments as done.Apr 6 2021, 11:47 AM

@myhsu when you get a moment, can you check that this is all OK now? :)

myhsu accepted this revision.Apr 12 2021, 10:57 PM

@myhsu when you get a moment, can you check that this is all OK now? :)

sorry I missed your last update. LGTM now

This revision is now accepted and ready to land.Apr 12 2021, 10:57 PM

@myhsu when you get a moment, can you check that this is all OK now? :)

sorry I missed your last update. LGTM now

Great to see such a major step forward already! Great work!

This revision was landed with ongoing or failed builds.Apr 13 2021, 1:30 AM
This revision was automatically updated to reflect the committed changes.