- Add the M68k-specific MC layer implementation
- Add ELF support for M68k
- Add M68k-specifc CC and reloc
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Part of the restructing of this patch series. Now this patch contains the MC layer and ELF support
llvm/include/llvm/module.modulemap | ||
---|---|---|
85 | Alphabetize with other files | |
llvm/lib/MC/MCExpr.cpp | ||
228 ↗ | (On Diff #296012) | Put return on same line for consistency |
378 ↗ | (On Diff #296012) | tabs? |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0AsmBackend.cpp | ||
177 ↗ | (On Diff #296012) | Can this be !isInt<16>(Value)? |
187 ↗ | (On Diff #296012) | isInt<8>? |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0BaseInfo.h | ||
251 ↗ | (On Diff #296012) | Can we remove commented out code or put in the patch that needs it? |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0ELFObjectWriter.cpp | ||
84 ↗ | (On Diff #296012) | Commented out code |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0InstPrinter.cpp | ||
35 ↗ | (On Diff #296012) | Is the StringRef conversion needed? |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0MCAsmInfo.cpp | ||
29 ↗ | (On Diff #296012) | Is this number correct for 68K? That looks like X86's NOP encoding. |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0MCCodeEmitter.cpp | ||
178 ↗ | (On Diff #296012) | Can this use isUIntN from MathExtras.h? |
182 ↗ | (On Diff #296012) | isIntN from MathExtras? |
189 ↗ | (On Diff #296012) | Commented out code? |
344 ↗ | (On Diff #296012) | Commented out code |
407 ↗ | (On Diff #296012) | Commented out code |
llvm/lib/Target/M680x0/MCTargetDesc/M680x0MCTargetDesc.cpp | ||
65 ↗ | (On Diff #296012) | I'm not sure the result of a concatenation can be a "SingleStringRef" |
Addressed feedbacks
llvm/lib/Target/M680x0/MCTargetDesc/M680x0MCAsmInfo.cpp | ||
---|---|---|
29 ↗ | (On Diff #296012) | good catch, 68K's NOP should be different |
llvm/include/llvm/BinaryFormat/ELFRelocs/m680x0.def | ||
---|---|---|
5 ↗ | (On Diff #297733) | As I said on D88389: They're all R_68K_FOO in system headers, please just use that name otherwise it gets confusing. |
llvm/include/llvm/Object/ELFObjectFile.h | ||
1145 | As I said on D88389: This gets reported in the file format line of llvm-objdump so should match what binutils has, which is elf32-m68k, though even if that weren't the case it should at least be in keeping with the style of all the others here. |
llvm/include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
1145 | Yeah, I agree this should definitely match with what GNU is using there. I would still prefer the backend being called "M680x0" and therefore the patches should be prefixed with "[M680x0]", similar to "SystemZ" and "s390x". Naming the "M680x0" instead of "M68K" improves the readability in my personal opinion as it's easier to tell when you are talking about the backend and when you're talking about the architecture and GNU triplet. |
llvm/include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
1145 | The difference there is "IBM System z9" etc are the actual product names, and it's not a really clumsy name to type like M680x0. NXP's own site categorises its M68K-derived processors as "68K Processors (Legacy)", and the manuals say things like:
GCC's own manage uses the M680x0 term in the following ways:
So the name M680x0 would actually be *more* narrow than what M68K means in practice, with the latter being the general term for any M68000-derived processor and the former being only for the 68000 through 68060 processors and *not* including the ColdFire extensions, yet there's no reason why our backend can't support that just like GCC does. |
llvm/include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
1145 | Meh, I think this is really a bike-shedding contest. As I said, I like the name "M680x0" because it clearly tells me we're talking about the backend and not the architecture. It makes reading the code easier in my opinion.
Which is titled with "3.19.25 M680x0 Options" ;-) |
llvm/include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
1145 |
Which I think is a bug (perhaps historical, if that was the name chosen before ColdFire was added) given the language used throughout. I've been considering sending a patch to change that title though. |
llvm/include/llvm/Object/ELFObjectFile.h | ||
---|---|---|
1145 | Well, ok. If LLVM upstream insists on the name "M68k" for the backend, I'm not going fight it. In the end, I want the backend to succeed and I don't want something like a naming dispute to block it. But please call it "M68k" (with a lower "k"), not "M68K" which would be incorrect as "kilo" is spelled all lower case. |
- [NFC] Rename M680x0 to M68k
- Change ELF reloc's prefix from "R_M680x0_" to "R_68K_" to be in consistent with GCC
A few nits, but this is looking good.
llvm/include/llvm/MC/MCExpr.h | ||
---|---|---|
202 ↗ | (On Diff #302184) | Does this get exposed in any way? Would it break a previous enum order? |
llvm/lib/Target/M68k/MCTargetDesc/M68kAsmBackend.cpp | ||
217 | Isn't there a better way to emit this? | |
llvm/lib/Target/M68k/MCTargetDesc/M68kBaseInfo.h | ||
2 | Other header comments mention "m68k", this should too. | |
217 | Is this comment really meaningful? Or is it commented by accident? Or is "SP" == "A7"? If so, I wouldn't comment like code because it looks like an accident or bad form. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kELFObjectWriter.cpp | ||
13 | This file is quite light on comments. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
12 | This file is also a bit light on comments. Some assumptions may not be obvious to non-m68k developers. |
llvm/include/llvm/MC/MCExpr.h | ||
---|---|---|
202 ↗ | (On Diff #302184) | You're right, I don't think any of the M68k code use it at this time point. Will remove it for now |
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp | ||
---|---|---|
85 | for (int s = 0; s < 8; s += 8) ? | |
87 | Applies to many braces in this file. | |
182 | The PCREL_OPERAND style print* functions are usually used this way: D77853 | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
59 | encodeBits Newer backends should stick with the function naming coding standard | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
70 | clang-format recognized spelling is /*TuneCPU=*/CPU | |
83 | There should be some .cfi_* tests with this change. |
- Addressed some of the feedbacks
- [NFC] Removed '#<number>' in the comments
- [NFC] Fixed minor formatting issues
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp | ||
---|---|---|
85 | good catch, thanks | |
182 | Well...actually Motorola is using its own assembly language (and where this (N,%pc) syntax comes from). We're trying to conform with that in order to have better coordination with other existing toolchains | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
83 | Correct, it's on the TODO list now |
a few minors
llvm/include/llvm/IR/CallingConv.h | ||
---|---|---|
248 | Why is this 1000 instead of 101? | |
llvm/include/llvm/module.modulemap | ||
73 | M68k.def ? | |
llvm/lib/Target/M68k/MCTargetDesc/M68kFixupKinds.h | ||
25 | This looks dodgy - please can you double check it? | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
43 | todo what? |
llvm/include/llvm/BinaryFormat/ELF.h | ||
---|---|---|
761 | File should be named the same as the directory in Target, ie with a capital M. | |
llvm/include/llvm/BinaryFormat/ELFRelocs/m68k.def | ||
5 ↗ | (On Diff #312017) | No space before ( |
llvm/include/llvm/Object/ELFObjectFile.h | ||
1145 | This specific string still needs fixing to match binutils's name for the file format as all lowercase. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kBaseInfo.h | ||
196 | The existence of this is surprising; one would expect to be able to pass any valid register number to a function called isAddressRegister and just get back false for special registers. Do you need this assert? If so, please change the function name to reflect that it's only valid for the "general-purpose" (if they're called that on m68k given it has split A and D) registers. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp | ||
15 | Would be helpful to state (either here at the top or in the places where the code gets it wrong) the ways in which it doesn't conform. Also is that the Motorola ASM syntax what GAS uses (which is the most important thing to implement) or do they differ? | |
182 | The objdump output is not the same as the asm input/codegen output in this specific case. What does binutils's objdump do for m68k? | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp | ||
23 | What does this mean? | |
30 | Which is what? A specific instruction? A special bit pattern that's never a valid instruction? | |
40 | Clang doesn't do and has never done anything for m68k. This line can go. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
43 | I assume "implement the ParseM68kTriple function" so it's not just a stub... but yes. | |
88–92 | If your hardware-provided call instruction saves the return address in memory like x86, then yes. If your hardware-provided call instruction saves the return address in a register like most other architectures, then no, and the prologue code will ensure the return address gets proper DWARF info if spilled, just like any other register. Given m68k is like x86 and jsr/rts/etc save/restore PC to/from the stack, yes, you do need this I believe. |
llvm/lib/Target/M68k/MCTargetDesc/M68kBaseInfo.h | ||
---|---|---|
18 | Old name in the header guard. Repeated several times in this revision and later revisions. |
- Addressed some of the feedbacks
- I need a little more time double checking the (motorola) assembly syntax
llvm/lib/Target/M68k/MCTargetDesc/CMakeLists.txt | ||
---|---|---|
11 | Fix sorting | |
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp | ||
82 | (style) Move comment into assert message assert((Mask & 0xFFFF) == Mask && "Mask should be 16 bits"); | |
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.h | ||
45 | why is this unsigned when all the other methods that have 'opNum' use int ? | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
179 | Merge asserts: assert((Size + Offset <= 64) && isUIntN(Size, Val) && "Value does not fit"); | |
336 | Should we test for null pointer as well here? if (!Beads || !*Beads) { | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
1–12 | We should probably have the boilerplate comments in the initial version of the file? | |
llvm/lib/Target/M68k/TargetInfo/M68kTargetInfo.cpp | ||
1–12 | We should probably have the boilerplate comments in the initial version of the file? |
- Addressed feedbacks
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp | ||
---|---|---|
182 | I think I can answer this question now: GCC, GNU AS and this M68k backend all use Motorola's own assembly syntax. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp | ||
23 | seems to be a stale comments, removing it now | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
1–12 | sorry I don't quite get what you said: what is the boilerplate comments here? is it the license + "This file provides M68k target specific descriptions." on the right? |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
---|---|---|
1–12 | Yes - the comment block containing the license etc. |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
---|---|---|
1–12 | right, almost forget, will do |
llvm/lib/Target/M68k/MCTargetDesc/M68kELFObjectWriter.cpp | ||
---|---|---|
52 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
79 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
90 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
102 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
113 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
llvm/lib/Target/M68k/MCTargetDesc/M68kFixupKinds.h | ||
24 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
43 | (style) Drop the default case and move llvm_unreachable after the switch statement | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
118 | Can you get uninitialized variables further down? The switch statement isn't exhaustive so static analyzers will complain, add a default llvm_unreachable? | |
213 | drop braces |
llvm/lib/Target/M68k/MCTargetDesc/M68kBaseInfo.h | ||
---|---|---|
117 | Commented-out code. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp | ||
30 | This doesn't work; it's assumed to be a single byte (MCAsmStreamer passes ValueSize as 1 when using it), but if it did it would presumably need to be dealt with very carefully wrt endianness. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
206 | Capitalise | |
282 | ? | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp | ||
80 | Capitalise |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp | ||
---|---|---|
30 | good catch, and you're right non-single byte fill value will never be used. I'll just remove this line to let MC use the default value. |
So, uh, correct me if I'm wrong but none of these patches add assembler or disassembler support? Should that not be present before a target is accepted? Otherwise there's no way to test any of the MC layer without including CodeGen.
We don't have AsmParser and disassembler right now. We're using MIR as input to test our integrated assembler, so you're right, we can't test MC layer without CodeGen. But why would that will be a hard blocker?
The real requirement is being able to produce code at the end. If that relies on (free, accessible) third-party tools, then it should be find for the first import. Some production targets have (free) proprietary tools that are required to continue the code generation (ex. NVPTX). Other targets have assemblers but can be tested (no available hardware). Having specific hard requirements would make it really hard for LLVM to have new backends.
What we need for the first batch is to know that the target can be used to generate correct code. It's less relevant if that can be shown with an integrated assembler or other tools at the end.
It would be important, however, to know if the m86k community intends to upstream the assembler before moving to production or if the target requires use of third-party closed source tools to work. This will help guide this and other reviews after the first merge, into production.
Agree.
What we need for the first batch is to know that the target can be used to generate correct code. It's less relevant if that can be shown with an integrated assembler or other tools at the end.
That is also what I'm thinking: it's true that the current work need to use external assembler (GNU AS) to deal with cases like inline assembly, but at the same time it can handle common cases with integrated assembler just fine. We also have test cases on testing that part.
It would be important, however, to know if the m86k community intends to upstream the assembler before moving to production or if the target requires use of third-party closed source tools to work. This will help guide this and other reviews after the first merge, into production.
Speaking of that. Few days ago there is a (draft) PR sent to our GitHub repo: https://github.com/M680x0/M680x0-mono-repo/pull/20
Author of that PR is working on the AsmParser and disassembler (that PR is just a placeholder to inform us in order to prevent overlapping works). I'm pretty excited and looking forward to seeing it appearing upstream before becoming an official backend.
Ok, MIR is fine and makes sense, if rather clunky compared to assembly. Are those tests cleanly separated from actual CodeGen tests and labeled as such (so once there is an AsmParser and associated tests they can be removed)?
I just learned about that (because I hadn't any Github notifications enabled for the repo yet) and I'm super excited to see that PR.
It's happening what I was hoping and also expecting - external people just coming by and sending such improvements because they love the retro architecture and enjoy working on LLVM.
So, I think that should make @jrtc27 happy regarding this question :-).
Ideally, we should get them contributing to the upstream LLVM backend directly, but for that, we need the target in tree.
I think this is a good example of how active the community is and what we should expect to get before moving the target to production.
Yep, and that's why I think it would be great to get the backend merged soonish and then let the community do it's work :).
I think this is a good example of how active the community is and what we should expect to get before moving the target to production.
Great to hear that and I fully agree. And all that despite of the age of the architecture :D.
llvm/lib/Target/M68k/MCTargetDesc/M68kELFObjectWriter.cpp | ||
---|---|---|
64 | This is odd; processors shouldn't care about fixups. But regardless of the actual problem, more info would be helpful as this isn't particularly useful for anyone who doesn't already know the problem. | |
llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp | ||
87 | Still many instances of this | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
343–345 |
A few minor remaining issues, but once those are fixed I believe this is fine to land.
llvm/lib/Target/M68k/MCTargetDesc/M68kAsmBackend.cpp | ||
---|---|---|
217 | This is how RISC-V does it (well, with OS.write("\x13\0\0\0", 4); and a similar one for 2-byte compressed instructions), so this is fine? | |
240 | Commented-out code | |
llvm/lib/Target/M68k/MCTargetDesc/M68kELFObjectWriter.cpp | ||
64 | Please clarify this comment or delete if it it's wrong | |
llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp | ||
379–382 | should work as a nicer way to write that (using llvm/Support/EndianStream.h) |
File should be named the same as the directory in Target, ie with a capital M.