Add instruction subset for the ARC backend
ClosedPublic

Authored by tatyana-krasnukha on Sep 18 2017, 10:42 AM.

Details

Summary

Add instruction subset required for disassembling:

  • All compact (16-bit) instruction formats;
  • A couple of 32-bit instructions;
  • Ability to print symbolic information for branch and call instructions.

clang-format is applied.

Diff Detail

Repository
rL LLVM

Hi Tatyana,

Thanks for working on this!
Can you add disassembly tests for the new 16-bit short instructions that you added here?
There are a few places in the ARCDisassembler code where you use 'auto' instead of standard scalar types (unsigned/uint64_t/etc.). Can you use these types instead of auto?
You create a new namespace (arc::inst_decoder) in the Disassembler. This looks different than the rest of the Disassemblers, can you use static routines instead?
Thanks again!

ARC/ARCInstrInfo.td
217 ↗(On Diff #115671)

These aren't right, please remove these MultiPat<sub, SUB[123]...> lines. You can keep the defm SUB[123]: ... lines.

ARC/Disassembler/ARCDisassembler.cpp
128 ↗(On Diff #115671)

Would DecodeGBR32ShortRegister be a better name?

172 ↗(On Diff #115671)

I think static_cast is better than reinterpret_cast here?

Hi Pete, thank you for review!

I'll do suggested corrections as soon as I'll have tests written.
Speaking of static and namespaces, I don't like this C-style code, but if you insist...)

ARC/ARCInstrInfo.td
217 ↗(On Diff #115671)

Ok.

ARC/Disassembler/ARCDisassembler.cpp
128 ↗(On Diff #115671)

Yes, probably it is...

172 ↗(On Diff #115671)

static_cast could lead to the mistaken belief that this type cast is safe, whereas reinterpret_cast is more conspicuous and reminding that the code requires changes)

Hello Tatyana,

I am mostly going with the style/conventions observed in the other backends/disassemblers. Can we use these current conventions for this patch? I feel like ARCDisassembler doesn't need many changes to enable the new instructions.
That will let us focus on the main new material here (new .td definitions/16-bit disassembly support). Perhaps suggestions on when do use C++ namespaces and MCDisassembler* instead of void* that would apply for all disassemblers could be proposed in a separate patch?

So, I view the goal of this patch is to provide enough 16-bit instruction format information so we can disassemble all 16-bit instructions with these formats.
Thanks again!

ARC/ARCInstrFormats.td
882 ↗(On Diff #115671)

For consistency, can we change asmstr#"stuff" -> !strconcat(asmstr, "stuff")?

ARC/ARCInstrInfo.td
421 ↗(On Diff #115671)

If we map bits 10-8,4-3 to the g register, do we need the custom decoder method anymore?

423 ↗(On Diff #115671)

This doesn't need to be strconcat.

ARC/Disassembler/ARCDisassembler.cpp
286 ↗(On Diff #115671)

See my earlier comment about mapping the correct bits for the g register. I don't think this custom decoder routine is needed.

368 ↗(On Diff #115671)

Please remove the braces from this conditional.

ARC/InstPrinter/ARCInstPrinter.cpp
105 ↗(On Diff #115671)

This reads a bit easier as 2 lines, and is how other printers do this:
OS << "0x";
OS.write_hex(CE->getValue());

Added tests, fixed ARCDisassembler.cpp according suggestions.

ARC/ARCInstrInfo.td
421 ↗(On Diff #115671)

Yes, it is required to handle such cases as limm data operand or no destination (MOV_S 0,h and MOV_S 0,limm). I haven't found the way to implement it in .td

Applied fixes from 2nd comment.

Forgot to change decoder method name.

Hello Tatyana,

Thanks again for the work here.
I applied this patch, and am getting build errors (DecodeFromCyclicRange, ReadField used before declared, others).
Assuming these declarations get moved to the right places, my only remaining question is how others feel about the ReadField/ReadFields variadic templates.

lib/Target/ARC/Disassembler/ARCDisassembler.cpp
265 ↗(On Diff #118015)

This makes the users more compact, but it seems less readable.

Thanks for comment, Pete!

Fixed the order of declarations and using of functions.
Removed variadic template for better readability.

tatyana-krasnukha marked an inline comment as done.Oct 12 2017, 9:22 AM

I'm getting a few errors when running these tests now?

misc.txt:25:10: error: expected string not found in input
# CHECK: bl -2028
         ^
<stdin>:9:2: note: scanning from here
 bl -1996

br.txt:6:10: error: expected string not found in input
# CHECK: brlo %r10, %r4, -112
         ^
<stdin>:3:2: note: scanning from here
 brlo %r10, %r4, -108
 ^

compact.txt:81:10: error: expected string not found in input
# CHECK: b_s 256
         ^
<stdin>:28:2: note: scanning from here
 b_s 316
 ^

Corrupted it by merging...

petecoup accepted this revision.Nov 8 2017, 8:35 AM

Hi Tatyana,

Thanks for the delay, I only have a couple other minor things.

lib/Target/ARC/Disassembler/ARCDisassembler.cpp
275 ↗(On Diff #118804)

Can we use the appropriate unsigned type, rather than auto?

280 ↗(On Diff #118804)

Can we pull this out into another static routine?

This revision is now accepted and ready to land.Nov 8 2017, 8:35 AM

Hi Pete,

lib/Target/ARC/Disassembler/ARCDisassembler.cpp
275 ↗(On Diff #118804)

Ok. I would like to use decltype(Insn) instead of uint64_t to draw the attention that size of return value is always the same as Insn size. Do you mind?

280 ↗(On Diff #118804)

This routine has no sense outside the scope of this function.

May be if the routine will take a parameter determine whether this is 32 or 16 bit instruction and check register number (30/62), depending on it, there will be good reason to make it a separate static function and use in many other places... But I guess to do it in another patch, do you think?

petecoup requested changes to this revision.Nov 8 2017, 10:51 AM
petecoup added a reviewer: kparzysz.

I don't personally feel strongly about either of these.
I'm generally trying to match the style in other backends, which on observation I thinke would either just implement the logic inline, or create a new static routine...and omit auto.
I was wrong on other details myself originally.
But, I'm also a newbie at this reviewing bit (like I somehow mistakenly marked this as accepted?), so I'm happy to be told otherwise.

This revision now requires changes to proceed.Nov 8 2017, 10:51 AM

I agree that matching the style in other backends is very important, but it is hard to be consistent with last, because it was written long before even c++11 was released...

Use type of instruction for its fields instead of 'auto'.

Removed accidentally added file.

petecoup accepted this revision.Mon, Nov 13, 6:52 PM

Hi Tatyana,

OK, what you have here is fine with me. Thanks for all of the work here!

This revision is now accepted and ready to land.Mon, Nov 13, 6:52 PM
tatyana-krasnukha marked an inline comment as done.Tue, Nov 14, 4:29 AM

Hi Pete,

Could you commit it for me, please? I have not commit access.

@kparzysz, procedural question...am I OK to commit this for Tatyana, or do we need someone else to also sign off on this as well?

tatyana-krasnukha removed a reviewer: modocache.

When I added instructions, I didn't care about properties like isBranch, isBarrier, etc., because didn't know its purpose. But it was found that debugger cannot step over a range of instructions correctly without this knowing, thus, I've added appropriate fields to instructions.

Also, replaced empty { } with ;

petecoup accepted this revision.Mon, Nov 27, 8:49 PM

OK, I was aware of these...but didn't know that you'd need them for the debugger. There are a couple of others (mayLoad, mayStore, Defs STATUS32), but I was going to add them when the code generation uses them. Thanks!

Hi Pete,
Now I have commit after approval access and would land this revision. May I do it now?

Hello Tatyana,

This looks fine to me. If you can commit, please do!

This revision was automatically updated to reflect the committed changes.