This is an archive of the discontinued LLVM Phabricator instance.

[tblgen][disasm] Allow multiple encodings to disassemble to the same instruction
ClosedPublic

Authored by dsanders on Sep 21 2018, 10:39 AM.

Details

Summary

Add an AdditionalEncoding class which can be used to define additional encodings
for a given instruction. This causes the disassembler to add an additional
encoding to its matching tables that map to the specified instruction.

Usage:

def ADD1 : Instruction {
  bits<8> Reg;
  bits<32> Inst;

  let Size = 4;
  let Inst{0-7} = Reg;
  let Inst{8-14} = 0;
  let Inst{15} = 1; // Continuation bit
  let Inst{16-31} = 0;
  ...
}
def : AdditionalEncoding<ADD1> {
  bits<8> Reg;
  bits<16> Inst; // You can also have bits<32> and it will still be a 16-bit encoding
  let Size = 2;
  let Inst{0-3} = 0;
  let Inst{4-7} = Reg;
  let Inst{8-15} = 0;
  ...
}

with those definitions, llvm-mc will successfully disassemble both of these:

0x01 0x00
0x10 0x80 0x00 0x00

to:

ADD1 r1

Depends on D52366

Diff Detail

Repository
rL LLVM

Event Timeline

dsanders created this revision.Sep 21 2018, 10:39 AM
nhaehnle added inline comments.
include/llvm/Target/Target.td
406–407 ↗(On Diff #166508)

Is Predicates actually used? If no, better to leave it out for now. If yes, remove its definition from Instruction.

dsanders added inline comments.Oct 22 2018, 9:34 AM
include/llvm/Target/Target.td
406–407 ↗(On Diff #166508)

Is Predicates actually used? If no, better to leave it out for now.

Yes, it's used in my out of tree target to control availability of an instruction encoding between different revisions of the ISA.

If yes, remove its definition from Instruction.

Well spotted. I'll remove that when I commit. Aside from that does the patch look good to you?

Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2019, 11:14 AM
bogner accepted this revision.Jun 12 2019, 12:28 PM

Sorry for the delay! This LGTM

This revision is now accepted and ready to land.Jun 12 2019, 12:28 PM
This revision was automatically updated to reflect the committed changes.

Is this feature (AdditionalEncoding) used by any downstream target? There is no in-tree user.

(@jyknight)

Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2022, 11:04 PM

Is this feature (AdditionalEncoding) used by any downstream target? There is no in-tree user.

(@jyknight)

We use it extensively downstream to deal with variable length instructions

Is this feature (AdditionalEncoding) used by any downstream target? There is no in-tree user.

(@jyknight)

Huh, despite doing a bunch of work recently in the encoder/decoder tablegen code, I never even noticed this feature existed before now. :)

We use it extensively downstream to deal with variable length instructions

I'd note that we have many targets upstream which handle the "compressed instruction" idea just fine, without this feature. E.g. RISCV's "C" (compressed) instructions, ARM Thumb/Thumb2, CSKY 16-bit instrs, MicroMIPS, etc. In all cases, they define separate instructions for the "compressed" variant. AFAICT, none of them could actually use this feature even if it wasn't only restricted to disassembly -- they need to be able to express a different register class (since the compressed instruction allows fewer registers), or be able to tie input==output, because the compressed instruction has fewer register fields, or reduce the allowed range of the immediate, or ....

Probably it'd be better to have the downstream target also write multiple instruction definitions, using proper register classes/etc, instead of this hack.