This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add MC encodings and tests of the Bit Manipulation extension
ClosedPublic

Authored by PaoloS on Aug 2 2019, 6:13 AM.

Diff Detail

Event Timeline

PaoloS created this revision.Aug 2 2019, 6:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2019, 6:13 AM
simoncook added inline comments.Aug 5 2019, 3:12 AM
llvm/lib/Target/RISCV/RISCV.td
53

nitpick: the capitalisation here looks weird.

99

This predicate probably needs a comment

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
18

Probably worth a comment noting that the intent is to move these to the InstrFormats file once B has been ratified.

lewis-revill added inline comments.
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
18

I'm concerned that with this definition you won't be able to enforce the immediate fitting in the field. A better approach would probably be to add an instruction format from scratch with a uimm6 shamt operand. Likewise for RVBInstFunct5, RVBInstFunct6, RVBInstFunct7, with the appropriate operand types for however many bits the shamt operand requires. Invalid immediate tests should also then be added.

68

Is there an explanation for why this instruction is not re-using RVInstR4 from RISCVInstrFormats.td?

89

Slight typo here, this instruction format is for more than one instruction so should use opcodestr rather than the hard coded "fsriw".

108

Vice-versa here, there is still an opcodestr passed to this instruction format. I prefer the approach of keeping the instruction format generic even if there will only ever be one opcodestr, and having the instruction specify it.

PaoloS updated this revision to Diff 213913.Aug 7 2019, 8:47 AM

Fixed typos, bad indentations and capitalization.
Added uimm6 and uimm7 riscv asm operand types in order to have more precise classes to represent the templates of the instructions with 5-bit, 6-bit and 7-bit immediate fields. The riscv asm operand uimm5 already existed.
Added missing rev pseudo instructions for riscv64.
Fixed the tests accordingly.
Added comments.

PaoloS updated this revision to Diff 213940.Aug 7 2019, 10:18 AM
PaoloS marked 6 inline comments as done.

Fixed the indentation of the tests.

PaoloS updated this revision to Diff 218104.Aug 30 2019, 8:38 AM

Added the latest updates to the encodings of the instructions, including the gorc, gorci, bfp instructions, the 'orc' pseudo instructions based on gorci and the zip/unzip pseudo instructions based on shfli/unshfli. Changed the encoding of grev and bmatxor to make space for gorc. Added the subtarget feature Zbf to support bfp. Added the necessary MC tests.

PaoloS updated this revision to Diff 218330.Sep 2 2019, 4:40 AM
PaoloS edited the summary of this revision. (Show Details)
PaoloS added a comment.Sep 2 2019, 4:40 AM

Update the encodings to the latest version of the spec. Fix the tests and encoding conflicts.

PaoloS updated this revision to Diff 218749.Sep 4 2019, 10:54 AM

Fixed immediate values of shfli/unshfli for zip/unzip pseudo instructions.

PaoloS marked 2 inline comments as done.Sep 13 2019, 3:23 AM
PaoloS added inline comments.
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
68

Yes, that was because with RVInstR4 I couldn't use funct3 for encoding. As it is declared in RISCVInstrFormats.td:

class RVInstR4<bits<2> funct2, RISCVOpcode opcode, dag outs, dag ins,
               string opcodestr, string argstr>
    : RVInst<outs, ins, opcodestr, argstr, [], InstFormatR4> {
  bits<5> rs3;
  bits<5> rs2;
  bits<5> rs1;
  bits<3> funct3;
  bits<5> rd;

  let Inst{31-27} = rs3;
  let Inst{26-25} = funct2;
  let Inst{24-20} = rs2;
  let Inst{19-15} = rs1;
  let Inst{14-12} = funct3;
  let Inst{11-7} = rd;
  let Opcode = opcode.Value;
}

any parameter passed as funct3 would be ignored.
I opted for a new class that allows to use funct3 as an extra opcode field too. It would be good to consider that for the original RVInstR4 too though.

PaoloS marked an inline comment as done.Sep 13 2019, 3:24 AM
PaoloS updated this revision to Diff 220538.Sep 17 2019, 10:48 AM

Removed new line.

PaoloS updated this revision to Diff 229480.Nov 15 2019, 2:14 AM

Update encodings to RISCV BitManip v0.92:

add the encodings of intstructions packu[w], packh, sext.b, sext.h,
update the encoding of instructions bfp[w] and bdep[w],
update tests accordingly.

PaoloS edited the summary of this revision. (Show Details)Dec 3 2019, 8:45 AM
PaoloS updated this revision to Diff 242046.Feb 3 2020, 6:02 AM

Added empty scheduling descriptions, just to let it build with the upstream LLVM.

Added zext.* and sext.w pseudo instructions.

simoncook updated this revision to Diff 243327.Feb 7 2020, 5:11 PM

Update Target Features to be of form zb<x> rather than just b<x>.

simoncook updated this revision to Diff 243329.Feb 7 2020, 5:18 PM

(Actually update patch rather than rebase of previous version)

simoncook added inline comments.Feb 7 2020, 5:20 PM
llvm/lib/Target/RISCV/RISCV.td
107

This isn't doing what you want it to. This patch currently causes MC/RISCV/invalid-instruction-spellcheck.s to fail because of the lack of an AssemblyPredicate so the instruction is always enabled.

Unfortunately it looks like AsmPredicate doesn't support saying if Feature X or Y is enabled, and we can only compose them as needing X and Y on the Instruction definitions themselves. I couldn't find any other backend needing something like this, so it might be that we need a dummy SubtargetFeature, or teach TableGen that AsmPredicates can be composed in an or fashion instead.

evandro added inline comments.Feb 13 2020, 3:10 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
155 ↗(On Diff #243329)

The more granular check (hasStdExtZbb()?) should be used here instead.

simoncook updated this revision to Diff 250768.Mar 17 2020, 7:37 AM
  • Update patch now AssemblerPredicate update has landed
  • Add "experimental-" to all bitmanip SubtargetFeatures
simoncook marked an inline comment as done.

Address review comments and clang-format changes

Address own feedback on RISCV.td and cpp files

This looks good to me, as long as it covers the spec. Can you add more reviewers though?

This is starting to look good. I've checked all the encodings, and other than c.zext.w having the wrong value all the encodings are right.

One thing I would consider is to make reviewing/future changes easier, I would define all the instructions in the order they appear in the Chapter 2.11 Opcode Encodings tables. This way, it will be easier to see for future revisions what has changed and make sure the encodings are up to date. This might end up with a few more let Predicates = directives, but we can rearrange and tidy this up when it is all ratified.

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
87

It seems like this is just RVInstR but with a opcode flag, but its only used with OPC_OP_IMM_32, can this just assume that opcode type and have a better name (assuming the opcode type makes sense)?

117

This looks like it mostly used for just the 'W' variants of instructions, in that case is this the most appropriate name if the opcode type is OPC_OP_32 by default? It seems for others you're doing the right thing and using ALU_rr as the class you derive from

119

remove {}

185

I don't know if splitting up the 12-bit immediate into a 7 and a 5 might make future maintenance of this easier, given that that's how its expressed int he manual?

185

I'm also unsure on the use of the opcode enum when it represents the wrong thing, I don't know if this causes some confusion, but we can see if anyone else has any comments here?

389

This block is missing C.NEG with funct2 01?

393

c.zext.w should have 10 as its funct2. Also zest -> zext

llvm/test/MC/RISCV/rv64bc-valid.s
13 ↗(On Diff #251005)

zest -> zext

asb added a comment.Mar 19 2020, 6:31 AM

This is starting to look good. I've checked all the encodings, and other than c.zext.w having the wrong value all the encodings are right.

One thing I would consider is to make reviewing/future changes easier, I would define all the instructions in the order they appear in the Chapter 2.11 Opcode Encodings tables. This way, it will be easier to see for future revisions what has changed and make sure the encodings are up to date. This might end up with a few more let Predicates = directives, but we can rearrange and tidy this up when it is all ratified.

Strong +1 on this. Throughout the RISC-V backend we order the instruction definitions to match the order in the ISA manual, with codegen patterns listed separately in a more meaningful order. As Simon say, it can lead to a few more Predicates directives needed, but makes cross-referencing vs the spec _much_ easier.

Looks tidy and a really decent set of tests. I've added a few comments, though they are mainly me agreeing with Simon's points.

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
183

Do these properties need to be specified, or do the values match the values which would be inferred by TableGen anyway (since all the DAG patterns are empty)?

185

I would also recommend avoiding the opcode enum values. It implies a connection to the base instruction encodings which is entirely incidental, and the name is also misleading.

185

I also think it is worth matching the manual and splitting up the 12 bit encoding into separate 7 and 5 bit fields.

389

c.neg will need to be added to the tests too.

PaoloS marked 5 inline comments as done.Mar 19 2020, 8:42 AM
PaoloS added inline comments.
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
87

Well, it is used by immediate instructions, so I think and indication of that should appear in the name. I wouldn't base it on RVInstR if that's what you meant. Anyway, in line with what you suggested about RVInstR I could call it ALU_ri_32, but ALU_ri uses a 12 bits immediate field, unlike the instructions using RVBInstImm5 that only use 5 bits immediates. So it's not just a matter of opcode but also of different fields. I could call it RVBInstShift32, as some of the 32 bit counterparts of such instructions use RVBInstShift. The problem is that instructions like GREVIW and GORCIW, or SBSETIW, SBCLRIW, SBINVIW have little to do with shifts. While SLOIW and SROIW have. At the time RVBInstImm5 seemed a good way to express the main difference of such template: the immediate operand bit width. I'll think about better names.

117

I could relate to ALU_rr with the name ALU_rr_32 or ALU_rr_w, just to specify the difference.

185

I agree. The explicit encoding would avoid some confusion here, even though the fact that this instruction uses the number OPC_OP_IMM for its opcode raises the clash between its immediate format and its "not immediate" usage. This might be subject to some adjustments once we move closer to have the encodings ratified.

185

Splitting the immediate field into fields of 7 and 5 fields seems interesting. I mean, I guess it could prove useful in the future. Never the less it can't be applied anyway to all those templates like RVBInstImm5, RVBInstImm6 and RVBInstImm7, as they expect one of the two fields to be used as an immediate operand. So yes, I could do it, but it applies anyway only to the instructions that use those 12 bits only for flags. Do you agree?

389

Correct. The compressed instructions seem outdated. I'll fix them.

PaoloS marked 2 inline comments as done.Mar 19 2020, 1:59 PM
PaoloS added inline comments.
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
117

Never mind, I'll use ALUW_rr that was already available. Must have missed it.

389

Correction. Adding any of c.not, c.neg, c.zext.w or c.zext.d creates a decoding conflict with the current encoding of c.addi16sp, c.lui and c.lui (hint) already implemented in the C extension. In the current proposed encoding of the compressed instructions a decoding conflict is due. Since these instructions though are meant to be part of the C extensions, not the B extension I think it's better to leave them out of this patch. Once a definitive encoding will be defined for them they should be added with no problems to the C extension.

PaoloS updated this revision to Diff 252200.Mar 24 2020, 4:52 AM
PaoloS marked an inline comment as done and an inline comment as not done.
  • Sorted the definitions of the instructions and the MC tests according to the order of the encodings in the manual.
  • Improved the names of the instruction templates.
  • Removed unused operand types and instruction templates.
  • The features hasSideEffects, mayLoad and mayStore are now set only on the templates that need it.
  • Renamed all the immediate parameters in the templates as 'shamt', for uniformity.
  • Split the funct12 field of the unary instruction template into two fields of 7 and 5 bits. For conformity to the manual and for maintainability.
  • Added a 4-bit immediate operand for the 32 bit shuffle.
  • Added constraints on the size of the immediate operands.
  • Added tests on the constraints of the immediate operands.
  • Renamed most registers in the tests for uniformity.
  • Removed the compressed instructions as they are still a proposal for the C extension and still cause encoding conflicts with the ones aready implemented there.
  • Removed tests of compressed instructions.
PaoloS marked 13 inline comments as done.Mar 24 2020, 4:55 AM
PaoloS added inline comments.
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
183

Yes they are needed. TableGen gives error if they are not specified.

PaoloS marked an inline comment as done.Mar 24 2020, 4:55 AM
PaoloS updated this revision to Diff 252558.Mar 25 2020, 6:44 AM
  • Added bitmanip compressed instructions.
  • Added compressed instruction emission.
  • Added mc tests.
  • Added assembler predicate strings to B extension and subextensions.
  • Fixed typos.
  • Fixed the order of the single bit instructions.
PaoloS marked an inline comment as done.Mar 25 2020, 6:45 AM
PaoloS updated this revision to Diff 252567.Mar 25 2020, 7:00 AM

Fixed obsolete comments.

asb added a comment.Mar 26 2020, 7:33 AM

I'm reviewing this with an eye to merging it, but one big thing that comes to mind is the compressed instructions. The draft bitmanip spec describes these under "Future compressed instructions" and says "It presumably would make sense for a future revision of the “C” extension to include compressed opcodes for those instructions." My reading is that this is more of a sketch of potential encodings and a less firm proposal than the 32-bit encodings described elsewhere in the spec. Do you disagree with that assessment?

edward-jones accepted this revision.Mar 26 2020, 7:55 AM

This looks good to me now.

This revision is now accepted and ready to land.Mar 26 2020, 7:55 AM
In D65649#1943742, @asb wrote:

I'm reviewing this with an eye to merging it, but one big thing that comes to mind is the compressed instructions. The draft bitmanip spec describes these under "Future compressed instructions" and says "It presumably would make sense for a future revision of the “C” extension to include compressed opcodes for those instructions." My reading is that this is more of a sketch of potential encodings and a less firm proposal than the 32-bit encodings described elsewhere in the spec. Do you disagree with that assessment?

This is one thing I've been discussing with Paolo a bit and I'm not 100% sure what the correct approach is here.

I agree, it seems more like a sketch of what they should be, but the mentioning of the size improvements suggested to me perhaps they should be, since evaluating the potential "what the final B extension should look like" it probably should be included. There is the question of where this should be enabled, my first thought (which this patch implements) is Zbb because it seemed the most logical, but maybe given the instructions stated state does another subextension make sense for its evaluation? These are all experimental so I think there's some leeway to some option, but I think it would be good to land it if we think people may be using it/want to evaluate it. I suspect there's many valid routes with this being experimental, we just need to choose one we have concensus around making sense. Perhaps a topic for today's call?

asb added a comment.Mar 26 2020, 8:29 AM

Submitting the comments I have so far - need to continue going through in more detail.

llvm/lib/Target/RISCV/RISCV.td
123

Nit: indentation of this list seems arbitrary. I'd have that [ should be aligned with the first " on the line above?

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
2

Doesn't match the filename

10

This should probably say something like "This file describes the RISC-V instructions from the standard 'B' Bitmanip extension, version 0.92. This extension has not yet been ratified (meaning LLVM support is experimental)."

49

You're using a different style here than elsewhere in the backend, where we try to use the basic RISC-V formats as the superclass wherever possible (and where necessary, defining a subclass in RISCVInstrInfo*.td). RVBUnary for instance just seems to be an R-type instruction (albeit, one where the rs2 field is hard coded). Also, for the base instr format class we don't set hasSideEffects/mayLoad/mayStore (these are semantic details that aren't necessarily implied by the isntr format).

52

Should be InstFormatR?

asb added a comment.Mar 26 2020, 8:31 AM

I agree, it seems more like a sketch of what they should be, but the mentioning of the size improvements suggested to me perhaps they should be, since evaluating the potential "what the final B extension should look like" it probably should be included. There is the question of where this should be enabled, my first thought (which this patch implements) is Zbb because it seemed the most logical, but maybe given the instructions stated state does another subextension make sense for its evaluation? These are all experimental so I think there's some leeway to some option, but I think it would be good to land it if we think people may be using it/want to evaluate it. I suspect there's many valid routes with this being experimental, we just need to choose one we have concensus around making sense. Perhaps a topic for today's call?

Yes, let's discuss today. I agree there's value in making it easy to evaluate the compressed instructions too. My concern would be that people implementing experimental support in simulators or CPUs might skip over the section as more speculative, meaning that if we emitted those encodings those users might have a bad experience.

PaoloS updated this revision to Diff 253455.Mar 29 2020, 1:25 PM
  • Changed some instruction template classes so that they are not based directly on the basic RVInst.
  • Changed the indentation where possible so that it fits in 80 columns.
  • Fixed some obsolete comments.
  • Added some runs to the Zbp tests.
  • Fixed other indentation problems.
PaoloS marked 5 inline comments as done.Mar 29 2020, 1:27 PM
asb accepted this revision.EditedApr 9 2020, 7:34 AM

I think we should land this and continue further development in-tree. There might be a few minor refactorings I want to propose after the fact, but these shouldn't block this going forwards.

LGTM. Thank you for all your work in putting this together and the care you've taken to do so in a way that matches the rest of the backend.

llvm/test/MC/RISCV/rvZbproposedc-invalid.s
1 ↗(On Diff #253455)

Nit: should be "-mattr=+experimental-zbproposedc,+c" given that +experimental-b won't turn on the proposed C additions by itself.

This revision was automatically updated to reflect the committed changes.