[RISCV 4/10] Add basic RISCV{InstrFormats,InstrInfo,RegisterInfo,}.td
ClosedPublic

Authored by asb on Aug 16 2016, 8:25 AM.

Details

Summary

For now, only add instruction definitions for basic ALU operations. Our initial target is a working MC layer rather than codegen, so appropriate SelectionDAG patterns will come later.

Diff Detail

Repository
rL LLVM
asb retitled this revision from to [RISCV 4/10] Add basic RISCV{InstrFormats,InstrInfo,RegisterInfo,}.td.Aug 16 2016, 8:25 AM
asb updated this object.
asb added reviewers: theraven, jyknight.
asb added a subscriber: llvm-commits.
emaste added a subscriber: emaste.Aug 16 2016, 2:42 PM
t.p.northover added inline comments.
lib/Target/RISCV/RISCVInstrFormats.td
33 ↗(On Diff #68187)

Nit: since the spec seems to list instructions from bit 31 down to 0 (as do TableGen 0bxxxx literals), definitions might read more naturally if you make the order "funct7, funct3, opcode".

97 ↗(On Diff #68187)

This immediate doesn't appear to have a bit 0 in the spec (it fits into exactly the same fields as imm12 does).

So what we're really deciding here is what the immediate in the MCInst representation of "BEQ #x" should be. Valid answers are "x" and "the bits representing how x is encoded", with newer targets tending to favour the latter.

This is mostly because it makes it harder by design to construct an invalid MCInst when you're doing it manually. Other than that, it's just punting complexity around the various MC layer components:

MC componentwith "x"with bits
AsmParsernopencodes
Disassemblerdecodesnop
AsmPrinternopdecodes
Fixup handlingencodesnop

Either way "imm13" is probably misleading as a name and it's probably a good idea to be consistent between formats if possible (I note FU has taken the second approach).

asb added inline comments.Aug 18 2016, 8:09 AM
lib/Target/RISCV/RISCVInstrFormats.td
33 ↗(On Diff #68187)

That's a good point. Obviously the advantage of listing the opcode first is it reads more logically when writing something in RISCVInstrInfo that inherits from it. i.e. you first specify the high-level opcode then might select a more specific options by varying the funct bits. More closely matching the presentation in the spec is attractive though.

97 ↗(On Diff #68187)

I am incredibly glad you raised this issue, as it's something I've agonised back and forth over. I considered writing something up about this in the patch description, but thought it may be of limited interest. The same issue you raise applies to FUJ vs FU.

I started going in the direction of imm13 because:

  1. this better matches the spec which describes immediates in terms of slicing the integer logically represented
  2. the idea that the MCInst representation of 'BEQ #x` should be 'x' appealed to me.

The logical value is a 13-bit immediate with the restriction that the LSB must be 0, but of course this is encoded in 12 bits. Arguably imm13m0 would be more descriptive (imm13, masked bit0) but this description is perhaps a little opaque to most. As you've spotted, I ended up going a different direction with FU. After deciding imm32 was misleading, I tried out with imm20shl12 as a description of the value that's logically a 20-bit immediate shifted left 12 places. However I think I found that for whatever reason the code manipulating this didn't read cleanly.

The argument that now FU has its arguments defined in terms of the bit encoding, FSB and FUJ are inconsistent with it is persuasive. I'll modify so FSB takes an imm12 and FUJ takes an imm20.

reames added a subscriber: reames.Aug 21 2016, 12:08 PM
reames added inline comments.
lib/Target/RISCV/RISCVRegisterInfo.td
18 ↗(On Diff #68187)

Minor: Ri might be overly terse. RegID? Or something which gives some context when seen elsewhere?

asb added inline comments.Aug 22 2016, 2:57 AM
lib/Target/RISCV/RISCVInstrFormats.td
97 ↗(On Diff #68187)

Do you have any thoughts on a clearer naming convention? Suppose I change FUJ so it takes an imm20 (i.e. the bits to be encoded), there are a few choices for naming the operand (shown below in context:

  1. def JAL : FUJ<0b1101111, (outs GPR:$rd), (ins simm21_mask1:$imm20). The argument for this would be that simm21_mask1 describes the logical value rather than the physical encoding and we don't have to describe the value in terms of either encoding or decoding. As part of def simm21_mask1 : Operand<i32> we would add a DecoderMethod and EncoderMethod that converts to/from the imm20 bit encoding
  2. def JAL : FUJ<0b1101111, (outs GPR:$rd), (ins simm20_lsl1:$imm20). This matches what the Mips backend has done. The name reflects how to go from the encoded 20-bit representation to the logical value that represents.
  3. def JAL : FUJ<0b1101111, (outs GPR:$rd), (ins simm21_asr1:$imm20). The name reflects how to go from the logical value (the signed 21-bit offset) to the 20-bit encoding.

It's not clear to me why it would be more logical to use option 2) vs option 3), which is what tempts me to stick with something along the lines of the current name. simm21_mask1 describes the logical value (a signed 21-bit value with the LSB guaranteed to be zero, i.e. has been masked & 1.

asb updated this revision to Diff 69349.Aug 26 2016, 5:20 AM
asb marked 5 inline comments as done.

Changed the order of parameters in RISCVInstrFormats.td so it more closely matches the RISC-V spec. As usggested by @t.p.northover, changed the way of naming the immediate in FUJ and FSB (refer to them as imm20 and imm12 which matches the number of bits stored in the instruction). Address comment from @reames about Ri being rather terse.

lib/Target/RISCV/RISCVRegisterInfo.td
18 ↗(On Diff #68187)

It's essentially never referred to again in user-written code - on the one hand this means it only needs to make sense in the context of this file, on the other it means there's little advantage in being terse. Other archs seem to call it ArchReg so I've renamed to RISCVReg.

Comments inline.

lib/Target/RISCV/RISCV.td
23 ↗(On Diff #69349)

Should these not be Processor, rather than ProcessorModel. I was under the impression that Processor is intended for generic families, whereas ProcessorModel was intended for specific microarchitectures.

It would be nice to have Rocket as a specific ProcessorModel here. It may be in a later review, but we should probably also have SubtargetFeature flags for floating point, atomics, and so on here too.

lib/Target/RISCV/RISCVInstrFormats.td
143 ↗(On Diff #69349)

Shouldn't this now be

bits<20> imm20;

?

asb updated this revision to Diff 69423.Aug 26 2016, 12:49 PM
asb marked an inline comment as done.

Rename ProcessorModels to generic-rv32 and generic-rv64 and fix incorrect immediate size spotted by @jordy.potman.lists

asb added a subscriber: hfinkel.Aug 26 2016, 12:49 PM
asb added inline comments.
lib/Target/RISCV/RISCV.td
23 ↗(On Diff #69349)

I think actually ProcessorModel is preferred as it exposes the more modern SchedMachineModel rather than the legacy ProcessorItineraries. At least X86, AArch64, and Lanai describe generic ProcessorModels. See this email from @hfinkel, who can perhaps confirm my interpretation http://lists.llvm.org/pipermail/llvm-dev/2015-November/092214.html. It looks like an uppercase ProcessorModel name is uncommon though, so I will change these to generic-rv32 and generic-rv64.

Having a Rocket ProcessorModel and scheduling model is definitely part of the plan, and later patches will add appropriate SubtargetFeature flags.

lib/Target/RISCV/RISCVInstrFormats.td
143 ↗(On Diff #69349)

It should be - thanks!

t.p.northover accepted this revision.Aug 26 2016, 2:33 PM
t.p.northover added a reviewer: t.p.northover.

I think this looks reasonable now.

This revision is now accepted and ready to land.Aug 26 2016, 2:33 PM
jyknight added inline comments.Aug 26 2016, 9:39 PM
lib/Target/RISCV/RISCVRegisterInfo.td
27 ↗(On Diff #68187)

Are the risc-v dwarf register numbers documented anywhere? (I assume not, just like the relocation types, but thought I'd ask anyways. Maybe you can start a list of Things That Ought To Be Documented...)

63 ↗(On Diff #68187)

May want to use a different register allocation sequence. Typically backends put caller-save registers first, then callee save. In RISCV, we probably also want to prefer registers in the range X8-X15, since they're usable by the compressed instructions.

How about putting the registers in the order:
X10-X17 [a0-a7] , X5-X7 [t0-t2], X28-X31 [t3-t6], X8-X9 [s0-s1], X18-X27 [s2-s11], X0-X4 [zero, ra, sp, gp, tp]

That is: caller-save between x10-15, then the remaining caller-save, then callee-save, then specials (which will be reserved, anyways)

66 ↗(On Diff #68187)

I think using one set of registers for both the 32-bit arch and the 64-bit arch doesn't actually work right -- you rather want separate 32bit and 64bit versions of the registers (perhaps name them X{0..31}_32 and X{0..31}_64 for clarity), to make up the GPR32 and GPR64 register classes.

The SPARC backend actually does something like you have here, but I'm pretty sure its 64-bit support is fairly busted and is the wrong way to do things -- better to look at, perhaps, the PPC backend. Could name the registers X{0..31}_32 and X{0..31}_64.

asb marked an inline comment as done.Aug 31 2016, 5:25 AM
asb added inline comments.
lib/Target/RISCV/RISCVRegisterInfo.td
28 ↗(On Diff #69423)

They're not. I have added that to the list.

64 ↗(On Diff #69423)

There have been some responses to this that haven't been picked up by phabricator: 1, 2.

To restate the discussion there - I would like to initially focus on MC concerns in this patchset and only introduce codegen concerns when they can be tested. Would you be happy with a comment here explaining that the allocation order should be changed when codegen is implemented?

66 ↗(On Diff #69423)

Could you elaborate on where things go wrong for SPARC? If it's conceptual failing of having a single set of register IDs, what precisely is the issue? There doesn't for instance seem to be an assumption that any given Register is a member of only a single RegisterClass. Or is it likely due to bugs in tablegen or elsewhere?

In PPC and AArch64, the 32-bit registers are defined as subregisters of the 64-bit GPRs but I'm not sure if this is as correct for RISC-V. The compiler is either targeting RV32 or RV64, and when targeting RV64 there is no instruction that will modify only 32-bits of the register. Values are always held in sign-extended format so bit 31 will be copied in bits 32-63 if you for instance execute an ADDW or one of the other RV64 instructions defined to work on 32-bit quantities. Load instructions defined to either zero bits 32-63 or to sign-extend the loaded value.

asb marked an inline comment as done.Aug 31 2016, 7:47 AM
asb added inline comments.
lib/Target/RISCV/RISCVRegisterInfo.td
66 ↗(On Diff #69423)

Ok, having had a look at this some more I think I mostly answered my own question. When writing to a register in RV64 (e.g. with ADDW) we do indeed need to model in-register sign extension, which I think is similar to MIPS. But the operands to ADDW or SUBW come from truncating the 64-bit register. Arguably the most sensible way of modelling this is by having a 32-bit register which is a subregister of the matching 64-bit GPR.

What I really want is that almost all instructions are defined as taking register operands from the 'GPR' set, which depending on if the target is RV32 or RV64 may be GPR32 or GPR64 (or in the future with RV128, GPR128). A small number of instructions as SLLW are defined to take a 32-bit subregisters, and in the future with RV128 we would have ADDD, SLLD and friends which take the 64-bit subregister of one of the GPR128 register set.

I'll have more of a think about what I want to do here.

jyknight added inline comments.Aug 31 2016, 3:39 PM
lib/Target/RISCV/RISCVRegisterInfo.td
64 ↗(On Diff #69423)

Yes, a TODO note sounds fine. I just didn't want it to get forgotten just because a future change might not touch this line of code.

66 ↗(On Diff #69423)

Originally, the SPARC 64bit support was written by reusing the 32bit instruction patterns in r178527. This definitely didn't work, because the instructions specified the 32-bit register class. It apparently "almost" worked though -- until the return value needed to be spilled, and got stored in 4-bytes of memory instead of 8.

That particular thing been fixed since -- by adding separate 64-bit instructions -- but the legacy of specifying a single register set still remains. I'm not really sure what exact problems it causes, I've not really pushed on sparc64, since I only really care about sparc32.

But I do think the right thing really is to specify separate 32bit and 64bit registers (with a subreg relationship), and separately specify the integer instructions for each mode.

As far as modeling ADDW, I believe it's fine to model it as having the 32-bit subregs as input/outputs; that it mangles the upper bits while doing so should be fine.

asb added a comment.Sep 14 2016, 3:54 AM

Apologies for the slight delay, organising last week's LLVM Cauldron and other activities in my life had limited my time.

As discussed here, we really want to have two separate register classes for RV32 and RV64 and separate instruction definitions to match this. In RISC-V, an RV32 add and an RV64 add (and indeed, a future RV128 add) all use the same encoding. There seem to be a number of possible approaches to handling this:

  • As done by other backends such as Sparc would, ust define every instruction twice.
  • Make classes in RISCVInstrInfo like ALU_rr multiclasses that define both 32-bit and 64-bit versions of instructions.
  • You could also imagine extending TableGen with some sort of AST macro support or a new tablegen option that would handle substituting the register class for each defined instruction
  • A foreach at the top of RISCVInstrInfo that iterates over each register width (for now, 32 and 64-bit). Each def would be to be written like def ADD#x and some !cast or !if magic would be needed to get the appropriate RegisterClass based on the register width.
  • Move all class definitions to the top of RISCVInstrInfo and make each ALU_rr etch take a RegisterClass parameter. Then define a multiclass used to parametrise all instruction definitions.

I have trialled the last option from the list above. You can see a version of RISCVInstrInfo.td (based on the complete patchset) that implements this option at paste P7637. For those subscribing to this review - I'd really appreciate your comment on if you feel this is the best way to go about it (particularly TableGen wizards like @t.p.northover). Given the current patchset aims to focus on MC, I propose actually making this change in a later patch when RV32/RV64 codegen is introduced.

  • As done by other backends such as Sparc would, ust define every instruction twice.

Let's use PPC rather than Sparc as our baseline.

IMO, this simple thing of just defining the instructions twice probably makes the most sense to start out with. Don't forget you also need different types in the patterns for the different modes, not just different register classes -- afaik, you can't do that with a multiclass.

If, after everything's more finished, it does turn out to be easy to merge them with a multiclass macro, then OK. I'm skeptical that it will be easier that way, but that's fine. :)

On the other hand, it's also easy to split them apart later, so if you want to start with what you have in the pastebin and split it later on when/if becomes cumbersome that way, that's fine too. :)

In D23561#542222, @asb wrote:

As discussed here, we really want to have two separate register classes for RV32 and RV64 and separate instruction definitions to match this. In RISC-V, an RV32 add and an RV64 add (and indeed, a future RV128 add) all use the same encoding.

On a somewhat longer-term note:

We have the exact same situation with HVX: the vector registers can be 64 or 128-byte long, depending on the processor mode. The actual encodings are identical between the two modes, so it's possible to have a single binary that would work in both modes. We have every HVX instruction defined twice, and separate register classes for both types of registers. This is a pain and a mess, and I have been planning to get rid of that for quite some time now.

Since this type of situation now appears in several targets, I hope that this will be enough of a rationale to develop a proper support for this issue, namely register class with a non-constant register size/alignment. This should be fairly simple, actually, and I can develop a prototype for review, hopefully in a few days.

This shouldn't stop you from proceeding with a currently available solution, since I'm not sure how long it would take to develop a working approach that avoids the duplication.

In D23561#542222, @asb wrote:

On a somewhat longer-term note:

We have the exact same situation with HVX: the vector registers can be 64 or 128-byte long, depending on the processor mode. The actual encodings are identical between the two modes, so it's possible to have a single binary that would work in both modes. We have every HVX instruction defined twice, and separate register classes for both types of registers. This is a pain and a mess, and I have been planning to get rid of that for quite some time now.

Since this type of situation now appears in several targets, I hope that this will be enough of a rationale to develop a proper support for this issue, namely register class with a non-constant register size/alignment. This should be fairly simple, actually, and I can develop a prototype for review, hopefully in a few days.

Please add me to the review thread for this. We have the same thing in MIPS and it's even worse in CHERI (where we have 128- and 256-bit variants of the ISA and currently conditionally compile for only one of them).

All of that sounds great! I think it also points strongly towards simply duplicating the instructions in the RISCV target for now, as other targets are doing today, rather than working on half-measures towards improving things here.

Once/if better infrastructure becomes available, it can be changed over then.

asb added a comment.Sep 14 2016, 8:41 AM

@jyknight: Yes, that's a good point, once codegen is added I'll need to thread through a ValueType in a similar way to how I currently pass RegisterClass. I can definitely see the argument that describing instructions twice and abstracting later might lead to the best solution. I think it's definitely worth considering the alternatives early on though, even if I do go ahead with the duplication approach.

@kparzysz: I'm glad to hear you're interested in working to solve this problem and I'm looking forward to studying your proposal. Given that the current patches in review for RISCV focus just on the MC layer without codegen, it may be that there is time for your proposal to be merged before deciding on an approach for the RV32/RV64/RV128 issue. If not, we always have the option of starting duplication or utilising multiclasses with the intent to move to a better approach further down the road.

I will add a comment to explain instruction definitions will need modifying for RV64, and modify the definitions to use GPR32.

asb updated this revision to Diff 74023.Oct 8 2016, 6:02 AM
asb marked 5 inline comments as done.

I have updated RISCVRegisterInfo.td so it defines both the 32-bit and 64-bit GPRs, marking the 32-bit GPRs as subregs. As explained in comments, the hope is that we can later move to using something like D24631.

asb added a comment.Oct 8 2016, 7:41 AM

It only shows up upon a clean compile (CMake issue?), but I've just found the introduction of registers with the same assembly name now leads to an assertion during compilation within llvm-tblgen. I'll investigate a fix.

asb updated this revision to Diff 74069.Oct 9 2016, 5:10 AM

Fix the issue described in my previous comment by not defining an AsmName and AltName for the RISCV64 registers. With this approach, the generated MatchRegisterName and MatchAltRegisterName functions can be used. In the future, a little bit of extra logic in RISCVAsmParser.cpp can be added to coerce a parsed register from 32-bit to 64-bit when desired.

jyknight accepted this revision.Oct 11 2016, 6:01 AM

Looks good

This revision was automatically updated to reflect the committed changes.