For now, only add instruction definitions for basic ALU operations. Our initial target is a working MC layer rather than codegen, so appropriate SelectionDAG patterns will come later.
Details
Diff Detail
Event Timeline
lib/Target/RISCV/RISCVInstrFormats.td | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
34 | Nit: since the spec seems to list instructions from bit 31 down to 0 (as do TableGen 0bxxxx literals), definitions might read more naturally if you make the order "funct7, funct3, opcode". | ||||||||||||||||
98 | This immediate doesn't appear to have a bit 0 in the spec (it fits into exactly the same fields as imm12 does). So what we're really deciding here is what the immediate in the MCInst representation of "BEQ #x" should be. Valid answers are "x" and "the bits representing how x is encoded", with newer targets tending to favour the latter. This is mostly because it makes it harder by design to construct an invalid MCInst when you're doing it manually. Other than that, it's just punting complexity around the various MC layer components:
Either way "imm13" is probably misleading as a name and it's probably a good idea to be consistent between formats if possible (I note FU has taken the second approach). |
lib/Target/RISCV/RISCVInstrFormats.td | ||
---|---|---|
34 | That's a good point. Obviously the advantage of listing the opcode first is it reads more logically when writing something in RISCVInstrInfo that inherits from it. i.e. you first specify the high-level opcode then might select a more specific options by varying the funct bits. More closely matching the presentation in the spec is attractive though. | |
98 | I am incredibly glad you raised this issue, as it's something I've agonised back and forth over. I considered writing something up about this in the patch description, but thought it may be of limited interest. The same issue you raise applies to FUJ vs FU. I started going in the direction of imm13 because:
The logical value is a 13-bit immediate with the restriction that the LSB must be 0, but of course this is encoded in 12 bits. Arguably imm13m0 would be more descriptive (imm13, masked bit0) but this description is perhaps a little opaque to most. As you've spotted, I ended up going a different direction with FU. After deciding imm32 was misleading, I tried out with imm20shl12 as a description of the value that's logically a 20-bit immediate shifted left 12 places. However I think I found that for whatever reason the code manipulating this didn't read cleanly. The argument that now FU has its arguments defined in terms of the bit encoding, FSB and FUJ are inconsistent with it is persuasive. I'll modify so FSB takes an imm12 and FUJ takes an imm20. |
lib/Target/RISCV/RISCVRegisterInfo.td | ||
---|---|---|
19 | Minor: Ri might be overly terse. RegID? Or something which gives some context when seen elsewhere? |
lib/Target/RISCV/RISCVInstrFormats.td | ||
---|---|---|
98 | Do you have any thoughts on a clearer naming convention? Suppose I change FUJ so it takes an imm20 (i.e. the bits to be encoded), there are a few choices for naming the operand (shown below in context:
It's not clear to me why it would be more logical to use option 2) vs option 3), which is what tempts me to stick with something along the lines of the current name. simm21_mask1 describes the logical value (a signed 21-bit value with the LSB guaranteed to be zero, i.e. has been masked & 1. |
Changed the order of parameters in RISCVInstrFormats.td so it more closely matches the RISC-V spec. As usggested by @t.p.northover, changed the way of naming the immediate in FUJ and FSB (refer to them as imm20 and imm12 which matches the number of bits stored in the instruction). Address comment from @reames about Ri being rather terse.
lib/Target/RISCV/RISCVRegisterInfo.td | ||
---|---|---|
19 | It's essentially never referred to again in user-written code - on the one hand this means it only needs to make sense in the context of this file, on the other it means there's little advantage in being terse. Other archs seem to call it ArchReg so I've renamed to RISCVReg. |
Comments inline.
lib/Target/RISCV/RISCV.td | ||
---|---|---|
23 | Should these not be Processor, rather than ProcessorModel. I was under the impression that Processor is intended for generic families, whereas ProcessorModel was intended for specific microarchitectures. It would be nice to have Rocket as a specific ProcessorModel here. It may be in a later review, but we should probably also have SubtargetFeature flags for floating point, atomics, and so on here too. |
lib/Target/RISCV/RISCVInstrFormats.td | ||
---|---|---|
143 | Shouldn't this now be bits<20> imm20; ? |
Rename ProcessorModels to generic-rv32 and generic-rv64 and fix incorrect immediate size spotted by @jordy.potman.lists
lib/Target/RISCV/RISCV.td | ||
---|---|---|
23 | I think actually ProcessorModel is preferred as it exposes the more modern SchedMachineModel rather than the legacy ProcessorItineraries. At least X86, AArch64, and Lanai describe generic ProcessorModels. See this email from @hfinkel, who can perhaps confirm my interpretation http://lists.llvm.org/pipermail/llvm-dev/2015-November/092214.html. It looks like an uppercase ProcessorModel name is uncommon though, so I will change these to generic-rv32 and generic-rv64. Having a Rocket ProcessorModel and scheduling model is definitely part of the plan, and later patches will add appropriate SubtargetFeature flags. | |
lib/Target/RISCV/RISCVInstrFormats.td | ||
143 | It should be - thanks! |
lib/Target/RISCV/RISCVRegisterInfo.td | ||
---|---|---|
28 | Are the risc-v dwarf register numbers documented anywhere? (I assume not, just like the relocation types, but thought I'd ask anyways. Maybe you can start a list of Things That Ought To Be Documented...) | |
64 | May want to use a different register allocation sequence. Typically backends put caller-save registers first, then callee save. In RISCV, we probably also want to prefer registers in the range X8-X15, since they're usable by the compressed instructions. How about putting the registers in the order: That is: caller-save between x10-15, then the remaining caller-save, then callee-save, then specials (which will be reserved, anyways) | |
67 | I think using one set of registers for both the 32-bit arch and the 64-bit arch doesn't actually work right -- you rather want separate 32bit and 64bit versions of the registers (perhaps name them X{0..31}_32 and X{0..31}_64 for clarity), to make up the GPR32 and GPR64 register classes. The SPARC backend actually does something like you have here, but I'm pretty sure its 64-bit support is fairly busted and is the wrong way to do things -- better to look at, perhaps, the PPC backend. Could name the registers X{0..31}_32 and X{0..31}_64. |
lib/Target/RISCV/RISCVRegisterInfo.td | ||
---|---|---|
29 | They're not. I have added that to the list. | |
65 | There have been some responses to this that haven't been picked up by phabricator: 1, 2. To restate the discussion there - I would like to initially focus on MC concerns in this patchset and only introduce codegen concerns when they can be tested. Would you be happy with a comment here explaining that the allocation order should be changed when codegen is implemented? | |
67 | Could you elaborate on where things go wrong for SPARC? If it's conceptual failing of having a single set of register IDs, what precisely is the issue? There doesn't for instance seem to be an assumption that any given Register is a member of only a single RegisterClass. Or is it likely due to bugs in tablegen or elsewhere? In PPC and AArch64, the 32-bit registers are defined as subregisters of the 64-bit GPRs but I'm not sure if this is as correct for RISC-V. The compiler is either targeting RV32 or RV64, and when targeting RV64 there is no instruction that will modify only 32-bits of the register. Values are always held in sign-extended format so bit 31 will be copied in bits 32-63 if you for instance execute an ADDW or one of the other RV64 instructions defined to work on 32-bit quantities. Load instructions defined to either zero bits 32-63 or to sign-extend the loaded value. |
lib/Target/RISCV/RISCVRegisterInfo.td | ||
---|---|---|
67 | Ok, having had a look at this some more I think I mostly answered my own question. When writing to a register in RV64 (e.g. with ADDW) we do indeed need to model in-register sign extension, which I think is similar to MIPS. But the operands to ADDW or SUBW come from truncating the 64-bit register. Arguably the most sensible way of modelling this is by having a 32-bit register which is a subregister of the matching 64-bit GPR. What I really want is that almost all instructions are defined as taking register operands from the 'GPR' set, which depending on if the target is RV32 or RV64 may be GPR32 or GPR64 (or in the future with RV128, GPR128). A small number of instructions as SLLW are defined to take a 32-bit subregisters, and in the future with RV128 we would have ADDD, SLLD and friends which take the 64-bit subregister of one of the GPR128 register set. I'll have more of a think about what I want to do here. |
lib/Target/RISCV/RISCVRegisterInfo.td | ||
---|---|---|
65 | Yes, a TODO note sounds fine. I just didn't want it to get forgotten just because a future change might not touch this line of code. | |
67 | Originally, the SPARC 64bit support was written by reusing the 32bit instruction patterns in r178527. This definitely didn't work, because the instructions specified the 32-bit register class. It apparently "almost" worked though -- until the return value needed to be spilled, and got stored in 4-bytes of memory instead of 8. That particular thing been fixed since -- by adding separate 64-bit instructions -- but the legacy of specifying a single register set still remains. I'm not really sure what exact problems it causes, I've not really pushed on sparc64, since I only really care about sparc32. But I do think the right thing really is to specify separate 32bit and 64bit registers (with a subreg relationship), and separately specify the integer instructions for each mode. As far as modeling ADDW, I believe it's fine to model it as having the 32-bit subregs as input/outputs; that it mangles the upper bits while doing so should be fine. |
Apologies for the slight delay, organising last week's LLVM Cauldron and other activities in my life had limited my time.
As discussed here, we really want to have two separate register classes for RV32 and RV64 and separate instruction definitions to match this. In RISC-V, an RV32 add and an RV64 add (and indeed, a future RV128 add) all use the same encoding. There seem to be a number of possible approaches to handling this:
- As done by other backends such as Sparc would, ust define every instruction twice.
- Make classes in RISCVInstrInfo like ALU_rr multiclasses that define both 32-bit and 64-bit versions of instructions.
- You could also imagine extending TableGen with some sort of AST macro support or a new tablegen option that would handle substituting the register class for each defined instruction
- A foreach at the top of RISCVInstrInfo that iterates over each register width (for now, 32 and 64-bit). Each def would be to be written like def ADD#x and some !cast or !if magic would be needed to get the appropriate RegisterClass based on the register width.
- Move all class definitions to the top of RISCVInstrInfo and make each ALU_rr etch take a RegisterClass parameter. Then define a multiclass used to parametrise all instruction definitions.
I have trialled the last option from the list above. You can see a version of RISCVInstrInfo.td (based on the complete patchset) that implements this option at paste P7637. For those subscribing to this review - I'd really appreciate your comment on if you feel this is the best way to go about it (particularly TableGen wizards like @t.p.northover). Given the current patchset aims to focus on MC, I propose actually making this change in a later patch when RV32/RV64 codegen is introduced.
- As done by other backends such as Sparc would, ust define every instruction twice.
Let's use PPC rather than Sparc as our baseline.
IMO, this simple thing of just defining the instructions twice probably makes the most sense to start out with. Don't forget you also need different types in the patterns for the different modes, not just different register classes -- afaik, you can't do that with a multiclass.
If, after everything's more finished, it does turn out to be easy to merge them with a multiclass macro, then OK. I'm skeptical that it will be easier that way, but that's fine. :)
On the other hand, it's also easy to split them apart later, so if you want to start with what you have in the pastebin and split it later on when/if becomes cumbersome that way, that's fine too. :)
On a somewhat longer-term note:
We have the exact same situation with HVX: the vector registers can be 64 or 128-byte long, depending on the processor mode. The actual encodings are identical between the two modes, so it's possible to have a single binary that would work in both modes. We have every HVX instruction defined twice, and separate register classes for both types of registers. This is a pain and a mess, and I have been planning to get rid of that for quite some time now.
Since this type of situation now appears in several targets, I hope that this will be enough of a rationale to develop a proper support for this issue, namely register class with a non-constant register size/alignment. This should be fairly simple, actually, and I can develop a prototype for review, hopefully in a few days.
This shouldn't stop you from proceeding with a currently available solution, since I'm not sure how long it would take to develop a working approach that avoids the duplication.
Please add me to the review thread for this. We have the same thing in MIPS and it's even worse in CHERI (where we have 128- and 256-bit variants of the ISA and currently conditionally compile for only one of them).
All of that sounds great! I think it also points strongly towards simply duplicating the instructions in the RISCV target for now, as other targets are doing today, rather than working on half-measures towards improving things here.
Once/if better infrastructure becomes available, it can be changed over then.
@jyknight: Yes, that's a good point, once codegen is added I'll need to thread through a ValueType in a similar way to how I currently pass RegisterClass. I can definitely see the argument that describing instructions twice and abstracting later might lead to the best solution. I think it's definitely worth considering the alternatives early on though, even if I do go ahead with the duplication approach.
@kparzysz: I'm glad to hear you're interested in working to solve this problem and I'm looking forward to studying your proposal. Given that the current patches in review for RISCV focus just on the MC layer without codegen, it may be that there is time for your proposal to be merged before deciding on an approach for the RV32/RV64/RV128 issue. If not, we always have the option of starting duplication or utilising multiclasses with the intent to move to a better approach further down the road.
I will add a comment to explain instruction definitions will need modifying for RV64, and modify the definitions to use GPR32.
I have updated RISCVRegisterInfo.td so it defines both the 32-bit and 64-bit GPRs, marking the 32-bit GPRs as subregs. As explained in comments, the hope is that we can later move to using something like D24631.
It only shows up upon a clean compile (CMake issue?), but I've just found the introduction of registers with the same assembly name now leads to an assertion during compilation within llvm-tblgen. I'll investigate a fix.
Fix the issue described in my previous comment by not defining an AsmName and AltName for the RISCV64 registers. With this approach, the generated MatchRegisterName and MatchAltRegisterName functions can be used. In the future, a little bit of extra logic in RISCVAsmParser.cpp can be added to coerce a parsed register from 32-bit to 64-bit when desired.
Should these not be Processor, rather than ProcessorModel. I was under the impression that Processor is intended for generic families, whereas ProcessorModel was intended for specific microarchitectures.
It would be nice to have Rocket as a specific ProcessorModel here. It may be in a later review, but we should probably also have SubtargetFeature flags for floating point, atomics, and so on here too.