Page MenuHomePhabricator

[RISCV] Rename FPRs and use Register arithmetic
ClosedPublic

Authored by luismarques on Sep 10 2019, 5:11 PM.

Details

Summary

Registers, as currently employed in LLVM, have the following properties:

  • You can't rely on a Register object having a specific numerical value;
  • You can rely on two consecutively enumerated Registers having consecutive numerical values (or having operator overloads that provide equivalent semantics);
  • Register are enumerated using a well-defined algorithm that provides some guarantees about enumeration order, given some reasonable assumptions.

This last point may seem surprising but it is a de facto guarantee that many targets rely on, frequently making use of Register arithmetic based on that assumption. In fact, you can see that multiple targets guard against that assumption being violated:

ARMBaseRegisterInfo.cpp:203
static_assert(ARM::D31 == ARM::D16 + 15, "Register list not consecutive!");
X86FloatingPoint.cpp:130
static_assert(X86::FP6 - X86::FP0 == 6, "sequential regnums");
SparcISelLowering.cpp:189
static_assert(SP::I0 + 7 == SP::I7 && SP::O0 + 7 == SP::O7,
AArch64CollectLOH.cpp:259
static_assert(AArch64::X28 - AArch64::X0 + 3 == N_GPR_REGS, "Number of GPRs");
static_assert(AArch64::W30 - AArch64::W0 + 1 == N_GPR_REGS, "Number of GPRs");

The registers are enumerated by RegisterInfoEmitter.cpp, but the order is set by CodeGenRegBank, which sorts all of the Register records using a LessRecordRegister function object. That comparator uses the following sorting logic:

  • The names are split into consecutive alphabetical and numerical parts (e.g. Foo42Bar7 -> [Foo, 42, Bar, 7]);
  • The alphabetical parts take precedence over the numerical parts in setting the order;
  • The number of parts takes precedence over the content of the parts in setting the order.

That last rule explains some less-than-intuitive orderings, such as X86's R8B appearing after ZMM31.

The RISC-V target also makes use of Register arithmetic when it comes to GPR registers. Unfortunately the sorting algorithm doesn't play particularly well with the names of our FPR Registers, resulting in a less-than-ideal ordering: F0_32, F0_64, F1_32, ..., F31_64. Probably because of this, the RISC-V target had so far relied on explicit tables of FPR Registers, rather than also using arithmetic for them, which is inconsistent, verbose and slightly more costly.

There are several ways to solve this problem:

  1. Tweak or customize the sorting algorithm. Adding the customizability infrastructure would probably be non-trivial, but it might be a good approach long-term, since the current approach of having a common but hard-coded algorithm is fairly limiting;
  1. Use Register arithmetic consistently, even with the less-than-ideal ordering. The ordering is still fairly reasonable, and the sorting algorithm makes it extremely unlikely that any new registers would disrupt it;
  1. Change the Register names. For instance, you can rename F0_32 -> F0_F, F0_64 -> F0_D, etc. This ensures that the registers within the same class are enumerated sequentially, and is consistent with the F/D distinction used in other parts of the target.

EDIT UPDATE: The patch now implements option #3.

This patch implements the option #2, but could be easily changed to adopt option #3.

Code review concerns:

  • It might be a good idea to have a single place (e.g. a couple of conversion functions) encapsulating the concept of the F/D registers being enumerated in an interleaved fashion. Yet, none of the existing files seem like an ideal place to put them -- RISCVBaseInfo.h and RISCV.h are okayish options but still not a great match. In any case, I would make that transition in two commits anyway, so this patch does not introduce such utility functions.

Diff Detail

Event Timeline

luismarques created this revision.Sep 10 2019, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2019, 5:11 PM

Hm, I don't particularly like this (or at least the F parts of it); the tables are clear, whereas this seems very brittle.. e.g. when you add Q you have to also modify the places doing arithmetic on F and D registers.

It's worth noting that other targets (Mips and Sparc) call their floating point registers F0-F31 and D0-D31 (and there's also Q0-Q31 for Sparc) even though they're always referred to as %fX in the assembly. That would be an alternative resolution to the sorting issue, that also feels less hacky/surprising compared with the numerical-vs-alphabetical ordering subtleties.

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
744–746

We've lost the assertion here; probably want something like assert(Reg >= RISCV::F0_32 && Reg <= RISCV::F31_32 && Reg % 2 == RISCV::F0_32 % 2);

jrtc27 added inline comments.Sep 10 2019, 5:31 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2110

For completeness you probably want to prepend RegLo >= RISCV::X0 (or > I guess... though at that point you could then argue for >= RISCV::X10 so I'd just say go with >= RISCV::X0 as the simplest thing that's clearly just an "is this a GPR" check).

It's worth noting that other targets (Mips and Sparc) call their floating point registers F0-F31 and D0-D31 (and there's also Q0-Q31 for Sparc) even though they're always referred to as %fX in the assembly. That would be an alternative resolution to the sorting issue, that also feels less hacky/surprising compared with the numerical-vs-alphabetical ordering subtleties.

Yeah, that's option #3 (with a different naming suggestion). I'm all for that, especially if there's a good precedent -- an existing naming convention for us to be consistent with.

luismarques marked an inline comment as done.Sep 10 2019, 5:49 PM
luismarques added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2110

That wasn't a check for GPRness, but instead that the + 1 didn't overflow the register class (if you were implementing some weird ABI, I guess).

luismarques retitled this revision from [RISCV] Consistently use Register arithmetic to [RISCV] Rename FPRs and use Register arithmetic.
  • Changes the FPRs Register identifiers, to ensure sequential enumeration:
    • F0_32-F31_32 -> F0_F-F31_F
    • F0_64-F31_64 -> F0_D-F31_D
    • These were chosen instead of Fxx/Dxx to better match the ISA spec names, although both options were reasonable.
  • Now makes straighforward use of FPR Register arithmetic.
  • Adds static asserts regarding the register ordering.
  • Updates the FPR register matching, since the tablegen matches now default to the 64-bit regs.
  • Tightens some RegNo bounds checks that were too loose on master.
  • Updates the title to reflect the changed scope of the patch.
luismarques marked 2 inline comments as done.Sep 12 2019, 11:49 AM

This is looking good. I have just one question this time.

llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
31

Why does this static assert not cover all 31 64-bit FP regs?

luismarques marked 2 inline comments as done.Sep 13 2019, 4:09 AM
luismarques added inline comments.
llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
31

You'd need a static_foreach to really check that all 32 of them are consecutive. Checking R31 == R0 + 31 strictly speaking doesn't check that R1-30 aren't scrambled. Choosing different indices for that last assert was a compromise attempt to both impart that concept and to take the opportunity to check some different indices. There's a precedent for checking other regs than the first and last.

luismarques edited the summary of this revision. (Show Details)Sep 13 2019, 4:10 AM

One suggestion, then LGTM

llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
31

It just looks super weird. How about one static assert to ensure that F31_D == F0_D + 31, and then another three asserts to ensure F1_D = F0_D + 1 for each of the register types. I think worrying that the order is scrambled beyond that is way too conservative.

I'm still personally not fond of the Fx_F/Fx_D convention. The fact that F0_F < F1_F < F0_D < F1_D is highly surprising, especially when F0_32 < F0_64 < F1_32 < F1_64, and not what one would naively assume. I also don't know whether this peculiarity of TableGen is meant to be something that's relied upon. I know it doesn't match the ISA convention of always calling them Fx, but Fx and Dx have precedence in other backends and do not risk this same confusion. Moreover, the spec itself is highly inconsistent in how to refer to single and double precision floating point:

  • The load/store instructions are suffixed with W/D/Q
  • Other floating point instructions are suffixed with .S/.D/.Q
  • The extensions are F/D/Q

So, personally, I would still vote for Fx/Dx/Qx given it avoids confusion, matches a bunch of other backends, and matches the ISA extension letters.

asb added a comment.Sep 17 2019, 5:35 AM

I'm still personally not fond of the Fx_F/Fx_D convention. The fact that F0_F < F1_F < F0_D < F1_D is highly surprising, especially when F0_32 < F0_64 < F1_32 < F1_64, and not what one would naively assume. I also don't know whether this peculiarity of TableGen is meant to be something that's relied upon. I know it doesn't match the ISA convention of always calling them Fx, but Fx and Dx have precedence in other backends and do not risk this same confusion. Moreover, the spec itself is highly inconsistent in how to refer to single and double precision floating point:

I think with this patch, it's actually the case that F0_D is less than F0_F. I don't think it makes sense to worry about the relative ordering of register classes, just the ordering within a class. It's a shame that RISCVAsmParser has to care, but I think the static_assert is a reasonable solution there. If TableGen were to start doing something less predictable in terms of relative ordering of the register classes, we could always make a more generic function that can handle more conversions than is strictly necessary.

  • The load/store instructions are suffixed with W/D/Q
  • Other floating point instructions are suffixed with .S/.D/.Q
  • The extensions are F/D/Q

    So, personally, I would still vote for Fx/Dx/Qx given it avoids confusion, matches a bunch of other backends, and matches the ISA extension letters.

That's a good point, but it seems a shame to lose out on the one thing that the spec is consistent about - the architectural register name. I suppose F0 and F0_D is another alternative.

luismarques marked an inline comment as done.
  • Rebased on master;
  • Addresses @lenary's static_assert concerns.
luismarques marked an inline comment as done.Sep 22 2019, 8:58 AM
asb accepted this revision.Sep 26 2019, 2:43 AM

This LGTM, thanks. Added a few nits. I think Sam's suggestion of adding a helper function or two to handle the register arithmetic is worth evaluating, but doesn't need to be done in this patch.

llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
32

clang-format wants to wrap this

34

clang-format wants to wrap this

This revision is now accepted and ready to land.Sep 26 2019, 2:43 AM
This revision was automatically updated to reflect the committed changes.