[TableGen] Give the option of tolerating duplicate register names
ClosedPublic

Authored by asb on Nov 9 2017, 7:22 AM.

Details

Summary

A number of architectures re-use the same register names (e.g. for both 32-bit FPRs and 64-bit FPRs). They are currently unable to use the tablegen'erated MatchRegisterName and MatchRegisterAltName, as tablegen (when built with asserts enabled) will fail.

When the AllowDuplicateRegisterNames in AsmParser is set, duplicated register names will be tolerated. A backend can then coerce registers to the desired register class by (for instance) implementing validateTargetOperandClass.

At least the in-tree Sparc backend could benefit from this, as does RISC-V (single and double precision floating point registers).

Diff Detail

Repository
rL LLVM
asb created this revision.Nov 9 2017, 7:22 AM
mgrang added a subscriber: mgrang.Wed, Nov 15, 3:07 PM

Could you please add some tests?

lib/TableGen/StringMatcher.cpp
53 ↗(On Diff #122246)

While you are at it could you also please fix the spacing here?

std::string Indent(IndentCount * 2 + 4, ' ');

This looks fine. Do you plan to add a test case?

asb updated this revision to Diff 123739.Tue, Nov 21, 1:55 AM
asb marked an inline comment as done.

Updated to add a TableGen test for the generation of MatchRegisterName and MatchRegisterAltName when AllowDuplicateRegisterNames is set.

This looks reasonable to me. The only comment I have is that the patch does not make it clear what numeric register id exactly will be returned if there are several of them matching a given string. Currently, the numeric match is unique, but once we make it non-unique, the code in a target with "reused" names will need to handle getting a "wrong" register id. Let's take the testcase as an example---the user would need to write something like this:

if (Want32Bit) {
  // If we got a 64-bit register, map it to the 32-bit counterpart.
  switch (Reg) {
    case R0_64:
      Reg = R0_32;
      break;
    case R1_64:
      ...
  }
} else if (Want64Bit) {
  // If we got a 32-bit register, map it to the 64-bit counterpart.
  switch (Reg) {
    case R0_32:
      Reg = R0_64;
      break;
    case R1_32:
      ...
}

This may be somewhat inconvenient, so I'm thinking if some extra support for this could be implemented as well. What I have in mind is something like:

if (Want64Bit)
 Reg = givenAnyRepresentativeGiveMeTheCorrespondingRegisterThatIWant(Reg, IWant64BitRegister);

The only issue is how to specify the "IWant64BitRegister" part. This will be used in the asm parser, so it has to work with the MC layer alone (so, no TargetRegisterInfo, etc.), so register classes probably wouldn't work. (At first I thought about something like getCounterpartFromRegClass(Reg, MyTarget::R64RegClass)). Maybe adding some kind of an extra flag to "Register" would work?

let Namespace = "Arch" in {
class ArchReg<string n, list <string> alt, list <RegAltNameIndex> altidx, string disambiguator>
    : Register<n, disambiguator> {
  let AltNames = alt;
  let RegAltNameIndices = altidx;
}

foreach i = 0-3 in {
  def R#i#_32 : ArchReg<"r"#i, ["x"#i], [ABIRegAltName], "32 bit">;
  def R#i#_64 : ArchReg<"r"#i, ["x"#i], [ABIRegAltName], "64 bit">;
}

Then have TableGen generate a function like

unsigned disambiguate(unsigned Reg, StringRef Str) {
  for (R in possible candidates)
    if (R.Disambiguator == Str)   // This is some fake syntax, but you get the idea.
      return R.Id;
}

And in the asm parser, use it as

if (Want64Bit)
  Reg = disambiguate(Reg, "64 bit");

It wouldn't have to be string-based, this is just an illustration.

All of this would only apply to targets that set the flag "AllowDuplicateRegisterNames", for all other targets, the disambiguating flag could be unset and ignored, and such a function wouldn't be generated.

Just to add to my last comment---this patch is fine as is, but it would be nice to make the selection of the correct register easier.

asb added a comment.Tue, Dec 5, 1:30 AM

Hi Krzysztof, I agree that further tablegen support for selecting aliased registers could be helpful. For what it's worth, you can see how this sort of case is handled for RV32D parsing in D39895. My main concern is creating false generality: adding a new tablegen feature that seems general-purpose but actually has a very very narrow set of sensible uses.

In my case, a tablegen'erated function conversion function based on the FPR32 reg being a subregister of the FPR64 reg would be sufficient, but this may not work for aliased register names in other contexts.

kparzysz accepted this revision.Tue, Dec 5, 7:56 AM

Makes sense. LGTM.

When you commit, could you add an explicit clarification to the comment in Target.td that MatchRegisterName and MatchRegisterAltName can return any id associated with the name (i.e. that there are no guarantees as to which specific numeric id will be returned in case of multiple matches)?

This revision is now accepted and ready to land.Tue, Dec 5, 7:56 AM
asb added a comment.Tue, Dec 5, 1:34 PM

@kparzysz: that's a good point. In my use of this, I actually have been assuming that the returned id is predictable. Currently, register ids seems to be assigned based on sorting register def names, so R0_32 is always preferred to R0_64 if they alias. It could be worth documenting and testing this? Or do you think it's better to leave undefined and update my RV32D patch so it's coded more defensively? (it will at least currently assert if an unexpected register id is returned).

I think that leaving the returned value unspecified is better, at least for now. Otherwise, we'd need to invent a way to identify the "preferred" register, and then it would need to be preserved by any future modifications to the MatchRegisterName et.al. functions.

asb added a comment.Wed, Dec 6, 3:31 AM

I think that leaving the returned value unspecified is better, at least for now. Otherwise, we'd need to invent a way to identify the "preferred" register, and then it would need to be preserved by any future modifications to the MatchRegisterName et.al. functions.

Passing a "preferred" register class (or similar) to MatchRegisterName is never really going to work I don't think, at least not without some fairly invasive changes to the asm parser to allow that information to be threaded through. Coercing a register later on is much easier. D39895 currently relies on the fact that MatchRegisterName and MathRegisterAltName prefer to return the F*_32 registers due to them sorting before F*_64. It would be pretty easy to check the current behaviour in a test, thus ensuring it doesn't change. But for now, let's leave it undefined as you suggest.

Sorry, I wasn't clear in my last comment. This comment doesn't add any new information, it's just a clarification.

By "preferred" I meant having some kind of an implicit ordering of the registers, and out of several candidates, the "preferred" one would, for example, be the first register in that order. The order could be based on the order in which they were defined in the .td file, or lexicographic sorting of their symbolic names (not assembly names), etc. My argument was that there is no intuitively clear way to establish such an ordering, and if we came up with something, then we'd need to make sure that MatchRegisterName returned a value in accordance with that order (which seems like an unnecessary constraint).

This revision was automatically updated to reflect the committed changes.