This is an archive of the discontinued LLVM Phabricator instance.

[TableGen][AsmMatcherEmitter] Fix tied-constraint checking for InstAliases
ClosedPublic

Authored by sdesmalen on Jan 19 2018, 5:40 AM.

Details

Summary

This is a bit of a reimplementation the work done in
https://reviews.llvm.org/D41446, since that patch only really works for
tied operands of instructions, not aliases.

Instead of checking the constraints based on the matched instruction's opcode,
this patch uses the match-info's convert function to check the operand
constraints for that specific instruction/alias.
This is based on the matched operands for the instruction, not the
resulting opcode of the MCInst.

This patch adds the following enum/table to the *GenAsmMatcher.inc file:

enum {
  Tie0_1_1,
  Tie0_1_2,
  Tie0_1_5,
  ...
};

const char TiedAsmOperandTable[][3] = {
  /* Tie0_1_1 */ { 0, 1, 1 },
  /* Tie0_1_2 */ { 0, 1, 2 },
  /* Tie0_1_5 */ { 0, 1, 5 },
  ...
};

And it is referenced directly in the ConversionTable, like this:
static const uint8_t ConversionTable[CVT_NUM_SIGNATURES][13] = {

...
{ CVT_95_addRegOperands, 1,
  CVT_95_addRegOperands, 2,
  CVT_Tied, Tie0_1_5,
  CVT_95_addRegOperands, 6, CVT_Done },
...

The Tie0_1_5 (and corresponding table) encodes that:

  • Result operand 0 is the operand to copy (which is e.g. done when building up the operands to the MCInst in convertToMCInst())
  • Asm operands 1 and 5 should be the same operands (which is checked in checkAsmTiedOperandConstraints()).

Diff Detail

Repository
rL LLVM

Event Timeline

sdesmalen created this revision.Jan 19 2018, 5:40 AM

I was looking at your original version of this the other day to work out if it could be used to do 3-operand aliases of 2-operand Thumb1 instructions, and this (adding support for aliases) looks like it's what we would need for that.

Why do you need to modify the conversion to MCInst? There is still a parsed operand for every MCInst operand, so can the existing conversion function not handle this already?

Thinking about how to integrate this with the new diagnostics used in the ARM assembler, I think it might be better to check tied operands at the same time as we check the operand classes. In fact, the tied-operand check might be able to replace the operand class check.

utils/TableGen/AsmMatcherEmitter.cpp
3669 ↗(On Diff #130591)

If we are in a ReportMultipleNearMisses backend then this check will be skipped, so should we assert here?

I was looking at your original version of this the other day to work out if it could be used to do 3-operand aliases of 2-operand Thumb1 instructions, and this (adding support for aliases) looks like it's what we would need for that.

That's great to hear!

Why do you need to modify the conversion to MCInst?

Well, this patch doesn't change the actual conversion to MCInst, it just encodes more information into the conversion-function by using an indirection through the 'TiedAsmOperandTable'. It will still create the MCInst in the same way as it did before.

There is still a parsed operand for every MCInst operand, so can the existing conversion function not handle this already?

Yes, however the existing conversion function does not encode which parsed operands correspond to each other and rather just tells 'Copy operand X' from the list of result operands (which is not the same as the list of parsed operands). This patch adds that extra bit of information to the conversion function, so now it can just say that "parsed operand Y" and "parsed operand Z" should match (that both result in "result operand X"). I hope my explanation/terminology makes some sense :)

Thinking about how to integrate this with the new diagnostics used in the ARM assembler, I think it might be better to check tied operands at the same time as we check the operand classes. In fact, the tied-operand check might be able to replace the operand class check.

That might be possible, but I worry that it might lead to unmatched instructions and/or vague error messages if the register-operands don't match up. At the moment it just tries to match the whole instruction based on the operand classes, and then checks that the operands should be the same, already knowing they are of the right operand class, so we already know the instruction should otherwise match. What would be the benefit of doing it during matching of the operand classes?

Well, this patch doesn't change the actual conversion to MCInst, it just encodes more information into the conversion-function by using an indirection through the 'TiedAsmOperandTable'. It will still create the MCInst in the same way as it did before.

Ah, yep, I misread the patch. It confused me a bit that CVT_Tied is being used for these two different cases (if i've understood it correctly this time):

  • One parsed operand which generates two identical MCInst operands
  • Two parsed operands which must bet the same, and generate two identical MCInst operand

That might be possible, but I worry that it might lead to unmatched instructions and/or vague error messages if the register-operands don't match up. At the moment it just tries to match the whole instruction based on the operand classes, and then checks that the operands should be the same, already knowing they are of the right operand class, so we already know the instruction should otherwise match. What would be the benefit of doing it during matching of the operand classes?

I think the way it is being done now will result in the situation where a diagnostic says "operand must be a register in some range", the user fixes that, then the diagnostic changes to "operand must be the same as this other operand". That's something I've been trying to avoid, always emitting diagnostics that will result in a valid instruction if followed. My thought here was that it would be better to do all of the operand checks at once, so that we can favour the more precise diagnostics (the tied-operand ones) over the looser ones (operand class).

However, now that I've thought about it a bit more, I'm not sure that would work any better. In particular, in the case where the left-most of a pair of tied operands isn't in the correct register class, we can't emit the tied-operand diagnostic because we haven't matched the right-most operand yet. I also can't think of a good way to word diagnostics that say "_this_ operand is valid, but _that one_ needs to match it". Maybe it would be better to relax that rule, giving slightly less precise, but easier to understand diagnostics.

The approach you're describing sounds good to me. I've got one question about the example from the other review:

def : InstAlias<"mov $dst, $src, $src, $src", (ORRWrs GPR32:$dst, WZR, GPR32:$src, 0), 2>;

It looks like this patch will handle the tie between the first two $src's but ignore the third. If that's a limitation of the implementation then I think that buildAliasResultOperands() should call findAsmOperandNamed() once more and report an error if another instance of $src is found.

@olista01 Thanks, I see your point about wanting to avoid taking the user through multiple steps to get the operand right, I think the work you did on reporting 'multiple near misses' is a step in that direction by giving all the information to the user at once. Having the tied-constraint diagnostic being mentioned in that list would be useful.

When giving only a single diagnostic however, the difference with matching different operand classes and matching the tied operand's case is that there will be no other possible match that is more likely to match than the current one, so in a way the tied-operand diagnostic takes precedence over all other diagnostics in that case.

At the moment the checking of tied constraints is still disabled when 'ReportMultipleNearMisses' is used, maybe it is worth integrating the constraint checking with that work in a separate patch?

The approach you're describing sounds good to me. I've got one question about the example from the other review:

def : InstAlias<"mov $dst, $src, $src, $src", (ORRWrs GPR32:$dst, WZR, GPR32:$src, 0), 2>;

It looks like this patch will handle the tie between the first two $src's but ignore the third. If that's a limitation of the implementation then I think that buildAliasResultOperands() should call findAsmOperandNamed() once more and report an error if another instance of $src is found.

Only multiple $src operands that correspond to a tied operand in the MCInst will be handled, subsequent mentions of $src in the asm-string are error'd by TableGen. This is done in AsmMatcherEmitter.cpp line 1891 where it goes through the map of all named operands and their 'last' mention in the asm string. If it is mentioned more than once without corresponding to a variable in the MCInst (at line 1891, all result operands are handled so any further mention of a variable in the asm string cannot correspond to a result operand) , an error is given that the operand can never be matched.

mcrosier resigned from this revision.Jan 23 2018, 6:39 AM
sdesmalen updated this revision to Diff 132160.Jan 31 2018, 6:11 AM
sdesmalen removed a reviewer: mcrosier.

Retain old behaviour of disallowing repeated operands in InstAliases when tied-operand checking is not enabled (which is currently the case with ReportMultipleNearMisses)

sdesmalen added inline comments.Jan 31 2018, 6:13 AM
utils/TableGen/AsmMatcherEmitter.cpp
3669 ↗(On Diff #130591)

Hi @olista01, my apologies, I didn't notice your comment earlier. I have addressed your concern in my latest patch.

olista01 accepted this revision.Jan 31 2018, 8:49 AM

LGTM.

This revision is now accepted and ready to land.Jan 31 2018, 8:49 AM
This revision was automatically updated to reflect the committed changes.