This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Added a function to identify unsupported opcodes
ClosedPublic

Authored by dp on Sep 24 2020, 4:05 AM.

Details

Summary

Summary.

This change implements generation of a function which may be used by a backend to check if a given instruction is supported for a specific subtarget.

For an example of how this function may be used see D88211.

Rationale.

AsmMatcher selects an appropriate instruction based on parsed operands and available features. However feature mismatch is used only as the last resort to improve error messages. As a result assembler may misleadingly report an invalid operand when the real issue is that the specified instruction is not supported for the current subtarget.

It should be noted that there is a difference between invalid and unsupported instructions. Invalid instructions are not supported by any subtarget; when such instruction is encountered, AsmMatcher returns Match_MnemonicFail. In contrast, if an instruction is unsupported, it cannot be used with the current subtarget but it is supported for some other subtarget(s). AsmMatcher cannot reliably identify unsupported instructions -
Match_MissingFeature may be returned for other reasons.

A possibility to identify unsupported instructions would be especially useful for backends which have many subtargets. AMDGPU is an example. An application developed for one GPU may later be ported to another GPU. During porting it would be helpful to get a meaningful error message when assembler encounters an instruction which is valid but not supported by the current subtarget.

Appendix.

Below is the code of a function which is generated to check if an opcode is supported:

#ifdef GET_MNEMONIC_CHECKER
#undef GET_MNEMONIC_CHECKER

static bool AMDGPUCheckMnemonic(StringRef Mnemonic,
                                const FeatureBitset &AvailableFeatures,
                                unsigned VariantID) {
  // Process all MnemonicAliases to remap the mnemonic.
  applyMnemonicAliases(Mnemonic, AvailableFeatures, VariantID);

  // Find the appropriate table for this asm variant.
  const MatchEntry *Start, *End;
  switch (VariantID) {
  default: llvm_unreachable("invalid variant!");
  case 0: Start = std::begin(MatchTable0); End = std::end(MatchTable0); break;
  <...>
  }

  // Search the table.
  auto MnemonicRange = std::equal_range(Start, End, Mnemonic, LessOpcode());

  if (MnemonicRange.first == MnemonicRange.second)
    return false;

  for (const MatchEntry *it = MnemonicRange.first, *ie = MnemonicRange.second;
       it != ie; ++it) {
    const FeatureBitset &RequiredFeatures =
      FeatureBitsets[it->RequiredFeaturesIdx];
    if ((AvailableFeatures & RequiredFeatures) == RequiredFeatures)
      return true;
  }
  return false;
}

#endif // GET_MNEMONIC_CHECKER

Diff Detail

Event Timeline

dp created this revision.Sep 24 2020, 4:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2020, 4:05 AM
dp requested review of this revision.Sep 24 2020, 4:05 AM

Hi @dp, I believe that I encountered the same issue when adding the assembler support for AArch64 SVE.

In D40363 I added a bool ParseForAllFeatures to MatchOperandParserImpl. By passing true for this parameter, the assembler will continue trying to match the operands regardless of which feature bits are set. MatchInstructionImpl (generated by TableGen) will finally check if the parsed mnemonic + operands combination is valid using the AsmOperand's predicate methods and the selected FeatureSet. It will emit a suitable diagnostic if the operand is not valid. If the operands are all valid but the instruction is not in the feature set, it will return Match_MissingFeature.

A limitation of emitMnemonicChecker you added here is that it doesn't work for different instructions with the same mnemonic. For example some arch-features might enable additional addressing modes, or enable instructions with the same mnemonic, but with different registers (e.g. Neon vs SVE vector add instructions).

dp added a comment.Sep 30 2020, 6:09 AM

Hello @sdesmalen. Thanks for your review.

If the operands are all valid but the instruction is not in the feature set, it will return Match_MissingFeature.

Please note that MatchInstructionImpl may also return Match_MissingFeature if an operand does not match selected feature set. So we cannot use MatchInstructionImpl to find out if an instruction mnemonic is supported or not - Match_MissingFeature may be returned for other reasons.

The limitation you pointed out was actually an intention. I wanted to completely ignore operands when searching for instruction mnemonic.
Why? Because if a mnemonic is not supported for a given set of features, it does not matter which operands were specified and error message should be "instruction not supported for this feature set". I believe that reporting an invalid operand would be misleading in this case.

Below are a few examples to illustrate my point. The examples show typical errors and how they can be handled.

invalid_opcode           operand, ...

If opcode mnemonic is invalid, MatchInstructionImpl returns Match_MnemonicFail. This allows reliable identification of invalid instruction mnemonics. This case is usualluy reported as "invalid instruction, did you mean ...?"

supported_opcode         invalid_operand, ...

MatchInstructionImpl may return any match error status, including Match_MissingFeature. In the latter case it is impossible to know what the reason is - unsupported opcode or unsupported operand.

unsupported_opcode       valid_operand, ...

This case should result in "unsupported opcode" message. However even if MatchInstructionImpl returns Match_MissingFeature it is not possible to know what the reason is - unsupported opcode or unsupported operand.

unsupported_opcode       invalid_operand, ...

Should result in "unsupported opcode" message. However MatchInstructionImpl may return any match error status, not only Match_MissingFeature. Consequently assembler may report something like "invalid operand".

The proposed function is needed to improve error messages in cases 2, 3 and 4. The function should be used to report unsupported opcodes before handling error codes returned by MatchInstructionImpl.

I could not figure out how your change in MatchOperandParserImpl could help with issues described above.

Please note that MatchInstructionImpl may also return Match_MissingFeature if an operand does not match selected feature set. So we cannot use MatchInstructionImpl to find out if an instruction mnemonic is supported or not - Match_MissingFeature may be returned for other reasons

Did you try out my suggestion though? Because if I make this local change in AMDGPUAsmParser:

-  OperandMatchResultTy ResTy = MatchOperandParserImpl(Operands, Mnemonic);
+  OperandMatchResultTy ResTy = MatchOperandParserImpl(Operands, Mnemonic, /*ParseForAllFeatures=*/true);

then most of the cases you fixed in D88211 are fixed for free and I don't see the case happening that you mention here. If the above change doesn't fix that, can you please point me to the test for which that happens?
(If the above change does fix it, we should probably consider making 'true' the default for MatchOperandParserImpl)

The only case that would still give a diagnostic for one of the operands is if both the instruction+operands are not valid for _any_ feature-set. The mnemonic checker could give a slightly improved diagnostic here, but I would think this scenario is quite limited. If you prefer the mnemonic checker to solve that specific case, that's fine, I mostly wanted to make you aware that a large portion of this problem can be solved with a one-line change.

llvm/utils/TableGen/AsmMatcherEmitter.cpp
3977

Should this be called by MatchInstructionImpl so that this feature comes for free for all AsmParsers?

dp marked an inline comment as done.Oct 1 2020, 11:57 AM

Hello @sdesmalen,

I'm sorry for not trying your patch earlier. I thought that it only fixed problems with operand matcher and I had a feeling that most problems were located in instruction matcher.
Well, I finally tried your patch. It did not fix all issues, but for on-liner it is really good!
There are several lit tests created specifically to test unsupported instructions. Below is a summary of results with and without your patch.

	Test name              number      failed       failed      improvement
                              of tests      base    (sdesmalen fix)   vs base
	gfx7_unsupported.s      1044        502           316           37%
	gfx8_unsupported.s       599        129            35           73%
	gfx9_unsupported.s       342         67            24           64%
	gfx10_unsupported.s      720        345           224           35%

Enclosed is the smallest of these files. To simplify analysis, tests have been corrected as follows:

  • passed tests removed;
  • failed tests updated so that the tests pass with your patch;
  • a header in each section describes an expected error message.

In other words the enclosed tests show actual error messages displayed by assembler after your patch has been applied.

I should also note that AMDGPU has several instruction variants identified by mnemonic suffices (_e32, _e64, _dpp, _sdwa). There are cases when the base instruction is supported but a variant is not. The proposed extension allows easy identification of such cases to provide appropriate diagnostic messages. The enclosed lit test has several examples at the end of the file.

llvm/utils/TableGen/AsmMatcherEmitter.cpp
3977

I'm not sure.

First, your fix may be quite sufficient for many backends.

Second, I considered this (proposed) function as a helper for constructing higher-level utilities for each backend. For example, AMDGPU uses this function to identify both unsupported instructions and unsupported instruction variants. This may or may not be needed for other backends.

Third, a single call to MatchInstructionImpl is not sufficient to trigger an error. MatchInstructionImpl only handles a specific set of features and one instruction variant. A backend may repeatedly call this function to search the mnemonic in different variant tables. But identification of unsupported opcodes may require more than that. For AMDGPU we have to search all variants and ignore feature bits to make sure the mnemonic in question is supported in some variant or feature set. But variant/feature search rules may be different for other backends.

So I think the new proposed functionality should be available as a separate function called either before first call of MatchInstructionImpl or after all calls to MatchInstructionImpl failed to locate a match.

sdesmalen accepted this revision.Oct 2 2020, 8:58 AM

Hi @dp, thanks for your detailed reply!

In D88214#2306928, @dp wrote:

In other words the enclosed tests show actual error messages displayed by assembler after your patch has been applied.

Just to make sure I understand this correctly though. These instructions are valid for at least some invocation of feature flags, but depending on which table MatchInstructionImpl is looking in (VariantID), it may not be legal for any feature set and thus lead to a 'illegal operand' diagnostic. But for AMDGPU that could have also been deduced from the mnemonic alone, hence the preference to check that before calling MatchInstructionImpl. Is that a correct?

In any case there would be no downside to generating this checker-function, even if it isn't used by other targets, so I'm happy to LGTM based on your analysis!

llvm/utils/TableGen/AsmMatcherEmitter.cpp
3977

MatchInstructionImpl will never return with Match_Success if we can prove that none of the feature bits enables the mnemonic, and so this could be a cheap exit from that function. But it would also slow down the assembler for all valid instructions as it would always need to check that table first. So I agree, happy to leave it as-is, the existing mechanism is probably sufficient for most use-cases.

This revision is now accepted and ready to land.Oct 2 2020, 8:58 AM
dp marked an inline comment as done.Oct 3 2020, 11:32 AM

Hello @sdesmalen,

Just to make sure I understand this correctly though. These instructions are valid for at least some invocation of feature flags, <...>

The tests we are discussing include:

  • Pure-negative cases which are not valid for any combinations of instruction variants and feature flags;
  • Tests which are valid for at least one AMDGPU subrarget.

In case of gfx9_unsupported.s assembler with your fix did not produce expected results for 24 tests. 12 tests out of these 24 tests are valid for at least one AMDGPU subtarget. Remaining 12 tests are pure-negative. So your fix is even more efficient if we exclude pure-negative tests from consideration.

Enclosed is an updated version of tests with pure-negative cases removed:

... but depending on which table MatchInstructionImpl is looking in (VariantID), it may not be legal for any feature set and thus lead to a 'illegal operand' diagnostic.

I'm not sure about "not be legal for any feature set".
I further analyzed 12 failures where your fix did not succeed. These 12 cases may be reduced to the following 2 cases:

  1. A mismatch in MatchOperandParserImpl. One of operands has a custom parser and the parser explicitly checks a feature and returns MatchOperand_NoMatch because the feature is not supported.
  2. A mismatch in MatchInstructionImpl. This case is more interesting because both gfx9 and gfx10 use the same variantId but due to different set of features the instruction is valid for gfx10 bit illegal (unsupported) for gfx9. MatchInstructionImpl is unable to find a suitable match for gfx9. Despite feature mismatch with all found mnemonics, it does not return Match_MissingFeature, for some reason Match_InvalidOperand is preferred.

But for AMDGPU that could have also been deduced from the mnemonic alone, hence the preference to check that before calling MatchInstructionImpl. Is that a correct?

Correct.

Thank you!