This is an archive of the discontinued LLVM Phabricator instance.

[Assembler] Report multiple near misses for invalid instructions

Authored by olista01 on Dec 9 2016, 9:02 AM.



The current table-generated assembly instruction matcher returns a
64-bit error code when matching fails. Since multiple instruction
encodings with the same mnemonic can fail for different reasons, it uses
some heuristics to decide which message is important.

This heuristic does not work well for targets that have many encodings
with the same mnemonic but different operands, or which have different
versions of instructions controlled by subtarget features, as it is hard
to know which encoding the user was intending to use.

Instead of trying to improve the heuristic in the table-generated
matcher, this patch changes it to report a list of near-miss encodings.
This list contains an entry for each encoding with the correct mnemonic,
but with exactly one thing preventing it from being valid. This thing
could be a single invalid operand, a missing target feature or a failed
target-specific validation function.

The target-specific assembly parser can then report an error message
giving multiple options for instruction variants that the user may have
been trying to use. For example, I am working on a patch to use this for
ARM, which can give this error for an invalid instruction for ARMv6-M:

<stdin>:8:3: error: invalid instruction, multiple near-miss encodings found
  adds r0, r1, #0x8
<stdin>:8:3: note: for one encoding: instruction requires: thumb2
  adds r0, r1, #0x8
<stdin>:8:16: note: for one encoding: expected an integer in range [0, 7]
  adds r0, r1, #0x8
<stdin>:8:16: note: for one encoding: expected a register in range [r0, r7]
  adds r0, r1, #0x8

This also allows the target-specific assembly parser to apply its own
heuristics to suppress some errors. For example, the error "instruction
requires: arm-mode" is never going to be useful when targeting an
M-profile architecture (which does not have ARM mode).

This patch just adds the target-independent mechanism for doing this,
all targets still use the old mechanism. I've added a bit in the
AsmParser tablegen class to allow targets to switch to this new
mechanism. To use this, the target-specific assembly parser will have to
be modified for the change in signature of MatchInstructionImpl, and to
report errors based on the list of near-misses.

Diff Detail


Event Timeline

olista01 updated this revision to Diff 80905.Dec 9 2016, 9:02 AM
olista01 retitled this revision from to [Assembler] Report multiple near misses for invalid instructions.
olista01 updated this object.
olista01 set the repository for this revision to rL LLVM.
olista01 added a subscriber: llvm-commits.
echristo edited edge metadata.Dec 9 2016, 5:42 PM

In general looks ok, can you share the next step patch to make sure that no only is this fine, but is also the direction we want to go?



ps. silly nit inline.


Random nit: no comma necessary.

Sorry it's taken me so long to get back to this, I've now posted the patch the ARM-specific usage of this at D31530.

rengolin edited edge metadata.May 5 2017, 3:24 AM

@echristo, any comments on this?

Catching up on code review. This is one of the next things on my list.

Hi Oliver!

I'm sorry for the delay, that said I'd also like to see a lot more commenting through here along with how we expect this API/data structures to work. It's hard to review this because I'm not sure how it's going to be used ultimately.

I don't see any noticeable problems, but I'd really like to see how this is going to be used (and have that commented in the source) before putting this in.



olista01 updated this revision to Diff 110578.Aug 10 2017, 7:18 AM

Expanded comments on the NearMissInfo struct.

If you want an example of how this will be used in practice, D31530 uses it in the ARM assembly parser.

Hi Oliver,

D31530 is looking great (which depends on this one), but I'd like to get Eric's final approval for this one.

@echristo, what kind of additional comments are you expecting to see in the code? Did you have a look at D31530 (a concrete implementation of this logic)?



I have a number of patches already approved which depend on this (using it in the ARM asm parser), and a few more that I'm working on, so it would be good to get this committed soon.

echristo accepted this revision.Oct 2 2017, 7:41 PM

Sorry for the delay, go ahead.

This revision is now accepted and ready to land.Oct 2 2017, 7:41 PM
This revision was automatically updated to reflect the committed changes.