This is an archive of the discontinued LLVM Phabricator instance.

Add a TargetMachineVerifier that runs along with the existing MachineVerifier
Needs ReviewPublic

Authored by dsanders on Jul 12 2019, 10:56 AM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Targets often have constraints on their MIR that need to hold true throughout
certain regions of the backend but cannot be verified in the MachineVerifier.
Some examples include:

  • Immediates must be in a certain range on a given instruction
  • Certain instructions must not appear for certain subtargets
  • Certain instructions must not appear in delay slots
  • Fences need to occur before a some types of load but fence/load/load is preferable to fence/load/fence/load

and so on.

Sometimes, it's enough to roll verification into the pass that does the
relevant work as there's no chance of something else interfering but sometimes
there's also a risk of subsequent passes violating constraints on the MIR due
to assumptions that turn out to be untrue or just simply bugs.

This patch expands on the existing MachineVerifier mechanism by introducing
a TargetMachineVerifier that installs itself into -verify-machineinstrs and
MachineFunction::verify() and therefore runs after every pass or more
frequently if requested. The generic MachineVerifier runs first so it should be
safe for the TargetMachineVerifier to assume that the MIR is generally ok and
that defs dominate uses, etc, etc.

There's no specific mechanism for having certain verifications apply beyond (or
until) a certain point in the backend but I'd recommend using function
attributes set by a given pass (like GlobalISel) or the equivalent in
MachineFunctionInfo to achieve this.

There's a simple sample verification in AArch64 but my main focus is an out of
tree target so I haven't gone very far with verifying AArch64. I'm posting this
to see if there's general interest in having this mechanism upstream.

Long term, I foresee a VerifyMethod (to go with PrintMethod, DecodeMethod, etc.)
on Instructions and Operands in tablegen that populate this pass with sanity
checks to detect when C++ code violates constraints defined in tablegen. I'm
aware that some targets are exploiting the lack of checks to do things that
conflict with tablegen definitions so such a thing will have to be opt-in.

Diff Detail

Event Timeline

dsanders created this revision.Jul 12 2019, 10:56 AM
dsanders updated this revision to Diff 209536.Jul 12 2019, 11:06 AM

Remove a static cast that's not necessary on trunk

dsanders updated this revision to Diff 209538.Jul 12 2019, 11:08 AM

Missed the *& left over in the last update

arsenm added a subscriber: arsenm.Jul 12 2019, 11:17 AM
arsenm added inline comments.
lib/Target/AArch64/AArch64MachineVerifier.cpp
80–93

While I do think there is a need for a target verifier pass, you don't need it to verify instruction level things like this. You can already do that in TargetInstrInfo::verifyInstruction

unittests/CodeGen/GlobalISel/PatternMatchTest.cpp
531

EXPECT_FALSE

The current target verification problems I have are:

  • Verifying legal register bank mappings, so I don't have to defend against them in InstructionSelect
  • Verifying G_INTRINSIC operand types and numbers. There's missing API to do this.
  • Verifying some target IR constraints, in an IR verifier
  • Verifying some special block level properties for exec masking, which is the only category I would probably want to put in this pass
dsanders marked an inline comment as done.Jul 12 2019, 1:40 PM

The current target verification problems I have are:

  • Verifying legal register bank mappings, so I don't have to defend against them in InstructionSelect
  • Verifying G_INTRINSIC operand types and numbers. There's missing API to do this.
  • Verifying some target IR constraints, in an IR verifier
  • Verifying some special block level properties for exec masking, which is the only category I would probably want to put in this pass

Yep, the first two sound like a job for verifyInstruction() and the third sounds like a one-shot pass before IRTranslator/SelectionDAG. The fourth does sound like it makes sense for this pass.

What's missing for the G_INTRINSIC one?

lib/Target/AArch64/AArch64MachineVerifier.cpp
80–93

I didn't know about that function. That will work nicely for the trivial checks on my list. Thanks.

dsanders updated this revision to Diff 209582.Jul 12 2019, 1:46 PM

ASSERT_FALSE -> EXPECT_FALSE
fixed an compile error in the unit test caused by an API change

dsanders marked an inline comment as done.Jul 12 2019, 1:46 PM

The current target verification problems I have are:

  • Verifying legal register bank mappings, so I don't have to defend against them in InstructionSelect
  • Verifying G_INTRINSIC operand types and numbers. There's missing API to do this.

The only Intrinsic:: functions only deal with IR-level information. I didn't see a way to figure out the argument list without getting a concrete Function declaration. That would require converting from the argument LLTs to Type*, but you also don't know the context of which operands are needed for mangling without hardcoding it for specific intrinsics.

Having it as an independent pass may be kind of annoying. Having verifier runs in -print-before-all/-print-after-all is annoying enough already, and this would double them

jfb added a comment.Jul 12 2019, 2:07 PM
This comment was removed by jfb.

Having it as an independent pass may be kind of annoying. Having verifier runs in -print-before-all/-print-after-all is annoying enough already, and this would double them

The MachineVerifier itself is supposed to prevent it by calling PM->add() directly to add itself instead of going through addPass() but PM is private so I wasn't able to use the same method. Along the same lines the false, false in:

addPass(createAArch64MachineVerifier(Banner), false, false);

is supposed to prevent that. One of them prevents verifiers being added after the pass while the other prevents printers being added after the pass. Interestingly, TargetMachineVerifier successfully elides the printer on an older branch but not on trunk, while MachineVerifier fails on both. I'll take a look.

The current target verification problems I have are:

  • Verifying legal register bank mappings, so I don't have to defend against them in InstructionSelect
  • Verifying G_INTRINSIC operand types and numbers. There's missing API to do this.

The only Intrinsic:: functions only deal with IR-level information. I didn't see a way to figure out the argument list without getting a concrete Function declaration. That would require converting from the argument LLTs to Type*, but you also don't know the context of which operands are needed for mangling without hardcoding it for specific intrinsics.

Ah ok. I haven't encountered a case where we need to reach back to the original argument list so I don't have any ideas for this to hand.

Interestingly, TargetMachineVerifier successfully elides the printer on an older branch but not on trunk, while MachineVerifier fails on both.

Sorry, the older branch actually fails to elide it too. I was checking the wrong working copy.

Having it as an independent pass may be kind of annoying. Having verifier runs in -print-before-all/-print-after-all is annoying enough already, and this would double them

The MachineVerifier itself is supposed to prevent it by calling PM->add() directly to add itself instead of going through addPass() but PM is private so I wasn't able to use the same method. Along the same lines the false, false
in:

addPass(createAArch64MachineVerifier(Banner), false, false);

is supposed to prevent that. One of them prevents verifiers being added after the pass while the other prevents printers being added after the pass. Interestingly, TargetMachineVerifier successfully elides the printer on an older > branch but not on trunk, while MachineVerifier fails on both. I'll take a look.

It turns out the printing passes are added in two places and only one of them honours the printAfter argument to addPass(). https://reviews.llvm.org/D64681 is a quick hack that prevents the verifier triggering IR dumps

asb added a subscriber: asb.Sep 2 2019, 10:16 PM

@dsanders I agree that this patch addresses a real problem, and that there is currently a generalized lack of target-specific checks. AMDGPU seems to be the target with the most comprehensive set of checks (although I think it's mostly in the S.I. subtarget). Inspired by this patch we also submitted a patch to verify the immediates of MachineInstrs for the RISC-V target (D67397). The fact that invalid immediate operands just get silently processed (emitted in asm form, truncated in obj form) in nearly all targets is a good indicator that there is space for improvements, and that unrelated changes at any point might reveal latent bugs. Hopefully having more of this infrastructure in place will further encourage targets to add more checks. Still, I'm not sure adding a pass is the best way to accomplish this.

Why have the distinction between MachineVerifier and TargetMachineVerifier? Presumably you are going to update this patch to remove the TargetMachineVerifier::verifyMI method, since that functionality is already covered the by TargetInstrInfo::verifyInstruction hook. Is there any fundamental reason why the MF/MBB checks should be part of this new pass while the MI check is a hook? I think not. I think all three verifications should be hooks, appropriately called by the MachineVerifier. Having the MachineVerifier in charge of the high-level logic and delegating to the target to verify target-specific information is consistent with LLVM practice. Now, I'm assuming we can do the appropriate target-independent verifications in MachineVerifier before calling the hooks, which I agree nicely simplifies the target-specific checks, but I may be wrong about that. Assuming we can, there's IMO not really a good motivation to split the verifications in two passes. I imagine a single pass will be more cache friendly too. One possible issue is that it might not be clear where to put the MF/MBB hooks, since it seems none of the existing target interfaces currently are perfect matches. It might be worth considering adding a new target interface (or refactoring existing ones?). BTW, if you do that, consider if the TargetMachineVerifier::verifyMI hook should be moved there, or if it makes sense to remain part of the TargetMachineVerifier. Answering this question of where to put the hooks might also clarify what is the best way to verify machine operands. For my D67397 patch I did the MachineInstr verification based on the MCOperandPredicates, and I had to duplicate some of the logic for that, since the MC layer and the MachineInstrs have a different notion of immediates. Having the somewhat higher-level MachineInstr be checked based on the lower-level MC info is also not great, although not unprecedented. It might make sense to have a more neutral place as a ground truth that could be used to check properties common to MachineInstrs and MCInstrs, and that place might be where you would put those two additional hooks!

D81821 has a use case where a target specific block/function level verifier makes sense