This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Add a generic VerifyFullyConverted pass
Needs ReviewPublic

Authored by GMNGeoffrey on Feb 5 2021, 5:33 PM.

Details

Reviewers
rriddle
Summary

This patch provides a utility that can be used to verify IR against a
ConversionTarget. It does not actually attempt to do any conversion.
This is useful (when compared to applyFullConversion) because it
allows multiple passes to complete the lowering without having to
know their ordering in the lowering pipeline. It also provides
information on all illegal ops rather than failing at the first.

It is ported from similar passes in IREE, which can be reimplemented
using this utility. See, for example:
https://github.com/google/iree/blob/0281bb72f6f4/integrations/tensorflow/iree_tf_compiler/TF/VerifyFullyConverted.cpp

Diff Detail

Event Timeline

GMNGeoffrey created this revision.Feb 5 2021, 5:33 PM
GMNGeoffrey requested review of this revision.Feb 5 2021, 5:33 PM
rriddle requested changes to this revision.Feb 5 2021, 5:42 PM
rriddle added inline comments.
mlir/include/mlir/Transforms/VerifyFullyConverted.h
23 ↗(On Diff #321903)

Header files should use proper namespaces.

26 ↗(On Diff #321903)

Given that we have to define a new pass instance for each of these, it seems better to just add a utility function to DialectConversion.h that passes can call:

LogicalResult verifyAllOperationsAreLegal(Operation *op, ConversionTarget &target);
30 ↗(On Diff #321903)

I'm pretty sure all of the internals here can be pulled out-of-line, we should remove as much code from templates as possible.

This revision now requires changes to proceed.Feb 5 2021, 5:42 PM
GMNGeoffrey marked an inline comment as done.

Switch to a utility instead of a pass base class

mlir/include/mlir/Transforms/VerifyFullyConverted.h
23 ↗(On Diff #321903)

Oh whew :-D we use proper namespaces everywhere, so I was just trying to follow MLIR convention :-P

26 ↗(On Diff #321903)

Yeah you're right. This evolved as I was writing it. Was originally planning to offer a "create" method that took a ConversionTarget, but then realized that doesn't work because ConversionTarget requires a context. Done. Do you think an ArrayRef variant is useful/necessary as with the apply methods? I think one of the major advantages of this approach is having the single overall summary and not failing till the end, which is a bit trickier with multiple ops because it's not totally clear what the right way to do that is (single fused loc? one summary per top-level op?)

GMNGeoffrey edited the summary of this revision. (Show Details)Feb 5 2021, 6:35 PM
bondhugula requested changes to this revision.Feb 7 2021, 12:19 AM
bondhugula added a subscriber: bondhugula.
bondhugula added inline comments.
mlir/include/mlir/Transforms/DialectConversion.h
868

give -> given

mlir/lib/Transforms/Utils/DialectConversion.cpp
2790

Missing doc comment.

2793

int -> unsigned.

2811

const for target?

mlir/test/Transforms/test-verify-fully-converted.mlir
3

You don't need this separator here I think.

15

Drop blank lines.

mlir/test/lib/Transforms/TestVerifyFullyConverted.cpp
1

Nit: Please fill 80 columns with in between hyphens.

This revision now requires changes to proceed.Feb 7 2021, 12:19 AM
jpienaar added inline comments.
mlir/lib/Transforms/Utils/DialectConversion.cpp
2797

I prefer the phrasing from https://github.com/tensorflow/tensorflow/blob/b8d2a994e4fda783f27894689b85c73581998407/tensorflow/compiler/mlir/xla/transforms/legalize_tf.cc#L6075 - unable to legalize op for conversion target vs saying op is illegal.

I also prefer the approach there where there is consideration for the caller wrt flooding with messages. Now there it is intended for a specific caller/call stack so it is further different in that verbosity can be specified via that project's standard means.

I'm actually not sure if stylistic choices wrt how the textual format & reporting style (ordering of summary wrt more detailed errors, reporting threshold, notes on a error vs multiple errors) should be made here. meaning, as is, I couldn't reuse this function, but adding the parametrization I'd need for my context would make this function too cumbersome, but it is in a privileged position here vs being in utils or some such, so presents it as the approach for dialect conversion to report these.

2804

I prefer one error below with notes, rather than new lines and tabbing.

2806
2810

If you do a partial legalization, you get the ops that weren't legalized. So this function is targeting where neither full nor partial legalization with full conversion target is used?

GMNGeoffrey added inline comments.Feb 19 2021, 10:14 AM
mlir/lib/Transforms/Utils/DialectConversion.cpp
2797

Let's figure out what if anything of this should go in core and then I'll address the more details comments elsewhere :-)

The impetus for this comes from an old request I had to have an easy way to validate that IR was in a certain state using the tools of the dialect conversion framework. I think partial/full conversion etc remains a source of confusion for people and this sort of thing is dead simple. To address your other comment

If you do a partial legalization, you get the ops that weren't legalized. So this function is targeting where neither full nor partial legalization with full conversion target is used?

We use this currently in standalone passes that just do this verification. This removes the requirement for knowing where in a pipeline a pass might be running when you are converting via several passes. IMO this is the right separation of concerns. A pass shouldn't be unnecessarily constrained to convert + verify if it's intended to only convert some ops. The pass immediately before this need not even be something that uses dialect conversion. It could be something using the greedy driver or any custom pass. I think when I originally requested something like this in core, dialect conversion was much more limited, so I ended up in a situation of trying to force something into the conversion framework that didn't work just because I wanted a full conversion that would validate my IR was in a certain state at the end. This allows validating the state at the pipeline level.

I also prefer the approach there where there is consideration for the caller wrt flooding with messages. Now there it is intended for a specific caller/call stack so it is further different in that verbosity can be specified via that project's standard means.

LLVM/MLIR also has standard means for controlling debug information, so it seems we could do that here if we want to have more control over the verbosity

I'm actually not sure if stylistic choices wrt how the textual format & reporting style (ordering of summary wrt more detailed errors, reporting threshold, notes on a error vs multiple errors) should be made here. meaning, as is, I couldn't reuse this function, but adding the parametrization I'd need for my context would make this function too cumbersome, but it is in a privileged position here vs being in utils or some such, so presents it as the approach for dialect conversion to report these.

Such decisions are made in lots of places in MLIR, so I think having this here isn't bad. That doesn't mean what I currently have (which is just ported from what our passes in IREE have are the best decisions). Of course if someone wants something different they are of course welcome to implement their own thing. I also didn't originally put this in dialect conversion directly, but River suggested moving it :-P

mehdi_amini added inline comments.Feb 22 2021, 5:13 PM
mlir/lib/Transforms/Utils/DialectConversion.cpp
2797

A pass shouldn't be unnecessarily constrained to convert + verify if it's intended to only convert some ops.

I don't quite get what you mean here?

Such decisions are made in lots of places in MLIR, so I think having this here isn't ba

I believe we rely on injection and callbacks for customization in most places and push the choice on the clients instead. This was the motivation for collecting the non converted operations for example.

Right now it isn't clear to me how this would be used and how a client like here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/legalize_tf.cc#L6116-L6126 would use this?

The use-case may be different, but I don't quite get it yet :)

GMNGeoffrey added inline comments.Feb 22 2021, 6:48 PM
mlir/lib/Transforms/Utils/DialectConversion.cpp
2797

I don't quite get what you mean here?

Our passes in IREE that would use this utility (1, 2) don't do any conversion at all. They *just* verify that the IR is in a certain state. I think that's a pretty useful thing to be able to do because it separates concerns. We could have (we don't currently, but I would probably add it if this general utility existed) a conversion target that defines the expected input to the IREE backend. This would be at the start of a pipeline that has no idea what the previous passes might have been to get from some frontend (e.g. TF, TFLite, Pytorch, npcomp).

I think this actually highlights the mismatch in expectation

I prefer the phrasing from https://github.com/tensorflow/tensorflow/blob/b8d2a994e4fda783f27894689b85c73581998407/tensorflow/compiler/mlir/xla/transforms/legalize_tf.cc#L6075 - unable to legalize op for conversion target vs saying op is illegal.

I'm interested in this being used in situations in which the pass is not necessarily making any attempt to legalize anything.

To give another example, I could write passes that legalizes MHLO to linalg, MHLO to std, Tosa to linalg, and Tosa to std. These could be run in any order. Why should one of the passes be responsible for ensuring that things are "fully legalized" (everything is std + linalg) after it's run? That unnecessarily encodes the ordering into the passes themselves, whereas they should be reusable in various pipelines. Instead, I would write a pass that uses this utility to verify linalg + std and add that to my pipeline.

I believe we rely on injection and callbacks for customization in most places and push the choice on the clients instead.

I'm referring to things like -debug-only= or how the dialect conversion framework outputs legalization failures at all. I believe that's just baked into the framework and I think that's totally reasonable. This is a more complicated use case, so maybe we want a bit more customization.

mehdi_amini added inline comments.Feb 22 2021, 7:12 PM
mlir/lib/Transforms/Utils/DialectConversion.cpp
2797

OK I see what you mean now, makes sense, thanks!

I'd still prefer to decouple the verifyAllOperationsAreLegal from emitLegalizationErrors if possible.

I.e. the client could setup their target and still have two calls. The pass that "verify the input" in IREE are very reasonable, but could they be implemented this way:

DenseSet<Operation *> nonlegalized_ops;
LogicalResult result = applyPartialConversion(
    op, target, /*patterns=*/EmptyPatternList, &nonlegalized_ops);
if (!nonlegalized_ops.empty())
  emitLegalizationErrors(op, nonlegalized_ops);

?

GMNGeoffrey added inline comments.Oct 8 2021, 10:38 AM
mlir/lib/Transforms/Utils/DialectConversion.cpp
2797

I think the weird thing about that is that a) the name is confusing, and b) it actually could do some conversions based on canonicalization patterns and folds IIUC. That isn't the end of the world, but it is a bit weird if your verification pass is actually transforming the IR. That's why I liked the alternative of just walking the ops. It would be pretty easy to provide 3 separate functions:

  1. void emitLegalizationErrors(Location loc, ArrayRef<Operation*> &illegalOps)
  2. DenseSet<Operation*> findIllegalOps(Operation* root, ConversionTarget &target)
  3. LogicalResult verifyAllOperationsAreLegal(Operation* root, ConversionTarget &target) (just calls the other two. I guess you're saying you don't think this should be in core?)

I still don't quite follow the logic that MLIR shouldn't be opinionated about the format of diagnostics emitted though. The dialect conversion framework itself already does both conversion and diagnostic output, has more output guarded by LLVM_DEBUG, etc. This seems very similar.

I also had another though. My original goal with this was to provide a single pass that was parameterized based on a conversion target, so it could be added into pipelines without needing to create a whole new pass class. Something like createVerifyFullyConverted(ConversionTarget &target), but I realized this wouldn't work because you need an MLIRContext to create a ConversionTarget and when constructing a pipeline you don't have one. AFAICT ConversionTarget only uses its context to create OperationNames, which are rooted in an identifier uniqued and owned by the context. I don't know how awful it would be to make it just use strings. It seems like it would be not so bad. Then a ConversionTarget could be constructed while making a pipeline and passed into a create call for a common pass. That of course, leans more in the direction of opinionated diagnostic output.

For now, I may pull the current implementation back into IREE and continue to iterate on it there, so we can make progress while we discuss what's best for upstream.

This revision now requires review to proceed.Oct 15 2021, 9:12 PM