This is an archive of the discontinued LLVM Phabricator instance.

[ODS] Implement a new 'hasCanonicalizeMethod' bit for cann patterns.
ClosedPublic

Authored by lattner on Mar 22 2021, 10:15 PM.

Details

Summary

This provides a simplified way to implement 'matchAndRewrite' style
canonicalization patterns for ops that don't need the full power of
RewritePatterns. Using this style, you can implement a static method
with a signature like:

LogicalResult AssertOp::canonicalize(AssertOp op, PatternRewriter &rewriter) {
  return success();
}

instead of dealing with defining RewritePattern subclasses.

This also adopts this for a few patterns in the standardops dialect
to show how it works.

Diff Detail

Event Timeline

lattner created this revision.Mar 22 2021, 10:15 PM
lattner requested review of this revision.Mar 22 2021, 10:15 PM
lattner updated this revision to Diff 332526.Mar 22 2021, 10:19 PM

Change commit message.

lattner updated this revision to Diff 332528.Mar 22 2021, 10:25 PM
lattner edited the summary of this revision. (Show Details)

update commit message in phab.

I'd prefer something like

let canonicalizer = [Pat<(...)>, ...];

Then you can use DRR (currently tablegen dag format, potentially other frontend post) to write these. The simple one could be expressed as

Pat<(Op:$op _,_,_), (NativeCodeCall<"canonicalizer"> $_builder, $_self)>

Or we could introduce a shorthand for it (potentially start with not needing to have match side of pattern for this pattern and then one can get this as tablegen def without modifying driver post - that same first step also allows generating the function pointer variant ones if we prefered for these). So you'd get

let canonicalizer = [CanonicalizeOnOp]

but could expand to writing other patterns adjacent to this & one could then also shared common patterns between multiple of these ops more declaratively. And it allows a single specification, so you can add all your patterns together and the C++ add method could be generated for you (so that hasCanonicalizer need not be treated as orthogonal but as something that can be removed/subsumed)

I'm not sure I understand the ask here - that sounds like a separate feature. We already have a way to specify canonicalization patterns with ODS, this is intended to help the C++ case.

A change to: "Pat<(Op:$op _,_,_), (NativeCodeCall<"canonicalizer"> $_builder, $_self)>" moves the boilerplate around and makes it more obscure, it doesn't eliminate it.

I love the reduced boilerplate! :)
But I'd like to understand more what Jacques has in mind, I didn't quite get the comment either.

mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1711

What happens if both hasCanonicalizer and hasCanonicalizeMethod are set?
I have the impression that the code isn't defensive here, or that hasCanonicalizer would be ignored in this case? If this is the case I rather error out telling the user that they are exclusive.

lattner added inline comments.Mar 23 2021, 10:43 AM
mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1711

If both are set then you get a prototype for getCanonicalizationPatterns and for canonicalize in the header, but they you implement both of them however you'd like.

mehdi_amini added inline comments.Mar 23 2021, 10:51 AM
mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1678

Nit, it'd read better to me as:

bool hasCanonicalizeMethod = def.getValueAsBit("hasCanonicalizeMethod");
if (hasCanonicalizeMethod) {
   ...

(making it more clear that hasCanonicalizeMethod value is directly the Tablegen value for the flag) instead of assigning in the body of the conditional)

1711

Mmmmm, reading the code I see getCanonicalizationPatterns getting a generated body just based on hasCanonicalizeMethod regardless of hasCanonicalizer.
From what you're answering me here I'd have expected to read:

if (hasCanonicalizeMethod && !def.getValueAsBit("hasCanonicalizer"))
  method->body() << "  results.add(canonicalize);\n";

Looks good!

Crazy half thought out idea:

Thinking about the feature that Jacques is talking about and the one being added here, I wonder if we should adopt a different approach to the current normal case for canonicalization patterns. By that I mean, should we rename the current getCanonicalizationPatterns that users implement to something else? That would allow for ODS to generate the external facing getCanonicalizationPatterns and automatically add all of the different ways to add patterns, e.g. something like:

class MyOp {
  /// The external facing call.
  void getCanonicalizationPatterns(RewritePatternSet &patterns) {
    // Add the native patterns.
    getCanonicalizationPatternsImpl(patterns);
    getCanonicalizationPatterns.add(canonicalize);
    
    // Add any DRR patterns defined on the op in ODS.
    getDRRCanonicalizationPatterns(patterns);
  }

private:
  /// The things that the author of `MyOp` implements.
  void getCanonicalizationPatternsImpl(RewritePatternSet &patterns);
  LogicalResult canonicalize(MyOp op, PatternRewriter &rewriter);

  /// Code generate by ODS to add canonicalization patterns defined in DRR.
  void getDRRCanonicalizationPatterns(RewritePatternSet &patterns);
};
mlir/lib/Dialect/StandardOps/IR/Ops.cpp
311–315

This is really nice!

528–532

Why merge these? From the discussion on the forums, it seems like there is still some contention on merging multiple patterns into one (unless this was resolved?). Wouldn't it be possible to still have this implemented as(with the same amount of code):

void BranchOp::getCanonicalizationPatterns(RewritePatternSet &results,
                                           MLIRContext *context) {
  results.add(simplifyBrToBlockWithSinglePred, simplifyPassThroughBr);
}

(Though that wouldn't exercise the feature you are added in this patch)

lattner marked 2 inline comments as done.Mar 23 2021, 11:52 AM
lattner added inline comments.
mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1678

Good point!

1711

You're right, I simplified the code to make it more obvious what is going on here, thank you for pointing this out!

lattner updated this revision to Diff 332750.Mar 23 2021, 11:52 AM
lattner marked 2 inline comments as done.

Clarify the code in tblgen, thanks to Mehdi's feedback

Crazy half thought out idea:

That is not a crazy idea. I think it is beautiful.

This step also makes be grumpy about the existing functionality like this:

let printer = [{ return ::print(p, *this); }];
let verifier = [{ return ::verify(*this); }];
let parser = [{ return ::parse$cppClass(parser, result); }];

These have annoyed me for some time for a couple of reasons:

  1. it leads to unnecessary "innovation" in implementation approaches for these hooks.
  2. It leads to unnecessary c++ code in tblgen, and encourages people to put stuff inline.
  3. One of the standard approaches (shown above) depends on overloading of "::verify" and other methods, which leads to terrible error messages from clang when you mess something up, because the compiler spews the full overload set.

I think that going to a model of:

let hasCustomPrinter = 1;
let hasCustomVerifier = 1;
let hasCustomParser = 1;

would make sense, and then you implement a "well known method", like folder and now canonicalizer do.

With such an approach, your suggestion would fit naturally in line:

let hasCustomCanonicalizerPatterns = 1;

or something.

That said, I think that is clearly out of bounds for this patch ;-). Does anyone object to landing this? It seems consistent with the logical path forward here.

rriddle accepted this revision.Mar 23 2021, 12:23 PM

LGTM as an initial patch (assuming all comments have been resolved)

mlir/lib/Dialect/StandardOps/IR/Ops.cpp
528–532

(Missed?)

mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1684
This revision is now accepted and ready to land.Mar 23 2021, 12:23 PM

This step also makes be grumpy about the existing functionality like this:

let printer = [{ return ::print(p, *this); }];
let verifier = [{ return ::verify(*this); }];
let parser = [{ return ::parse$cppClass(parser, result); }];

These have annoyed me for some time for a couple of reasons:

  1. it leads to unnecessary "innovation" in implementation approaches for these hooks.
  2. It leads to unnecessary c++ code in tblgen, and encourages people to put stuff inline.
  3. One of the standard approaches (shown above) depends on overloading of "::verify" and other methods, which leads to terrible error messages from clang when you mess something up, because the compiler spews the full overload set.

I think that going to a model of:

let hasCustomPrinter = 1;
let hasCustomVerifier = 1;
let hasCustomParser = 1;

Oh yes please!!! :)

lattner marked an inline comment as done.Mar 23 2021, 12:49 PM
lattner added inline comments.
mlir/lib/Dialect/StandardOps/IR/Ops.cpp
528–532

Yep I missed this, thanks for the ping.

Sure, I could do this, but how in any way is this _better_? This would add more codesize bloat, and more patterns for the pattern rewriter to juggle. Merging into a single function is functionally identical and more efficient.

mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1684

Good catch, thanks!

lattner updated this revision to Diff 332763.Mar 23 2021, 12:50 PM
lattner marked an inline comment as done.

namespace qualify LogicalResult as River pointed out.

rriddle added inline comments.Mar 23 2021, 1:29 PM
mlir/lib/Dialect/StandardOps/IR/Ops.cpp
528–532

For very simple transformations like this specific instance, it doesn't really matter (at least IMO). In generality though, having directed patterns makes it much easier to debug and reason about what is going on. We have facilities for logging which patterns are executing, controlling their execution(ala debug counters, with ideas for more sophisticated systems being devised; such as enabling/disabling specific patterns), and more. Creating mega-patterns makes the system that much more difficult to interact with (from my experience).

As to the performance related concerns, having looked deeply into where the code size in MLIR is(I heavily optimized quite a few components a while ago to better enable running MLIR on-device), the impact from the pattern wrapper classes was negligible in the grand scheme of things. For the second point, the pattern rewriter already performs op-specific bucketing, and splitting into two patterns isn't that dramatic of a change in execution time (it drops an iteration from the matcher loop for this specific operation type, and removes an indirect call).

Anyways, none of that blocks this specific patch (IMO) but it does bring about a need to document general guidelines on when should a pattern be split into multiple vs mega pattern.

This revision was landed with ongoing or failed builds.Mar 23 2021, 1:46 PM
This revision was automatically updated to reflect the committed changes.

Got it, thanks. I committed this patch and will follow up with an enhancement to allow multiple function pointers to be added as individual patterns all at one time.

Actually, I don't think we need to do anything here. We already support patterns.add(foo).add(bar).add(baz); we could support patterns.add(foo, bar, baz); through more application of variadic templates, but I don't see a super important reason to do so. We can add it if there is a reason to in the future.

Actually, I don't think we need to do anything here. We already support patterns.add(foo).add(bar).add(baz); we could support patterns.add(foo, bar, baz); through more application of variadic templates, but I don't see a super important reason to do so. We can add it if there is a reason to in the future.

SGTM, completely forgot that we have chaining. Thanks!