Page MenuHomePhabricator

[mlir] Add a conversion pass between PDL and the PDL Interpreter Dialect

Authored by rriddle on Jul 24 2020, 9:54 PM.



The conversion between PDL and the interpreter is split into several different parts.

  • The Matcher:

The matching section of all incoming pdl.pattern operations is converted into a predicate tree and merged. Each pattern is first converted into an ordered list of predicates starting from the root operation. A predicate is composed of three distinct parts:

  • Position
    • A position refers to a specific location on the input DAG, i.e. an existing MLIR entity being matched. These can be attributes, operands, operations, results, and types. Each position also defines a relation to its parent. For example, the operand [0] -> 1 has a parent operation position [0] (the root).
  • Question
    • A question refers to a query on a specific positional value. For example, an operation name question checks the name of an operation position.
  • Answer
    • An answer is the expected result of a question. For example, when matching an operation with the name "foo.op". The question would be an operation name question, with an expected answer of "foo.op".

After the predicate lists have been created and ordered(based on occurrence of common predicates and other factors), they are formed into a tree of nodes that represent the branching flow of a pattern match. This structure allows for efficient construction and merging of the input patterns. There are currently only 4 simple nodes in the tree:

  • ExitNode: Represents the termination of a match
  • SuccessNode: Represents a successful match of a specific pattern
  • BoolNode/SwitchNode: Branch to a specific child node based on the expected answer to a predicate question.

Once the matcher tree has been generated, this tree is walked to generate the corresponding interpreter operations.

  • The Rewriter:

The rewriter portion of a pattern is generated in a very straightforward manor, similarly to lowerings in other dialects. Each PDL operation that may exist within a rewrite has a mapping into the interpreter dialect. The code for the rewriter is generated within a FuncOp, that is invoked by the interpreter on a successful pattern match. Referenced values defined in the matcher become inputs the generated rewriter function.

An example lowering is shown below:

// The following high level PDL pattern:
pdl.pattern : benefit(1) {
  %resultType = pdl.type
  %inputOperand = pdl.input
  %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
  pdl.rewrite(%root) {
    pdl.replace %root with (%inputOperand)

// is lowered to the following:
module {
  // The matcher function takes the root operation as an input.
  func @matcher(%arg0: !pdl.operation) {
    pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
    pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
    pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
    %0 = pdl_interp.get_operand 0 of %arg0
    pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
    %1 = pdl_interp.get_result 0 of %arg0
    pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
    // This operation corresponds to a successful pattern match.
    pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
  module @rewriters {
    // The inputs to the rewriter from the matcher are passed as arguments.
    func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
      pdl_interp.replace %arg1 with(%arg0)

Depends On D84579

Diff Detail

Event Timeline

rriddle created this revision.Jul 24 2020, 9:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2020, 9:54 PM
lcnzg added a subscriber: lcnzg.Sep 30 2020, 6:15 PM

First partial scan (sorry forgot to hit send on this earlier)


What is the ordering based on here? (predicate -> switch -> rewriter -> dag rewriter ?)


Not sure I completely follow what this means.

jpienaar added inline comments.Oct 7 2020, 11:02 AM

Are these rewriteValues passed rather than stored as part of class for a reason? (e.g., if we have a ScopedHashTable<Value, Value> in main would that cover these)


Wouldn't the verification of the input module also insure this? E.g., how can you make an invalid PDL program that would invalidate this?


So currently the value is just the parent one? And all locations in this file refer purely to the generated PDL Interp and not the ops in the output?


Do we ever have non identity cases?


What is the ordering of these cases?


Nit: PdlGeneratedRewriter ? Just to make it clear in dumps the origin of the function


So this is creating a new operation that is creating new operations?


Why is try_emplace needed here?


I struggle to associate this with the code below. How does type come in to play here?


Is there a reason we couldn't reserve for types outside the loop?


Could we perhaps expand the documentation for this? Make full sentence here and separate the parameteric vs singleton one.

rriddle updated this revision to Diff 297151.Oct 9 2020, 1:50 AM
rriddle marked 12 inline comments as done.

Update based on feedback

rriddle added inline comments.Oct 9 2020, 1:53 AM

It's a logical grouping of what code uses the functions: matcher(generateMatcher) uses predicate+switch, rewriter(generateRewriter) uses generateDagRewriter.


ScopedHashTable isn't useful given that these are only for the rewriter, which has no scoping ATM. I used params given that the values are specific to a given rewriter function, of which there will be 1 per pattern.


Each matcher node has a set of success destinations and a failure destination, given that they represent conditional branch/switch nodes. When creating the nodes we don't always assign a failure destination to matcher nodes. If a failure destination isn't explicitly assigned, it inherits the closest ancestor failure destination. This stack is the set of ancestor failure destinations.


The comment refers to the fact that we currently aren't properly propagating the locations from PDL to the interp. Yes to the second question, this is purely about locations within the generated PDL and not the the things generated by PDL during rewriting(if I understood the question correctly).


What do you mean by identity here?


Re-arranged to ensure it matches the ordering in the Kind enum.


Went with pdl_generated_rewriter if that's alright.




The !pdl.operation result of an OperationOp may be used further within the rewrite block, we need to provide a mapping for it. I may have missed something else that you are alluding to.

Thanks for the review Jacques!

jpienaar accepted this revision.Oct 19 2020, 12:34 PM

Looks good - this is little bit of a big review, but I also don't have an idea of how to make it smaller easily. Given the work following on this & verification wrt existing, I feel a little bit more comfortable.


Nit: locOps ? (isn't the result of fusing these the fused loc op?)


Could this be something like generatedRecordMatch? (matches the structure above)


Meaning, could we avoid using getName and string here?


Could you look at the clang-tidy warnings?


I was wondering if

rewriteValues[operationOp.op()] = createdOp;

would not suffice (I normally associate try_emplace with actually verifying if it happened or not)


c++ not needed in cpp file


Mmm, does the header guard change in the lib dir or is the clang-tidy warning off?



This revision is now accepted and ready to land.Oct 19 2020, 12:34 PM
rriddle updated this revision to Diff 300836.Oct 26 2020, 5:23 PM
rriddle marked 9 inline comments as done.


This revision was landed with ongoing or failed builds.Oct 26 2020, 6:05 PM
This revision was automatically updated to reflect the committed changes.