This is an archive of the discontinued LLVM Phabricator instance.

[mlir][PDL] Add support for native constraints with results
Needs ReviewPublic

Authored by martin-luecke on Jun 18 2023, 11:52 PM.

Details

Summary

This adds support for native PDL (and PDLL) C++ constraints to return results.

This is useful for situations where a pattern checks for certain constraints of multiple interdependent attributes and computes a new attribute value based on them. Currently, for such an example it is required to escape to C++ during matching to perform the check and after a successful match again escape to native C++ to perform the computation during the rewriting part of the pattern.
With this work we can do the computation in C++ during matching and use the result in the rewriting part of the pattern. Effectively this enables a choice in the trade-off of memory consumption during matching vs recomputation of values.

This is an example of a situation where this is useful: We have two operations with certain attributes that have interdependent constraints. For instance attr_foo: one_of [0, 2, 4, 8], attr_bar: one_of [0, 2, 4, 8] and attr_foo == attr_bar. The pattern should only match if all conditions are true. The new operation should be created with a new attribute which is computed from the two matched attributes e.g. attr_baz = attr_foo * attr_bar. For the check we already escape to native C++ and have all values at hand so it makes sense to directly compute the new attribute value as well:

PDLL
Constraint checkAndCompute(attr0: Attr, attr1: Attr) -> Attr;

Pattern example with benefit(1) { 
    let foo = op<test.foo>() {attr = attr_foo : Attr}; 
    let bar = op<test.bar>(foo) {attr = attr_bar : Attr}; 
    let attr_baz = checkAndCompute(attr_foo, attr_bar); 
    rewrite bar with { 
        let baz = op<test.baz> {attr=attr_baz}; 
        replace bar with baz; 
    }; 
}

To achieve this the following notable changes were necessary:
PDLL

  • Remove check in PDLL parser that prevented native constraints from returning results

PDL

  • Change PDL definition of pdl.apply_native_constraint to allow variadic results

PDL_interp

  • Change PDL_interp definition of pdl_interp.apply_constraint to allow variadic results

PDLToPDLInterp Pass:
The input to the pass is an arbitrary number of PDL patterns. The pass collects the predicates
that are required to match all of the pdl patterns and establishes an ordering that allows
creation of a single efficient matcher function to match all of them. Values that are matched and
possibly used in the rewriting part of a pattern are represented as positions. This allows fusion
and thus reusing a single position for multiple matching patterns.
Accordingly, we introduce ConstraintPosition, which records the type and index of the result of the constraint.
The problem is for the corresponding value to be used in the rewriting part of a pattern it has
to be an input to the pdl_interp.record_match operation, which is generated early during the pass
such that its surrounding block can be referred to by branching operations. In consequence the value has
to be materialized after the original pdl.apply_native_constraint has been deleted but before
we get the chance to generate the corresponding pdl_interp.apply_constraint operation.
We solve this by emitting a placeholder value when a ConstraintPosition is evaluated.
These placeholder values (due to fusion there may be multiple for one constraint result) are
replaced later when the actual pdl_interp.apply_constraint operation is created.

Bytecode generator and interpreter:
Constraint functions which return results have a different type compared to existing constraint functions.
They have the same type as native rewrite functions and hence are registered as rewrite functions.
Other options:

  • Have a specific registration bucket for constraints with results
  • Change the type for all native constraints to include (possibly empty) results and store all natvie constraints in the same registration bucket. (It is open how this interacts with the convenience PDL Constraint Builders, possibly something for a follow-up PR)

Diff Detail

Event Timeline

martin-luecke created this revision.Jun 18 2023, 11:52 PM
Herald added a project: Restricted Project. · View Herald Transcript
martin-luecke requested review of this revision.Jun 18 2023, 11:52 PM

Awesome work. Thanks for the patch! The high-level concerns are:

  • whether a use-before-def is possible due to the PredicateTree attempting to access a ConstraintPosition before the relevant ConstraintQuestion is emitted
  • changing native constraint functions to return results instead of clobbering rewrite functions to achieve the same effect

In the future, it would be great if the patch could be split up into smaller pieces to make it easier to review. For example, this patch could have been

  1. Add results to pdl.apply_native_constraint and pdl_interp.apply_constraint and implement the lowering in PDLToPDLInterp
  2. Add registration for native constraints with results and support in bytecode lowering and the interpreter.
  3. Add PDLL support.
mlir/include/mlir/Dialect/PDL/IR/PDLOps.td
44

Please add an example of apply_native_constraint that returns results

50
mlir/include/mlir/Dialect/PDLInterp/IR/PDLInterpOps.td
108
mlir/include/mlir/IR/PatternMatch.h
1585

It does make more sense to me to give constraint functions results by default, and current constraints will have "empty results". This means there will be a single consistent representation of constraint functions.

mlir/lib/Conversion/PDLToPDLInterp/PDLToPDLInterp.cpp
381

There is an awkward problem here that the results of a native constraint could be accessed before the constraint is invoked, resulting in an SSA value use before definition error in this pass. Essentially, the assumption the pattern fuser makes is that all Positions can be accessed in any order, and PDLInterp and the interpreter rely on null propagation to make this work.

This may be working now because the results of the constraints are only accessed in the rewriter. On the other hand, it's possible the predicate tree will never access a ConstraintPosition value before the ConstraintQuestion itself is encoded, because structurally it is impossible for a pattern to use a native constraint result without posing the ConstraintQuestion first, meaning across patterns it's impossible for the number of constraints on the ConstraintPosition to be greater than the number of ConstraintQuestion of the same kind. If the number is equal, the PredicateTree *may* be stable, so they won't get swapped. I'm not 100% sure, however.

In any case, I think it makes more sense to map the ConstraintQuestion to the pdl_interp::ApplyConstraint operation and then retrieve the relevant results here, instead of relying on creating "placeholder" values.

mlir/lib/Conversion/PDLToPDLInterp/Predicate.h
478

Do the types need to be encoded?

I guess there is nothing stopping a user from writing a native constraint that dynamically returns different results, but this would be impossible to construct from PDLL.

mlir/lib/Conversion/PDLToPDLInterp/PredicateTree.cpp
279
280
mlir/lib/Dialect/PDL/IR/PDL.cpp
98

Please also document this restriction in the op definition

mlir/lib/Rewrite/ByteCode.cpp
784

please use llvm::report_fatal_error

818

isa<T>(...) is the preferred form

1458

This clobbering of using rewrite functions to contain constraints with results seems problematic. At this point I don't remember why rewrite functions are allowed to fail in the first place. In any case, please add results to regular constraint functions instead.

mlir/test/Conversion/PDLToPDLInterp/pdl-to-pdl-interp-matcher.mlir
125

please add a test where the result of a pdl.apply_native_constraint is used inside the matcher