This is an archive of the discontinued LLVM Phabricator instance.

[DRR] Explicit Types and Types Builders in Rewrites
ClosedPublic

Authored by Mogball on Sep 8 2021, 4:36 PM.

Details

Summary

Adds a rewrite directive OpWithResultTypeBuilders that allows
operations to be created in rewrites that are not able to infer or
deduce a return type. Types can be specified with a string:

(OpWithResultType<AnOp, "IntegerType::get(getContext(), 32)"> $val)

Or can be built using NativeCodeCall:

(OpWithResultTypeBuilder<AnOp, (NativeCodeCall<"$0.getType()"> $arg)>
$val)

Diff Detail

Event Timeline

Mogball created this revision.Sep 8 2021, 4:36 PM
Mogball requested review of this revision.Sep 8 2021, 4:36 PM
Mogball updated this revision to Diff 371609.Sep 9 2021, 8:50 AM

clang-format files

Mogball updated this revision to Diff 371611.Sep 9 2021, 8:54 AM

fixing commit?

Mogball updated this revision to Diff 371658.Sep 9 2021, 10:52 AM

rebase against HEAD

jpienaar accepted this revision.Sep 9 2021, 3:14 PM

Mostly looks good, thanks! One part to handle to avoid changing return types.

mlir/include/mlir/IR/OpBase.td
2603

s/with/without/ ?

2618

Why can't one use bound values in string one too?

2624

Consider just indenting here and above with 2 spaces rather than lining up ( to avoid such long lines. You could also use op X in the pattern and just in a comment say X is an op without type deduction

mlir/test/lib/Dialect/Test/TestOps.td
1341

OOC is there any way to have a name independent one too? E.g., SameTypeAsFirstOperandOfOperatorMatched? :)

mlir/tools/mlir-tblgen/RewriterGen.cpp
1120

In DRR we require that the root node's type during replacement remains as is. So we don't allow the type to change in DRR rules (we wouldn't generally know what casts to assert or type mismatches are allowed)

This revision is now accepted and ready to land.Sep 9 2021, 3:14 PM

Thanks for the review!

mlir/include/mlir/IR/OpBase.td
2618

E.g. you can't do

("doSomething($0)" $arg0)

But you can do

(NativeCodeCall<"doSomething($0)"> $arg0)

Oh, I should improve the wording.

You can use $_builder, for example, in strings since they're wrapped in NativeCodeCall anyways, but of course you'll need to specify NativeCodeCall in order to pass arguments.

mlir/test/lib/Dialect/Test/TestOps.td
1341

Yeah. The code snippet is pasted into matchAndRewrite so you could have "castedOp0.getOperand(0).getType()", as long as your okay with hardcoding the table-generated root op name,

mlir/tools/mlir-tblgen/RewriterGen.cpp
1120

True. Let me fix this.

Mogball marked 5 inline comments as done.Sep 9 2021, 4:01 PM
Mogball updated this revision to Diff 371742.Sep 9 2021, 4:02 PM
  • prevent explicit types in values replacing root op's return values

About the syntax, I'm wondering if we can make it similar to what we have so far in DRR. For example,

(OpWithResultTypeBuilder<AnOp, (NativeCodeCall<"$0.getType()"> $arg)>
$val)

It nests DAG into template arg. While reading a DRR pattern, we may expect the DAG is a matcher/rewriter for a operation/operand/attribute and now it has different semantics.

I'm thinking a new type of DAG node which indicates the matching/rewriting of the return type

Pat<(FooOp ..., (ReturnType<I32>:$ret)),
        (BarOp arg1, arg2, (ReturnType $ret))>

or

Pat<(FooOp ...),
        (BarOp arg1, arg2, (ReturnType (NativeCodeCall<"IntegerType::get(getContext(), 32)">)))>

What do you think?

Mogball added a comment.EditedSep 9 2021, 9:44 PM

About the syntax, I'm wondering if we can make it similar to what we have so far in DRR. For example,

(OpWithResultTypeBuilder<AnOp, (NativeCodeCall<"$0.getType()"> $arg)>
$val)

It nests DAG into template arg. While reading a DRR pattern, we may expect the DAG is a matcher/rewriter for a operation/operand/attribute and now it has different semantics.

I do agree that the syntax isn't perfect, especially since with template arguments we need 4 different classes: OpWithReturnType, OpWithReturnTypes, OpWithReturnTypeBuilder, and OpWithReturnTypeBuilders. But the DAG still describes a rewriter for a type, it's just placed inside a template arg. Also, the class' names are a bit verbose.

I'm thinking a new type of DAG node which indicates the matching/rewriting of the return type

I like the idea of introducing a matcher for return types, because currently (I think) the only way to match return types is with constraints. The current syntax could also be used for that; it's just not implemented in the rewriter gen (e.g. OpWithResultType<AnOp, I32>:$res). Also, I think I can find a way to deduplicate the classes.

Pat<(FooOp ..., (ReturnType<I32>:$ret)),
    (BarOp arg1, arg2, (ReturnType $ret))>

Return values are typically bound at the DAG node: FooOp:$ret and I assume ReturnType $ret actually means to copy the return type, so it should be (ReturnType (NativeCodeCall<"$0.getType()> $ret)). Alternatively, an explicit type could be like (ReturnType "$_builder.getI32Type()"); I would first propose the following change to your syntax:

Pat<(FooOp:$res $arg1, $arg2, (ReturnType<I32>)),
    (BarOp $arg1, $arg2, (ReturnType "$_builder.getI32Type()"))>;

Assuming I reduce 4 classes down to 1 and introduce matchers for return types as well with the same OpWithResultType syntax, the difference boils down to whether you'd prefer the return type be specified as a trailing DAG node or next to the operation inside a template argument. I personally think it's a little bit weird to have a return type specifier after the operands and attributes, because the expectation of the DAG arguments is that they match the arguments = (ins ...) in the op declaration.

Also, having the return type specifier as a DAG argument would be more complex to implement. You'd have to ensure that there's only one return type specifier, that it's at the end of the argument list (or skip over it if would be allowed to be specified anywhere else), and it would have to be associated back to the operation.

What's your opinion?

Mogball updated this revision to Diff 371967.Sep 10 2021, 11:04 AM
  • add test that fails when trying to replace root types

About the syntax, I'm wondering if we can make it similar to what we have so far in DRR. For example,

(OpWithResultTypeBuilder<AnOp, (NativeCodeCall<"$0.getType()"> $arg)>
$val)

It nests DAG into template arg. While reading a DRR pattern, we may expect the DAG is a matcher/rewriter for a operation/operand/attribute and now it has different semantics.

I do agree that the syntax isn't perfect, especially since with template arguments we need 4 different classes: OpWithReturnType, OpWithReturnTypes, OpWithReturnTypeBuilder, and OpWithReturnTypeBuilders. But the DAG still describes a rewriter for a type, it's just placed inside a template arg. Also, the class' names are a bit verbose.

I'm thinking a new type of DAG node which indicates the matching/rewriting of the return type

I like the idea of introducing a matcher for return types, because currently (I think) the only way to match return types is with constraints. The current syntax could also be used for that; it's just not implemented in the rewriter gen (e.g. OpWithResultType<AnOp, I32>:$res). Also, I think I can find a way to deduplicate the classes.

Pat<(FooOp ..., (ReturnType<I32>:$ret)),
    (BarOp arg1, arg2, (ReturnType $ret))>

Return values are typically bound at the DAG node: FooOp:$ret and I assume ReturnType $ret actually means to copy the return type, so it should be (ReturnType (NativeCodeCall<"$0.getType()> $ret)). Alternatively, an explicit type could be like (ReturnType "$_builder.getI32Type()"); I would first propose the following change to your syntax:

Pat<(FooOp:$res $arg1, $arg2, (ReturnType<I32>)),
    (BarOp $arg1, $arg2, (ReturnType "$_builder.getI32Type()"))>;

I'm assuming if we want to take the return type matched from FooOp, we can use (ReturnType $res). If so, yes, I think it's better.
I'm thinking to keep the meaning of Dag arg, i.e., it'll be binding the operand, specifying the type/attr constraint or another DagNode. An alternative way will be (ReturnType<"$_builder.getI32Type()">), so it aligns the shape of NativeCodeCall.

One question here, I'm not sure the complexity of supporting both ReturnType and ReturnType<...>. I barely remembered that tablegen parser may report an error but I'm not sure.

Assuming I reduce 4 classes down to 1 and introduce matchers for return types as well with the same OpWithResultType syntax, the difference boils down to whether you'd prefer the return type be specified as a trailing DAG node or next to the operation inside a template argument. I personally think it's a little bit weird to have a return type specifier after the operands and attributes, because the expectation of the DAG arguments is that they match the arguments = (ins ...) in the op declaration.

Also, having the return type specifier as a DAG argument would be more complex to implement. You'd have to ensure that there's only one return type specifier, that it's at the end of the argument list (or skip over it if would be allowed to be specified anywhere else), and it would have $_builder.getI32Type()to be associated back to the operation.

I prefer to keep the the DAG operator has the same meaning. It can be pure operation or some utility operations like NativeCodeCall. When we want to add other constraints to the operation, we usually do it in the constraints list (separate from DAG operator).

Yes, so far we have the agreement that a DAG has the form (Op arg0, arg1, ...). I think we could extend it now. When we create a new operation, we needs to specify the type (an exception is the interface inferReturnType). So I think it's ok to extend the same idea that adds an arg position for return type.

I remember that checking the position of ReturnType DAG is not too complicated. We have location which is supposed to be the last DAG arg if presents. So I think it can be verified in similar approach. For the associating the ReturnType with the operation, either in matcher or rewriter, we will go over the argument of the DagNode, so I think it may not be hard to bind the specified return type with an operation. Maybe I missed some invisible complexity :)

What's your opinion?

mehdi_amini added inline comments.Sep 10 2021, 5:45 PM
mlir/include/mlir/IR/OpBase.td
2641

Slightly off-topic for this revision, but I'm surprised we bundle all of this in OpBase.td, which I'd think would focus on ODS and not DRR: should we split all the DRR stuff in its own header?

Mogball added inline comments.Sep 10 2021, 6:18 PM
mlir/include/mlir/IR/OpBase.td
2641

+1 I am in favour of this.

Mogball updated this revision to Diff 372044.Sep 10 2021, 6:34 PM
  • rename OpWithResultTypeBuilders to ReturnTypeFuncs
Mogball updated this revision to Diff 372565.Sep 14 2021, 2:26 PM
  • change syntax to trailing directive

Thanks for this update!

Can you also add a test case which has both location and returnType?

mlir/tools/mlir-tblgen/RewriterGen.cpp
130–144

Do you think if we put this in DagNode class will be better?

1013–1016

How about checking this first so that we can remove one layer of nesting? I.e.,
if (!tree.isNestedDagArg())

break;

// Do everything else

1027

Can we lift this out of conditional branch?

1028

This is guarded above, maybe we can omit this check?

1083–1084

Can we have a variable(maybe tblgen_types?) which stores the return type and we just pass the type to rewriter.create?

Mogball marked an inline comment as done.Sep 14 2021, 4:16 PM
Mogball added inline comments.
mlir/tools/mlir-tblgen/RewriterGen.cpp
130–144

This I feel belongs in RewriterGen.cpp. It's part of the details of DRR codegen, whereas DagNode and the other classes under mlir/TableGen are purely wrappers for llvm::Init.

One of the functions that break this rule is Pattern::collectBoundSymbols, and it consequently needs to duplicate logic found in RewriterGen.cpp into Pattern.cpp. Moving TrailingDirectives into Pattern.h would prevent this duplication, but that's only because the logic inside collectBoundSymbols is not where it belongs in the first place (IMO).

1027

true

1028

I guess you can, yeah

1083–1084

Yeah, that's what's being done below.

Mogball marked 2 inline comments as done.Sep 14 2021, 4:18 PM
Mogball added inline comments.
mlir/tools/mlir-tblgen/RewriterGen.cpp
1013–1016

Yeah sure. (the codebase is not 100% consistent with this though, especially in the messier files like this one XD)

Mogball updated this revision to Diff 372588.Sep 14 2021, 4:45 PM
  • minor fixes and adding test case for returnType and location together
Mogball marked 3 inline comments as done.Sep 14 2021, 4:46 PM
Chia-hungDuan added inline comments.Sep 14 2021, 4:48 PM
mlir/tools/mlir-tblgen/RewriterGen.cpp
130–144

Agree. Thanks!

1083–1084

Sorry, I didn't state this completely. I'm thinking if we have a variable always saves the return type, then do we still need this check !tail.returnType here and down below?

Mogball added inline comments.Sep 14 2021, 4:57 PM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1083–1084

Do you mean always passing tblgen_types in each of these cases?

The first two cases rely on 1) auto-generated builder that uses OpTraits to infer return types and 2) a user-defined builder that does not require return types to be passed.

Only in the cases of 1) explicitly defined return types and 2) replacing the root op (so we can just copy the return types) will the fully generic builder TypeRange, ValueRange, AttrDict be called.

Chia-hungDuan added inline comments.Sep 14 2021, 5:47 PM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1083–1084

Sorry, I got the wrong memory for the logic here. Just read the code again,

I think it should be a check here, if the op has at least one of those traits, then it shouldn't be able to assign an incompatible return type(Or we can disallow the use returnType here)

1101–1102

I think in this case, we can invoke the builder with return type, right?

1136

Will this introduce a redundant indent?

jpienaar added inline comments.Sep 15 2021, 6:04 AM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1083–1084

One can provide a more refined type (but compatible) than a local inference function could produce, so we should not disallow it, and checking here for compatible would be tricky and at best only handle a subset of cases (and verification here could require duplicating logic).

We should definitely disallow changing root's return types, but for that I'd just say let's uniformly flag it.

Mogball added inline comments.Sep 15 2021, 8:52 AM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1101–1102

In this case, if we don't have explicitly specified return types, there's nothing ODS can do to infer return types (no OpTraits, no attributes, etc.). DRR will generate a call to a builder that ODS does has not automatically generated, so the burden is on the developer to provide a builder that fits that signature.

This is the main reason why explicit return types were added, because certain ops (like casts) need explicit return types. The solution before was to use a NativeCodeCall to directly call the correct builder.

1136

yessir let me fix this sir

Mogball updated this revision to Diff 372717.Sep 15 2021, 9:02 AM
  • fixing indentation
Mogball marked 4 inline comments as done.Sep 15 2021, 9:19 AM
Chia-hungDuan added inline comments.Sep 15 2021, 9:39 AM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1101–1102

I mean, in this conditional branch, if the user doesn't provide the explicit return type, then its their responsibility to handle the builder call. If the user provides the explicit return type, then we can call the builder with return type argument.

So whether or not user provides the explicit return type, we can always handle the case (usePartialResults || depth > 0 || resultIndex < 0) and we may not need the check !tail.returnType, right?

Mogball added inline comments.Sep 15 2021, 9:48 AM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1101–1102

Well, not if no such builder was defined. "Responsibility" might be the wrong word; certain ops need to have an explicit return type and cannot define a type deduction builder.

If you defined a cast operation e.g. my_cast_op and you were trying to create it nested in a result pattern, without any OpTraits or attributes for type inference, and without explicitly specified return types, the generated pattern will not compile because no type deduction builder was defined. So you simply always have to provide an explicit return type.

And anyways, even if a type deduction builder is defined, explicitly specifying the return types should always take precedence.

Mogball added inline comments.Sep 15 2021, 9:50 AM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1101–1102

It's structured like this:

if has explicit return types
    if is a root op - throw an error
    else use the return types
else
    if has attributes/optraits - use those
    else if is a root op - copy root return types
    else try to call a type deduction builder
Chia-hungDuan added inline comments.Sep 15 2021, 10:17 AM
mlir/tools/mlir-tblgen/RewriterGen.cpp
1101–1102

Ok, I thought there were codes for special handling usePartialResults and just noticed that it's nothing special but calling the builder with different signature. Sorry for the misleading.

And I saw you already have test case for this, thanks!

Mogball marked 4 inline comments as done.Sep 15 2021, 12:09 PM
Chia-hungDuan added inline comments.Sep 15 2021, 2:07 PM
mlir/tools/mlir-tblgen/RewriterGen.cpp
900–902

How about adding one comment for this case?

This revision was automatically updated to reflect the committed changes.