This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Uniformize linalg.generic with named ops.
ClosedPublic

Authored by nicolasvasilache on Sep 18 2020, 1:22 PM.

Details

Summary

This revision allows representing a reduction at the level of linalg on tensors for generic ops by uniformizing with the named ops approach.

Diff Detail

Event Timeline

nicolasvasilache requested review of this revision.Sep 18 2020, 1:22 PM
burmako accepted this revision.Sep 19 2020, 3:31 PM
burmako added inline comments.
mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
48

The comment looks outdated after the changes to the function signature.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
524

{other-attributes}? (Generally, I see the notation in this file is inconsistent between [other-attributes] and {other-attributes}, so maybe this change is intentional?).

614

Is this paragraph removed because this is no longer the case?

621

"update" instead of "updates"?

639

What do you think about the syntax which keeps type annotations at the end (while still introducing ins and friends)?

718

Missing curly brace?

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
46

In the function above, this parameter is called outputBufferTypes without the "s" in "Buffers". Also see initTensorTypes below.

mlir/test/Dialect/Linalg/invalid.mlir
394

Why is this no longer a test?

527

Likewise.

This revision is now accepted and ready to land.Sep 19 2020, 3:31 PM
nicolasvasilache marked 9 inline comments as done.Sep 21 2020, 12:09 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
614

correct, the SSA values don't escape control flow is not true anymore (e.g. scf.for with yield that I am going to take advantage of in the future).

639

It has 2 drawbacks that I can see right now:

  • still need to jump through hoops to figure out which arg is of what type in the multi-list. More local type information is more desirable esp. when we mix input + output buffers + result tensors.
  • the goal is to have the parser and printer use declarative assembly format with optional groups (blocked on some missing declarative assembly feature). Optional groups need operands and types to be within the same parsing unit (can't have the type parsing relegated at the end).
mlir/test/Dialect/Linalg/invalid.mlir
394

it was duplicated, see the next test.

527

this is no longer valid, there is no such thing as #outputs anymore (previously independently specified by the args_out attribute).

burmako added inline comments.Sep 21 2020, 6:55 AM
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
639
  1. "still need to jump through hoops to figure out which arg is of what type in the multi-list". If I understand correctly, you're talking about the more general problem of visually matching things with their types because there's a bunch of syntax between one and the other.

    This also exists in other operations like call/llvm.invoke (match arguments with their types) and, more generally, in default syntax for ops. One could even say it's also relevant to bigger ops in general, e.g. for subview (match the source with its type). Why does this need to be treated in a special way in linalg.generic / Linalg named ops syntax?
  1. The problem is still not solved with the current notation because you still need to jump through hoops within argument lists unlike e.g. in cond_br and func.
  1. Depending on the reader / formatting, this syntax may not necessarily be a readability improvement upon the status quo. Previously one could figure out the types of linalg.generic views by looking at the end of the syntax, and now the types live in the middle which is not necessarily easy to visually parse.

    This also affects the use case of mixing tensors and buffers because for that use case it seems to be beneficial to quickly skim the types involved in an op.
  1. Custom syntax that's stylistically different from the status quo may lead to other issues that I haven't thought about. For example, here's one that came to mind as I was ready to press the Submit button.

    https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices recommends that "Tests should be minimal, and only check what is absolutely necessary". As an example of minimalism, it shows an example that omits types from ops in expected syntax, e.g. // CHECK-NEXT: return %[[RESULT]], %[[RESULT]].

    This is something that becomes harder with the proposed syntax since one cannot just drop stuff after the conventional colon (colon included), and instead we need to say things like ins(%foo, %bar: {{.*}}), outs....
nicolasvasilache marked 5 inline comments as done.Sep 21 2020, 8:36 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
639

and, more generally, in default syntax for ops.

Yes that is the unsugared form, it quickly becomes hard to read when you have enough operands.

Why does this need to be treated in a special way in linalg.generic / Linalg named ops syntax?

Because parameter packs based on operand_segment_sizes need to be sugared otherwise they are hard to segment out.

The problem is still not solved with the current notation because you still need to jump through hoops within argument lists unlike e.g. in cond_br and func.

This relates to @mravishankar's comment on https://reviews.llvm.org/D87767.

TL;DR I agree that we want to go towards a func-like syntax where each arg is followed by its type.
I would strongly prefer this to be done automatically, once and for all, by the declarative format to avoid developing more parser/printer debt.
When the custom parser / printer accepts and interleaved mode it will be easy to have an NFC CL to update the syntax.

The situation is much better than it was before though: if we have 3 parameter packs with 2, 3 and 4 arguments there is strictly less jumping through hoops to determine the type of second argument of the third pack (you just need to look for the 2nd local type instead of looking for the 7^th global type).

and now the types live in the middle which is not necessarily easy to visually parse

We could also put the region before the type arguments, would that alleviate part of the problem ?

readability improvement upon the status quo

Consistency wins here, first I need to fix the semantics gap and uniformize with named ops (see https://reviews.llvm.org/D87767).
Also note that some of these arguments may become regions in the future.
Once uniformity is achieved we can continue improving the parsing / pretty-printing in an NFC fashion.
In particular the regions should also ideally be simplified.

stylistically different from the status quo

The status quo is now https://reviews.llvm.org/D87767, the generic ops have weaker semantics and need to be updated to allow tensors + reductions.

Once the harder functional changes are landed, we can iterate on a better syntax.
Note however that any proposed syntax will need to also work for named ops.

nicolasvasilache marked an inline comment as done.

Additional changes and builders.

I like the new syntax and encoding! Great cleanup. Some more verification of valid forms would be nice.

mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
51

I assume you mean resultTensorTypes here?

53

Is this an alternative way to encode tensor results?

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
524

This is not the actual syntax, right? There is an attrs there.

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
360

I would have expected init_tensors not to count here. They only exist for reductions on tensors, so they are implied. But this verification is not very precise anyway.

1222

Maybe assign to a local?

1450

nit: parseColonTypeList

1507

printArrowTypeList or printOptionalArrowTypeList?

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp
446

Left over?

512–521

I know this is a carry over but why UnknownLoc?

517

Reformat.

mlir/test/lib/Transforms/TestBufferPlacement.cpp
42

Not sure why this exists here. Can this not use the pattern from TensorToBuffers.h or is that not exposed there?

nicolasvasilache marked 10 inline comments as done.Sep 21 2020, 12:27 PM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
53

Just stale stuff, cleaned up, thanks!

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
524

Indeed thanks! Ideally this would go away but this CL has grown fat and I am running out of patience writing throwaway parser/printer code.
Will figure it out once we can use the declarative format.

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
360

Ack, more work is needed to make reductions + tensors really work with Linalg.
In particular for buffer allocation.

1222

Not getting this, could you please elaborate?

mlir/test/lib/Transforms/TestBufferPlacement.cpp
42

It's not exposed and it is not exposable without deeper refactorings because we don't want TensorToBuffer.h to depend on Linalg.

nicolasvasilache marked 4 inline comments as done.

Address review comments.

herhut added a subscriber: tpopp.Sep 22 2020, 1:39 AM
herhut added inline comments.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
1222

It was just a nit to do

auto iteratorTypes = cast<LinalgOp>(op).iterator_types()

and then use that here and below.

mlir/test/lib/Transforms/TestBufferPlacement.cpp
42

There already is a populateConvertLinalgOnTensorsToBuffersPattern function, it is just not exposed. It should be enough to just expose that function and call it here. No need to do this in this change, I can clean that up, too, once this landed.

We need to figure out where to put all the tensor to buffers pattern anyway, as different dialects will need them and having a populate function in passes/transforms/rewrites seems the right approach to me.

@tpopp FYI as you looked into patterns for shape.assuming.