This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg][doc] Add Design Document for the Linalg Dialect
ClosedPublic

Authored by nicolasvasilache on Jan 28 2020, 5:03 PM.

Details

Summary

This revision adds a Design Document for the Linalg Dialect

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJan 28 2020, 5:03 PM

For real time rendering as the review proceeds, please see:
https://github.com/nicolasvasilache/mlir/wiki/Linalg-Dialect:-The-Case-For-Compiler-Friendly-Custom-Operations

I will update it as I address review comments.

Thanks!

Unit tests: pass. 62278 tests passed, 0 failed and 827 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Started making my way through it, very interesting doc!

mlir/docs/Dialects/Linalg.md
3

It'd be really nice if we can use a TOC equivalent.

12

This seems like it's going to be out-of-date soon and feels weird to include for the documentation of a dialect.

16

Double space?

24

This start of this paragraph seems like it's describing why you are writing this documentation, which seems weird for a doc about the dialect.

34

Is there a 7?

39

Extra space here.

57

Extra space before "The"

83

Prefer relative paths on all of the things that can use it

105

I'd remove "currently still"

I don't feel it's appropriate for me to review this since I wrote some parts of the text.

mlir/docs/Dialects/Linalg.md
34

Can we just use "1." everywhere and let markdown render do the numbering?

@ftynse it seems approppriate for you to review parts you haven't touched yet though.
I think your review would still be very valuable for sections 3 - 8 (Core Guiding Principles - end).

Thanks!

@ftynse it seems approppriate for you to review parts you haven't touched yet though.
I think your review would still be very valuable for sections 3 - 8 (Core Guiding Principles - end).

Thanks!

Reviewed until 6 inclusive. FWIW, I think it's a waste of time for me to write comments and for you to try and implement them instead of me just fixing the text directly.

mlir/docs/Dialects/Linalg.md
12

I would consider putting everything up to and including section 5 into a separate "Rationale" doc that explains _why_ Linalg. And keep the sections 6-7 here to explain _how_ Linalg, referring the rationale doc the same way LangRef does.

371

Make an actual list

372

"(1) A and B (2) (3) C, D and E" breaks the logical list structure here

372

"correct by construction by carefully curating the set of generic operation properties that drive applicability"

380

Provide examples of this needs, to be concrete :)

406

grammo "will later needs to be reconstructed"

452

I'd drop the inner list here

464

It isn't just a similarity, heuristics _are_ human-engineered features :)

468

We don't seem to have Tensor Comprehensions in the "lessons learned from". IMO, we should and part of this text can go there, together with the limitations.

480

We are also not saying we don't want cost models at all. There are ways of having ML build you a cost model, or using a cost model to train ML. Cite Hugh Leather and team.

483

MLIR merely enables one to do so, we chose to define it in the structured way for the reasons above.

492

Explain what inspector-executor style means.

497

I'm not sure it makes sense to put observations after principles. IMO, principles can be built on the observations (and prior art).

508

I like the reference, but it would be better to briefly explain what the implications are. Something about choosing where to implement a transformation: on the algorithm or on the data?

510

This paragraph sounds overlapping with "preservation of information" from the section above.

530

Maybe argue that closedness was necessary because one couldn't easily mix different types of IRs? Other reasons?

532

Progressive lowering is also one of the guiding principles above. There's _some_ overlap between this section and the previous section. I would consider either merging the two, or explicitly deriving one from the other, point by point.

534

Care to elaborate? I don't grasp this analogy.

542

Monotonicity of what? Do you mean monotonously non-increasing level of abstraction, i.e. after transformation you are either at the same level of abstraction, or at a mix of same and lower level of abstraction?

544

You haven't introduced GenericOp yet.

558

typo "to at list"

583

typo: can expressing

597

Isn't it exactly the inverse in schedule trees? The op (leaf stmt) has almost no information and you need to trace to the tree root to collect it?

@ftynse I fully agree hence my pushing back on the github comments.

@rriddle can we please move forward?
I'll integrate the current review and then we can iterate like Alex suggests?

@ftynse I fully agree hence my pushing back on the github comments.

If you had started submitting this incrementally, you wouldn't have this problem ;)

@rriddle can we please move forward?
I'll integrate the current review and then we can iterate like Alex suggests?

Is there a rush here? I have some strong concerns with the way this document is structured as-is, which are largely enumerated by Uday on the github commit:

This file doesn't appear to be just the documentation or design doc of Linalg but also a rationale doc + past survey/experience + related work discussion + opinion + future thoughts > / side comments and some amount of vision. Writing these up is useful, but I think it's both misnamed and misplaced in the directory tree. For eg., other docs/Dialects/*.md file have > vastly different content.

What is the plan to resolve these?

Reviewed until 6 inclusive. FWIW, I think it's a waste of time for me to write comments and for you to try and implement them instead of me just fixing the text directly.

Given that this was already submitted, and never reverted, do whatever you feel comfortable doing.

@rriddle can we please move forward?
I'll integrate the current review and then we can iterate like Alex suggests?

Given that this is still submitted, and never reverted, that is fine. I didn't intend for this to be a "blocking review" given that it is already in-tree.

albertcohen accepted this revision.Jan 29 2020, 8:36 AM

LGTM. Thank you for taking the time to dump this background document. This is a huge piece of work as a rationale and vision for linalg, and code generation for tensor libraries in MLIR.

There are a few aspects I'd like to organize differently, like splitting out a top level documentation paragraph and moving it to the tablegen description where the rest of the documentation lives, renaming the rest as a rationale document (pointing to it in the doc), and modularizing the rationale a bit more. I will suggest something in a separate commit and also propose minor revisions to the technical presentation.

This revision is now accepted and ready to land.Jan 29 2020, 8:36 AM

@ftynse I fully agree hence my pushing back on the github comments.

If you had started submitting this incrementally, you wouldn't have this problem ;)

From my experience, you cannot write this kind of document (essentially a position paper more than documentation) incrementally in the same sense you would write code. There are no really separable components (functions, classes, features) that can be implemented independently. Worse, having only some parts available might make it convey a different message than what is actually intended. I would love to hear suggestions on how to iterate on such things.

@rriddle can we please move forward?
I'll integrate the current review and then we can iterate like Alex suggests?

Is there a rush here? I have some strong concerns with the way this document is structured as-is, which are largely enumerated by Uday on the github commit:

This file doesn't appear to be just the documentation or design doc of Linalg but also a rationale doc + past survey/experience + related work discussion + opinion + future thoughts > / side comments and some amount of vision. Writing these up is useful, but I think it's both misnamed and misplaced in the directory tree. For eg., other docs/Dialects/*.md file have > vastly different content.

What is the plan to resolve these?

I think most of this doc is the rationale for building Linalg, _not_ dialect documentation. It lists specific design decisions Linalg takes and needs the review of state-of-the-art to justify those decisions. How about renaming it to "Linalg Rationale" and linking to design decisions (aka core guiding principles) with an explanatory phrase that, unless you know the field well, you should read the survey first?

@ftynse I fully agree hence my pushing back on the github comments.

If you had started submitting this incrementally, you wouldn't have this problem ;)

From my experience, you cannot write this kind of document (essentially a position paper more than documentation) incrementally in the same sense you would write code. There are no really separable components (functions, classes, features) that can be implemented independently. Worse, having only some parts available might make it convey a different message than what is actually intended. I would love to hear suggestions on how to iterate on such things.

It depends really, but I had some things in mind when making that comment. For essay-like/position pieces it is really difficult to split up I definitely agree. Though the more infra-documentation bits could be split up however, if they are similar to the other documentation on dialects: types, operations, etc. This document seems to lack much of dry-langref style documentation so I 100% agree with you.

Now as for where to have this review, it is difficult.

  • One big Phab review:

Pro: It can be treated the same as the review for everything else
Pro: The comment and revision history is easy to see
Con: It is difficult when there are multiple collaborators as ownership/updates are difficult to make

  • Multiple Phab reviews:

Pro: It is much easier to review
Con: The entire picture may not be clear if each of the reviews rely on each other for context and deeply interconnect

  • Google Docs:

Pro: Much easier to collaboratively modify
Con: Comment/Revision history is difficult to view/recover

The above isn't exhaustive by any means and I don't necessarily have a preference in any of those cases. I'd just like to encourage a culture where we the community can more easily collaborate and review.

@rriddle can we please move forward?
I'll integrate the current review and then we can iterate like Alex suggests?

Is there a rush here? I have some strong concerns with the way this document is structured as-is, which are largely enumerated by Uday on the github commit:

This file doesn't appear to be just the documentation or design doc of Linalg but also a rationale doc + past survey/experience + related work discussion + opinion + future thoughts > / side comments and some amount of vision. Writing these up is useful, but I think it's both misnamed and misplaced in the directory tree. For eg., other docs/Dialects/*.md file have > vastly different content.

What is the plan to resolve these?

I think most of this doc is the rationale for building Linalg, _not_ dialect documentation. It lists specific design decisions Linalg takes and needs the review of state-of-the-art to justify those decisions. How about renaming it to "Linalg Rationale" and linking to design decisions (aka core guiding principles) with an explanatory phrase that, unless you know the field well, you should read the survey first?

I don't think we necessarily need to have a disclaimer, but I'm +1 to this being named "Linalg Rationale" as it fits much better.

Thanks everyone for your comments, I will integrate, update, rename to "Linalg Rationale" (there is already a separate Tablgen'd op doc here: https://mlir.llvm.org/docs/Dialects/LinalgDoc/).
Once this is done please feel free to send patches however you see fit and reorganize in a way that makes the most sense.

I am looking forward to not being the SPOF on this :)

nicolasvasilache marked an inline comment as done.Feb 2 2020, 7:27 AM
nicolasvasilache added inline comments.
mlir/docs/Dialects/Linalg.md
3

Do you have a suggestion here?
If there is a canned solution for this I'd happily adopt it.
I wouldn't want to condition progress on missing infrastructure and I am not looking for new pet projects at this time :)

nicolasvasilache marked 42 inline comments as done.Feb 2 2020, 11:31 AM
nicolasvasilache added inline comments.
mlir/docs/Dialects/Linalg.md
3

Trying with just TOC, we'll see how this goes.

12

Will move the whole doc to a Rationale and then move things around a bit in a followup.

24

Moving to a Rationale as suggested by others too.

83

I'll probably mess this up if I just go for it without a way to test.
Atm I am iterating with testing this with Markdown on github.
I prefer to leave the paths for a NFC followup.

380

These are Core Design Principles that will want to evolve into generic design principles that should also inform TCP (T Comp. Primitives) and other things. I don't want to oversubscribe this to Linalg (I may already do that too much in other places, I'll have to rework to make it into a design principles proposal).

497

Historically I started from the principles and after some iterations trying to resolve apparent contradictions, these observations allowed me to define the IR the way it is.
As you and others have advocated, this doc will be split up into more pieces but I want the connections between pieces to be visible and clear at all times.

I'm not worrying about presentation ordering for now since things will move.

510

I reflowed the text into more appropriate places, thanks!

534

Reshuffled things a bit around and explained a little more.

597

Yes you're right, dropping the reference to ISL here.

nicolasvasilache marked 8 inline comments as done.Feb 2 2020, 11:36 AM

Address comments.

Split the docs.

Drop references that are now crossing documents.

This revision was automatically updated to reflect the committed changes.

Thanks all for your comments and suggestions!
This has now landed with a little extra content and a few more figures.
Please send revisions if you wish to change things and in particular if you think I went overboard with figures.

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Unit tests: pass. 62395 tests passed, 0 failed and 839 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster completed remote builds in B45536: Diff 241937.
rriddle marked an inline comment as done.Feb 2 2020, 1:35 PM
rriddle added inline comments.
mlir/docs/Dialects/Linalg.md
3

I worked out a solution with Jacques today, [TOC] should be supported on the website now.

Thanks for all this documentation! Very elucidating! Just a couple of minor comments on the code. Probably missing something on my side.

mlir/docs/Dialects/Linalg.md
68

Are %2 and %3 actually %A and %B?

150

Same re %2 and %3.

I'm probably missing something but shouldn't %J2 = "dim" %3, 0: index be %J2 = "dim" %3, 1: index