This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Introduce data layout modeling subsystem
ClosedPublic

Authored by ftynse on Feb 19 2021, 10:55 AM.

Details

Summary

Data layout information allows to answer questions about the size and alignment
properties of a type. It enables, among others, the generation of various
linear memory addressing schemes for containers of abstract types and deeper
reasoning about vectors. This introduces the subsystem for modeling data
layouts in MLIR.

The data layout subsystem is designed to scale to MLIR's open type and
operation system. At the top level, it consists of attribute interfaces that
can be implemented by concrete data layout specifications; type interfaces that
should be implemented by types subject to data layout; operation interfaces
that must be implemented by operations that can serve as data layout scopes
(e.g., modules); and dialect interfaces for data layout properties unrelated to
specific types. Built-in types are handled specially to decrease the overall
query cost.

A concrete default implementation of these interfaces is provided in the new
Target dialect. Defaults for built-in types that match the current behavior are
also provided.

Diff Detail

Event Timeline

ftynse created this revision.Feb 19 2021, 10:55 AM
ftynse requested review of this revision.Feb 19 2021, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2021, 10:55 AM

Started making my way through it.

Can you start a top-level doc documenting the design here, how users will interact with these interfaces, etc?

mlir/include/mlir/Dialect/Target/Target.h
75 ↗(On Diff #325038)

nit: Can you use StringLiteral here instead?

mlir/include/mlir/Dialect/Target/TargetBase.td
20 ↗(On Diff #325038)

Can you expound more upon what you mean here by "target"? In MLIR that is a very open concept and it isn't clear how narrow this dialects intention is.

25 ↗(On Diff #325038)

Same comment here, these look like the should be StringLiteral.

rriddle added inline comments.Feb 19 2021, 5:02 PM
mlir/include/mlir/Interfaces/DataLayoutInterfaces.h
2

This header is incorrect.

111

typo: identifies

mlir/include/mlir/Interfaces/DataLayoutInterfaces.td
282

Using byte as a measurement for size can be problematic if it bakes in the assumption 1 byte== 8-bits, given that not all targets do that. Also, what is the intended result of this method for something like i1?

289

What unit of measurement is the alignment? byte?

303

Is there a definition of what "compatible" means in this context?

(Note: The description for interface methods doesn't have to be a single line, it can be a full paragraph)

313

Missing a period here.

mlir/lib/Dialect/Target/Traits.cpp
16 ↗(On Diff #325038)

Why purpose is the trait serving that the interface can't provide?

17 ↗(On Diff #325038)

Typo: enforece

llvm went from unsigned to llvm::Align. Maybe introduce a richer type to represent the size of something. The conversion from bits to bytes could be target specific?

ftynse updated this revision to Diff 325430.Feb 22 2021, 6:13 AM
ftynse marked 10 inline comments as done.

Review

mlir/include/mlir/Dialect/Target/TargetBase.td
20 ↗(On Diff #325038)

I didn't want to be restrictive here...

mlir/include/mlir/Interfaces/DataLayoutInterfaces.td
282

So far, it does not bake in anything, it's up to the specific type. For i1, it will be 1 byte. As of now, we only have allocation as a user of this, and we can't allocate less. It should be possible to add bit/byte variants when we need them, this patch is large enough without that.

mlir/lib/Dialect/Target/Traits.cpp
16 ↗(On Diff #325038)

The trait provides the interface implementation that is based on the Target dialect. Without it, the interface would have to depend on the dialect, which sounds like layering inversion.

ftynse updated this revision to Diff 325431.Feb 22 2021, 6:16 AM

Forgot to git-add the doc file

Harbormaster completed remote builds in B90210: Diff 325430.

Took another pass. Code looks good, mostly still digesting the design.

mlir/docs/DataLayout.md
4
24
29
35
45
190

Can you also include rationale on these decisions?

mlir/include/mlir/Dialect/Target/Target.h
92 ↗(On Diff #325431)
mlir/include/mlir/Dialect/Target/TargetBase.td
20 ↗(On Diff #325038)

What other kinds of things would you see going in this dialect?

mlir/include/mlir/Interfaces/DataLayoutInterfaces.td
261
297

typo: alignemnt

mlir/lib/Dialect/Target/Target.cpp
242 ↗(On Diff #325431)

Is it valid for specs to have null entries?

261 ↗(On Diff #325431)

nit: I'd use const auto &v here personally.

mlir/lib/Dialect/Target/Traits.cpp
16 ↗(On Diff #325038)

Ah, that makes much more sense. I think this further points to Target being a problematic name. I read HasTargetDataLayout and glossed over this being for the implementation of the Target dialect.

mlir/lib/Interfaces/DataLayoutInterfaces.cpp
35

nit: Use a merged isa here.

91–93
95

nit: Use a ternary here?

132

nit: Spell out auto here

170

nit: Add braces for this if, the body spans multiple lines.

278

Use isa<> instead. isa<BuiltinDialect>(&sampleType.getDialect())

mlir/test/lib/Dialect/Test/TestTypes.cpp
135

This should be returning on an error. Missing test?

186

The above functions already emit an error, so we shouldn't emit a second one.

mlir/test/lib/Dialect/Test/TestTypes.h
126

Why not define this type using ODS?

Also, shouldn't this be LaidOut not LayedOut. (Though TypeWithLayout is probably cleaner)

mlir/test/lib/Transforms/TestDataLayoutQuery.cpp
19

nit: ///

ftynse updated this revision to Diff 328161.Mar 4 2021, 7:07 AM
ftynse marked 20 inline comments as done.

Address review

mlir/include/mlir/Dialect/Target/TargetBase.td
20 ↗(On Diff #325038)

When targeting LLVM IR, things like triple, CPU kind (e.g. skylake) and features (e.g. avx512) that are currently attached as function attributes; properties of built-in types; memory space descriptions. More generally, any assumption that transformations can make when transforming the code as long as it needs to be visible in many dialects.

mlir/lib/Dialect/Target/Target.cpp
242 ↗(On Diff #325431)

No, leftover from a previous version.

mlir/lib/Dialect/Target/Traits.cpp
16 ↗(On Diff #325038)

Any suggestions for better names?

mlir/lib/Interfaces/DataLayoutInterfaces.cpp
91–93
error: no matching constructor for initialization of 'mlir::Identifier'
    return T();
 note: in instantiation of function template specialization 'llvm::PointerUnion<mlir::Type, mlir::Identifier>::dyn_cast<mlir::Identifier>' requested here
    return entry.getKey().dyn_cast<Identifier>() == id;
 note: in instantiation of function template specialization 'llvm::PointerUnion<mlir::Type, mlir::Identifier>::dyn_cast<mlir::Identifier>' requested here
    return entry.getKey().dyn_cast<Identifier>() == id;
note: candidate constructor not viable: requires 1 argument, but 0 were provided
  Identifier(const Identifier &) = default;
mlir/test/lib/Dialect/Test/TestTypes.cpp
135

This should never happen, changed to assert.

mlir/test/lib/Dialect/Test/TestTypes.h
126

Why not define this type using ODS?

ODS doesn't support defining types with interfaces / traits.

Some Qs related to splitting this massive commit up.

mlir/docs/DataLayout.md
6

or its address alignment requirements ?

10

can you split in 4 clear bulletpoints with a very simple embedded string example of what the usage for each looks like?

43

Is this notion of scoping and compatibility well established somewhere else you can cite?
If not I am wondering if this is something that is immediately useful / needed?

I am thinking in particular of how we would want to represent host / device code and layouts.
My gut feeling right now is that the notion of nesting and compatibility is premature complexity to introduce from the very beginning and would be better split in a subsequent commit?

78

Can caching be a followup concern/commit so that complexity is introduced more incrementally?

82

Same Q, isn't this premature consideration for a first impl and can these considerations be split into a subsequent commit?

95

A simple early inline string example of what this looks like e.g. for vector types would help anchor this reader's imagination.

194

Is it default or required ?

We already know of HW that has xKB load alignment requirements: if this is a requirement, isn't this already limiting?

210

This is also related to the modeling of n-D vectors and the rationale described here: https://mlir.llvm.org/docs/Dialects/Vector/#deeperdive.
Potential future changes will also need to take this into consideration.

mlir/include/mlir/Interfaces/DataLayoutInterfaces.h
151

?

ftynse updated this revision to Diff 328484.Mar 5 2021, 5:32 AM
ftynse marked 3 inline comments as done.

Review

ftynse added inline comments.Mar 5 2021, 5:33 AM
mlir/docs/DataLayout.md
10

How do you see an embedded string example of an interface implementation? class MyOp : public Op<MyOp, ..., DataLayoutOpInterface::Trait> { DataLayoutSpecInterface getDataLayoutSpec() { return ...; } ... }; is the shortest form I can come up with, and it looks to verbose to be actually helpful.

43

No, it's not. But this is something that was explicitly requested in the RFC discussion so I prefer not to leave it out.

I am thinking in particular of how we would want to represent host / device code and layouts.

"Compatible" may mean that device modules must override all layout properties, there's no prescription on what that should mean. I can also imagine some things like index bitwidth or memref lowering scheme to be shared between host and device.

78

Caching has been explicitly requested in RFC review. If you strongly insist, I can try to split it out _the implementation_ in a separate commit.

82

The issue I see with splitting is that landing a modification to this subsystem will likely take time, so I'll have to come up with the temporary yet reasonable behavior and then modify every place that has come to rely on the temporary behavior. Again, if you really insist, I could try splitting the implementation into a separate commit, but I will likely have to land all of such commits simultaneously anyway.

95

I will add an example below, but the interface does not prescribe what this looks like, only how it can be interpreted. The risk is over-constraining the readers imagination to think a specific implementation of the interface is expected.

rriddle accepted this revision.Mar 8 2021, 10:58 AM

LGTM from me on this being the initial commit. I have still have some reservations on several aspects(name of the "Target" dialect, 8-bit byte assumptions) so it would be good to get another LGTM as well.

mlir/docs/DataLayout.md
193
210

I am still quite apprehensive about baking in 8-bit byte in the design here, as it is hard to undo these changes as they permeate through the system.

230

https?

mlir/include/mlir/Dialect/Target/Target.h
33 ↗(On Diff #328484)

You should be able to define these in ODS now.

mlir/lib/Dialect/Target/Target.cpp
145–147 ↗(On Diff #328484)
149–152 ↗(On Diff #328484)
170 ↗(On Diff #328484)

Can you cache .size() outside of the top loop? Given that your comment says you don't need to check for duplicates in newEntries.

259–261 ↗(On Diff #328484)
This revision is now accepted and ready to land.Mar 8 2021, 10:58 AM

Thanks a lot for pushing on this! This is a significant amount of work, and it'll be very important moving forward!

There are a few details I'm not sure about, but it looks fine as a first step. The only thing that I'm still uneasy about is to use the name target for the dialect, do we have alternatives?

ftynse updated this revision to Diff 329631.Mar 10 2021, 5:42 AM
ftynse marked 13 inline comments as done.

Address more review

The only thing that I'm still uneasy about is to use the name target for the dialect, do we have alternatives?

I don't really. DataLayout sounds overly restrictive, TargetProperties and the likes sound too long. I am happy to take suggestions here.

mlir/docs/DataLayout.md
210

This just reflects the current behavior without data layouts. But I agree that we can clean it up immediately after this lands.

mlir/include/mlir/Dialect/Target/Target.h
33 ↗(On Diff #328484)

Same problem as for test attrs, ODS is missing support for attribute traits/interfaces.

ftynse updated this revision to Diff 329943.Mar 11 2021, 5:49 AM

Renamed Target dialect to DLTI (Data Layout and Target Info)

Acronyms can be annoying but this one has the merit of being sufficiently
unique to afford grep-based renaming in the future if somebody comes up with a
better name.

This revision was landed with ongoing or failed builds.Mar 11 2021, 7:54 AM
This revision was automatically updated to reflect the committed changes.