Page MenuHomePhabricator

[mlir][wip] Start Shape dialect
ClosedPublic

Authored by jpienaar on Feb 3 2020, 10:21 PM.

Details

Summary
  • Add basic skeleton for Shape dialect;
  • Add description of types and ops to be used;

Naming still open and pretty form would enhance this. As mentioned on
discourse, will do this piece wise for discussion/easier feedback cycles. Tried
to reduce parts while allowing enough to get idea/write tests with.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rriddle added inline comments.Feb 3 2020, 10:40 PM
mlir/include/mlir/Dialect/Shape/IR/Shape.h
52

///?

56

///

69

Can we get comments on each of these types?

97

///

116

weird spacing here.

118

This header comment looks off.

mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
4

MLIR -> LLVM

72

Are these wrapped at 80 characters? They look like they are overflowing.

mlir/include/mlir/IR/DialectSymbolRegistry.def
27

Not this revision, but we should sort these.

mlir/lib/Dialect/Shape/DialectRegistration.cpp
4

MLIR -> LLVM

Unit tests: fail. 62150 tests passed, 5 failed and 811 were skipped.

failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: fail. clang-tidy found 2 errors and 1 warnings. 0 of them are added as review comments below (why?).

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

stellaraccident requested changes to this revision.Feb 3 2020, 11:10 PM
stellaraccident added inline comments.
mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
89

Just "Value" now, right?

This revision now requires changes to proceed.Feb 3 2020, 11:10 PM
mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
182

I might just be mis-understanding, but to my reading, I'm not sure this formulation is right. I believe this would yield "left aligned" broadcasting, whereas the traditional rules for a ranked broadcast are "right aligned":

OUTPUT DIM: 0 1 2 
       LHS: Z Y X
       RHS:     X

I think there is a step before where you want to align the lower ranked input dimensions to fill towards the right, with respect to the output dims. For full generality, you may also need a dimension map (similar to xhl_hlo broadcast_dimensions attributes).

jpienaar updated this revision to Diff 242458.Feb 4 2020, 4:05 PM
jpienaar marked 7 inline comments as done.

Addresses review comments.

jpienaar marked 12 inline comments as done.Feb 4 2020, 4:07 PM

Updated, thanks

mlir/include/mlir/Dialect/Shape/IR/Shape.h
69

This made me think about having ODS generate these from the description :)

mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
89

Ha, good point I had written this before that change :)

182

Good catch, I forgot to add that step. In the code I had used zip_longest and reverse iterator, but that doesn't lend itself to words that well :). Here I'm pursuing broadcasting as defined in numpy and used by ML frameworks (https://docs.scipy.org/doc/numpy/reference/ufuncs.html#broadcasting) but I could be wrong so please check.

Re: broadcast_dimensions attribute, while that is more general that is not the same semantics as what folks think of when broadcasting is mentioned. That is a combination of reshape & broadcast IMHO.

mlir/include/mlir/IR/DialectSymbolRegistry.def
27

Yeah I was very tempted ... I'll do in NFC follow up.

jpienaar updated this revision to Diff 242459.Feb 4 2020, 4:10 PM
jpienaar marked 2 inline comments as done.

Ran clang-format

Unit tests: fail. 62150 tests passed, 5 failed and 811 were skipped.

failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: fail. clang-tidy found 2 errors and 1 warnings. 0 of them are added as review comments below (why?).

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Unit tests: fail. 62150 tests passed, 5 failed and 811 were skipped.

failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: fail. clang-tidy found 2 errors and 1 warnings. 0 of them are added as review comments below (why?).

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

rriddle marked an inline comment as done.Feb 4 2020, 6:10 PM
rriddle added inline comments.
mlir/include/mlir/Dialect/CMakeLists.txt
11

Seems like this should come before SPIRV?

mlir/include/mlir/Dialect/Shape/IR/Shape.h
19

Why do you need Function.h here?

37

Any reason not to sort these?

69

Realistically, we should be able to auto-generate simple types pretty trivially.

mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
152

Can we sort these operations alphabetically?

antiagainst added inline comments.Feb 5 2020, 6:50 AM
mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
182

I'd typically think an op means some action so the name is typically a verb but here we have broadcastable which sounds like a trait. What about just name it as broadcast?

208

My 2c: join sounds better for me as it indicates some operation behind. any_of gives me the feeling that the result will be the same as one of the input, which is not always true. :)

281

Would it make sense to name it as debug_print to be more clear?

herhut added a comment.Feb 5 2020, 9:11 AM

Great to see progress on this! Some comments.

mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
47

Nit: It is not used for dimensions, only, but generally as a positive integer value with support for being unknown and invalid. In that sense it is more like a size_t than a dim. Maybe shape.size?

56

I find the name type somewhat misleading. I think Shape_Shape would describe better what it is (a shape). But I get the desire to name it type. Should they all be Shape_DimType and Shape_ShapeType?

101

DimOrShape?

111

+1 to the constant op. Also, why expose a -1 here? Why not use

shape.constant unknown : !shape.size
shape.constant invalid : !shape.size
shape.constant <positive integer> : !shape.size

and then parse that internally into some special integer values? This would leave more flexibility to change the backing type to something other than int later.

135

Is there a reason for this? Also, thinking of library calls that return ranked tensors of statically unknown rank, this function would not allow me to create a shape for them. Should we also have a version that takes just the rank?

182

I agree. Broadcast seems a better name, as it broadcasts two shapes into a resulting broadcasted shape.

208

any_of is less constrained in that it does not have to pick the join of the two values, whereas join has rigid semantics. On the other hand, the description below describes the join of the two values.

222

I would expect

join ([], []) -> []
join([], [*]) -> []
join([], [?,...,?]) -> [invalid]

223

join([1,?], [2, ?, ?]) -> [invalid] ?

243

"for each every" is one too many.

264

Should this be shape.mul if it existed?

272

Does this mean the rank is unknown? Or also if parts of the shape are unknown?

If known rank was allowed, one could still implement computing the rank by counting the elements of the vector (as a somewhat useless example).

nicolasvasilache requested changes to this revision.Feb 7 2020, 8:26 AM

I question the relevance of carrying implicit broadcasting concepts from HLO and other frontends into MLIR core.
I see this as a red flag that the required separation of concerns has not happened.
Anything related to broadcast should be on the XLA / frontend side IMO.

I do not see a mention of broadcast in the discourse discussion: https://llvm.discourse.group/t/shape-type-dialect/423/4.

This revision now requires changes to proceed.Feb 7 2020, 8:26 AM
mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
191

This is frontend logic that I do not think should leak into MLIR.

191

I meant into MLIR *core*, sorry.

stellaraccident added inline comments.Feb 8 2020, 3:50 PM
mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
191

I tend to agree. We do need this to lower frontend dialects that choose to implement numpy-style broadcasting, but that does not mean it needs to be in core (or at least in this level of core -- I could see having a numpy_broadcast_shape op somewhere).

I suspect that there may have been some thought of needing this in some follow-on analysis algorithms. It would probably be better to discuss more in that context.

jpienaar updated this revision to Diff 243460.Feb 9 2020, 12:49 PM
jpienaar marked 30 inline comments as done.

Updating to address review comments (sorting ops, types, making type names uniform, renaming broadcastable op, renaming dim type, fixing example).

Updated, thanks

I question the relevance of carrying implicit broadcasting concepts from HLO and other frontends into MLIR core.

I disagree. This is about enabling frontend dialects to express their shape constraints. And need not even be implicit (e.g., explicit broadcast operator has broadcast shape op in its shape function). And I see this similar to signed integer discussion. While we have preferences and would prefer to never use implicit broadcast, many frontend dialects do and this is about enabling their modelling. Enabling frontend dialects to express their constraints is very useful. This is similar to broadcastable trait that we have at the moment.

I see this as a red flag that the required separation of concerns has not happened.
Anything related to broadcast should be on the XLA / frontend side IMO.

Are you saying we should not have any frontend dialects? This moves responsibility into op construction of the frontend time, which is often not where we wanted & optimizing frontend dialect is definitely something that we want, which might require legalizing from one frontend dialect to another.

mlir/include/mlir/Dialect/CMakeLists.txt
11

Depends on if you follow sort order case sensitive or insensitive :) I normally default to case sensitive, but can do insensitive.

mlir/include/mlir/Dialect/Shape/IR/Shape.h
19

Ah this was left over from previous testing, thanks

37

It was a semi hierarchy in my mind, but that doesn't really matter, so sorted.

69

Exactly. Should be easy to address.

mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
47

Good point and I like that, renaming.

56

Yeah, Shape_Shape was just weird, changing to uniformly suffixing with Type.

111

Mostly as it made writing tests easier while allowing reusing the integer attribute parser/dense storage and so switch later. But I can start with an ArrayAttr and then do the above.

For invalid case I feel constant doesn't highlight the error state well enough and would rather that be a separate op that takes a string attribute. WDYT?

135

The one below does. The main reason was that then you need to special case value interpretation (where I was using only integers, e.g., require rank followed by dims so that you could do [?] for unranked [1, ?] for 1D ranked but unknown etc.) but with expansion to using strings, one can use * now and so rolled it into constant.

191

Correct, we need this for frontend dialects (TF, TFLite, XLA [client] HLO, etc) to be able to express their shape constraints. it need not be used by standard etc. and we could add specialization that only works for ranked types. But I see unranked inputs and broadcast behavior as common for many frontend dialects and this similar to broadcastable trait or signed integers. We can move the op into a different file.

Now, of course nobody is forced to use this and standard ops will never generate nor need to solve shape equations using this.

223

I changed notations and didn't do it correctly.

243

It was just so good it happened every every time 😁

264

D'oh yes. This was computing sum of dimensions :)

272

Unraked, updated to clarify. Correct. well and I was using this for op where one computed the product of known indices, so inside the expression it would conditionally reduce πŸ™‚

281

Sounds good. I quite like it, made writing tests easier (although I should refactor those into non-executable tests ...).

jpienaar updated this revision to Diff 243461.Feb 9 2020, 1:06 PM

Reduce with mul should be init with 1 instead of 0

mehdi_amini added inline comments.Feb 9 2020, 2:43 PM
mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
245

I think si.dim should be shape.size?

herhut accepted this revision.Feb 10 2020, 1:21 AM

Thanks for starting this. I'd be happy to see this land.

mlir/include/mlir/Dialect/Shape/IR/ShapeOps.td
111

I like the idea. Invalid shapes should always have some reason attached to them to ease debugging whereas normal constants probably don't. Something like shape.error.

191

Ultimately, we could think of making the shape dialect and the machinery built on top of it extensible so that other front-ends can add their own high-level shape operations to model their constraints. In such a world, I would agree with @nicolasvasilache that the numpy-style broadcast operation should live with the corresponding front-end dialect.
However, as a first step, we should build a closed system that is powerful enough to model constraints for the input languages we care about (which includes HLO). This requires computing shapes for broadcast, so we should have the operation.

jpienaar updated this revision to Diff 244010.Feb 11 2020, 2:39 PM
jpienaar marked 7 inline comments as done.

Fix incorrectly named si.dim to shape.size

Updated typo, thanks

This revision was not accepted when it landed; it landed in state Needs Review.Feb 11 2020, 2:49 PM
This revision was automatically updated to reflect the committed changes.

It seems like this landed without enough code to compile? There's no .cpp file which includes the implementation .cpp.inc generated from tablegen?
As a result BUILD_SHARED_LIBS=on fails.

Steve