Page MenuHomePhabricator

[mlir] Add a signedness semantics bit to IntegerType
ClosedPublic

Authored by antiagainst on Jan 10 2020, 11:50 AM.

Details

Summary

Thus far IntegerType has been signless: a value of IntegerType does
not have a sign intrinsically and it's up to the specific operation
to decide how to interpret those bits. For example, std.addi does
two's complement arithmetic, and std.divis/std.diviu treats the first
bit as a sign.

This design choice was made some time ago when we did't have lots
of dialects and dialects were more rigid. Today we have much more
extensible infrastructure and different dialect may want different
modelling over integer signedness. So while we can say we want
signless integers in the standard dialect, we cannot dictate for
others. Requiring each dialect to model the signedness semantics
with another set of custom types is duplicating the functionality
everywhere, considering the fundamental role integer types play.

This CL extends the IntegerType with a signedness semantics bit.
This gives each dialect an option to opt in signedness semantics
if that's what they want and helps code sharing. The parser is
modified to recognize si[1-9][0-9]* and ui[1-9][0-9]* as
signed and unsigned integer types, respectively, leaving the
original i[1-9][0-9]* to continue to mean no indication over
signedness semantics. All existing dialects are not affected (yet)
as this is a feature to be opt in.

More discussions can be found at:

https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ

Diff Detail

Event Timeline

antiagainst created this revision.Jan 10 2020, 11:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2020, 11:50 AM

Unit tests: pass. 61765 tests passed, 0 failed and 780 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

flaub added a subscriber: flaub.Jan 16 2020, 1:44 AM
rriddle requested changes to this revision.Jan 16 2020, 5:49 PM
rriddle added inline comments.
mlir/include/mlir/IR/Builders.h
115

// -> ///

mlir/include/mlir/IR/StandardTypes.h
87

Use triple tick comments.

mlir/lib/IR/Attributes.cpp
282

This is a slightly concerning. Is your plan to have getSInt/getUInt? Or remove this all together?

738

Can you tighten this up a bit?

if (intType.isSignless())

return true;

return intType.isSigned() ? isSigned : !isSigned;

mlir/lib/IR/MLIRContext.cpp
519

Can you just switch this to early exit and leave the rest as is?

mlir/lib/IR/TypeDetail.h
76–97

Can you bit pack the signedness with the width?

356

You don't need all of this, you shouldn't need any of this if you follow the above. If it's still necessary, just implement a custom hashKey function for the integer type storage.

This revision now requires changes to proceed.Jan 16 2020, 5:49 PM
antiagainst marked 8 inline comments as done.

Address comments

antiagainst added inline comments.Jan 21 2020, 9:08 AM
mlir/lib/IR/Attributes.cpp
282

I don't plan to add getSInt/getUInt at the moment because there is a TODO for removing this API entirely. I'm not sure whether that's still planned? @jpienaar

738

I think that's effectively the same as the switch? Anyway, changed. :)

Unit tests: pass. 62059 tests passed, 0 failed and 784 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

jpienaar added inline comments.Jan 25 2020, 9:04 AM
mlir/lib/IR/Attributes.cpp
282

Which TODO are you referring to?

rriddle added inline comments.Jan 29 2020, 11:37 PM
mlir/include/mlir/IR/OpBase.td
313–314

Can you go through and make sure all of the current usages of IntegerType use the signless variant? Or is this something planned for a followup?

dcaballe added inline comments.Feb 3 2020, 11:37 AM
mlir/include/mlir/IR/StandardTypes.h
101

sigedness -> signedness

115

sigedness -> signedness
indicatd -> indicated

antiagainst marked 4 inline comments as done.

Address comments

  • Went through isa<IntergerType> and isInterger(...) use sites to make sure they now use signless integer type where possible.
  • Updated several APIs to include 'Signless' in the name to be explicit.
  • Fixed typos
antiagainst marked 2 inline comments as done.Feb 8 2020, 3:05 PM
antiagainst added inline comments.
mlir/include/mlir/IR/OpBase.td
313–314

Sure! Since you are asking here and I changed some places, I went another pass over isa<IntergerType> and isIntegerType use sites and converted as many as possible to use signless ones. Remaining ones basically are for i1, parsing and int elements attr storage. So it's should be quite thorough now; but if you can go another pass that would be awesome. Some APIs are also adjusted to include 'Signless' in the name to be explicit so downstream users can see an explicit compilation break.

mlir/lib/IR/Attributes.cpp
282
bondhugula requested changes to this revision.Feb 8 2020, 5:48 PM
bondhugula added a subscriber: bondhugula.
bondhugula added inline comments.
mlir/docs/Rationale.md
252

different levels of abstractions

253

more near -> closer to

254

likewise

This revision now requires changes to proceed.Feb 8 2020, 5:48 PM

For the naming part, could we consider using IntegerType for signless integers the way it is now and add SignedIntegerType? i.e., see "Signed" as an addition as opposed to an adjective/specialization, and continuing to maintain that integer types in MLIR are signless.

A couple of things.

  1. The commit message has no mention of 'SignedIntegerType' or 'SignlessIntegerType'. Please add something to the effect "add 'SignedIntegerType' and IntegerType -> SignlessIntegerType".
  1. The last message on https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ was in Jul 2019. Could you mention the naming change on the discussion forum?
antiagainst marked 4 inline comments as done.
  • Fix Rationale wording

For the naming part, could we consider using IntegerType for signless integers the way it is now and add SignedIntegerType? i.e., see "Signed" as an addition as opposed to an adjective/specialization, and continuing to maintain that integer types in MLIR are signless.

A couple of things.

  1. The commit message has no mention of 'SignedIntegerType' or 'SignlessIntegerType'. Please add something to the effect "add 'SignedIntegerType' and IntegerType -> SignlessIntegerType".
  2. The last message on https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ was in Jul 2019. Could you mention the naming change on the discussion forum?

Functionality-wise it might not matter that much but engineering-wise, IMO, the current way is better because:

  • Signedness is just one of the semantics bit belonging to integers, like bit width. So I think the current way it's conceptually clearer.
  • Having different subclasses, say, SignlessIntegerType, SignedIntegerType, and UnSignedIntegerType (IMO we cannot just have SignlessIntegerType and SignedIntegerType because using SignedIntegerType to mean both signed and unsigned integers can be quite confusing) we are basically duplicating the same set of utilities for integers three times. And one still would like to have an intermediate base integer class for cases where one don't care about signedness so ends up we are having the same set of utilities four times. We may be able to deduplicate and simplify with templating and other ways, but still it's complexity coming with the class hierarchy.

:)

rriddle added inline comments.Feb 14 2020, 9:56 PM
mlir/include/mlir/Dialect/VectorOps/VectorOps.td
490–491

Seems like some of these need to be formatted now.

mlir/lib/IR/Attributes.cpp
282

I'd say that we could add them and just let the user decide which one to call. I'd be fine with doing what llvm::ConstantInt does, i.e. have a getSExtValue and getZExtValue.

mlir/lib/IR/TypeDetail.h
60–68

I think we can keep the KeyTy the same(i.e. unsigned) and just construct it from unsigned+signedness when necessary. That would simplify a lot of this.

80

Can we just use a packed struct instead and convert between that and unsigned when necessary? That would remove the bit magic and make it easier to manipulate. e.g.

struct KeyBits {
  unsigned width : 30;
  unsigned signedness : 2;
};

KeyBits getKeyBits(unsigned value) {
  KeyBits asBits;
  memcpy(&asBits, &valueData, sizeof(asBits));
  return asBits;
}
unsigned getFromKeyBits(KeyBits) {
  // Reverse of above
}
rriddle accepted this revision.Feb 19 2020, 1:15 PM

Approval after resolving comments. Thank you for all of your hard work and patience making sure this work! Awesome work.

antiagainst marked 6 inline comments as done.Feb 21 2020, 6:19 AM
antiagainst added inline comments.
mlir/lib/IR/TypeDetail.h
60–68

I tried but wasn't able to get this working properly. Anway this is an internal thing that we can improve later. So leaving it as-is for now.

80

Thanks! I used KeyBits to simplify a few things.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 21 2020, 6:24 AM
This revision was automatically updated to reflect the committed changes.
antiagainst marked an inline comment as done.