This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add initial support for a binary serialization format
ClosedPublic

Authored by rriddle on Aug 11 2022, 8:17 PM.

Details

Summary

This commit adds a new bytecode serialization format for MLIR.
The actual serialization of MLIR to binary is relatively straightforward,
given the very very general structure of MLIR. The underlying basis for
this format is a variable-length encoding for integers, which gets heavily
used for nearly all aspects of the encoding (given that most of the encoding
is just indexing into lists).

The format currently does not provide support for custom attribute/type
serialization, and thus always uses an assembly format fallback. It also
doesn't provide support for resources. These will be added in followups,
the intention for this patch is to provide something that supports the
basic cases, and can be built on top of.

https://discourse.llvm.org/t/rfc-a-binary-serialization-format-for-mlir/63518

Diff Detail

Event Timeline

rriddle created this revision.Aug 11 2022, 8:17 PM
rriddle requested review of this revision.Aug 11 2022, 8:17 PM

I don't have any failure tests right now because those are annoying to make/update, and I want to make sure we agree on various aspects before doing that.

Nice, just started with doc for now.

I was wondering if we should we have mlir-translate do the text to binary and vice versa, but can see the convenience when stitching together passes and it's not an external format.

mlir/docs/BytecodeFormat.md
36

OOC why not use it?

69

What is used instead?

121

Nit: [] instead of *? (I see the relational database or pointer notations of the latter but I find the former more intuitive)

128

Link to encoding section?

209

And 7 of 8 optional elements used?

214

So location is either previous or new? Unspecified doesn't mean unknown but previous. That's nice. The only other common case I could think of is successive lines in a file but that would take a bit.

247

So we could have a region with 0 blocks here that is considered not empty?

rriddle added inline comments.Aug 12 2022, 12:40 AM
mlir/docs/BytecodeFormat.md
36

The encoding described here, or more commonly referred to as PrefixVarInt, is essentially a variant of LEB. Encoding and decoding are generally faster using the prefix strategy, though. The decode for the prefix variant, for example, is effectively branchless given that you just need to count the trailing zeros, which is an intrinsic on most modern hardware. I've benchmarked a bunch of different strategies and implementations over the past week, using both a corpus of .mlir taken from various projects and just using random distributions of integers. This is what emerged from that testing.

247

Zero block regions should be marked as empty. I used codes for various "bool" like things just to make it easier to change things later on. E.g. if we wanted to change the way regions are encoded we could just have a new "kRegion2" code. Though that only matters after we start versioning things.

Just commenting on the doc right now.

mlir/docs/BytecodeFormat.md
26

No fixed64 ? Is the bet that we'd always use VBR for this?

36

Are you using PrefixVarInt or how does it differ? Is your variant format documented somewhere in the literature?

I'd rather have us stick to something existing, I doubt that we'll invent a revolutionary trick here somehow.

Can you position what you're proposing against varint-G8IU, PFOR, SIMD-BP12, ... ?

120

I'm not sure here why these varint are useful?

121

+1 for [] :)

142

What is kAsmForm ? Seems to refer to an enum not documented here.

148

I'm not sure yet how we dispatch to a dialect for loading a type/attribute

154

Basically this is differential encoding of the offset table, but that means you need to decompress the table here IIUC. Accessing the last attribute requires to read the entire table.
(I assume we would do it once and for-all in the "BitcodeReaderContext" or whatever you're naming it)

168

That seems overly conservative to me: what about just storing the part after the dot and add a varint for the dialect ID?

191

I assume this field is needed to allow some level of lazy loading / random access?
I'm not sure yet...

209

Actually 6? firstResultIndex and numResults seem coupled right?

248

Do we need this indirection here? You didn't provide one for the operation content.

265

block_element ?

mehdi_amini added inline comments.Aug 12 2022, 2:39 PM
mlir/docs/BytecodeFormat.md
203

Having the regions in-line makes it hard/impossible to lazily load IR I think? (at least not without decoding the entire IR section).

rriddle updated this revision to Diff 452346.Aug 12 2022, 6:30 PM
rriddle marked 12 inline comments as done.
rriddle added inline comments.Aug 12 2022, 6:31 PM
mlir/docs/BytecodeFormat.md
26

I just didn't add it in. Technically right now we only use byte, so I dropped the rest and added a TODO to add larger widths as necessary.

36

Yeah, it's just PrefixVarInt. I completely missed explicitly saying that when rushing out the doc.

120

These are used to know how many attributes/types/operation names need to be parsed.

121

SG, I also like [].

148

We chatted a little about this offline. Attributes and Types are grouped by dialect, with each grouping emitted in the same order as the dialects in the dialect section. This allows for us to know which dialect an attribute/type belongs to based on its index (i.e. we could know attributes 0-5 are the builtin dialect, 6-9 are the func dialect and so on).

154

Yeah. We do a single pass to initialize the bytecode structure of all attributes/types, which indicates where the data is stored in the bytecode. Reading lazily after that is trivial, because we just read the previously computed data directly.

168

Agreed, I need to look into this. I went with the current thing because it was easier to bootstrap (e.g. I don't have to worry about the difference between builtin and non-builtin).

191

This field is the value index of the first result. Every value gets a number, which is what gets referred to in the operands list. For multiple results, we know the value number are consecutive, so we just need to know the first one.

203

Yeah. The idea I have right now is that the op encoding mask will indicate if the regions are inline or out-of-line. That way we just dispatch to two different code paths depending on the encoding.

209

Yeah, right now the mask uses 6 out of 8 possible bits. Whenever we do lazy loading that may bump up to 7 bits (we could get around that by encoding the "are the regions lazy" bit a different way). We could of course change from a mask to a set of values for each possible encoding type.

214

Yeah, the file locations encoding is gonna be interesting. I'm hoping that when builtin attributes have custom encodings it ends up being mostly okay (should just be a string index+two small varints for the line/col). This is something we will likely want to play around with, but thankfully it's easy to test (I have a huge IR file that inherits locations from the file).

jpienaar added inline comments.Aug 12 2022, 7:13 PM
mlir/include/mlir/Bytecode/BytecodeWriter.h
36

This feels a little bit weird ... I'd almost expect like OpPrintingFlags having the config be separate from Operation* being printed. But perhaps it makes more sense below.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
21

mlir-bytecode-reader ?

120

Nit: I'm more used to null-terminated, with NUL being the character name.

mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
156

I think yes & no. piping these will be common and other uses seem like mistake, but I don't know how foolproof this check is on all platforms and opt tool is not a user tool.

mlir/test/Bytecode/general.mlir
2

We probably end up running all non-split or -error cases through a round trip tests to check, followed by fuzzing. It would almost seem possible to enumerate all of these kind of the constructs above.

mehdi_amini added inline comments.Aug 13 2022, 2:59 AM
mlir/docs/BytecodeFormat.md
102

Producer string?

120

Actually, following our offline discussion, they are needed to be able match to type/attr back to a dialect right?

133

Worth mentioning: the first section can't be decoded without the second one as the elements in the array of attrs/types don't include a size.

159

Maybe make it explicit: , this allows to associate an attribute back to a dialect without including a dialect reference in each type/attr entry.

168

Can't you capture this as a TODO at the end of this paragraph?

191

Right, but I'm still not sure why this needs to be encoded: if you load operations in order, you could just use an ever incrementing number for each Value.

203

I think we should think about isolated region from the get go: you don't document the value numbering in the doc but I think we can use the same "local scope" as the textual parser to ensure that the value IDs stays small (and so use less varint space).

221

Is this really the common case that (other than "unknown") the locations are repeating in sequence?

I am not convinced by this choice right now because it will make future lazy loading harder (you need to stream back to find the previously defined location).

252

When is "region empty" useful? Is it legal for an operation to have an empty region?

269

I don't get this, why multiple block_arguments blocks?

What about something like:

block {
  encoding: varint, // (numOps << 1) | (hasBlockArgs)
  arguments: block_arguments?, // Optional based on encoding
  ops : op[]
}
block_arguments {
  firstArgIndex: varint,
  numArgs: varint?,
  args: block_argument[]
}
block_argument {
  typeIndexAndHasLoc: varint, // (typeIndex << 1) | (hasLoc)
  location: varint?
}
mehdi_amini added inline comments.Aug 13 2022, 8:55 AM
mlir/docs/BytecodeFormat.md
120

Also right now it isn't used at all as far as I can tell.

262

Please document numValues :)

mlir/lib/Bytecode/Encoding.h
30

I'm not clear on how you manage "codes", or what are a "builtin section codes"

58

"a" top level operation...
We parse in a block so we should have the ability to have multiple I think.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
136

You should sanity check sectionID here I think (and make the argument type the right enum in the API)

149

Document noinline please.

157

assert(numBytes > 0 && numBytes <= 7); ?

173

Does this work on a big endian machine?

217

We should have a pointer to the BytecodeDialect here I think, should be able to set it up in initializeOffsets

249

We should be able to have an enum for code here right?

282

offsetReader? (I was confusing to me reading the code where it is used)

306

I think you should check that currentOffset does not exceed the sectionData.size(), a malformed byte code coud have offsets going beyond.

381

I remember that Attr/Type were made "mutable" to support LLVM named struct (IIRC?), but isn't this encoding and loading scheme assuming there are no cycles? How are we gonna handle this?

566

This should move to parseSection I think. (sectionID` is used in test/set already, seems unsafe)

593

Why aren't dialects lazy loaded?

616

I was thinking: could we have a stringpool top-level section and everywhere refer to strings with an id there? Mnemonic shared between op/attributes/types and across dialects would be stored once and for-all.

749

Please break the recursion :)

755

Not sure where to attached this comment, but there is something missing somewhere (unless I missed it?) to ensure that use-lists ordering is preserved.

mlir/lib/Bytecode/Writer/IRNumbering.cpp
61

We could record the number of times an attribute is used in ordre to sort them so that the most used one have a lower IDs (and have more chances to fit in one bytes) :)

mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
156

I think the difference with LLVM opt is that we're not having byte code as the default, hence we may not need to warn since the user has to opt-in to get there.

mlir/test/Bytecode/general.mlir
2

Reminds me an old proposal of mine to add some flag to mlir-opt to automatically round-trip and diff, and enable this flag optionally to process the entire test-suite :)

Seems like it would be useful here as well!

jpienaar added inline comments.Aug 13 2022, 9:18 AM
mlir/test/Bytecode/general.mlir
2

Yes indeed, I was actually wondering if we had that already :)

mehdi_amini added inline comments.Aug 13 2022, 9:24 AM
mlir/test/Bytecode/general.mlir
2
rriddle updated this revision to Diff 452942.Aug 16 2022, 4:04 AM
rriddle marked 33 inline comments as done.
rriddle added inline comments.Aug 16 2022, 4:04 AM
mlir/docs/BytecodeFormat.md
191

Originally I was thinking that you can't do that if you load lazily, but if I use your other suggestion of doing per-isolated region numbering it should be possible. This will be much nicer, thanks!

203

Great suggestion!

221

I don't think it would make lazy-loading bad, given that we could encode the last location at the start of the region. I'm going to drop this behavior though, given that I'm not sure how many cases in practice it would help. This would also free up this optional behavior for something else (if that something else was more efficient for real-world use cases).

252

Yeah? e.g. External functions have empty regions. Almost every operation that has a region that can optionally be filled use an empty region, given that regions can't be dynamically added after op construction. Cleaned up this section though, we don't need the leading code, we can just use the block count.

269

Nice! I figured you would come up with something better ;)

mlir/include/mlir/Bytecode/BytecodeWriter.h
36

There was some reason why I did this before, but I forget now. I just dropped it.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
149

I thought I did:

This method is marked noinline to avoid pessimizing the common case of single byte encoding.

Tried to make it more obvious.

173

Right now, no. I need to setup the proper little -> native conversions. I'm deferring that to a follow up because I have to setup a virtual machine for big endian (which is annoying/time consuming). I also need to fix some things in the textual format related to big endian as well.

381

We will likely need some form of special API that can parse just the "immutable" part (i.e. the "name" in the LLVM struct case). For example, if an attribute/type *is* recursive, we could encode both its immutable and mutable encodings in one entry (with some header that has the size of the immutable part or something). Something like:

RecursiveEntry {
  immutableEncodingSize: varint,
  immutableEncoding: ...,
  mutableEncoding: ...
}

During processing we could first process the immutable entry, and then immediately process the mutable one. That way any recursive references would resolve properly, and then we'd fix the final reference afterwards. Something like:

// Parse the immutable first, so that we have something to give recursive references.
if (!(result = parseImmutable()))
  return failure();
// Parse the mutable afterwards. Pass in `result` so that it can populate the mutable bits?
if (failed(parseMutable(result))
  return failure();

Until we figure any of this out though, I'm just going to have them always use the string fallback for those attributes and types.

755

Deferring this to a follow up to help simplify this patch, added a TODO for now.

mlir/lib/Bytecode/Writer/IRNumbering.cpp
61

We would need to encode things differently in that case, i.e. if the attributes are not in order of dialect, they would each need to have an associated dialect id encoded with them. In the case of lots of attributes/types, that would be significant. Maybe we could come up a hybrid model? i.e. encode the most common 128 attributes/types, so that they fit in one byte (or two), and then encode the rest using dialect grouping.

mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
156

Makes sense to me, just dropped it. We can add a warning back in if enough people trip up on this (given bytecode generation is an explicit decision).

jpienaar added inline comments.Aug 16 2022, 7:55 AM
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
616

So encoding would be start and end offsets into a string table?

mehdi_amini added inline comments.Aug 16 2022, 3:49 PM
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
616

String being null terminated, you don't necessarily need the end offsets. But if we have an offset section separate from the string table: we just need to point to an entry number, same mechanism as attr/type reference.

(partial scan before meeting)

mlir/docs/BytecodeFormat.md
12

I just noticed ï and not i , I mean I guess writing this by hand and the other was that we don't actually take text file as bytecode (MĻîŘ just to bikeshed :))

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
10

Nit: I'd move this lower, I expect documentation here not todos :)

227

Dialect of this Attribute or Type ? (its mostly parent that makes me think OOP more than I normally think here, up to you)

309

attribute ?

mehdi_amini added inline comments.Aug 16 2022, 4:06 PM
mlir/docs/BytecodeFormat.md
12

I don't think your bike shed is ASCII though?

rriddle updated this revision to Diff 453166.Aug 16 2022, 4:41 PM
rriddle marked 5 inline comments as done.Aug 16 2022, 4:44 PM
rriddle updated this revision to Diff 453172.Aug 16 2022, 5:09 PM
rriddle marked an inline comment as done.Aug 16 2022, 5:10 PM
jpienaar accepted this revision.Aug 16 2022, 9:08 PM

Missing negative tests are a bit unfortunate, but good to hear they are coming soon and this seems like good starting point.

mlir/docs/BytecodeFormat.md
12

Yeah I marked the comment as done before sending as I wasn't serious (and I did also use an extended ASCII encoding without realizing).

But in seriousness: MLÏR wouldn't be a valid dialect name, so this looks good.

mlir/lib/Bytecode/Encoding.h
26

uint8_t here too? (I mean with 0 it doesn't matter)

80

We can now use binary literals (not sure it makes if it is more readable, but was reminded of it)

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
243

Unsigned needed?

414

Comment?

502

So this parses the string starting at front of reader? And null-terminated?

616

Indeed, null-termination means we can't have substrings referenced (not sure if that is common here, could think for error strings, but unsure about decoding cost).

mlir/lib/Bytecode/Writer/IRNumbering.cpp
61

Would sorting attributes per frequency per dialect? (Keep dialect attributes still together but just sort dialects per frequency). We could measure all three of course, doesn't require version bump ;-)

This revision is now accepted and ready to land.Aug 16 2022, 9:08 PM
rriddle updated this revision to Diff 453383.Aug 17 2022, 11:46 AM
rriddle marked 12 inline comments as done.
rriddle added inline comments.Aug 17 2022, 11:47 AM
mlir/lib/Bytecode/Encoding.h
26

kVersion is encoded as a varint now, so that we don't have to change it if we have some burst of changing versions (makes it easier to change version if we don't have a cap looming overhead). I suppose I could switch the general constants to use inline constexpr variables now (given we are on C++17), let me know your preference.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
502

Yeah, the front of the reader has an index to a string defined in the string section. Updated the comment.

mlir/test/Bytecode/general.mlir
2

Do you plan on reviving that @mehdi_amini ?

rriddle updated this revision to Diff 453397.Aug 17 2022, 12:32 PM
mehdi_amini added inline comments.Aug 17 2022, 2:53 PM
mlir/test/Bytecode/general.mlir
2

Yeah I should. I had memory that we couldn't reach consensus on it but I may be wrong.

mehdi_amini accepted this revision.Aug 17 2022, 3:04 PM

LG, great start :)

jpienaar added inline comments.Aug 17 2022, 3:05 PM
mlir/test/Bytecode/general.mlir
2

I think only question was on on by default or not (e.g., how much of testing tool mlir-opt really is, should it be used in directed testing only etc)

rriddle updated this revision to Diff 453794.Aug 18 2022, 2:40 PM
rriddle marked 2 inline comments as done.
jpienaar accepted this revision.Aug 18 2022, 2:51 PM

Nice, like the encoding change.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
57

Why this change?

mlir/lib/Bytecode/Writer/IRNumbering.cpp
80

Could this just be a static function here?

rriddle marked 2 inline comments as done.Aug 18 2022, 7:23 PM
rriddle added inline comments.
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
57

So that we can load in enums.

rriddle updated this revision to Diff 453864.Aug 18 2022, 7:23 PM
rriddle marked an inline comment as done.
vitalybuka added inline comments.Aug 22 2022, 9:53 AM
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
902

also a problem, it emplace_back may relocate container, but the for loop above uses readState which is the ref to the element of container.

926

This pop_back and then readState.isIsolatedFromAbove which from the regionStack?

rriddle marked 2 inline comments as done.Aug 22 2022, 9:58 AM
rriddle added inline comments.
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
902

This should be fine, given that we always return in this case (i.e. never touch to invalid reference again).

926

Thanks for catching this. I'm not sure why my local asan build didn't catch this (I'll try nuking and resetting it).

vitalybuka added inline comments.Aug 22 2022, 10:03 AM
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
902

Thanks, I see.

926

I'm not sure why my local asan build didn't catch this

Probably you don't use libc++ or instrumented libc++?

I'm not sure why my local asan build didn't catch this I'm not sure why my local asan build didn't catch this

If you can fix it quickly go for it.
If not, please let me know, I have a patch to revert it with related fixes.

If you need to unbreak a sanitizer bot, we can XFAIL: asan the two tests, this is pretty cheap.

If you need to unbreak a sanitizer bot, we can XFAIL: asan the two tests, this is pretty cheap.

You are not scare of out of bound mem access in alive code?

If you need to unbreak a sanitizer bot, we can XFAIL: asan the two tests, this is pretty cheap.

You are not scare of out of bound mem access in alive code?

Define "alive"? This is a new feature that has zero users and we're actively bootstrapping. So no I'm not scared by an out-of-bound here for a couple of days at most.

vitalybuka added a comment.EditedAug 22 2022, 11:15 AM

Looks like it's already fixed with 96fd3f2

rriddle marked 2 inline comments as done.Aug 22 2022, 11:19 AM

If you need to unbreak a sanitizer bot, we can XFAIL: asan the two tests, this is pretty cheap.

You are not scare of out of bound mem access in alive code?

Define "alive"? This is a new feature that has zero users and we're actively bootstrapping. So no I'm not scared by an out-of-bound here for a couple of days at most.

Sure, I don't know that code. So XFAIL is OK to me if you accept implications. (having that @rriddle failed to reproduce locally, maybe UNSUPPORTED instead, in case if some asan setups will miss the issues)
But seems revert/reland safe and easy to do as well.

Also if this is the only issue

Trivial fix may work?

bool isIsolatedFromAbove = readState.isIsolatedFromAbove;
 regionStack.pop_back();
 if (isIsolatedFromAbove)
   valueScopes.pop_back();

Yeah, sorry I've been in meetings. I pushed https://github.com/llvm/llvm-project/commit/96fd3f2d5be21ded6ffed0ac75195df04ec679df an hour ago and have been watching the bot to see if that is the only issue.

Hi @rriddle , as of commit https://github.com/llvm/llvm-project/commit/93cf0e8a28e8c682f65d3e5c394d1eb169ca09ce the s390x build bot is still red due to "unexpected success":

XPASS: MLIR::invalid-string_section.mlir
XPASS: MLIR::invalid_attr_type_offset_section.mlir
XPASS: MLIR::invalid_attr_type_section.mlir
XPASS: MLIR::invalid-structure.mlir
XPASS: MLIR::invalid-ir_section.mlir
XPASS: MLIR::invalid-dialect_section.mlir

(see https://lab.llvm.org/buildbot/#/builders/199/builds/8674)

Can these "invalid" tests still legitimately pass even on a big-endian platform? It seems these XFAILs should either be removed or changed into UNSUPPORTED.

Hi @rriddle , as of commit https://github.com/llvm/llvm-project/commit/93cf0e8a28e8c682f65d3e5c394d1eb169ca09ce the s390x build bot is still red due to "unexpected success":

XPASS: MLIR::invalid-string_section.mlir
XPASS: MLIR::invalid_attr_type_offset_section.mlir
XPASS: MLIR::invalid_attr_type_section.mlir
XPASS: MLIR::invalid-structure.mlir
XPASS: MLIR::invalid-ir_section.mlir
XPASS: MLIR::invalid-dialect_section.mlir

(see https://lab.llvm.org/buildbot/#/builders/199/builds/8674)

Can these "invalid" tests still legitimately pass even on a big-endian platform? It seems these XFAILs should either be removed or changed into UNSUPPORTED.

@uweigand Thanks for the ping, it's possible big-endian is fine up to the point at which some of the tests fail (I haven't had time to setup a venv to test everything out). UNSUPPORTED is likely a better check than XFAIL there (I just copied from our other s390x broken test)

@uweigand Thanks for the ping, it's possible big-endian is fine up to the point at which some of the tests fail (I haven't had time to setup a venv to test everything out). UNSUPPORTED is likely a better check than XFAIL there (I just copied from our other s390x broken test)

As of commit df4e637ca7ef4ef17b662845120864921e65bb67 the build bot is green again on s390x. Thanks!

RVP added a subscriber: RVP.Sep 29 2022, 8:29 AM
RVP added inline comments.
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
186

Is this parenthesized correctly?

jpienaar added inline comments.Sep 29 2022, 8:34 AM
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
186

This is checking if the value post shift is 0 (and relies on this function being called only when multi byte), what issue did you run into with this?

RVP added inline comments.Sep 29 2022, 8:38 AM
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
186

Shouldn't LLVM_LIKELY be around the whole condition instead of the shift expression? Isn't == 0 the likely case and not the shift result being non-zero?

RVP added inline comments.Sep 29 2022, 8:53 AM
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
186

I didn't see any issues. Was looking at the code and this question popped. I now saw that emitVarInt specially handles the common case (... >> 7) == 0. Maybe a comment here as well would have avoided the question. Thanks.