This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
docs/
6/6
BytecodeFormat.md
-
LangRef.md
-
include/mlir/Bytecode/
-
mlir/
-
Bytecode/
6/6
BytecodeImplementation.h
-
lib/Bytecode/
-
Bytecode/
-
Encoding.h
-
Reader/
34/34
BytecodeReader.cpp
-
Writer/
1/1
BytecodeWriter.cpp
-
test/
-
Bytecode/
-
invalid/
-
invalid-structure.mlir
-
versioning/
-
versioned-attr-1.12.mlirbc
-
versioned-attr-2.0.mlirbc
-
versioned-op-1.12.mlirbc
-
versioned-op-2.0.mlirbc
-
versioned-op-2.2.mlirbc
-
versioned_attr.mlir
-
versioned_op.mlir
-
lib/Dialect/Test/
-
Dialect/
-
Test/
13/13
TestDialect.cpp
3/3
TestOps.td

Differential D143647

Extension of "Implement IR versioning through post-parsing upgrade through OpAsmDialectInterface"
ClosedPublic

Authored by mfrancio on Feb 9 2023, 6:58 AM.

Download Raw Diff

Details

Reviewers

rriddle
mehdi_amini
rengolin
nicolasvasilache
myhsu
jpienaar

Commits

rG0e0b6070fd2a: Implements MLIR Bytecode versioning capability

Summary

[mlir] Implements IR versioning capability

A dialect can opt-in to handle versioning through the BytecodeDialectInterface. Few hooks are exposed to the dialect to allow managing a version encoded into the bytecode file. The version is loaded lazily and allows to retrieve the version information while parsing the input IR, and gives an opportunity to each dialect for which a version is present to perform IR upgrades post-parsing through the upgradeFromVersion method. Custom Attribute and Type encodings can also be upgraded according to the dialect version using readAttribute and readType methods.

There is no restriction on what kind of information a dialect is allowed to encode to model its versioning. Currently, versioning is supported only for bytecode formats.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mfrancio created this revision.Feb 9 2023, 6:58 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2023, 6:58 AM

Herald added subscribers: Moerafaat, zero9178, bzcheeseman and 19 others. · View Herald Transcript

mfrancio requested review of this revision.Feb 9 2023, 6:58 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptFeb 9 2023, 6:58 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B212799: Diff 496112.Feb 9 2023, 7:55 AM

My Phab handle is different from my Discourse account

jpienaar added a subscriber: jpienaar.Feb 9 2023, 9:30 AM

jpienaar added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1466	I'd prefer upgrade of the in memory structure to not be inside the reader. We already have a way to parse without verification, this upgrade is of the in memory structure which can be done separate. In here I'd prefer only upgrades related to parsing/before it gets to memory. This could be done at the top level entry point though, but outside of the parsing guts feels.
mlir/lib/IR/AsmPrinter.cpp
3091 ↗	(On Diff #496112)	When this was discussed we talked about needing to have builtin attr version be treated specially (else one can't parse its version to know how to parse integerattr even).
mlir/test/lib/Dialect/Test/TestDialect.cpp
176	Note: error messages should follow LLVM convention and be a sentence fragment (start lower case, no trailing punctuation)
1675	I may have missed where this is used.

I believe some tests are missing like those related to bytecode. Also could you attach diff with more context as instructed here.

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
504	format: add braces

Revised diff according to the comments received:

Adds a new built-in "VersionAttr";
Decouples VersionAttr from other built-in Attributes, so that they can grow independently;
Leaves complete freedom to each dialect on how to manage versioning and how to encode its version into the VersionAttr;
Exposes a couple of hooks to optionally print/parse the dialect version as custom string.

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 17 2023, 9:53 AM

mfrancio marked 4 inline comments as done.Feb 17 2023, 9:59 AM

mfrancio added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1466	I considered this, but I found a little bit confusing the need to carry over the version at which the IR was parsed into the top level entry point - I actually like a lot the fact that the version stays within the parsing, so that only the current version of the dialect exists at the entry point level.
mlir/lib/IR/AsmPrinter.cpp
3091 ↗	(On Diff #496112)	I added a new built in attribute called VersionAttr.
mlir/test/lib/Dialect/Test/TestDialect.cpp
1675	Yes, indeed I forgot to upload the corresponding mlir file.

mfrancio edited the summary of this revision. (Show Details)Feb 17 2023, 10:08 AM

mfrancio marked 2 inline comments as done.Feb 17 2023, 10:41 AM

I don't understand why we need a builtin VersionAttr at all?

In D143647#4135740, @mehdi_amini wrote:

I don't understand why we need a builtin VersionAttr at all?

Nevermind, this is just lifetime management, doesn't seem unreasonable.

Harbormaster completed remote builds in B214446: Diff 498417.Feb 17 2023, 10:54 AM

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

This revision now requires changes to proceed.Feb 17 2023, 11:04 AM

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

Quoting @jpienaar: "When this was discussed we talked about needing to have builtin attr version be treated specially (else one can't parse its version to know how to parse integerattr even)."

One of the comments I received in the first version was indeed to use a new builtin attr that could be treated specially and allow decoupling with the existing attributes. Also, we wish to encode anything on the version and leave freedom to each dialect to do whatever (essentially writing on/retrieving the buffer). It indeed felt natural to use a new attribute that handles that bag of bytes. I would be happy to revise as necessary if there is a better solution to do this.

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

I was thinking about how to manage the lifetime, but I think the attribute does not change much actually: a handle still need to be made available whether it is an attribute or not.
So we could design it with a data structure stored on the parser itself and so made available to the dialects during the parsing.
It wouldn't survive the parsing phase though: post-parsing you lose the information about the version producer.

In D143647#4135978, @mehdi_amini wrote:

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

I was thinking about how to manage the lifetime, but I think the attribute does not change much actually: a handle still need to be made available whether it is an attribute or not.
So we could design it with a data structure stored on the parser itself and so made available to the dialects during the parsing.
It wouldn't survive the parsing phase though: post-parsing you lose the information about the version producer.

In the current implementation you don't really need to have a lifetime that exceeds the parsing. However, this may change in the future and having a reserved attribute for doing this may come handy. Are there any other drawbacks I do not see for adding a VersionAttr? If there are, I would be more than happy to revise further with the idea of adding a parser data structure. Any additional feedback would be greatly appreciated.

In the hope of reaching consensus, I am uploading a revised diff that removes the use of attributes entirely for Version and introduces a new dedicated AsmDialectVersionHandle to manage the lifetime of the buffer.

To summarize, the proposed approach:

Introduces a new AsmDialectVersionHandle to manage the lifetime of buffer representing dialect attribute info;
decouples versioning of a dialect to the rest of the mlir infrastructure, so each can be used and grow independently;
Leaves complete freedom to each dialect on how to manage versioning and how to encode its version into the Version Handle;
Exposes a couple of hooks to optionally print/parse the dialect version as custom string.

I hope you will find this interesting and compelling for the project.

Harbormaster completed remote builds in B215056: Diff 499217.Feb 21 2023, 11:47 AM

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

Have we successfully addressed the concern here and is this ready to land?

I just skimmed through, I'd need some time to review this but I'm travelling right now and not sure if I'll get to it before the week end.

My intuition right now would be to implement it only for the Bytecode for now. The story there is more comprehensive than for the textual format, where we only offer the post-parsing upgrade and no control during parsing (and I'm not convinced we should encourage this).

Something that would be nice also is an example of how to use the version while parsing a type or an attribute to support an upgraded format (for example a new field that was added post version 1.40 or something like that).

mlir/include/mlir/IR/OpImplementation.h
470 ↗	(On Diff #499217)	Can you increase the amount of doc here: add context about what purpose it serves and some info on the context where it is used.
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1520	Can you spell the type here?
1525	Should `byteCodedialect.dialect` be available here?
1529	Isn't dyn_cast working for dialect interfaces?
1786	Is going through the string name for the dialect the best way to resolve this? (I would think we have a dialect ID directly available? And using integer makes everything else more straighforward)
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	We should use the bytecode encoding here for portability purpose at minima. That is encore the two int as varint (and decode them when loading).
mlir/test/lib/Dialect/Test/TestOps.td
3171	Leftover?

In addition to the stuff mentioned, I'd also love to see top-level docs detailing versioning, how it's structured, and how to hook in.

mlir/test/lib/Dialect/Test/TestOps.td
3170–3171	Dead code?

In D143647#4144867, @rriddle wrote:

In addition to the stuff mentioned, I'd also love to see top-level docs detailing versioning, how it's structured, and how to hook in.

Thanks for your feedback. I will also add some tests related to the byte code encoding itself, similarly to what already done for the other sections -- I've been holding off to doing it while trying to get the bulk of the code reviewed.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1520	definitely.
1525	Yes, it should, but it looks like you would have to handle a bunch of cases in the general case. From BytecodeDialect.h: /// The loaded dialect entry. This field is std::nullopt if we haven't /// attempted to load, nullptr if we failed to load, otherwise the loaded /// dialect. std::optional<Dialect *> dialect; I find getting the dialect from the context directly to be generally safer here.
1529	Yes, it does work - are there any issues in using this API though? I'll change it anyway, since we could dyn_cast_or_null and remove the check for nullptr on the dialect.
1786	It is true that the bytecode holds an integer which references the string section, but I don't see an existing API to reference the string by idx. I don't really see the "non-straightforward" part anyway - we parse a string with a clean API, and we use it as a hash to map to the version handle. Am I missing something?
mlir/test/lib/Dialect/Test/TestOps.td
3170–3171	yep. thanks for pointing it out!

mfrancio added inline comments.Feb 23 2023, 6:05 PM

mlir/include/mlir/IR/OpImplementation.h
470 ↗	(On Diff #499217)	Definitely.
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	This is a very good point that I overlooked. It looks like we have two separate problems here - one is the portability, and the other one is to apply some sort of compression (not really critical in my view, but nice to have). For the purpose of the example, the first problem could be solved simply by using the helpers exposed by llvm under llvm/Support/Endian.h. For example, we could write/read the integers representing the version using inline void write16le(void P, uint16_t V) { write16<little>(P, V); } inline uint16_t read16le(const void P) { return read16<little>(P); } For the second, using varInt is definitely a great idea. It would be great to reuse the same byte code emitters and readers but it looks like they are not really exposed outside the bytecode cpp files. What we could do is to expose the varInt portion of it as helpers under mlir/Support. I am open to doing it, but since this is just an example, is it really worth it? Looking forward to hear your thoughts.

Updates with respect to the previous diff:

Removes versioning capabilities from the textual format
Adds new tests specific to the bytecode format
Adds some documentation and addresses some comments

To summarize, the proposed approach:

Introduces a new AsmDialectVersionHandle to manage the lifetime of buffer representing dialect attribute info;
decouples versioning of a dialect to the rest of the mlir infrastructure, so each can be used and grow independently;
Leaves complete freedom to each dialect on how to manage versioning and how to encode its version into the Version Handle.

Looking forward to your feedback.

Herald added a subscriber: dmgreen. · View Herald TranscriptFeb 23 2023, 9:36 PM

Going in the right direction, thanks for the update!

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Efficiency: string manipulation isn't free. That said it is pretty bounded here, we should have at most one version per dialect... But stepping back: why aren't we emitting the version in the dialect section? We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect. That seems like it could fit right before the op names.
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	The bytecode primitive are exposed in the public header `mlir/include/mlir/Bytecode/BytecodeImplementation.h`. Have a look at the dialect interface for manipulating types and attribute: virtual Attribute readAttribute(DialectBytecodeReader &reader) const { We should model the API here similarly: for a dialect writing a custom version blob should be no different than writing an attribute.

Harbormaster completed remote builds in B215674: Diff 500076.Feb 23 2023, 10:51 PM

mfrancio added inline comments.Feb 24 2023, 8:15 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	The reason why I didn't do it is that it would break existing bytecodes and would require increasing the bytecode version (I am talking about mlir::bytecode::kVersion). I am open to this, but I don't really see the immediate need. It could always be done as part of a major update of the bytecode version itself.
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	Sounds good, I'll take a look.

mehdi_amini added inline comments.Feb 24 2023, 8:30 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	In general I'm not in favor of taking detour when we know where we want to land (I don't see a problem with upgrading the bytecode as a breaking change at this point). There are a couple of things I intend to break there as well soon-ish.

mfrancio added inline comments.Feb 24 2023, 8:42 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Maybe this was already discussed in the past and I missed it, but isn't the bytecode version itself going to be backward compatible? Is there any interest in achieving this?

saksenadhruv added inline comments.Feb 24 2023, 9:41 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Yes, we actually are hoping to ship a serialization format with versioning soon, and would like bytecode to have some compatibility, or atleast a way to upgrade/downgrade when we break it in next couple of months. What is the guidance on using bytecode for serialization and compatibility? We are using versioning on our dialect but we need some underlying guarantees on the bytecode itself as well.

mehdi_amini added inline comments.Feb 24 2023, 9:45 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Yes we want it to be stable. From my point of view I am aware of 3 features I want to get before I'm comfortable with trying to claim that we reached the "stability" point. Dialect Versioning (thanks you for driving this!) Use-list order. Lazy-loading ability. (Some people may have other ideas, I'm not aware of any) Then there is my work on "properties", but I suspect we can preserve backward compatibility on this (assuming the dialects themselves don't change of course).

mfrancio planned changes to this revision.Mar 2 2023, 5:48 PM

mfrancio added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Back to: We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect. That seems like it could fit right before the op names. If we are in agreement on the current proposal (the dialect provides a version handle which holds a buffer to be written to file), we can definitely emit this blob of data into the dialect section as a breaking change. Can you kindly confirm before I move forward with the change?
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in `BytecodeImplementation.h`. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections - what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in `mlir/lib/Bytecode/Writer/BytecodeWriter.cpp`, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version.

mehdi_amini added inline comments.Mar 5 2023, 12:30 PM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Yes I think we should just do that now if we need to, that said in https://reviews.llvm.org/D145328 I did the change in a backward compatible way.
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in BytecodeImplementation.h. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections Right, sorry if I have the impression that this interface was "ready to be used" as-is here, I meant to point it as an example of an API that allows dialect author to access bytecode manipulation primitives. what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in mlir/lib/Bytecode/Writer/BytecodeWriter.cpp, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version. I started typing a long answer here, but felt like I was missing something so I sketched something here instead: https://reviews.llvm.org/D145328 (there is still a bug, and a I haven't regenerated the bytecode test file, but the interface is there!)

jpienaar added inline comments.Mar 5 2023, 9:02 PM

mlir/test/lib/Dialect/Test/TestDialect.cpp
62	I like the sketch.

mfrancio added inline comments.Mar 5 2023, 9:16 PM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Yes! This is exactly what I envisioned when I started implementing the first draft, but I didn't post it as I didn't want to rely on a bytecode version change. Nice to see this. We could also opt to remove the version section explicitly and inline the read/write of size/bytes reusing the alignment of the parent dialect section (probably a bit more memory efficient), but this works.
mlir/test/lib/Dialect/Test/TestDialect.cpp
62	Thanks for the suggestion. This is very neat, I'll try to finalize it and regenerate the bytecode test files.

mehdi_amini added inline comments.Mar 6 2023, 1:24 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface.

mehdi_amini added inline comments.Mar 6 2023, 1:26 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Something to add still is an attribute in the test dialect that is serialized at v0.1 and read / upgraded during parsing of v0.2. I suspect we're missing making the version available on the readAttribute API.

mehdi_amini mentioned this in D145328: [mlir] Implements IR versioning capability (WIP).Mar 6 2023, 1:30 AM

Updates diff incorporating changes from https://reviews.llvm.org/D145328

Includes an example for upgrading an attribute that was written at v1 and it is read at v2 with a different encoding.

Harbormaster completed remote builds in B217768: Diff 502876.Mar 6 2023, 5:54 PM

mfrancio added inline comments.Mar 6 2023, 6:00 PM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	I did the change! It is tested only for attributes, but I can easily extend it to types as well!
1786	The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface. I still don't see the reason. I think the section could just be inlined. You don't need the interface to read it (we would just hold the buffer). The interface is needed later to resolve the buffer and decode it... Unless I am missing something subtle :)

mfrancio updated this revision to Diff 502915.Mar 6 2023, 9:33 PM

LGTM, but please wait for @rriddle to stamp it as well!

mlir/include/mlir/Bytecode/BytecodeImplementation.h
330	Can you add a simple doc?
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Right, I guess I didn't find the method to do it! Do you see how to emit the content of the `versionEmitter` differently than using emitSection? We need to emit the size and then the content. The logic in `emitSection()` has this logic: // Push our current buffer and then merge the provided section body into // ours. appendResult(std::move(currentResult)); for (std::vector<uint8_t> &result : emitter.prevResultStorage) prevResultStorage.push_back(std::move(result)); llvm::append_range(prevResultList, emitter.prevResultList); prevResultSize += emitter.prevResultSize; appendResult(std::move(emitter.currentResult)); (knowing that the writeVersion interface can't do it because it needs to compute the size first before emitting the content)
mlir/test/lib/Dialect/Test/TestDialect.cpp
108	Would `else` be enough here?

Harbormaster completed remote builds in B217792: Diff 502915.Mar 6 2023, 11:16 PM

Looking very close, thanks!

mlir/docs/BytecodeFormat.md
167–170	Why use a separate section? I would have expected to have this just be part of the `op_name_group` (which should be renamed at this point to `dialect`). We can store a bit with `numOpNames` or `dialect` to indicate if a version is present, and then optionally read it.
mlir/include/mlir/Bytecode/BytecodeImplementation.h
326–328	Why is it necessary for dialect authors to write the size? I would expect this could be automatically handled (e.g. via back-patching)?
mlir/include/mlir/IR/OpImplementation.h
1466 ↗	(On Diff #502915)	This change feels unrelated, can you revert?

Nice! I'm fine with delaying textual form.

mlir/docs/BytecodeFormat.md
167–170	I don't think it's documented here, can we have multiple op_name_groups for same dialect? With the naming change it feels like it's saying there will be only one per dialect. +1 to bit and making this optionally specified if bit set. (I think section may be overloaded, I see this as proposed as just convenient way of grouping these two optional items together).
mlir/include/mlir/Bytecode/BytecodeImplementation.h
275	If we have a versioned dialect, it would seem we'd always have to use this method just in case (and if version is unspecified then there is just no upgrade).
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
421	Why does DialectReader need to be thread through? I thought it was a rather cheap, stateless structure to create.

mfrancio planned changes to this revision.Mar 7 2023, 6:52 AM

mfrancio added inline comments.

mlir/docs/BytecodeFormat.md
167–170	Agreed, I'll plan for this change. You shouldn't really need a bit - just printing the buffer is enough. A size zero means that no version is available. From a quick look I don't think you can have multiple op_name_groups per dialect - those are indeed already grouped by dialect: // Parse the operation names, which are grouped by dialect. auto parseOpName = [&](BytecodeDialect *dialect) { StringRef opName; if (failed(stringReader.parseString(sectionReader, opName))) return failure(); opNames.emplace_back(dialect, opName); return success(); }; while (!sectionReader.empty()) if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName))) return failure(); so changing the op_name_group to be dialect itself should be fine.
mlir/include/mlir/Bytecode/BytecodeImplementation.h
275	I feel it's anyway up to the dialect to decide what to do here. Having a fall-back seems convenient to me, but if it looks confusing or there is desire to push for the versioned implementation anyway we can emit an error similarly to the other reader.
326–328	This slipped - it is indeed not necessary. I'll update the comment.
mlir/include/mlir/IR/OpImplementation.h
1466 ↗	(On Diff #502915)	Sure!
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	Exactly, I was thinking of consolidating this into a new method of reader and avoid the existence of a new dialect version section. I'll try!
mlir/test/lib/Dialect/Test/TestDialect.cpp
108	I think the comment below is misleading - the intent was to forbid reading a newer than current version. I'll revise this.

mehdi_amini added inline comments.Mar 7 2023, 7:21 AM

mlir/docs/BytecodeFormat.md
167–170	We could use a bit on the numOpNames to gate the existence of the version: op_name_group { dialect: varint, numOpNamesAndIsVersionAvailable: varint, // (numOpNames << 1 \| versionAvailable) version : dialect_version_section opNames: varint[] } That way we don't write a section when there is no version there.

mehdi_amini added inline comments.Mar 7 2023, 7:30 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
421	It is cheap, but needs to be created from things unavailable in this class, so you'd need to thread through more of other things here!

I was wondering (post review) if we should split the reader and writer commit parts, so that we could give a bit of time for bytecode consumers to get updated first (thinking of projects that span multiple repos). I mean, it is unstable at the moment, but wouldn't cause any additional churn.

mlir/docs/BytecodeFormat.md
167–170	They are grouped by dialect, but it just emplaces inside a vector. The emplacing results in the "flat" ID, so one can have multiple instances of this where the flat id can be small for the most common operations independent of dialect. So we'd probably need to just verify that a version is only specified once for a dialect (we could allow multiple as long as the same but that seems undesirable from size poitn of view) And yes what Mehdi suggested is what River also mentioned.
mlir/include/mlir/Bytecode/BytecodeImplementation.h
275	Yes I'm less worried about dialect than checking expectations for bytecode format parser (where it defers to dialect inside attribute/type parser its under full control of dialect).
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
421	What else do you need to thread through? Dialect version?

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
421	Look at the call sites: DialectReader dialectReader(*this, stringReader, resourceReader, reader); if (failed(entry.dialect->load(dialectReader, fileLoc.getContext()))) return failure(); So `stringReader, resourceReader` are the extra I think? (also some call sites already have the dialectReader available)

Added a bit flag to detect if a dialect is versioned and trigger the read of the section.

mfrancio marked an inline comment as done.Mar 10 2023, 10:00 AM

mfrancio added inline comments.

mlir/docs/BytecodeFormat.md
167–170	Added the bit flag. It felt more natural doing it on the dialect name itself instead of going inside the dialect version grouping. Let me know if there are any concerns.
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1786	I considered this again, but the only thing that would eventually "save" is really to print the var int of the section, so it felt not strictly necessary now that we have the bit flag.

rriddle accepted this revision.Mar 10 2023, 10:03 AM

rriddle added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1075	Can you drop the trivial braces here?
1413	I really thought we had a helper that read a varint and extracted a flag.

This revision is now accepted and ready to land.Mar 10 2023, 10:03 AM

mfrancio added inline comments.Mar 10 2023, 10:07 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1413	Oh indeed, I didn't see it. Thanks!

mfrancio updated this revision to Diff 504212.Mar 10 2023, 10:27 AM

mfrancio marked 2 inline comments as done.

mfrancio added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1075	yep, thanks!

LGTM, thanks for being so patient through the reviews!

Let's wait for @rriddle to give a final approval.

In D143647#4185686, @mehdi_amini wrote:

LGTM, thanks for being so patient through the reviews!

Let's wait for @rriddle to give a final approval.

Actually River reviewed already (I starting reviewing earlier and went out, figured I didn't hit "submit" before).

Do you have commit access or do you need help to land this?

In D143647#4185691, @mehdi_amini wrote:

Do you have commit access or do you need help to land this?

I don't have commit access, it's the first time I commit here! I was just waiting for the builds to complete before asking for help.

Thank you all for the feedback - it's been nice to collaborate!

Harbormaster completed remote builds in B218723: Diff 504212.Mar 10 2023, 1:20 PM

Right now you have test failures. On my Mac locally I see:

Failed Tests (2):
  MLIR :: Bytecode/versioning/versioned_attr.mlir
  MLIR :: Bytecode/versioning/versioned_op.mlir

This revision was landed with ongoing or failed builds.Mar 10 2023, 2:29 PM

Closed by commit rG0e0b6070fd2a: Implements MLIR Bytecode versioning capability (authored by mfrancio, committed by mehdi_amini). · Explain Why

This revision was automatically updated to reflect the committed changes.

mehdi_amini added a commit: rG0e0b6070fd2a: Implements MLIR Bytecode versioning capability.

eric-k256 added a subscriber: eric-k256.Apr 23 2023, 9:56 PM

Herald added a subscriber: bviyer. · View Herald TranscriptApr 23 2023, 9:56 PM

Revision Contents

Path

Size

mlir/

docs/

BytecodeFormat.md

22 lines

LangRef.md

15 lines

include/

mlir/

Bytecode/

BytecodeImplementation.h

58 lines

lib/

Bytecode/

Encoding.h

10 lines

Reader/

BytecodeReader.cpp

160 lines

Writer/

BytecodeWriter.cpp

249 lines

test/

Bytecode/

invalid/

invalid-structure.mlir

2 lines

versioning/

versioned-attr-1.12.mlirbc

versioned-attr-2.0.mlirbc

versioned-op-1.12.mlirbc

versioned-op-2.0.mlirbc

versioned-op-2.2.mlirbc

versioned_attr.mlir

29 lines

versioned_op.mlir

41 lines

lib/

Dialect/

Test/

TestDialect.cpp

121 lines

TestOps.td

39 lines

Diff 504284

mlir/docs/BytecodeFormat.md

	# MLIR Bytecode Format			# MLIR Bytecode Format

	This documents describes the MLIR bytecode format and its encoding.			This documents describes the MLIR bytecode format and its encoding.

	[TOC]			[TOC]

	## Magic Number			## Magic Number

	MLIR uses the following four-byte magic number to indicate bytecode files:			MLIR uses the following four-byte magic number to
				indicate bytecode files:

	'\[‘M’<sub>8</sub>, ‘L’<sub>8</sub>, ‘ï’<sub>8</sub>, ‘R’<sub>8</sub>\]'			'\[‘M’<sub>8</sub>, ‘L’<sub>8</sub>, ‘ï’<sub>8</sub>, ‘R’<sub>8</sub>\]'

	In hex:			In hex:

	'\[‘4D’<sub>8</sub>, ‘4C’<sub>8</sub>, ‘EF’<sub>8</sub>, ‘52’<sub>8</sub>\]'			'\[‘4D’<sub>8</sub>, ‘4C’<sub>8</sub>, ‘EF’<sub>8</sub>, ‘52’<sub>8</sub>\]'

	## Format Overview			## Format Overview
	▲ Show 20 Lines • Show All 134 Lines • ▼ Show 20 Lines
	```			```
	dialect_section {			dialect_section {
	numDialects: varint,			numDialects: varint,
	dialectNames: varint[],			dialectNames: varint[],
	opNames: op_name_group[]			opNames: op_name_group[]
	}			}

	op_name_group {			op_name_group {
	dialect: varint,			dialect: varint // (dialectID << 1) \| (hasVersion),
				version : dialect_version_section
	numOpNames: varint,			numOpNames: varint,
	opNames: varint[]			opNames: varint[]
	}			}

				dialect_version_section {
				size: varint,
				version: byte[]
				}
				rriddleUnsubmitted Done Reply Inline Actions Why use a separate section? I would have expected to have this just be part of the `op_name_group` (which should be renamed at this point to `dialect`). We can store a bit with `numOpNames` or `dialect` to indicate if a version is present, and then optionally read it. rriddle: Why use a separate section? I would have expected to have this just be part of the…
				jpienaarUnsubmitted Done Reply Inline Actions I don't think it's documented here, can we have multiple op_name_groups for same dialect? With the naming change it feels like it's saying there will be only one per dialect. +1 to bit and making this optionally specified if bit set. (I think section may be overloaded, I see this as proposed as just convenient way of grouping these two optional items together). jpienaar: I don't think it's documented here, can we have multiple op_name_groups for same dialect? With…
				mfrancioAuthorUnsubmitted Done Reply Inline Actions Agreed, I'll plan for this change. You shouldn't really need a bit - just printing the buffer is enough. A size zero means that no version is available. From a quick look I don't think you can have multiple op_name_groups per dialect - those are indeed already grouped by dialect: // Parse the operation names, which are grouped by dialect. auto parseOpName = [&](BytecodeDialect dialect) { StringRef opName; if (failed(stringReader.parseString(sectionReader, opName))) return failure(); opNames.emplace_back(dialect, opName); return success(); }; while (!sectionReader.empty()) if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName))) return failure(); so changing the op_name_group to be dialect itself should be fine. mfrancio:* Agreed, I'll plan for this change. You shouldn't really need a bit - just printing the buffer…
				mehdi_aminiUnsubmitted Done Reply Inline Actions We could use a bit on the numOpNames to gate the existence of the version: op_name_group { dialect: varint, numOpNamesAndIsVersionAvailable: varint, // (numOpNames << 1 \| versionAvailable) version : dialect_version_section opNames: varint[] } That way we don't write a section when there is no version there. mehdi_amini: We could use a bit on the numOpNames to gate the existence of the version: ``` op_name_group {…
				jpienaarUnsubmitted Done Reply Inline Actions They are grouped by dialect, but it just emplaces inside a vector. The emplacing results in the "flat" ID, so one can have multiple instances of this where the flat id can be small for the most common operations independent of dialect. So we'd probably need to just verify that a version is only specified once for a dialect (we could allow multiple as long as the same but that seems undesirable from size poitn of view) And yes what Mehdi suggested is what River also mentioned. jpienaar: They are grouped by dialect, but it just emplaces inside a vector. The emplacing results in the…
				mfrancioAuthorUnsubmitted Done Reply Inline Actions Added the bit flag. It felt more natural doing it on the dialect name itself instead of going inside the dialect version grouping. Let me know if there are any concerns. mfrancio: Added the bit flag. It felt more natural doing it on the dialect name itself instead of going…

	```			```

	Dialects are encoded as indexes to the name string within the string section.			Dialects are encoded as a `varint` containing the index to the name string
	Operation names are encoded in groups by dialect, with each group containing the			within the string section, plus a flag indicating whether the dialect is
	dialect, the number of operation names, and the array of indexes to each name			versioned. Operation names are encoded in groups by dialect, with each group
	within the string section.			containing the dialect, the number of operation names, and the array of indexes
				to each name within the string section. The version is encoded as a nested
				section.

	### Attribute/Type Sections			### Attribute/Type Sections

	Attributes and types are encoded using two [sections](#sections), one section			Attributes and types are encoded using two [sections](#sections), one section
	(`attr_type_section`) containing the actual encoded representation, and another			(`attr_type_section`) containing the actual encoded representation, and another
	section (`attr_type_offset_section`) containing the offsets of each encoded			section (`attr_type_offset_section`) containing the offsets of each encoded
	attribute/type into the previous section. This structure allows for attributes			attribute/type into the previous section. This structure allows for attributes
	and types to always be lazily loaded on demand.			and types to always be lazily loaded on demand.
	▲ Show 20 Lines • Show All 228 Lines • Show Last 20 Lines

mlir/docs/LangRef.md

	Show First 20 Lines • Show All 839 Lines • ▼ Show 20 Lines
	See [here](DefiningDialects/AttributesAndTypes.md) on how to define dialect attribute values.			See [here](DefiningDialects/AttributesAndTypes.md) on how to define dialect attribute values.

	### Builtin Attribute Values			### Builtin Attribute Values

	The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values			The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values
	that are directly usable by any other dialect in MLIR. These types cover a range			that are directly usable by any other dialect in MLIR. These types cover a range
	from primitive integer and floating-point values, attribute dictionaries, dense			from primitive integer and floating-point values, attribute dictionaries, dense
	multi-dimensional arrays, and more.			multi-dimensional arrays, and more.

				### IR Versionning

				A dialect can opt-in to handle versioning through the
				`BytecodeDialectInterface`. Few hooks are exposed to the dialect to allow
				managing a version encoded into the bytecode file. The version is loaded lazily
				and allows to retrieve the version information while parsing the input IR, and
				gives an opportunity to each dialect for which a version is present to perform
				IR upgrades post-parsing through the `upgradeFromVersion` method. Custom
				Attribute and Type encodings can also be upgraded according to the dialect
				version using readAttribute and readType methods.

				There is no restriction on what kind of information a dialect is allowed to
				encode to model its versioning. Currently, versioning is supported only for
				bytecode formats.

mlir/include/mlir/Bytecode/BytecodeImplementation.h

Show First 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	public:
virtual void writeOwnedString(StringRef str) = 0;		virtual void writeOwnedString(StringRef str) = 0;

/// Write a blob to the bytecode, which is owned by the caller and is		/// Write a blob to the bytecode, which is owned by the caller and is
/// guaranteed to not die before the end of the bytecode process. The blob is		/// guaranteed to not die before the end of the bytecode process. The blob is
/// written as-is, with no additional compression or compaction.		/// written as-is, with no additional compression or compaction.
virtual void writeOwnedBlob(ArrayRef<char> blob) = 0;		virtual void writeOwnedBlob(ArrayRef<char> blob) = 0;
};		};

		//===--------------------------------------------------------------------===//
		// Dialect Version Interface.
		//===--------------------------------------------------------------------===//

		/// This class is used to represent the version of a dialect, for the purpose
		/// of polymorphic destruction.
		class DialectVersion {
		public:
		virtual ~DialectVersion() = default;
		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// BytecodeDialectInterface		// BytecodeDialectInterface
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class BytecodeDialectInterface		class BytecodeDialectInterface
: public DialectInterface::Base<BytecodeDialectInterface> {		: public DialectInterface::Base<BytecodeDialectInterface> {
public:		public:
using Base::Base;		using Base::Base;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Reading		// Reading
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Read an attribute belonging to this dialect from the given reader. This		/// Read an attribute belonging to this dialect from the given reader. This
/// method should return null in the case of failure.		/// method should return null in the case of failure.
virtual Attribute readAttribute(DialectBytecodeReader &reader) const {		virtual Attribute readAttribute(DialectBytecodeReader &reader) const {
reader.emitError() << "dialect " << getDialect()->getNamespace()		reader.emitError() << "dialect " << getDialect()->getNamespace()
<< " does not support reading attributes from bytecode";		<< " does not support reading attributes from bytecode";
return Attribute();		return Attribute();
}		}

		/// Read a versioned attribute encoding belonging to this dialect from the
		/// given reader. This method should return null in the case of failure, and
		/// falls back to the non-versioned reader in case the dialect implements
		/// versioning but it does not support versioned custom encodings for the
		/// attributes.
		virtual Attribute readAttribute(DialectBytecodeReader &reader,
		jpienaarUnsubmitted Done Reply Inline Actions If we have a versioned dialect, it would seem we'd always have to use this method just in case (and if version is unspecified then there is just no upgrade). jpienaar: If we have a versioned dialect, it would seem we'd always have to use this method just in case…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions I feel it's anyway up to the dialect to decide what to do here. Having a fall-back seems convenient to me, but if it looks confusing or there is desire to push for the versioned implementation anyway we can emit an error similarly to the other reader. mfrancio: I feel it's anyway up to the dialect to decide what to do here. Having a fall-back seems…
		jpienaarUnsubmitted Done Reply Inline Actions Yes I'm less worried about dialect than checking expectations for bytecode format parser (where it defers to dialect inside attribute/type parser its under full control of dialect). jpienaar: Yes I'm less worried about dialect than checking expectations for bytecode format parser (where…
		const DialectVersion &version) const {
		reader.emitError()
		<< "dialect " << getDialect()->getNamespace()
		<< " does not support reading versioned attributes from bytecode";
		return Attribute();
		}

/// Read a type belonging to this dialect from the given reader. This method		/// Read a type belonging to this dialect from the given reader. This method
/// should return null in the case of failure.		/// should return null in the case of failure.
virtual Type readType(DialectBytecodeReader &reader) const {		virtual Type readType(DialectBytecodeReader &reader) const {
reader.emitError() << "dialect " << getDialect()->getNamespace()		reader.emitError() << "dialect " << getDialect()->getNamespace()
<< " does not support reading types from bytecode";		<< " does not support reading types from bytecode";
return Type();		return Type();
}		}

		/// Read a versioned type encoding belonging to this dialect from the given
		/// reader. This method should return null in the case of failure, and
		/// falls back to the non-versioned reader in case the dialect implements
		/// versioning but it does not support versioned custom encodings for the
		/// types.
		virtual Type readType(DialectBytecodeReader &reader,
		const DialectVersion &version) const {
		reader.emitError()
		<< "dialect " << getDialect()->getNamespace()
		<< " does not support reading versioned types from bytecode";
		return Type();
		}

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Writing		// Writing
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Write the given attribute, which belongs to this dialect, to the given		/// Write the given attribute, which belongs to this dialect, to the given
/// writer. This method may return failure to indicate that the given		/// writer. This method may return failure to indicate that the given
/// attribute could not be encoded, in which case the textual format will be		/// attribute could not be encoded, in which case the textual format will be
/// used to encode this attribute instead.		/// used to encode this attribute instead.
virtual LogicalResult writeAttribute(Attribute attr,		virtual LogicalResult writeAttribute(Attribute attr,
DialectBytecodeWriter &writer) const {		DialectBytecodeWriter &writer) const {
return failure();		return failure();
}		}

/// Write the given type, which belongs to this dialect, to the given writer.		/// Write the given type, which belongs to this dialect, to the given writer.
/// This method may return failure to indicate that the given type could not		/// This method may return failure to indicate that the given type could not
/// be encoded, in which case the textual format will be used to encode this		/// be encoded, in which case the textual format will be used to encode this
/// type instead.		/// type instead.
virtual LogicalResult writeType(Type type,		virtual LogicalResult writeType(Type type,
DialectBytecodeWriter &writer) const {		DialectBytecodeWriter &writer) const {
return failure();		return failure();
}		}

		/// Write the version of this dialect to the given writer.
		virtual void writeVersion(DialectBytecodeWriter &writer) const {}

		rriddleUnsubmitted Done Reply Inline Actions Why is it necessary for dialect authors to write the size? I would expect this could be automatically handled (e.g. via back-patching)? rriddle: Why is it necessary for dialect authors to write the size? I would expect this could be…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions This slipped - it is indeed not necessary. I'll update the comment. mfrancio: This slipped - it is indeed not necessary. I'll update the comment.
		// Read the version of this dialect from the provided reader and return it as
		// a `unique_ptr` to a dialect version object.
		mehdi_aminiUnsubmitted Done Reply Inline Actions Can you add a simple doc? mehdi_amini: Can you add a simple doc?
		virtual std::unique_ptr<DialectVersion>
		readVersion(DialectBytecodeReader &reader) const {
		reader.emitError("Dialect does not support versioning");
		return nullptr;
		}

		/// Hook invoked after parsing completed, if a version directive was present
		/// and included an entry for the current dialect. This hook offers the
		/// opportunity to the dialect to visit the IR and upgrades constructs emitted
		/// by the version of the dialect corresponding to the provided version.
		virtual LogicalResult
		upgradeFromVersion(Operation *topLevelOp,
		const DialectVersion &version) const {
		return success();
		}
};		};

} // namespace mlir		} // namespace mlir

#endif // MLIR_BYTECODE_BYTECODEIMPLEMENTATION_H		#endif // MLIR_BYTECODE_BYTECODEIMPLEMENTATION_H

mlir/lib/Bytecode/Encoding.h

Show All 17 Lines

namespace mlir {		namespace mlir {
namespace bytecode {		namespace bytecode {
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// General constants		// General constants
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

enum {		enum {
		/// The minimum supported version of the bytecode.
		kMinSupportedVersion = 0,

/// The current bytecode version.		/// The current bytecode version.
kVersion = 0,		kVersion = 1,

/// An arbitrary value used to fill alignment padding.		/// An arbitrary value used to fill alignment padding.
kAlignmentByte = 0xCB,		kAlignmentByte = 0xCB,
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sections		// Sections
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
Show All 20 Lines	enum ID : uint8_t {

/// This section contains the resources of the bytecode.		/// This section contains the resources of the bytecode.
kResource = 5,		kResource = 5,

/// This section contains the offsets of resources within the Resource		/// This section contains the offsets of resources within the Resource
/// section.		/// section.
kResourceOffset = 6,		kResourceOffset = 6,

		/// This section contains the versions of each dialect.
		kDialectVersions = 7,

/// The total number of section types.		/// The total number of section types.
kNumSections = 7,		kNumSections = 8,
};		};
} // namespace Section		} // namespace Section

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// IR Section		// IR Section
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This enum represents a mask of all of the potential components of an		/// This enum represents a mask of all of the potential components of an
Show All 18 Lines

mlir/lib/Bytecode/Reader/BytecodeReader.cpp

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines static std::string toString(bytecode::Section::ID sectionID) {

case bytecode::Section::kAttrTypeOffset: case bytecode::Section::kAttrTypeOffset:

return "AttrTypeOffset (3)"; return "AttrTypeOffset (3)";

case bytecode::Section::kIR: case bytecode::Section::kIR:

return "IR (4)"; return "IR (4)";

case bytecode::Section::kResource: case bytecode::Section::kResource:

return "Resource (5)"; return "Resource (5)";

case bytecode::Section::kResourceOffset: case bytecode::Section::kResourceOffset:

return "ResourceOffset (6)"; return "ResourceOffset (6)";

case bytecode::Section::kDialectVersions:

return "DialectVersions (7)";

default: default:

return ("Unknown (" + Twine(static_cast<unsigned>(sectionID)) + ")").str(); return ("Unknown (" + Twine(static_cast<unsigned>(sectionID)) + ")").str();

} }

/// Returns true if the given top-level section ID is optional. /// Returns true if the given top-level section ID is optional.

static bool isSectionOptional(bytecode::Section::ID sectionID) { static bool isSectionOptional(bytecode::Section::ID sectionID) {

switch (sectionID) { switch (sectionID) {

case bytecode::Section::kString: case bytecode::Section::kString:

case bytecode::Section::kDialect: case bytecode::Section::kDialect:

case bytecode::Section::kAttrType: case bytecode::Section::kAttrType:

case bytecode::Section::kAttrTypeOffset: case bytecode::Section::kAttrTypeOffset:

case bytecode::Section::kIR: case bytecode::Section::kIR:

return false; return false;

case bytecode::Section::kResource: case bytecode::Section::kResource:

case bytecode::Section::kResourceOffset: case bytecode::Section::kResourceOffset:

case bytecode::Section::kDialectVersions:

return true; return true;

default: default:

llvm_unreachable("unknown section ID"); llvm_unreachable("unknown section ID");

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// EncodingReader // EncodingReader

▲ Show 20 Lines • Show All 271 Lines • ▼ Show 20 Lines public:

LogicalResult initialize(Location fileLoc, ArrayRef<uint8_t> sectionData); LogicalResult initialize(Location fileLoc, ArrayRef<uint8_t> sectionData);

/// Parse a shared string from the string section. The shared string is /// Parse a shared string from the string section. The shared string is

/// encoded using an index to a corresponding string in the string section. /// encoded using an index to a corresponding string in the string section.

LogicalResult parseString(EncodingReader &reader, StringRef &result) { LogicalResult parseString(EncodingReader &reader, StringRef &result) {

return parseEntry(reader, strings, result, "string"); return parseEntry(reader, strings, result, "string");

} }

/// Parse a shared string from the string section. The shared string is

/// encoded using an index to a corresponding string in the string section.

LogicalResult parseStringAtIndex(EncodingReader &reader, uint64_t index,

StringRef &result) {

return resolveEntry(reader, strings, index, result, "string");

}

private: private:

/// The table of strings referenced within the bytecode file. /// The table of strings referenced within the bytecode file.

SmallVector<StringRef> strings; SmallVector<StringRef> strings;

}; };

} // namespace } // namespace

LogicalResult StringSectionReader::initialize(Location fileLoc, LogicalResult StringSectionReader::initialize(Location fileLoc,

ArrayRef<uint8_t> sectionData) { ArrayRef<uint8_t> sectionData) {

Show All 34 Lines LogicalResult StringSectionReader::initialize(Location fileLoc,

return success(); return success();

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// BytecodeDialect // BytecodeDialect

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

namespace { namespace {

class DialectReader;

/// This struct represents a dialect entry within the bytecode. /// This struct represents a dialect entry within the bytecode.

struct BytecodeDialect { struct BytecodeDialect {

/// Load the dialect into the provided context if it hasn't been loaded yet. /// Load the dialect into the provided context if it hasn't been loaded yet.

/// Returns failure if the dialect couldn't be loaded *and* the provided /// Returns failure if the dialect couldn't be loaded *and* the provided

/// context does not allow unregistered dialects. The provided reader is used /// context does not allow unregistered dialects. The provided reader is used

/// for error emission if necessary. /// for error emission if necessary.

LogicalResult load(EncodingReader &reader, MLIRContext *ctx) { LogicalResult load(DialectReader &reader, MLIRContext *ctx);

jpienaarUnsubmitted

Done

Why does DialectReader need to be thread through? I thought it was a rather cheap, stateless structure to create.

jpienaar: Why does DialectReader need to be thread through? I thought it was a rather cheap, stateless…

mehdi_aminiUnsubmitted

Done

It is cheap, but needs to be created from things unavailable in this class, so you'd need to thread through more of other things here!

mehdi_amini: It is cheap, but needs to be created from things unavailable in this class, so you'd need to…

jpienaarUnsubmitted

Done

What else do you need to thread through? Dialect version?

jpienaar: What else do you need to thread through? Dialect version?

mehdi_aminiUnsubmitted

Done

Look at the call sites:

DialectReader dialectReader(*this, stringReader, resourceReader, reader);
if (failed(entry.dialect->load(dialectReader, fileLoc.getContext())))
  return failure();

So stringReader, resourceReader are the extra I think?
(also some call sites already have the dialectReader available)

mehdi_amini: Look at the call sites: ``` DialectReader dialectReader(*this, stringReader, resourceReader…

if (dialect)

return success();

Dialect *loadedDialect = ctx->getOrLoadDialect(name);

if (!loadedDialect && !ctx->allowsUnregisteredDialects()) {

return reader.emitError(

"dialect '", name,

"' is unknown. If this is intended, please call "

"allowUnregisteredDialects() on the MLIRContext, or use "

"-allow-unregistered-dialect with the MLIR tool used.");

}

dialect = loadedDialect;

// If the dialect was actually loaded, check to see if it has a bytecode

// interface.

if (loadedDialect)

interface = dyn_cast<BytecodeDialectInterface>(loadedDialect);

return success();

}

/// Return the loaded dialect, or nullptr if the dialect is unknown. This can /// Return the loaded dialect, or nullptr if the dialect is unknown. This can

/// only be called after `load`. /// only be called after `load`.

Dialect *getLoadedDialect() const { Dialect *getLoadedDialect() const {

assert(dialect && assert(dialect &&

"expected `load` to be invoked before `getLoadedDialect`"); "expected `load` to be invoked before `getLoadedDialect`");

return *dialect; return *dialect;

} }

/// The loaded dialect entry. This field is std::nullopt if we haven't /// The loaded dialect entry. This field is std::nullopt if we haven't

/// attempted to load, nullptr if we failed to load, otherwise the loaded /// attempted to load, nullptr if we failed to load, otherwise the loaded

/// dialect. /// dialect.

std::optional<Dialect *> dialect; std::optional<Dialect *> dialect;

/// The bytecode interface of the dialect, or nullptr if the dialect does not /// The bytecode interface of the dialect, or nullptr if the dialect does not

/// implement the bytecode interface. This field should only be checked if the /// implement the bytecode interface. This field should only be checked if the

/// `dialect` field is not std::nullopt. /// `dialect` field is not std::nullopt.

const BytecodeDialectInterface *interface = nullptr; const BytecodeDialectInterface *interface = nullptr;

/// The name of the dialect. /// The name of the dialect.

StringRef name; StringRef name;

/// A buffer containing the encoding of the dialect version parsed.

ArrayRef<uint8_t> versionBuffer;

/// Lazy loaded dialect version from the handle above.

std::unique_ptr<DialectVersion> loadedVersion;

}; };

/// This struct represents an operation name entry within the bytecode. /// This struct represents an operation name entry within the bytecode.

struct BytecodeOperationName { struct BytecodeOperationName {

BytecodeOperationName(BytecodeDialect *dialect, StringRef name) BytecodeOperationName(BytecodeDialect *dialect, StringRef name)

: dialect(dialect), name(name) {} : dialect(dialect), name(name) {}

/// The loaded operation name, or std::nullopt if it hasn't been processed /// The loaded operation name, or std::nullopt if it hasn't been processed

Show All 34 Lines

/// This class is used to read the resource section from the bytecode. /// This class is used to read the resource section from the bytecode.

class ResourceSectionReader { class ResourceSectionReader {

public: public:

/// Initialize the resource section reader with the given section data. /// Initialize the resource section reader with the given section data.

LogicalResult LogicalResult

initialize(Location fileLoc, const ParserConfig &config, initialize(Location fileLoc, const ParserConfig &config,

MutableArrayRef<BytecodeDialect> dialects, MutableArrayRef<BytecodeDialect> dialects,

StringSectionReader &stringReader, ArrayRef<uint8_t> sectionData, StringSectionReader &stringReader, ArrayRef<uint8_t> sectionData,

ArrayRef<uint8_t> offsetSectionData, ArrayRef<uint8_t> offsetSectionData, DialectReader &dialectReader,

const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef); const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef);

/// Parse a dialect resource handle from the resource section. /// Parse a dialect resource handle from the resource section.

LogicalResult parseResourceHandle(EncodingReader &reader, LogicalResult parseResourceHandle(EncodingReader &reader,

AsmDialectResourceHandle &result) { AsmDialectResourceHandle &result) {

return parseEntry(reader, dialectResources, result, "resource handle"); return parseEntry(reader, dialectResources, result, "resource handle");

} }

▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines parseResourceGroup(Location fileLoc, bool allowEmpty,

} }

return success(); return success();

} }

LogicalResult ResourceSectionReader::initialize( LogicalResult ResourceSectionReader::initialize(

Location fileLoc, const ParserConfig &config, Location fileLoc, const ParserConfig &config,

MutableArrayRef<BytecodeDialect> dialects, MutableArrayRef<BytecodeDialect> dialects,

StringSectionReader &stringReader, ArrayRef<uint8_t> sectionData, StringSectionReader &stringReader, ArrayRef<uint8_t> sectionData,

ArrayRef<uint8_t> offsetSectionData, ArrayRef<uint8_t> offsetSectionData, DialectReader &dialectReader,

const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef) { const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef) {

EncodingReader resourceReader(sectionData, fileLoc); EncodingReader resourceReader(sectionData, fileLoc);

EncodingReader offsetReader(offsetSectionData, fileLoc); EncodingReader offsetReader(offsetSectionData, fileLoc);

// Read the number of external resource providers. // Read the number of external resource providers.

uint64_t numExternalResourceGroups; uint64_t numExternalResourceGroups;

if (failed(offsetReader.parseVarInt(numExternalResourceGroups))) if (failed(offsetReader.parseVarInt(numExternalResourceGroups)))

return failure(); return failure();

Show All 24 Lines if (failed(parseGroup(handler)))

return failure(); return failure();

} }

// Read the dialect resources from the bytecode. // Read the dialect resources from the bytecode.

MLIRContext *ctx = fileLoc->getContext(); MLIRContext *ctx = fileLoc->getContext();

while (!offsetReader.empty()) { while (!offsetReader.empty()) {

BytecodeDialect *dialect; BytecodeDialect *dialect;

if (failed(parseEntry(offsetReader, dialects, dialect, "dialect")) || if (failed(parseEntry(offsetReader, dialects, dialect, "dialect")) ||

failed(dialect->load(resourceReader, ctx))) failed(dialect->load(dialectReader, ctx)))

return failure(); return failure();

Dialect *loadedDialect = dialect->getLoadedDialect(); Dialect *loadedDialect = dialect->getLoadedDialect();

if (!loadedDialect) { if (!loadedDialect) {

return resourceReader.emitError() return resourceReader.emitError()

<< "dialect '" << dialect->name << "' is unknown"; << "dialect '" << dialect->name << "' is unknown";

} }

const auto *handler = dyn_cast<OpAsmDialectInterface>(loadedDialect); const auto *handler = dyn_cast<OpAsmDialectInterface>(loadedDialect);

if (!handler) { if (!handler) {

▲ Show 20 Lines • Show All 350 Lines • ▼ Show 20 Lines LogicalResult AttrTypeReader::parseAsmEntry(T &result, EncodingReader &reader,

} }

return success(); return success();

} }

template <typename T> template <typename T>

LogicalResult AttrTypeReader::parseCustomEntry(Entry<T> &entry, LogicalResult AttrTypeReader::parseCustomEntry(Entry<T> &entry,

EncodingReader &reader, EncodingReader &reader,

StringRef entryType) { StringRef entryType) {

if (failed(entry.dialect->load(reader, fileLoc.getContext()))) DialectReader dialectReader(*this, stringReader, resourceReader, reader);

if (failed(entry.dialect->load(dialectReader, fileLoc.getContext())))

return failure(); return failure();

// Ensure that the dialect implements the bytecode interface. // Ensure that the dialect implements the bytecode interface.

if (!entry.dialect->interface) { if (!entry.dialect->interface) {

return reader.emitError("dialect '", entry.dialect->name, return reader.emitError("dialect '", entry.dialect->name,

"' does not implement the bytecode interface"); "' does not implement the bytecode interface");

} }

// Ask the dialect to parse the entry. // Ask the dialect to parse the entry. If the dialect is versioned, parse

DialectReader dialectReader(*this, stringReader, resourceReader, reader); // using the versioned encoding readers.

if (entry.dialect->loadedVersion.get()) {

if constexpr (std::is_same_v<T, Type>)

entry.entry = entry.dialect->interface->readType(

dialectReader, *entry.dialect->loadedVersion);

else

entry.entry = entry.dialect->interface->readAttribute(

dialectReader, *entry.dialect->loadedVersion);

} else {

if constexpr (std::is_same_v<T, Type>) if constexpr (std::is_same_v<T, Type>)

rriddleUnsubmitted

Done

Can you drop the trivial braces here?

rriddle: Can you drop the trivial braces here?

mfrancioAuthorUnsubmitted

Done

yep, thanks!

mfrancio: yep, thanks!

entry.entry = entry.dialect->interface->readType(dialectReader); entry.entry = entry.dialect->interface->readType(dialectReader);

else else

entry.entry = entry.dialect->interface->readAttribute(dialectReader); entry.entry = entry.dialect->interface->readAttribute(dialectReader);

}

return success(!!entry.entry); return success(!!entry.entry);

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Bytecode Reader // Bytecode Reader

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

namespace { namespace {

Show All 40 Lines private:

LogicalResult parseType(EncodingReader &reader, Type &result) { LogicalResult parseType(EncodingReader &reader, Type &result) {

return attrTypeReader.parseType(reader, result); return attrTypeReader.parseType(reader, result);

} }

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

// Resource Section // Resource Section

LogicalResult LogicalResult

parseResourceSection(std::optional<ArrayRef<uint8_t>> resourceData, parseResourceSection(EncodingReader &reader,

std::optional<ArrayRef<uint8_t>> resourceData,

std::optional<ArrayRef<uint8_t>> resourceOffsetData); std::optional<ArrayRef<uint8_t>> resourceOffsetData);

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

// IR Section // IR Section

/// This struct represents the current read state of a range of regions. This /// This struct represents the current read state of a range of regions. This

/// struct is used to enable iterative parsing of regions. /// struct is used to enable iterative parsing of regions.

struct RegionReadState { struct RegionReadState {

▲ Show 20 Lines • Show All 167 Lines • ▼ Show 20 Lines if (failed(stringReader.initialize(

return failure(); return failure();

// Process the dialect section. // Process the dialect section.

if (failed(parseDialectSection(*sectionDatas[bytecode::Section::kDialect]))) if (failed(parseDialectSection(*sectionDatas[bytecode::Section::kDialect])))

return failure(); return failure();

// Process the resource section if present. // Process the resource section if present.

if (failed(parseResourceSection( if (failed(parseResourceSection(

sectionDatas[bytecode::Section::kResource], reader, sectionDatas[bytecode::Section::kResource],

sectionDatas[bytecode::Section::kResourceOffset]))) sectionDatas[bytecode::Section::kResourceOffset])))

return failure(); return failure();

// Process the attribute and type section. // Process the attribute and type section.

if (failed(attrTypeReader.initialize( if (failed(attrTypeReader.initialize(

dialects, *sectionDatas[bytecode::Section::kAttrType], dialects, *sectionDatas[bytecode::Section::kAttrType],

*sectionDatas[bytecode::Section::kAttrTypeOffset]))) *sectionDatas[bytecode::Section::kAttrTypeOffset])))

return failure(); return failure();

// Finally, process the IR section. // Finally, process the IR section.

return parseIRSection(*sectionDatas[bytecode::Section::kIR], block); return parseIRSection(*sectionDatas[bytecode::Section::kIR], block);

} }

LogicalResult BytecodeReader::parseVersion(EncodingReader &reader) { LogicalResult BytecodeReader::parseVersion(EncodingReader &reader) {

if (failed(reader.parseVarInt(version))) if (failed(reader.parseVarInt(version)))

return failure(); return failure();

// Validate the bytecode version. // Validate the bytecode version.

uint64_t currentVersion = bytecode::kVersion; uint64_t currentVersion = bytecode::kVersion;

if (version < currentVersion) { uint64_t minSupportedVersion = bytecode::kMinSupportedVersion;

if (version < minSupportedVersion) {

return reader.emitError("bytecode version ", version, return reader.emitError("bytecode version ", version,

" is older than the current version of ", " is older than the current version of ",

currentVersion, ", and upgrade is not supported"); currentVersion, ", and upgrade is not supported");

} }

if (version > currentVersion) { if (version > currentVersion) {

return reader.emitError("bytecode version ", version, return reader.emitError("bytecode version ", version,

" is newer than the current version ", " is newer than the current version ",

currentVersion); currentVersion);

} }

return success(); return success();

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Dialect Section // Dialect Section

LogicalResult BytecodeDialect::load(DialectReader &reader, MLIRContext *ctx) {

if (dialect)

return success();

Dialect *loadedDialect = ctx->getOrLoadDialect(name);

if (!loadedDialect && !ctx->allowsUnregisteredDialects()) {

return reader.emitError("dialect '")

<< name

<< "' is unknown. If this is intended, please call "

"allowUnregisteredDialects() on the MLIRContext, or use "

"-allow-unregistered-dialect with the MLIR tool used.";

}

dialect = loadedDialect;

// If the dialect was actually loaded, check to see if it has a bytecode

// interface.

if (loadedDialect)

interface = dyn_cast<BytecodeDialectInterface>(loadedDialect);

if (!versionBuffer.empty()) {

if (!interface)

return reader.emitError("dialect '")

<< name

<< "' does not implement the bytecode interface, "

"but found a version entry";

loadedVersion = interface->readVersion(reader);

if (!loadedVersion)

return failure();

}

return success();

}

LogicalResult LogicalResult

BytecodeReader::parseDialectSection(ArrayRef<uint8_t> sectionData) { BytecodeReader::parseDialectSection(ArrayRef<uint8_t> sectionData) {

EncodingReader sectionReader(sectionData, fileLoc); EncodingReader sectionReader(sectionData, fileLoc);

// Parse the number of dialects in the section. // Parse the number of dialects in the section.

uint64_t numDialects; uint64_t numDialects;

if (failed(sectionReader.parseVarInt(numDialects))) if (failed(sectionReader.parseVarInt(numDialects)))

return failure(); return failure();

dialects.resize(numDialects); dialects.resize(numDialects);

// Parse each of the dialects. // Parse each of the dialects.

for (uint64_t i = 0; i < numDialects; ++i) for (uint64_t i = 0; i < numDialects; ++i) {

/// Before version 1, there wasn't any versioning available for dialects,

/// and the entryIdx represent the string itself.

if (version == 0) {

if (failed(stringReader.parseString(sectionReader, dialects[i].name))) if (failed(stringReader.parseString(sectionReader, dialects[i].name)))

return failure(); return failure();

continue;

}

// Parse ID representing dialect and version.

uint64_t dialectNameIdx;

bool versionAvailable;

if (failed(sectionReader.parseVarIntWithFlag(dialectNameIdx,

versionAvailable)))

return failure();

if (failed(stringReader.parseStringAtIndex(sectionReader, dialectNameIdx,

rriddleUnsubmitted

Done

continue;

}

- // modify entryIdx to decode entry index and version available.

+ // Modify entryIdx to decode entry index and version available.

uint64_t versionIdx = entryIdx >> 1;

I really thought we had a helper that read a varint and extracted a flag.

rriddle: I really thought we had a helper that read a varint and extracted a flag.

mfrancioAuthorUnsubmitted

Done

Oh indeed, I didn't see it. Thanks!

mfrancio: Oh indeed, I didn't see it. Thanks!

dialects[i].name)))

return failure();

if (versionAvailable) {

bytecode::Section::ID sectionID;

if (failed(

sectionReader.parseSection(sectionID, dialects[i].versionBuffer)))

return failure();

if (sectionID != bytecode::Section::kDialectVersions) {

emitError(fileLoc, "expected dialect version section");

return failure();

}

// Parse the operation names, which are grouped by dialect. // Parse the operation names, which are grouped by dialect.

auto parseOpName = [&](BytecodeDialect *dialect) { auto parseOpName = [&](BytecodeDialect *dialect) {

StringRef opName; StringRef opName;

if (failed(stringReader.parseString(sectionReader, opName))) if (failed(stringReader.parseString(sectionReader, opName)))

return failure(); return failure();

opNames.emplace_back(dialect, opName); opNames.emplace_back(dialect, opName);

return success(); return success();

}; };

while (!sectionReader.empty()) while (!sectionReader.empty())

if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName))) if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName)))

return failure(); return failure();

return success(); return success();

} }

FailureOr<OperationName> BytecodeReader::parseOpName(EncodingReader &reader) { FailureOr<OperationName> BytecodeReader::parseOpName(EncodingReader &reader) {

BytecodeOperationName *opName = nullptr; BytecodeOperationName *opName = nullptr;

if (failed(parseEntry(reader, opNames, opName, "operation name"))) if (failed(parseEntry(reader, opNames, opName, "operation name")))

return failure(); return failure();

// Check to see if this operation name has already been resolved. If we // Check to see if this operation name has already been resolved. If we

// haven't, load the dialect and build the operation name. // haven't, load the dialect and build the operation name.

if (!opName->opName) { if (!opName->opName) {

if (failed(opName->dialect->load(reader, getContext()))) // Load the dialect and its version.

EncodingReader versionReader(opName->dialect->versionBuffer, fileLoc);

DialectReader dialectReader(attrTypeReader, stringReader, resourceReader,

versionReader);

if (failed(opName->dialect->load(dialectReader, getContext())))

return failure(); return failure();

opName->opName.emplace((opName->dialect->name + "." + opName->name).str(), opName->opName.emplace((opName->dialect->name + "." + opName->name).str(),

getContext()); getContext());

} }

return *opName->opName; return *opName->opName;

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Resource Section // Resource Section

LogicalResult BytecodeReader::parseResourceSection( LogicalResult BytecodeReader::parseResourceSection(

std::optional<ArrayRef<uint8_t>> resourceData, EncodingReader &reader, std::optional<ArrayRef<uint8_t>> resourceData,

jpienaarUnsubmitted

Done

I'd prefer upgrade of the in memory structure to not be inside the reader. We already have a way to parse without verification, this upgrade is of the in memory structure which can be done separate. In here I'd prefer only upgrades related to parsing/before it gets to memory. This could be done at the top level entry point though, but outside of the parsing guts feels.

jpienaar: I'd prefer upgrade of the in memory structure to not be inside the reader. We already have a…

mfrancioAuthorUnsubmitted

Done

I considered this, but I found a little bit confusing the need to carry over the version at which the IR was parsed into the top level entry point - I actually like a lot the fact that the version stays within the parsing, so that only the current version of the dialect exists at the entry point level.

mfrancio: I considered this, but I found a little bit confusing the need to carry over the version at…

std::optional<ArrayRef<uint8_t>> resourceOffsetData) { std::optional<ArrayRef<uint8_t>> resourceOffsetData) {

// Ensure both sections are either present or not. // Ensure both sections are either present or not.

if (resourceData.has_value() != resourceOffsetData.has_value()) { if (resourceData.has_value() != resourceOffsetData.has_value()) {

if (resourceOffsetData) if (resourceOffsetData)

return emitError(fileLoc, "unexpected resource offset section when " return emitError(fileLoc, "unexpected resource offset section when "

"resource section is not present"); "resource section is not present");

return emitError( return emitError(

fileLoc, fileLoc,

"expected resource offset section when resource section is present"); "expected resource offset section when resource section is present");

} }

// If the resource sections are absent, there is nothing to do. // If the resource sections are absent, there is nothing to do.

if (!resourceData) if (!resourceData)

return success(); return success();

// Initialize the resource reader with the resource sections. // Initialize the resource reader with the resource sections.

DialectReader dialectReader(attrTypeReader, stringReader, resourceReader,

reader);

return resourceReader.initialize(fileLoc, config, dialects, stringReader, return resourceReader.initialize(fileLoc, config, dialects, stringReader,

*resourceData, *resourceOffsetData, *resourceData, *resourceOffsetData,

bufferOwnerRef); dialectReader, bufferOwnerRef);

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// IR Section // IR Section

LogicalResult BytecodeReader::parseIRSection(ArrayRef<uint8_t> sectionData, LogicalResult BytecodeReader::parseIRSection(ArrayRef<uint8_t> sectionData,

Block *block) { Block *block) {

EncodingReader reader(sectionData, fileLoc); EncodingReader reader(sectionData, fileLoc);

Show All 15 Lines LogicalResult BytecodeReader::parseIRSection(ArrayRef<uint8_t> sectionData,

while (!regionStack.empty()) while (!regionStack.empty())

if (failed(parseRegions(reader, regionStack, regionStack.back()))) if (failed(parseRegions(reader, regionStack, regionStack.back())))

return failure(); return failure();

if (!forwardRefOps.empty()) { if (!forwardRefOps.empty()) {

return reader.emitError( return reader.emitError(

"not all forward unresolved forward operand references"); "not all forward unresolved forward operand references");

} }

// Resolve dialect version.

for (const BytecodeDialect &byteCodeDialect : dialects) {

mehdi_aminiUnsubmitted

Done

Can you spell the type here?

mehdi_amini: Can you spell the type here?

mfrancioAuthorUnsubmitted

Done

definitely.

mfrancio: definitely.

// Parsing is complete, give an opportunity to each dialect to visit the

// IR and perform upgrades.

if (!byteCodeDialect.loadedVersion)

continue;

if (byteCodeDialect.interface &&

mehdi_aminiUnsubmitted

Done

Should byteCodedialect.dialect be available here?

mehdi_amini: Should `byteCodedialect.dialect` be available here?

mfrancioAuthorUnsubmitted

Done

Yes, it should, but it looks like you would have to handle a bunch of cases in the general case.

From BytecodeDialect.h:

/// The loaded dialect entry. This field is std::nullopt if we haven't
/// attempted to load, nullptr if we failed to load, otherwise the loaded
/// dialect.
std::optional<Dialect *> dialect;

I find getting the dialect from the context directly to be generally safer here.

mfrancio: Yes, it should, but it looks like you would have to handle a bunch of cases in the general case.

failed(byteCodeDialect.interface->upgradeFromVersion(

*moduleOp, *byteCodeDialect.loadedVersion)))

return failure();

}

mehdi_aminiUnsubmitted

Done

Isn't dyn_cast working for dialect interfaces?

mehdi_amini: Isn't dyn_cast working for dialect interfaces?

mfrancioAuthorUnsubmitted

Done

Yes, it does work - are there any issues in using this API though?

I'll change it anyway, since we could dyn_cast_or_null and remove the check for nullptr on the dialect.

mfrancio: Yes, it does work - are there any issues in using this API though? I'll change it anyway…

// Verify that the parsed operations are valid. // Verify that the parsed operations are valid.

if (config.shouldVerifyAfterParse() && failed(verify(*moduleOp))) if (config.shouldVerifyAfterParse() && failed(verify(*moduleOp)))

return failure(); return failure();

// Splice the parsed operations over to the provided top-level block. // Splice the parsed operations over to the provided top-level block.

auto &parsedOps = moduleOp->getBody()->getOperations(); auto &parsedOps = moduleOp->getBody()->getOperations();

auto &destOps = block->getOperations(); auto &destOps = block->getOperations();

destOps.splice(destOps.end(), parsedOps, parsedOps.begin(), parsedOps.end()); destOps.splice(destOps.end(), parsedOps, parsedOps.begin(), parsedOps.end());

▲ Show 20 Lines • Show All 239 Lines • ▼ Show 20 Lines

LogicalResult BytecodeReader::defineValues(EncodingReader &reader, LogicalResult BytecodeReader::defineValues(EncodingReader &reader,

ValueRange newValues) { ValueRange newValues) {

ValueScope &valueScope = valueScopes.back(); ValueScope &valueScope = valueScopes.back();

std::vector<Value> &values = valueScope.values; std::vector<Value> &values = valueScope.values;

unsigned &valueID = valueScope.nextValueIDs.back(); unsigned &valueID = valueScope.nextValueIDs.back();

unsigned valueIDEnd = valueID + newValues.size(); unsigned valueIDEnd = valueID + newValues.size();

if (valueIDEnd > values.size()) { if (valueIDEnd > values.size()) {

mehdi_aminiUnsubmitted

Done

Is going through the string name for the dialect the best way to resolve this? (I would think we have a dialect ID directly available? And using integer makes everything else more straighforward)

mehdi_amini: Is going through the string name for the dialect the best way to resolve this? (I would think…

mfrancioAuthorUnsubmitted

Done

It is true that the bytecode holds an integer which references the string section, but I don't see an existing API to reference the string by idx.

I don't really see the "non-straightforward" part anyway - we parse a string with a clean API, and we use it as a hash to map to the version handle. Am I missing something?

mfrancio: It is true that the bytecode holds an integer which references the string section, but I don't…

mehdi_aminiUnsubmitted

Done

Efficiency: string manipulation isn't free.

That said it is pretty bounded here, we should have at most one version per dialect...

But stepping back: why aren't we emitting the version in the dialect section?
We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect.
That seems like it could fit right before the op names.

mehdi_amini: Efficiency: string manipulation isn't free. That said it is pretty bounded here, we should…

mfrancioAuthorUnsubmitted

Done

The reason why I didn't do it is that it would break existing bytecodes and would require increasing the bytecode version (I am talking about mlir::bytecode::kVersion). I am open to this, but I don't really see the immediate need. It could always be done as part of a major update of the bytecode version itself.

mfrancio: The reason why I didn't do it is that it would break existing bytecodes and would require…

mehdi_aminiUnsubmitted

Done

In general I'm not in favor of taking detour when we know where we want to land (I don't see a problem with upgrading the bytecode as a breaking change at this point). There are a couple of things I intend to break there as well soon-ish.

mehdi_amini: In general I'm not in favor of taking detour when we know where we want to land (I don't see a…

mfrancioAuthorUnsubmitted

Done

Maybe this was already discussed in the past and I missed it, but isn't the bytecode version itself going to be backward compatible? Is there any interest in achieving this?

mfrancio: Maybe this was already discussed in the past and I missed it, but isn't the bytecode version…

saksenadhruvUnsubmitted

Done

Yes, we actually are hoping to ship a serialization format with versioning soon, and would like bytecode to have some compatibility, or atleast a way to upgrade/downgrade when we break it in next couple of months.

What is the guidance on using bytecode for serialization and compatibility?

We are using versioning on our dialect but we need some underlying guarantees on the bytecode itself as well.

saksenadhruv: Yes, we actually are hoping to ship a serialization format with versioning soon, and would like…

mehdi_aminiUnsubmitted

Done

Yes we want it to be stable. From my point of view I am aware of 3 features I want to get before I'm comfortable with trying to claim that we reached the "stability" point.

Dialect Versioning (thanks you for driving this!)
Use-list order.
Lazy-loading ability.

(Some people may have other ideas, I'm not aware of any)

Then there is my work on "properties", but I suspect we can preserve backward compatibility on this (assuming the dialects themselves don't change of course).

mehdi_amini: Yes we want it to be stable. From my point of view I am aware of 3 features I want to get…

mfrancioAuthorUnsubmitted

Done

Back to:

We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect.
That seems like it could fit right before the op names.

If we are in agreement on the current proposal (the dialect provides a version handle which holds a buffer to be written to file), we can definitely emit this blob of data into the dialect section as a breaking change. Can you kindly confirm before I move forward with the change?

mfrancio: Back to: > We could emit an varint for the version blob size, if it is zero that means there…

mehdi_aminiUnsubmitted

Done

Yes I think we should just do that now if we need to, that said in https://reviews.llvm.org/D145328 I did the change in a backward compatible way.

mehdi_amini: Yes I think we should just do that now if we need to, that said in https://reviews.llvm.

mfrancioAuthorUnsubmitted

Done

Yes! This is exactly what I envisioned when I started implementing the first draft, but I didn't post it as I didn't want to rely on a bytecode version change. Nice to see this.

We could also opt to remove the version section explicitly and inline the read/write of size/bytes reusing the alignment of the parent dialect section (probably a bit more memory efficient), but this works.

mfrancio: Yes! This is exactly what I envisioned when I started implementing the first draft, but I…

mehdi_aminiUnsubmitted

Done

The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface.

mehdi_amini: The reason I used a section is that when we load the version section we haven't loaded the…

mehdi_aminiUnsubmitted

Done

Something to add still is an attribute in the test dialect that is serialized at v0.1 and read / upgraded during parsing of v0.2.
I suspect we're missing making the version available on the readAttribute API.

mehdi_amini: Something to add still is an attribute in the test dialect that is serialized at v0.1 and read…

mfrancioAuthorUnsubmitted

Done

I did the change! It is tested only for attributes, but I can easily extend it to types as well!

mfrancio: I did the change! It is tested only for attributes, but I can easily extend it to types as well!

mfrancioAuthorUnsubmitted

Done

The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface.

I still don't see the reason. I think the section could just be inlined. You don't need the interface to read it (we would just hold the buffer). The interface is needed later to resolve the buffer and decode it... Unless I am missing something subtle :)

mfrancio: > The reason I used a section is that when we load the version section we haven't loaded the…

mehdi_aminiUnsubmitted

Done

Right, I guess I didn't find the method to do it!

Do you see how to emit the content of the versionEmitter differently than using emitSection? We need to emit the size and then the content. The logic in emitSection() has this logic:

// Push our current buffer and then merge the provided section body into
// ours.
appendResult(std::move(currentResult));
for (std::vector<uint8_t> &result : emitter.prevResultStorage)
  prevResultStorage.push_back(std::move(result));
llvm::append_range(prevResultList, emitter.prevResultList);
prevResultSize += emitter.prevResultSize;
appendResult(std::move(emitter.currentResult));

(knowing that the writeVersion interface can't do it because it needs to compute the size first before emitting the content)

mehdi_amini: Right, I guess I didn't find the method to do it! Do you see how to emit the content of the…

mfrancioAuthorUnsubmitted

Done

Exactly, I was thinking of consolidating this into a new method of reader and avoid the existence of a new dialect version section. I'll try!

mfrancio: Exactly, I was thinking of consolidating this into a new method of reader and avoid the…

mfrancioAuthorUnsubmitted

Done

I considered this again, but the only thing that would eventually "save" is really to print the var int of the section, so it felt not strictly necessary now that we have the bit flag.

mfrancio: I considered this again, but the only thing that would eventually "save" is really to print the…

return reader.emitError( return reader.emitError(

"value index range was outside of the expected range for " "value index range was outside of the expected range for "

"the parent region, got [", "the parent region, got [",

valueID, ", ", valueIDEnd, "), but the maximum index was ", valueID, ", ", valueIDEnd, "), but the maximum index was ",

values.size() - 1); values.size() - 1);

} }

// Assign the values and update any forward references. // Assign the values and update any forward references.

▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp

//===- BytecodeWriter.cpp - MLIR Bytecode Writer --------------------------===//		//===- BytecodeWriter.cpp - MLIR Bytecode Writer --------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Bytecode/BytecodeWriter.h"		#include "mlir/Bytecode/BytecodeWriter.h"
#include "../Encoding.h"		#include "../Encoding.h"
#include "IRNumbering.h"		#include "IRNumbering.h"
#include "mlir/Bytecode/BytecodeImplementation.h"		#include "mlir/Bytecode/BytecodeImplementation.h"
#include "mlir/IR/BuiltinDialect.h"
#include "mlir/IR/OpImplementation.h"		#include "mlir/IR/OpImplementation.h"
#include "llvm/ADT/CachedHashString.h"		#include "llvm/ADT/CachedHashString.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/Support/Debug.h"
#include <random>

#define DEBUG_TYPE "mlir-bytecode-writer"		#define DEBUG_TYPE "mlir-bytecode-writer"

using namespace mlir;		using namespace mlir;
using namespace mlir::bytecode::detail;		using namespace mlir::bytecode::detail;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// BytecodeWriterConfig		// BytecodeWriterConfig
▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines	private:
/// An up-to-date total size of all of the buffers within `prevResultList`.		/// An up-to-date total size of all of the buffers within `prevResultList`.
/// This enables O(1) size checks of the current encoding.		/// This enables O(1) size checks of the current encoding.
size_t prevResultSize = 0;		size_t prevResultSize = 0;

/// The highest required alignment for the start of this section.		/// The highest required alignment for the start of this section.
unsigned requiredAlignment = 1;		unsigned requiredAlignment = 1;
};		};

		//===----------------------------------------------------------------------===//
		// StringSectionBuilder
		//===----------------------------------------------------------------------===//

		namespace {
		/// This class is used to simplify the process of emitting the string section.
		class StringSectionBuilder {
		public:
		/// Add the given string to the string section, and return the index of the
		/// string within the section.
		size_t insert(StringRef str) {
		auto it = strings.insert({llvm::CachedHashStringRef(str), strings.size()});
		return it.first->second;
		}

		/// Write the current set of strings to the given emitter.
		void write(EncodingEmitter &emitter) {
		emitter.emitVarInt(strings.size());

		// Emit the sizes in reverse order, so that we don't need to backpatch an
		// offset to the string data or have a separate section.
		for (const auto &it : llvm::reverse(strings))
		emitter.emitVarInt(it.first.size() + 1);
		// Emit the string data itself.
		for (const auto &it : strings)
		emitter.emitNulTerminatedString(it.first.val());
		}

		private:
		/// A set of strings referenced within the bytecode. The value of the map is
		/// unused.
		llvm::MapVector<llvm::CachedHashStringRef, size_t> strings;
		};
		} // namespace

		class DialectWriter : public DialectBytecodeWriter {
		public:
		DialectWriter(EncodingEmitter &emitter, IRNumberingState &numberingState,
		StringSectionBuilder &stringSection)
		: emitter(emitter), numberingState(numberingState),
		stringSection(stringSection) {}

		//===--------------------------------------------------------------------===//
		// IR
		//===--------------------------------------------------------------------===//

		void writeAttribute(Attribute attr) override {
		emitter.emitVarInt(numberingState.getNumber(attr));
		}
		void writeType(Type type) override {
		emitter.emitVarInt(numberingState.getNumber(type));
		}

		void writeResourceHandle(const AsmDialectResourceHandle &resource) override {
		emitter.emitVarInt(numberingState.getNumber(resource));
		}

		//===--------------------------------------------------------------------===//
		// Primitives
		//===--------------------------------------------------------------------===//

		void writeVarInt(uint64_t value) override { emitter.emitVarInt(value); }

		void writeSignedVarInt(int64_t value) override {
		emitter.emitSignedVarInt(value);
		}

		void writeAPIntWithKnownWidth(const APInt &value) override {
		size_t bitWidth = value.getBitWidth();

		// If the value is a single byte, just emit it directly without going
		// through a varint.
		if (bitWidth <= 8)
		return emitter.emitByte(value.getLimitedValue());

		// If the value fits within a single varint, emit it directly.
		if (bitWidth <= 64)
		return emitter.emitSignedVarInt(value.getLimitedValue());

		// Otherwise, we need to encode a variable number of active words. We use
		// active words instead of the number of total words under the observation
		// that smaller values will be more common.
		unsigned numActiveWords = value.getActiveWords();
		emitter.emitVarInt(numActiveWords);

		const uint64_t *rawValueData = value.getRawData();
		for (unsigned i = 0; i < numActiveWords; ++i)
		emitter.emitSignedVarInt(rawValueData[i]);
		}

		void writeAPFloatWithKnownSemantics(const APFloat &value) override {
		writeAPIntWithKnownWidth(value.bitcastToAPInt());
		}

		void writeOwnedString(StringRef str) override {
		emitter.emitVarInt(stringSection.insert(str));
		}

		void writeOwnedBlob(ArrayRef<char> blob) override {
		emitter.emitVarInt(blob.size());
		emitter.emitOwnedBlob(ArrayRef<uint8_t>(
		reinterpret_cast<const uint8_t *>(blob.data()), blob.size()));
		}

		private:
		EncodingEmitter &emitter;
		IRNumberingState &numberingState;
		StringSectionBuilder &stringSection;
		};

/// A simple raw_ostream wrapper around a EncodingEmitter. This removes the need		/// A simple raw_ostream wrapper around a EncodingEmitter. This removes the need
/// to go through an intermediate buffer when interacting with code that wants a		/// to go through an intermediate buffer when interacting with code that wants a
/// raw_ostream.		/// raw_ostream.
class RawEmitterOstream : public raw_ostream {		class RawEmitterOstream : public raw_ostream {
public:		public:
explicit RawEmitterOstream(EncodingEmitter &emitter) : emitter(emitter) {		explicit RawEmitterOstream(EncodingEmitter &emitter) : emitter(emitter) {
SetUnbuffered();		SetUnbuffered();
}		}
Show All 31 Lines	void EncodingEmitter::emitMultiByteVarInt(uint64_t value) {

// If the value is too large to encode in a single byte, emit a special all		// If the value is too large to encode in a single byte, emit a special all
// zero marker byte and splat the value directly.		// zero marker byte and splat the value directly.
emitByte(0);		emitByte(0);
emitBytes({reinterpret_cast<uint8_t *>(&value), sizeof(value)});		emitBytes({reinterpret_cast<uint8_t *>(&value), sizeof(value)});
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// StringSectionBuilder
//===----------------------------------------------------------------------===//

namespace {
/// This class is used to simplify the process of emitting the string section.
class StringSectionBuilder {
public:
/// Add the given string to the string section, and return the index of the
/// string within the section.
size_t insert(StringRef str) {
auto it = strings.insert({llvm::CachedHashStringRef(str), strings.size()});
return it.first->second;
}

/// Write the current set of strings to the given emitter.
void write(EncodingEmitter &emitter) {
emitter.emitVarInt(strings.size());

// Emit the sizes in reverse order, so that we don't need to backpatch an
// offset to the string data or have a separate section.
for (const auto &it : llvm::reverse(strings))
emitter.emitVarInt(it.first.size() + 1);
// Emit the string data itself.
for (const auto &it : strings)
emitter.emitNulTerminatedString(it.first.val());
}

private:
/// A set of strings referenced within the bytecode. The value of the map is
/// unused.
llvm::MapVector<llvm::CachedHashStringRef, size_t> strings;
};
} // namespace

//===----------------------------------------------------------------------===//
// Bytecode Writer		// Bytecode Writer
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
class BytecodeWriter {		class BytecodeWriter {
public:		public:
BytecodeWriter(Operation *op) : numberingState(op) {}		BytecodeWriter(Operation *op) : numberingState(op) {}

▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	void BytecodeWriter::write(Operation *rootOp, raw_ostream &os,

// Emit the string section.		// Emit the string section.
writeStringSection(emitter);		writeStringSection(emitter);

// Write the generated bytecode to the provided output stream.		// Write the generated bytecode to the provided output stream.
emitter.writeTo(os);		emitter.writeTo(os);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		myhsuUnsubmitted Done Reply Inline Actions format: add braces myhsu: format: add braces
// Dialects		// Dialects

/// Write the given entries in contiguous groups with the same parent dialect.		/// Write the given entries in contiguous groups with the same parent dialect.
/// Each dialect sub-group is encoded with the parent dialect and number of		/// Each dialect sub-group is encoded with the parent dialect and number of
/// elements, followed by the encoding for the entries. The given callback is		/// elements, followed by the encoding for the entries. The given callback is
/// invoked to encode each individual entry.		/// invoked to encode each individual entry.
template <typename EntriesT, typename EntryCallbackT>		template <typename EntriesT, typename EntryCallbackT>
static void writeDialectGrouping(EncodingEmitter &emitter, EntriesT &&entries,		static void writeDialectGrouping(EncodingEmitter &emitter, EntriesT &&entries,
Show All 18 Lines
}		}

void BytecodeWriter::writeDialectSection(EncodingEmitter &emitter) {		void BytecodeWriter::writeDialectSection(EncodingEmitter &emitter) {
EncodingEmitter dialectEmitter;		EncodingEmitter dialectEmitter;

// Emit the referenced dialects.		// Emit the referenced dialects.
auto dialects = numberingState.getDialects();		auto dialects = numberingState.getDialects();
dialectEmitter.emitVarInt(llvm::size(dialects));		dialectEmitter.emitVarInt(llvm::size(dialects));
for (DialectNumbering &dialect : dialects)		for (DialectNumbering &dialect : dialects) {
dialectEmitter.emitVarInt(stringSection.insert(dialect.name));		// Write the string section and get the ID.
		size_t nameID = stringSection.insert(dialect.name);

		// Try writing the version to the versionEmitter.
		EncodingEmitter versionEmitter;
		if (dialect.interface) {
		// The writer used when emitting using a custom bytecode encoding.
		DialectWriter versionWriter(versionEmitter, numberingState,
		stringSection);
		dialect.interface->writeVersion(versionWriter);
		}

		// If the version emitter is empty, version is not available. We can encode
		// this in the dialect ID, so if there is no version, we don't write the
		// section.
		size_t versionAvailable = versionEmitter.size() > 0;
		dialectEmitter.emitVarIntWithFlag(nameID, versionAvailable);
		if (versionAvailable)
		dialectEmitter.emitSection(bytecode::Section::kDialectVersions,
		std::move(versionEmitter));
		}

// Emit the referenced operation names grouped by dialect.		// Emit the referenced operation names grouped by dialect.
auto emitOpName = [&](OpNameNumbering &name) {		auto emitOpName = [&](OpNameNumbering &name) {
dialectEmitter.emitVarInt(stringSection.insert(name.name.stripDialect()));		dialectEmitter.emitVarInt(stringSection.insert(name.name.stripDialect()));
};		};
writeDialectGrouping(dialectEmitter, numberingState.getOpNames(), emitOpName);		writeDialectGrouping(dialectEmitter, numberingState.getOpNames(), emitOpName);

emitter.emitSection(bytecode::Section::kDialect, std::move(dialectEmitter));		emitter.emitSection(bytecode::Section::kDialect, std::move(dialectEmitter));
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Attributes and Types		// Attributes and Types

namespace {
class DialectWriter : public DialectBytecodeWriter {
public:
DialectWriter(EncodingEmitter &emitter, IRNumberingState &numberingState,
StringSectionBuilder &stringSection)
: emitter(emitter), numberingState(numberingState),
stringSection(stringSection) {}

//===--------------------------------------------------------------------===//
// IR
//===--------------------------------------------------------------------===//

void writeAttribute(Attribute attr) override {
emitter.emitVarInt(numberingState.getNumber(attr));
}
void writeType(Type type) override {
emitter.emitVarInt(numberingState.getNumber(type));
}

void writeResourceHandle(const AsmDialectResourceHandle &resource) override {
emitter.emitVarInt(numberingState.getNumber(resource));
}

//===--------------------------------------------------------------------===//
// Primitives
//===--------------------------------------------------------------------===//

void writeVarInt(uint64_t value) override { emitter.emitVarInt(value); }

void writeSignedVarInt(int64_t value) override {
emitter.emitSignedVarInt(value);
}

void writeAPIntWithKnownWidth(const APInt &value) override {
size_t bitWidth = value.getBitWidth();

// If the value is a single byte, just emit it directly without going
// through a varint.
if (bitWidth <= 8)
return emitter.emitByte(value.getLimitedValue());

// If the value fits within a single varint, emit it directly.
if (bitWidth <= 64)
return emitter.emitSignedVarInt(value.getLimitedValue());

// Otherwise, we need to encode a variable number of active words. We use
// active words instead of the number of total words under the observation
// that smaller values will be more common.
unsigned numActiveWords = value.getActiveWords();
emitter.emitVarInt(numActiveWords);

const uint64_t *rawValueData = value.getRawData();
for (unsigned i = 0; i < numActiveWords; ++i)
emitter.emitSignedVarInt(rawValueData[i]);
}

void writeAPFloatWithKnownSemantics(const APFloat &value) override {
writeAPIntWithKnownWidth(value.bitcastToAPInt());
}

void writeOwnedString(StringRef str) override {
emitter.emitVarInt(stringSection.insert(str));
}

void writeOwnedBlob(ArrayRef<char> blob) override {
emitter.emitVarInt(blob.size());
emitter.emitOwnedBlob(ArrayRef<uint8_t>(
reinterpret_cast<const uint8_t *>(blob.data()), blob.size()));
}

private:
EncodingEmitter &emitter;
IRNumberingState &numberingState;
StringSectionBuilder &stringSection;
};
} // namespace

void BytecodeWriter::writeAttrTypeSection(EncodingEmitter &emitter) {		void BytecodeWriter::writeAttrTypeSection(EncodingEmitter &emitter) {
EncodingEmitter attrTypeEmitter;		EncodingEmitter attrTypeEmitter;
EncodingEmitter offsetEmitter;		EncodingEmitter offsetEmitter;
offsetEmitter.emitVarInt(llvm::size(numberingState.getAttributes()));		offsetEmitter.emitVarInt(llvm::size(numberingState.getAttributes()));
offsetEmitter.emitVarInt(llvm::size(numberingState.getTypes()));		offsetEmitter.emitVarInt(llvm::size(numberingState.getTypes()));

// A functor used to emit an attribute or type entry.		// A functor used to emit an attribute or type entry.
uint64_t prevOffset = 0;		uint64_t prevOffset = 0;
▲ Show 20 Lines • Show All 294 Lines • Show Last 20 Lines

mlir/test/Bytecode/invalid/invalid-structure.mlir

	// This file contains various failure test cases related to the structure of			// This file contains various failure test cases related to the structure of
	// a bytecode file.			// a bytecode file.

	// Bytecode currently does not support big-endian platforms			// Bytecode currently does not support big-endian platforms
	// UNSUPPORTED: target=s390x-{{.*}}			// UNSUPPORTED: target=s390x-{{.*}}

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	// Version			// Version
	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//

	// RUN: not mlir-opt %S/invalid-structure-version.mlirbc 2>&1 \| FileCheck %s --check-prefix=VERSION			// RUN: not mlir-opt %S/invalid-structure-version.mlirbc 2>&1 \| FileCheck %s --check-prefix=VERSION
	// VERSION: bytecode version 127 is newer than the current version 0			// VERSION: bytecode version 127 is newer than the current version 1

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	// Producer			// Producer
	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//

	// RUN: not mlir-opt %S/invalid-structure-producer.mlirbc 2>&1 \| FileCheck %s --check-prefix=PRODUCER			// RUN: not mlir-opt %S/invalid-structure-producer.mlirbc 2>&1 \| FileCheck %s --check-prefix=PRODUCER
	// PRODUCER: malformed null-terminated string, no null character found			// PRODUCER: malformed null-terminated string, no null character found

	Show All 27 Lines

mlir/test/Bytecode/versioning/versioned-attr-1.12.mlirbc

This binary file was added.

mlir/test/Bytecode/versioning/versioned-attr-2.0.mlirbc

This binary file was added.

mlir/test/Bytecode/versioning/versioned-op-1.12.mlirbc

This binary file was added.

mlir/test/Bytecode/versioning/versioned-op-2.0.mlirbc

This binary file was added.

mlir/test/Bytecode/versioning/versioned-op-2.2.mlirbc

This binary file was added.

mlir/test/Bytecode/versioning/versioned_attr.mlir

This file was added.

				// This file contains a test case representative of a dialect parsing an
				// attribute with versioned custom encoding.

				// Bytecode currently does not support big-endian platforms
				// UNSUPPORTED: target=s390x-{{.*}}

				//===--------------------------------------------------------------------===//
				// Test attribute upgrade
				//===--------------------------------------------------------------------===//

				// COM: bytecode contains
				// COM: module {
				// COM: version: 1.12
				// COM: "test.versionedB"() {attribute = #test.attr_params<24, 42>} : () -> ()
				// COM: }
				// RUN: mlir-opt %S/versioned-attr-1.12.mlirbc 2>&1 \| FileCheck %s --check-prefix=CHECK1
				// CHECK1: "test.versionedB"() {attribute = #test.attr_params<42, 24>} : () -> ()

				//===--------------------------------------------------------------------===//
				// Test attribute upgrade
				//===--------------------------------------------------------------------===//

				// COM: bytecode contains
				// COM: module {
				// COM: version: 2.0
				// COM: "test.versionedB"() {attribute = #test.attr_params<42, 24>} : () -> ()
				// COM: }
				// RUN: mlir-opt %S/versioned-attr-2.0.mlirbc 2>&1 \| FileCheck %s --check-prefix=CHECK2
				// CHECK2: "test.versionedB"() {attribute = #test.attr_params<42, 24>} : () -> ()

mlir/test/Bytecode/versioning/versioned_op.mlir

This file was added.

				// This file contains test cases related to the dialect post-parsing upgrade
				// mechanism.

				// Bytecode currently does not support big-endian platforms
				// UNSUPPORTED: target=s390x-{{.*}}

				//===--------------------------------------------------------------------===//
				// Test generic
				//===--------------------------------------------------------------------===//

				// COM: bytecode contains
				// COM: module {
				// COM: version: 2.0
				// COM: "test.versionedA"() {dims = 123 : i64, modifier = false} : () -> ()
				// COM: }
				// RUN: mlir-opt %S/versioned-op-2.0.mlirbc 2>&1 \| FileCheck %s --check-prefix=CHECK1
				// CHECK1: "test.versionedA"() {dims = 123 : i64, modifier = false} : () -> ()

				//===--------------------------------------------------------------------===//
				// Test upgrade
				//===--------------------------------------------------------------------===//

				// COM: bytecode contains
				// COM: module {
				// COM: version: 1.12
				// COM: "test.versionedA"() {dimensions = 123 : i64} : () -> ()
				// COM: }
				// RUN: mlir-opt %S/versioned-op-1.12.mlirbc 2>&1 \| FileCheck %s --check-prefix=CHECK2
				// CHECK2: "test.versionedA"() {dims = 123 : i64, modifier = false} : () -> ()

				//===--------------------------------------------------------------------===//
				// Test forbidden downgrade
				//===--------------------------------------------------------------------===//

				// COM: bytecode contains
				// COM: module {
				// COM: version: 2.2
				// COM: "test.versionedA"() {dims = 123 : i64, modifier = false} : () -> ()
				// COM: }
				// RUN: not mlir-opt %S/versioned-op-2.2.mlirbc 2>&1 \| FileCheck %s --check-prefix=ERR_NEW_VERSION
				// ERR_NEW_VERSION: current test dialect version is 2.0, can't parse version: 2.2

mlir/test/lib/Dialect/Test/TestDialect.cpp

//===- TestDialect.cpp - MLIR Dialect for Testing -------------------------===//		//===- TestDialect.cpp - MLIR Dialect for Testing -------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "TestDialect.h"		#include "TestDialect.h"
#include "TestAttributes.h"		#include "TestAttributes.h"
#include "TestInterfaces.h"		#include "TestInterfaces.h"
#include "TestTypes.h"		#include "TestTypes.h"
		#include "mlir/Bytecode/BytecodeImplementation.h"
#include "mlir/Dialect/Arith/IR/Arith.h"		#include "mlir/Dialect/Arith/IR/Arith.h"
#include "mlir/Dialect/DLTI/DLTI.h"
#include "mlir/Dialect/Func/IR/FuncOps.h"		#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"		#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "mlir/IR/AsmState.h"		#include "mlir/IR/AsmState.h"
#include "mlir/IR/BuiltinAttributes.h"		#include "mlir/IR/BuiltinAttributes.h"
#include "mlir/IR/BuiltinOps.h"		#include "mlir/IR/BuiltinOps.h"
#include "mlir/IR/Diagnostics.h"		#include "mlir/IR/Diagnostics.h"
#include "mlir/IR/DialectImplementation.h"
#include "mlir/IR/ExtensibleDialect.h"		#include "mlir/IR/ExtensibleDialect.h"
#include "mlir/IR/MLIRContext.h"		#include "mlir/IR/MLIRContext.h"
#include "mlir/IR/OperationSupport.h"		#include "mlir/IR/OperationSupport.h"
#include "mlir/IR/PatternMatch.h"		#include "mlir/IR/PatternMatch.h"
#include "mlir/IR/TypeUtilities.h"		#include "mlir/IR/TypeUtilities.h"
#include "mlir/IR/Verifier.h"		#include "mlir/IR/Verifier.h"
#include "mlir/Interfaces/InferIntRangeInterface.h"		#include "mlir/Interfaces/InferIntRangeInterface.h"
#include "mlir/Reducer/ReductionPatternInterface.h"		#include "mlir/Reducer/ReductionPatternInterface.h"
#include "mlir/Transforms/FoldUtils.h"		#include "mlir/Transforms/FoldUtils.h"
#include "mlir/Transforms/InliningUtils.h"		#include "mlir/Transforms/InliningUtils.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include <optional>

#include <numeric>		#include <numeric>
		#include <optional>

// Include this before the using namespace lines below to		// Include this before the using namespace lines below to
// test that we don't have namespace dependencies.		// test that we don't have namespace dependencies.
#include "TestOpsDialect.cpp.inc"		#include "TestOpsDialect.cpp.inc"

using namespace mlir;		using namespace mlir;
using namespace test;		using namespace test;

void test::registerTestDialect(DialectRegistry &registry) {		void test::registerTestDialect(DialectRegistry &registry) {
registry.insert<TestDialect>();		registry.insert<TestDialect>();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// TestDialect version utilities
		//===----------------------------------------------------------------------===//

		struct TestDialectVersion : public DialectVersion {
		uint32_t major = 2;
		uint32_t minor = 0;
		};

		//===----------------------------------------------------------------------===//
// TestDialect Interfaces		// TestDialect Interfaces
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
		mehdi_aminiUnsubmitted Done Reply Inline Actions We should use the bytecode encoding here for portability purpose at minima. That is encore the two int as varint (and decode them when loading). mehdi_amini: We should use the bytecode encoding here for portability purpose at minima. That is encore the…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions This is a very good point that I overlooked. It looks like we have two separate problems here - one is the portability, and the other one is to apply some sort of compression (not really critical in my view, but nice to have). For the purpose of the example, the first problem could be solved simply by using the helpers exposed by llvm under llvm/Support/Endian.h. For example, we could write/read the integers representing the version using inline void write16le(void P, uint16_t V) { write16<little>(P, V); } inline uint16_t read16le(const void P) { return read16<little>(P); } For the second, using varInt is definitely a great idea. It would be great to reuse the same byte code emitters and readers but it looks like they are not really exposed outside the bytecode cpp files. What we could do is to expose the varInt portion of it as helpers under mlir/Support. I am open to doing it, but since this is just an example, is it really worth it? Looking forward to hear your thoughts. mfrancio: This is a very good point that I overlooked. It looks like we have two separate problems here…
		mehdi_aminiUnsubmitted Done Reply Inline Actions The bytecode primitive are exposed in the public header `mlir/include/mlir/Bytecode/BytecodeImplementation.h`. Have a look at the dialect interface for manipulating types and attribute: virtual Attribute readAttribute(DialectBytecodeReader &reader) const { We should model the API here similarly: for a dialect writing a custom version blob should be no different than writing an attribute. mehdi_amini: The bytecode primitive are exposed in the public header…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Sounds good, I'll take a look. mfrancio: Sounds good, I'll take a look.
		mfrancioAuthorUnsubmitted Done Reply Inline Actions I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in `BytecodeImplementation.h`. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections - what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in `mlir/lib/Bytecode/Writer/BytecodeWriter.cpp`, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version. mfrancio: I considered this, but I don't really see a way to model the API for reading and writing a…
		mehdi_aminiUnsubmitted Done Reply Inline Actions I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in BytecodeImplementation.h. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections Right, sorry if I have the impression that this interface was "ready to be used" as-is here, I meant to point it as an example of an API that allows dialect author to access bytecode manipulation primitives. what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in mlir/lib/Bytecode/Writer/BytecodeWriter.cpp, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version. I started typing a long answer here, but felt like I was missing something so I sketched something here instead: https://reviews.llvm.org/D145328 (there is still a bug, and a I haven't regenerated the bytecode test file, but the interface is there!) mehdi_amini: > I considered this, but I don't really see a way to model the API for reading and writing a…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Thanks for the suggestion. This is very neat, I'll try to finalize it and regenerate the bytecode test files. mfrancio: Thanks for the suggestion. This is very neat, I'll try to finalize it and regenerate the…
		jpienaarUnsubmitted Done Reply Inline Actions I like the sketch. jpienaar: I like the sketch.

/// Testing the correctness of some traits.		/// Testing the correctness of some traits.
static_assert(		static_assert(
llvm::is_detected<OpTrait::has_implicit_terminator_t,		llvm::is_detected<OpTrait::has_implicit_terminator_t,
SingleBlockImplicitTerminatorOp>::value,		SingleBlockImplicitTerminatorOp>::value,
"has_implicit_terminator_t does not match SingleBlockImplicitTerminatorOp");		"has_implicit_terminator_t does not match SingleBlockImplicitTerminatorOp");
static_assert(OpTrait::hasSingleBlockImplicitTerminator<		static_assert(OpTrait::hasSingleBlockImplicitTerminator<
SingleBlockImplicitTerminatorOp>::value,		SingleBlockImplicitTerminatorOp>::value,
"hasSingleBlockImplicitTerminator does not match "		"hasSingleBlockImplicitTerminator does not match "
"SingleBlockImplicitTerminatorOp");		"SingleBlockImplicitTerminatorOp");

struct TestResourceBlobManagerInterface		struct TestResourceBlobManagerInterface
: public ResourceBlobManagerDialectInterfaceBase<		: public ResourceBlobManagerDialectInterfaceBase<
TestDialectResourceBlobHandle> {		TestDialectResourceBlobHandle> {
using ResourceBlobManagerDialectInterfaceBase<		using ResourceBlobManagerDialectInterfaceBase<
TestDialectResourceBlobHandle>::ResourceBlobManagerDialectInterfaceBase;		TestDialectResourceBlobHandle>::ResourceBlobManagerDialectInterfaceBase;
};		};

		namespace {
		enum test_encoding { k_attr_params = 0 };
		}

		// Test support for interacting with the Bytecode reader/writer.
		struct TestBytecodeDialectInterface : public BytecodeDialectInterface {
		using BytecodeDialectInterface::BytecodeDialectInterface;
		TestBytecodeDialectInterface(Dialect *dialect)
		: BytecodeDialectInterface(dialect) {}

		LogicalResult writeAttribute(Attribute attr,
		DialectBytecodeWriter &writer) const final {
		if (auto concreteAttr = llvm::dyn_cast<TestAttrParamsAttr>(attr)) {
		writer.writeVarInt(test_encoding::k_attr_params);
		writer.writeVarInt(concreteAttr.getV0());
		writer.writeVarInt(concreteAttr.getV1());
		return success();
		}
		writer.writeAttribute(attr);
		return success();
		}

		Attribute readAttribute(DialectBytecodeReader &reader,
		const DialectVersion &version_) const final {
		const auto &version = static_cast<const TestDialectVersion &>(version_);
		if (version.major < 2)
		return readAttrOldEncoding(reader);
		if (version.major == 2 && version.minor == 0)
		mehdi_aminiUnsubmitted Done Reply Inline Actions Would `else` be enough here? mehdi_amini: Would `else` be enough here?
		mfrancioAuthorUnsubmitted Done Reply Inline Actions I think the comment below is misleading - the intent was to forbid reading a newer than current version. I'll revise this. mfrancio: I think the comment below is misleading - the intent was to forbid reading a newer than current…
		return readAttrNewEncoding(reader);
		// Forbid reading future versions by returning nullptr.
		return Attribute();
		}

		// Emit a specific version of the dialect.
		void writeVersion(DialectBytecodeWriter &writer) const final {
		auto version = TestDialectVersion();
		writer.writeVarInt(version.major); // major
		writer.writeVarInt(version.minor); // minor
		}

		std::unique_ptr<DialectVersion>
		readVersion(DialectBytecodeReader &reader) const final {
		uint64_t major, minor;
		if (failed(reader.readVarInt(major)) \|\| failed(reader.readVarInt(minor)))
		return nullptr;
		auto version = std::make_unique<TestDialectVersion>();
		version->major = major;
		version->minor = minor;
		return version;
		}

		LogicalResult upgradeFromVersion(Operation *topLevelOp,
		const DialectVersion &version_) const final {
		const auto &version = static_cast<const TestDialectVersion &>(version_);
		if ((version.major == 2) && (version.minor == 0))
		return success();
		if (version.major > 2 \|\| (version.major == 2 && version.minor > 0)) {
		return topLevelOp->emitError()
		<< "current test dialect version is 2.0, can't parse version: "
		<< version.major << "." << version.minor;
		}
		// Prior version 2.0, the old op supported only a single attribute called
		// "dimensions". We can perform the upgrade.
		topLevelOp->walk([](TestVersionedOpA op) {
		if (auto dims = op->getAttr("dimensions")) {
		op->removeAttr("dimensions");
		op->setAttr("dims", dims);
		}
		op->setAttr("modifier", BoolAttr::get(op->getContext(), false));
		});
		return success();
		}

		private:
		Attribute readAttrNewEncoding(DialectBytecodeReader &reader) const {
		uint64_t encoding;
		if (failed(reader.readVarInt(encoding)) \|\|
		encoding != test_encoding::k_attr_params)
		return Attribute();
		// The new encoding has v0 first, v1 second.
		uint64_t v0, v1;
		if (failed(reader.readVarInt(v0)) \|\| failed(reader.readVarInt(v1)))
		return Attribute();
		return TestAttrParamsAttr::get(getContext(), static_cast<int>(v0),
		static_cast<int>(v1));
		}

		Attribute readAttrOldEncoding(DialectBytecodeReader &reader) const {
		uint64_t encoding;
		if (failed(reader.readVarInt(encoding)) \|\|
		encoding != test_encoding::k_attr_params)
		return Attribute();
		// The old encoding has v1 first, v0 second.
		uint64_t v0, v1;
		if (failed(reader.readVarInt(v1)) \|\| failed(reader.readVarInt(v0)))
		return Attribute();
		jpienaarUnsubmitted Done Reply Inline Actions Note: error messages should follow LLVM convention and be a sentence fragment (start lower case, no trailing punctuation) jpienaar: Note: error messages should follow LLVM convention and be a sentence fragment (start lower case…
		return TestAttrParamsAttr::get(getContext(), static_cast<int>(v0),
		static_cast<int>(v1));
		}
		};

// Test support for interacting with the AsmPrinter.		// Test support for interacting with the AsmPrinter.
struct TestOpAsmInterface : public OpAsmDialectInterface {		struct TestOpAsmInterface : public OpAsmDialectInterface {
using OpAsmDialectInterface::OpAsmDialectInterface;		using OpAsmDialectInterface::OpAsmDialectInterface;
TestOpAsmInterface(Dialect *dialect, TestResourceBlobManagerInterface &mgr)		TestOpAsmInterface(Dialect *dialect, TestResourceBlobManagerInterface &mgr)
: OpAsmDialectInterface(dialect), blobManager(mgr) {}		: OpAsmDialectInterface(dialect), blobManager(mgr) {}

//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
// Aliases		// Aliases
▲ Show 20 Lines • Show All 281 Lines • ▼ Show 20 Lines	#include "TestOps.cpp.inc"
registerDynamicOp(getDynamicGenericOp(this));		registerDynamicOp(getDynamicGenericOp(this));
registerDynamicOp(getDynamicOneOperandTwoResultsOp(this));		registerDynamicOp(getDynamicOneOperandTwoResultsOp(this));
registerDynamicOp(getDynamicCustomParserPrinterOp(this));		registerDynamicOp(getDynamicCustomParserPrinterOp(this));

auto &blobInterface = addInterface<TestResourceBlobManagerInterface>();		auto &blobInterface = addInterface<TestResourceBlobManagerInterface>();
addInterface<TestOpAsmInterface>(blobInterface);		addInterface<TestOpAsmInterface>(blobInterface);

addInterfaces<TestDialectFoldInterface, TestInlinerInterface,		addInterfaces<TestDialectFoldInterface, TestInlinerInterface,
TestReductionPatternInterface>();		TestReductionPatternInterface, TestBytecodeDialectInterface>();
allowUnknownOperations();		allowUnknownOperations();

// Instantiate our fallback op interface that we'll use on specific		// Instantiate our fallback op interface that we'll use on specific
// unregistered op.		// unregistered op.
fallbackEffectOpInterfaces = new TestOpEffectInterfaceFallback;		fallbackEffectOpInterfaces = new TestOpEffectInterfaceFallback;
}		}
TestDialect::~TestDialect() {		TestDialect::~TestDialect() {
delete static_cast<TestOpEffectInterfaceFallback *>(		delete static_cast<TestOpEffectInterfaceFallback *>(
▲ Show 20 Lines • Show All 719 Lines • ▼ Show 20 Lines	void TestOpWithRegionPattern::getCanonicalizationPatterns(
RewritePatternSet &results, MLIRContext *context) {		RewritePatternSet &results, MLIRContext *context) {
results.add<TestRemoveOpWithInnerOps>(context);		results.add<TestRemoveOpWithInnerOps>(context);
}		}

OpFoldResult TestOpWithRegionFold::fold(FoldAdaptor adaptor) {		OpFoldResult TestOpWithRegionFold::fold(FoldAdaptor adaptor) {
return getOperand();		return getOperand();
}		}

OpFoldResult TestOpConstant::fold(FoldAdaptor adaptor) {		OpFoldResult TestOpConstant::fold(FoldAdaptor adaptor) { return getValue(); }
return getValue();
}

LogicalResult TestOpWithVariadicResultsAndFolder::fold(		LogicalResult TestOpWithVariadicResultsAndFolder::fold(
FoldAdaptor adaptor, SmallVectorImpl<OpFoldResult> &results) {		FoldAdaptor adaptor, SmallVectorImpl<OpFoldResult> &results) {
for (Value input : this->getOperands()) {		for (Value input : this->getOperands()) {
results.push_back(input);		results.push_back(input);
}		}
return success();		return success();
}		}
▲ Show 20 Lines • Show All 443 Lines • ▼ Show 20 Lines	LogicalResult TestVerifiersOp::verify() {
Operation *definingOp = getInput().getDefiningOp();		Operation *definingOp = getInput().getDefiningOp();
if (definingOp && failed(mlir::verify(definingOp)))		if (definingOp && failed(mlir::verify(definingOp)))
return emitOpError("operand hasn't been verified");		return emitOpError("operand hasn't been verified");

emitRemark("success run of verifier");		emitRemark("success run of verifier");

return success();		return success();
}		}

		jpienaarUnsubmitted Done Reply Inline Actions I may have missed where this is used. jpienaar: I may have missed where this is used.
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Yes, indeed I forgot to upload the corresponding mlir file. mfrancio: Yes, indeed I forgot to upload the corresponding mlir file.
LogicalResult TestVerifiersOp::verifyRegions() {		LogicalResult TestVerifiersOp::verifyRegions() {
if (!getRegion().hasOneBlock())		if (!getRegion().hasOneBlock())
return emitOpError("`hasOneBlock` trait hasn't been verified");		return emitOpError("`hasOneBlock` trait hasn't been verified");

for (Block &block : getRegion())		for (Block &block : getRegion())
for (Operation &op : block)		for (Operation &op : block)
if (failed(mlir::verify(&op)))		if (failed(mlir::verify(&op)))
return emitOpError("nested op hasn't been verified");		return emitOpError("nested op hasn't been verified");
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

mlir/test/lib/Dialect/Test/TestOps.td

Show First 20 Lines • Show All 3,143 Lines • ▼ Show 20 Lines	def TestCSEOfSingleBlockOp : TEST_Op<"cse_of_single_block_op",
let results = (outs Variadic<AnyType>:$outputs);		let results = (outs Variadic<AnyType>:$outputs);
let regions = (region SizedRegion<1>:$region);		let regions = (region SizedRegion<1>:$region);
let assemblyFormat = [{		let assemblyFormat = [{
attr-dict `inputs` `(` $inputs `)`		attr-dict `inputs` `(` $inputs `)`
$region `:` type($inputs) `->` type($outputs)		$region `:` type($inputs) `->` type($outputs)
}];		}];
}		}

		//===----------------------------------------------------------------------===//
		// Test Ops to upgrade base on the dialect versions
		//===----------------------------------------------------------------------===//

		def TestVersionedOpA : TEST_Op<"versionedA"> {
		// A previous version of the dialect (let's say 1.*) supported an attribute
		// named "dimensions":
		// let arguments = (ins
		// AnyI64Attr:$dimensions
		// );

		// In the current version (2.0) "dimensions" was renamed to "dims", and a new
		// boolean attribute "modifier" was added. The previous version of the op
		// corresponds to "modifier=false". We support loading old IR through
		// upgrading, see `upgradeFromVersion()` in `TestBytecodeDialectInterface`.
		let arguments = (ins
		AnyI64Attr:$dims,
		BoolAttr:$modifier
		);
		}
		rriddleUnsubmitted Done Reply Inline Actions Dead code? rriddle: Dead code?
		mfrancioAuthorUnsubmitted Done Reply Inline Actions yep. thanks for pointing it out! mfrancio: yep. thanks for pointing it out!
		mehdi_aminiUnsubmitted Done Reply Inline Actions Leftover? mehdi_amini: Leftover?

		def TestVersionedOpB : TEST_Op<"versionedB"> {
		// A previous version of the dialect (let's say 1.*) we encoded TestAttrParams
		// with a custom encoding:
		//
		// #test.attr_params<X, Y> -> { varInt: Y, varInt: X }
		//
		// In the current version (2.0) the encoding changed and the two parameters of
		// the attribute are swapped:
		//
		// #test.attr_params<X, Y> -> { varInt: X, varInt: Y }
		//
		// We support loading old IR through a custom readAttribute method, see
		// `readAttribute()` in `TestBytecodeDialectInterface`
		let arguments = (ins
		TestAttrParams:$attribute
		);
		}

#endif // TEST_OPS		#endif // TEST_OPS

This is an archive of the discontinued LLVM Phabricator instance.

Extension of "Implement IR versioning through post-parsing upgrade through OpAsmDialectInterface"ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 504284

mlir/docs/BytecodeFormat.md

mlir/docs/LangRef.md

mlir/include/mlir/Bytecode/BytecodeImplementation.h

mlir/lib/Bytecode/Encoding.h

mlir/lib/Bytecode/Reader/BytecodeReader.cpp

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp

mlir/test/Bytecode/invalid/invalid-structure.mlir

mlir/test/Bytecode/versioning/versioned-attr-1.12.mlirbc

mlir/test/Bytecode/versioning/versioned-attr-2.0.mlirbc

mlir/test/Bytecode/versioning/versioned-op-1.12.mlirbc

mlir/test/Bytecode/versioning/versioned-op-2.0.mlirbc

mlir/test/Bytecode/versioning/versioned-op-2.2.mlirbc

mlir/test/Bytecode/versioning/versioned_attr.mlir

mlir/test/Bytecode/versioning/versioned_op.mlir

mlir/test/lib/Dialect/Test/TestDialect.cpp

mlir/test/lib/Dialect/Test/TestOps.td

Extension of "Implement IR versioning through post-parsing upgrade through OpAsmDialectInterface"
ClosedPublic