This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/IR/
-
mlir/
-
IR/
4/4
OpImplementation.h
-
lib/
-
AsmParser/
-
AsmParserImpl.h
-
Parser.h
-
Parser.cpp
-
ParserState.h
-
TokenKinds.def
-
Bytecode/
-
Encoding.h
-
Reader/
34/34
BytecodeReader.cpp
-
Writer/
1/1
BytecodeWriter.cpp
-
IR/
2/2
AsmPrinter.cpp
-
test/
-
Bytecode/
-
general.mlir
-
IR/
-
ir_upgrade.mlir
-
lib/Dialect/Test/
-
Dialect/
-
Test/
13/13
TestDialect.cpp
3/3
TestOps.td

Differential D143647

Extension of "Implement IR versioning through post-parsing upgrade through OpAsmDialectInterface"
ClosedPublic

Authored by mfrancio on Feb 9 2023, 6:58 AM.

Download Raw Diff

Details

Reviewers

rriddle
mehdi_amini
rengolin
nicolasvasilache
myhsu
jpienaar

Commits

rG0e0b6070fd2a: Implements MLIR Bytecode versioning capability

Summary

[mlir] Implements IR versioning capability

A dialect can opt-in to handle versioning through the BytecodeDialectInterface. Few hooks are exposed to the dialect to allow managing a version encoded into the bytecode file. The version is loaded lazily and allows to retrieve the version information while parsing the input IR, and gives an opportunity to each dialect for which a version is present to perform IR upgrades post-parsing through the upgradeFromVersion method. Custom Attribute and Type encodings can also be upgraded according to the dialect version using readAttribute and readType methods.

There is no restriction on what kind of information a dialect is allowed to encode to model its versioning. Currently, versioning is supported only for bytecode formats.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mfrancio created this revision.Feb 9 2023, 6:58 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2023, 6:58 AM

Herald added subscribers: Moerafaat, zero9178, bzcheeseman and 19 others. · View Herald Transcript

mfrancio requested review of this revision.Feb 9 2023, 6:58 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptFeb 9 2023, 6:58 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B212799: Diff 496112.Feb 9 2023, 7:55 AM

My Phab handle is different from my Discourse account

jpienaar added a subscriber: jpienaar.Feb 9 2023, 9:30 AM

jpienaar added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1466	I'd prefer upgrade of the in memory structure to not be inside the reader. We already have a way to parse without verification, this upgrade is of the in memory structure which can be done separate. In here I'd prefer only upgrades related to parsing/before it gets to memory. This could be done at the top level entry point though, but outside of the parsing guts feels.
mlir/lib/IR/AsmPrinter.cpp
3091	When this was discussed we talked about needing to have builtin attr version be treated specially (else one can't parse its version to know how to parse integerattr even).
mlir/test/lib/Dialect/Test/TestDialect.cpp
176	Note: error messages should follow LLVM convention and be a sentence fragment (start lower case, no trailing punctuation)
1675	I may have missed where this is used.

I believe some tests are missing like those related to bytecode. Also could you attach diff with more context as instructed here.

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
504	format: add braces

Revised diff according to the comments received:

Adds a new built-in "VersionAttr";
Decouples VersionAttr from other built-in Attributes, so that they can grow independently;
Leaves complete freedom to each dialect on how to manage versioning and how to encode its version into the VersionAttr;
Exposes a couple of hooks to optionally print/parse the dialect version as custom string.

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 17 2023, 9:53 AM

mfrancio marked 4 inline comments as done.Feb 17 2023, 9:59 AM

mfrancio added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1466	I considered this, but I found a little bit confusing the need to carry over the version at which the IR was parsed into the top level entry point - I actually like a lot the fact that the version stays within the parsing, so that only the current version of the dialect exists at the entry point level.
mlir/lib/IR/AsmPrinter.cpp
3091	I added a new built in attribute called VersionAttr.
mlir/test/lib/Dialect/Test/TestDialect.cpp
1675	Yes, indeed I forgot to upload the corresponding mlir file.

mfrancio edited the summary of this revision. (Show Details)Feb 17 2023, 10:08 AM

mfrancio marked 2 inline comments as done.Feb 17 2023, 10:41 AM

I don't understand why we need a builtin VersionAttr at all?

In D143647#4135740, @mehdi_amini wrote:

I don't understand why we need a builtin VersionAttr at all?

Nevermind, this is just lifetime management, doesn't seem unreasonable.

Harbormaster completed remote builds in B214446: Diff 498417.Feb 17 2023, 10:54 AM

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

This revision now requires changes to proceed.Feb 17 2023, 11:04 AM

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

Quoting @jpienaar: "When this was discussed we talked about needing to have builtin attr version be treated specially (else one can't parse its version to know how to parse integerattr even)."

One of the comments I received in the first version was indeed to use a new builtin attr that could be treated specially and allow decoupling with the existing attributes. Also, we wish to encode anything on the version and leave freedom to each dialect to do whatever (essentially writing on/retrieving the buffer). It indeed felt natural to use a new attribute that handles that bag of bytes. I would be happy to revise as necessary if there is a better solution to do this.

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

I was thinking about how to manage the lifetime, but I think the attribute does not change much actually: a handle still need to be made available whether it is an attribute or not.
So we could design it with a data structure stored on the parser itself and so made available to the dialects during the parsing.
It wouldn't survive the parsing phase though: post-parsing you lose the information about the version producer.

In D143647#4135978, @mehdi_amini wrote:

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

I was thinking about how to manage the lifetime, but I think the attribute does not change much actually: a handle still need to be made available whether it is an attribute or not.
So we could design it with a data structure stored on the parser itself and so made available to the dialects during the parsing.
It wouldn't survive the parsing phase though: post-parsing you lose the information about the version producer.

In the current implementation you don't really need to have a lifetime that exceeds the parsing. However, this may change in the future and having a reserved attribute for doing this may come handy. Are there any other drawbacks I do not see for adding a VersionAttr? If there are, I would be more than happy to revise further with the idea of adding a parser data structure. Any additional feedback would be greatly appreciated.

In the hope of reaching consensus, I am uploading a revised diff that removes the use of attributes entirely for Version and introduces a new dedicated AsmDialectVersionHandle to manage the lifetime of the buffer.

To summarize, the proposed approach:

Introduces a new AsmDialectVersionHandle to manage the lifetime of buffer representing dialect attribute info;
decouples versioning of a dialect to the rest of the mlir infrastructure, so each can be used and grow independently;
Leaves complete freedom to each dialect on how to manage versioning and how to encode its version into the Version Handle;
Exposes a couple of hooks to optionally print/parse the dialect version as custom string.

I hope you will find this interesting and compelling for the project.

Harbormaster completed remote builds in B215056: Diff 499217.Feb 21 2023, 11:47 AM

In D143647#4135778, @rriddle wrote:

Haven't had time to dig in here, but adding an attribute doesn't feel right for version. Why is this necessary? As opposed to being something specific to the assembly format?

Have we successfully addressed the concern here and is this ready to land?

I just skimmed through, I'd need some time to review this but I'm travelling right now and not sure if I'll get to it before the week end.

My intuition right now would be to implement it only for the Bytecode for now. The story there is more comprehensive than for the textual format, where we only offer the post-parsing upgrade and no control during parsing (and I'm not convinced we should encourage this).

Something that would be nice also is an example of how to use the version while parsing a type or an attribute to support an upgraded format (for example a new field that was added post version 1.40 or something like that).

mlir/include/mlir/IR/OpImplementation.h
470	Can you increase the amount of doc here: add context about what purpose it serves and some info on the context where it is used.
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1462	Can you spell the type here?
1467	Should `byteCodedialect.dialect` be available here?
1471	Isn't dyn_cast working for dialect interfaces?
1735	Is going through the string name for the dialect the best way to resolve this? (I would think we have a dialect ID directly available? And using integer makes everything else more straighforward)
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	We should use the bytecode encoding here for portability purpose at minima. That is encore the two int as varint (and decode them when loading).
mlir/test/lib/Dialect/Test/TestOps.td
3171	Leftover?

In addition to the stuff mentioned, I'd also love to see top-level docs detailing versioning, how it's structured, and how to hook in.

mlir/test/lib/Dialect/Test/TestOps.td
3170–3171	Dead code?

In D143647#4144867, @rriddle wrote:

In addition to the stuff mentioned, I'd also love to see top-level docs detailing versioning, how it's structured, and how to hook in.

Thanks for your feedback. I will also add some tests related to the byte code encoding itself, similarly to what already done for the other sections -- I've been holding off to doing it while trying to get the bulk of the code reviewed.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1462	definitely.
1467	Yes, it should, but it looks like you would have to handle a bunch of cases in the general case. From BytecodeDialect.h: /// The loaded dialect entry. This field is std::nullopt if we haven't /// attempted to load, nullptr if we failed to load, otherwise the loaded /// dialect. std::optional<Dialect *> dialect; I find getting the dialect from the context directly to be generally safer here.
1471	Yes, it does work - are there any issues in using this API though? I'll change it anyway, since we could dyn_cast_or_null and remove the check for nullptr on the dialect.
1735	It is true that the bytecode holds an integer which references the string section, but I don't see an existing API to reference the string by idx. I don't really see the "non-straightforward" part anyway - we parse a string with a clean API, and we use it as a hash to map to the version handle. Am I missing something?
mlir/test/lib/Dialect/Test/TestOps.td
3170–3171	yep. thanks for pointing it out!

mfrancio added inline comments.Feb 23 2023, 6:05 PM

mlir/include/mlir/IR/OpImplementation.h
470	Definitely.
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	This is a very good point that I overlooked. It looks like we have two separate problems here - one is the portability, and the other one is to apply some sort of compression (not really critical in my view, but nice to have). For the purpose of the example, the first problem could be solved simply by using the helpers exposed by llvm under llvm/Support/Endian.h. For example, we could write/read the integers representing the version using inline void write16le(void P, uint16_t V) { write16<little>(P, V); } inline uint16_t read16le(const void P) { return read16<little>(P); } For the second, using varInt is definitely a great idea. It would be great to reuse the same byte code emitters and readers but it looks like they are not really exposed outside the bytecode cpp files. What we could do is to expose the varInt portion of it as helpers under mlir/Support. I am open to doing it, but since this is just an example, is it really worth it? Looking forward to hear your thoughts.

Updates with respect to the previous diff:

Removes versioning capabilities from the textual format
Adds new tests specific to the bytecode format
Adds some documentation and addresses some comments

To summarize, the proposed approach:

Introduces a new AsmDialectVersionHandle to manage the lifetime of buffer representing dialect attribute info;
decouples versioning of a dialect to the rest of the mlir infrastructure, so each can be used and grow independently;
Leaves complete freedom to each dialect on how to manage versioning and how to encode its version into the Version Handle.

Looking forward to your feedback.

Herald added a subscriber: dmgreen. · View Herald TranscriptFeb 23 2023, 9:36 PM

Going in the right direction, thanks for the update!

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Efficiency: string manipulation isn't free. That said it is pretty bounded here, we should have at most one version per dialect... But stepping back: why aren't we emitting the version in the dialect section? We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect. That seems like it could fit right before the op names.
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	The bytecode primitive are exposed in the public header `mlir/include/mlir/Bytecode/BytecodeImplementation.h`. Have a look at the dialect interface for manipulating types and attribute: virtual Attribute readAttribute(DialectBytecodeReader &reader) const { We should model the API here similarly: for a dialect writing a custom version blob should be no different than writing an attribute.

Harbormaster completed remote builds in B215674: Diff 500076.Feb 23 2023, 10:51 PM

mfrancio added inline comments.Feb 24 2023, 8:15 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	The reason why I didn't do it is that it would break existing bytecodes and would require increasing the bytecode version (I am talking about mlir::bytecode::kVersion). I am open to this, but I don't really see the immediate need. It could always be done as part of a major update of the bytecode version itself.
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	Sounds good, I'll take a look.

mehdi_amini added inline comments.Feb 24 2023, 8:30 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	In general I'm not in favor of taking detour when we know where we want to land (I don't see a problem with upgrading the bytecode as a breaking change at this point). There are a couple of things I intend to break there as well soon-ish.

mfrancio added inline comments.Feb 24 2023, 8:42 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Maybe this was already discussed in the past and I missed it, but isn't the bytecode version itself going to be backward compatible? Is there any interest in achieving this?

saksenadhruv added inline comments.Feb 24 2023, 9:41 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Yes, we actually are hoping to ship a serialization format with versioning soon, and would like bytecode to have some compatibility, or atleast a way to upgrade/downgrade when we break it in next couple of months. What is the guidance on using bytecode for serialization and compatibility? We are using versioning on our dialect but we need some underlying guarantees on the bytecode itself as well.

mehdi_amini added inline comments.Feb 24 2023, 9:45 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Yes we want it to be stable. From my point of view I am aware of 3 features I want to get before I'm comfortable with trying to claim that we reached the "stability" point. Dialect Versioning (thanks you for driving this!) Use-list order. Lazy-loading ability. (Some people may have other ideas, I'm not aware of any) Then there is my work on "properties", but I suspect we can preserve backward compatibility on this (assuming the dialects themselves don't change of course).

mfrancio planned changes to this revision.Mar 2 2023, 5:48 PM

mfrancio added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Back to: We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect. That seems like it could fit right before the op names. If we are in agreement on the current proposal (the dialect provides a version handle which holds a buffer to be written to file), we can definitely emit this blob of data into the dialect section as a breaking change. Can you kindly confirm before I move forward with the change?
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in `BytecodeImplementation.h`. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections - what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in `mlir/lib/Bytecode/Writer/BytecodeWriter.cpp`, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version.

mehdi_amini added inline comments.Mar 5 2023, 12:30 PM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Yes I think we should just do that now if we need to, that said in https://reviews.llvm.org/D145328 I did the change in a backward compatible way.
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in BytecodeImplementation.h. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections Right, sorry if I have the impression that this interface was "ready to be used" as-is here, I meant to point it as an example of an API that allows dialect author to access bytecode manipulation primitives. what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in mlir/lib/Bytecode/Writer/BytecodeWriter.cpp, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version. I started typing a long answer here, but felt like I was missing something so I sketched something here instead: https://reviews.llvm.org/D145328 (there is still a bug, and a I haven't regenerated the bytecode test file, but the interface is there!)

jpienaar added inline comments.Mar 5 2023, 9:02 PM

mlir/test/lib/Dialect/Test/TestDialect.cpp
63	I like the sketch.

mfrancio added inline comments.Mar 5 2023, 9:16 PM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Yes! This is exactly what I envisioned when I started implementing the first draft, but I didn't post it as I didn't want to rely on a bytecode version change. Nice to see this. We could also opt to remove the version section explicitly and inline the read/write of size/bytes reusing the alignment of the parent dialect section (probably a bit more memory efficient), but this works.
mlir/test/lib/Dialect/Test/TestDialect.cpp
63	Thanks for the suggestion. This is very neat, I'll try to finalize it and regenerate the bytecode test files.

mehdi_amini added inline comments.Mar 6 2023, 1:24 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface.

mehdi_amini added inline comments.Mar 6 2023, 1:26 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Something to add still is an attribute in the test dialect that is serialized at v0.1 and read / upgraded during parsing of v0.2. I suspect we're missing making the version available on the readAttribute API.

mehdi_amini mentioned this in D145328: [mlir] Implements IR versioning capability (WIP).Mar 6 2023, 1:30 AM

Updates diff incorporating changes from https://reviews.llvm.org/D145328

Includes an example for upgrading an attribute that was written at v1 and it is read at v2 with a different encoding.

Harbormaster completed remote builds in B217768: Diff 502876.Mar 6 2023, 5:54 PM

mfrancio added inline comments.Mar 6 2023, 6:00 PM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	I did the change! It is tested only for attributes, but I can easily extend it to types as well!
1735	The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface. I still don't see the reason. I think the section could just be inlined. You don't need the interface to read it (we would just hold the buffer). The interface is needed later to resolve the buffer and decode it... Unless I am missing something subtle :)

mfrancio updated this revision to Diff 502915.Mar 6 2023, 9:33 PM

LGTM, but please wait for @rriddle to stamp it as well!

mlir/include/mlir/Bytecode/BytecodeImplementation.h
324 ↗	(On Diff #502915)	Can you add a simple doc?
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Right, I guess I didn't find the method to do it! Do you see how to emit the content of the `versionEmitter` differently than using emitSection? We need to emit the size and then the content. The logic in `emitSection()` has this logic: // Push our current buffer and then merge the provided section body into // ours. appendResult(std::move(currentResult)); for (std::vector<uint8_t> &result : emitter.prevResultStorage) prevResultStorage.push_back(std::move(result)); llvm::append_range(prevResultList, emitter.prevResultList); prevResultSize += emitter.prevResultSize; appendResult(std::move(emitter.currentResult)); (knowing that the writeVersion interface can't do it because it needs to compute the size first before emitting the content)
mlir/test/lib/Dialect/Test/TestDialect.cpp
124	Would `else` be enough here?

Harbormaster completed remote builds in B217792: Diff 502915.Mar 6 2023, 11:16 PM

Looking very close, thanks!

mlir/docs/BytecodeFormat.md
166–169 ↗	(On Diff #502915)	Why use a separate section? I would have expected to have this just be part of the `op_name_group` (which should be renamed at this point to `dialect`). We can store a bit with `numOpNames` or `dialect` to indicate if a version is present, and then optionally read it.
mlir/include/mlir/Bytecode/BytecodeImplementation.h
320–322 ↗	(On Diff #502915)	Why is it necessary for dialect authors to write the size? I would expect this could be automatically handled (e.g. via back-patching)?
mlir/include/mlir/IR/OpImplementation.h
1516–1518	This change feels unrelated, can you revert?

Nice! I'm fine with delaying textual form.

mlir/docs/BytecodeFormat.md
166–169 ↗	(On Diff #502915)	I don't think it's documented here, can we have multiple op_name_groups for same dialect? With the naming change it feels like it's saying there will be only one per dialect. +1 to bit and making this optionally specified if bit set. (I think section may be overloaded, I see this as proposed as just convenient way of grouping these two optional items together).
mlir/include/mlir/Bytecode/BytecodeImplementation.h
275 ↗	(On Diff #502915)	If we have a versioned dialect, it would seem we'd always have to use this method just in case (and if version is unspecified then there is just no upgrade).
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
430	Why does DialectReader need to be thread through? I thought it was a rather cheap, stateless structure to create.

mfrancio planned changes to this revision.Mar 7 2023, 6:52 AM

mfrancio added inline comments.

mlir/docs/BytecodeFormat.md
166–169 ↗	(On Diff #502915)	Agreed, I'll plan for this change. You shouldn't really need a bit - just printing the buffer is enough. A size zero means that no version is available. From a quick look I don't think you can have multiple op_name_groups per dialect - those are indeed already grouped by dialect: // Parse the operation names, which are grouped by dialect. auto parseOpName = [&](BytecodeDialect *dialect) { StringRef opName; if (failed(stringReader.parseString(sectionReader, opName))) return failure(); opNames.emplace_back(dialect, opName); return success(); }; while (!sectionReader.empty()) if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName))) return failure(); so changing the op_name_group to be dialect itself should be fine.
mlir/include/mlir/Bytecode/BytecodeImplementation.h
275 ↗	(On Diff #502915)	I feel it's anyway up to the dialect to decide what to do here. Having a fall-back seems convenient to me, but if it looks confusing or there is desire to push for the versioned implementation anyway we can emit an error similarly to the other reader.
320–322 ↗	(On Diff #502915)	This slipped - it is indeed not necessary. I'll update the comment.
mlir/include/mlir/IR/OpImplementation.h
1516–1518	Sure!
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	Exactly, I was thinking of consolidating this into a new method of reader and avoid the existence of a new dialect version section. I'll try!
mlir/test/lib/Dialect/Test/TestDialect.cpp
124	I think the comment below is misleading - the intent was to forbid reading a newer than current version. I'll revise this.

mehdi_amini added inline comments.Mar 7 2023, 7:21 AM

mlir/docs/BytecodeFormat.md
166–169 ↗	(On Diff #502915)	We could use a bit on the numOpNames to gate the existence of the version: op_name_group { dialect: varint, numOpNamesAndIsVersionAvailable: varint, // (numOpNames << 1 \| versionAvailable) version : dialect_version_section opNames: varint[] } That way we don't write a section when there is no version there.

mehdi_amini added inline comments.Mar 7 2023, 7:30 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
430	It is cheap, but needs to be created from things unavailable in this class, so you'd need to thread through more of other things here!

I was wondering (post review) if we should split the reader and writer commit parts, so that we could give a bit of time for bytecode consumers to get updated first (thinking of projects that span multiple repos). I mean, it is unstable at the moment, but wouldn't cause any additional churn.

mlir/docs/BytecodeFormat.md
166–169 ↗	(On Diff #502915)	They are grouped by dialect, but it just emplaces inside a vector. The emplacing results in the "flat" ID, so one can have multiple instances of this where the flat id can be small for the most common operations independent of dialect. So we'd probably need to just verify that a version is only specified once for a dialect (we could allow multiple as long as the same but that seems undesirable from size poitn of view) And yes what Mehdi suggested is what River also mentioned.
mlir/include/mlir/Bytecode/BytecodeImplementation.h
275 ↗	(On Diff #502915)	Yes I'm less worried about dialect than checking expectations for bytecode format parser (where it defers to dialect inside attribute/type parser its under full control of dialect).
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
430	What else do you need to thread through? Dialect version?

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
430	Look at the call sites: DialectReader dialectReader(*this, stringReader, resourceReader, reader); if (failed(entry.dialect->load(dialectReader, fileLoc.getContext()))) return failure(); So `stringReader, resourceReader` are the extra I think? (also some call sites already have the dialectReader available)

Added a bit flag to detect if a dialect is versioned and trigger the read of the section.

mfrancio marked an inline comment as done.Mar 10 2023, 10:00 AM

mfrancio added inline comments.

mlir/docs/BytecodeFormat.md
166–169 ↗	(On Diff #502915)	Added the bit flag. It felt more natural doing it on the dialect name itself instead of going inside the dialect version grouping. Let me know if there are any concerns.
mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1735	I considered this again, but the only thing that would eventually "save" is really to print the var int of the section, so it felt not strictly necessary now that we have the bit flag.

rriddle accepted this revision.Mar 10 2023, 10:03 AM

rriddle added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1083	Can you drop the trivial braces here?
1387	I really thought we had a helper that read a varint and extracted a flag.

This revision is now accepted and ready to land.Mar 10 2023, 10:03 AM

mfrancio added inline comments.Mar 10 2023, 10:07 AM

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1387	Oh indeed, I didn't see it. Thanks!

mfrancio updated this revision to Diff 504212.Mar 10 2023, 10:27 AM

mfrancio marked 2 inline comments as done.

mfrancio added inline comments.

mlir/lib/Bytecode/Reader/BytecodeReader.cpp
1083	yep, thanks!

LGTM, thanks for being so patient through the reviews!

Let's wait for @rriddle to give a final approval.

In D143647#4185686, @mehdi_amini wrote:

LGTM, thanks for being so patient through the reviews!

Let's wait for @rriddle to give a final approval.

Actually River reviewed already (I starting reviewing earlier and went out, figured I didn't hit "submit" before).

Do you have commit access or do you need help to land this?

In D143647#4185691, @mehdi_amini wrote:

Do you have commit access or do you need help to land this?

I don't have commit access, it's the first time I commit here! I was just waiting for the builds to complete before asking for help.

Thank you all for the feedback - it's been nice to collaborate!

Harbormaster completed remote builds in B218723: Diff 504212.Mar 10 2023, 1:20 PM

Right now you have test failures. On my Mac locally I see:

Failed Tests (2):
  MLIR :: Bytecode/versioning/versioned_attr.mlir
  MLIR :: Bytecode/versioning/versioned_op.mlir

This revision was landed with ongoing or failed builds.Mar 10 2023, 2:29 PM

Closed by commit rG0e0b6070fd2a: Implements MLIR Bytecode versioning capability (authored by mfrancio, committed by mehdi_amini). · Explain Why

This revision was automatically updated to reflect the committed changes.

mehdi_amini added a commit: rG0e0b6070fd2a: Implements MLIR Bytecode versioning capability.

eric-k256 added a subscriber: eric-k256.Apr 23 2023, 9:56 PM

Herald added a subscriber: bviyer. · View Herald TranscriptApr 23 2023, 9:56 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

IR/

OpImplementation.h

98 lines

lib/

AsmParser/

4 lines

3 lines

103 lines

3 lines

2 lines

Bytecode/

Encoding.h

5 lines

Reader/

BytecodeReader.cpp

76 lines

Writer/

BytecodeWriter.cpp

43 lines

IR/

AsmPrinter.cpp

52 lines

test/

Bytecode/

general.mlir

1 line

IR/

ir_upgrade.mlir

52 lines

lib/

Dialect/

Test/

TestDialect.cpp

93 lines

TestOps.td

22 lines

Diff 499217

mlir/include/mlir/IR/OpImplementation.h

Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines	inline OpAsmPrinter &operator<<(OpAsmPrinter &p, const T &values) {
return p;		return p;
}		}

inline OpAsmPrinter &operator<<(OpAsmPrinter &p, Block *value) {		inline OpAsmPrinter &operator<<(OpAsmPrinter &p, Block *value) {
p.printSuccessor(value);		p.printSuccessor(value);
return p;		return p;
}		}

		//===--------------------------------------------------------------------===//
		// Dialect Asm Version Interface.
		//===--------------------------------------------------------------------===//

		namespace {
		/// Simple wrapper around StorageAllocator to expose a buffer through
		/// `AsmDialectVersionHandle`.
		class AsmDialectVersionStorage : private StorageUniquer::StorageAllocator {
		public:
		AsmDialectVersionStorage(ArrayRef<uint8_t> in) { buffer = copyInto(in); };
		auto getBuffer() { return buffer; }

		private:
		ArrayRef<uint8_t> buffer;
		};
		} // namespace

		/// This class represents a handle to a dialect version storage.
		mehdi_aminiUnsubmitted Done Reply Inline Actions Can you increase the amount of doc here: add context about what purpose it serves and some info on the context where it is used. mehdi_amini: Can you increase the amount of doc here: add context about what purpose it serves and some info…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Definitely. mfrancio: Definitely.
		class AsmDialectVersionHandle {
		public:
		AsmDialectVersionHandle() = default;
		AsmDialectVersionHandle(ArrayRef<uint8_t> buffer, StringRef dialectName)
		: dialectName(dialectName) {
		storage = std::make_shared<AsmDialectVersionStorage>(buffer);
		}
		operator bool() { return storage && !storage->getBuffer().empty(); }

		/// Return a reference to the storage buffer.
		auto getBuffer() const {
		assert(storage && "buffer must exist to be retrieved");
		return storage->getBuffer();
		}

		/// Return an opaque pointer to the raw data.
		const void *getRawData() const {
		if (storage)
		return reinterpret_cast<const void *>(storage->getBuffer().data());
		return nullptr;
		}

		/// Return the size of the storage buffer.
		auto size() const {
		if (!storage)
		return size_t(0);
		return storage->getBuffer().size();
		}

		/// Return the dialect that owns the version.
		StringRef getDialectName() const { return dialectName; }

		private:
		/// The data associated with the version.
		std::shared_ptr<AsmDialectVersionStorage> storage;

		/// The dialect owning the version.
		StringRef dialectName;
		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// AsmParser		// AsmParser
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This base class exposes generic asm parser hooks, usable across the various		/// This base class exposes generic asm parser hooks, usable across the various
/// derived parsers.		/// derived parsers.
class AsmParser {		class AsmParser {
public:		public:
Show All 25 Lines	ParseResult getCurrentLocation(SMLoc *loc) {
return success();		return success();
}		}

/// Re-encode the given source location as an MLIR location and return it.		/// Re-encode the given source location as an MLIR location and return it.
/// Note: This method should only be used when a `Location` is necessary, as		/// Note: This method should only be used when a `Location` is necessary, as
/// the encoding process is not efficient.		/// the encoding process is not efficient.
virtual Location getEncodedSourceLoc(SMLoc loc) = 0;		virtual Location getEncodedSourceLoc(SMLoc loc) = 0;

		/// Return the attribute describing the version of the provided dialect name,
		/// if any. The version is provided by the `dialect_versions` directive at the
		/// very beginning of the parsing.
		virtual AsmDialectVersionHandle getDialectVersion(StringRef dialect) = 0;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Token Parsing		// Token Parsing
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Parse a '->' token.		/// Parse a '->' token.
virtual ParseResult parseArrow() = 0;		virtual ParseResult parseArrow() = 0;

/// Parse a '->' token if present		/// Parse a '->' token if present
▲ Show 20 Lines • Show All 943 Lines • ▼ Show 20 Lines	parseAffineExprOfSSAIds(SmallVectorImpl<UnresolvedOperand> &dimOperands,
SmallVectorImpl<UnresolvedOperand> &symbOperands,		SmallVectorImpl<UnresolvedOperand> &symbOperands,
AffineExpr &expr) = 0;		AffineExpr &expr) = 0;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Argument Parsing		// Argument Parsing
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

struct Argument {		struct Argument {
UnresolvedOperand ssaName; // SourceLoc, SSA name, result #.		UnresolvedOperand ssaName; // SourceLoc, SSA name, result #.
Type type; // Type.		Type type; // Type.
DictionaryAttr attrs; // Attributes if present.		DictionaryAttr attrs; // Attributes if present.
		rriddleUnsubmitted Done Reply Inline Actions This change feels unrelated, can you revert? rriddle: This change feels unrelated, can you revert?
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Sure! mfrancio: Sure!
std::optional<Location> sourceLoc; // Source location specifier if present.		std::optional<Location> sourceLoc; // Source location specifier if present.
};		};

/// Parse a single argument with the following syntax:		/// Parse a single argument with the following syntax:
///		///
/// `%ssaName : !type { optionalAttrDict} loc(optionalSourceLoc)`		/// `%ssaName : !type { optionalAttrDict} loc(optionalSourceLoc)`
///		///
/// If `allowType` is false or `allowAttrs` are false then the respective		/// If `allowType` is false or `allowAttrs` are false then the respective
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	public:
/// end with a numeric digit([0-9]+).		/// end with a numeric digit([0-9]+).
virtual AliasResult getAlias(Attribute attr, raw_ostream &os) const {		virtual AliasResult getAlias(Attribute attr, raw_ostream &os) const {
return AliasResult::NoAlias;		return AliasResult::NoAlias;
}		}
virtual AliasResult getAlias(Type type, raw_ostream &os) const {		virtual AliasResult getAlias(Type type, raw_ostream &os) const {
return AliasResult::NoAlias;		return AliasResult::NoAlias;
}		}

		/// Hook provided by the dialect to emit a version when printing. The
		/// handle will be available when parsing back, and the dialect implementation
		/// will be able to use it to load previous known version. This management is
		/// entirely under the responsibility of the individual dialects.
		virtual AsmDialectVersionHandle getProducerVersion() const { return {}; }

		/// Hook invoked after parsing completed, if a version directive was present
		/// and included an entry for the current dialect. This hook offers the
		/// opportunity to the dialect to visit the IR and upgrades constructs emitted
		/// by the version of the dialect corresponding to the provided version.
		virtual LogicalResult
		upgradeFromVersion(Operation *topLevelOp,
		AsmDialectVersionHandle versionHandle) const {
		return success();
		}

		/// Hook exposed to the dialect to parse the version from the provided token
		/// and return it as `AsmDialectVersionHandle`.
		virtual FailureOr<AsmDialectVersionHandle>
		parseVersionAsString(StringRef token) const {
		return failure();
		}

		/// Hook exposed to the dialect to print the version as a custom string.
		virtual FailureOr<std::string>
		printVersionAsString(AsmDialectVersionHandle versionHandle) const {
		return failure();
		}

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Resources		// Resources
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Declare a resource with the given key, returning a handle to use for any		/// Declare a resource with the given key, returning a handle to use for any
/// references of this resource key within the IR during parsing. The result		/// references of this resource key within the IR during parsing. The result
/// of `getResourceKey` on the returned handle is permitted to be different		/// of `getResourceKey` on the returned handle is permitted to be different
/// than `key`.		/// than `key`.
▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

mlir/lib/AsmParser/AsmParserImpl.h

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	public:
/// always succeeds.		/// always succeeds.
SMLoc getCurrentLocation() override { return parser.getToken().getLoc(); }		SMLoc getCurrentLocation() override { return parser.getToken().getLoc(); }

/// Re-encode the given source location as an MLIR location and return it.		/// Re-encode the given source location as an MLIR location and return it.
Location getEncodedSourceLoc(SMLoc loc) override {		Location getEncodedSourceLoc(SMLoc loc) override {
return parser.getEncodedSourceLocation(loc);		return parser.getEncodedSourceLocation(loc);
}		}

		AsmDialectVersionHandle getDialectVersion(StringRef dialect) override {
		return parser.getDialectVersion(dialect);
		}

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Token Parsing		// Token Parsing
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

using Delimiter = AsmParser::Delimiter;		using Delimiter = AsmParser::Delimiter;

/// Parse a `->` token.		/// Parse a `->` token.
ParseResult parseArrow() override {		ParseResult parseArrow() override {
▲ Show 20 Lines • Show All 530 Lines • Show Last 20 Lines

mlir/lib/AsmParser/Parser.h

Show All 30 Lines	public:

Parser(ParserState &state)		Parser(ParserState &state)
: builder(state.config.getContext()), state(state) {}		: builder(state.config.getContext()), state(state) {}

// Helper methods to get stuff from the parser-global state.		// Helper methods to get stuff from the parser-global state.
ParserState &getState() const { return state; }		ParserState &getState() const { return state; }
MLIRContext *getContext() const { return state.config.getContext(); }		MLIRContext *getContext() const { return state.config.getContext(); }
const llvm::SourceMgr &getSourceMgr() { return state.lex.getSourceMgr(); }		const llvm::SourceMgr &getSourceMgr() { return state.lex.getSourceMgr(); }
		AsmDialectVersionHandle getDialectVersion(StringRef dialect) const {
		return state.dialectVersions.lookup(dialect);
		}

/// Parse a comma-separated list of elements up until the specified end token.		/// Parse a comma-separated list of elements up until the specified end token.
ParseResult		ParseResult
parseCommaSeparatedListUntil(Token::Kind rightToken,		parseCommaSeparatedListUntil(Token::Kind rightToken,
function_ref<ParseResult()> parseElement,		function_ref<ParseResult()> parseElement,
bool allowEmptyList = true);		bool allowEmptyList = true);

/// Parse a list of comma-separated items with an optional delimiter. If a		/// Parse a list of comma-separated items with an optional delimiter. If a
▲ Show 20 Lines • Show All 300 Lines • Show Last 20 Lines

mlir/lib/AsmParser/Parser.cpp

Show First 20 Lines • Show All 797 Lines • ▼ Show 20 Lines	ParseResult OperationParser::finalize() {
});		});
if (walkRes.wasInterrupted())		if (walkRes.wasInterrupted())
return failure();		return failure();

// Pop the top level name scope.		// Pop the top level name scope.
if (failed(popSSANameScope()))		if (failed(popSSANameScope()))
return failure();		return failure();

		// Parsing is complete, give an opportunity to each dialect to visit the
		// IR and perform upgrades.
		if (!state.dialectVersions.empty()) {
		for (auto &dialectVersion : state.dialectVersions) {
		auto version = dialectVersion.getValue();
		Dialect *dialect =
		topLevelOp->getContext()->getOrLoadDialect(version.getDialectName());
		if (!dialect)
		continue;
		auto *asmIface = dialect->getRegisteredInterface<OpAsmDialectInterface>();
		if (!asmIface)
		continue;
		if (failed(asmIface->upgradeFromVersion(topLevelOp,
		dialectVersion.getValue())))
		return failure();
		}
		}

// Verify that the parsed operations are valid.		// Verify that the parsed operations are valid.
if (state.config.shouldVerifyAfterParse() && failed(verify(topLevelOp)))		if (state.config.shouldVerifyAfterParse() && failed(verify(topLevelOp)))
return failure();		return failure();

// If we are populating the parser state, finalize the top-level operation.		// If we are populating the parser state, finalize the top-level operation.
if (state.asmState)		if (state.asmState)
state.asmState->finalize(topLevelOp);		state.asmState->finalize(topLevelOp);
return success();		return success();
▲ Show 20 Lines • Show All 1,534 Lines • ▼ Show 20 Lines	private:
///		///
ParseResult parseFileMetadataDictionary();		ParseResult parseFileMetadataDictionary();

/// Parse a resource metadata dictionary.		/// Parse a resource metadata dictionary.
ParseResult parseResourceFileMetadata(		ParseResult parseResourceFileMetadata(
function_ref<ParseResult(StringRef, SMLoc)> parseBody);		function_ref<ParseResult(StringRef, SMLoc)> parseBody);
ParseResult parseDialectResourceFileMetadata();		ParseResult parseDialectResourceFileMetadata();
ParseResult parseExternalResourceFileMetadata();		ParseResult parseExternalResourceFileMetadata();

		/// Parse a top-level file dialect version dictionary.
		///
		/// version-dict ::= 'dialect_versions {' version-entry* `}'
		///
		ParseResult parseDialectVersionDictionary();
		ParseResult parseVersionEntry(AsmDialectVersionHandle &handle,
		StringRef dialectName);
};		};

/// This class represents an implementation of a resource entry for the MLIR		/// This class represents an implementation of a resource entry for the MLIR
/// textual format.		/// textual format.
class ParsedResourceEntry : public AsmParsedResourceEntry {		class ParsedResourceEntry : public AsmParsedResourceEntry {
public:		public:
ParsedResourceEntry(StringRef key, SMLoc keyLoc, Token value, Parser &p)		ParsedResourceEntry(StringRef key, SMLoc keyLoc, Token value, Parser &p)
: key(key), keyLoc(keyLoc), value(value), p(p) {}		: key(key), keyLoc(keyLoc), value(value), p(p) {}
▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines	return parseCommaSeparatedListUntil(Token::r_brace, [&]() -> ParseResult {
if (!handler)		if (!handler)
return success();		return success();
ParsedResourceEntry entry(key, keyLoc, valueTok, *this);		ParsedResourceEntry entry(key, keyLoc, valueTok, *this);
return handler->parseResource(entry);		return handler->parseResource(entry);
});		});
});		});
}		}

		ParseResult TopLevelOperationParser::parseDialectVersionDictionary() {
		// If the input starts with a `dialect_versions` keyword, we expect a
		// dictionary representing the version of the dialect at the time the IR was
		// produced. This will be used for possibly upgrading the IR when parsing
		// completes.
		if (failed(parseToken(Token::kw_dialect_versions,
		"expected 'dialect_versions'")))
		return failure();

		return parseCommaSeparatedList(Delimiter::Braces, [&]() -> ParseResult {
		// Parse the name of the dialect entry.
		StringRef dialectName = getTokenSpelling();
		consumeToken();

		if (failed(parseToken(Token::equal, "expected '='")))
		return failure();

		AsmDialectVersionHandle handle;
		if (failed(parseVersionEntry(handle, dialectName)))
		return failure();

		state.dialectVersions.insert({dialectName, handle});
		return success();
		});
		}

		ParseResult
		TopLevelOperationParser::parseVersionEntry(AsmDialectVersionHandle &handle,
		StringRef dialectName) {
		if (failed(parseToken(Token::kw_version, "expected 'version'")))
		return failure();
		if (failed(parseToken(Token::less, "expected '<' after 'version'")))
		return failure();

		if (!getToken().is(Token::string)) {
		emitWrongTokenError("expected string after 'version<'");
		return failure();
		}

		// If the token represents a hex string, parse it as hex.
		std::optional<std::string> result = getToken().getHexStringValue();
		auto *iface = llvm::dyn_cast_or_null<OpAsmDialectInterface>(
		getContext()->getOrLoadDialect(dialectName));
		if (!result.has_value() && !iface)
		return failure();

		// If we couldn't parse the token as hex string, then we can try using the
		// custom parser exposed to the dialect.
		if (!result.has_value() \|\| (*result).empty()) {
		FailureOr<AsmDialectVersionHandle> handleOr =
		iface->parseVersionAsString(getToken().getStringValue());
		if (succeeded(handleOr))
		handle = *handleOr;
		else
		return failure();
		}
		consumeToken(Token::string);

		if (failed(parseToken(Token::greater, "expected '>'")))
		return failure();

		if (handle)
		return success();

		llvm::SmallVector<uint8_t> cast((result).begin(), (result).end());
		handle = AsmDialectVersionHandle(cast, dialectName);
		return success();
		}

ParseResult TopLevelOperationParser::parse(Block *topLevelBlock,		ParseResult TopLevelOperationParser::parse(Block *topLevelBlock,
Location parserLoc) {		Location parserLoc) {
// Create a top-level operation to contain the parsed state.		// Create a top-level operation to contain the parsed state.
OwningOpRef<ModuleOp> topLevelOp(ModuleOp::create(parserLoc));		OwningOpRef<ModuleOp> topLevelOp(ModuleOp::create(parserLoc));
OperationParser opParser(state, topLevelOp.get());		OperationParser opParser(state, topLevelOp.get());
while (true) {		while (true) {
switch (getToken().getKind()) {		switch (getToken().getKind()) {
default:		default:
Show All 29 Lines	case Token::hash_identifier:
break;		break;

// Parse a type alias.		// Parse a type alias.
case Token::exclamation_identifier:		case Token::exclamation_identifier:
if (parseTypeAliasDef())		if (parseTypeAliasDef())
return failure();		return failure();
break;		break;

// Parse a file-level metadata dictionary.		// Parse a file-level metadata dictionary.
case Token::file_metadata_begin:		case Token::file_metadata_begin:
if (parseFileMetadataDictionary())		if (parseFileMetadataDictionary())
return failure();		return failure();
break;		break;

		// Parse a file-level version dictionary.
		case Token::kw_dialect_versions:
		if (parseDialectVersionDictionary())
		return failure();
		break;
}		}
}		}
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

LogicalResult		LogicalResult
mlir::parseAsmSourceFile(const llvm::SourceMgr &sourceMgr, Block *block,		mlir::parseAsmSourceFile(const llvm::SourceMgr &sourceMgr, Block *block,
Show All 13 Lines

mlir/lib/AsmParser/ParserState.h

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	struct ParserState {
AsmParserCodeCompleteContext *codeCompleteContext;		AsmParserCodeCompleteContext *codeCompleteContext;

// Contains the stack of default dialect to use when parsing regions.		// Contains the stack of default dialect to use when parsing regions.
// A new dialect get pushed to the stack before parsing regions nested		// A new dialect get pushed to the stack before parsing regions nested
// under an operation implementing `OpAsmOpInterface`, and		// under an operation implementing `OpAsmOpInterface`, and
// popped when done. At the top-level we start with "builtin" as the		// popped when done. At the top-level we start with "builtin" as the
// default, so that the top-level `module` operation parses as-is.		// default, so that the top-level `module` operation parses as-is.
SmallVector<StringRef> defaultDialectStack{"builtin"};		SmallVector<StringRef> defaultDialectStack{"builtin"};

		/// A map between a dialect name and its version.
		llvm::StringMap<AsmDialectVersionHandle> dialectVersions;
};		};

} // namespace detail		} // namespace detail
} // namespace mlir		} // namespace mlir

#endif // MLIR_LIB_ASMPARSER_PARSERSTATE_H		#endif // MLIR_LIB_ASMPARSER_PARSERSTATE_H

mlir/lib/AsmParser/TokenKinds.def

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	TOK_KEYWORD(symbol)			TOK_KEYWORD(symbol)
	TOK_KEYWORD(tensor)			TOK_KEYWORD(tensor)
	TOK_KEYWORD(to)			TOK_KEYWORD(to)
	TOK_KEYWORD(true)			TOK_KEYWORD(true)
	TOK_KEYWORD(tuple)			TOK_KEYWORD(tuple)
	TOK_KEYWORD(type)			TOK_KEYWORD(type)
	TOK_KEYWORD(unit)			TOK_KEYWORD(unit)
	TOK_KEYWORD(vector)			TOK_KEYWORD(vector)
				TOK_KEYWORD(version)
				TOK_KEYWORD(dialect_versions)

	#undef TOK_MARKER			#undef TOK_MARKER
	#undef TOK_IDENTIFIER			#undef TOK_IDENTIFIER
	#undef TOK_LITERAL			#undef TOK_LITERAL
	#undef TOK_PUNCTUATION			#undef TOK_PUNCTUATION
	#undef TOK_KEYWORD			#undef TOK_KEYWORD

mlir/lib/Bytecode/Encoding.h

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	enum ID : uint8_t {

/// This section contains the resources of the bytecode.		/// This section contains the resources of the bytecode.
kResource = 5,		kResource = 5,

/// This section contains the offsets of resources within the Resource		/// This section contains the offsets of resources within the Resource
/// section.		/// section.
kResourceOffset = 6,		kResourceOffset = 6,

		/// This section contains the versions of each dialect.
		kDialectVersions = 7,

/// The total number of section types.		/// The total number of section types.
kNumSections = 7,		kNumSections = 8,
};		};
} // namespace Section		} // namespace Section

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// IR Section		// IR Section
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This enum represents a mask of all of the potential components of an		/// This enum represents a mask of all of the potential components of an
Show All 18 Lines

mlir/lib/Bytecode/Reader/BytecodeReader.cpp

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines static std::string toString(bytecode::Section::ID sectionID) {

case bytecode::Section::kAttrTypeOffset: case bytecode::Section::kAttrTypeOffset:

return "AttrTypeOffset (3)"; return "AttrTypeOffset (3)";

case bytecode::Section::kIR: case bytecode::Section::kIR:

return "IR (4)"; return "IR (4)";

case bytecode::Section::kResource: case bytecode::Section::kResource:

return "Resource (5)"; return "Resource (5)";

case bytecode::Section::kResourceOffset: case bytecode::Section::kResourceOffset:

return "ResourceOffset (6)"; return "ResourceOffset (6)";

case bytecode::Section::kDialectVersions:

return "DialectVersions (7)";

default: default:

return ("Unknown (" + Twine(static_cast<unsigned>(sectionID)) + ")").str(); return ("Unknown (" + Twine(static_cast<unsigned>(sectionID)) + ")").str();

} }

/// Returns true if the given top-level section ID is optional. /// Returns true if the given top-level section ID is optional.

static bool isSectionOptional(bytecode::Section::ID sectionID) { static bool isSectionOptional(bytecode::Section::ID sectionID) {

switch (sectionID) { switch (sectionID) {

case bytecode::Section::kString: case bytecode::Section::kString:

case bytecode::Section::kDialect: case bytecode::Section::kDialect:

case bytecode::Section::kAttrType: case bytecode::Section::kAttrType:

case bytecode::Section::kAttrTypeOffset: case bytecode::Section::kAttrTypeOffset:

case bytecode::Section::kIR: case bytecode::Section::kIR:

return false; return false;

case bytecode::Section::kResource: case bytecode::Section::kResource:

case bytecode::Section::kResourceOffset: case bytecode::Section::kResourceOffset:

case bytecode::Section::kDialectVersions:

return true; return true;

default: default:

llvm_unreachable("unknown section ID"); llvm_unreachable("unknown section ID");

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// EncodingReader // EncodingReader

▲ Show 20 Lines • Show All 345 Lines • ▼ Show 20 Lines LogicalResult load(EncodingReader &reader, MLIRContext *ctx) {

} }

dialect = loadedDialect; dialect = loadedDialect;

// If the dialect was actually loaded, check to see if it has a bytecode // If the dialect was actually loaded, check to see if it has a bytecode

// interface. // interface.

if (loadedDialect) if (loadedDialect)

interface = dyn_cast<BytecodeDialectInterface>(loadedDialect); interface = dyn_cast<BytecodeDialectInterface>(loadedDialect);

return success(); return success();

} }

jpienaarUnsubmitted

Done

Why does DialectReader need to be thread through? I thought it was a rather cheap, stateless structure to create.

jpienaar: Why does DialectReader need to be thread through? I thought it was a rather cheap, stateless…

mehdi_aminiUnsubmitted

Done

It is cheap, but needs to be created from things unavailable in this class, so you'd need to thread through more of other things here!

mehdi_amini: It is cheap, but needs to be created from things unavailable in this class, so you'd need to…

jpienaarUnsubmitted

Done

What else do you need to thread through? Dialect version?

jpienaar: What else do you need to thread through? Dialect version?

mehdi_aminiUnsubmitted

Done

Look at the call sites:

DialectReader dialectReader(*this, stringReader, resourceReader, reader);
if (failed(entry.dialect->load(dialectReader, fileLoc.getContext())))
  return failure();

So stringReader, resourceReader are the extra I think?
(also some call sites already have the dialectReader available)

mehdi_amini: Look at the call sites: ``` DialectReader dialectReader(*this, stringReader, resourceReader…

/// Return the loaded dialect, or nullptr if the dialect is unknown. This can /// Return the loaded dialect, or nullptr if the dialect is unknown. This can

/// only be called after `load`. /// only be called after `load`.

Dialect *getLoadedDialect() const { Dialect *getLoadedDialect() const {

assert(dialect && assert(dialect &&

"expected `load` to be invoked before `getLoadedDialect`"); "expected `load` to be invoked before `getLoadedDialect`");

return *dialect; return *dialect;

} }

/// The loaded dialect entry. This field is std::nullopt if we haven't /// The loaded dialect entry. This field is std::nullopt if we haven't

/// attempted to load, nullptr if we failed to load, otherwise the loaded /// attempted to load, nullptr if we failed to load, otherwise the loaded

/// dialect. /// dialect.

std::optional<Dialect *> dialect; std::optional<Dialect *> dialect;

/// The bytecode interface of the dialect, or nullptr if the dialect does not /// The bytecode interface of the dialect, or nullptr if the dialect does not

/// implement the bytecode interface. This field should only be checked if the /// implement the bytecode interface. This field should only be checked if the

/// `dialect` field is not std::nullopt. /// `dialect` field is not std::nullopt.

const BytecodeDialectInterface *interface = nullptr; const BytecodeDialectInterface *interface = nullptr;

/// The name of the dialect. /// The name of the dialect.

StringRef name; StringRef name;

/// Handle for the dialect version we are parsing.

AsmDialectVersionHandle version;

}; };

/// This struct represents an operation name entry within the bytecode. /// This struct represents an operation name entry within the bytecode.

struct BytecodeOperationName { struct BytecodeOperationName {

BytecodeOperationName(BytecodeDialect *dialect, StringRef name) BytecodeOperationName(BytecodeDialect *dialect, StringRef name)

: dialect(dialect), name(name) {} : dialect(dialect), name(name) {}

/// The loaded operation name, or std::nullopt if it hasn't been processed /// The loaded operation name, or std::nullopt if it hasn't been processed

▲ Show 20 Lines • Show All 612 Lines • ▼ Show 20 Lines

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Bytecode Reader // Bytecode Reader

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

namespace { namespace {

/// This class is used to read a bytecode buffer and translate it into MLIR. /// This class is used to read a bytecode buffer and translate it into MLIR.

class BytecodeReader { class BytecodeReader {

public: public:

rriddleUnsubmitted

Done

Can you drop the trivial braces here?

rriddle: Can you drop the trivial braces here?

mfrancioAuthorUnsubmitted

Done

yep, thanks!

mfrancio: yep, thanks!

BytecodeReader(Location fileLoc, const ParserConfig &config, BytecodeReader(Location fileLoc, const ParserConfig &config,

const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef) const std::shared_ptr<llvm::SourceMgr> &bufferOwnerRef)

: config(config), fileLoc(fileLoc), : config(config), fileLoc(fileLoc),

attrTypeReader(stringReader, resourceReader, fileLoc), attrTypeReader(stringReader, resourceReader, fileLoc),

// Use the builtin unrealized conversion cast operation to represent // Use the builtin unrealized conversion cast operation to represent

// forward references to values that aren't yet defined. // forward references to values that aren't yet defined.

forwardRefOpState(UnknownLoc::get(config.getContext()), forwardRefOpState(UnknownLoc::get(config.getContext()),

"builtin.unrealized_conversion_cast", ValueRange(), "builtin.unrealized_conversion_cast", ValueRange(),

▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines FailureOr<Operation *> parseOpWithoutRegions(EncodingReader &reader,

RegionReadState &readState, RegionReadState &readState,

bool &isIsolatedFromAbove); bool &isIsolatedFromAbove);

LogicalResult parseRegion(EncodingReader &reader, RegionReadState &readState); LogicalResult parseRegion(EncodingReader &reader, RegionReadState &readState);

LogicalResult parseBlock(EncodingReader &reader, RegionReadState &readState); LogicalResult parseBlock(EncodingReader &reader, RegionReadState &readState);

LogicalResult parseBlockArguments(EncodingReader &reader, Block *block); LogicalResult parseBlockArguments(EncodingReader &reader, Block *block);

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

// Dialect Versions Section

/// Parse dialect versions.

LogicalResult

parseDialectVersionsSection(std::optional<ArrayRef<uint8_t>> sectionData);

//===--------------------------------------------------------------------===//

// Value Processing // Value Processing

/// Parse an operand reference using the given reader. Returns nullptr in the /// Parse an operand reference using the given reader. Returns nullptr in the

/// case of failure. /// case of failure.

Value parseOperand(EncodingReader &reader); Value parseOperand(EncodingReader &reader);

/// Sequentially define the given value range. /// Sequentially define the given value range.

LogicalResult defineValues(EncodingReader &reader, ValueRange values); LogicalResult defineValues(EncodingReader &reader, ValueRange values);

▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines LogicalResult BytecodeReader::read(llvm::MemoryBufferRef buffer, Block *block) {

if (failed(stringReader.initialize( if (failed(stringReader.initialize(

fileLoc, *sectionDatas[bytecode::Section::kString]))) fileLoc, *sectionDatas[bytecode::Section::kString])))

return failure(); return failure();

// Process the dialect section. // Process the dialect section.

if (failed(parseDialectSection(*sectionDatas[bytecode::Section::kDialect]))) if (failed(parseDialectSection(*sectionDatas[bytecode::Section::kDialect])))

return failure(); return failure();

// Process the dialect section.

if (failed(parseDialectVersionsSection(

sectionDatas[bytecode::Section::kDialectVersions])))

return failure();

// Process the resource section if present. // Process the resource section if present.

if (failed(parseResourceSection( if (failed(parseResourceSection(

sectionDatas[bytecode::Section::kResource], sectionDatas[bytecode::Section::kResource],

sectionDatas[bytecode::Section::kResourceOffset]))) sectionDatas[bytecode::Section::kResourceOffset])))

return failure(); return failure();

// Process the attribute and type section. // Process the attribute and type section.

if (failed(attrTypeReader.initialize( if (failed(attrTypeReader.initialize(

▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines auto parseOpName = [&](BytecodeDialect *dialect) {

if (failed(stringReader.parseString(sectionReader, opName))) if (failed(stringReader.parseString(sectionReader, opName)))

return failure(); return failure();

opNames.emplace_back(dialect, opName); opNames.emplace_back(dialect, opName);

return success(); return success();

}; };

while (!sectionReader.empty()) while (!sectionReader.empty())

if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName))) if (failed(parseDialectGrouping(sectionReader, dialects, parseOpName)))

return failure(); return failure();

return success(); return success();

rriddleUnsubmitted

Done

continue;

}

- // modify entryIdx to decode entry index and version available.

+ // Modify entryIdx to decode entry index and version available.

uint64_t versionIdx = entryIdx >> 1;

I really thought we had a helper that read a varint and extracted a flag.

rriddle: I really thought we had a helper that read a varint and extracted a flag.

mfrancioAuthorUnsubmitted

Done

Oh indeed, I didn't see it. Thanks!

mfrancio: Oh indeed, I didn't see it. Thanks!

} }

FailureOr<OperationName> BytecodeReader::parseOpName(EncodingReader &reader) { FailureOr<OperationName> BytecodeReader::parseOpName(EncodingReader &reader) {

BytecodeOperationName *opName = nullptr; BytecodeOperationName *opName = nullptr;

if (failed(parseEntry(reader, opNames, opName, "operation name"))) if (failed(parseEntry(reader, opNames, opName, "operation name")))

return failure(); return failure();

// Check to see if this operation name has already been resolved. If we // Check to see if this operation name has already been resolved. If we

▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines LogicalResult BytecodeReader::parseIRSection(ArrayRef<uint8_t> sectionData,

while (!regionStack.empty()) while (!regionStack.empty())

if (failed(parseRegions(reader, regionStack, regionStack.back()))) if (failed(parseRegions(reader, regionStack, regionStack.back())))

return failure(); return failure();

if (!forwardRefOps.empty()) { if (!forwardRefOps.empty()) {

return reader.emitError( return reader.emitError(

"not all forward unresolved forward operand references"); "not all forward unresolved forward operand references");

} }

// Resolve dialect version.

for (auto byteCodeDialect : dialects) {

mehdi_aminiUnsubmitted

Done

Can you spell the type here?

mehdi_amini: Can you spell the type here?

mfrancioAuthorUnsubmitted

Done

definitely.

mfrancio: definitely.

// Parsing is complete, give an opportunity to each dialect to visit the

// IR and perform upgrades.

if (byteCodeDialect.version) {

Dialect *dialect = moduleOp->getContext()->getOrLoadDialect(

jpienaarUnsubmitted

Done

I'd prefer upgrade of the in memory structure to not be inside the reader. We already have a way to parse without verification, this upgrade is of the in memory structure which can be done separate. In here I'd prefer only upgrades related to parsing/before it gets to memory. This could be done at the top level entry point though, but outside of the parsing guts feels.

jpienaar: I'd prefer upgrade of the in memory structure to not be inside the reader. We already have a…

mfrancioAuthorUnsubmitted

Done

I considered this, but I found a little bit confusing the need to carry over the version at which the IR was parsed into the top level entry point - I actually like a lot the fact that the version stays within the parsing, so that only the current version of the dialect exists at the entry point level.

mfrancio: I considered this, but I found a little bit confusing the need to carry over the version at…

byteCodeDialect.version.getDialectName());

mehdi_aminiUnsubmitted

Done

Should byteCodedialect.dialect be available here?

mehdi_amini: Should `byteCodedialect.dialect` be available here?

mfrancioAuthorUnsubmitted

Done

Yes, it should, but it looks like you would have to handle a bunch of cases in the general case.

From BytecodeDialect.h:

/// The loaded dialect entry. This field is std::nullopt if we haven't
/// attempted to load, nullptr if we failed to load, otherwise the loaded
/// dialect.
std::optional<Dialect *> dialect;

I find getting the dialect from the context directly to be generally safer here.

mfrancio: Yes, it should, but it looks like you would have to handle a bunch of cases in the general case.

if (!dialect)

continue;

auto *asmIface =

(dialect)->getRegisteredInterface<OpAsmDialectInterface>();

mehdi_aminiUnsubmitted

Done

Isn't dyn_cast working for dialect interfaces?

mehdi_amini: Isn't dyn_cast working for dialect interfaces?

mfrancioAuthorUnsubmitted

Done

Yes, it does work - are there any issues in using this API though?

I'll change it anyway, since we could dyn_cast_or_null and remove the check for nullptr on the dialect.

mfrancio: Yes, it does work - are there any issues in using this API though? I'll change it anyway…

if (!asmIface)

continue;

if (failed(

asmIface->upgradeFromVersion(*moduleOp, byteCodeDialect.version)))

return failure();

}

// Verify that the parsed operations are valid. // Verify that the parsed operations are valid.

if (config.shouldVerifyAfterParse() && failed(verify(*moduleOp))) if (config.shouldVerifyAfterParse() && failed(verify(*moduleOp)))

return failure(); return failure();

// Splice the parsed operations over to the provided top-level block. // Splice the parsed operations over to the provided top-level block.

auto &parsedOps = moduleOp->getBody()->getOperations(); auto &parsedOps = moduleOp->getBody()->getOperations();

auto &destOps = block->getOperations(); auto &destOps = block->getOperations();

destOps.splice(destOps.end(), parsedOps, parsedOps.begin(), parsedOps.end()); destOps.splice(destOps.end(), parsedOps, parsedOps.begin(), parsedOps.end());

▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines while (numArgs--) {

argTypes.push_back(argType); argTypes.push_back(argType);

argLocs.push_back(argLoc); argLocs.push_back(argLoc);

} }

block->addArguments(argTypes, argLocs); block->addArguments(argTypes, argLocs);

return defineValues(reader, block->getArguments()); return defineValues(reader, block->getArguments());

} }

//===--------------------------------------------------------------------===//

// Dialect Versions Section

LogicalResult BytecodeReader::parseDialectVersionsSection(

std::optional<ArrayRef<uint8_t>> sectionData) {

// If the dialect versions are absent, there is nothing to do.

if (!sectionData.has_value())

return success();

EncodingReader sectionReader(sectionData.value(), fileLoc);

// Parse the number of dialects in the section.

uint64_t numVersionedDialects;

if (failed(sectionReader.parseVarInt(numVersionedDialects)))

return failure();

// Parse each of the dialect versions.

llvm::StringMap<AsmDialectVersionHandle> versionedDialectMap;

for (uint64_t i = 0; i < numVersionedDialects; ++i) {

StringRef dialectName;

uint64_t dialectVersionSize;

if (failed(stringReader.parseString(sectionReader, dialectName)))

return failure();

mehdi_aminiUnsubmitted

Done

Is going through the string name for the dialect the best way to resolve this? (I would think we have a dialect ID directly available? And using integer makes everything else more straighforward)

mehdi_amini: Is going through the string name for the dialect the best way to resolve this? (I would think…

mfrancioAuthorUnsubmitted

Done

It is true that the bytecode holds an integer which references the string section, but I don't see an existing API to reference the string by idx.

I don't really see the "non-straightforward" part anyway - we parse a string with a clean API, and we use it as a hash to map to the version handle. Am I missing something?

mfrancio: It is true that the bytecode holds an integer which references the string section, but I don't…

mehdi_aminiUnsubmitted

Done

Efficiency: string manipulation isn't free.

That said it is pretty bounded here, we should have at most one version per dialect...

But stepping back: why aren't we emitting the version in the dialect section?
We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect.
That seems like it could fit right before the op names.

mehdi_amini: Efficiency: string manipulation isn't free. That said it is pretty bounded here, we should…

mfrancioAuthorUnsubmitted

Done

The reason why I didn't do it is that it would break existing bytecodes and would require increasing the bytecode version (I am talking about mlir::bytecode::kVersion). I am open to this, but I don't really see the immediate need. It could always be done as part of a major update of the bytecode version itself.

mfrancio: The reason why I didn't do it is that it would break existing bytecodes and would require…

mehdi_aminiUnsubmitted

Done

In general I'm not in favor of taking detour when we know where we want to land (I don't see a problem with upgrading the bytecode as a breaking change at this point). There are a couple of things I intend to break there as well soon-ish.

mehdi_amini: In general I'm not in favor of taking detour when we know where we want to land (I don't see a…

mfrancioAuthorUnsubmitted

Done

Maybe this was already discussed in the past and I missed it, but isn't the bytecode version itself going to be backward compatible? Is there any interest in achieving this?

mfrancio: Maybe this was already discussed in the past and I missed it, but isn't the bytecode version…

saksenadhruvUnsubmitted

Done

Yes, we actually are hoping to ship a serialization format with versioning soon, and would like bytecode to have some compatibility, or atleast a way to upgrade/downgrade when we break it in next couple of months.

What is the guidance on using bytecode for serialization and compatibility?

We are using versioning on our dialect but we need some underlying guarantees on the bytecode itself as well.

saksenadhruv: Yes, we actually are hoping to ship a serialization format with versioning soon, and would like…

mehdi_aminiUnsubmitted

Done

Yes we want it to be stable. From my point of view I am aware of 3 features I want to get before I'm comfortable with trying to claim that we reached the "stability" point.

Dialect Versioning (thanks you for driving this!)
Use-list order.
Lazy-loading ability.

(Some people may have other ideas, I'm not aware of any)

Then there is my work on "properties", but I suspect we can preserve backward compatibility on this (assuming the dialects themselves don't change of course).

mehdi_amini: Yes we want it to be stable. From my point of view I am aware of 3 features I want to get…

mfrancioAuthorUnsubmitted

Done

Back to:

We could emit an varint for the version blob size, if it is zero that means there is no version attached to the dialect.
That seems like it could fit right before the op names.

If we are in agreement on the current proposal (the dialect provides a version handle which holds a buffer to be written to file), we can definitely emit this blob of data into the dialect section as a breaking change. Can you kindly confirm before I move forward with the change?

mfrancio: Back to: > We could emit an varint for the version blob size, if it is zero that means there…

mehdi_aminiUnsubmitted

Done

Yes I think we should just do that now if we need to, that said in https://reviews.llvm.org/D145328 I did the change in a backward compatible way.

mehdi_amini: Yes I think we should just do that now if we need to, that said in https://reviews.llvm.

mfrancioAuthorUnsubmitted

Done

Yes! This is exactly what I envisioned when I started implementing the first draft, but I didn't post it as I didn't want to rely on a bytecode version change. Nice to see this.

We could also opt to remove the version section explicitly and inline the read/write of size/bytes reusing the alignment of the parent dialect section (probably a bit more memory efficient), but this works.

mfrancio: Yes! This is exactly what I envisioned when I started implementing the first draft, but I…

mehdi_aminiUnsubmitted

Done

The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface.

mehdi_amini: The reason I used a section is that when we load the version section we haven't loaded the…

mehdi_aminiUnsubmitted

Done

Something to add still is an attribute in the test dialect that is serialized at v0.1 and read / upgraded during parsing of v0.2.
I suspect we're missing making the version available on the readAttribute API.

mehdi_amini: Something to add still is an attribute in the test dialect that is serialized at v0.1 and read…

mfrancioAuthorUnsubmitted

Done

I did the change! It is tested only for attributes, but I can easily extend it to types as well!

mfrancio: I did the change! It is tested only for attributes, but I can easily extend it to types as well!

mfrancioAuthorUnsubmitted

Done

The reason I used a section is that when we load the version section we haven't loaded the dialect yet so we don't have the interface.

I still don't see the reason. I think the section could just be inlined. You don't need the interface to read it (we would just hold the buffer). The interface is needed later to resolve the buffer and decode it... Unless I am missing something subtle :)

mfrancio: > The reason I used a section is that when we load the version section we haven't loaded the…

mehdi_aminiUnsubmitted

Done

Right, I guess I didn't find the method to do it!

Do you see how to emit the content of the versionEmitter differently than using emitSection? We need to emit the size and then the content. The logic in emitSection() has this logic:

// Push our current buffer and then merge the provided section body into
// ours.
appendResult(std::move(currentResult));
for (std::vector<uint8_t> &result : emitter.prevResultStorage)
  prevResultStorage.push_back(std::move(result));
llvm::append_range(prevResultList, emitter.prevResultList);
prevResultSize += emitter.prevResultSize;
appendResult(std::move(emitter.currentResult));

(knowing that the writeVersion interface can't do it because it needs to compute the size first before emitting the content)

mehdi_amini: Right, I guess I didn't find the method to do it! Do you see how to emit the content of the…

mfrancioAuthorUnsubmitted

Done

Exactly, I was thinking of consolidating this into a new method of reader and avoid the existence of a new dialect version section. I'll try!

mfrancio: Exactly, I was thinking of consolidating this into a new method of reader and avoid the…

mfrancioAuthorUnsubmitted

Done

I considered this again, but the only thing that would eventually "save" is really to print the var int of the section, so it felt not strictly necessary now that we have the bit flag.

mfrancio: I considered this again, but the only thing that would eventually "save" is really to print the…

if (failed(sectionReader.parseVarInt(dialectVersionSize)))

return failure();

ArrayRef<uint8_t> bytes;

if (failed(sectionReader.parseBytes(dialectVersionSize, bytes)))

return failure();

versionedDialectMap.insert(

{dialectName, AsmDialectVersionHandle(bytes, dialectName)});

}

for (auto dialect : dialects)

if (versionedDialectMap.count(dialect.name))

dialect.version = versionedDialectMap.lookup(dialect.name);

return success();

}

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Value Processing // Value Processing

Value BytecodeReader::parseOperand(EncodingReader &reader) { Value BytecodeReader::parseOperand(EncodingReader &reader) {

std::vector<Value> &values = valueScopes.back().values; std::vector<Value> &values = valueScopes.back().values;

Value *value = nullptr; Value *value = nullptr;

if (failed(parseEntry(reader, values, value, "value"))) if (failed(parseEntry(reader, values, value, "value")))

return Value(); return Value();

▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp

Show First 20 Lines • Show All 380 Lines • ▼ Show 20 Lines	void writeResourceSection(Operation *op, EncodingEmitter &emitter,
const BytecodeWriterConfig::Impl &config);		const BytecodeWriterConfig::Impl &config);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Strings		// Strings

void writeStringSection(EncodingEmitter &emitter);		void writeStringSection(EncodingEmitter &emitter);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
		// Dialect versions

		void writeDialectVersionsSection(EncodingEmitter &emitter);

		//===--------------------------------------------------------------------===//
// Fields		// Fields

/// The builder used for the string section.		/// The builder used for the string section.
StringSectionBuilder stringSection;		StringSectionBuilder stringSection;

/// The IR numbering state generated for the root operation.		/// The IR numbering state generated for the root operation.
IRNumberingState numberingState;		IRNumberingState numberingState;
};		};
Show All 23 Lines	void BytecodeWriter::write(Operation *rootOp, raw_ostream &os,
writeIRSection(emitter, rootOp);		writeIRSection(emitter, rootOp);

// Emit the resources section.		// Emit the resources section.
writeResourceSection(rootOp, emitter, config);		writeResourceSection(rootOp, emitter, config);

// Emit the string section.		// Emit the string section.
writeStringSection(emitter);		writeStringSection(emitter);

		// Emit the dialect section.
		writeDialectVersionsSection(emitter);

// Write the generated bytecode to the provided output stream.		// Write the generated bytecode to the provided output stream.
emitter.writeTo(os);		emitter.writeTo(os);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Dialects		// Dialects

/// Write the given entries in contiguous groups with the same parent dialect.		/// Write the given entries in contiguous groups with the same parent dialect.
Show All 35 Lines	void BytecodeWriter::writeDialectSection(EncodingEmitter &emitter) {
auto emitOpName = [&](OpNameNumbering &name) {		auto emitOpName = [&](OpNameNumbering &name) {
dialectEmitter.emitVarInt(stringSection.insert(name.name.stripDialect()));		dialectEmitter.emitVarInt(stringSection.insert(name.name.stripDialect()));
};		};
writeDialectGrouping(dialectEmitter, numberingState.getOpNames(), emitOpName);		writeDialectGrouping(dialectEmitter, numberingState.getOpNames(), emitOpName);

emitter.emitSection(bytecode::Section::kDialect, std::move(dialectEmitter));		emitter.emitSection(bytecode::Section::kDialect, std::move(dialectEmitter));
}		}

		void BytecodeWriter::writeDialectVersionsSection(EncodingEmitter &emitter) {
		EncodingEmitter dialectEmitter;

		// Get dialect version.
		auto getVersion =
		[&](const OpAsmDialectInterface *asmIface) -> AsmDialectVersionHandle {
		// A dialect can be nullptr if not loaded. In such a case, we can't print
		// the version properly.
		if (asmIface)
		return asmIface->getProducerVersion();
		return {};
		};

		// Emit the referenced dialects.
		auto dialects = numberingState.getDialects();
		llvm::SmallVector<std::pair<unsigned, AsmDialectVersionHandle>>
		dialectVersionPair;

		myhsuUnsubmitted Done Reply Inline Actions format: add braces myhsu: format: add braces
		for (DialectNumbering &dialect : dialects) {
		if (auto version = getVersion(dialect.asmInterface)) {
		dialectVersionPair.push_back(
		std::make_pair(stringSection.insert(dialect.name), version));
		}
		}
		dialectEmitter.emitVarInt(dialectVersionPair.size());
		for (auto item : dialectVersionPair) {
		dialectEmitter.emitVarInt(item.first);
		dialectEmitter.emitVarInt(item.second.size());
		dialectEmitter.emitBytes(item.second.getBuffer());
		}

		emitter.emitSection(bytecode::Section::kDialectVersions,
		std::move(dialectEmitter));
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Attributes and Types		// Attributes and Types

namespace {		namespace {
class DialectWriter : public DialectBytecodeWriter {		class DialectWriter : public DialectBytecodeWriter {
public:		public:
DialectWriter(EncodingEmitter &emitter, IRNumberingState &numberingState,		DialectWriter(EncodingEmitter &emitter, IRNumberingState &numberingState,
StringSectionBuilder &stringSection)		StringSectionBuilder &stringSection)
▲ Show 20 Lines • Show All 374 Lines • Show Last 20 Lines

mlir/lib/IR/AsmPrinter.cpp

Show All 38 Lines
#include "llvm/ADT/TypeSwitch.h"		#include "llvm/ADT/TypeSwitch.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/Regex.h"		#include "llvm/Support/Regex.h"
#include "llvm/Support/SaveAndRestore.h"		#include "llvm/Support/SaveAndRestore.h"
#include "llvm/Support/Threading.h"		#include "llvm/Support/Threading.h"

#include <tuple>
#include <optional>		#include <optional>
		#include <tuple>

using namespace mlir;		using namespace mlir;
using namespace mlir::detail;		using namespace mlir::detail;

#define DEBUG_TYPE "mlir-asm-printer"		#define DEBUG_TYPE "mlir-asm-printer"

void OperationName::print(raw_ostream &os) const { os << getStringRef(); }		void OperationName::print(raw_ostream &os) const { os << getStringRef(); }

▲ Show 20 Lines • Show All 2,999 Lines • ▼ Show 20 Lines	private:

/// Print the resource sections for the file metadata dictionary.		/// Print the resource sections for the file metadata dictionary.
/// `checkAddMetadataDict` is used to indicate that metadata is going to be		/// `checkAddMetadataDict` is used to indicate that metadata is going to be
/// added, and the file metadata dictionary should be started if it hasn't		/// added, and the file metadata dictionary should be started if it hasn't
/// yet.		/// yet.
void printResourceFileMetadata(function_ref<void()> checkAddMetadataDict,		void printResourceFileMetadata(function_ref<void()> checkAddMetadataDict,
Operation *op);		Operation *op);

		/// Print a dictionary containing dialect names and corresponding dialect
		/// versions.
		void printDialectVersions(Operation *op);

// Contains the stack of default dialects to use when printing regions.		// Contains the stack of default dialects to use when printing regions.
// A new dialect is pushed to the stack before parsing regions nested under an		// A new dialect is pushed to the stack before parsing regions nested under an
// operation implementing `OpAsmOpInterface`, and popped when done. At the		// operation implementing `OpAsmOpInterface`, and popped when done. At the
// top-level we start with "builtin" as the default, so that the top-level		// top-level we start with "builtin" as the default, so that the top-level
// `module` operation prints as-is.		// `module` operation prints as-is.
SmallVector<StringRef> defaultDialectStack{"builtin"};		SmallVector<StringRef> defaultDialectStack{"builtin"};

/// The number of spaces used for indenting nested operations.		/// The number of spaces used for indenting nested operations.
const static unsigned indentWidth = 2;		const static unsigned indentWidth = 2;

// This is the current indentation level for nested structures.		// This is the current indentation level for nested structures.
unsigned currentIndent = 0;		unsigned currentIndent = 0;
};		};
} // namespace		} // namespace

void OperationPrinter::printTopLevelOperation(Operation *op) {		void OperationPrinter::printTopLevelOperation(Operation *op) {
		// If any dialect has a version to print, print the `dialect_versions`
		// directive. This directive has to be the very first directive in the file
		// and introduce a dictionary where the keys are dialect names and the value
		// are the buffers representing the producer version for the given dialect.
		printDialectVersions(op);

// Output the aliases at the top level that can't be deferred.		// Output the aliases at the top level that can't be deferred.
state.getAliasState().printNonDeferredAliases(*this, newLine);		state.getAliasState().printNonDeferredAliases(*this, newLine);
		jpienaarUnsubmitted Done Reply Inline Actions When this was discussed we talked about needing to have builtin attr version be treated specially (else one can't parse its version to know how to parse integerattr even). jpienaar: When this was discussed we talked about needing to have builtin attr version be treated…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions I added a new built in attribute called VersionAttr. mfrancio: I added a new built in attribute called VersionAttr.

// Print the module.		// Print the module.
printFullOpWithIndentAndLoc(op);		printFullOpWithIndentAndLoc(op);
os << newLine;		os << newLine;

// Output the aliases at the top level that can be deferred.		// Output the aliases at the top level that can be deferred.
state.getAliasState().printDeferredAliases(*this, newLine);		state.getAliasState().printDeferredAliases(*this, newLine);

▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	void OperationPrinter::printResourceFileMetadata(
// resources.		// resources.
hadResource = false;		hadResource = false;
for (const auto &printer : state.getResourcePrinters())		for (const auto &printer : state.getResourcePrinters())
processProvider("external", printer.getName(), printer);		processProvider("external", printer.getName(), printer);
if (hadResource)		if (hadResource)
os << newLine << " }";		os << newLine << " }";
}		}

		void OperationPrinter::printDialectVersions(Operation *op) {
		// Retrieve the version to print for this dialect, if any.
		auto getVersion = [&](Dialect *dialect) -> AsmDialectVersionHandle {
		const auto interfaces = state.getDialectInterfaces();
		const OpAsmDialectInterface *asmIface = interfaces.getInterfaceFor(dialect);
		if (!asmIface)
		return {};
		return asmIface->getProducerVersion();
		};

		llvm::SmallVector<AsmDialectVersionHandle> versionHandles;
		for (auto dialect : op->getContext()->getLoadedDialects()) {
		auto version = getVersion(dialect);
		if (version)
		versionHandles.push_back(version);
		}

		if (versionHandles.empty())
		return;

		os << "dialect_versions { ";
		interleaveComma(versionHandles, [&](auto item) {
		auto dialectName = item.getDialectName();
		const auto interfaces = state.getDialectInterfaces();
		auto *dialect = op->getContext()->getLoadedDialect(dialectName);
		const OpAsmDialectInterface *asmIface = interfaces.getInterfaceFor(dialect);
		::printKeywordOrString(dialect->getNamespace(), os);
		os << " = version<";
		auto stringOr = asmIface->printVersionAsString(item);
		if (succeeded(stringOr)) {
		os << "\"" << *stringOr << "\"";
		} else {
		os << "\"0x" << llvm::toHex(item.getBuffer()) << "\"";
		}
		os << ">";
		});
		os << " }" << newLine;
		return;
		}

/// Print a block argument in the usual format of:		/// Print a block argument in the usual format of:
/// %ssaName : type {attr1=42} loc("here")		/// %ssaName : type {attr1=42} loc("here")
/// where location printing is controlled by the standard internal option.		/// where location printing is controlled by the standard internal option.
/// You may pass omitType=true to not print a type, and pass an empty		/// You may pass omitType=true to not print a type, and pass an empty
/// attribute list if you don't care for attributes.		/// attribute list if you don't care for attributes.
void OperationPrinter::printRegionArgument(BlockArgument arg,		void OperationPrinter::printRegionArgument(BlockArgument arg,
ArrayRef<NamedAttribute> argAttrs,		ArrayRef<NamedAttribute> argAttrs,
bool omitType) {		bool omitType) {
▲ Show 20 Lines • Show All 543 Lines • Show Last 20 Lines

mlir/test/Bytecode/general.mlir

	// RUN: mlir-opt -allow-unregistered-dialect -emit-bytecode %s \| mlir-opt -allow-unregistered-dialect \| FileCheck %s			// RUN: mlir-opt -allow-unregistered-dialect -emit-bytecode %s \| mlir-opt -allow-unregistered-dialect \| FileCheck %s

	// Bytecode currently does not support big-endian platforms			// Bytecode currently does not support big-endian platforms
	// UNSUPPORTED: target=s390x-{{.*}}			// UNSUPPORTED: target=s390x-{{.*}}

				// CHECK: dialect_versions { test = version<"1.42"> }
	// CHECK-LABEL: "bytecode.test1"			// CHECK-LABEL: "bytecode.test1"
	// CHECK-NEXT: "bytecode.empty"() : () -> ()			// CHECK-NEXT: "bytecode.empty"() : () -> ()
	// CHECK-NEXT: "bytecode.attributes"() {attra = 10 : i64, attrb = #bytecode.attr} : () -> ()			// CHECK-NEXT: "bytecode.attributes"() {attra = 10 : i64, attrb = #bytecode.attr} : () -> ()
	// CHECK-NEXT: test.graph_region {			// CHECK-NEXT: test.graph_region {
	// CHECK-NEXT: "bytecode.operands"(%[[RESULTS:.*]]#0, %[[RESULTS]]#1, %[[RESULTS]]#2) : (i32, i64, i32) -> ()			// CHECK-NEXT: "bytecode.operands"(%[[RESULTS:.*]]#0, %[[RESULTS]]#1, %[[RESULTS]]#2) : (i32, i64, i32) -> ()
	// CHECK-NEXT: %[[RESULTS]]:3 = "bytecode.results"() : () -> (i32, i64, i32)			// CHECK-NEXT: %[[RESULTS]]:3 = "bytecode.results"() : () -> (i32, i64, i32)
	// CHECK-NEXT: }			// CHECK-NEXT: }
	// CHECK-NEXT: "bytecode.branch"()[^[[BLOCK:.*]]] : () -> ()			// CHECK-NEXT: "bytecode.branch"()[^[[BLOCK:.*]]] : () -> ()
	Show All 24 Lines

mlir/test/IR/ir_upgrade.mlir

This file was added.

				// RUN: mlir-opt -split-input-file --verify-diagnostics %s \| FileCheck %s

				dialect_versions { test = version<"0x0100000029000000"> }

				// -----

				dialect_versions { test = version<"1.39"> }

				// -----

				// CHECK: dialect_versions { test = version<"1.42"> }
				// CHECK: "test.versionedA"() {dims = 123 : i64} : () -> ()
				"test.versionedA"() {dims = 123 : i64} : () -> ()

				// -----

				// CHECK: dialect_versions { test = version<"1.42"> }
				// CHECK: "test.versionedA"() {dims = 123 : i64} : () -> ()
				dialect_versions { test = version<"0x0100000029000000"> }
				"test.versionedA"() {dimensions = 123 : i64} : () -> ()

				// -----

				// CHECK: dialect_versions { test = version<"1.42"> }
				// CHECK: "test.versionedA"() {dims = 123 : i64} : () -> ()
				dialect_versions { test = version<"0x0100000029000000"> }
				"test.versionedA"() {dims = 123 : i64} : () -> ()

				// -----

				// CHECK: dialect_versions { test = version<"1.42"> }
				// CHECK: test.versionedB current_version
				dialect_versions { test = version<"0x0120000009000000"> }
				test.versionedB deprecated_syntax

				// -----

				// CHECK: dialect_versions { test = version<"1.42"> }
				// CHECK: test.versionedB current_version
				dialect_versions { test = version<"0x010000002A000000"> }
				test.versionedB current_version

				// -----

				dialect_versions { test = version<"0x010000002A000000"> }
				// expected-error@+1{{custom op 'test.versionedB' expected 'current_version'}}
				test.versionedB deprecated_syntax

				// -----
				// expected-error@-2{{current test dialect version is 1.42, can't parse version: 1.43}}
				dialect_versions { test = version<"1.43"> }
				"test.versionedA"() {dims = 123 : i64} : () -> ()
				No newline at end of file

mlir/test/lib/Dialect/Test/TestDialect.cpp

Show All 26 Lines
#include "mlir/IR/Verifier.h"		#include "mlir/IR/Verifier.h"
#include "mlir/Interfaces/InferIntRangeInterface.h"		#include "mlir/Interfaces/InferIntRangeInterface.h"
#include "mlir/Reducer/ReductionPatternInterface.h"		#include "mlir/Reducer/ReductionPatternInterface.h"
#include "mlir/Transforms/FoldUtils.h"		#include "mlir/Transforms/FoldUtils.h"
#include "mlir/Transforms/InliningUtils.h"		#include "mlir/Transforms/InliningUtils.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include <optional>

#include <numeric>		#include <numeric>
		#include <optional>

// Include this before the using namespace lines below to		// Include this before the using namespace lines below to
// test that we don't have namespace dependencies.		// test that we don't have namespace dependencies.
#include "TestOpsDialect.cpp.inc"		#include "TestOpsDialect.cpp.inc"

using namespace mlir;		using namespace mlir;
using namespace test;		using namespace test;

void test::registerTestDialect(DialectRegistry &registry) {		void test::registerTestDialect(DialectRegistry &registry) {
registry.insert<TestDialect>();		registry.insert<TestDialect>();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// TestDialect version utilities
		//===----------------------------------------------------------------------===//

		struct TestDialectVersion {
		int major = 1;
		int minor = 42;
		};

		// Encode/decode a version attribute
		AsmDialectVersionHandle encodeDialectVersion(MLIRContext *ctx,
		const TestDialectVersion version) {
		ArrayRef<uint8_t> encode(reinterpret_cast<const uint8_t *>(&version),
		sizeof(TestDialectVersion));
		mehdi_aminiUnsubmitted Done Reply Inline Actions We should use the bytecode encoding here for portability purpose at minima. That is encore the two int as varint (and decode them when loading). mehdi_amini: We should use the bytecode encoding here for portability purpose at minima. That is encore the…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions This is a very good point that I overlooked. It looks like we have two separate problems here - one is the portability, and the other one is to apply some sort of compression (not really critical in my view, but nice to have). For the purpose of the example, the first problem could be solved simply by using the helpers exposed by llvm under llvm/Support/Endian.h. For example, we could write/read the integers representing the version using inline void write16le(void P, uint16_t V) { write16<little>(P, V); } inline uint16_t read16le(const void P) { return read16<little>(P); } For the second, using varInt is definitely a great idea. It would be great to reuse the same byte code emitters and readers but it looks like they are not really exposed outside the bytecode cpp files. What we could do is to expose the varInt portion of it as helpers under mlir/Support. I am open to doing it, but since this is just an example, is it really worth it? Looking forward to hear your thoughts. mfrancio: This is a very good point that I overlooked. It looks like we have two separate problems here…
		mehdi_aminiUnsubmitted Done Reply Inline Actions The bytecode primitive are exposed in the public header `mlir/include/mlir/Bytecode/BytecodeImplementation.h`. Have a look at the dialect interface for manipulating types and attribute: virtual Attribute readAttribute(DialectBytecodeReader &reader) const { We should model the API here similarly: for a dialect writing a custom version blob should be no different than writing an attribute. mehdi_amini: The bytecode primitive are exposed in the public header…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Sounds good, I'll take a look. mfrancio: Sounds good, I'll take a look.
		mfrancioAuthorUnsubmitted Done Reply Inline Actions I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in `BytecodeImplementation.h`. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections - what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in `mlir/lib/Bytecode/Writer/BytecodeWriter.cpp`, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version. mfrancio: I considered this, but I don't really see a way to model the API for reading and writing a…
		mehdi_aminiUnsubmitted Done Reply Inline Actions I considered this, but I don't really see a way to model the API for reading and writing a version into the dialect section through what is exposed in BytecodeImplementation.h. That dialect interface seem to have very specific objectives that are tied to writing and reading custom attributes and types into their respective sections Right, sorry if I have the impression that this interface was "ready to be used" as-is here, I meant to point it as an example of an API that allows dialect author to access bytecode manipulation primitives. what we need is an interface that allows the dialect to write into a custom buffer. We could model the API through this interface, but it would become something pretty close to EncodingEmitter implemented in mlir/lib/Bytecode/Writer/BytecodeWriter.cpp, line 64. Wouldn't it be just more convenient to expose something like this under Support? It is true that writing a blob of data is no different than writing an attribute, but what changes here is the way this blob of data is created. For the attribute, its encoding is defined. But since we want to be independent from any existing attribute, and also completely defined by the user, I don't really see another convenient way of doing this other than exposing low level API to the user to write whatever encoding they need into their data blob that they wish to use to represent the version. I started typing a long answer here, but felt like I was missing something so I sketched something here instead: https://reviews.llvm.org/D145328 (there is still a bug, and a I haven't regenerated the bytecode test file, but the interface is there!) mehdi_amini: > I considered this, but I don't really see a way to model the API for reading and writing a…
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Thanks for the suggestion. This is very neat, I'll try to finalize it and regenerate the bytecode test files. mfrancio: Thanks for the suggestion. This is very neat, I'll try to finalize it and regenerate the…
		jpienaarUnsubmitted Done Reply Inline Actions I like the sketch. jpienaar: I like the sketch.
		return AsmDialectVersionHandle(
		encode, ctx->getOrLoadDialect<TestDialect>()->getNamespace());
		}

		TestDialectVersion decodeDialectVersion(AsmDialectVersionHandle handle) {
		auto version = TestDialectVersion(
		reinterpret_cast<const TestDialectVersion >(handle.getRawData()));
		return version;
		}

		//===----------------------------------------------------------------------===//
// TestDialect Interfaces		// TestDialect Interfaces
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {

/// Testing the correctness of some traits.		/// Testing the correctness of some traits.
static_assert(		static_assert(
llvm::is_detected<OpTrait::has_implicit_terminator_t,		llvm::is_detected<OpTrait::has_implicit_terminator_t,
Show All 33 Lines	std::optional<StringRef> aliasName =
.Case("alias_test:dot_in_name", StringRef("test.alias"))		.Case("alias_test:dot_in_name", StringRef("test.alias"))
.Case("alias_test:trailing_digit", StringRef("test_alias0"))		.Case("alias_test:trailing_digit", StringRef("test_alias0"))
.Case("alias_test:prefixed_digit", StringRef("0_test_alias"))		.Case("alias_test:prefixed_digit", StringRef("0_test_alias"))
.Case("alias_test:sanitize_conflict_a",		.Case("alias_test:sanitize_conflict_a",
StringRef("test_alias_conflict0"))		StringRef("test_alias_conflict0"))
.Case("alias_test:sanitize_conflict_b",		.Case("alias_test:sanitize_conflict_b",
StringRef("test_alias_conflict0_"))		StringRef("test_alias_conflict0_"))
.Case("alias_test:tensor_encoding", StringRef("test_encoding"))		.Case("alias_test:tensor_encoding", StringRef("test_encoding"))
.Default(std::nullopt);		.Default(std::nullopt);
		mehdi_aminiUnsubmitted Done Reply Inline Actions Would `else` be enough here? mehdi_amini: Would `else` be enough here?
		mfrancioAuthorUnsubmitted Done Reply Inline Actions I think the comment below is misleading - the intent was to forbid reading a newer than current version. I'll revise this. mfrancio: I think the comment below is misleading - the intent was to forbid reading a newer than current…
if (!aliasName)		if (!aliasName)
return AliasResult::NoAlias;		return AliasResult::NoAlias;

os << *aliasName;		os << *aliasName;
return AliasResult::FinalAlias;		return AliasResult::FinalAlias;
}		}

AliasResult getAlias(Type type, raw_ostream &os) const final {		AliasResult getAlias(Type type, raw_ostream &os) const final {
Show All 35 Lines	struct TestOpAsmInterface : public OpAsmDialectInterface {

FailureOr<AsmDialectResourceHandle>		FailureOr<AsmDialectResourceHandle>
declareResource(StringRef key) const final {		declareResource(StringRef key) const final {
return blobManager.insert(key);		return blobManager.insert(key);
}		}

LogicalResult parseResource(AsmParsedResourceEntry &entry) const final {		LogicalResult parseResource(AsmParsedResourceEntry &entry) const final {
FailureOr<AsmResourceBlob> blob = entry.parseAsBlob();		FailureOr<AsmResourceBlob> blob = entry.parseAsBlob();
if (failed(blob))		if (failed(blob))
		jpienaarUnsubmitted Done Reply Inline Actions Note: error messages should follow LLVM convention and be a sentence fragment (start lower case, no trailing punctuation) jpienaar: Note: error messages should follow LLVM convention and be a sentence fragment (start lower case…
return failure();		return failure();

// Update the blob for this entry.		// Update the blob for this entry.
blobManager.update(entry.getKey(), std::move(*blob));		blobManager.update(entry.getKey(), std::move(*blob));
return success();		return success();
}		}

void		void
buildResources(Operation *op,		buildResources(Operation *op,
const SetVector<AsmDialectResourceHandle> &referencedResources,		const SetVector<AsmDialectResourceHandle> &referencedResources,
AsmResourceBuilder &provider) const final {		AsmResourceBuilder &provider) const final {
blobManager.buildResources(provider, referencedResources.getArrayRef());		blobManager.buildResources(provider, referencedResources.getArrayRef());
}		}

		// Test IR upgrade with dialect version lookup.
		AsmDialectVersionHandle getProducerVersion() const final {
		return encodeDialectVersion(getContext(), TestDialectVersion());
		}

		FailureOr<AsmDialectVersionHandle>
		parseVersionAsString(StringRef token) const final {
		// We represent our version as a string `"major.minor"`
		auto split = token.split('.');
		TestDialectVersion version;
		version.major = std::stoi(split.first.str());
		version.minor = std::stoi(split.second.str());
		return encodeDialectVersion(getContext(), version);
		}

		// We would like to represent our version as version<"major.minor">
		FailureOr<std::string>
		printVersionAsString(AsmDialectVersionHandle attr) const final {
		auto version = decodeDialectVersion(attr);
		std::string result = std::to_string(version.major);
		result.append(".");
		result.append(std::to_string(version.minor));
		return result;
		}

		LogicalResult
		upgradeFromVersion(Operation *topLevelOp,
		AsmDialectVersionHandle producerVersion) const final {
		auto version = decodeDialectVersion(producerVersion);
		if (version.minor == 42)
		return success();
		if (version.minor > 42) {
		return topLevelOp->emitError()
		<< "current test dialect version is 1.42, can't parse version: "
		<< version.major << "." << version.minor;
		}
		topLevelOp->walk([](TestVersionedOpA op) {
		if (auto dims = op->getAttr("dimensions")) {
		op->removeAttr("dimensions");
		op->setAttr("dims", dims);
		}
		});

		return success();
		}

private:		private:
/// The blob manager for the dialect.		/// The blob manager for the dialect.
TestResourceBlobManagerInterface &blobManager;		TestResourceBlobManagerInterface &blobManager;
};		};

struct TestDialectFoldInterface : public DialectFoldInterface {		struct TestDialectFoldInterface : public DialectFoldInterface {
using DialectFoldInterface::DialectFoldInterface;		using DialectFoldInterface::DialectFoldInterface;

▲ Show 20 Lines • Show All 922 Lines • ▼ Show 20 Lines	void TestOpWithRegionPattern::getCanonicalizationPatterns(
RewritePatternSet &results, MLIRContext *context) {		RewritePatternSet &results, MLIRContext *context) {
results.add<TestRemoveOpWithInnerOps>(context);		results.add<TestRemoveOpWithInnerOps>(context);
}		}

OpFoldResult TestOpWithRegionFold::fold(FoldAdaptor adaptor) {		OpFoldResult TestOpWithRegionFold::fold(FoldAdaptor adaptor) {
return getOperand();		return getOperand();
}		}

OpFoldResult TestOpConstant::fold(FoldAdaptor adaptor) {		OpFoldResult TestOpConstant::fold(FoldAdaptor adaptor) { return getValue(); }
return getValue();
}

LogicalResult TestOpWithVariadicResultsAndFolder::fold(		LogicalResult TestOpWithVariadicResultsAndFolder::fold(
FoldAdaptor adaptor, SmallVectorImpl<OpFoldResult> &results) {		FoldAdaptor adaptor, SmallVectorImpl<OpFoldResult> &results) {
for (Value input : this->getOperands()) {		for (Value input : this->getOperands()) {
results.push_back(input);		results.push_back(input);
}		}
return success();		return success();
}		}
▲ Show 20 Lines • Show All 483 Lines • ▼ Show 20 Lines
}		}

void TestWithBoundsRegionOp::print(OpAsmPrinter &p) {		void TestWithBoundsRegionOp::print(OpAsmPrinter &p) {
p.printOptionalAttrDict((*this)->getAttrs());		p.printOptionalAttrDict((*this)->getAttrs());
p << ' ';		p << ' ';
p.printRegionArgument(getRegion().getArgument(0), /argAttrs=/{},		p.printRegionArgument(getRegion().getArgument(0), /argAttrs=/{},
/omitType=/true);		/omitType=/true);
p << ' ';		p << ' ';
p.printRegion(getRegion(), /printEntryBlockArgs=/false);		p.printRegion(getRegion(), /printEntryBlockArgs=/false);
		jpienaarUnsubmitted Done Reply Inline Actions I may have missed where this is used. jpienaar: I may have missed where this is used.
		mfrancioAuthorUnsubmitted Done Reply Inline Actions Yes, indeed I forgot to upload the corresponding mlir file. mfrancio: Yes, indeed I forgot to upload the corresponding mlir file.
}		}

void TestWithBoundsRegionOp::inferResultRanges(		void TestWithBoundsRegionOp::inferResultRanges(
ArrayRef<ConstantIntRanges> argRanges, SetIntRangeFn setResultRanges) {		ArrayRef<ConstantIntRanges> argRanges, SetIntRangeFn setResultRanges) {
Value arg = getRegion().getArgument(0);		Value arg = getRegion().getArgument(0);
setResultRanges(arg, {getUmin(), getUmax(), getSmin(), getSmax()});		setResultRanges(arg, {getUmin(), getUmax(), getSmin(), getSmax()});
}		}

Show All 13 Lines	void TestReflectBoundsOp::inferResultRanges(
Builder b(ctx);		Builder b(ctx);
setUminAttr(b.getIndexAttr(range.umin().getZExtValue()));		setUminAttr(b.getIndexAttr(range.umin().getZExtValue()));
setUmaxAttr(b.getIndexAttr(range.umax().getZExtValue()));		setUmaxAttr(b.getIndexAttr(range.umax().getZExtValue()));
setSminAttr(b.getIndexAttr(range.smin().getSExtValue()));		setSminAttr(b.getIndexAttr(range.smin().getSExtValue()));
setSmaxAttr(b.getIndexAttr(range.smax().getSExtValue()));		setSmaxAttr(b.getIndexAttr(range.smax().getSExtValue()));
setResultRanges(getResult(), range);		setResultRanges(getResult(), range);
}		}

		//===----------------------------------------------------------------------===//
		// Test dialect_version upgrade by supporting an old syntax
		//===----------------------------------------------------------------------===//

		ParseResult TestVersionedOpB::parse(mlir::OpAsmParser &parser,
		mlir::OperationState &state) {
		auto handle = parser.getDialectVersion("test");
		auto version = decodeDialectVersion(handle);
		if (version.minor < 42)
		return parser.parseKeyword("deprecated_syntax");
		return parser.parseKeyword("current_version");
		}

		void TestVersionedOpB::print(OpAsmPrinter &printer) {
		printer << " current_version";
		}

#include "TestOpEnums.cpp.inc"		#include "TestOpEnums.cpp.inc"
#include "TestOpInterfaces.cpp.inc"		#include "TestOpInterfaces.cpp.inc"
#include "TestTypeInterfaces.cpp.inc"		#include "TestTypeInterfaces.cpp.inc"

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "TestOps.cpp.inc"		#include "TestOps.cpp.inc"

mlir/test/lib/Dialect/Test/TestOps.td

Show First 20 Lines • Show All 3,143 Lines • ▼ Show 20 Lines	def TestCSEOfSingleBlockOp : TEST_Op<"cse_of_single_block_op",
let results = (outs Variadic<AnyType>:$outputs);		let results = (outs Variadic<AnyType>:$outputs);
let regions = (region SizedRegion<1>:$region);		let regions = (region SizedRegion<1>:$region);
let assemblyFormat = [{		let assemblyFormat = [{
attr-dict `inputs` `(` $inputs `)`		attr-dict `inputs` `(` $inputs `)`
$region `:` type($inputs) `->` type($outputs)		$region `:` type($inputs) `->` type($outputs)
}];		}];
}		}

		//===----------------------------------------------------------------------===//
		// Test Ops to upgrade base on the dialect versions
		//===----------------------------------------------------------------------===//

		def TestVersionedOpA : TEST_Op<"versionedA"> {
		let arguments = (ins
		// A previous version of this dialect used the name "dimensions" for this
		// attribute, it got renamed but we support loading old IR through
		// upgrading, see `upgradeFromVersion()` in `TestOpAsmInterface`.
		AnyI64Attr:$dims
		);
		}

		// This op will be able to parse based on an old syntax and "auto-upgrade".
		def TestVersionedOpB : TEST_Op<"versionedB"> {
		let arguments = (ins);

		let hasCustomAssemblyFormat = 1;
		// let parser = [{ return ::parseVersionedOp(parser, result); }];
		// let printer = [{ return ::print(*this, p); }];
		rriddleUnsubmitted Done Reply Inline Actions Dead code? rriddle: Dead code?
		mfrancioAuthorUnsubmitted Done Reply Inline Actions yep. thanks for pointing it out! mfrancio: yep. thanks for pointing it out!
		mehdi_aminiUnsubmitted Done Reply Inline Actions Leftover? mehdi_amini: Leftover?
		}

#endif // TEST_OPS		#endif // TEST_OPS

This is an archive of the discontinued LLVM Phabricator instance.

Extension of "Implement IR versioning through post-parsing upgrade through OpAsmDialectInterface"ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 499217

mlir/include/mlir/IR/OpImplementation.h

mlir/lib/AsmParser/AsmParserImpl.h

mlir/lib/AsmParser/Parser.h

mlir/lib/AsmParser/Parser.cpp

mlir/lib/AsmParser/ParserState.h

mlir/lib/AsmParser/TokenKinds.def

mlir/lib/Bytecode/Encoding.h

mlir/lib/Bytecode/Reader/BytecodeReader.cpp

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp

mlir/lib/IR/AsmPrinter.cpp

mlir/test/Bytecode/general.mlir

mlir/test/IR/ir_upgrade.mlir

mlir/test/lib/Dialect/Test/TestDialect.cpp

mlir/test/lib/Dialect/Test/TestOps.td

Extension of "Implement IR versioning through post-parsing upgrade through OpAsmDialectInterface"
ClosedPublic