This is an archive of the discontinued LLVM Phabricator instance.

[mlir][bytecode] Allow client to specify an older version of the bytecode to emit
ClosedPublic

Authored by jpienaar on Mar 21 2023, 11:36 AM.

Details

Summary

Add method to set a desired bytecode file format to generate. Change
write method to be able to return minimum bytecode version needed in
reader. This enables generating an older version of the bytecode (not
dialect ops, attributes or types). But this does not guarantee that an
older version can always be generated. This clamps setting to at most
current version.

Diff Detail

Event Timeline

jpienaar created this revision.Mar 21 2023, 11:36 AM
jpienaar requested review of this revision.Mar 21 2023, 11:36 AM

If we can't guarantee it can be generated, I don't think this should be added.

Aside from that, I'd like for us to finish up things we want to add to the format before we start to consider guaranteeing compatibility in this way.

mehdi_amini added inline comments.Mar 23 2023, 9:33 AM
mlir/test/Bytecode/versioning/versioned_bytecode.mlir
15

(newline)

jpienaar retitled this revision from [mlir][bytecode] Add version in write config. to [mlir][bytecode] Allow client to specify an older version of the bytecode to emit.Mar 23 2023, 10:14 AM

If we can't guarantee it can be generated, I don't think this should be added.

I don't think it can ever be a guarantee, but if the starting point is generic enough, you shouldn't hit compatibility issues anytime soon. The benefits of having this severely outweighs the risks of breaking compatibility.

Aside from that, I'd like for us to finish up things we want to add to the format before we start to consider guaranteeing compatibility in this way.

I agree with this.

If we can't guarantee it can be generated, I don't think this should be added.

I don't think it can ever be a guarantee, but if the starting point is generic enough, you shouldn't hit compatibility issues anytime soon. The benefits of having this severely outweighs the risks of breaking compatibility.

I will go one step further and say, we need to strive for a guarantee, that is what stability should be, which would allow us to write a serialized format which works on an older os, the one we shipped the first version with.

jpienaar updated this revision to Diff 515822.Apr 21 2023, 10:25 AM

Make clamp instead of fail and enable dialect writer to query set version

This was flagged as being needed to handle cases where dialect could use feature not at bytecode to serialize. There is no failure path added there for writes yet.

Adding the API does not commit us to much right now though, even if we add the

mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
83

Instead of an optional, can we just default initialize it with the current bytecode version? It may simplify the call-sites in the future...

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
891

This API will need to return a LogicalResult somehow to express failures...

mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
87

The help message needs an update

89

We should likely check somewhere that emit-bytecode-version is not set without emit-bytecode

jpienaar updated this revision to Diff 516617.Apr 24 2023, 9:19 PM
jpienaar marked 4 inline comments as done.

Return error status for write method.

Yes, worst case this enables quantifying impact for formalizing expectations.

mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
83

The current bytecode version is private in lib/Bytecode/Encoding.h - splitting it out (had chosen mlir/Bytecode/BytecodeVersion.h) resulted in this file depending on file in bytecode which felt a bit weird.

jpienaar added inline comments.Apr 24 2023, 9:25 PM
mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
83

This is currently private inside lib/Bytecode/Encoding.h - I thought about exposing it in mlir/Bytecode/BytecodeVersion.h but then Tools/mlir-opt depended on Bytecode which felt a bit odd. Let me know and I can change it (I mean given clamp behavior, I could initialize this to maxint here and it would behave the same).

mehdi_amini added inline comments.Apr 24 2023, 11:29 PM
mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
83

LG!

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
891

I was wondering about this today and felt that LogicalResult is a bit broad: we make it seems like serialization can fail in general, but we only want to fail if a specific bytecode version was requested and not honored.

So what about returning the actual minimum version of the reader that will be needer to read the bytecode?
If a method from version 4 is used by a dialect and the client requested version 2, we'd return 4, and it's up to the client to decide if they want to fail or not.

jpienaar updated this revision to Diff 516837.Apr 25 2023, 10:17 AM

Return min bytecode reader needs to support instead of failure, but make it failure in mlir-opt if the requested and required do not match.

jpienaar added inline comments.Apr 25 2023, 6:41 PM
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
891

Works for me - made the opt tool behavior an error if not request version.

jpienaar edited the summary of this revision. (Show Details)Apr 25 2023, 6:47 PM
jpienaar updated this revision to Diff 517055.Apr 25 2023, 10:10 PM

Plumb through to Python

jpienaar marked 3 inline comments as done.Apr 25 2023, 10:11 PM
mehdi_amini accepted this revision.Apr 25 2023, 10:13 PM

LGTM, but please wait for River's feedback as well!

This revision is now accepted and ready to land.Apr 25 2023, 10:13 PM
GleasonK added inline comments.
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
891

If a method from version 4 is used by a dialect and the client requested version 2, we'd return 4

If we can make forward compatibility a matter of the dialects used, that would be great! The StableHLO team is very interested in upstream forward compatibility, happy to follow more strict dialect policies to make it work if needed.

jpienaar added inline comments.Apr 26 2023, 7:15 PM
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
891

If we can make forward compatibility a matter of the dialects used

Not sure I follow. I believe Mehdi's proposal here is that if a dialect uses a write method only available at version X then return at least X (now, we have no such case at the moment so nothing modifies the set number yet). But a dialect can query the requested bytecode version and decide which method to use. So its under the dialects control to only use encoding methods available at a specific version: its a function of methods used rather than dialects used, as a given dialect could still use new encoding methods but the module/op serialized just doesn't use any of those and it would still be be able to serialize with the old.

rriddle added inline comments.Apr 27 2023, 12:01 AM
mlir/include/mlir/Bytecode/BytecodeWriter.h
43–45

I would likely name this something more explicit, like: trySetBytecodeVersion, given that it isn't guaranteed to actually set the version provided. We could return the actual version used, and let the user decide what to do about it.

85

Please create a result struct for this. Returning an integer feels a bit too opaque here.

mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
31–32

Note: This is only set if a specific set version can be emitted.

This felt a bit weird to read. It might be more accurate to say it only differs from kVersion if ...

mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
46–59

Why do we need a custom parser for optional? Why not just use the getNumOccurrences method to check if this has been set?

jpienaar marked 4 inline comments as done.Apr 27 2023, 9:53 AM

(updating to struct but wanted to check on other parts in parallel)

mlir/include/mlir/Bytecode/BytecodeWriter.h
43–45

How about setDesiredBytecodeVersion? Here we don't yet know what the dialects involved would be doing, so couldn't return actual version and instead returning it from write entry point.

mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
46–59

Usage of external storage - at least I couldn't find nice way that doesn't use sentinel or other usage like this.

jpienaar updated this revision to Diff 517697.Apr 27 2023, 1:39 PM
jpienaar marked 2 inline comments as done.

Use a more explicit return struct from writer.

jpienaar updated this revision to Diff 517732.Apr 27 2023, 3:27 PM

Fixing formatting

jpienaar updated this revision to Diff 518105.Apr 28 2023, 5:57 PM

Retrying clang-format

rriddle accepted this revision.Apr 29 2023, 2:39 AM
rriddle added inline comments.
mlir/lib/Tools/mlir-lsp-server/MLIRServer.cpp
888–889 ↗(On Diff #518105)

Do we need the comment or the void cast at all here anymore?

This revision was automatically updated to reflect the committed changes.
jpienaar marked an inline comment as done.
mehdi_amini added inline comments.Apr 29 2023, 10:40 PM
mlir/include/mlir-c/IR.h
568 ↗(On Diff #518160)

Can you make this nodiscard?

We don't have a test where the bitcode gets emitted at a higher requirement than requested. I think we would actually emit bitcode that can't be loaded anymore.

That is we can't change the bitcode version in the middle of the emission, it would create inconsistent version within the file.

mlir/test/Bytecode/versioning/versioned_bytecode.mlir
12

I don't think we check here what the actual emitted version is?

mehdi_amini added inline comments.Apr 29 2023, 10:57 PM
mlir/test/Bytecode/versioning/versioned_bytecode.mlir
12

Actually scratch that: mlir-opt would error out if it wasn't the requested version

We don't have a test where the bitcode gets emitted at a higher requirement than requested. I think we would actually emit bitcode that can't be loaded anymore.

That is we can't change the bitcode version in the middle of the emission, it would create inconsistent version within the file.

You are correct, when this was just returning a failure result it was fine but with version returned and "auto relaxing" not so much. I don't know if general case can be supported except by attempting to emit and then redoing it (at which point success or not can be returned and caller could decide to redo instead).

Yeah I don't see another way, we should probably just abort the emission and emit an error when a dialect emits with a newer version.