Add method to set a desired bytecode file format to generate. Change
write method to be able to return minimum bytecode version needed in
reader. This enables generating an older version of the bytecode (not
dialect ops, attributes or types). But this does not guarantee that an
older version can always be generated. This clamps setting to at most
current version.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
If we can't guarantee it can be generated, I don't think this should be added.
Aside from that, I'd like for us to finish up things we want to add to the format before we start to consider guaranteeing compatibility in this way.
mlir/test/Bytecode/versioning/versioned_bytecode.mlir | ||
---|---|---|
15 | (newline) |
I don't think it can ever be a guarantee, but if the starting point is generic enough, you shouldn't hit compatibility issues anytime soon. The benefits of having this severely outweighs the risks of breaking compatibility.
Aside from that, I'd like for us to finish up things we want to add to the format before we start to consider guaranteeing compatibility in this way.
I agree with this.
I will go one step further and say, we need to strive for a guarantee, that is what stability should be, which would allow us to write a serialized format which works on an older os, the one we shipped the first version with.
Make clamp instead of fail and enable dialect writer to query set version
This was flagged as being needed to handle cases where dialect could use feature not at bytecode to serialize. There is no failure path added there for writes yet.
Adding the API does not commit us to much right now though, even if we add the
mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h | ||
---|---|---|
83 | Instead of an optional, can we just default initialize it with the current bytecode version? It may simplify the call-sites in the future... | |
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp | ||
891 | This API will need to return a LogicalResult somehow to express failures... | |
mlir/lib/Tools/mlir-opt/MlirOptMain.cpp | ||
87 | The help message needs an update | |
89 | We should likely check somewhere that emit-bytecode-version is not set without emit-bytecode |
Yes, worst case this enables quantifying impact for formalizing expectations.
mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h | ||
---|---|---|
83 | The current bytecode version is private in lib/Bytecode/Encoding.h - splitting it out (had chosen mlir/Bytecode/BytecodeVersion.h) resulted in this file depending on file in bytecode which felt a bit weird. |
mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h | ||
---|---|---|
83 | This is currently private inside lib/Bytecode/Encoding.h - I thought about exposing it in mlir/Bytecode/BytecodeVersion.h but then Tools/mlir-opt depended on Bytecode which felt a bit odd. Let me know and I can change it (I mean given clamp behavior, I could initialize this to maxint here and it would behave the same). |
mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h | ||
---|---|---|
83 | LG! | |
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp | ||
891 | I was wondering about this today and felt that LogicalResult is a bit broad: we make it seems like serialization can fail in general, but we only want to fail if a specific bytecode version was requested and not honored. So what about returning the actual minimum version of the reader that will be needer to read the bytecode? |
Return min bytecode reader needs to support instead of failure, but make it failure in mlir-opt if the requested and required do not match.
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp | ||
---|---|---|
891 | Works for me - made the opt tool behavior an error if not request version. |
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp | ||
---|---|---|
891 |
If we can make forward compatibility a matter of the dialects used, that would be great! The StableHLO team is very interested in upstream forward compatibility, happy to follow more strict dialect policies to make it work if needed. |
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp | ||
---|---|---|
891 |
Not sure I follow. I believe Mehdi's proposal here is that if a dialect uses a write method only available at version X then return at least X (now, we have no such case at the moment so nothing modifies the set number yet). But a dialect can query the requested bytecode version and decide which method to use. So its under the dialects control to only use encoding methods available at a specific version: its a function of methods used rather than dialects used, as a given dialect could still use new encoding methods but the module/op serialized just doesn't use any of those and it would still be be able to serialize with the old. |
mlir/include/mlir/Bytecode/BytecodeWriter.h | ||
---|---|---|
43–45 | I would likely name this something more explicit, like: trySetBytecodeVersion, given that it isn't guaranteed to actually set the version provided. We could return the actual version used, and let the user decide what to do about it. | |
85 | Please create a result struct for this. Returning an integer feels a bit too opaque here. | |
mlir/lib/Bytecode/Writer/BytecodeWriter.cpp | ||
31–32 | Note: This is only set if a specific set version can be emitted. This felt a bit weird to read. It might be more accurate to say it only differs from kVersion if ... | |
mlir/lib/Tools/mlir-opt/MlirOptMain.cpp | ||
46–59 | Why do we need a custom parser for optional? Why not just use the getNumOccurrences method to check if this has been set? |
(updating to struct but wanted to check on other parts in parallel)
mlir/include/mlir/Bytecode/BytecodeWriter.h | ||
---|---|---|
43–45 | How about setDesiredBytecodeVersion? Here we don't yet know what the dialects involved would be doing, so couldn't return actual version and instead returning it from write entry point. | |
mlir/lib/Tools/mlir-opt/MlirOptMain.cpp | ||
46–59 | Usage of external storage - at least I couldn't find nice way that doesn't use sentinel or other usage like this. |
mlir/lib/Tools/mlir-lsp-server/MLIRServer.cpp | ||
---|---|---|
888–889 ↗ | (On Diff #518105) | Do we need the comment or the void cast at all here anymore? |
mlir/include/mlir-c/IR.h | ||
---|---|---|
568 ↗ | (On Diff #518160) | Can you make this nodiscard? |
We don't have a test where the bitcode gets emitted at a higher requirement than requested. I think we would actually emit bitcode that can't be loaded anymore.
That is we can't change the bitcode version in the middle of the emission, it would create inconsistent version within the file.
mlir/test/Bytecode/versioning/versioned_bytecode.mlir | ||
---|---|---|
12 | I don't think we check here what the actual emitted version is? |
mlir/test/Bytecode/versioning/versioned_bytecode.mlir | ||
---|---|---|
12 | Actually scratch that: mlir-opt would error out if it wasn't the requested version |
You are correct, when this was just returning a failure result it was fine but with version returned and "auto relaxing" not so much. I don't know if general case can be supported except by attempting to emit and then redoing it (at which point success or not can be returned and caller could decide to redo instead).
Yeah I don't see another way, we should probably just abort the emission and emit an error when a dialect emits with a newer version.
I would likely name this something more explicit, like: trySetBytecodeVersion, given that it isn't guaranteed to actually set the version provided. We could return the actual version used, and let the user decide what to do about it.