This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Python] Sync Python bindings with C API MlirStringRef modification.
ClosedPublic

Authored by stellaraccident on Nov 23 2020, 11:26 PM.

Details

Summary

MLIR C API use the MlirStringRef instead of const char * for the string type now. This patch sync the Python bindings with the C API modification.

Diff Detail

Event Timeline

zhanghb97 created this revision.Nov 23 2020, 11:26 PM
zhanghb97 requested review of this revision.Nov 23 2020, 11:26 PM
ftynse requested changes to this revision.Nov 24 2020, 12:26 AM
ftynse added inline comments.
mlir/lib/Bindings/Python/IRModules.cpp
911

Let's use mlirStringRefCreate instead and get the length of the string from std::string. This will avoid the need to list all characters in the string until \0. Bonus points for adding a helper.

This revision now requires changes to proceed.Nov 24 2020, 12:26 AM
zhanghb97 added inline comments.Nov 24 2020, 2:32 AM
mlir/lib/Bindings/Python/IRModules.cpp
911

When I add a helper function like this:

static MlirStringRef createMlirStringRef(std::string str) {
  return mlirStringRefCreate(str.c_str(), str.length());
}

It seems that the lifetime of the std::string isn't extended out of the function. I think I need help here: how can I extend the lifetime of the string, and avoid memory leak at the same time?

Does anyone mind if I commandeer this change (it is broken at head)?

Small process question: I authored the change to us MlirStringRef instead of const char *, what is the policy about breaking changes in the C API? I think it is likely unreasonable to expect any change in the C API to fix all in-tree language bindings, but since the Python bindings are in-tree, these types of changes can break them. I'm might be the only person working on non-Python bindings here, so I may be biased :)

Small process question: I authored the change to us MlirStringRef instead of const char *, what is the policy about breaking changes in the C API? I think it is likely unreasonable to expect any change in the C API to fix all in-tree language bindings, but since the Python bindings are in-tree, these types of changes can break them. I'm might be the only person working on non-Python bindings here, so I may be biased :)

The RFC where we started the Python bindings identified this as an optional feature, so you're good :) We're working on getting buildbot coverage and making the case to enable it by default, but that will be a discussion. Until then, keeping it updated is best effort.

Also, as a non-trivial user of the C-API in tree, it may eventually be a good buffer against unintended API changes (we want the C-API to be fairly stable).

stellaraccident commandeered this revision.Nov 24 2020, 10:47 AM
stellaraccident edited reviewers, added: zhanghb97; removed: stellaraccident.

Commandeer change and resolve comments.

I think it is likely unreasonable to expect any change in the C API to fix all in-tree language bindings,

Why is that? We update all the C++ API users when we update it and the surface is much higher.

The difference is that all the consumers of the C++ API are C++, so it is pretty straightforward to update when making breaking changes even if the surface is greater. Language bindings will live at least partially in their own ecosystem and requiring folks who work on the C API to understand each individual ecosystem may be too onerous, even if the surface area is smaller. To be clear, I’m just saying there is a tradeoff here, not advocating for us to decide on a particular point on that spectrum. I agree with Stella that we can wait until this becomes a bigger issue and then have a more formal discussion (hopefully with more data in the form of diverse language bindings).

The difference is that all the consumers of the C++ API are C++, so it is pretty straightforward to update when making breaking changes even if the surface is greater.

You're clearly not updating all of the Google internal users ;)

Language bindings will live at least partially in their own ecosystem and requiring folks who work on the C API to understand each individual ecosystem may be too onerous, even if the surface area is smaller.

There are a few aspects here that make me doubtful about it:

  • the C API is more limited (by the virtue of using C: no overloading sets, no inheritance/overrides, no templates, etc.)
  • the C API is intended to be stable at some point and see little churn.
  • I does not seem hard to me to see myself changing the C API like here and adjust any in-tree language bindings in a fairly straightforward way, in most cases it'll be fairly mechanical to look at the API call site and update it. I don't quite get why this would be more complex than all of the C++ uses actually.
mehdi_amini added inline comments.Nov 24 2020, 11:16 AM
mlir/lib/Bindings/Python/IRModules.cpp
911

You should take the string as const std::string &std in the helper.

mehdi_amini accepted this revision.Nov 24 2020, 11:24 AM
stellaraccident marked 3 inline comments as done.Nov 24 2020, 11:33 AM

The difference is that all the consumers of the C++ API are C++, so it is pretty straightforward to update when making breaking changes even if the surface is greater.

You're clearly not updating all of the Google internal users ;)

Language bindings will live at least partially in their own ecosystem and requiring folks who work on the C API to understand each individual ecosystem may be too onerous, even if the surface area is smaller.

There are a few aspects here that make me doubtful about it:

  • the C API is more limited (by the virtue of using C: no overloading sets, no inheritance/overrides, no templates, etc.)
  • the C API is intended to be stable at some point and see little churn.
  • I does not seem hard to me to see myself changing the C API like here and adjust any in-tree language bindings in a fairly straightforward way, in most cases it'll be fairly mechanical to look at the API call site and update it. I don't quite get why this would be more complex than all of the C++ uses actually.

FWIW - I think we will eventually want these bindings updated as part of making changes to the C-API that they ride on top of. But with this as an optional feature that we have not yet finished/enabled by default, we're in an intermediate state. Once we have build bots on this and a reasonable path to build locally (still shaving some things there), I'll be sending an RFC to enable by default -- and then updating will be required.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 24 2020, 11:35 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

I'm so sorry for the late reply because of the time zone difference, and thanks for your help and review!