This is an archive of the discontinued LLVM Phabricator instance.

[mlir][py] Enable building ops with raw inputs
ClosedPublic

Authored by jpienaar on Dec 7 2022, 11:54 AM.

Details

Summary

For cases where we can automatically construct the Attribute allow for more
user-friendly input. This is consistent with C++ builder generation as well
choice of which single builder to generate here (most
specialized/user-friendly).

Registration of attribute builders from more raw/non-Attribute input is all
Python side.
The downside is that

  • same Python code may not work everywhere the same,
  • extra checking to see if user provided a custom builder in op builders,
  • the ODS attribute name is load bearing,

upside is that

  • easily change these/register dialect specific ones in downstream projects,
  • we/users can use, for example, numpy where available rather than lowest common denominator,
  • adding support/changing to different convenience builders are all along with the rest of the convenience functions in Python (no additional changes to tablegen file or recompilation needed);

Allow for both building with Attributes as well as raw inputs. This change
should therefore be backwards compatible as well as allow for avoiding
recreating Attribute where already available.

Diff Detail

Event Timeline

jpienaar created this revision.Dec 7 2022, 11:54 AM
jpienaar requested review of this revision.Dec 7 2022, 11:54 AM
jpienaar updated this revision to Diff 481642.Dec 9 2022, 7:08 AM

Have the builders take context additionally

Avoids always using default context.

jpienaar updated this revision to Diff 481749.Dec 9 2022, 2:34 PM

Remove leftover debugging

jpienaar updated this revision to Diff 482150.Dec 12 2022, 8:43 AM

Update test post making context explicit

jpienaar added a subscriber: phawkins.
jpienaar updated this revision to Diff 482693.Dec 13 2022, 8:08 PM

Try moving helpers to common spot

In general, this feels like a usability improvement. We need user-level documentation on how to use and extend this.

Regarding potential downsides:

  • Could we somehow limit the location of AttrBuilder functions so that a downstream integration of bindings can only add functions for attributes from its own dialects (and not core dialects)? I understand this is Python and one can always find a workaround, but making it intentionally hard would send the message. The downstream may want to construct built-in attributes from its custom types, but that's the point where we start getting divergence between downstreams that we'd better avoid if we care for reuse.
mlir/lib/Bindings/Python/IRCore.cpp
204–205

Could you add a short comment describing what this does? Defining __get__ makes something a property, but this defining __get__ itself as a property, I'm confused.

mlir/python/mlir/ir.py
9–10 ↗(On Diff #482693)

We definitely want type annotations on what x can be.
We may also want to raise a TypeError on wrong types, though attribute getters may already be doing so.

11 ↗(On Diff #482693)

Nit: I'd add vertical whitespace before each new def.

mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
540–550
544
680–690

Nit: is it possible to stick to formatv everywhere for uniformity?

In general, this feels like a usability improvement. We need user-level documentation on how to use and extend this.

Regarding potential downsides:

  • Could we somehow limit the location of AttrBuilder functions so that a downstream integration of bindings can only add functions for attributes from its own dialects (and not core dialects)? I understand this is Python and one can always find a workaround, but making it intentionally hard would send the message. The downstream may want to construct built-in attributes from its custom types, but that's the point where we start getting divergence between downstreams that we'd better avoid if we care for reuse.

Best I can think of is to make it so that one cannot re-assign and then error if one attempts to ... Mmm, given multiple imports one either would then need to check if function being assigned is the same or add an explicit "register once" method. (I'm not python expert to know of alternatives). I think the way I could do that easiest would be to have an explicit set call which then takes a string as input and verified C++ side.

AttrBuilder.SymbolNameAttr = _symbolNameAttr

becomes

AttrBuilder.set_builder("SymbolNameAt", _symbolNameAttr)

(at least I couldn't figure out via pybind11 how to enable AttrBuilder['SymbolNameAttr"] = _symbolNameAttr but I can try a bit more)

One could, but I like this without having any side data structure. I see this as an enhancement we could add if it is cumbersome (e.g.,

mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
680–690

You want the entire thing as one string? It means having two string constants and switching between them with the two string constants sharing values that need to be kept consistent between them.

In general, this feels like a usability improvement. We need user-level documentation on how to use and extend this.

Regarding potential downsides:

  • Could we somehow limit the location of AttrBuilder functions so that a downstream integration of bindings can only add functions for attributes from its own dialects (and not core dialects)? I understand this is Python and one can always find a workaround, but making it intentionally hard would send the message. The downstream may want to construct built-in attributes from its custom types, but that's the point where we start getting divergence between downstreams that we'd better avoid if we care for reuse.

Best I can think of is to make it so that one cannot re-assign and then error if one attempts to ... Mmm, given multiple imports one either would then need to check if function being assigned is the same or add an explicit "register once" method. (I'm not python expert to know of alternatives). I think the way I could do that easiest would be to have an explicit set call which then takes a string as input and verified C++ side.

Works for me.

AttrBuilder.SymbolNameAttr = _symbolNameAttr

becomes

AttrBuilder.set_builder("SymbolNameAt", _symbolNameAttr)

(at least I couldn't figure out via pybind11 how to enable AttrBuilder['SymbolNameAttr"] = _symbolNameAttr but I can try a bit more)

Usually __setitem__ works, but you may need an instance of the builder class, not the class itself.

One could, but I like this without having any side data structure. I see this as an enhancement we could add if it is cumbersome (e.g.,

mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
680–690

We already have similar cases. I find it easier to guess what the generated code should look like if there is less string manipulation.

jpienaar updated this revision to Diff 484098.Dec 19 2022, 3:02 PM

Switch to explicit methods and avoid redefining.

jpienaar updated this revision to Diff 484104.Dec 19 2022, 3:22 PM
jpienaar marked 2 inline comments as done.

Make 2 self-contained strings

jpienaar updated this revision to Diff 484121.Dec 19 2022, 3:42 PM

Update doc

ftynse accepted this revision.Dec 21 2022, 7:23 AM

Nice!

mlir/lib/Bindings/Python/Globals.h
61–63

Something went wrong with the merge/rebase and now the documentation is duplicated,, and wrong for registerAttributeBuilder

mlir/lib/Bindings/Python/IRModule.cpp
70–72

Nit: throw std::runtime_error(message) should do what you expect without calling into low-level API.
Also nit: I'm always suspicious about usage of llvm::Twine in places that expect something else, please double-check that it doesn't get destroyed too early.

97

Nit: is *not* defined? Also, this is a py::function.

This revision is now accepted and ready to land.Dec 21 2022, 7:23 AM
This revision was landed with ongoing or failed builds.Dec 21 2022, 10:10 AM
This revision was automatically updated to reflect the committed changes.
jpienaar marked 5 inline comments as done.
jpienaar added inline comments.Dec 21 2022, 10:11 AM
mlir/lib/Bindings/Python/IRModule.cpp
70–72

Yeah passing Twine unless as && often an issue. Here it converts to std::string before call.

97

The style followed here (taken from below in file) is to say what is being asserted (that the function is defined). Kept it as is to be consistent, I can follow up to flip all.