This is an archive of the discontinued LLVM Phabricator instance.

[mlir][python bindings] generate all the enums
ClosedPublic

Authored by makslevental on Aug 14 2023, 4:35 PM.

Details

Summary

This PR implements python enum bindings for *all* the enums - this includes I*Attrs (including positional/bit) and Dialect/EnumAttr.

There are a few parts to this:

  1. CMake: a small addition to declare_mlir_dialect_python_bindings and declare_mlir_dialect_extension_python_bindings to generate the enum, a boolean arg GEN_ENUM_BINDINGS to make it opt-in (even though it works for basically all of the dialects), and an optional GEN_ENUM_BINDINGS_TD_FILE for handling corner cases.
  2. EnumPythonBindingGen.cpp: there are two weedy aspects here that took investigation:
    1. If an enum attribute is not a Dialect/EnumAttr then the EnumAttrInfo record is canonical, as far as both the cases of the enum and the AttrDefName. On the otherhand, if an enum is a Dialect/EnumAttr then the EnumAttr record has the correct AttrDefName ("load bearing", i.e., populates ods.ir.AttributeBuilder('<NAME>')) but its enum field contains the cases, which is an instance of EnumAttrInfo. The solution is to generate an one enum class for both Dialect/EnumAttr and "independent" EnumAttrInfo but to make that class interopable with two builder registrations that both do the right thing (see next sub-bullet).
    2. Because we don't have a good connection to cpp EnumAttr, i.e., only the enum class getters are exposed (like DimensionAttr::get(Dimension value)), we have to resort to parsing e.g., Attribute.parse(f'#gpu<dim {x}>'). This means that the set of supported assemblyFormats (for the enum) is fixed at compile of MLIR (currently 2, the only 2 I saw). There might be some things that could be done here but they would require quite a bit more C API work to support generically (e.g., casting ints to enum cases and binding all the getters or going generically through the symbolize* methods, like symbolizeDimension(uint32_t) or symbolizeDimension(StringRef)).

A few small changes:

  1. In addition, since this patch registers default builders for attributes where people might've had their own builders already written, I added a replace param to AttributeBuilder.insert (False by default).
  2. makePythonEnumCaseName can't handle all the different ways in which people write their enum cases, e.g., llvm.CConv.Intel_OCL_BI, which gets turned into INTEL_O_C_L_B_I (because llvm::convertToSnakeFromCamelCase doesn't look for runs of caps). So I dropped it. On the otherhand regularization does need to done because some enums have None as a case (and others might have other python keywords).
  3. I turned on llvm dialect generation here in order to test nvvm.WGMMAScaleIn, which is an enum with no explicit discriminator for the neg case.

Note, dialects that didn't get a GEN_ENUM_BINDINGS don't have any enums to generate.

Let me know if I should add more tests (the three trivial ones I added exercise both the supported assemblyFormats and replace=True).

Diff Detail

Event Timeline

makslevental created this revision.Aug 14 2023, 4:35 PM
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

add top-level imports

guard no enums emitted

remove cruft

treat EnumAttrs correctly

treat EnumAttrs even more correctly

makslevental edited the summary of this revision. (Show Details)Aug 15 2023, 2:02 PM

add emptiness check

makslevental published this revision for review.Aug 15 2023, 2:14 PM
makslevental edited the summary of this revision. (Show Details)
Herald added a project: Restricted Project. · View Herald Transcript
makslevental edited the summary of this revision. (Show Details)Aug 15 2023, 2:38 PM

remove unnecessary includes

The generated bindings: multiple dialects can emit tablegen records for the same enum attributes

I think it would be better to allow specifying a .td file that's different from the one that contains the ops in this case (iirc most dialects have a separate file for attrs/types/etc for this reason) and/or add an extra flag specifying which dialect should be generated (similar to https://github.com/llvm/llvm-project/blob/cb6fe61be3dee67bb8b080c73dd2c48f2d0ce2c7/mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp#L920-L922)

makslevental edited the summary of this revision. (Show Details)Aug 15 2023, 8:57 PM

add doc string to cmake

fix doc in LinalgTransformEnums

put back import mlir.dialects.gpu.passes

update doc strings

makslevental edited the summary of this revision. (Show Details)Aug 16 2023, 12:31 PM
makslevental edited the summary of this revision. (Show Details)

restore emitAttributeBuilder

fix register_builder name

rkayaith added inline comments.Aug 16 2023, 8:53 PM
mlir/cmake/modules/AddMLIRPython.cmake
276

Can we drop the "relative to ROOT_DIR" part, and point directly at the "real" td file? (the python bindings having their own wrapper td files should be removed, see TODO below)

mlir/python/CMakeLists.txt
228

To avoid accidentally picking up extra stuff in the future, I think these should all point to the same td file that's used for -gen-enum-decls etc., i.e. this one should point to VectorTransformsBase.td: https://github.com/llvm/llvm-project/blob/463e7cb89278936a86fef401abc20f171fcb6363/mlir/include/mlir/Dialect/Vector/Transforms/CMakeLists.txt#L1-L3

mlir/test/mlir-tblgen/enums-python-bindings.td
7

is the Enum import still needed?

7

can you update one of the tests here to show how auto gets used

mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
38–46

Can this be merged with isPythonReserved https://github.com/llvm/llvm-project/blob/463e7cb89278936a86fef401abc20f171fcb6363/mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp#L281-L291?
A slightly different list is used there, not sure if the differences are intentional.

116

what was the issue with bit enums before?

140

nit: make this a separate loop over getAllDerivedDefinitionsIfDefined("EnumAttr")? (unless there's some subtle difference? i'm not too familiar with these methods)

143

Can you add/use the methods on the llvm::Record wrappers for these? I think most should already exist on mlir::tblgen::AttrDef.

145–162

nit

makslevental marked 6 inline comments as done.

incorporate comments

mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
38–46

I didn't know about that. I'm not sure how it was constructed (it's missing None, True, False, which are keywords) but I did python -c"import keyword; print(set(keyword.kwlist))". I was actually thinking of shelling out (since, presumably if you're generating you have python in your env) but that's probably overkill. I'll just update that list.

116

generating the 1<< case doesn't work because there's no wrapper about them in the python bindings (there's no CRTP for those attributes). So purely IntEnum didn't work but parsing works. Maybe a todo to just make the CRTP for BitEnums.

makslevental marked an inline comment as done.Aug 17 2023, 1:31 PM

point to the correct td enum file

makslevental marked 2 inline comments as done.Aug 17 2023, 2:59 PM
makslevental added inline comments.Aug 17 2023, 5:19 PM
mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
116

Actually this is wrong - it's just that all current eg I32BitEnumAttrs are wrapped in EnumAttr which required parsing - ie if you parse the mnemonic for the case it works.

Not sure if this is the reason EnumAttrs exist but currently you can't actually expose a bare I32BitEnumAttr because there's no way to go back and forth between an enum case and an APInt or int64_t (so the parsers will complain about missing stringify and symbolize methods). I have a draft patch (minus the tablegen parts) https://reviews.llvm.org/D158236 that implements this.

implement bitenums correctly

update tests

mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
116

third time's the charm: implemented here correctly (no need for another patch)

support py3.9

rkayaith added inline comments.Aug 17 2023, 9:51 PM
mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
70–74

doesn't IntFlag already have these defined? https://docs.python.org/3/library/enum.html#enum.Flag

71

nit

mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
18 ↗(On Diff #551376)

include can be removed

support py3.11

makslevental added inline comments.Aug 17 2023, 10:23 PM
mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
69

on >= py3.11 these things are implemented for us

153

shockingly, on py3.7 f'{x}' doesn't call __str__.

makslevental added inline comments.Aug 17 2023, 10:28 PM
mlir/test/mlir-tblgen/enums-python-bindings.td
61–65

for >=py3.11 this is handled by IntFlag

84

shockingly, on py3.7, f'{x}' doesn't call __str__

makslevental added inline comments.Aug 17 2023, 10:29 PM
mlir/test/python/dialects/arith_dialect.py
31–34 ↗(On Diff #551383)

demo of bitenum union (note FastMathFlags uses , as the separator)

makslevental marked 3 inline comments as done.

comments

mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
70–74

for <3.11, len() gives you the number of possible cases, not the current number of flags set

see

https://github.com/python/cpython/blob/3.10/Lib/enum.py#L449

vs

https://github.com/python/cpython/blob/3.11/Lib/enum.py#L1478-L1479

put stringset back

makslevental added inline comments.Aug 18 2023, 11:28 PM
mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
18 ↗(On Diff #551376)

nope, there's another stringset down here isODSReserved

Seems generally fine to me, but I'd wait to see if anyone has opinions on the approach (using Attribute.parse) or enum name changes.

mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
70–74

Ah I see. Importing private functions (_decompose) seems fragile though, can we just generate explicit bit tests? And if the logic works in later versions as well, seems simpler to use it regardless of python version and just leave a TODO to simplify if the min python version is raised to 3.11.

jpienaar added inline comments.
mlir/python/mlir/dialects/gpu/__init__.py
6

OOC is there a required order?

makslevental added inline comments.Aug 22 2023, 10:11 AM
mlir/python/mlir/dialects/gpu/__init__.py
6

no don't think so - neither module imports from the other and ops_gen only depends on enum_gen in so far as eg @_register_attribute_builder("GPU_AddressSpaceAttr") needs to be run at some point.

mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
70–74

i agree it's brittel-ish but i doubt cpython will drop this for a point release? and i tested for 3.11>v>=3.8

generate members for bitenum

makslevental added inline comments.Aug 22 2023, 1:49 PM
mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
70–74

changed to generate explicitly

makslevental marked 2 inline comments as done.Aug 23 2023, 6:50 AM
stellaraccident requested changes to this revision.Aug 23 2023, 11:01 AM

This looks pretty reasonable to me, although I've commented on one simplification that I think we should do before landing this and having people start adapting their code.

I'm not completely comfortable with the fragility noted on enum builder calls, but neither can I see a straight-forward way out of it. Overall, I think this is a good improvement.

mlir/python/CMakeLists.txt
157

I think you can remove this entire GEN_ENUM_BINDINGS_TD_FILE carve-out if you instead just added to TransformOps.td:

include "mlir/Dialect/Transform/IR/TransformAttrs.td"

Basically, that Python level td file was intended to be the "container" for all of the records needed to generate bindings, and extending it in these cases is what it was made for.

Would eliminate a lot of special case stuff here and a chunk of CMake complexity.

This revision now requires changes to proceed.Aug 23 2023, 11:01 AM
stellaraccident accepted this revision.Aug 23 2023, 11:15 AM

Chatted offline with Maks. I'm ok with this as-is with the caveat that we should:

  • Address the TODO about refactoring TD_FILE and the wrappers quickly so as not to have the transient GEN_ENUM_BINDINGS_TD_FILE baked into downstreams (changed CMake arguments like that are hard to track down).
  • Add a comment to GEN_ENUM_BINDINGS_TD_FILE indicating that downstreams should not use this as it will be removed shortly/without warning upon satisfying the TODO.

I think that satisfying the TODO involves:

  • TD_FILES to directly reference all TD files.
  • Add a GEN_DIR argument
  • Keep TD_FILE with the current semantics for downstreams.
This revision is now accepted and ready to land.Aug 23 2023, 11:15 AM

add GEN_ENUM_BINDINGS_TD_FILE warning

update structured_ext test

This revision was automatically updated to reflect the committed changes.
  • Address the TODO about refactoring TD_FILE and the wrappers quickly so as not to have the transient GEN_ENUM_BINDINGS_TD_FILE baked into downstreams (changed CMake arguments like that are hard to track down).
  • Add a comment to GEN_ENUM_BINDINGS_TD_FILE indicating that downstreams should not use this as it will be removed shortly/without warning upon satisfying the TODO.

I didn't quite follow how we'd be able to get rid of GEN_ENUM_BINDINGS_TD_FILE. The issue is that:

  • the "main" td file (with ops etc.) may include EnumAttr and EnumAttrInfo definitions for unrelated dialects, and
  • EnumAttrInfos aren't associated with a specific dialect (they don't have a Dialect argument in tablegen), so I'm not sure how we'd filter out the unrelated ones.

So we need to be able to specify a .td file that only contains the relevant defs. And as far as I can tell, the C++ generation for these has the same restriction.