This is an archive of the discontinued LLVM Phabricator instance.

[mlirTranslateMain] Add a customization callback.
ClosedPublic

Authored by lattner on Mar 3 2022, 9:11 PM.

Details

Summary

mlir-translate and related tools currently have a fixed set
of flags that are built into Translation.cpp. This works for
simple cases, but some clients want to change the default
globally (e.g. default to allowing unregistered dialects
without a command line flag), or support dialect-independent
translations without having those translations register every
conceivable dialect they could be used with (breaking
modularity).

This approach could also be applied to mlirOptMain to reduce
the significant number of flags it has accumulated.

Diff Detail

Event Timeline

lattner created this revision.Mar 3 2022, 9:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 9:11 PM
lattner requested review of this revision.Mar 3 2022, 9:11 PM

This approach could also be applied to mlirOptMain to reduce the significant number of flags it has accumulated.

There is value in explicit options: this provides a restricted interface instead of an open one with unbounded access to the context. We leveraged the fact that we have control over the context over time in mlirOptMain to change the behavior as we saw fit.

mlir/include/mlir/Translation.h
109 ↗(On Diff #412924)
rriddle accepted this revision.Mar 4 2022, 1:14 PM

I think I'm fine with this in general, but we should also be wary about encouraging users to do the wrong thing. I would say that we should make certain context-related things parameters (e.g. the DialectRegistry) when we want to enforce/encourage them to do what is expected.

mlir/include/mlir/Translation.h
108 ↗(On Diff #412924)

I would add a DialectRegistry parameter here. We really want users to be driving adding dialects through that, the customization callback I would view for non-registry related things.

This revision is now accepted and ready to land.Mar 4 2022, 1:14 PM

Ok, I agree the dialect registration is the most critical thing that is missing.

That said, I have a translate tool that always wants to accept unregistered dialects. It is super annoying that every testcase has to specify -allow-unregistered-dialect. We have a way to do this for mlir-opt, because it takes a ton of flags.... but is it really better to feed a ton of arguments into these tools? For reference MlirOptMain has grown to this beast:

LogicalResult MlirOptMain(llvm::raw_ostream &outputStream,
                          std::unique_ptr<llvm::MemoryBuffer> buffer,
                          const PassPipelineCLParser &passPipeline,
                          DialectRegistry &registry, bool splitInputFile,
                          bool verifyDiagnostics, bool verifyPasses,
                          bool allowUnregisteredDialects,
                          bool preloadDialectsInContext = false);

with several overloads, because this is unmanagable for most clients.

Coming back to this patch, if we change the callback to take a registry instead of an MLIRContext, then we'll have to go down this same path for mlirTranslateMain: adding a bool allowUnregisteredDialects parameter... and I assume other clients will want others over time. Is this actually a better design?

In my opinion, the goal of constraining the interface is not actually as important as having a clean interface and allowing people using MLIR to get stuff done. We cannot prevent all "abuse" after all. I don't think we can even define what "abuse" means.

WDYT?

rriddle added a comment.EditedMar 6 2022, 4:22 PM

Ok, I agree the dialect registration is the most critical thing that is missing.

That said, I have a translate tool that always wants to accept unregistered dialects. It is super annoying that every testcase has to specify -allow-unregistered-dialect. We have a way to do this for mlir-opt, because it takes a ton of flags.... but is it really better to feed a ton of arguments into these tools? For reference MlirOptMain has grown to this beast:

LogicalResult MlirOptMain(llvm::raw_ostream &outputStream,
                          std::unique_ptr<llvm::MemoryBuffer> buffer,
                          const PassPipelineCLParser &passPipeline,
                          DialectRegistry &registry, bool splitInputFile,
                          bool verifyDiagnostics, bool verifyPasses,
                          bool allowUnregisteredDialects,
                          bool preloadDialectsInContext = false);

with several overloads, because this is unmanagable for most clients.

Not really sure you would do differently here (other than a config struct?), most of those are specific to mlir-opt; i.e. not configuring the context (as a side note preloadDialectsInContext is also something we want to get rid of).

Coming back to this patch, if we change the callback to take a registry instead of an MLIRContext, then we'll have to go down this same path for mlirTranslateMain: adding a bool allowUnregisteredDialects parameter... and I assume other clients will want others over time. Is this actually a better design?

In my opinion, the goal of constraining the interface is not actually as important as having a clean interface and allowing people using MLIR to get stuff done. We cannot prevent all "abuse" after all. I don't think we can even define what "abuse" means.

WDYT?

I never suggested that the callback would take a DialectRegistry, MlirTranslateMain should always take a registry. The callback could still take an MLIRContext, but clients should be populating the DialectRegistry that gets passed to MlirTranslateMain, not manually adding dialects to the MLIRContext.

jpienaar accepted this revision.Mar 6 2022, 6:06 PM
jpienaar added a subscriber: jpienaar.

I agree with the sentiments here (enable getting things done + pushing folks to using supported APIs), and this seems reasonable. These for me are meant for creating exploration/testing tools, so shouldn't be as locked down as a dedicated tool (and dedicated end-to-end tools should probably not be using this either), so SGTM.

mlir/lib/Translation/Translation.cpp
161 ↗(On Diff #412924)

This is one that hit me recently, having the init in here meant I couldn't reuse another helper (and is different than what we do for mlir-opt's main helper). Not relevant to this change beyond reminding me of it.

That said, I have a translate tool that always wants to accept unregistered dialects. It is super annoying that every testcase has to specify -allow-unregistered-dialect.

If this is the only use case for the callback, then can we just add a flag (or a config struct if you prefer)? Since this API hasn't changed "forever", in absence of other motivation it seems extreme to me to expose the entire context where we need a bool.

But more importantly I'm not convinced that this is the right API to change actually: it is unlikely that "a translate tool that always wants to accept unregistered dialects" and instead more likely that a specific translation wants this.
This is also why the DialectRegistry isn't provided through this API, but is part of the individual translation registrations.
For example see here: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Target/SPIRV/TranslateRegistration.cpp#L114-L116
how the Spirv translation registers the spirv dialect.
I would expect that the property of "allowing allow-unregistered-dialect" would be at the same level, and so conveys from this registration instead.

Below I'm just providing my counter point to your overall argument on a restricted vs open API, less relevant for this current patch:

We have a way to do this for mlir-opt, because it takes a ton of flags.... but is it really better to feed a ton of arguments into these tools?  For reference `MlirOptMain` has grown to this beast:
LogicalResult MlirOptMain(llvm::raw_ostream &outputStream,
                          std::unique_ptr<llvm::MemoryBuffer> buffer,
                          const PassPipelineCLParser &passPipeline,
                          DialectRegistry &registry, bool splitInputFile,
                          bool verifyDiagnostics, bool verifyPasses,
                          bool allowUnregisteredDialects,
                          bool preloadDialectsInContext = false);

Exactly two flags here are about the MLIRContext (the last two), and the last one is even deprecated! It's hardly "a ton of flags" IMO.

The other flags are unfortunate but also quite specific to the behavior of mlir-opt, I would be fine with a config struct here as well (on the model of OpPrintingFlags) to collect splitInputFile, verifyDiagnostics, verifyPasses, ...

In my opinion, the goal of constraining the interface is not actually as important as having a clean interface and allowing people using MLIR to get stuff done.

I disagree here: constraining the interface is important to me and has proven to be very valuable exactly in the case of mlir-opt in the past!
More importantly: I don't see how replacing the 2 flags (one deprecated) with a callback would make the mlirOptMain interface "cleaner".
Finally: it has to be elaborated about how the current mlirOptMain interface does not allow "to get stuff done".

We cannot prevent all "abuse" after all.

We can't prevent it all, but talking in the extreme isn't an argument IMO: we can always make API "easy to use, hard to misuse" without being absolute.
APIs conveys important information to users about the intent of what we support. I argue that this is even more critical in a project like LLVM with unstable APIs: because we want to be able to evolve them, having "closed" or "restricted" API is much more resilient to changes: it both support our job as maintainer (easier to think about invariants when refactoring a "restricted" API, in particular when you don't see all the clients) and as users (less likely to be broken or be told "you're relying on an unsupported implementation detail of the API").

I don't think we can even define what "abuse" means.

We design the system with a specific mental model, in general the way I'm using "abuse" is when people work around the invariants we're using to keep the system consistent in the design, but that we can't easily enforce (through assertions or others).

lattner updated this revision to Diff 414874.Mar 12 2022, 12:43 PM

Update for review comments and merge to mainline.

This adopts function_ref, and adds a dialect registry to make
the interface more explicit.

rriddle accepted this revision.Mar 12 2022, 12:50 PM

This still LGTM. Making it easier for tool authors that have weird configs, while also encouraging the right thing (re using the registry instead of adding things directly to the context), seems fine. (and is also aligned with mlir-opt).

mlir/include/mlir/Tools/mlir-translate/MlirTranslateMain.h
39

We can likely drop the llvm:: here.

lattner updated this revision to Diff 414876.Mar 12 2022, 1:18 PM

remove unneeded llvm:: qualifier.

Thank you for the discussion and review!

This revision was landed with ongoing or failed builds.Mar 12 2022, 1:18 PM
This revision was automatically updated to reflect the committed changes.

That said, I have a translate tool that always wants to accept unregistered dialects. It is super annoying that every testcase has to specify -allow-unregistered-dialect.

If this is the only use case for the callback, then can we just add a flag (or a config struct if you prefer)? Since this API hasn't changed "forever", in absence of other motivation it seems extreme to me to expose the entire context where we need a bool.

But more importantly I'm not convinced that this is the right API to change actually: it is unlikely that "a translate tool that always wants to accept unregistered dialects" and instead more likely that a specific translation wants this.
This is also why the DialectRegistry isn't provided through this API, but is part of the individual translation registrations.
For example see here: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Target/SPIRV/TranslateRegistration.cpp#L114-L116
how the Spirv translation registers the spirv dialect.
I would expect that the property of "allowing allow-unregistered-dialect" would be at the same level, and so conveys from this registration instead.

Surprised to see this landing: you haven't addressed my comment as far as I can tell?

Surprised to see this landing: you haven't addressed my comment as far as I can tell?

I'm sorry, I didn't mean to land while ignoring your comment! I thought I addressed this on the discourse thread but I can see that it isn't completely clear. I'll follow-up over there, I didn't mean to steam roll you Mehdi!