This is an archive of the discontinued LLVM Phabricator instance.

Add a flag on the context to protect against creation of operations in unregistered dialects
ClosedPublic

Authored by mehdi_amini on Mar 26 2020, 9:34 PM.

Details

Summary

By default, the verifier won't allow operations with unregistered
dialect. A flag can be set on the context to allow this and a
command line option is provided for mlir-opt to opt-in.

Diff Detail

Event Timeline

mehdi_amini created this revision.Mar 26 2020, 9:34 PM
Herald added a reviewer: herhut. · View Herald Transcript
Herald added 1 blocking reviewer(s): rriddle. · View Herald Transcript
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

Fix parser test

rriddle requested changes to this revision.Mar 28 2020, 8:37 PM
rriddle marked an inline comment as done.
rriddle added inline comments.
mlir/include/mlir/IR/MLIRContext.h
53

These sound weird, can we use the same wording as the options for operations/types in dialects? allowsUnregisteredDialects()/allowUnregisteredDialects(bool allow)?

mlir/lib/Analysis/Verifier.cpp
211–212

Seems like you should put the check on this back-edge to avoid redundant lookups.

mlir/lib/IR/MLIRContext.cpp
168

Please use /// for comments.

mlir/lib/Support/MlirOptMain.cpp
78

Using the suggestion above would simplify this to just: context.allowUnregisteredDialects(allowUnregisteredDialect).

mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
449

I actually would have expected this one to be fine, but it seems that OperationName::getDialect doesn't properly handle unregistered operations for the builtin dialect(no namespace). I think that is fine though.

mlir/tools/mlir-translate/mlir-translate.cpp
49

This looks unused.

mlir/unittests/IR/OperationSupportTest.cpp
16–17

Can you move this out of the namespace and mark as static while you are here?

This revision now requires changes to proceed.Mar 28 2020, 8:37 PM
mehdi_amini marked 7 inline comments as done.

Address comments

mehdi_amini added inline comments.Mar 29 2020, 12:11 PM
mlir/lib/IR/MLIRContext.cpp
168

I had modeled on the data members above :

// Identifier allocator and mutex for thread safety.
llvm::BumpPtrAllocator identifierAllocator;
llvm::sys::SmartRWMutex<true> identifierMutex;
bondhugula requested changes to this revision.Mar 29 2020, 12:57 PM
bondhugula added a subscriber: bondhugula.

The commit summary opening "By default, the verifier won't allow operations with unregistered dialect. " reads as if this was the situation before. Consider rephrasing please. "Change the verifier to disallow ..."? Besides "operations with unregistered dialect", what about unregistered operations, i.e., operations that aren't part of any registered dialect? It looks like you are equating the two, and did you mean to use the latter for this patch?

mlir/tools/mlir-opt/mlir-opt.cpp
90

Nit: "operation for unregistered dialects" isn't really meaningful. Rephrase here and other places.

171

clAllowUnregisteredDialects ?

This revision now requires changes to proceed.Mar 29 2020, 12:57 PM
mehdi_amini marked 2 inline comments as done.Mar 29 2020, 1:46 PM

The commit summary opening "By default, the verifier won't allow operations with unregistered dialect. " reads as if this was the situation before.

I used the future tense here to indicate the change.

Besides "operations with unregistered dialect", what about unregistered operations, i.e., operations that aren't part of any registered dialect?

To me “operations with unregistered dialect” is the same as “ operations that aren't part of any registered dialect”.
“Unregistered operations” on the other hand can refer to an operation with a known dialect but that isn’t registered within the dialect. There is already a per-dialect flag to control this.

mlir/tools/mlir-opt/mlir-opt.cpp
90

I can’t think of anything more descriptive, please suggest an alternative if you have one

171

Is this a documented or widespread convention?

The commit summary opening "By default, the verifier won't allow operations with unregistered dialect. " reads as if this was the situation before.

I used the future tense here to indicate the change.

Besides "operations with unregistered dialect", what about unregistered operations, i.e., operations that aren't part of any registered dialect?

To me “operations with unregistered dialect” is the same as “ operations that aren't part of any registered dialect”.

But the two are clearly different. There could be an operation that is part of a dialect that is not registered, and you then could have operations that aren't part of any dialect like "foo"() : () -> () -- the latter isn't part of say any dialect.

“Unregistered operations” on the other hand can refer to an operation with a known dialect but that isn’t registered within the dialect. There is
already a per-dialect flag to control this.

If the dialect itself hasn't been registered with <whatever we are referring to>, then obviously its ops themselves aren't registered. I think you could consistently use "ops with no / unregistered dialects" everywhere to be accurate.

bondhugula added inline comments.Mar 29 2020, 2:45 PM
mlir/tools/mlir-opt/mlir-opt.cpp
90

Allow operations with no/unregistered dialects ?

171

I thought it was a convention - unless folks started deviating from it. Readability-wise IMO it's a win because it's easily distinguishable as a cmd line flag, and you would in many cases want the same name with cl dropped somewhere in that pass.

The commit summary opening "By default, the verifier won't allow operations with unregistered dialect. " reads as if this was the situation before.

I used the future tense here to indicate the change.

Besides "operations with unregistered dialect", what about unregistered operations, i.e., operations that aren't part of any registered dialect?

To me “operations with unregistered dialect” is the same as “ operations that aren't part of any registered dialect”.

But the two are clearly different. There could be an operation that is part of a dialect that is not registered, and you then could have operations that aren't part of any dialect like "foo"() : () -> () -- the latter isn't part of say any dialect.

OK I see what you mean, I'll write “operations that aren't part of any registered dialect” which covers both f.oo which if part of the unregistered dialect f and foo which is not part of any dialect. The verifier is covering both here.

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

mehdi_amini marked 2 inline comments as done.Mar 29 2020, 3:33 PM
mehdi_amini added inline comments.
mlir/tools/mlir-opt/mlir-opt.cpp
171

The other variables above: passPipeline, splitInputFile, verifyDiagnostics, and verifyPasses are all cl flags and not using this convention, so I just followed what is there at the moment.

I don't mind the cl prefix, but I'd rather apply it consistently.

Rename the option description to "Allow operation with no registered dialects"

The commit summary opening "By default, the verifier won't allow operations with unregistered dialect. " reads as if this was the situation before.

I used the future tense here to indicate the change.

Besides "operations with unregistered dialect", what about unregistered operations, i.e., operations that aren't part of any registered dialect?

To me “operations with unregistered dialect” is the same as “ operations that aren't part of any registered dialect”.

But the two are clearly different. There could be an operation that is part of a dialect that is not registered, and you then could have operations that aren't part of any dialect like "foo"() : () -> () -- the latter isn't part of say any dialect.

OK I see what you mean, I'll write “operations that aren't part of any registered dialect” which covers both f.oo which if part of the unregistered dialect f and foo which is not part of any dialect. The verifier is covering both here.

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

This is wrong. Operations are always part of a dialect. "foo" would be considered to be part of the builtin dialect, which currently has no namespace. IMO we should set the namespace of the builtin dialect to "mlir" and the assert that operation names
always have a dialect prefix.

rriddle accepted this revision.Mar 29 2020, 3:43 PM
rriddle added inline comments.
mlir/include/mlir/IR/MLIRContext.h
56

nit: allowing -> allow

mlir/lib/Analysis/Verifier.cpp
209

nit: Should we really mention mlir-opt here? Especially given that many users have their own opt equivalents?

mehdi_amini marked an inline comment as done.Mar 29 2020, 4:16 PM

This is wrong. Operations are always part of a dialect. "foo" would be considered to be part of the builtin dialect, which currently has no namespace.

Do you know why I see this?

echo '"foo"() : () -> ()' | ./bin/mlir-opt
<stdin>:1:1: error: 'foo' op created with unregistered dialect. If this is intended, please call allowUnregisteredDialects() on the MLIRContext, or use -allow-unregistered-dialect with mlir-opt

IMO we should set the namespace of the builtin dialect to "mlir" and the assert that operation names always have a dialect prefix.

I'd prefer this indeed.

mlir/lib/Analysis/Verifier.cpp
209

I usually don't try to optimize for out-of-tree users, they'll interpret however it makes sense for them (or they can patch the message if they want).
This is also a misconfiguration, not really expected to happen other than while developing, so I expect them to hit this once and fix their config.
Tests something with mlir-opt upstream however is likely much more common I believe.

mehdi_amini marked 5 inline comments as done.Mar 29 2020, 4:18 PM

The commit summary opening "By default, the verifier won't allow operations with unregistered dialect. " reads as if this was the situation before.

I used the future tense here to indicate the change.

Besides "operations with unregistered dialect", what about unregistered operations, i.e., operations that aren't part of any registered dialect?

To me “operations with unregistered dialect” is the same as “ operations that aren't part of any registered dialect”.

But the two are clearly different. There could be an operation that is part of a dialect that is not registered, and you then could have operations that aren't part of any dialect like "foo"() : () -> () -- the latter isn't part of say any dialect.

OK I see what you mean, I'll write “operations that aren't part of any registered dialect” which covers both f.oo which if part of the unregistered dialect f and foo which is not part of any dialect. The verifier is covering both here.

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

This is wrong. Operations are always part of a dialect. "foo" would be considered to be part of the builtin dialect, which currently has no namespace. IMO we should set the namespace of the builtin dialect to "mlir" and the assert that operation names
always have a dialect prefix.

I didn't know the unknown named ops were considered as part of the builtin dialect - they are anything but built-in!

bondhugula added inline comments.Mar 29 2020, 4:27 PM
mlir/tools/mlir-opt/mlir-opt.cpp
171

Sounds good, thanks.

This is wrong. Operations are always part of a dialect. "foo" would be considered to be part of the builtin dialect, which currently has no namespace.

Do you know why I see this?

Because OperationName::getDialect does not account for the fact that a prefix may be missing. It(IMO rightfully) assumes that all operations have a dialect namespace prefix. I wrote it that way because I'd always intended to make the builtin dialect have a proper namespace, but forgot about it until now. No one checks for the dialect prefix of the builtin operations ('func'/'module'/etc.) so it has just "worked" up until now.

Fix variable name

This revision was not accepted when it landed; it landed in state Needs Review.Mar 30 2020, 1:04 PM
This revision was automatically updated to reflect the committed changes.

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

One advantage has been in writing test cases where you need an op with a side-effect so that it just doesn't become dead code (when canonicalization is being run as part of the pass/utility you are testing). These unknown ops provided the easiest/shortest IR for it - we would have otherwise had to think of another side-effecting op (like creating a memref type and a load/store on it). Any other quick shortcut replacements?

Another use case of allowing such unknown ops is a need to continue / allow transformation in their presence before the output goes to another tool that is able to finally deal with those ops. Why force a dialect to be registered when an mlir-opt transform can meaningfully work in the presence of a mix of registered and unregistered ops?

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

One advantage has been in writing test cases where you need an op with a side-effect so that it just doesn't become dead code (when canonicalization is being run as part of the pass/utility you are testing). These unknown ops provided the easiest/shortest IR for it - we would have otherwise had to think of another side-effecting op (like creating a memref type and a load/store on it). Any other quick shortcut replacements?

Using a name with a dot?
foo.myop instead of ˋmyop` should give the same effect I expect.

(Could also use the test dialect to avoid using the new flag moving forward)

Another use case of allowing such unknown ops is a need to continue / allow transformation in their presence before the output goes to another tool that is able to finally deal with those ops. Why force a dialect to be registered when an mlir-opt transform can meaningfully work in the presence of a mix of registered and unregistered ops?

mlir-opt is really a testing tool, I expect any « real » compiler built with MLIR to always register their dialects.

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

One advantage has been in writing test cases where you need an op with a side-effect so that it just doesn't become dead code (when canonicalization is being run as part of the pass/utility you are testing). These unknown ops provided the easiest/shortest IR for it - we would have otherwise had to think of another side-effecting op (like creating a memref type and a load/store on it). Any other quick shortcut replacements?

Using a name with a dot?
foo.myop instead of ˋmyop` should give the same effect I expect.

You mean 'foo' is a registered dialect? Won't you still still get an error if 'myop' isn't registered and we disallow all unregistered ops. Or you were suggesting something else?

(Could also use the test dialect to avoid using the new flag moving forward)

Sure.

Another use case of allowing such unknown ops is a need to continue / allow transformation in their presence before the output goes to another tool that is able to finally deal with those ops. Why force a dialect to be registered when an mlir-opt transform can meaningfully work in the presence of a mix of registered and unregistered ops?

mlir-opt is really a testing tool, I expect any « real » compiler built with MLIR to always register their dialects.

Even for testing, I meant if one wanted to run:
.... | mlir-opt ... | <another-mlir-based-tool> ...

If mlir-opt errors out on unregistered dialect operations, one can't test a mix of mlir registered dialect ops and other ops mid-way in the pipeline there.

(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)

One advantage has been in writing test cases where you need an op with a side-effect so that it just doesn't become dead code (when canonicalization is being run as part of the pass/utility you are testing). These unknown ops provided the easiest/shortest IR for it - we would have otherwise had to think of another side-effecting op (like creating a memref type and a load/store on it). Any other quick shortcut replacements?

Using a name with a dot?
foo.myop instead of ˋmyop` should give the same effect I expect.

You mean 'foo' is a registered dialect? Won't you still still get an error if 'myop' isn't registered and we disallow all unregistered ops. Or you were suggesting something else?

Some context got lost I guess, you were answering to "(I don't even know why we allow such operation name in the first place to be honest, other than saving a few chars when writing tests there isn't much interest)" which referred to ops that don't contain any dot in the name.

mlir-opt is really a testing tool, I expect any « real » compiler built with MLIR to always register their dialects.

Even for testing, I meant if one wanted to run:
.... | mlir-opt ... | <another-mlir-based-tool> ...

If mlir-opt errors out on unregistered dialect operations, one can't test a mix of mlir registered dialect ops and other ops mid-way in the pipeline there.

Right, but that's why the command line flag is provided.