This is an archive of the discontinued LLVM Phabricator instance.

Introduce a new DialectIdentifier structure, extending Identifier with a Dialect information
ClosedPublic

Authored by mehdi_amini on Jan 25 2021, 10:08 PM.

Details

Summary

This class is looking up a dialect prefix on the identifier on initialization
and keeping a pointer to the Dialect when found.

The NamedAttribute key is now a DialectIdentifier.

Diff Detail

Event Timeline

mehdi_amini created this revision.Jan 25 2021, 10:08 PM
mehdi_amini requested review of this revision.Jan 25 2021, 10:08 PM

The name DialectIdentifier strikes me as weird on first glance, but I think it's slowly being normalized the more I look at it.

I like the idea of making it more efficient to access a context/dialect from a dialect prefixed identifier. Do you have any performance impact numbers for the change to NamedAttribute?

Isn't it still just an Identifier? The dialect was always there, it was just previously encoded in the string and is now separate

Isn't it still just an Identifier? The dialect was always there, it was just previously encoded in the string and is now separate

Are you suggesting to not have two classes and just merge the two classes here?

rriddle added a comment.EditedJan 26 2021, 5:56 PM

Isn't it still just an Identifier? The dialect was always there, it was just previously encoded in the string and is now separate

Are you suggesting to not have two classes and just merge the two classes here?

I think that route is going to end up being better, given that it reduces the number of lookups to 1 in the common case. If not that, then diverging from Identifier and still uniquing. Before suggesting to just merge them, I was trying to see what else Identifier is used for aside from Attribute names, but I couldn't find anything.

Some Locations may be using Identifier like in FileLineColLocationStorage.
An OperationName would get larger as well I think, which would make Operation larger by one pointer (it'd accelerate Operation::getDialect() for unregistered operations, but I don't think we need to optimize this)

rriddle added a comment.EditedJan 26 2021, 6:13 PM

Some Locations may be using Identifier like in FileLineColLocationStorage.
An OperationName would get larger as well I think, which would make Operation larger by one pointer (it'd accelerate Operation::getDialect() for unregistered operations, but I don't think we need to optimize this)

If we uniqued DialectIdentifier(the Identifier part + the Dialect */MLIRContext *), then OperationName wouldn't grow AFAICS. It would improve memory(no extra pointer per instance of a specific identifier) and reduce lookup time. I don't think we should replace Identifier directly given the other usages, but it still seems interesting to optimize the important cases that do rely on this type of identifier. Having a proper representation for an identifier that can be dialect prefixed seems useful.

Oh I see what you mean, we'd replace the current llvm::StringSet<llvm::BumpPtrAllocator &> identifiers; in the MLIRContext with something like llvm::StringMap<Dialect*, llvm::BumpPtrAllocator &> identifiers;?
(probably a bit more complex, but that's the rough direction?)

Something I need to figure out is what to do on dialect loading in the Context, I propose on such event to loop over the identifiers and if any is prefixed with the loaded dialect then set the Dialect pointer, WDYT?

mehdi_amini retitled this revision from Introduce a new DialectIdentifier structure, extending Identifier with a Dialect information to Extend the Identifier storage with a pointer to the Dialect (or the MLIRContext).Jan 27 2021, 7:35 PM
mehdi_amini edited the summary of this revision. (Show Details)
mehdi_amini retitled this revision from Extend the Identifier storage with a pointer to the Dialect (or the MLIRContext) to Introduce a new DialectIdentifier structure, extending Identifier with a Dialect information.
mehdi_amini edited the summary of this revision. (Show Details)

Update to fold the pointer into the Identifier storage.

jpienaar accepted this revision.Jan 28 2021, 9:40 AM

Thanks!

mlir/lib/IR/MLIRContext.cpp
726

Remove trivial {} ?

This revision is now accepted and ready to land.Jan 28 2021, 9:40 AM
mehdi_amini marked an inline comment as done.

Remove trivial braces

rriddle accepted this revision.Jan 28 2021, 1:52 PM

Nice. Really clean now!

LGTM after adding examples.

mlir/include/mlir/IR/Identifier.h
64

Can you beef up the main description of the Identifier class? Right now there is no mention of what it means to be "prefixed with a dialect", and it would be good to document some of these invariants(with an example or two). This would allow for us to point to a given location when someone asks what a "dialect prefixed identifier" means.

mlir/lib/IR/MLIRContext.cpp
30

I don't think this is necessary.

mehdi_amini marked an inline comment as done.

beef up the main description of the Identifier

mlir/include/mlir/IR/Identifier.h
64

PTAL!

Add an example in the description for getdialect()

rriddle accepted this revision.Jan 28 2021, 3:30 PM

LGTM

mlir/include/mlir/IR/Identifier.h
30–34
63

You use loaded everywhere else in this description, which makes sense given that the dialect has to be loaded for it to exist.

mehdi_amini marked 2 inline comments as done.Jan 28 2021, 4:03 PM

address comment

This revision was landed with ongoing or failed builds.Jan 28 2021, 4:05 PM
This revision was automatically updated to reflect the committed changes.

Seems you retitled the commit, but that got reverted to original description again on push.

mlir/lib/IR/MLIRContext.cpp
492

OOC: if identifierEntry.second is set, can it change here? (E.g., could we rely on/filter based on it, seems like we'd avoid string comparison except where dialect is not set at cost of one extra check to see if it refers to a context or dialect, probably not an important optimization unless we interleaved dialect loading and identified creation much).

rriddle added inline comments.Jan 31 2021, 12:07 PM
mlir/lib/IR/MLIRContext.cpp
723

Forgot to mention in review, but please rework this to avoid looking up the dialect unless the identifier is being created. We should only be doing the string splicing/dialect lookup when a new identifier is created.