I like the idea of making it more efficient to access a context/dialect from a dialect prefixed identifier. Do you have any performance impact numbers for the change to NamedAttribute?

Isn't it still just an Identifier? The dialect was always there, it was just previously encoded in the string and is now separate

In D95418#2524202, @jpienaar wrote:

Isn't it still just an Identifier? The dialect was always there, it was just previously encoded in the string and is now separate

Are you suggesting to not have two classes and just merge the two classes here?

In D95418#2524208, @mehdi_amini wrote:

In D95418#2524202, @jpienaar wrote:

Isn't it still just an Identifier? The dialect was always there, it was just previously encoded in the string and is now separate

Are you suggesting to not have two classes and just merge the two classes here?

I think that route is going to end up being better, given that it reduces the number of lookups to 1 in the common case. If not that, then diverging from Identifier and still uniquing. Before suggesting to just merge them, I was trying to see what else Identifier is used for aside from Attribute names, but I couldn't find anything.

Some Locations may be using Identifier like in FileLineColLocationStorage.
An OperationName would get larger as well I think, which would make Operation larger by one pointer (it'd accelerate Operation::getDialect() for unregistered operations, but I don't think we need to optimize this)

In D95418#2524214, @mehdi_amini wrote:

Some Locations may be using Identifier like in FileLineColLocationStorage.
An OperationName would get larger as well I think, which would make Operation larger by one pointer (it'd accelerate Operation::getDialect() for unregistered operations, but I don't think we need to optimize this)

If we uniqued DialectIdentifier(the Identifier part + the Dialect */MLIRContext *), then OperationName wouldn't grow AFAICS. It would improve memory(no extra pointer per instance of a specific identifier) and reduce lookup time. I don't think we should replace Identifier directly given the other usages, but it still seems interesting to optimize the important cases that do rely on this type of identifier. Having a proper representation for an identifier that can be dialect prefixed seems useful.

Oh I see what you mean, we'd replace the current llvm::StringSet<llvm::BumpPtrAllocator &> identifiers; in the MLIRContext with something like llvm::StringMap<Dialect*, llvm::BumpPtrAllocator &> identifiers;?
(probably a bit more complex, but that's the rough direction?)

Something I need to figure out is what to do on dialect loading in the Context, I propose on such event to loop over the identifiers and if any is prefixed with the loaded dialect then set the Dialect pointer, WDYT?

mehdi_amini retitled this revision from Introduce a new DialectIdentifier structure, extending Identifier with a Dialect information to Extend the Identifier storage with a pointer to the Dialect (or the MLIRContext).Jan 27 2021, 7:35 PM

mehdi_amini edited the summary of this revision. (Show Details)

Update to fold the pointer into the Identifier storage.

Harbormaster completed remote builds in B86949: Diff 319741.Jan 27 2021, 8:10 PM

Thanks!

mlir/lib/IR/MLIRContext.cpp
726	Remove trivial {} ?

This revision is now accepted and ready to land.Jan 28 2021, 9:40 AM

Remove trivial braces

Harbormaster completed remote builds in B87060: Diff 319922.Jan 28 2021, 11:46 AM

Nice. Really clean now!

LGTM after adding examples.

mlir/include/mlir/IR/Identifier.h
64	Can you beef up the main description of the Identifier class? Right now there is no mention of what it means to be "prefixed with a dialect", and it would be good to document some of these invariants(with an example or two). This would allow for us to point to a given location when someone asks what a "dialect prefixed identifier" means.
mlir/lib/IR/MLIRContext.cpp
30	I don't think this is necessary.

beef up the main description of the Identifier

mlir/include/mlir/IR/Identifier.h
64	PTAL!

Add an example in the description for getdialect()

LGTM

mlir/include/mlir/IR/Identifier.h
30–34
63	You use `loaded` everywhere else in this description, which makes sense given that the dialect has to be loaded for it to exist.

Harbormaster completed remote builds in B87079: Diff 319975.Jan 28 2021, 3:52 PM

mehdi_amini marked 2 inline comments as done.Jan 28 2021, 4:03 PM

Harbormaster completed remote builds in B87080: Diff 319976.Jan 28 2021, 4:03 PM

address comment

This revision was landed with ongoing or failed builds.Jan 28 2021, 4:05 PM

Closed by commit rGe9dc94291e7d: Introduce a new DialectIdentifier structure, extending Identifier with a… (authored by mehdi_amini). · Explain Why

This revision was automatically updated to reflect the committed changes.

mehdi_amini added a commit: rGe9dc94291e7d: Introduce a new DialectIdentifier structure, extending Identifier with a….

Harbormaster completed remote builds in B87086: Diff 319983.Jan 28 2021, 4:54 PM

Seems you retitled the commit, but that got reverted to original description again on push.

mlir/lib/IR/MLIRContext.cpp
492	OOC: if identifierEntry.second is set, can it change here? (E.g., could we rely on/filter based on it, seems like we'd avoid string comparison except where dialect is not set at cost of one extra check to see if it refers to a context or dialect, probably not an important optimization unless we interleaved dialect loading and identified creation much).

rriddle added inline comments.Jan 31 2021, 12:07 PM

mlir/lib/IR/MLIRContext.cpp
723	Forgot to mention in review, but please rework this to avoid looking up the dialect unless the identifier is being created. We should only be doing the string splicing/dialect lookup when a new identifier is created.

Diff 319984

mlir/include/mlir/IR/Identifier.h

//===- Identifier.h - MLIR Identifier Class ---------------------*- C++ -*-===// //===- Identifier.h - MLIR Identifier Class ---------------------*- C++ -*-===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#ifndef MLIR_IR_IDENTIFIER_H #ifndef MLIR_IR_IDENTIFIER_H

#define MLIR_IR_IDENTIFIER_H #define MLIR_IR_IDENTIFIER_H

#include "mlir/Support/LLVM.h" #include "mlir/Support/LLVM.h"

#include "llvm/ADT/DenseMapInfo.h" #include "llvm/ADT/DenseMapInfo.h"

#include "llvm/ADT/PointerUnion.h"

#include "llvm/ADT/StringMapEntry.h" #include "llvm/ADT/StringMapEntry.h"

#include "llvm/Support/PointerLikeTypeTraits.h" #include "llvm/Support/PointerLikeTypeTraits.h"

namespace mlir { namespace mlir {

class Dialect;

class MLIRContext; class MLIRContext;

/// This class represents a uniqued string owned by an MLIRContext. Strings /// This class represents a uniqued string owned by an MLIRContext. Strings

/// represented by this type cannot contain nul characters, and may not have a /// represented by this type cannot contain nul characters, and may not have a

/// zero length. /// zero length.

/// ///

/// This is a POD type with pointer size, so it should be passed around by /// This is a POD type with pointer size, so it should be passed around by

/// value. The underlying data is owned by MLIRContext and is thus immortal for /// value. The underlying data is owned by MLIRContext and is thus immortal for

/// almost all clients. /// almost all clients.

///

/// An Identifier may be prefixed with a dialect namespace followed by a single

/// dot `.`. This is particularly useful when used as a key in a NamedAttribute

/// to differentiate a dependent attribute (specific to an operation) from a

/// generic attribute defined by the dialect (in general applicable to multiple

/// operations).

rriddleUnsubmitted

Done

/// almost all clients.

///

- /// Identifier can be prefixed with a dialect name follow by a single dot `.`.

- /// This is particulary useful when used a key in a NamedAttribute to

+ /// Identifiers may be prefixed with a dialect namespace followed by a single dot `.`.

+ /// This is particularly useful when used as a key in a NamedAttribute to

/// differentiate a dependent attribute (specific to an operation) from a

/// generic attribute defined by the dialect (in general applicable to multiple

/// operations).

class Identifier {

rriddle:

class Identifier { class Identifier {

using EntryType = llvm::StringMapEntry<llvm::NoneType>; using EntryType =

llvm::StringMapEntry<PointerUnion<Dialect *, MLIRContext *>>;

public: public:

/// Return an identifier for the specified string. /// Return an identifier for the specified string.

static Identifier get(StringRef str, MLIRContext *context); static Identifier get(StringRef str, MLIRContext *context);

Identifier(const Identifier &) = default; Identifier(const Identifier &) = default;

Identifier &operator=(const Identifier &other) = default; Identifier &operator=(const Identifier &other) = default;

/// Return a StringRef for the string. /// Return a StringRef for the string.

Show All 9 Lines public:

const char *c_str() const { return entry->getKeyData(); } const char *c_str() const { return entry->getKeyData(); }

/// Return a pointer to the start of the string data. /// Return a pointer to the start of the string data.

const char *data() const { return entry->getKeyData(); } const char *data() const { return entry->getKeyData(); }

/// Return the number of bytes in this string. /// Return the number of bytes in this string.

unsigned size() const { return entry->getKeyLength(); } unsigned size() const { return entry->getKeyLength(); }

/// Return the dialect loaded in the context for this identifier or nullptr if

rriddleUnsubmitted

Done

unsigned size() const { return entry->getKeyLength(); }

- /// Return the dialect registered/loaded in the context for this

+ /// Return the dialect loaded in the context for this

/// identifier or nullptr if this identifier isn't prefixed with a loaded

You use loaded everywhere else in this description, which makes sense given that the dialect has to be loaded for it to exist.

rriddle: You use `loaded` everywhere else in this description, which makes sense given that the dialect…

/// this identifier isn't prefixed with a loaded dialect. For example the

rriddleUnsubmitted

Not Done

Can you beef up the main description of the Identifier class? Right now there is no mention of what it means to be "prefixed with a dialect", and it would be good to document some of these invariants(with an example or two). This would allow for us to point to a given location when someone asks what a "dialect prefixed identifier" means.

rriddle: Can you beef up the main description of the Identifier class? Right now there is no mention of…

mehdi_aminiAuthorUnsubmitted

Done

PTAL!

mehdi_amini: PTAL!

/// `llvm.fastmathflags` identifier would return the LLVM dialect here,

/// assuming it is loaded in the context.

Dialect *getDialect();

/// Return the current MLIRContext associated with this identifier.

MLIRContext *getContext();

const char *begin() const { return data(); } const char *begin() const { return data(); }

const char *end() const { return entry->getKeyData() + size(); } const char *end() const { return entry->getKeyData() + size(); }

bool operator==(Identifier other) const { return entry == other.entry; } bool operator==(Identifier other) const { return entry == other.entry; }

bool operator!=(Identifier rhs) const { return !(*this == rhs); } bool operator!=(Identifier rhs) const { return !(*this == rhs); }

void print(raw_ostream &os) const; void print(raw_ostream &os) const;

void dump() const; void dump() const;

▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

mlir/lib/IR/MLIRContext.cpp

Show All 21 Lines
#include "mlir/IR/Identifier.h"		#include "mlir/IR/Identifier.h"
#include "mlir/IR/IntegerSet.h"		#include "mlir/IR/IntegerSet.h"
#include "mlir/IR/Location.h"		#include "mlir/IR/Location.h"
#include "mlir/IR/OpImplementation.h"		#include "mlir/IR/OpImplementation.h"
#include "mlir/IR/Types.h"		#include "mlir/IR/Types.h"
#include "mlir/Support/ThreadLocalCache.h"		#include "mlir/Support/ThreadLocalCache.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
		rriddleUnsubmitted Done Reply Inline Actions I don't think this is necessary. rriddle: I don't think this is necessary.
#include "llvm/ADT/StringSet.h"		#include "llvm/ADT/StringSet.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/Support/Allocator.h"		#include "llvm/Support/Allocator.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/RWMutex.h"		#include "llvm/Support/RWMutex.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <memory>		#include <memory>
▲ Show 20 Lines • Show All 220 Lines • ▼ Show 20 Lines	#endif
DialectRegistry dialectsRegistry;		DialectRegistry dialectsRegistry;

/// This is a mapping from operation name to AbstractOperation for registered		/// This is a mapping from operation name to AbstractOperation for registered
/// operations.		/// operations.
llvm::StringMap<AbstractOperation> registeredOperations;		llvm::StringMap<AbstractOperation> registeredOperations;

/// Identifiers are uniqued by string value and use the internal string set		/// Identifiers are uniqued by string value and use the internal string set
/// for storage.		/// for storage.
llvm::StringSet<llvm::BumpPtrAllocator &> identifiers;		llvm::StringMap<PointerUnion<Dialect , MLIRContext >,
		llvm::BumpPtrAllocator &>
		identifiers;
/// A thread local cache of identifiers to reduce lock contention.		/// A thread local cache of identifiers to reduce lock contention.
ThreadLocalCache<llvm::StringMap<llvm::StringMapEntry<llvm::NoneType> *>>		ThreadLocalCache<llvm::StringMap<
		llvm::StringMapEntry<PointerUnion<Dialect , MLIRContext >> *>>
localIdentifierCache;		localIdentifierCache;

/// An allocator used for AbstractAttribute and AbstractType objects.		/// An allocator used for AbstractAttribute and AbstractType objects.
llvm::BumpPtrAllocator abstractDialectSymbolAllocator;		llvm::BumpPtrAllocator abstractDialectSymbolAllocator;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Affine uniquing		// Affine uniquing
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	if (impl.multiThreadedExecutionContext != 0)
llvm::report_fatal_error(		llvm::report_fatal_error(
"Loading a dialect (" + dialectNamespace +		"Loading a dialect (" + dialectNamespace +
") while in a multi-threaded execution context (maybe "		") while in a multi-threaded execution context (maybe "
"the PassManager): this can indicate a "		"the PassManager): this can indicate a "
"missing `dependentDialects` in a pass for example.");		"missing `dependentDialects` in a pass for example.");
#endif		#endif
dialect = ctor();		dialect = ctor();
assert(dialect && "dialect ctor failed");		assert(dialect && "dialect ctor failed");

		// Refresh all the identifiers dialect field, this catches cases where a
		// dialect may be loaded after identifier prefixed with this dialect name
		// were already created.
		for (auto &identifierEntry : impl.identifiers)
		if (identifierEntry.first().startswith(dialectNamespace))
		jpienaarUnsubmitted Not Done Reply Inline Actions OOC: if identifierEntry.second is set, can it change here? (E.g., could we rely on/filter based on it, seems like we'd avoid string comparison except where dialect is not set at cost of one extra check to see if it refers to a context or dialect, probably not an important optimization unless we interleaved dialect loading and identified creation much). jpienaar: OOC: if identifierEntry.second is set, can it change here? (E.g., could we rely on/filter based…
		identifierEntry.second = dialect.get();

return dialect.get();		return dialect.get();
}		}

// Abort if dialect with namespace has already been registered.		// Abort if dialect with namespace has already been registered.
if (dialect->getTypeID() != dialectID)		if (dialect->getTypeID() != dialectID)
llvm::report_fatal_error("a dialect with namespace '" + dialectNamespace +		llvm::report_fatal_error("a dialect with namespace '" + dialectNamespace +
"' has already been registered");		"' has already been registered");

▲ Show 20 Lines • Show All 210 Lines • ▼ Show 20 Lines
Identifier Identifier::get(StringRef str, MLIRContext *context) {		Identifier Identifier::get(StringRef str, MLIRContext *context) {
// Check invariants after seeing if we already have something in the		// Check invariants after seeing if we already have something in the
// identifier table - if we already had it in the table, then it already		// identifier table - if we already had it in the table, then it already
// passed invariant checks.		// passed invariant checks.
assert(!str.empty() && "Cannot create an empty identifier");		assert(!str.empty() && "Cannot create an empty identifier");
assert(str.find('\0') == StringRef::npos &&		assert(str.find('\0') == StringRef::npos &&
"Cannot create an identifier with a nul character");		"Cannot create an identifier with a nul character");

		PointerUnion<Dialect , MLIRContext > dialectOrContext = context;
		auto dialectNamePair = str.split('.');
		if (!dialectNamePair.first.empty())
		rriddleUnsubmitted Not Done Reply Inline Actions Forgot to mention in review, but please rework this to avoid looking up the dialect unless the identifier is being created. We should only be doing the string splicing/dialect lookup when a new identifier is created. rriddle: Forgot to mention in review, but please rework this to avoid looking up the dialect unless the…
		if (Dialect *dialect = context->getLoadedDialect(dialectNamePair.first))
		dialectOrContext = dialect;

		jpienaarUnsubmitted Done Reply Inline Actions Remove trivial {} ? jpienaar: Remove trivial {} ?
auto &impl = context->getImpl();		auto &impl = context->getImpl();
if (!context->isMultithreadingEnabled())		if (!context->isMultithreadingEnabled())
return Identifier(&*impl.identifiers.insert(str).first);		return Identifier(&*impl.identifiers.insert({str, dialectOrContext}).first);

// Check for an existing instance in the local cache.		// Check for an existing instance in the local cache.
auto &localEntry = (impl.localIdentifierCache)[str];		auto &localEntry = (impl.localIdentifierCache)[str];
if (localEntry)		if (localEntry)
return Identifier(localEntry);		return Identifier(localEntry);

// Check for an existing identifier in read-only mode.		// Check for an existing identifier in read-only mode.
{		{
llvm::sys::SmartScopedReader<true> contextLock(impl.identifierMutex);		llvm::sys::SmartScopedReader<true> contextLock(impl.identifierMutex);
auto it = impl.identifiers.find(str);		auto it = impl.identifiers.find(str);
if (it != impl.identifiers.end()) {		if (it != impl.identifiers.end()) {
localEntry = &*it;		localEntry = &*it;
return Identifier(localEntry);		return Identifier(localEntry);
}		}
}		}

// Acquire a writer-lock so that we can safely create the new instance.		// Acquire a writer-lock so that we can safely create the new instance.
llvm::sys::SmartScopedWriter<true> contextLock(impl.identifierMutex);		llvm::sys::SmartScopedWriter<true> contextLock(impl.identifierMutex);
auto it = impl.identifiers.insert(str).first;		auto it = impl.identifiers.insert({str, dialectOrContext}).first;
localEntry = &*it;		localEntry = &*it;
return Identifier(localEntry);		return Identifier(localEntry);
}		}

		Dialect *Identifier::getDialect() {
		return entry->second.dyn_cast<Dialect *>();
		}

		MLIRContext *Identifier::getContext() {
		if (Dialect *dialect = getDialect())
		return dialect->getContext();
		return entry->second.get<MLIRContext *>();
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Type uniquing		// Type uniquing
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Returns the storage uniquer used for constructing type storage instances.		/// Returns the storage uniquer used for constructing type storage instances.
/// This should not be used directly.		/// This should not be used directly.
StorageUniquer &MLIRContext::getTypeUniquer() { return getImpl().typeUniquer; }		StorageUniquer &MLIRContext::getTypeUniquer() { return getImpl().typeUniquer; }

▲ Show 20 Lines • Show All 215 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Introduce a new DialectIdentifier structure, extending Identifier with a Dialect information
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 319984

mlir/include/mlir/IR/Identifier.h

mlir/lib/IR/MLIRContext.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Introduce a new DialectIdentifier structure, extending Identifier with a Dialect informationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 319984

mlir/include/mlir/IR/Identifier.h

mlir/lib/IR/MLIRContext.cpp

Introduce a new DialectIdentifier structure, extending Identifier with a Dialect information
ClosedPublic