This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/IR/
-
mlir/
-
IR/
1/1
MLIRContext.h
-
lib/IR/
-
IR/
5/5
MLIRContext.cpp

Differential D118937

[mlir] Keep sorted vector of registered operation names for efficient lookup
ClosedPublic

Authored by ezhulenev on Feb 3 2022, 12:11 PM.

Download Raw Diff

Details

Reviewers

rriddle
mehdi_amini

Commits

rG0557c6a7970d: [mlir] Keep sorted vector of registered operation names for efficient lookup

Summary

I see a lot of array sorting in stack traces of our compiler, canonicalizer traverses this list every time it builds a pattern set, and it gets expensive very quickly.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ezhulenev created this revision.Feb 3 2022, 12:11 PM

Herald added a reviewer: rriddle. · View Herald TranscriptFeb 3 2022, 12:11 PM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 19 others. · View Herald Transcript

ezhulenev requested review of this revision.Feb 3 2022, 12:11 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2022, 12:11 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

ezhulenev edited the summary of this revision. (Show Details)Feb 3 2022, 12:13 PM

ezhulenev added a reviewer: mehdi_amini.

Do you have an idea on the cost of this vs. say computing on demand?

In D118937#3294739, @rriddle wrote:

Do you have an idea on the cost of this vs. say computing on demand?

I was looking at the crash, and from 30 threads running MLIR compiler, ~10 were in getRegisteredOperations. For TF/TFRT compiler it takes 300us to get all operations, and it is called ~30 times for a typical module (10ms in total).

Inside google it's in top 7 function under mlir::MLIRContext

mlir::MLIRContext::getOrLoadDialect
mlir::MLIRContext::MLIRContext
mlir::MLIRContext::~MLIRContext
mlir::MLIRContextImpl::~MLIRContextImpl
mlir::MLIRContext::loadAllAvailableDialects
mlir::MLIRContext::getOrLoadDialect()::{lambda()#1}::operator()
mlir::MLIRContext::getRegisteredOperations
mlir::MLIRContext::getRegisteredOperations()::$_1::__invoke

Assuming that this will be called at least once during the MLIRContext lifetime it's cheaper to precompute.

Hardcode84 added a subscriber: Hardcode84.Feb 3 2022, 1:00 PM

Hardcode84 added inline comments.

mlir/lib/IR/MLIRContext.cpp
189	Why SmallVector? It's unlikely for context to have that few operations but it will bloat context object size.

ezhulenev added inline comments.Feb 3 2022, 1:03 PM

mlir/lib/IR/MLIRContext.cpp
189	Just for consistency with every other container, I don't think that this really matters

Oh wow!
I'm really surprised we're using this API in the canonicalized and in FrozenRewritePatternSet.
This was really supposed to be more of a debugging API only.

It's a bit annoying that we're keeping both a map and a vector in the Context, but that seems justified if this API is actually useful.

Do you have an idea on the cost of this vs. say computing on demand?

Isn't "computing on demand" what to the existing API does? Do you have something else in mind?

mlir/include/mlir/IR/MLIRContext.h
179–181
mlir/lib/IR/MLIRContext.cpp
187

This revision is now accepted and ready to land.Feb 3 2022, 1:04 PM

mehdi_amini added inline comments.Feb 3 2022, 1:05 PM

mlir/lib/IR/MLIRContext.cpp
189	You may use `llvm::SmallVector<RegisteredOperationName, 0>` to avoid the extra bloat (this is even smaller than std::vector if I remember correctly)

In D118937#3294837, @mehdi_amini wrote:

Oh wow!
I'm really surprised we're using this API in the canonicalized and in FrozenRewritePatternSet.
This was really supposed to be more of a debugging API only.

It's a bit annoying that we're keeping both a map and a vector in the Context, but that seems justified if this API is actually useful.

Do you have an idea on the cost of this vs. say computing on demand?

Isn't "computing on demand" what to the existing API does? Do you have something else in mind?

No, by "on demand" I mean lazy sorting. Instead of doing a sorted insert for every operation, only sort if the list is requested and not already sorted.

In D118937#3294850, @rriddle wrote:

Isn't "computing on demand" what to the existing API does? Do you have something else in mind?

No, by "on demand" I mean lazy sorting. Instead of doing a sorted insert for every operation, only sort if the list is requested and not already sorted.

Gotcha!
So you'd keep a bool sorted = falst; next to the vector, and in the getRegisteredOperations() call you would check:

if (!sorted) {
  llvm::sort(ctxImpl.sortedRegisteredOperations),
                        [](auto &lhs, auto &rhs) {
                          return lhs.getIdentifier().compare(
                              rhs.getIdentifier());
                        });
  sorted = true;
}

Makes sense!

In D118937#3294858, @mehdi_amini wrote:
In D118937#3294850, @rriddle wrote:

Isn't "computing on demand" what to the existing API does? Do you have something else in mind?

No, by "on demand" I mean lazy sorting. Instead of doing a sorted insert for every operation, only sort if the list is requested and not already sorted.

Gotcha!
So you'd keep a bool sorted = falst; next to the vector, and in the getRegisteredOperations() call you would check:
if (!sorted) {
  llvm::sort(ctxImpl.sortedRegisteredOperations),
                        [](auto &lhs, auto &rhs) {
                          return lhs.getIdentifier().compare(
                              rhs.getIdentifier());
                        });
  sorted = true;
}
Makes sense!

You'd also need to lock while actually sorting, but effectively yeah.

It will lead to slightly slower startup times, but we can always switch to deferred sorting if that is a problem. (It isn't clear how much it matters in practice anyways).

mlir/lib/IR/MLIRContext.cpp
189

It will lead to slightly slower startup times, but we can always switch to deferred sorting if that is a problem. (It isn't clear how much it matters in practice anyways).

I think that inserting into various maps and synchronization is much more expensive, and for simplicity I'd prefer to keep all collections mutations close to each other.

Upd

This revision was landed with ongoing or failed builds.Feb 3 2022, 2:19 PM

Closed by commit rG0557c6a7970d: [mlir] Keep sorted vector of registered operation names for efficient lookup (authored by ezhulenev). · Explain Why

This revision was automatically updated to reflect the committed changes.

ezhulenev added a commit: rG0557c6a7970d: [mlir] Keep sorted vector of registered operation names for efficient lookup.

Harbormaster completed remote builds in B147484: Diff 405764.Feb 3 2022, 2:30 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

IR/

MLIRContext.h

7 lines

lib/

IR/

MLIRContext.cpp

40 lines

Diff 405789

mlir/include/mlir/IR/MLIRContext.h

Show First 20 Lines • Show All 170 Lines • ▼ Show 20 Lines

public:

/// Return true if we should attach the current stacktrace to diagnostics when

/// emitted.

bool shouldPrintStackTraceOnDiagnostic();

/// Set the flag specifying if we should attach the current stacktrace when

/// emitting diagnostics.

void printStackTraceOnDiagnostic(bool enable);

/// Return information about all registered operations. This isn't very

/// Return a sorted array containing the information about all registered

/// efficient: typically you should ask the operations about their properties

/// operations.

/// directly.

ArrayRef<RegisteredOperationName> getRegisteredOperations();

mehdi_aminiUnsubmitted

Done

void printStackTraceOnDiagnostic(bool enable);

- /// Return information about all registered operations.

+ /// Return a sorted array containing the information about all registered

+ /// operations.

ArrayRef<RegisteredOperationName> getRegisteredOperations();

mehdi_amini:

std::vector<RegisteredOperationName> getRegisteredOperations();

/// Return true if this operation name is registered in this context.

bool isOperationRegistered(StringRef name);

// This is effectively private given that only MLIRContext.cpp can see the

// MLIRContextImpl type.

MLIRContextImpl &getImpl() { return *impl; }

▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

mlir/lib/IR/MLIRContext.cpp

Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines #endif

llvm::BumpPtrAllocator abstractDialectSymbolAllocator; llvm::BumpPtrAllocator abstractDialectSymbolAllocator;

/// This is a mapping from operation name to the operation info describing it. /// This is a mapping from operation name to the operation info describing it.

llvm::StringMap<OperationName::Impl> operations; llvm::StringMap<OperationName::Impl> operations;

/// A vector of operation info specifically for registered operations. /// A vector of operation info specifically for registered operations.

llvm::StringMap<RegisteredOperationName> registeredOperations; llvm::StringMap<RegisteredOperationName> registeredOperations;

/// This is a sorted container of registered operations for a deterministic

mehdi_aminiUnsubmitted

Done

llvm::StringMap<RegisteredOperationName> registeredOperations;

- /// This is a sorted container of registereted operations for a deterministic

+ /// This is a sorted container of registered operations for a deterministic

/// and efficient `getRegisteredOperations` implementation.

mehdi_amini:

/// and efficient `getRegisteredOperations` implementation.

SmallVector<RegisteredOperationName, 0> sortedRegisteredOperations;

Hardcode84Unsubmitted

Done

Why SmallVector? It's unlikely for context to have that few operations but it will bloat context object size.

Hardcode84: Why SmallVector? It's unlikely for context to have that few operations but it will bloat…

ezhulenevAuthorUnsubmitted

Done

Just for consistency with every other container, I don't think that this really matters

ezhulenev: Just for consistency with every other container, I don't think that this really matters

mehdi_aminiUnsubmitted

Done

You may use llvm::SmallVector<RegisteredOperationName, 0> to avoid the extra bloat (this is even smaller than std::vector if I remember correctly)

mehdi_amini: You may use `llvm::SmallVector<RegisteredOperationName, 0>` to avoid the extra bloat (this is…

rriddleUnsubmitted

Done

/// and efficient `getRegisteredOperations` implementation.

- llvm::SmallVector<RegisteredOperationName> sortedRegisteredOperations;

+ SmallVector<RegisteredOperationName> sortedRegisteredOperations;

/// A mutex used when accessing operation information.

rriddle:

/// A mutex used when accessing operation information. /// A mutex used when accessing operation information.

llvm::sys::SmartRWMutex<true> operationInfoMutex; llvm::sys::SmartRWMutex<true> operationInfoMutex;

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

// Affine uniquing // Affine uniquing

//===--------------------------------------------------------------------===// //===--------------------------------------------------------------------===//

// Affine expression, map and integer set uniquing. // Affine expression, map and integer set uniquing.

▲ Show 20 Lines • Show All 369 Lines • ▼ Show 20 Lines

} }

/// Set the flag specifying if we should attach the current stacktrace when /// Set the flag specifying if we should attach the current stacktrace when

/// emitting diagnostics. /// emitting diagnostics.

void MLIRContext::printStackTraceOnDiagnostic(bool enable) { void MLIRContext::printStackTraceOnDiagnostic(bool enable) {

impl->printStackTraceOnDiagnostic = enable; impl->printStackTraceOnDiagnostic = enable;

} }

/// Return information about all registered operations. This isn't very /// Return information about all registered operations.

/// efficient, typically you should ask the operations about their properties ArrayRef<RegisteredOperationName> MLIRContext::getRegisteredOperations() {

/// directly. return impl->sortedRegisteredOperations;

std::vector<RegisteredOperationName> MLIRContext::getRegisteredOperations() {

// We just have the operations in a non-deterministic hash table order. Dump

// into a temporary array, then sort it by operation name to get a stable

// ordering.

auto unwrappedNames = llvm::make_second_range(impl->registeredOperations);

std::vector<RegisteredOperationName> result(unwrappedNames.begin(),

unwrappedNames.end());

llvm::array_pod_sort(result.begin(), result.end(),

[](const RegisteredOperationName *lhs,

const RegisteredOperationName *rhs) {

return lhs->getIdentifier().compare(

rhs->getIdentifier());

});

return result;

} }

bool MLIRContext::isOperationRegistered(StringRef name) { bool MLIRContext::isOperationRegistered(StringRef name) {

return RegisteredOperationName::lookup(name, this).hasValue(); return RegisteredOperationName::lookup(name, this).hasValue();

} }

void Dialect::addType(TypeID typeID, AbstractType &&typeInfo) { void Dialect::addType(TypeID typeID, AbstractType &&typeInfo) {

auto &impl = context->getImpl(); auto &impl = context->getImpl();

▲ Show 20 Lines • Show All 133 Lines • ▼ Show 20 Lines if (it.second)

it.first->second.name = StringAttr::get(ctx, name); it.first->second.name = StringAttr::get(ctx, name);

OperationName::Impl &impl = it.first->second; OperationName::Impl &impl = it.first->second;

if (impl.isRegistered()) { if (impl.isRegistered()) {

llvm::errs() << "error: operation named '" << name llvm::errs() << "error: operation named '" << name

<< "' is already registered.\n"; << "' is already registered.\n";

abort(); abort();

} }

ctxImpl.registeredOperations.try_emplace(name, auto emplaced = ctxImpl.registeredOperations.try_emplace(

RegisteredOperationName(&impl)); name, RegisteredOperationName(&impl));

assert(emplaced.second && "operation name registration must be successful");

// Add emplaced operation name to the sorted operations container.

RegisteredOperationName &value = emplaced.first->getValue();

ctxImpl.sortedRegisteredOperations.insert(

llvm::upper_bound(ctxImpl.sortedRegisteredOperations, value,

[](auto &lhs, auto &rhs) {

return lhs.getIdentifier().compare(

rhs.getIdentifier());

}),

value);

// Update the registered info for this operation. // Update the registered info for this operation.

impl.dialect = &dialect; impl.dialect = &dialect;

impl.typeID = typeID; impl.typeID = typeID;

impl.interfaceMap = std::move(interfaceMap); impl.interfaceMap = std::move(interfaceMap);

impl.foldHookFn = std::move(foldHook); impl.foldHookFn = std::move(foldHook);

impl.getCanonicalizationPatternsFn = std::move(getCanonicalizationPatterns); impl.getCanonicalizationPatternsFn = std::move(getCanonicalizationPatterns);

impl.hasTraitFn = std::move(hasTrait); impl.hasTraitFn = std::move(hasTrait);

▲ Show 20 Lines • Show All 266 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Keep sorted vector of registered operation names for efficient lookupClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 405789

mlir/include/mlir/IR/MLIRContext.h

mlir/lib/IR/MLIRContext.cpp

[mlir] Keep sorted vector of registered operation names for efficient lookup
ClosedPublic