This is an archive of the discontinued LLVM Phabricator instance.

[mlir][core] Slightly improved attribute lookup
ClosedPublic

Authored by Mogball on Nov 1 2021, 4:18 PM.

Details

Summary
  • String binary search does 1 less string comparison
  • Identifier linear scan on large attribute list is switched to string binary search

Diff Detail

Event Timeline

Mogball created this revision.Nov 1 2021, 4:18 PM
Mogball requested review of this revision.Nov 1 2021, 4:18 PM
Mogball updated this revision to Diff 383918.Nov 1 2021, 4:19 PM

Fix commit message

Mogball edited the summary of this revision. (Show Details)Nov 1 2021, 4:20 PM
rriddle added inline comments.Nov 2 2021, 9:24 PM
mlir/include/mlir/IR/OperationSupport.h
305–311

This class is only ever used by NamedAttrList. Is there a reason for this to be separate?

347
438
Mogball added inline comments.Nov 3 2021, 9:43 AM
mlir/include/mlir/IR/OperationSupport.h
305–311

I was going to add it to DictionaryAttr but since that's tied up in ODS I didn't bother...

Mogball updated this revision to Diff 384546.Nov 3 2021, 11:50 AM

Deleted AttributeDictionaryLike class

Mogball marked 3 inline comments as done.Nov 3 2021, 11:51 AM
rriddle added inline comments.Nov 4 2021, 8:25 AM
mlir/include/mlir/IR/OperationSupport.h
260–264
271–273

Is the bool return important? What is the performance if you just return end() if the name isn't found?

274–275

Just use ptrdiff_t.

280

Spell out auto here.

295

Drop the 16, it could change. Prefer being somewhat ambiguous on exact numbers of heuristics in comments.

Mogball marked an inline comment as done.Nov 4 2021, 11:10 AM
Mogball added inline comments.
mlir/include/mlir/IR/OperationSupport.h
271–273

For a linear search, the only gain is that you skip one extra it == end() check.

For a binary search, it's important to know whether the attribute was actually found, because, if not, then it will be the sorted insertion point. If just end() was returned:

  • Attribute lookups will need to check it != end() && it->first == name to know if the attribute was actually found (which is no different than using lower_bound)
  • Setting an attribute when the attribute doesn't exist will have to re-search the sorted insertion point.
Mogball added inline comments.Nov 4 2021, 11:13 AM
mlir/include/mlir/IR/OperationSupport.h
271–273

And if the attribute was actually found (with a StringRef lookup) then it->first == name will compare the entire string (think operand_segment_sizes...)

rriddle added inline comments.Nov 4 2021, 11:13 AM
mlir/include/mlir/IR/OperationSupport.h
271–273

Why would the binary search be different? Wouldn't we just return end in all of the cases where you currently return false, and the current iterator whenever you return true?

Mogball marked 2 inline comments as done.Nov 4 2021, 11:36 AM
Mogball added inline comments.
mlir/include/mlir/IR/OperationSupport.h
271–273

Case 1: attribute is found. Returning just an iterator:

c++
auto it = findAttrSorted(begin(), end(), name);
return it == end() ? Attribute() : it->second;

// vs
auto it = findAttrSorted(begin(), end(), name);
return it.second ? it.first->second : Attribute();

All that happens is that the == end() check is skipped, which isn't much, but:

Case 2: setting an attribute

c++
auto it = findAttrSorted(begin(), end(), name);
// a: attribute is found
if (it != end()) {
  it->second = value;
  return;
}
// b: attribute is not found, where should the attribute be inserted?
it = llvm::lower_bound(begin(), end(), name); // re-search the list
attrs.insert(it, {name, value});

// vs
auto it = findAttrSorted(begin(), end(), name);
if (it.second) {
  it.first->second = value;
  return;
}
// it.first will be the sorted insertion point
attrs.insert(it.first, {name, value});

// vs lower_bound
auto it = llvm::lower_bound(begin(), end(), name);
if (it != end() && it->first == name) { // re-compare the whole string (if found)
  it->second = value;
  return; 
}
attrs.insert(it, {name, value});

Essentially, the bool contains the result of it->first == name so that it doesn't have to be re-checked.

rriddle added inline comments.Nov 4 2021, 11:39 AM
mlir/include/mlir/IR/OperationSupport.h
271–273

Right, I missed where this was being used for insertion.

Mogball marked 3 inline comments as done.Nov 4 2021, 11:39 AM
Mogball updated this revision to Diff 384833.Nov 4 2021, 11:41 AM

Review comments

rriddle accepted this revision.Nov 4 2021, 12:20 PM
rriddle added inline comments.
mlir/lib/IR/OperationSupport.cpp
110

nit: Add a comment on the use of swap here.

122–128

Why is this the case? Can't we use the insertion point from the findAttr above directly?

This revision is now accepted and ready to land.Nov 4 2021, 12:20 PM
Mogball added inline comments.Nov 4 2021, 12:52 PM
mlir/lib/IR/OperationSupport.cpp
122–128

Since this one is set(Identifier, Attribute), the identifier lookup doesn't return the sorted insertion point (either end() or the found attribute). You can either:

  1. insert wherever and unset the sorted flag dictionarySorted.setInt(false)
  2. lookup the sorted insertion point and insert maintaining sortedness

I chose (2) because that was the previous behaviour of the function.

Mogball updated this revision to Diff 384855.Nov 4 2021, 1:36 PM

Add swap comment

Mogball marked an inline comment as done.Nov 4 2021, 1:36 PM
This revision was automatically updated to reflect the committed changes.