This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Implement iterator cost
ClosedPublic

Authored by kbobyrev on Aug 27 2018, 8:33 AM.

Details

Summary

This patch introduces iterator cost concept to improve the performance of Dex query iterators (mainly, AND iterator). Benchmarks show that the queries become ~10% faster.

Before

-------------------------------------------------------
Benchmark                Time           CPU Iteration
-------------------------------------------------------
DexAdHocQueries    5883074 ns    5883018 ns        117
DexRealQ         959904457 ns  959898507 ns          1

After

-------------------------------------------------------
Benchmark                Time           CPU Iteration
-------------------------------------------------------
DexAdHocQueries    5238403 ns    5238361 ns        130
DexRealQ         873275207 ns  873269453 ns          1

Diff Detail

Event Timeline

kbobyrev created this revision.Aug 27 2018, 8:33 AM
kbobyrev planned changes to this revision.Aug 27 2018, 8:37 AM

It's probably better to roll out proximity path boosting & actual two-stage filtering before rolling this out.

sammccall accepted this revision.Aug 28 2018, 4:36 AM

It's probably better to roll out proximity path boosting & actual two-stage filtering before rolling this out.

Up to you (I don't understand the interaction), but this looks good to go as-is. I'd expect a 10% speedup to be pretty robust to the sorts of changes you're talking about.

clang-tools-extra/clangd/index/dex/Iterator.cpp
90 ↗(On Diff #162686)

this is a performance heuristic and deserves a brief comment (about *why* we're best to have the shortest first). Not too many details, handwavy is fine :-)

93 ↗(On Diff #162686)

(I'd actually suggest declaring operator< in iterator.h and making it compare by cost, it seems natural/important enough, and saves duplicating this lambda)

125 ↗(On Diff #162686)

The use of std::accumulate here seems less clear than the direct:

size_t Smallest = ...;
for (const auto& Child : Children)
  Smallest = std::min(Smallest, Child->cost());
return Smallest;

For better or worse, functional styles don't have the most natural syntax in C++, and (IMO) shouldn't be used just because they fit.

(This is consistent with other places, but I also think that the same applies there)

125 ↗(On Diff #162686)

actually, we've sorted by cost, so don't we just want Children.front().cost() here?

258 ↗(On Diff #162686)

as above, we've sorted, so just Children.back()->cost()?

377 ↗(On Diff #162686)

min() this with Child->cost()?

clang-tools-extra/clangd/index/dex/Iterator.h
99 ↗(On Diff #162686)

const?

99 ↗(On Diff #162686)

I know this terminology comes from elsewhere, but I struggle with "cost" as a metaphor because I expect it to aggregate its children as a sum (at least in cases like "or" when all children will be advanced until end).

Maybe estimateSize()?

kbobyrev updated this revision to Diff 163024.Aug 29 2018, 1:15 AM
kbobyrev marked 6 inline comments as done.
This revision is now accepted and ready to land.Aug 29 2018, 1:15 AM
kbobyrev added inline comments.Aug 29 2018, 1:16 AM
clang-tools-extra/clangd/index/dex/Iterator.cpp
93 ↗(On Diff #162686)

Actually, I think sorting the children of OR iterator doesn't improve performance so I should probably just have a lambda in one place (AND) as it won't be used anywhere else.

125 ↗(On Diff #162686)

It's a sad day for the noble λ knights, but I agree.

I think I should push it like this in this patch (to comply with the common style) and then send out a separate patch with refactorings (I had few concerns about the inconsistency in the Iterator.cpp).

This revision was automatically updated to reflect the committed changes.