Calling sync() *after* we sort the children is faster.
Also bail out early if some child is already at end.
This seems safe and should improve perf though only slightly
Differential D106645
[clangd] Tweak constructor of dex AndIterator for performance. sammccall on Jul 23 2021, 3:42 AM. Authored by
Details
Calling sync() *after* we sort the children is faster. This seems safe and should improve perf though only slightly
Diff Detail
Event TimelineComment Actions I am extremely confused, but this is actually 4.5% slower than the updated (improved performance with D106528) baseline. Now, this actually works (2.2-2.5% faster): ❯ git diff diff --git a/clang-tools-extra/clangd/index/dex/Iterator.cpp b/clang-tools-extra/clangd/index/dex/Iterator.cpp index 8b37403ff406..ea90e35c32a4 100644 --- a/clang-tools-extra/clangd/index/dex/Iterator.cpp +++ b/clang-tools-extra/clangd/index/dex/Iterator.cpp @@ -29,8 +29,8 @@ public: assert(!Children.empty() && "AND iterator should have at least one child."); // Establish invariants. for (const auto &Child : Children) - ReachedEnd |= Child->reachedEnd(); - sync(); + if ((ReachedEnd |= Child->reachedEnd())) + return; // When children are sorted by the estimateSize(), sync() calls are more // effective. Each sync() starts with the first child and makes sure all // children point to the same element. If any child is "above" the previous @@ -42,6 +42,7 @@ public: const std::unique_ptr<Iterator> &RHS) { return LHS->estimateSize() < RHS->estimateSize(); }); + sync(); } bool reachedEnd() const override { return ReachedEnd; } Some other "obvious" optimizatinons, e.g. ❯ git diff diff --git a/clang-tools-extra/clangd/index/dex/Iterator.cpp b/clang-tools-extra/clangd/index/dex/Iterator.cpp index 8b37403ff406..d0f3bd1597ac 100644 --- a/clang-tools-extra/clangd/index/dex/Iterator.cpp +++ b/clang-tools-extra/clangd/index/dex/Iterator.cpp @@ -89,8 +89,7 @@ private: /// Restores class invariants: each child will point to the same element after /// sync. void sync() { - ReachedEnd |= Children.front()->reachedEnd(); - if (ReachedEnd) + if (ReachedEnd || (ReachedEnd |= Children.front()->reachedEnd())) return; auto SyncID = Children.front()->peek(); // Indicates whether any child needs to be advanced to new SyncID. Surprisingly slow down the queries, too, in an insignificant manner. I think, at this point it's really the matter of assembly output in an extremely hot loop. Comment Actions As discussed offline, does not yield performance improvements. Putting on "Request Changes" so that I don't miss ongoing reviews (please let me know if that's OK with you). |