Take full advantage of AND's iterator children size estimation: use early reset
in sync() and prevent large overhead. The idea is that the children at the
beginning of the list are smaller and cheaper to advance. Very large children
negate the effect of this performance optimisation and hence should be
advanced only when absolutely necessary. By reducing the number of large
iterators' updates, we increase the performance by a large margin.
This change was tested on a comprehensive query dataset. The performance
boost increases with the average length of the query, on small queries it is
close to 45% but the longer they go the closer it gets to 60% and beyond.
I stared at this for a while and this change makes me think about the whole algo differently.
Effectively, we're just iterating through the items C of front trying to find a valid sync point.
With the slight optimization that when some other child skips over C to X, we catch up the front to X.
This leads to two possible simplifications (strictly, unrelated to this patch and totally up to you):
Consider rather:
This seems basically equivalent and I find the flow control easier to reason about.
Because now the outer loop is *entirely* just a loop over the items of Front.
Intuitively the average number that advanceTo() advances by here is between 1 and 2 (we know it's at least 1, but the remaining items are mostly from the longer iterator so we probably don't skip over any from the shorter iterator).
And advanceTo is more complicated, if advanceTo turns out to be 2x as expensive as advance in practice, this isn't a win. So this is maybe worth checking.