This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Early exit in RangeDataVector:FindEntryIndexesThatContain
ClosedPublic

Authored by teemperor on Sep 3 2019, 11:57 AM.

Details

Summary

We currently spend a lot of time in this function (around 27% of the br-by-regex benchmark in lldb-bench)
by just iterating over all the ranges. We already sorted these ranges by their base address, we we can actually
just stop checking ranges as soon as we find one that has a higher base address.

Diff Detail

Repository
rL LLVM

Event Timeline

teemperor created this revision.Sep 3 2019, 11:57 AM

See the br-by-regex flamegraph:

This feels very familiar. I think I've reviewed a similar change back when we first implemented minidumps.

lldb/include/lldb/Utility/RangeMap.h
739 ↗(On Diff #218509)

You're trying to find an upper bound, so std::upper_bound seems more natural than std::lower_bound. Understanding how std::lower_bound works with a comparator that implements <= requires some mental gymnastics. With std::upper_bound, you'd just need < that compares only the base addresses.

You could even avoid the custom comparison function by using the maximum value for the size:

auto end = std::upper_bound(m_entries.begin(), m_entries.end(),
                            Entry(addr, std::numeric_limits<Entry::SizeType>::max()));

I remember seeing this inefficiency when looking at this class some time ago, but it was not clear to me what should be done about that. This is still an improvement, as it will reduce the time by 50% on average, but it is still going to be O(n). If this is really a performance bottleneck, then we will need to come up with some better data structure for storing this, or put some restrictions on the kind of entries that we can have in the map...

lldb/include/lldb/Utility/RangeMap.h
739 ↗(On Diff #218509)

Actually, If I understand this correctly, there is no need for either lower or upper bound here. Since you're going to be iterating through the list anyway. It should be sufficient to add a early exit from the loop once you encounter an element whose base address is >= the address you search for.

teemperor updated this revision to Diff 218602.Sep 4 2019, 12:34 AM
teemperor retitled this revision from [lldb] Use binary search in RangeDataVector:FindEntryIndexesThatContain to [lldb] Early exit in RangeDataVector:FindEntryIndexesThatContain.
teemperor edited the summary of this revision. (Show Details)
  • Refactor to just stop the search when we find a higher base address.
teemperor marked 3 inline comments as done.Sep 4 2019, 12:35 AM
teemperor added inline comments.
lldb/include/lldb/Utility/RangeMap.h
739 ↗(On Diff #218509)

Oh true, I got confused by all the other lower_bounding we do in the surrounding functions :)

labath accepted this revision.Sep 4 2019, 12:59 AM

lgtm

lldb/include/lldb/Utility/RangeMap.h
728–729 ↗(On Diff #218602)

This pattern seems to be common in lldb, but I really don't understand the reason behind it -- it just checks the condition that would be checked by the for loop anyway. I've personally just started removing checks like these whenever I touch some code which uses them...

This revision is now accepted and ready to land.Sep 4 2019, 12:59 AM
teemperor updated this revision to Diff 218644.Sep 4 2019, 4:38 AM
teemperor marked an inline comment as done.
  • remove empty check
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2019, 4:39 AM
amccarth added inline comments.Sep 4 2019, 9:27 AM
lldb/include/lldb/Utility/RangeMap.h
739 ↗(On Diff #218509)

I still don't see the early exit from the loop. Have you uploaded the latest diff to Phabricator?

(Thanks Pavel for pointing out the obvious that teemperor and I both missed.)

Disregard my last comment. Phabricator wasn't showing me that latest, nor that the patch had already been submitted, which seems to be happening with increasing frequency.