Download Raw Diff

Details

Reviewers

labath
teemperor

Commits

rG6304368818a1: [lldb] Treat RangeDataVector as an augmented binary search tree

Summary

Since RangeDataVector is assumed to always be sorted we can treat it as an flattened BST and augment it with additional information about the ranges belonging to each "subtree". By storing the maximum endpoint in every subtree we can query for intervals in O(log n) time.

Note: this is not a complete patch, I just wanted to put it out there and see how you feel about this. Also what kind of testing could and should be done for this.

Diff Detail

Event Timeline

unnar created this revision.Feb 18 2020, 3:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2020, 3:48 AM

Herald added subscribers: lldb-commits, JDevlieghere. · View Herald Transcript

Thanks for putting this together, some comments below. Let us see what Pavel thinks.

lldb/include/lldb/Utility/RangeMap.h
644	BST -> binary search tree
652	Here, B() should be the min value of type B, no? Perhaps this should be `std::numeric_limits<B>::min()` instead of `B()`?
734	Hmm, weird, I am surprised this is not `std::vector<T> &indexes` (I realize this was in the code before).
815	I am guessing this should have the `m_` prefix?

I like this idea a lot, in principle. It is much simpler than a full blown interval tree, and it should give us similar performance characteristics.

Have you done a proper complexity analysis here? I doubt the O(log n) claim is true in general. It would have to be at least O(m + log n) (m - number of elements found), but it's not clear to me whether even this is true in general. (However, I believe this can achieve ~~log(n) for non-degenerate cases.)

The implementation itself needs some work though. My incomplete list of comments is:

replace int with size_t and closed intervals with half-open ones
let's move the computation of the upper bound into the "Sort" function. sorting is O(n log(n)), this is O(n) -- we can just hide it there.
make private functions private
we should avoid the business of figuring out what is the suitable "minimum" value of B by ensuring we call the recursive function on non-empty intervals
clang-format the patch

For testing you should add some c++ unit tests for the relevant interfaces.

In D74759#1880499, @labath wrote:

I like this idea a lot, in principle. It is much simpler than a full blown interval tree, and it should give us similar performance characteristics.

Have you done a proper complexity analysis here? I doubt the O(log n) claim is true in general. It would have to be at least O(m + log n) (m - number of elements found), but it's not clear to me whether even this is true in general. (However, I believe this can achieve ~~log(n) for non-degenerate cases.)

Thanks for the feedback! We were aiming for something simple and efficient enough. Our preliminary results show that the lookup pretty much disappears even from the profiles it was dominating before.

The implementation is pretty much taken from the "Augmented tree" section of https://en.wikipedia.org/wiki/Interval_tree where we just use the tree induced by the pivots of the binary search as the binary search tree that we augment. I believe the complexity is O(m log n), even though the wikipedia article makes a O(m + log n) claim. This should be still much better than the current O(n) and the memory cost seems to be quite palatable (extra word per symbol).

The implementation itself needs some work though. My incomplete list of comments is:

replace int with size_t and closed intervals with half-open ones

let's move the computation of the upper bound into the "Sort" function. sorting is O(n log(n)), this is O(n) -- we can just hide it there.

make private functions private

we should avoid the business of figuring out what is the suitable "minimum" value of B by ensuring we call the recursive function on non-empty intervals

clang-format the patch

For testing you should add some c++ unit tests for the relevant interfaces.

If we get away with this approach them I'm completely fine with it. But I would feel better if we first have some more unit tests for RangeDataVector first.

In D74759#1880559, @jarin wrote:

In D74759#1880499, @labath wrote:

I like this idea a lot, in principle. It is much simpler than a full blown interval tree, and it should give us similar performance characteristics.

Have you done a proper complexity analysis here? I doubt the O(log n) claim is true in general. It would have to be at least O(m + log n) (m - number of elements found), but it's not clear to me whether even this is true in general. (However, I believe this can achieve ~~log(n) for non-degenerate cases.)

Thanks for the feedback! We were aiming for something simple and efficient enough. Our preliminary results show that the lookup pretty much disappears even from the profiles it was dominating before.

The implementation is pretty much taken from the "Augmented tree" section of https://en.wikipedia.org/wiki/Interval_tree where we just use the tree induced by the pivots of the binary search as the binary search tree that we augment. I believe the complexity is O(m log n), even though the wikipedia article makes a O(m + log n) claim. This should be still much better than the current O(n) and the memory cost seems to be quite palatable (extra word per symbol).

Ok, I see. Thanks for that reference. The wikipedia description is somewhat short, but I am beginning to understand how this could work. Maybe you could include a pointer to the wikipedia page and/or the book it references next to the algorithm.

In D74759#1880499, @labath wrote:

Have you done a proper complexity analysis here? I doubt the O(log n) claim is true in general. It would have to be at least O(m + log n) (m - number of elements found), but it's not clear to me whether even this is true in general. (However, I believe this can achieve ~~log(n) for non-degenerate cases.)

I should have been more specific, searching for any interval that contains a point can be done in O(log n) but in our case where we are searching for all intervals that contain the point and as Jarin said we believe it takes O(m log n) (I admit I have not done a thorough analysis of the time complexity myself). The theoretical best you can do is indeed O(m + log n).

The implementation itself needs some work though. My incomplete list of comments is:

replace int with size_t and closed intervals with half-open ones

Done.

let's move the computation of the upper bound into the "Sort" function. sorting is O(n log(n)), this is O(n) -- we can just hide it there.

That was my first thought but I decided to have it generated lazily instead, I will move it back to the sort.

make private functions private

Done.

we should avoid the business of figuring out what is the suitable "minimum" value of B by ensuring we call the recursive function on non-empty intervals

Done.

clang-format the patch

Done.

For testing you should add some c++ unit tests for the relevant interfaces.

Done. Do you think we should also unit test ComputeUpperBounds or just the interface?

lldb/include/lldb/Utility/RangeMap.h
652	Removed and made sure not to recursively call in degenerate cases.
734	I suspect this function used to have a different implementation where it would return the indexes of the entries rather than the data itself similar to FindEntryIndexThatContains and it was not properly changed when it was updated.
815	Removed as it is no longer needed.

Added reference to Wikipedia article.

Thanks for adding the tests. They look good. Could you split them off into a separate patch? I believe @teemperor wanted (and I do agree it's a good idea) to check them in first in order to demonstrate that this patch does not change the behavior (or to make it what exactly changes).

lldb/unittests/Utility/RangeMapTest.cpp
22 ↗	(On Diff #246681)	returning a const value from a function is fairly useless. All it does is make the caller jump through some hoops if he really wants to modify it.

unnar updated this revision to Diff 246716.Feb 26 2020, 7:16 AM

unnar marked 2 inline comments as done.

In D74759#1893186, @unnar wrote:

In D74759#1880499, @labath wrote:

Have you done a proper complexity analysis here? I doubt the O(log n) claim is true in general. It would have to be at least O(m + log n) (m - number of elements found), but it's not clear to me whether even this is true in general. (However, I believe this can achieve ~~log(n) for non-degenerate cases.)

I should have been more specific, searching for any interval that contains a point can be done in O(log n) but in our case where we are searching for all intervals that contain the point and as Jarin said we believe it takes O(m log n) (I admit I have not done a thorough analysis of the time complexity myself). The theoretical best you can do is indeed O(m + log n).

Yep, I can easily convince myself is m*log(n). I can believe it can be O(m+log(n)) too, but proving that would require some pretty careful accounting. In either case, I am sure this is better than what we have now.

For testing you should add some c++ unit tests for the relevant interfaces.

Done. Do you think we should also unit test ComputeUpperBounds or just the interface?

Testing the interface should be enough. In fact, the main thing that is bothering me about this patch at this stage is that the new "upper_bound" member is public and accessible to the user. I haven't decided yet what to do about it. Have you looked at what it would take to make that private somehow? Maybe by storing std::pair<Entry, bound> in the private vector, and only handing out pointers to the Entry component ?

Sure, removed tests from this patch. They are now at D75180

lldb/unittests/Utility/RangeMapTest.cpp
22 ↗	(On Diff #246681)	Noted.

In D74759#1893450, @labath wrote:

In D74759#1893186, @unnar wrote:

In D74759#1880499, @labath wrote:

Have you done a proper complexity analysis here? I doubt the O(log n) claim is true in general. It would have to be at least O(m + log n) (m - number of elements found), but it's not clear to me whether even this is true in general. (However, I believe this can achieve ~~log(n) for non-degenerate cases.)

I should have been more specific, searching for any interval that contains a point can be done in O(log n) but in our case where we are searching for all intervals that contain the point and as Jarin said we believe it takes O(m log n) (I admit I have not done a thorough analysis of the time complexity myself). The theoretical best you can do is indeed O(m + log n).

Yep, I can easily convince myself is m*log(n). I can believe it can be O(m+log(n)) too, but proving that would require some pretty careful accounting. In either case, I am sure this is better than what we have now.

For testing you should add some c++ unit tests for the relevant interfaces.

Done. Do you think we should also unit test ComputeUpperBounds or just the interface?

Testing the interface should be enough. In fact, the main thing that is bothering me about this patch at this stage is that the new "upper_bound" member is public and accessible to the user. I haven't decided yet what to do about it. Have you looked at what it would take to make that private somehow? Maybe by storing std::pair<Entry, bound> in the private vector, and only handing out pointers to the Entry component ?

That should work...although I'm not sure how that is any different to the range or data being public. If a user modifies anything then it has essentially invalidated the whole structure anyway.

In D74759#1893485, @unnar wrote:

That should work...although I'm not sure how that is any different to the range or data being public. If a user modifies anything then it has essentially invalidated the whole structure anyway.

That is a fair point. I suppose the reason why I see this as different is because this field is an implementation detail of the RangeDataVector class, and so the user should not even see it -- whereas the user has a legitimate reason to at least access the other fields (and most of the methods only provide read-only access to these fields).

I'm sorry, I haven't gotten around to looking at this patch today, but I thought I'd at least say that...

In D74759#1895748, @labath wrote:

In D74759#1893485, @unnar wrote:

That should work...although I'm not sure how that is any different to the range or data being public. If a user modifies anything then it has essentially invalidated the whole structure anyway.

That is a fair point. I suppose the reason why I see this as different is because this field is an implementation detail of the RangeDataVector class, and so the user should not even see it -- whereas the user has a legitimate reason to at least access the other fields (and most of the methods only provide read-only access to these fields).

I'm sorry, I haven't gotten around to looking at this patch today, but I thought I'd at least say that...

That is true. I am fine with changing it if that's the only thing that you see as blocking this change from passing.

In D74759#1896100, @unnar wrote:

In D74759#1895748, @labath wrote:

That is a fair point. I suppose the reason why I see this as different is because this field is an implementation detail of the RangeDataVector class, and so the user should not even see it -- whereas the user has a legitimate reason to at least access the other fields (and most of the methods only provide read-only access to these fields).

I'm sorry, I haven't gotten around to looking at this patch today, but I thought I'd at least say that...

That is true. I am fine with changing it if that's the only thing that you see as blocking this change from passing.

I finally took a good look at the patch, and I think that is the only remaining question on my mind. Could you try implementing that to see how it looks like?

@teemperor, do you have any more thoughts on this?

lldb/include/lldb/Utility/RangeMap.h
822	Llvm style guide does not recommend using `auto` in this situation. `Entry &entry` is one character longer, but it makes it clear what is going on.
842	`const Entry &` -- http://llvm.org/docs/CodingStandards.html#beware-unnecessary-copies-with-auto

As discussed with @labath I made a separate struct called AugmentedEntry that is used internally but we only ever expose the Entry part to the user.

I think this is looking very good now. Just a few polishing comments.

lldb/include/lldb/Utility/RangeMap.h
604	Add a short comment about the purpose of this class. Maybe you could just move the comment from the function `ComputeUpperBounds` to here?
608–609	looking at the usage, it would be simpler if this constructor just took a `const RangeData<B, S, T>&` argument, and then initialized the base class using its copy constructor.
626–627	and then here you'd do `m_entries.emplace_back(entry)`

unnar updated this revision to Diff 247597.Mar 2 2020, 4:21 AM

unnar marked 3 inline comments as done.

Thanks for the feedback! I addressed all of your comments in the latest patch.

Awesome. Thanks for the patch. Do you need me to commit this for you?

This revision is now accepted and ready to land.Mar 2 2020, 7:45 AM

Yes please. :)

Closed by commit rG6304368818a1: [lldb] Treat RangeDataVector as an augmented binary search tree (authored by unnar, committed by labath). · Explain WhyMar 3 2020, 2:39 AM

This revision was automatically updated to reflect the committed changes.

labath mentioned this in D77568: Return correct entry in RangeDataVector::FindEntryThatContains.Apr 7 2020, 12:42 AM

Diff 247211

lldb/include/lldb/Utility/RangeMap.h

Show First 20 Lines • Show All 595 Lines • ▼ Show 20 Lines	struct RangeData : public Range<B, S> {

RangeData() : Range<B, S>(), data() {}		RangeData() : Range<B, S>(), data() {}

RangeData(B base, S size) : Range<B, S>(base, size), data() {}		RangeData(B base, S size) : Range<B, S>(base, size), data() {}

RangeData(B base, S size, DataType d) : Range<B, S>(base, size), data(d) {}		RangeData(B base, S size, DataType d) : Range<B, S>(base, size), data(d) {}
};		};

		template <typename B, typename S, typename T>
		labathUnsubmitted Done Reply Inline Actions Add a short comment about the purpose of this class. Maybe you could just move the comment from the function `ComputeUpperBounds` to here? labath: Add a short comment about the purpose of this class. Maybe you could just move the comment from…
		struct AugmentedRangeData : public RangeData<B, S, T> {
		B upper_bound;

		AugmentedRangeData(B base, S size, T data)
		: RangeData<B, S, T>(base, size, data), upper_bound() {}
		labathUnsubmitted Done Reply Inline Actions looking at the usage, it would be simpler if this constructor just took a `const RangeData<B, S, T>&` argument, and then initialized the base class using its copy constructor. labath: looking at the usage, it would be simpler if this constructor just took a `const RangeData<B, S…
		};

template <typename B, typename S, typename T, unsigned N = 0,		template <typename B, typename S, typename T, unsigned N = 0,
class Compare = std::less<T>>		class Compare = std::less<T>>
class RangeDataVector {		class RangeDataVector {
public:		public:
typedef lldb_private::Range<B, S> Range;		typedef lldb_private::Range<B, S> Range;
typedef RangeData<B, S, T> Entry;		typedef RangeData<B, S, T> Entry;
typedef llvm::SmallVector<Entry, N> Collection;		typedef AugmentedRangeData<B, S, T> AugmentedEntry;
		typedef llvm::SmallVector<AugmentedEntry, N> Collection;

RangeDataVector(Compare compare = Compare()) : m_compare(compare) {}		RangeDataVector(Compare compare = Compare()) : m_compare(compare) {}

~RangeDataVector() = default;		~RangeDataVector() = default;

void Append(const Entry &entry) { m_entries.push_back(entry); }		void Append(const Entry &entry) {
		AugmentedEntry augmented_entry(entry.base, entry.size, entry.data);
		m_entries.push_back(augmented_entry);
		labathUnsubmitted Done Reply Inline Actions and then here you'd do `m_entries.emplace_back(entry)` labath: and then here you'd do `m_entries.emplace_back(entry)`
		}

void Sort() {		void Sort() {
if (m_entries.size() > 1)		if (m_entries.size() > 1)
std::stable_sort(m_entries.begin(), m_entries.end(),		std::stable_sort(m_entries.begin(), m_entries.end(),
[&compare = m_compare](const Entry &a, const Entry &b) {		[&compare = m_compare](const Entry &a, const Entry &b) {
if (a.base != b.base)		if (a.base != b.base)
return a.base < b.base;		return a.base < b.base;
if (a.size != b.size)		if (a.size != b.size)
return a.size < b.size;		return a.size < b.size;
return compare(a.data, b.data);		return compare(a.data, b.data);
});		});
		if (!m_entries.empty())
		ComputeUpperBounds(0, m_entries.size());
}		}

#ifdef ASSERT_RANGEMAP_ARE_SORTED		#ifdef ASSERT_RANGEMAP_ARE_SORTED
		jarinUnsubmitted Done Reply Inline Actions BST -> binary search tree jarin: BST -> binary search tree
bool IsSorted() const {		bool IsSorted() const {
typename Collection::const_iterator pos, end, prev;		typename Collection::const_iterator pos, end, prev;
// First we determine if we can combine any of the Entry objects so we
// don't end up allocating and making a new collection for no reason
for (pos = m_entries.begin(), end = m_entries.end(), prev = end; pos != end;		for (pos = m_entries.begin(), end = m_entries.end(), prev = end; pos != end;
prev = pos++) {		prev = pos++) {
if (prev != end && pos < prev)		if (prev != end && pos < prev)
return false;		return false;
}		}
return true;		return true;
		jarinUnsubmitted Not Done Reply Inline Actions Here, B() should be the min value of type B, no? Perhaps this should be `std::numeric_limits<B>::min()` instead of `B()`? jarin: Here, B() should be the min value of type B, no? Perhaps this should be `std::numeric_limits<B>…
		unnarAuthorUnsubmitted Done Reply Inline Actions Removed and made sure not to recursively call in degenerate cases. unnar: Removed and made sure not to recursively call in degenerate cases.
}		}
#endif		#endif

void CombineConsecutiveEntriesWithEqualData() {		void CombineConsecutiveEntriesWithEqualData() {
#ifdef ASSERT_RANGEMAP_ARE_SORTED		#ifdef ASSERT_RANGEMAP_ARE_SORTED
assert(IsSorted());		assert(IsSorted());
#endif		#endif
typename Collection::iterator pos;		typename Collection::iterator pos;
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	#endif
Entry &GetEntryRef(size_t i) { return m_entries[i]; }		Entry &GetEntryRef(size_t i) { return m_entries[i]; }
const Entry &GetEntryRef(size_t i) const { return m_entries[i]; }		const Entry &GetEntryRef(size_t i) const { return m_entries[i]; }

static bool BaseLessThan(const Entry &lhs, const Entry &rhs) {		static bool BaseLessThan(const Entry &lhs, const Entry &rhs) {
return lhs.GetRangeBase() < rhs.GetRangeBase();		return lhs.GetRangeBase() < rhs.GetRangeBase();
}		}

uint32_t FindEntryIndexThatContains(B addr) const {		uint32_t FindEntryIndexThatContains(B addr) const {
const Entry *entry = FindEntryThatContains(addr);		const AugmentedEntry *entry =
		static_cast<const AugmentedEntry *>(FindEntryThatContains(addr));
if (entry)		if (entry)
return std::distance(m_entries.begin(), entry);		return std::distance(m_entries.begin(), entry);
return UINT32_MAX;		return UINT32_MAX;
}		}

uint32_t FindEntryIndexesThatContain(B addr,		uint32_t FindEntryIndexesThatContain(B addr, std::vector<uint32_t> &indexes) {
std::vector<uint32_t> &indexes) const {
#ifdef ASSERT_RANGEMAP_ARE_SORTED		#ifdef ASSERT_RANGEMAP_ARE_SORTED
assert(IsSorted());		assert(IsSorted());
#endif		#endif
// Search the entries until the first entry that has a larger base address		if (!m_entries.empty())
// than `addr`. As m_entries is sorted by their base address, all following		FindEntryIndexesThatContain(addr, 0, m_entries.size(), indexes);
// entries can't contain `addr` as their base address is already larger.
for (const auto &entry : m_entries) {
if (entry.Contains(addr))
indexes.push_back(entry.data);
else if (entry.GetRangeBase() > addr)
break;
}
return indexes.size();		return indexes.size();
}		}

Entry *FindEntryThatContains(B addr) {		Entry *FindEntryThatContains(B addr) {
return const_cast<Entry *>(		return const_cast<Entry *>(
		jarinUnsubmitted Not Done Reply Inline Actions Hmm, weird, I am surprised this is not `std::vector<T> &indexes` (I realize this was in the code before). jarin: Hmm, weird, I am surprised this is not `std::vector<T> &indexes` (I realize this was in the…
		unnarAuthorUnsubmitted Done Reply Inline Actions I suspect this function used to have a different implementation where it would return the indexes of the entries rather than the data itself similar to FindEntryIndexThatContains and it was not properly changed when it was updated. unnar: I suspect this function used to have a different implementation where it would return the…
static_cast<const RangeDataVector *>(this)->FindEntryThatContains(		static_cast<const RangeDataVector *>(this)->FindEntryThatContains(
addr));		addr));
}		}

const Entry *FindEntryThatContains(B addr) const {		const Entry *FindEntryThatContains(B addr) const {
return FindEntryThatContains(Entry(addr, 1));		return FindEntryThatContains(Entry(addr, 1));
}		}

▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	#endif

const Entry *Back() const {		const Entry *Back() const {
return (m_entries.empty() ? nullptr : &m_entries.back());		return (m_entries.empty() ? nullptr : &m_entries.back());
}		}

protected:		protected:
Collection m_entries;		Collection m_entries;
Compare m_compare;		Compare m_compare;

		jarinUnsubmitted Done Reply Inline Actions I am guessing this should have the `m_` prefix? jarin: I am guessing this should have the `m_` prefix?
		unnarAuthorUnsubmitted Done Reply Inline Actions Removed as it is no longer needed. unnar: Removed as it is no longer needed.
		private:
		// We can treat the vector as a flattened Binary Search Tree, augmenting it
		// with upper bounds (max of range endpoints) for every index allows us to
		// query for range containment quicker.
		B ComputeUpperBounds(size_t lo, size_t hi) {
		size_t mid = (lo + hi) / 2;
		AugmentedEntry &entry = m_entries[mid];
		labathUnsubmitted Done Reply Inline Actions Llvm style guide does not recommend using `auto` in this situation. `Entry &entry` is one character longer, but it makes it clear what is going on. labath: [[ http://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable…

		entry.upper_bound = entry.base + entry.size;

		if (lo < mid)
		entry.upper_bound =
		std::max(entry.upper_bound, ComputeUpperBounds(lo, mid));

		if (mid + 1 < hi)
		entry.upper_bound =
		std::max(entry.upper_bound, ComputeUpperBounds(mid + 1, hi));

		return entry.upper_bound;
		}

		// This is based on the augmented tree implementation found at
		// https://en.wikipedia.org/wiki/Interval_tree#Augmented_tree
		void FindEntryIndexesThatContain(B addr, size_t lo, size_t hi,
		std::vector<uint32_t> &indexes) {
		size_t mid = (lo + hi) / 2;
		const AugmentedEntry &entry = m_entries[mid];
		labathUnsubmitted Done Reply Inline Actions `const Entry &` -- http://llvm.org/docs/CodingStandards.html#beware-unnecessary-copies-with-auto labath: `const Entry &` -- http://llvm.org/docs/CodingStandards.html#beware-unnecessary-copies-with-auto

		// addr is greater than the rightmost point of any interval below mid
		// so there are cannot be any matches.
		if (addr > entry.upper_bound)
		return;

		// Recursively search left subtree
		if (lo < mid)
		FindEntryIndexesThatContain(addr, lo, mid, indexes);

		// If addr is smaller than the start of the current interval it
		// cannot contain it nor can any of its right subtree.
		if (addr < entry.base)
		return;

		if (entry.Contains(addr))
		indexes.push_back(entry.data);

		// Recursively search right subtree
		if (mid + 1 < hi)
		FindEntryIndexesThatContain(addr, mid + 1, hi, indexes);
		}
};		};

// A simple range with data class where you get to define the type of		// A simple range with data class where you get to define the type of
// the range base "B", the type used for the range byte size "S", and the type		// the range base "B", the type used for the range byte size "S", and the type
// for the associated data "T".		// for the associated data "T".
template <typename B, typename T> struct AddressData {		template <typename B, typename T> struct AddressData {
typedef B BaseType;		typedef B BaseType;
typedef T DataType;		typedef T DataType;
▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Treat RangeDataVector as an augmented BST
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 247211

lldb/include/lldb/Utility/RangeMap.h

This is an archive of the discontinued LLVM Phabricator instance.

Treat RangeDataVector as an augmented BSTClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 247211

lldb/include/lldb/Utility/RangeMap.h

Treat RangeDataVector as an augmented BST
ClosedPublic