This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/ADT/
-
llvm/
-
ADT/
4
ChunkedList.h
-
unittests/ADT/
-
ADT/
-
CMakeLists.txt
-
ChunkedListTest.cpp

Differential D38433

Introduce a specialized data structure to be used in a subsequent change
AbandonedPublic

Authored by sanjoy on Sep 29 2017, 3:19 PM.

Download Raw Diff

Details

Reviewers

chandlerc
rsmith

Summary

ChunkedList will be used to store use lists in ScalarEvolution.

Diff Detail

Build Status

Buildable 10706
Build 10706: arc lint + arc unit

Event Timeline

sanjoy created this revision.Sep 29 2017, 3:19 PM

Herald added subscribers: mgorny, mcrosier. · View Herald TranscriptSep 29 2017, 3:19 PM

grandinj added a subscriber: grandinj.Oct 1 2017, 12:31 PM

Looks a fair bit like std::deque - what're the tradeoffs between the two?

include/llvm/ADT/ChunkedList.h
100–102	Anything operator overload that can be a non-member generally should be, so conversions on LHS and RHS have matching handling. (can still be a friend which allows inline definition here, if that's desired/convenient)

In D38433#892625, @dblaikie wrote:

Looks a fair bit like std::deque - what're the tradeoffs between the two?

I don't think this is like a std::deque -- it allows insertion only in one direction. It is probably most similar to std::forward_list<std::array<T, N>>, and frankly I'm not sure if I've gone overboard with implementing a new data structure from scratch. However:

Most of the logic is in iteration and figuring out when to add a new chunk to the linked list. This would still remain with std::forward_list<std::array<T, N>>.
I'd be less comfortable playing the pointer-offsetting tricks I've used to keep the data structure two words long if I were using std::forward_list.

jbhateja added a subscriber: jbhateja.Oct 9 2017, 7:26 PM

Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.

The only downside I can see is that is is possible to implement insertion into the middle of ChunkedList with bounded complexity, but with ChunkedVector it would still be linear. However, you don't seem to have implemented arbitrary insertion and so I'm hoping that isn't necessary.

If you go the route of ChunkedVector, I'd also suggest thinking about what a useful SSO would look like... At the least an SmallVector for the pointers to chunks, but maybe also the ability to put the first chunk inline? Might make it too big in essentially all cases.

My guess is that SmallVector<std::array<T, 4096/sizeof(T)>, N> (for some small N) is going to be hard to beat.

include/llvm/ADT/ChunkedList.h
9–24	Move to a doxygen comment on the data structure?
62–65	swap?
74–80	The location of this comment doesn't makee it easy to find and looks like it pertains to a using declaration which doesn't make much sense. I'd suggest: having a (doxygen) API comment sinking the implementation details to be attached to code in question. Either to the members or to the increment logic where this is implemented

(Haven't addressed the code comments yet since the design isn't settled)

In D38433#893426, @chandlerc wrote:

Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.

With a vector-of-buffers implementation, I'm a bit worried about the space overhead on the smaller cases. For instance, this is the histogram of how this data structure is populated from a clang-bootstrap (also in https://reviews.llvm.org/D38434):

     Count: 731310
       Min: 1
      Mean: 8.555150
50th %tile: 4
95th %tile: 25
99th %tile: 53
       Max: 433

If I used a vector-of-buffers, I will either have to recompute the capacity and end (of the last buffer) on every insert (which will require an additional deref and some computation) or have to keep two words in the data structure over the three that smallvector keeps anyway. This adds a lot of relative overhead on the median case (4 elements). In fact, the current situation of two extra words also qualifies as a "lot" of relative overhead IMO, and I want to think of an SSO to improve the situation.

The only downside I can see is that is is possible to implement insertion into the middle of ChunkedList with bounded complexity, but with ChunkedVector it would still be linear. However, you don't seem to have implemented arbitrary insertion and so I'm hoping that isn't necessary.

If you go the route of ChunkedVector, I'd also suggest thinking about what a useful SSO would look like... At the least an SmallVector for the pointers to chunks, but maybe also the ability to put the first chunk inline? Might make it too big in essentially all cases.

My guess is that SmallVector<std::array<T, 4096/sizeof(T)>, N> (for some small N) is going to be hard to beat.

Just to be clear, you mean SmallVector<std::array<T, 4096/sizeof(T)> *, N>, right? Btw, 4096 seems a bit high based on the histogram in https://reviews.llvm.org/D38434, but we can obviously lower it.

In D38433#893658, @sanjoy wrote:
(Haven't addressed the code comments yet since the design isn't settled)

In D38433#893426, @chandlerc wrote:

Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.

With a vector-of-buffers implementation, I'm a bit worried about the space overhead on the smaller cases. For instance, this is the histogram of how this data structure is populated from a clang-bootstrap (also in https://reviews.llvm.org/D38434):
     Count: 731310
       Min: 1
      Mean: 8.555150
50th %tile: 4
95th %tile: 25
99th %tile: 53
       Max: 433
If I used a vector-of-buffers, I will either have to recompute the capacity and end (of the last buffer) on every insert (which will require an additional deref and some computation) or have to keep two words in the data structure over the three that smallvector keeps anyway. This adds a lot of relative overhead on the median case (4 elements). In fact, the current situation of two extra words also qualifies as a "lot" of relative overhead IMO, and I want to think of an SSO to improve the situation.

Hold on, the objects here are just pointers? Then none of this really makes sense to me...

Chunked data structures seem to make the most sense if moving the objects is really expensive and/or the objects are really large.

For pointers, why not just a vector?

In D38433#893745, @chandlerc wrote:
In D38433#893658, @sanjoy wrote:
(Haven't addressed the code comments yet since the design isn't settled)

In D38433#893426, @chandlerc wrote:

Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.

With a vector-of-buffers implementation, I'm a bit worried about the space overhead on the smaller cases. For instance, this is the histogram of how this data structure is populated from a clang-bootstrap (also in https://reviews.llvm.org/D38434):
     Count: 731310
       Min: 1
      Mean: 8.555150
50th %tile: 4
95th %tile: 25
99th %tile: 53
       Max: 433
If I used a vector-of-buffers, I will either have to recompute the capacity and end (of the last buffer) on every insert (which will require an additional deref and some computation) or have to keep two words in the data structure over the three that smallvector keeps anyway. This adds a lot of relative overhead on the median case (4 elements). In fact, the current situation of two extra words also qualifies as a "lot" of relative overhead IMO, and I want to think of an SSO to improve the situation.
Hold on, the objects here are just pointers? Then none of this really makes sense to me...

Chunked data structures seem to make the most sense if moving the objects is really expensive and/or the objects are really large.

For pointers, why not just a vector?

I wanted to use a ChunkedList to avoid "slack" in the allocated memory since a vector can waste up to 1/2 the memory it currently has allocated. But I'll keep this complexity out of LLVM for now (as I've said above, I was already somewhat on the fence), we can pull this patch in later if needed.

sanjoy abandoned this revision.Jan 29 2022, 5:37 PM

Herald added a subscriber: bixia. · View Herald TranscriptJan 29 2022, 5:37 PM

Revision Contents

Path

Size

include/

llvm/

ADT/

ChunkedList.h

261 lines

unittests/

ADT/

CMakeLists.txt

1 line

ChunkedListTest.cpp

156 lines

Diff 117234

include/llvm/ADT/ChunkedList.h

This file was added.

				//===- llvm/ADT/SmallVector.h - 'Normally small' vectors --------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// A chunked list is a unidirectional linked list where each node in the list
				// contains an array of \c N values.
				//
				// Pros:
				//
				// - Constant (and low, depending on \c N) memory overhead: the amount of memory
				// consumed is a constant amount over the amount of memory needed.
				// - Fast insertion: O(1) in worst case
				//
				// Cons:
				//
				// - O(n) random access.
				// - Only forward iteration supported in LIFO order.
				// - O(n) deletion.
				//
				chandlercUnsubmitted Not Done Reply Inline Actions Move to a doxygen comment on the data structure? chandlerc: Move to a doxygen comment on the data structure?
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ADT_CHUNKED_LIST_H
				#define LLVM_ADT_CHUNKED_LIST_H

				#include "llvm/ADT/iterator.h"
				#include "llvm/Support/Compiler.h"

				#include <cassert>
				#include <cstddef>
				#include <iterator>

				namespace llvm {

				template <typename T, size_t N> class ChunkedList {
				struct Chunk;

				public:
				using value_type = T;

				ChunkedList() = default;
				ChunkedList(const ChunkedList &Other) { copyFrom(Other); }
				ChunkedList(ChunkedList &&Other) : End(Other.End), Capacity(Other.Capacity) {
				Other.End = nullptr;
				Other.Capacity = nullptr;
				}

				~ChunkedList() { clear(); }

				ChunkedList &operator=(const ChunkedList &Other) {
				clear();
				copyFrom(Other);
				return *this;
				}

				ChunkedList &operator=(ChunkedList &&Other) {
				clear();
				End = Other.End;
				Other.End = nullptr;
				Capacity = Other.Capacity;
				Other.Capacity = nullptr;
				chandlercUnsubmitted Not Done Reply Inline Actions swap? chandlerc: swap?
				}

				void push_back(const T &V) { new (getLastLocation()) T(V); }
				void push_back(T &&V) { new (getLastLocation()) T(std::move(V)); }

				template <bool IsConst>
				class Iterator : public iterator_facade_base<Iterator<IsConst>,
				std::forward_iterator_tag, T> {
				// Implementation Note:
				//
				// Iterators traverse the ChunkedList in LIFO order. They maintain a
				// pointer to the current element, \c Current; and to the first element in
				// the chunk the current element belongs to, \c First. Once we've hit \c
				// First, we use the known offset of \c First within \c Chunk to get to the
				// containing \c Chunk instance, and retreive the previous chunk.
				chandlercUnsubmitted Not Done Reply Inline Actions The location of this comment doesn't makee it easy to find and looks like it pertains to a using declaration which doesn't make much sense. I'd suggest: having a (doxygen) API comment sinking the implementation details to be attached to code in question. Either to the members or to the increment logic where this is implemented chandlerc: The location of this comment doesn't makee it easy to find and looks like it pertains to a…
				using ConstIterator = Iterator<true>;
				using MutableIterator = Iterator<false>;

				friend ConstIterator;
				friend MutableIterator;

				public:
				using value_type = typename std::conditional<IsConst, const T, T>::type;
				using pointer = value_type *;
				using reference = value_type &;
				using iterator_category = std::forward_iterator_tag;

				template <bool IsConstSrc,
				typename = typename std::enable_if<!IsConstSrc \|\| IsConst>::type>
				Iterator(const Iterator<IsConstSrc> &Other)
				: Current(Other.Current), First(Other.First) {}

				reference operator() const { return Current; }

				bool operator==(const ConstIterator &RHS) const {
				assert(Current != RHS.First \|\| First == RHS.First && "Broken invariant");
				return Current == RHS.Current;
				dblaikieUnsubmitted Not Done Reply Inline Actions Anything operator overload that can be a non-member generally should be, so conversions on LHS and RHS have matching handling. (can still be a friend which allows inline definition here, if that's desired/convenient) dblaikie: Anything operator overload that can be a non-member generally should be, so conversions on LHS…
				}

				Iterator &operator++() { // Preincrement
				if (LLVM_UNLIKELY(Current == First)) {
				Chunk *PrevChunk = getPrevChunk();
				if (!PrevChunk) {
				Current = First = nullptr;
				return *this;
				}
				Current = &PrevChunk->Values[N];
				First = &PrevChunk->Values[0];
				}
				--Current;
				return *this;
				}

				private:
				T *Current;
				T *First;

				Iterator(T Current, T First) : Current(Current), First(First) {}

				Chunk *getPrevChunk() const {
				assert(First && "getPrevChunk() on empty list!");
				char *CurChunkAddress =
				reinterpret_cast<char *>(First) - offsetof(Chunk, Values);
				Chunk CurChunk = reinterpret_cast<Chunk >(CurChunkAddress);
				return CurChunk->Prev;
				}

				friend class ChunkedList<T, N>;
				};

				using iterator = Iterator<false>;
				using const_iterator = Iterator<true>;

				iterator begin() {
				if (End == nullptr)
				return iterator(nullptr, nullptr);

				// We never have this condition. End either points to one past the last
				// element in the full chunk or one past the first element in a chunk with
				// only one element.
				assert(End != &getTailChunk()->Values[0] && "Broken invariant!");
				return iterator(End - 1, &getTailChunk()->Values[0]);
				}
				iterator end() { return iterator(nullptr, nullptr); }

				const_iterator begin() const {
				return const_cast<ChunkedList<T, N> *>(this)->begin();
				}
				const_iterator end() const {
				return const_cast<ChunkedList<T, N> *>(this)->end();
				}

				bool empty() const { return End == nullptr; }

				private:
				// Implementation Note:
				//
				// \c End points to one past the last inserted element and \c Capacity points
				// to the last element in the current chunk. When \c End meets \c Capacity
				// and we need to create a new Chunk, we use \c Capacity to find the pointer
				// to the current \c Chunk instance (to store as \c Prev in the new chunk).
				struct Chunk {
				Chunk *Prev;
				T Values[N];
				};

				T *End = nullptr;
				T *Capacity = nullptr;

				Chunk *getTailChunk() const {
				if (Capacity)
				return ChunkedList<T, N>::getTailChunkFromCapacityPtr(Capacity);
				return nullptr;
				}

				Chunk *mallocChunk() const {
				return static_cast<Chunk *>(malloc(sizeof(Chunk)));
				}

				T *getLastLocation() {
				if (LLVM_UNLIKELY(End == Capacity)) {
				auto *NewChunk = mallocChunk();
				NewChunk->Prev = getTailChunk();
				End = &NewChunk->Values[0];
				Capacity = &NewChunk->Values[N];
				}
				return End++;
				}

				void clear() {
				if (End == nullptr) {
				assert(Capacity == nullptr && "Broken invariant!");
				return;
				}

				Chunk *TailChunk = getTailChunk();
				Chunk *CurrentChunk = TailChunk->Prev;

				for (T I = &TailChunk->Values[0], E = End; I != E; ++I)
				I->~T();
				free(TailChunk);

				while (CurrentChunk) {
				Chunk *PrevChunk = CurrentChunk->Prev;
				for (size_t i = 0; i < N; i++)
				CurrentChunk->Values[i].~T();
				free(CurrentChunk);
				CurrentChunk = PrevChunk;
				}
				End = Capacity = nullptr;
				}

				void copyFrom(const ChunkedList &Other) {
				if (Other.End == nullptr) {
				assert(Other.Capacity == nullptr && "Broken invariant!");
				return;
				}

				assert(End == nullptr && Capacity == nullptr &&
				"Call clear() before calling copyFrom!");

				Chunk *OtherChunk = Other.getTailChunk();
				Chunk *MyChunk = mallocChunk();

				size_t TailCount = Other.End - &OtherChunk->Values[0];
				for (size_t i = 0; i < TailCount; i++) {
				new (&MyChunk->Values[i]) T(OtherChunk->Values[i]);
				}

				End = &MyChunk->Values[TailCount];
				Capacity = &MyChunk->Values[N];

				while (OtherChunk->Prev) {
				OtherChunk = OtherChunk->Prev;
				MyChunk->Prev = mallocChunk();
				MyChunk = MyChunk->Prev;

				for (size_t i = 0; i < N; i++)
				new (&MyChunk->Values[i]) T(OtherChunk->Values[i]);
				}

				MyChunk->Prev = nullptr;
				}

				static Chunk getTailChunkFromCapacityPtr(T CapacityPtr) {
				assert(CapacityPtr);
				size_t TotalOffset = offsetof(Chunk, Values) + N * sizeof(T);
				char TailChunkAddr = reinterpret_cast<char >(CapacityPtr) - TotalOffset;
				return reinterpret_cast<Chunk *>(TailChunkAddr);
				}

				static_assert(N > 0, "");
				};
				} // namespace llvm

				#endif // LLVM_ADT_CHUNKED_LIST_H

unittests/ADT/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	Support			Support
	)			)

	set(ADTSources			set(ADTSources
	APFloatTest.cpp			APFloatTest.cpp
	APIntTest.cpp			APIntTest.cpp
	APSIntTest.cpp			APSIntTest.cpp
	ArrayRefTest.cpp			ArrayRefTest.cpp
	BitmaskEnumTest.cpp			BitmaskEnumTest.cpp
	BitVectorTest.cpp			BitVectorTest.cpp
	BreadthFirstIteratorTest.cpp			BreadthFirstIteratorTest.cpp
	BumpPtrListTest.cpp			BumpPtrListTest.cpp
				ChunkedListTest.cpp
	DAGDeltaAlgorithmTest.cpp			DAGDeltaAlgorithmTest.cpp
	DeltaAlgorithmTest.cpp			DeltaAlgorithmTest.cpp
	DenseMapTest.cpp			DenseMapTest.cpp
	DenseSetTest.cpp			DenseSetTest.cpp
	DepthFirstIteratorTest.cpp			DepthFirstIteratorTest.cpp
	FoldingSet.cpp			FoldingSet.cpp
	FunctionRefTest.cpp			FunctionRefTest.cpp
	HashingTest.cpp			HashingTest.cpp
	▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

unittests/ADT/ChunkedListTest.cpp

This file was added.

				//===- llvm/unittest/ADT/APFloat.cpp - APFloat unit tests
				//---------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include <initializer_list>
				#include <string>

				#include "llvm/ADT/ChunkedList.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				TEST(ChunkedListTest, withinFirstChunk) {
				ChunkedList<std::string, 3> CL;
				for (std::string S : {"1", "2", "3"})
				CL.push_back(S);

				int Counter = 3;
				for (std::string C : CL) {
				EXPECT_EQ(std::to_string(Counter), C);
				Counter--;
				}
				}

				TEST(ChunkedListTest, inThreeChunks) {
				ChunkedList<std::string, 3> CL;
				for (std::string S : {"1", "2", "3", "4", "5", "6", "7"})
				CL.push_back(S);

				int Counter = 7;
				for (std::string C : CL) {
				EXPECT_EQ(std::to_string(Counter), C);
				Counter--;
				}
				}

				TEST(ChunkedListTest, empty) {
				ChunkedList<std::string, 3> CL;

				for (std::string C : CL) {
				(void)C;
				ADD_FAILURE() << "List should have been empty!";
				}

				EXPECT_TRUE(CL.empty());
				}

				TEST(ChunkedListTest, moveSemantics) {
				ChunkedList<std::string, 3> SourceCL;

				for (std::string i : {"1", "2", "3", "4"})
				SourceCL.push_back(i);

				ChunkedList<std::string, 3> DestCL(std::move(SourceCL));

				int Counter = 4;
				for (std::string C : DestCL) {
				EXPECT_EQ(std::to_string(Counter), C);
				Counter--;
				}

				for (std::string C : SourceCL) {
				(void)C;
				ADD_FAILURE() << "List should have been empty!";
				}

				EXPECT_TRUE(SourceCL.empty());
				}

				static void checkValues(const ChunkedList<std::string, 3> &SourceCL,
				const ChunkedList<std::string, 3> &DestCL,
				const SmallVector<std::string, 3> &Values) {
				int Idx = Values.size() - 1;
				for (std::string V : DestCL) {
				EXPECT_EQ(V, Values[Idx]);
				Idx--;
				}

				Idx = Values.size() - 1;
				for (std::string V : SourceCL) {
				EXPECT_EQ(V, Values[Idx]);
				Idx--;
				}

				EXPECT_EQ(Values.empty(), SourceCL.empty());
				EXPECT_EQ(Values.empty(), DestCL.empty());
				}

				static void copySemanticsTest(const SmallVector<std::string, 3> &Values) {
				ChunkedList<std::string, 3> SourceCL;

				for (std::string i : Values)
				SourceCL.push_back(i);

				{
				ChunkedList<std::string, 3> DestCLViaCopyConstructor(SourceCL);
				checkValues(SourceCL, DestCLViaCopyConstructor, Values);
				}
				{
				ChunkedList<std::string, 3> DestCLViaOperatorEquals;
				DestCLViaOperatorEquals = SourceCL;
				checkValues(SourceCL, DestCLViaOperatorEquals, Values);
				}
				}

				TEST(ChunkedListTest, copySemanticsTwoChunks) {
				copySemanticsTest({"1", "2", "3", "4"});
				}

				TEST(ChunkedListTest, copySemanticsOneChunkA) {
				copySemanticsTest({"1", "2", "3"});
				}

				TEST(ChunkedListTest, copySemanticsOneChunkB) {
				copySemanticsTest({"1", "2", "3"});
				}

				TEST(ChunkedListTest, copySemanticsEmpty) { copySemanticsTest({}); }

				struct StructWithDestructor {
				int *Counter;

				~StructWithDestructor() {
				EXPECT_NE(Counter, nullptr);
				(*Counter)++;
				Counter = nullptr;
				}
				};

				static void destructorTest(int Count) {
				int DestructorCounter = 0;
				{
				ChunkedList<StructWithDestructor, 3> CL;
				for (int i = 0; i < Count; i++)
				CL.push_back(StructWithDestructor({&DestructorCounter}));
				DestructorCounter = 0;
				}

				EXPECT_EQ(Count, DestructorCounter);
				}

				TEST(ChunkedListTest, destructorTestOneChunkA) { destructorTest(2); }

				TEST(ChunkedListTest, destructorTestOneChunkB) { destructorTest(3); }

				TEST(ChunkedListTest, destructorTestTwoChunks) { destructorTest(4); }

				TEST(ChunkedListTest, destructorTestEmpty) { destructorTest(0); }