This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/lldb/Utility/
-
lldb/
-
Utility/
3
TaskPool.h
-
source/
-
Plugins/SymbolFile/DWARF/
-
SymbolFile/
-
DWARF/
3
SymbolFileDWARF.cpp
-
Utility/
-
CMakeLists.txt
-
TaskPool.cpp
-
unittests/Utility/
-
Utility/
-
TaskPoolTest.cpp

Differential D33246

Remove most of lldb's TaskPool in favor of llvm's parallel functions
Needs ReviewPublic

Authored by scott.smith on May 16 2017, 10:35 AM.

Download Raw Diff

Details

Reviewers

zturner

Summary

Remove the thread pool and for_each-like iteration functions.
Keep RunTasks, which has no analog in llvm::parallel, but implement it using llvm::parallel.

Diff Detail

Repository: rL LLVM

Event Timeline

scott.smith created this revision.May 16 2017, 10:35 AM

Herald added a subscriber: mgorny. · View Herald TranscriptMay 16 2017, 10:35 AM

zturner added inline comments.May 16 2017, 7:31 PM

include/lldb/Utility/TaskPool.h
18–20	I'm not sure this is the most efficient implementation. `std::function` has pretty poor performance, and there might be no need to even convert everything to `std::function` to begin with. You could make this a bit better by using `llvm::function_ref<void()>` instead. That said, I wonder if it's worth adding a function like this to `llvm::TaskGroup`? And you could just enqueue all the tasks, rather than `for_each_n`. Not sure if there would be a different in practice, what do you think?
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1995–1996	What did you decide about the recursive parallelism? I don't know if that works yet using LLVM's default executor.

scott.smith added inline comments.May 16 2017, 7:45 PM

include/lldb/Utility/TaskPool.h
18–20	I'm not too worried about std::function vs llvm::function_ref; it isn't called often, and we still need allocations for the tasks that get enqueued. That said, there's no reason to use std::function, so I'll cahnge it. I like using for_each_n mostly to regularize the interface. For example, for_each_n/for_each can then optimize the type of TaskGroup it creates to ensure that it gets the right # of threads right away, rather than spawning up enough for full hardware concurrency. Or, if there are a lot of tasks (unlikely, but possible), then for_each can change to a model of enqueueing one task per thread, and having that thread loop using std::atomic to increment the iterator, which reduces allocations in TaskGroup and reduces lock contention (assuming TaskGroup doesn't use a lock free queue). i.e. the more things funnel through a single interface, the more we benefit from optimizing that one implementation. Also it means we can have for_each_n manage TaskGroups itself (maybe keeping one around for repeated use, then creating more as needed to support recursion, etc (more on that later)).
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1995–1996	This code doesn't care. It looks like it works, since (I think) for_each creates a separate TaskGroup for each call. However I got a deadlock when using this for parallelizing the dynamic library loading itself, which used to work. That could either be due to other code changes, some oversight on my part, or it could be that for_each_n doesn't actually support recursion - which means that I misunderstood for_each_n. So I have more work to do...

scott.smith added inline comments.May 16 2017, 7:53 PM

include/lldb/Utility/TaskPool.h
18–20	oh, and I wouldn't be surprised if there's a better home in llvm. I'm fine with moving it. I doubt it should go in llvm::parallel, since that seems like it's trying to be similar to std::parallel (though note for_each_n is incompatible, since it should take Iter,Size instead of Type,Type, and it tries to dereference Iter, which means you can't pass in a number like all the callsites do. I tried fixing that but failed due to the deref assumption, and the LLD dependency). It's small enough that it does seem like it should fit under something else rather than be standalone; I'm open to suggestions.

scott.smith added inline comments.May 16 2017, 8:06 PM

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1995–1996	On further inspection, llvm::parallel does not support recursion, since TaskGroup uses a single static Executor, and provides no way to override that (and besides, there's no way to pass parameters from for_each_n to the TaskGroup). That's fixable though, by making the default executor a thread local variable, so that worker threads can enqueue work to a different executor.

Revision Contents

Path

Size

include/

lldb/

Utility/

TaskPool.h

80 lines

source/

Plugins/

SymbolFile/

DWARF/

SymbolFileDWARF.cpp

7 lines

Utility/

CMakeLists.txt

1 line

TaskPool.cpp

unittests/

Utility/

TaskPoolTest.cpp

26 lines

Diff 99166

include/lldb/Utility/TaskPool.h

Context not available.
	#ifndef utility_TaskPool_h_	#ifndef utility_TaskPool_h_
	#define utility_TaskPool_h_	#define utility_TaskPool_h_

	#include <functional> // for bind, function	#include <llvm/Support/Parallel.h>
	#include <future>
	#include <list>
	#include <memory> // for make_shared
	#include <mutex> // for mutex, unique_lock, condition_variable
	#include <type_traits> // for forward, result_of, move

	// Global TaskPool class for running tasks in parallel on a set of worker thread	namespace TaskPool {
	// created the first
	// time the task pool is used. The TaskPool provide no guarantee about the order
	// the task will be run
	// and about what tasks will run in parallel. None of the task added to the task
	// pool should block
	// on something (mutex, future, condition variable) what will be set only by the
	// completion of an
	// other task on the task pool as they may run on the same thread sequentally.
	class TaskPool {
	public:
	// Add a new task to the task pool and return a std::future belonging to the
	// newly created task.
	// The caller of this function has to wait on the future for this task to
	// complete.
	template <typename F, typename... Args>
	static std::future<typename std::result_of<F(Args...)>::type>
	AddTask(F &&f, Args &&... args);

	// Run all of the specified tasks on the task pool and wait until all of them	template <typename... T> void RunTasks(T &&... tasks) {
	// are finished	std::function<void()> cbs[sizeof...(T)]{tasks...};
	// before returning. This method is intended to be used for small number tasks	llvm::parallel::for_each_n(llvm::parallel::par, static_cast<size_t>(0),
	// where listing	sizeof...(T), [&cbs](size_t idx) { cbs[idx](); });
		zturnerUnsubmitted Not Done Reply Inline Actions I'm not sure this is the most efficient implementation. `std::function` has pretty poor performance, and there might be no need to even convert everything to `std::function` to begin with. You could make this a bit better by using `llvm::function_ref<void()>` instead. That said, I wonder if it's worth adding a function like this to `llvm::TaskGroup`? And you could just enqueue all the tasks, rather than `for_each_n`. Not sure if there would be a different in practice, what do you think? zturner: I'm not sure this is the most efficient implementation. `std::function` has pretty poor…
		scott.smithAuthorUnsubmitted Not Done Reply Inline Actions I'm not too worried about std::function vs llvm::function_ref; it isn't called often, and we still need allocations for the tasks that get enqueued. That said, there's no reason to use std::function, so I'll cahnge it. I like using for_each_n mostly to regularize the interface. For example, for_each_n/for_each can then optimize the type of TaskGroup it creates to ensure that it gets the right # of threads right away, rather than spawning up enough for full hardware concurrency. Or, if there are a lot of tasks (unlikely, but possible), then for_each can change to a model of enqueueing one task per thread, and having that thread loop using std::atomic to increment the iterator, which reduces allocations in TaskGroup and reduces lock contention (assuming TaskGroup doesn't use a lock free queue). i.e. the more things funnel through a single interface, the more we benefit from optimizing that one implementation. Also it means we can have for_each_n manage TaskGroups itself (maybe keeping one around for repeated use, then creating more as needed to support recursion, etc (more on that later)). scott.smith: I'm not too worried about std::function vs llvm::function_ref; it isn't called often, and we…
		scott.smithAuthorUnsubmitted Not Done Reply Inline Actions oh, and I wouldn't be surprised if there's a better home in llvm. I'm fine with moving it. I doubt it should go in llvm::parallel, since that seems like it's trying to be similar to std::parallel (though note for_each_n is incompatible, since it should take Iter,Size instead of Type,Type, and it tries to dereference Iter, which means you can't pass in a number like all the callsites do. I tried fixing that but failed due to the deref assumption, and the LLD dependency). It's small enough that it does seem like it should fit under something else rather than be standalone; I'm open to suggestions. scott.smith: oh, and I wouldn't be surprised if there's a better home in llvm. I'm fine with moving it. I…
	// them as function arguments is acceptable. For running large number of tasks
	// you should use
	// AddTask for each task and then call wait() on each returned future.
	template <typename... T> static void RunTasks(T &&... tasks);

	private:
	TaskPool() = delete;

	template <typename... T> struct RunTaskImpl;

	static void AddTaskImpl(std::function<void()> &&task_fn);
	};

	template <typename F, typename... Args>
	std::future<typename std::result_of<F(Args...)>::type>
	TaskPool::AddTask(F &&f, Args &&... args) {
	auto task_sp = std::make_shared<
	std::packaged_task<typename std::result_of<F(Args...)>::type()>>(
	std::bind(std::forward<F>(f), std::forward<Args>(args)...));

	AddTaskImpl([task_sp]() { (*task_sp)(); });

	return task_sp->get_future();
	}

	template <typename... T> void TaskPool::RunTasks(T &&... tasks) {
	RunTaskImpl<T...>::Run(std::forward<T>(tasks)...);
	}	}

	template <typename Head, typename... Tail>	} // namespace TaskPool
	struct TaskPool::RunTaskImpl<Head, Tail...> {
	static void Run(Head &&h, Tail &&... t) {
	auto f = AddTask(std::forward<Head>(h));
	RunTaskImpl<Tail...>::Run(std::forward<Tail>(t)...);
	f.wait();
	}
	};

	template <> struct TaskPool::RunTaskImpl<> {
	static void Run() {}
	};

	// Run 'func' on every value from begin .. end-1. Each worker will grab
	// 'batch_size' numbers at a time to work on, so for very fast functions, batch
	// should be large enough to avoid too much cache line contention.
	void TaskMapOverInt(size_t begin, size_t end,
	std::function<void(size_t)> const &func);

	#endif // #ifndef utility_TaskPool_h_	#endif // #ifndef utility_TaskPool_h_
Context not available.

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp

Context not available.

	// Other libraries and framework includes	// Other libraries and framework includes
	#include "llvm/Support/Casting.h"	#include "llvm/Support/Casting.h"
		#include "llvm/Support/Parallel.h"
	#include "llvm/Support/Threading.h"	#include "llvm/Support/Threading.h"

	#include "lldb/Core/ArchSpec.h"	#include "lldb/Core/ArchSpec.h"
Context not available.
	// a DIE in one compile unit refers to another and the indexes accesses	// a DIE in one compile unit refers to another and the indexes accesses
	// those DIEs.	// those DIEs.
	//----------------------------------------------------------------------	//----------------------------------------------------------------------
	TaskMapOverInt(0, num_compile_units, extract_fn);	llvm::parallel::for_each_n(llvm::parallel::par, 0U, num_compile_units,
		extract_fn);
		zturnerUnsubmitted Not Done Reply Inline Actions What did you decide about the recursive parallelism? I don't know if that works yet using LLVM's default executor. zturner: What did you decide about the recursive parallelism? I don't know if that works yet using…
		scott.smithAuthorUnsubmitted Not Done Reply Inline Actions This code doesn't care. It looks like it works, since (I think) for_each creates a separate TaskGroup for each call. However I got a deadlock when using this for parallelizing the dynamic library loading itself, which used to work. That could either be due to other code changes, some oversight on my part, or it could be that for_each_n doesn't actually support recursion - which means that I misunderstood for_each_n. So I have more work to do... scott.smith: 1. This code doesn't care. 2. It looks like it works, since (I think) for_each creates a…
		scott.smithAuthorUnsubmitted Not Done Reply Inline Actions On further inspection, llvm::parallel does not support recursion, since TaskGroup uses a single static Executor, and provides no way to override that (and besides, there's no way to pass parameters from for_each_n to the TaskGroup). That's fixable though, by making the default executor a thread local variable, so that worker threads can enqueue work to a different executor. scott.smith: On further inspection, llvm::parallel does not support recursion, since TaskGroup uses a single…

	// Now create a task runner that can index each DWARF compile unit in a	// Now create a task runner that can index each DWARF compile unit in a
	// separate	// separate
	// thread so we can index quickly.	// thread so we can index quickly.

	TaskMapOverInt(0, num_compile_units, parser_fn);	llvm::parallel::for_each_n(llvm::parallel::par, 0U, num_compile_units,
		parser_fn);

	auto finalize_fn = [](NameToDIE &index, std::vector<NameToDIE> &srcs) {	auto finalize_fn = [](NameToDIE &index, std::vector<NameToDIE> &srcs) {
	for (auto &src : srcs)	for (auto &src : srcs)
Context not available.

source/Utility/CMakeLists.txt

Context not available.
	StringExtractorGDBRemote.cpp	StringExtractorGDBRemote.cpp
	StringLexer.cpp	StringLexer.cpp
	StringList.cpp	StringList.cpp
	TaskPool.cpp
	TildeExpressionResolver.cpp	TildeExpressionResolver.cpp
	UserID.cpp	UserID.cpp
	UriParser.cpp	UriParser.cpp
Context not available.

source/Utility/TaskPool.cpp

This file was deleted.

	//===--------------------- TaskPool.cpp -------------------------- C++ --===//
	//
	// The LLVM Compiler Infrastructure
	//
	// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.
	//
	//===----------------------------------------------------------------------===//

	#include "lldb/Utility/TaskPool.h"

	#include <cstdint> // for uint32_t
	#include <queue> // for queue
	#include <thread> // for thread

	namespace {
	class TaskPoolImpl {
	public:
	static TaskPoolImpl &GetInstance();

	void AddTask(std::function<void()> &&task_fn);

	private:
	TaskPoolImpl();

	static void Worker(TaskPoolImpl *pool);

	std::queue<std::function<void()>> m_tasks;
	std::mutex m_tasks_mutex;
	uint32_t m_thread_count;
	};

	} // end of anonymous namespace

	TaskPoolImpl &TaskPoolImpl::GetInstance() {
	static TaskPoolImpl g_task_pool_impl;
	return g_task_pool_impl;
	}

	void TaskPool::AddTaskImpl(std::function<void()> &&task_fn) {
	TaskPoolImpl::GetInstance().AddTask(std::move(task_fn));
	}

	TaskPoolImpl::TaskPoolImpl() : m_thread_count(0) {}

	void TaskPoolImpl::AddTask(std::function<void()> &&task_fn) {
	static const uint32_t max_threads = std::thread::hardware_concurrency();

	std::unique_lock<std::mutex> lock(m_tasks_mutex);
	m_tasks.emplace(std::move(task_fn));
	if (m_thread_count < max_threads) {
	m_thread_count++;
	// Note that this detach call needs to happen with the m_tasks_mutex held.
	// This prevents the thread
	// from exiting prematurely and triggering a linux libc bug
	// (https://sourceware.org/bugzilla/show_bug.cgi?id=19951).
	std::thread(Worker, this).detach();
	}
	}

	void TaskPoolImpl::Worker(TaskPoolImpl *pool) {
	while (true) {
	std::unique_lock<std::mutex> lock(pool->m_tasks_mutex);
	if (pool->m_tasks.empty()) {
	pool->m_thread_count--;
	break;
	}

	std::function<void()> f = pool->m_tasks.front();
	pool->m_tasks.pop();
	lock.unlock();

	f();
	}
	}

	void TaskMapOverInt(size_t begin, size_t end,
	std::function<void(size_t)> const &func) {
	std::atomic<size_t> idx{begin};
	size_t num_workers =
	std::min<size_t>(end, std::thread::hardware_concurrency());

	auto wrapper = [&idx, end, &func]() {
	while (true) {
	size_t i = idx.fetch_add(1);
	if (i >= end)
	break;
	func(i);
	}
	};

	std::vector<std::future<void>> futures;
	futures.reserve(num_workers);
	for (size_t i = 0; i < num_workers; i++)
	futures.push_back(TaskPool::AddTask(wrapper));
	for (size_t i = 0; i < num_workers; i++)
	futures[i].wait();
	}

unittests/Utility/TaskPoolTest.cpp

Context not available.

	#include "lldb/Utility/TaskPool.h"	#include "lldb/Utility/TaskPool.h"

	TEST(TaskPoolTest, AddTask) {
	auto fn = [](int x) { return x * x + 1; };

	auto f1 = TaskPool::AddTask(fn, 1);
	auto f2 = TaskPool::AddTask(fn, 2);
	auto f3 = TaskPool::AddTask(fn, 3);
	auto f4 = TaskPool::AddTask(fn, 4);

	ASSERT_EQ(10, f3.get());
	ASSERT_EQ(2, f1.get());
	ASSERT_EQ(17, f4.get());
	ASSERT_EQ(5, f2.get());
	}

	TEST(TaskPoolTest, RunTasks) {	TEST(TaskPoolTest, RunTasks) {
	std::vector<int> r(4);	std::vector<int> r(4);

Context not available.
	ASSERT_EQ(10, r[2]);	ASSERT_EQ(10, r[2]);
	ASSERT_EQ(17, r[3]);	ASSERT_EQ(17, r[3]);
	}	}

	TEST(TaskPoolTest, TaskMap) {
	int data[4];
	auto fn = [&data](int x) { data[x] = x * x; };

	TaskMapOverInt(0, 4, fn);

	ASSERT_EQ(data[0], 0);
	ASSERT_EQ(data[1], 1);
	ASSERT_EQ(data[2], 4);
	ASSERT_EQ(data[3], 9);
	}
Context not available.

This is an archive of the discontinued LLVM Phabricator instance.

Remove most of lldb's TaskPool in favor of llvm's parallel functionsNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 99166

include/lldb/Utility/TaskPool.h

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp

source/Utility/CMakeLists.txt

source/Utility/TaskPool.cpp

unittests/Utility/TaskPoolTest.cpp

Remove most of lldb's TaskPool in favor of llvm's parallel functions
Needs ReviewPublic