This is an archive of the discontinued LLVM Phabricator instance.

Remove most of lldb's TaskPool in favor of llvm's parallel functions
Needs ReviewPublic

Authored by scott.smith on May 16 2017, 10:35 AM.

Details

Reviewers
zturner
Summary

Remove the thread pool and for_each-like iteration functions.
Keep RunTasks, which has no analog in llvm::parallel, but implement it using llvm::parallel.

Diff Detail

Repository
rL LLVM

Event Timeline

scott.smith created this revision.May 16 2017, 10:35 AM
zturner added inline comments.May 16 2017, 7:31 PM
include/lldb/Utility/TaskPool.h
18–20

I'm not sure this is the most efficient implementation. std::function has pretty poor performance, and there might be no need to even convert everything to std::function to begin with. You could make this a bit better by using llvm::function_ref<void()> instead.

That said, I wonder if it's worth adding a function like this to llvm::TaskGroup? And you could just enqueue all the tasks, rather than for_each_n. Not sure if there would be a different in practice, what do you think?

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1995–1996

What did you decide about the recursive parallelism? I don't know if that works yet using LLVM's default executor.

scott.smith added inline comments.May 16 2017, 7:45 PM
include/lldb/Utility/TaskPool.h
18–20

I'm not too worried about std::function vs llvm::function_ref; it isn't called often, and we still need allocations for the tasks that get enqueued. That said, there's no reason *to* use std::function, so I'll cahnge it.

I like using for_each_n mostly to regularize the interface. For example, for_each_n/for_each can then optimize the type of TaskGroup it creates to ensure that it gets the right # of threads right away, rather than spawning up enough for full hardware concurrency. Or, if there are a lot of tasks (unlikely, but possible), then for_each can change to a model of enqueueing one task per thread, and having that thread loop using std::atomic to increment the iterator, which reduces allocations in TaskGroup and reduces lock contention (assuming TaskGroup doesn't use a lock free queue).

i.e. the more things funnel through a single interface, the more we benefit from optimizing that one implementation.

Also it means we can have for_each_n manage TaskGroups itself (maybe keeping one around for repeated use, then creating more as needed to support recursion, etc (more on that later)).

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1995–1996
  1. This code doesn't care.
  2. It looks like it works, since (I think) for_each creates a separate TaskGroup for each call.
  3. However I got a deadlock when using this for parallelizing the dynamic library loading itself, which used to work. That could either be due to other code changes, some oversight on my part, or it could be that for_each_n doesn't actually support recursion - which means that I misunderstood for_each_n. So I have more work to do...
scott.smith added inline comments.May 16 2017, 7:53 PM
include/lldb/Utility/TaskPool.h
18–20

oh, and I wouldn't be surprised if there's a better home in llvm. I'm fine with moving it. I doubt it should go in llvm::parallel, since that seems like it's trying to be similar to std::parallel (though note for_each_n is incompatible, since it should take Iter,Size instead of Type,Type, and it tries to dereference Iter, which means you can't pass in a number like all the callsites do. I tried fixing that but failed due to the deref assumption, and the LLD dependency).

It's small enough that it does seem like it should fit under something else rather than be standalone; I'm open to suggestions.

scott.smith added inline comments.May 16 2017, 8:06 PM
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1995–1996

On further inspection, llvm::parallel does not support recursion, since TaskGroup uses a single static Executor, and provides no way to override that (and besides, there's no way to pass parameters from for_each_n to the TaskGroup). That's fixable though, by making the default executor a thread local variable, so that worker threads can enqueue work to a different executor.

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp