This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Support/
-
llvm/
-
Support/
13
TaskQueue.h
-
lib/Support/
-
Support/
-
CMakeLists.txt
2
TaskQueue.cpp
-
unittests/Support/
-
Support/
-
CMakeLists.txt
1
TaskQueueTest.cpp

Differential D48240

Try again to implement a FIFO task queue
ClosedPublic

Authored by zturner on Jun 15 2018, 2:52 PM.

Download Raw Diff

Details

Reviewers

labath
dblaikie
davide
inglorion
chandlerc

Commits

rG35169f669882: Add a TaskQueue that can serialize work on a ThreadPool.
rL335440: Add a TaskQueue that can serialize work on a ThreadPool.

Summary

In D48115 I attempted to modify ThreadPool so that it could support tasks that returned values. I got this working, but in doing so I added a test that relied on sequential execution of tasks that run on the same thread, and this uncovered some problems in my approach.

What I *really* wanted was to create a ThreadPool with a single thread, queue up asynchronous work to it, and then assume that the work would be completed sequentially. After having some difficulty making these assumptions fit be consistent with those of the ThreadPool class, we decided that ultimately ThreadPool isn't the right place for that, and we should have a separate class explicitly for this purpose. So that patch was reverted.

Enter TaskQueue. TaskQueue consists of exactly 1 thread, that always executes its work first in first out. When you queue asynchronous work to it, it returns a future that can be waited on. When the wait returns, you are guaranteed that the work you waited on, as well as all previous work, has been completed.

Getting the LLVM_ENABLE_THREADS=0 case working is quite a tricky job. There may be a better solution than what I've come up with here, but this is the only way I could get it to work. When LLVM_ENABLE_THREADS=1, the problem is very easy. Just add the work to a queue, and process the queue on a single blessed background thread. When LLVM_ENABLE_THREADS=0 however, you have to consider the case where a user queues up multiple asynchronous tasks and then waits on them out of order. So if you queue F1, F2, and F3, and wait on F3, F3 has to manually invoke F2's work, which has to manually invoke F1's work. So the implementations of LLVM_ENABLE_THREADS and !LLVM_ENABLE_THREADS diverge somewhat, but I couldn't find any other way to make this work.

Note that my final goal (supporting tasks that return a value) is still unimplemented in this patch. I plan to do that in a followup, as there are some additional complexities that need to be taken into consideration, mostly in the LLVM_ENABLE_THREADS=0 branch.

Diff Detail

Event Timeline

zturner created this revision.Jun 15 2018, 2:52 PM

Herald added subscribers: hiraditya, mgorny. · View Herald TranscriptJun 15 2018, 2:52 PM

zturner added a reviewer: chandlerc.Jun 15 2018, 3:03 PM

Added 2 more tests, the first of which exposed a bug in the wait() function. All the wait() function was doing was joining the work thread, but this is incorrect. What it's supposed to do is simply drain the queue. i.e. keep processing work until the queue is empty. It's not supposed to join the thread. Joining the thread doesn't work anyway since we weren't setting the EnableFlag to be false, so this method was deadlocking.

To fix this I had to borrow some more code from ThreadPool, but all tests now pass both with LLVM_ENABLE_THREADS=0 and LLVM_ENABLE_THREADS=1.

Made the test cases a little stronger. Before I was simply assigning to the local variables in the test case. Now I increment them. This way the test will fail if a task were to ever get executed more than once, for example, whereas before it would pass since it would just be re-assigning the same value.

Before I dig in too much to the code, I want to better understand the underlying goal...

The core primitive here is a *serialized* task queue from what I understand...

Is it important that it starts a separate thread and eagerly (as opposed to lazily) processes the enqueued tasks?

If not, what about making this a lazy system and a more literal work queue?

If so, would it be a problem for the thread(s) used to be shared with other work at times (provided the synchronized aspect is preserved)?

In D48240#1134655, @chandlerc wrote:

Before I dig in too much to the code, I want to better understand the underlying goal...

The core primitive here is a *serialized* task queue from what I understand...

Is it important that it starts a separate thread and eagerly (as opposed to lazily) processes the enqueued tasks?

Yes

If so, would it be a problem for the thread(s) used to be shared with other work at times (provided the synchronized aspect is preserved)?

Depends what you mean by shared. Certainly you can queue up any work you want here, so if your jobs are A, C, and E, and someone else's jobs are B, D, and F, then queueing up A, B, C, D, E, F should work just fine.

The motivation for this is imagine you're writing an application that segments its work into different threads. For example, you might have a UI thread, a file i/o thread, a network thread. Work that changes the state of a UI element has to happen on the UI thread. Work that physically waits on a socket has to happen on the network thread, etc. This is a pretty common architecture for certain types of applications. Another example might be if you were writing a debugger which was debugging N processes simultaneously, and due to various OS restrictions, each of these N processes was being debugged by a separate thread. The application controlling all this would like to not have to deal with synchronization, but if each of these threads just said "wait for something to happen, then invoke a callback" the callback would happen on an arbitrary thread context and the user would be responsible for synchronizing all of this. It's nice to be able to guarantee something like "the callback always gets invoked on the same thread" to simplify the user's life.

LMK if this makes sense.

Another example might be if you were writing a debugger which was debugging N processes simultaneously, and due to various OS restrictions, each of these N processes was being debugged by a separate thread.

We've had something similar to this in lldb (it was called ThreadStateCoordinator), but it was removed when we made the whole lldb-server process single-threaded.

llvm/include/llvm/Support/TaskQueue.h
42–44	Are all of these really needed? The only use of ParentFuture I see is in the lambda in ThisFuture, and that one already has it captured. `ThisTask` is not captured by that lambda, but it could easily be, at which point the whole Task struct could just become `std::shared_future<void>`.
llvm/lib/Support/TaskQueue.cpp
80	The assert message could be updated to reflect the new home.
103–108	This could be implemented as `Tasks.back().wait()`, avoiding the direct access into the ThisTask function and making sure the tasks are always invoked the same way.
llvm/unittests/Support/TaskQueueTest.cpp
65–71	I'm not sure if it's necessary, but if you wanted to have the test be uniform for both kinds of builds, you could replace MainThreadReady with an integer variable and then have the tasks wait until that variable is >= some_value.

zturner added inline comments.Jun 18 2018, 4:38 PM

llvm/include/llvm/Support/TaskQueue.h
42–44	I was able to remove them as you suggested in this patch, but things get more tricky in the follow-up patch where I extend this to allow the queued up tasks to return values. Regardless, since these are independent patches, I'm uploading a new version that has the suggested changes and I can try to make the other one use a similar model separately.

zturner updated this revision to Diff 151817.Jun 18 2018, 4:39 PM

zturner added a child revision: D48306: Make TaskQueue support tasks that return values.Jun 18 2018, 5:23 PM

Some suggestions about the way you package things and manage std::future stuff below.

Some higher level comments here...

The term I think you're looking for is "synchronized" or "serial". So I would call this a SynchronizedTaskQueue or SerializedTaskQueue. Given the other use of the term "serialized", I'd lean towards Synchronized or Sync if that's too long to type. Ultimately, the name needs to make it clear that it enforces that the tasks scheduled here execute without overlap.

I also wonder whether you want to decouple this serialized execution from the thread creation and management. Could you accept a thread pool instead, and enqueue work on the thread pool? You can do this by having each task on the thread pool, on completion, check the queue for more work, and if there is some, schedule a new task on the thread pool. The reason why this is interesting is because if you have, say, a system thread pool that maybe gets a lot of other work, you can intersperse these tasks on it without overloading the system. It also allows the client of this class to dictate which threads are allowable threads for these tasks to run by providing a correct thread pool. My hope is that worst case, creating a one-thread thread pool and handing it to this class would be the equivalent of what you have here, but with a more clean separation of roles. I understand it would add some complexity, for example you'd have to handle the case where no work is currently in-flight when asyncImpl finishes adding the task to the queue and start a new batch. But this seems reasonably tolerable.

llvm/include/llvm/Support/TaskQueue.h
36–42	Rather than these types, I would suggest you build your task more like the following: template <typename Callable> std::future<void> async(Callable &&C) { // Because we don't have init capture to use move-only local variables that // are captured into a lambda, we create the promise inside an explicit callable // struct. We want to do as much of the wrapping in the type-specialized domain // (before type erasure) and then erase this into a std::function. struct Task { Callable C; std::promise<void> P; void operator()() noexcept { // Note, we don't catch exceptions and store them in the promise here because // we don't support exception handling in LLVM. C(); P.set_value(C()); } }; Task T = {std::forward<Callable>(C)}; std::future<void> F = T.P.get_future(); asyncImpl(std::move(T)); return F; } This is somewhat similar to what `std::packaged_task` does, but due to how `std::packaged_task` works, this will probably end up a bit lighter weight (specifically, `std::packaged_task` type erases the callable separately from the promise, while this packages the callable and the promise together ahead of time and then type erases it afterward. The end result is that your task queue operates in a super simple closure type: `std::function<void()>`. This will be true even when you produce a value from the callable and store it into the promise. And this should also work even when you have no threading support at all. `std::shared_future` is a lot more expensive ultimately than `std::future`. Does all of this make sense? And FWIW, this is where C++14 is actually helpful, as there the above goop looks a lot cleaner: template <typename Callable> std::future<void> async(Callable &&C) { std::promise<void> P; std::future<void> F = P.get_future(); asyncImpl([C = std::forward<Callable>(C), P = std::move(P)] { // Note, we don't catch exceptions and store them in the promise here because // we don't support exception handling in LLVM. C(); P.set_value(C()); }); return F; }
50–58	I regularly find std::bind to be problematic and hard to understand. Rather than expose this at all, I would force clients to handle it by passing you a lambda (or a std::bind if they so choose) that handles all of the argument binding. Then you just have a callable (the overload below).
63–64	I would term this a `Callable` as it may not be a function at all. The most common case I suspect will be a lambda.

In D48240#1136052, @chandlerc wrote:

I also wonder whether you want to decouple this serialized execution from the thread creation and management. Could you accept a thread pool instead, and enqueue work on the thread pool? You can do this by having each task on the thread pool, on completion, check the queue for more work, and if there is some, schedule a new task on the thread pool. The reason why this is interesting is because if you have, say, a system thread pool that maybe gets a lot of other work, you can intersperse these tasks on it without overloading the system. It also allows the client of this class to dictate which threads are allowable threads for these tasks to run by providing a correct thread pool. My hope is that worst case, creating a one-thread thread pool and handing it to this class would be the equivalent of what you have here, but with a more clean separation of roles. I understand it would add some complexity, for example you'd have to handle the case where no work is currently in-flight when asyncImpl finishes adding the task to the queue and start a new batch. But this seems reasonably tolerable.

Isn't using multiple threads strictly incompatible with the idea of it being serialized or synchronized? With multiple threads, you will have work overlapping, which confuses the purpose of what this is for. If someone comes up with a use case for it, then maybe, but otherwise, YAGNI?

I like your other suggestions (which I snipped from the quote), I'll try to make it work and report back if I can't.

One thing that's annoying is that you can't call promise.set_value(V) when V is void. Bleh.

In D48240#1136061, @zturner wrote:

In D48240#1136052, @chandlerc wrote:

I also wonder whether you want to decouple this serialized execution from the thread creation and management. Could you accept a thread pool instead, and enqueue work on the thread pool? You can do this by having each task on the thread pool, on completion, check the queue for more work, and if there is some, schedule a new task on the thread pool. The reason why this is interesting is because if you have, say, a system thread pool that maybe gets a lot of other work, you can intersperse these tasks on it without overloading the system. It also allows the client of this class to dictate which threads are allowable threads for these tasks to run by providing a correct thread pool. My hope is that worst case, creating a one-thread thread pool and handing it to this class would be the equivalent of what you have here, but with a more clean separation of roles. I understand it would add some complexity, for example you'd have to handle the case where no work is currently in-flight when asyncImpl finishes adding the task to the queue and start a new batch. But this seems reasonably tolerable.

Isn't using multiple threads strictly incompatible with the idea of it being serialized or synchronized? With multiple threads, you will have work overlapping, which confuses the purpose of what this is for. If someone comes up with a use case for it, then maybe, but otherwise, YAGNI?

It isn't about *using* multiple threads, its about selecting which threads run things.

Whether tasks are serialized in their execution is independent of how many threads they run on. Its just that there doesn't *need* to be more than one if they are serialized.

But a system thread pool might expose the equivalent of <1 thread.

Think of this example. You have three different systems which each want to use this infrastructure to run a series of asynchronous tasks, but for each system the series of tasks must run in-order (FIFO) and without overlap. You can re-use a single threadpool with potentially fewer than three threads for all three of these systems. This allows the client to control the rate of work -- perhaps only allowing at most two operations to happen concurrently.

While with three systems and two threads this seems somewhat odd, if you have *many* systems (maybe 100s) all using this to coordinate a series of tasks, you might not want hundereds of threads, all spinning on their own queues. Instead, it seems nicer to have a single threadpool, sized appropriately for the amount of resources on the system, and have all of the users of this class enqueue their serial sequence of tasks into this shared threadpool.

I like your other suggestions (which I snipped from the quote), I'll try to make it work and report back if I can't.

Yeah, the intent was that they're totally orgthogonal.

One thing that's annoying is that you can't call promise.set_value(V) when V is void. Bleh.

OMG YES! =[ I wish I knew a better way than horrible specialization on void that has different code.

In D48240#1136073, @chandlerc wrote:

In D48240#1136061, @zturner wrote:

In D48240#1136052, @chandlerc wrote:

I also wonder whether you want to decouple this serialized execution from the thread creation and management. Could you accept a thread pool instead, and enqueue work on the thread pool? You can do this by having each task on the thread pool, on completion, check the queue for more work, and if there is some, schedule a new task on the thread pool. The reason why this is interesting is because if you have, say, a system thread pool that maybe gets a lot of other work, you can intersperse these tasks on it without overloading the system. It also allows the client of this class to dictate which threads are allowable threads for these tasks to run by providing a correct thread pool. My hope is that worst case, creating a one-thread thread pool and handing it to this class would be the equivalent of what you have here, but with a more clean separation of roles. I understand it would add some complexity, for example you'd have to handle the case where no work is currently in-flight when asyncImpl finishes adding the task to the queue and start a new batch. But this seems reasonably tolerable.

Isn't using multiple threads strictly incompatible with the idea of it being serialized or synchronized? With multiple threads, you will have work overlapping, which confuses the purpose of what this is for. If someone comes up with a use case for it, then maybe, but otherwise, YAGNI?

It isn't about *using* multiple threads, its about selecting which threads run things.

Whether tasks are serialized in their execution is independent of how many threads they run on. Its just that there doesn't *need* to be more than one if they are serialized.

But a system thread pool might expose the equivalent of <1 thread.

Think of this example. You have three different systems which each want to use this infrastructure to run a series of asynchronous tasks, but for each system the series of tasks must run in-order (FIFO) and without overlap. You can re-use a single threadpool with potentially fewer than three threads for all three of these systems. This allows the client to control the rate of work -- perhaps only allowing at most two operations to happen concurrently.

While with three systems and two threads this seems somewhat odd, if you have *many* systems (maybe 100s) all using this to coordinate a series of tasks, you might not want hundereds of threads, all spinning on their own queues. Instead, it seems nicer to have a single threadpool, sized appropriately for the amount of resources on the system, and have all of the users of this class enqueue their serial sequence of tasks into this shared threadpool.

Would you be willing to allow that as a followup patch? I want to get the single threaded case working with this patch and the extension to allow tasks that return values, and then maybe add that on in a followup? (I'm not punting on it either, I actually will do it, but as you said it seems orthogonal so maybe makes sense to do in a followup)

I can't get your suggestion to work, it's very bizarre. It's claiming that I can't even *move* construct a std::function because Task has a deleted copy constructor.

See for yourself here:

https://godbolt.org/g/x1eCnX

Any ideas?

In D48240#1136107, @zturner wrote:

I can't get your suggestion to work, it's very bizarre. It's claiming that I can't even *move* construct a std::function because Task has a deleted copy constructor.

See for yourself here:

https://godbolt.org/g/x1eCnX

Any ideas?

Hmm... http://en.cppreference.com/w/cpp/utility/functional/function/function

Type requirements
-F must meet the requirements of Callable and CopyConstructible.

Yuck. Am I out of luck with this method? I could make it a shared_ptr<promise<void>>, but...

Yea I'm not really able to get anything working using the suggested method of the local class. The Task class here doesn't have any logic to wait on the parent task to complete, and furthermore we just return a future from a promise without using async with deferred launch, so waiting on the future doesn't run the function. All of this is fixable, but ultimately you just end up getting pretty close to what I already have I think.

Updated based on suggestions from Chandler. I don't have the LLVM_ENABLE_THREADS=0 case working yet, but wanted to make sure this is generally what you had in mind.

I also removed the wait() function entirely, as it's a little more difficult to implement now. I can't just wait on the ThreadPool because there could in theory be other work on the thread pool, so waiting on the task queue is not the same as waiting on the thread pool. I don't have an immediate need for this functionality, so unless someone else does, yagni.

This gets the LLVM_ENABLE_THREADS=0 case working. It does so by taking advantage of pointer invalidation properties of std::deque, so I can save off a pointer to the newly inserted element and just run tasks until the pointers match.

Really nice approach for THREADS=0 w/ the deferred future that checks that "enough" of the queue has been processed. I like it.

If you want to avoid the address stability stuff, you could always keep an integer next to the things in the queue and track the integers...

I want to think a bit harder to understand if using the address creates an A-B-A problem. I feel like it does, but my brain is exhausted and not able to pin it down yet.

Unrelated comments inline....

llvm/include/llvm/Support/TaskQueue.h
32–33	Documentation needs updating based on the new design.
47	Assert that the queue is empty at this point? Also, rather than waiting on the entire thread-pool, maybe enqueue a no-op future into the queue and wait on that? We don't need the underlying executor to finish, just the queue to empty.
65–68	This doesn't seem right... Only if there are no tasks in flight do you want to mov this into the scheduler's async. And when you do, you want to set tasks in flight to be true.
89	Add a FIXME about the std::function issue? D48349 has unique_function to eventually fix this.
96	FWIW, the copy of std::function, and the eventual move of llvm::unique_function won't be free. It may be worthwhile to keep this in a non-type-erased state by making CreateTask here a lambda within the async function above. =/ Just something to think about.

In D48240#1137381, @chandlerc wrote:

Really nice approach for THREADS=0 w/ the deferred future that checks that "enough" of the queue has been processed. I like it.

If you want to avoid the address stability stuff, you could always keep an integer next to the things in the queue and track the integers...

I want to think a bit harder to understand if using the address creates an A-B-A problem. I feel like it does, but my brain is exhausted and not able to pin it down yet.

I don't think it does, because in the LLVM_ENABLE_THREADS=0 case, there's literally only one thread in the entire application.

Updated with suggestions from chandler. I'm not sure why all my tests passed even despite having that bug that he noticed. Would be nice if there were a way to write a test for that case, but I'm not sure how. I also made the cleanup suggested to eliminate the extra cost of copying the task. This was a little bit annoying, because I had to go back to using the callable object instead of using the lambda so that I could get move semantics of the callable. But this means I can't easily capture this, so I had to move the callable up to the class level so that I could make it a friend to give it access to TaskQueue's internal state. Anyway, it all works now.

In D48240#1138646, @zturner wrote:

Updated with suggestions from chandler. I'm not sure why all my tests passed even despite having that bug that he noticed. Would be nice if there were a way to write a test for that case, but I'm not sure how.

I think what happened was that all tasks were inserted into the scheduler unconditionally. But, since that was a one-thread scheduler which happens to process tasks in sequence, everything worked as it should.

It would be interesting to have a test with scheduler with at least two threads which verifies that the tasks are still processed serially. You could have the first task block and then validate that the second task hasn't been started.

llvm/include/llvm/Support/TaskQueue.h
38	`Task` uses different tags here (class) and in the definition (struct). Also, I think this friend declaration isn't really needed as Task should have access to TaskQueue internals anyway.
53–67	Maybe this part could be moved to a function in the TaskQueue class? It means less code would need to be duplicated between different Task instantiations, and the management of TaskQueue internal data structures would stay within that class.
132–133	Here you're using NextPtr after you have popped the element it points to off the queue. I think this is technically UB. Instead of that you could just have bool `Done` variable which you set at the beginning of the loop. After that ABA should not be a problem since you're only ever using the `ElementPtr` pointer while it is live.

Updated based on more suggestions. I switched from using deque pointer stability to using explicit ids. I think this makes the code easier to reason about, especially in the followup patch which allows the tasks to return values. I also split out the LLVM_ENABLE_THREADS=1 case from the alternative. They were diverging enough that all the pre-processor branches made it difficult to follow. I think this is cleaner and more maintainable this way.

I think I noticed one more bug in the !threads implementation. Other than that, this looks fine to me, but I'll leave the final ok to Chandler.

llvm/include/llvm/Support/TaskQueue.h
145–151	I just realized this could be wrong when the futures are waited on out of order. E.g. if you have F1 and F2, `F2.wait()` will increment `NextRunId` to 2. Then if you do `F1.wait()`, it will loop until NextRunId wraps back to 1 and crash in the process. Fortunately, with RunIds it should be easy to fix this by changing `!=` to `<`. If we're worried about uint64_t wrapping we can do something like `Id-NextRunId < UINT64_MAX/2` (Also, this code could be moved into a separate function to decrease the number of instantiations of the lambda.)

After some discussion with Chandler, we actually decided to just delete the nothreads codepath and generate a compile error if anyone tries to use this class with nothreads. I'll upload a fix that does that later today.

This patch deletes the LLVM_ENABLE_THREADS=0 codepath and just errors out if you try to use this class in those circumstances. Since the regular codepath is actually pretty straightforward and doesn't have much complexity, I went ahead and added support for tasks that return values directly to this patch. It was only a couple of line change and saves me from having to make another patch for it.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 24 2018, 8:17 PM

Closed by commit rL335440: Add a TaskQueue that can serialize work on a ThreadPool. (authored by zturner). · Explain Why

This revision was automatically updated to reflect the committed changes.

In D48240#1141268, @zturner wrote:

This patch deletes the LLVM_ENABLE_THREADS=0 codepath and just errors out if you try to use this class in those circumstances. Since the regular codepath is actually pretty straightforward and doesn't have much complexity, I went ahead and added support for tasks that return values directly to this patch. It was only a couple of line change and saves me from having to make another patch for it.

Unsurprisingly this breaks the LLVM_ENABLE_THREADS=0 build:

FAILED: unittests/Support/CMakeFiles/SupportTests.dir/TaskQueueTest.cpp.o 
/b/build/slave/linux_upload_clang/build/src/third_party/llvm-bootstrap-install/bin/clang++   -DGTEST_HAS_PTHREAD=0 -DGTEST_HAS_RTTI=0 -DGTEST_HAS_TR1_TUPLE=0 -DGTEST_LANG_CXX11=1 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iunittests/Support -I/b/build/slave/linux_upload_clang/build/src/third_party/llvm/unittests/Support -I/usr/include/libxml2 -Iinclude -I/b/build/slave/linux_upload_clang/build/src/third_party/llvm/include -I/b/build/slave/linux_upload_clang/build/src/third_party/llvm/utils/unittest/googletest/include -I/b/build/slave/linux_upload_clang/build/src/third_party/llvm/utils/unittest/googlemock/include --gcc-toolchain=/b/build/slave/linux_upload_clang/build/src/third_party/llvm-build-tools/gcc485precise -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -O3    -UNDEBUG  -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti -MMD -MT unittests/Support/CMakeFiles/SupportTests.dir/TaskQueueTest.cpp.o -MF unittests/Support/CMakeFiles/SupportTests.dir/TaskQueueTest.cpp.o.d -o unittests/Support/CMakeFiles/SupportTests.dir/TaskQueueTest.cpp.o -c /b/build/slave/linux_upload_clang/build/src/third_party/llvm/unittests/Support/TaskQueueTest.cpp
In file included from /b/build/slave/linux_upload_clang/build/src/third_party/llvm/unittests/Support/TaskQueueTest.cpp:10:
/b/build/slave/linux_upload_clang/build/src/third_party/llvm/include/llvm/Support/TaskQueue.h:84:5: error: static_assert failed "TaskQueue requires building with LLVM_ENABLE_THREADS!"
    static_assert(false,
    ^             ~~~~~

What's the thinking here? Maybe TaskQueueTest.cpp shouldn't be part of the build if LLVM_ENABLE_THREADS=0? As-is, this just gives you a compile error if you use that option and build all targets (like our bots do).

This whole file is supposed to be ifdefed out in this case. Is it not? I
definitely tested this case

In D48240#1143531, @zturner wrote:

This whole file is supposed to be ifdefed out in this case. Is it not? I
definitely tested this case

Looking at it, it seems that the code is tested out, but the #include isn't, and that's what errors. Maybe you tested it by reconfiguring an existing cmake build dir instead of using a fresh one and were led astray by cmake's overaggressive caching of build settings?

The static_assert is in a template function though, which shouldn’t get
instantiated unless you create an instance of the class. And creating that
instance is ifdef’ed out. If it fixes it just move the ifdef up, but I’m a
bit confused.

Also i tested with msvc I guess that could be it

https://stackoverflow.com/questions/5246049/c11-static-assert-and-template-instantiation

Tried moving it up in r335608.

teemperor mentioned this in D86338: Remove TaskQueue.Aug 21 2020, 3:11 AM

aganea mentioned this in rG5039d364547c: [Support] Remove TaskQueue.Jun 13 2023, 7:20 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Support/

TaskQueue.h

132 lines

lib/

Support/

CMakeLists.txt

1 line

TaskQueue.cpp

20 lines

unittests/

Support/

CMakeLists.txt

1 line

TaskQueueTest.cpp

185 lines

Diff 152018

llvm/include/llvm/Support/TaskQueue.h

This file was added.

				//===-- llvm/Support/TaskQueue.h - A TaskQueue implementation ---- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines a crude C++11 based task queue.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_SUPPORT_TASK_QUEUE_H
				#define LLVM_SUPPORT_TASK_QUEUE_H

				#include "llvm/Config/llvm-config.h"
				#include "llvm/Support/ThreadPool.h"
				#include "llvm/Support/thread.h"

				#include <atomic>
				#include <cassert>
				#include <condition_variable>
				#include <deque>
				#include <functional>
				#include <future>
				#include <memory>
				#include <mutex>
				#include <utility>

				namespace llvm {
				/// TaskQueue executes work on exactly 1 thread in a first in first out
				/// fashion. All work is guaranteed to execute in the order it is received.
				chandlercUnsubmitted Not Done Reply Inline Actions Documentation needs updating based on the new design. chandlerc: Documentation needs updating based on the new design.
				class TaskQueue {
				public:
				/// Construct a task queue with no work.
				TaskQueue(ThreadPool &Scheduler)
				#if LLVM_ENABLE_THREADS
				labathUnsubmitted Not Done Reply Inline Actions `Task` uses different tags here (class) and in the definition (struct). Also, I think this friend declaration isn't really needed as Task should have access to TaskQueue internals anyway. labath: `Task` uses different tags here (class) and in the definition (struct). Also, I think this…
				: Scheduler(Scheduler)
				#endif
				{
				}
				chandlercUnsubmitted Not Done Reply Inline Actions Rather than these types, I would suggest you build your task more like the following: template <typename Callable> std::future<void> async(Callable &&C) { // Because we don't have init capture to use move-only local variables that // are captured into a lambda, we create the promise inside an explicit callable // struct. We want to do as much of the wrapping in the type-specialized domain // (before type erasure) and then erase this into a std::function. struct Task { Callable C; std::promise<void> P; void operator()() noexcept { // Note, we don't catch exceptions and store them in the promise here because // we don't support exception handling in LLVM. C(); P.set_value(C()); } }; Task T = {std::forward<Callable>(C)}; std::future<void> F = T.P.get_future(); asyncImpl(std::move(T)); return F; } This is somewhat similar to what `std::packaged_task` does, but due to how `std::packaged_task` works, this will probably end up a bit lighter weight (specifically, `std::packaged_task` type erases the callable separately from the promise, while this packages the callable and the promise together ahead of time and then type erases it afterward. The end result is that your task queue operates in a super simple closure type: `std::function<void()>`. This will be true even when you produce a value from the callable and store it into the promise. And this should also work even when you have no threading support at all. `std::shared_future` is a lot more expensive ultimately than `std::future`. Does all of this make sense? And FWIW, this is where C++14 is actually helpful, as there the above goop looks a lot cleaner: template <typename Callable> std::future<void> async(Callable &&C) { std::promise<void> P; std::future<void> F = P.get_future(); asyncImpl([C = std::forward<Callable>(C), P = std::move(P)] { // Note, we don't catch exceptions and store them in the promise here because // we don't support exception handling in LLVM. C(); P.set_value(C()); }); return F; } chandlerc: Rather than these types, I would suggest you build your task more like the following: ```…

				/// Blocking destructor: the queue will wait for all work to complete.
				labathUnsubmitted Not Done Reply Inline Actions Are all of these really needed? The only use of ParentFuture I see is in the lambda in ThisFuture, and that one already has it captured. `ThisTask` is not captured by that lambda, but it could easily be, at which point the whole Task struct could just become `std::shared_future<void>`. labath: Are all of these really needed? The only use of ParentFuture I see is in the lambda in…
				zturnerAuthorUnsubmitted Not Done Reply Inline Actions I was able to remove them as you suggested in this patch, but things get more tricky in the follow-up patch where I extend this to allow the queued up tasks to return values. Regardless, since these are independent patches, I'm uploading a new version that has the suggested changes and I can try to make the other one use a similar model separately. zturner: I was able to remove them as you suggested in this patch, but things get more tricky in the…
				~TaskQueue() {
				#if LLVM_ENABLE_THREADS
				Scheduler.wait();
				chandlercUnsubmitted Not Done Reply Inline Actions Assert that the queue is empty at this point? Also, rather than waiting on the entire thread-pool, maybe enqueue a no-op future into the queue and wait on that? We don't need the underlying executor to finish, just the queue to empty. chandlerc: Assert that the queue is empty at this point? Also, rather than waiting on the entire thread…
				#else
				for (auto &F : Tasks)
				F();
				#endif
				}

				/// Asynchronous submission of a task to the queue. The returned future can be
				/// used to wait for the task (and all previous tasks that have not yet
				/// completed) to finish.
				template <typename Callable> std::future<void> async(Callable &&C) {
				#if LLVM_ENABLE_THREADS
				chandlercUnsubmitted Not Done Reply Inline Actions I regularly find std::bind to be problematic and hard to understand. Rather than expose this at all, I would force clients to handle it by passing you a lambda (or a std::bind if they so choose) that handles all of the argument binding. Then you just have a callable (the overload below). chandlerc: I regularly find std::bind to be problematic and hard to understand. Rather than expose this at…
				std::future<void> Future;
				std::function<void()> Func;
				std::tie(Func, Future) = CreateTask(std::forward<Callable>(C));
				{
				std::lock_guard<std::mutex> Lock(QueueLock);
				// If there's already a task in flight, just queue this one up.
				chandlercUnsubmitted Not Done Reply Inline Actions I would term this a `Callable` as it may not be a function at all. The most common case I suspect will be a lambda. chandlerc: I would term this a `Callable` as it may not be a function at all. The most common case I…
				if (IsTaskInFlight)
				Tasks.push_back(std::move(Func));
				}
				labathUnsubmitted Not Done Reply Inline Actions Maybe this part could be moved to a function in the TaskQueue class? It means less code would need to be duplicated between different Task instantiations, and the management of TaskQueue internal data structures would stay within that class. labath: Maybe this part could be moved to a function in the TaskQueue class? It means less code would…
				Scheduler.async(std::move(Func));
				chandlercUnsubmitted Not Done Reply Inline Actions This doesn't seem right... Only if there are no tasks in flight do you want to mov this into the scheduler's async. And when you do, you want to set tasks in flight to be true. chandlerc: This doesn't seem right... Only if there are no tasks in flight do you want to mov this into…
				return Future;
				#else
				Tasks.push_back(std::move(C));
				auto *ElementPtr = &Tasks.back();

				auto WrappedTask = [this, ElementPtr] {
				auto *NextPtr = &Tasks.front();
				do {
				NextPtr = &Tasks.front();
				Tasks.front()();
				Tasks.pop_front();
				} while (NextPtr != ElementPtr);
				};
				return std::async(std::launch::deferred, std::move(WrappedTask));
				#endif
				}

				private:
				template <typename Callable>
				std::pair<std::function<void()>, std::future<void>> CreateTask(Callable &&C) {
				auto P = std::make_shared<std::promise<void>>();
				chandlercUnsubmitted Not Done Reply Inline Actions Add a FIXME about the std::function issue? D48349 has unique_function to eventually fix this. chandlerc: Add a FIXME about the std::function issue? D48349 has unique_function to eventually fix this.

				// Because we don't have init capture to use move-only local variables that
				// are captured into a lambda, we create the promise inside an explicit
				// callable struct. We want to do as much of the wrapping in the
				// type-specialized domain (before type erasure) and then erase this into a
				// std::function.
				std::function<void()> WrappedTask = [this, C, P] {
				chandlercUnsubmitted Not Done Reply Inline Actions FWIW, the copy of std::function, and the eventual move of llvm::unique_function won't be free. It may be worthwhile to keep this in a non-type-erased state by making CreateTask here a lambda within the async function above. =/ Just something to think about. chandlerc: FWIW, the copy of std::function, and the eventual move of llvm::unique_function won't be free.
				C();
				P->set_value();

				std::function<void()> Continuation;
				{
				std::lock_guard<std::mutex> Lock(QueueLock);
				if (Tasks.empty()) {
				IsTaskInFlight = false;
				return;
				}

				Continuation = std::move(Tasks.front());
				Tasks.pop_front();
				}
				Scheduler.async(std::move(Continuation));
				};

				return std::make_pair(std::move(WrappedTask), P->get_future());
				}

				#if LLVM_ENABLE_THREADS
				/// The thread pool on which to run the work.
				ThreadPool &Scheduler;

				bool IsTaskInFlight = false;

				/// Mutex for synchronizing access to the Tasks array.
				std::mutex QueueLock;
				#endif

				/// Tasks waiting for execution in the queue.
				std::deque<std::function<void()>> Tasks;
				};
				} // namespace llvm

				#endif // LLVM_SUPPORT_TASK_QUEUE_H
				labathUnsubmitted Not Done Reply Inline Actions I just realized this could be wrong when the futures are waited on out of order. E.g. if you have F1 and F2, `F2.wait()` will increment `NextRunId` to 2. Then if you do `F1.wait()`, it will loop until NextRunId wraps back to 1 and crash in the process. Fortunately, with RunIds it should be easy to fix this by changing `!=` to `<`. If we're worried about uint64_t wrapping we can do something like `Id-NextRunId < UINT64_MAX/2` (Also, this code could be moved into a separate function to decrease the number of instantiations of the lambda.) labath: I just realized this could be wrong when the futures are waited on out of order. E.g. if you…

llvm/lib/Support/CMakeLists.txt

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMSupport
StringExtras.cpp		StringExtras.cpp
StringMap.cpp		StringMap.cpp
StringPool.cpp		StringPool.cpp
StringSaver.cpp		StringSaver.cpp
StringRef.cpp		StringRef.cpp
SystemUtils.cpp		SystemUtils.cpp
TarWriter.cpp		TarWriter.cpp
TargetParser.cpp		TargetParser.cpp
		TaskQueue.cpp
ThreadPool.cpp		ThreadPool.cpp
Timer.cpp		Timer.cpp
ToolOutputFile.cpp		ToolOutputFile.cpp
TrigramIndex.cpp		TrigramIndex.cpp
Triple.cpp		Triple.cpp
Twine.cpp		Twine.cpp
Unicode.cpp		Unicode.cpp
UnicodeCaseFold.cpp		UnicodeCaseFold.cpp
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/lib/Support/TaskQueue.cpp

This file was added.

				//==-- llvm/Support/TaskQueue.cpp - A TaskQueue implementation ---- C++ --==//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a crude C++11 based task queue.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Support/TaskQueue.h"

				#include "llvm/Config/llvm-config.h"
				#include "llvm/Support/Threading.h"
				#include "llvm/Support/raw_ostream.h"

				using namespace llvm;
				labathUnsubmitted Not Done Reply Inline Actions This could be implemented as `Tasks.back().wait()`, avoiding the direct access into the ThisTask function and making sure the tasks are always invoked the same way. labath: This could be implemented as `Tasks.back().wait()`, avoiding the direct access into the…
				labathUnsubmitted Not Done Reply Inline Actions The assert message could be updated to reflect the new home. labath: The assert message could be updated to reflect the new home.

llvm/unittests/Support/CMakeLists.txt

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	add_llvm_unittest(SupportTests
ReplaceFileTest.cpp		ReplaceFileTest.cpp
ScaledNumberTest.cpp		ScaledNumberTest.cpp
SourceMgrTest.cpp		SourceMgrTest.cpp
SpecialCaseListTest.cpp		SpecialCaseListTest.cpp
StringPool.cpp		StringPool.cpp
SwapByteOrderTest.cpp		SwapByteOrderTest.cpp
TarWriterTest.cpp		TarWriterTest.cpp
TargetParserTest.cpp		TargetParserTest.cpp
		TaskQueueTest.cpp
ThreadLocalTest.cpp		ThreadLocalTest.cpp
ThreadPool.cpp		ThreadPool.cpp
Threading.cpp		Threading.cpp
TimerTest.cpp		TimerTest.cpp
TypeNameTest.cpp		TypeNameTest.cpp
TrailingObjectsTest.cpp		TrailingObjectsTest.cpp
TrigramIndexTest.cpp		TrigramIndexTest.cpp
UnicodeTest.cpp		UnicodeTest.cpp
Show All 21 Lines

llvm/unittests/Support/TaskQueueTest.cpp

This file was added.

				//========- unittests/Support/TaskQueue.cpp - TaskQueue.h tests ------========//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Support/TaskQueue.h"

				#include "gtest/gtest.h"

				using namespace llvm;

				class TaskQueueTest : public testing::Test {
				protected:
				TaskQueueTest() {}

				/// Make sure this thread not progress faster than the main thread.
				void waitForMainThread() {
				std::unique_lock<std::mutex> LockGuard(WaitMainThreadMutex);
				WaitMainThread.wait(LockGuard, [&] { return MainThreadReady; });
				}

				/// Set the readiness of the main thread.
				void setMainThreadReady() {
				{
				std::unique_lock<std::mutex> LockGuard(WaitMainThreadMutex);
				MainThreadReady = true;
				}
				WaitMainThread.notify_all();
				}

				void SetUp() override { MainThreadReady = false; }

				std::condition_variable WaitMainThread;
				std::mutex WaitMainThreadMutex;
				bool MainThreadReady;
				};

				TEST_F(TaskQueueTest, OrderedFutures) {
				ThreadPool TP(1);
				TaskQueue TQ(TP);
				std::atomic<int> X = 0;
				std::atomic<int> Y = 0;
				std::atomic<int> Z = 0;

				std::future<void> F1 = TQ.async([&] {
				waitForMainThread();
				++X;
				});
				std::future<void> F2 = TQ.async([&] {
				waitForMainThread();
				++Y;
				});
				std::future<void> F3 = TQ.async([&] {
				waitForMainThread();
				++Z;
				});

				setMainThreadReady();
				F1.wait();
				ASSERT_EQ(1, X);
				#if !LLVM_ENABLE_THREADS
				// We can't check this in the LLVM_ENABLE_THREADS case since the work thread
				// proceeds independently of the main thread, so this may or may not still
				// be true.
				ASSERT_EQ(0, Y);
				ASSERT_EQ(0, Z);
				#endif
				labathUnsubmitted Not Done Reply Inline Actions I'm not sure if it's necessary, but if you wanted to have the test be uniform for both kinds of builds, you could replace MainThreadReady with an integer variable and then have the tasks wait until that variable is >= some_value. labath: I'm not sure if it's necessary, but if you wanted to have the test be uniform for both kinds of…

				F2.wait();
				ASSERT_EQ(1, X);
				ASSERT_EQ(1, Y);
				#if !LLVM_ENABLE_THREADS
				ASSERT_EQ(0, Z);
				#endif

				F3.wait();
				ASSERT_EQ(1, X);
				ASSERT_EQ(1, Y);
				ASSERT_EQ(1, Z);
				}

				TEST_F(TaskQueueTest, UnOrderedFutures) {
				ThreadPool TP(1);
				TaskQueue TQ(TP);
				std::atomic<int> X = 0;
				std::atomic<int> Y = 0;
				std::atomic<int> Z = 0;
				std::future<void> F1 = TQ.async([&] { ++X; });
				std::future<void> F2 = TQ.async([&] { ++Y; });
				std::future<void> F3 = TQ.async([&] { ++Z; });

				F2.wait();
				ASSERT_EQ(1, X);
				ASSERT_EQ(1, Y);
				#if !LLVM_ENABLE_THREADS
				// We can't check this in the LLVM_ENABLE_THREADS case since the work thread
				// proceeds independently of the main thread, so this may or may not still
				// be true.
				ASSERT_EQ(0, Z);
				#endif

				F3.wait();
				ASSERT_EQ(1, X);
				ASSERT_EQ(1, Y);
				ASSERT_EQ(1, Z);
				}
				//
				// TEST_F(TaskQueueTest, Wait) {
				// // Test that calling wait() on a TaskQueue processes all outstanding work.
				// TaskQueue TQ;
				// std::atomic<int> X = 0;
				// std::atomic<int> Y = 0;
				// std::atomic<int> Z = 0;
				// TQ.async([&] { X = 1; });
				// TQ.async([&] { Y = 1; });
				// TQ.async([&] { Z = 1; });
				//
				// TQ.wait();
				// ASSERT_EQ(1, X);
				// ASSERT_EQ(1, Y);
				// ASSERT_EQ(1, Z);
				//}
				//
				// TEST_F(TaskQueueTest, WaitAgain) {
				// // Test that calling wait() on a TaskQueue and queueing up more work
				// processes
				// // the new work as it should.
				// TaskQueue TQ;
				// std::atomic<int> X = 0;
				// std::atomic<int> Y = 0;
				// std::atomic<int> Z = 0;
				// TQ.async([&] { ++X; });
				// TQ.async([&] { ++Y; });
				// TQ.async([&] { ++Z; });
				//
				// TQ.wait();
				// ASSERT_EQ(1, X);
				// ASSERT_EQ(1, Y);
				// ASSERT_EQ(1, Z);
				//
				// TQ.async([&] { ++X; });
				// TQ.async([&] { ++Y; });
				// TQ.async([&] { ++Z; });
				//
				// TQ.wait();
				// ASSERT_EQ(2, X);
				// ASSERT_EQ(2, Y);
				// ASSERT_EQ(2, Z);
				//}
				//
				// TEST_F(TaskQueueTest, QueueWaitAfterFutureWait) {
				// // Test that waiting on the result of some work and then waiting on the
				// // entire queue processes each job exactly once, and in the correct
				// // order.
				// TaskQueue TQ;
				// std::atomic<int> X = 0;
				// std::atomic<int> Y = 0;
				// std::atomic<int> Z = 0;
				// std::atomic<int> A = 0;
				// std::shared_future<void> F1 = TQ.async([&] { ++X; });
				// std::shared_future<void> F2 = TQ.async([&] { ++Y; });
				// std::shared_future<void> F3 = TQ.async([&] { ++Z; });
				// std::shared_future<void> F4 = TQ.async([&] { ++A; });
				//
				// F2.wait();
				// ASSERT_EQ(1, X);
				// ASSERT_EQ(1, Y);
				//#if !LLVM_ENABLE_THREADS
				// // We can't check this in the LLVM_ENABLE_THREADS case since the work thread
				// // proceeds independently of the main thread, so this may or may not still
				// // be true.
				// ASSERT_EQ(0, Z);
				// ASSERT_EQ(0, A);
				//#endif
				//
				// TQ.wait();
				// ASSERT_EQ(1, X);
				// ASSERT_EQ(1, Y);
				// ASSERT_EQ(1, Z);
				// ASSERT_EQ(1, A);
				//}

This is an archive of the discontinued LLVM Phabricator instance.

Try again to implement a FIFO task queueClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 152018

llvm/include/llvm/Support/TaskQueue.h

llvm/lib/Support/CMakeLists.txt

llvm/lib/Support/TaskQueue.cpp

llvm/unittests/Support/CMakeLists.txt

llvm/unittests/Support/TaskQueueTest.cpp

Try again to implement a FIFO task queue
ClosedPublic