This is an archive of the discontinued LLVM Phabricator instance.

[llvm] [Support] [Debuginfod] waitQueueSize for ThreadPool
AbandonedPublic

Authored by noajshu on Jun 1 2022, 1:18 PM.

Details

Summary

The Debuginfod server (D114845 + D114846) scans the filesystem for valid binaries, using many threads via ThreadPool. Since the filesystem can hypothetically have very many files, simply submitting all jobs to the ThreadPool without waiting for any to finish processing can cause the queue to use unbounded memory.

This diff adds a small waitQueueSize(size_t Size=0) to the ThreadPool, which blocks until the queue size is at most Size. This allows to keep the total queue size below a constant by waiting whenever it has grown too big.

Diff Detail

Event Timeline

noajshu created this revision.Jun 1 2022, 1:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 1:18 PM
noajshu requested review of this revision.Jun 1 2022, 1:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 1:18 PM

That seems like a very ad-hoc API to me, can you handle this in your client code? Some atomic counter that you manage in your tasks should do the trick just as well?

mysterymath added a comment.EditedJun 2 2022, 2:28 PM

That seems like a very ad-hoc API to me, can you handle this in your client code? Some atomic counter that you manage in your tasks should do the trick just as well?

Originally the client code did have such counters; I suggested the change to ThreadPool in code review.

If the producer of ThreadPool.async calls runs too much faster than the threads can consume the requests, then the size of the queue is unbounded. With a regular producer/consumer queue, you could check the size before producing, but there wasn't any way to do this in ThreadPool. Maintaining counters in the client is akin to keeping track of the queue size outside of the queue, which seems kludgey.

The suggestion was just for some kind of feedback mechanism from ThreadPool to callers; I don't have any strong opinions about the specifics of the API. Does a cleaner API come to mind?

Thinking about this a bit more, one alternative would be to have a bounded version of async that only submit to queue no larger than k, blocking otherwise. Another would be an optional MaxQueueSize that would cause async to block.

I agree with @mysterymath there is more than one way to support this usage pattern and I also agree with @mehdi_amini that the user code could keep track on its own.
It seems like this ought to be a common scenario, that there should be an unbounded number of files ( / modules / sections / etc.) to be processed, but bounded memory. User code will be slightly simpler if ThreadPool provides a standard API to handle such cases.

I suppose another direction would be to abandon the producer-consumer model altogether. In D114845, this could be done (in part) by letting the threads share access to the recursive_directory_iterator, acquiring a lock then advancing it on their own to harvest jobs from the filesystem.

mysterymath added a comment.EditedJun 8 2022, 4:55 PM

I suppose another direction would be to abandon the producer-consumer model altogether. In D114845, this could be done (in part) by letting the threads share access to the recursive_directory_iterator, acquiring a lock then advancing it on their own to harvest jobs from the filesystem.

I like that; it should decouple those commits from this one, and it should be quite a bit cleaner than maintaining size counters.

I suppose another direction would be to abandon the producer-consumer model altogether. In D114845, this could be done (in part) by letting the threads share access to the recursive_directory_iterator, acquiring a lock then advancing it on their own to harvest jobs from the filesystem.

I like that; it should decouple those commits from this one, and it should be quite a bit cleaner than maintaining size counters.

Sounds good, thanks! I have updated D114845 with this change. I will wait until at least next week before merging to leave time for comments.

noajshu abandoned this revision.Jun 21 2022, 1:42 PM

I have just noticed the new Task Group feature of ThreadPool. Waiting on the queue size to decrease (as this diff allows) would appear an ad-hoc practice in light of the clean alternative of using a task group. I will therefore abandon this diff D126815 altogether. Thank you for the helpful comments!