Move code that is executed on worker process to separate file. This
makes the use of the pickled arguments stored in global variables in the
worker a bit clearer. (Still not pretty though.)
Extract handling of parallelism groups to it's own function.
Use BoundedSemaphore instead of Semaphore. BoundedSemaphore raises for
unmatched release() calls.
Cleanup imports.
Won't we get an unbound local error now if parallelism_semaphores[pg] raises KeyError? I would expect this to raise, hit the except block, then run the finally, where it would raise again.