By default a task execution is serialized when a task queue is full, the one exception being when its dependencies are not met, then the task queue is extended.
This behavior may cripple the application performance in some specific task graph scenarios, like the ones detailed in section 4.2 from this paper published at IWOMP 2018 (the full text can be found here).
In such cases not having the full task graph prevent some opportunities for cache reuse between successive tasks of the stencil algorithm.
After the paper Thierry Gautier (whom I work with) was contacted by Jim Cownie who seemed to think some of the improvements described in the paper may be interesting enough to be merged upstream.
While the changes in the paper were made against an earlier version of the runtime, it's actually easier to integrate now, as the mechanism to resize task queues already exists.
So my suggestion would be to add an option to make sure the task queues are always resized if it's full and task throttling is disabled.
Please let me know what you think.