From "5. Discussions - Task Throttling" of this paper
Task Throttling is a runtime mechanism to reduce tasking operational and memory overheads.
Once at threshold is reached, producer threads stop producing and start consuming tasks instead.
LLVM runtime implements a threshold on (1) the number of ready-tasks per thread.
It was implemented in the context of independent tasking (OpenMP-3.0, 2008), with a motivation being to provide a memory consumption bound.
In the context of dependent tasking (OpenMP-4.0, 2013), this threshold is not sufficient to bound memory consumption: as many (not ready) successor tasks may be created.
This patch introduces a new threshold: (2) the total number of tasks and an environment variable KMP_TASK_MAXIMUM to configure its value (set to 65,536); providing strong warranties on memory consumption.
The patch also adds the KMP_TASK_MAXIMUM_READY to configure the threshold (1) (set to 256).
Few points that came to my mind that may need discussions:
- a) thresholds default values
- b) induced overheads: an atomic variable is written by any threads on each task allocation/deallocation - maybe a compile-time option could be added to disabled (2)-throttling entirely preserving the current behavior
If other points come to your mind, please let me know