This is an archive of the discontinued LLVM Phabricator instance.

Honor cgroup limitation when running in parallel
Needs ReviewPublic

Authored by dsilakov on Sep 13 2022, 2:26 AM.

Details

Reviewers
steakhal
NoQ
Summary

cpu.cfs_quota_us and cpu.cfs_period_us limit CPU resources and used
by container management systems such as Kubernetes to limit CPU
time consumed by containers.

In particular, for Docker one can set '--cpus' option to mimic limitation
of CPU cores. In practice, '--cpus' is the equivalent of setting corresponding
--cpu-period and --cpu-quota.

Python doesn't take such limitations into account when running inside
container. See https://bugs.python.org/issue36054 - os.cpu_count() still
returns number of host CPU cores. As a result, all routines that use this
function to get CPU count, ignore container limitations.

For us this means that number of processes created multiprocessing.Pool()
inside run_analyzer_parallel() is equal to the number of host CPU cores.
This can become a problem if we are running on a powerfull server with
many containers.

Since the fix for Python doesn' seem to come in the near future, we
calculate limitations by ourselves in this function and provide
necessary argument to the multiprocessing.Pool()

Besides cfs_quota_us, the process can be limited by sched_affinity.
In this path, we take this into account as well, and choose the minimal
of two limits if both are set.

Diff Detail

Event Timeline

dsilakov created this revision.Sep 13 2022, 2:26 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: whisperity. · View Herald Transcript
dsilakov requested review of this revision.Sep 13 2022, 2:26 AM