[Support] heavyweight_hardware_concurrency uses affinity when counting cores fails, and never returns 0
Previously it would fall back to std::thread::hardware_concurrency(),
which ignores affinity. llvm::hardware_concurrency() is better, but was
not available at the time.
Also, the case where std::thread::hardware_concurrency() returns 0 was
never handled. llvm::hardware_concurrency() never does this, so that's fixed.