Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.
After this patch, one can say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows.
When N > number-of-hardware-threads-in-the-system, the std::threads will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the std::threads will remain on the CPU socket where the process started (only on Windows).
All cmd-line flags and code paths that lead to ThinLTO have been modified by this patch:
-flto-jobs=...
--thinlto-jobs=...
-thinlto-threads=...
--plugin-opt=jobs=...
This is a follow-up for: https://reviews.llvm.org/D71775#1891709 and: https://reviews.llvm.org/D71775#1892013
This change is not needed. lto/thinlto.ll has already tested the functionally.
basic.s should also be split. I did this in 34bdddf9a13cfdbbb5506dc89cf8e781be53105f