Details
Diff Detail
Event Timeline
ELF/Driver.cpp | ||
---|---|---|
486 | It is better to include the option name here. "--lto-partitions: number of thread must be >0" |
ELF/Driver.cpp | ||
---|---|---|
489 | I think this would result in a default parallelism level of 1. Can we avoid passing a ThinBackend if the user did not specify a parallelism level? |
ELF/Driver.cpp | ||
---|---|---|
489 | Doesn't it mean that if the user does not specify --thinlto-jobs on the command line, ThinLTO won't work? I think it would even lead to the LTO library crashing if the bitcode has been generated with ThinLTO? Is there a test that shows what happen in this case? (I don't see any, but I didn't look closely) |
ELF/Driver.cpp | ||
---|---|---|
489 | No, if the ThinBackend is not supplied, LTO will create one for you. See: http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#229 I think we will eventually want to replace the thread::hardware_concurrency() default with something else that will lead to better utilisation of the hardware. In Chromium at least we've observed that hardware_concurrency leads to poor utilisation on machines with a large number of cores (https://bugs.chromium.org/p/chromium/issues/detail?id=645295#c20). So I'd prefer to keep any default out of the clients. You're right, we should have a test that does not pass a --thinlto-jobs flag. @davide can you please add one? | |
test/ELF/lto/thinlto.ll | ||
2 | I think you need to use opt -module-summary here. |
ELF/Driver.cpp | ||
---|---|---|
489 |
OK! Great :)
Yes, Apple's clang is using the number of physical cores because of that, but my rational was to limit the memory consumption at a time where debug info where even more costly than now for ThinLTO. On the other hand, David and Teresa benchmarks building Chromium on a 16-cores Xeon machine (32 hyper threaded cores) and they reported 2m18s for the ThinLTO plugin with 32 threads vs 3m38s with 32 threads. So your link is puzzling... |
It is better to include the option name here.