Add a "-j" option to llvm-profdata to control the number of threads
used. Auto-detect NumThreads when it isn't specified, and avoid spawning
threads when they wouldn't be beneficial.
I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:
build | user | system | cpu | total |
No thread pool | 112.87s | 5.92s | 97% | 2:01.08 |
With 2 threads | 134.99s | 26.54s | 164% | 1:33.31 |
--> Merge Src context into Dst.