There is still more parallelism to get here because we synchonize on the
actual uniquing but just doing YAML parsing in parallel already gives a
significant speedup.
Merging all symbols in LLVM+clang+compiler-rt+lld+libc++, 48 cores.
before: 201.55s user 1.47s system 99% cpu 3:23.04 total
after: 279.13s user 7.53s system 929% cpu 30.838 total
Don't we need to all 'Pool.wait()' to wait for all working threads complete?