There is still more parallelism to get here because we synchonize on the
actual uniquing but just doing YAML parsing in parallel already gives a
significant speedup.
Merging all symbols in LLVM+clang+compiler-rt+lld+libc++, 48 cores.
before: 201.55s user 1.47s system 99% cpu 3:23.04 total
after:  279.13s user 7.53s system 929% cpu 30.838 total