Before parsing file, collect all bitcode files that need upgrade, and perform the actual upgrade in parallel, assuming the time taken to upgrade the symtab is proportion to the file size. Probably some of the upgrade is not necessary if the bitcode file is archive member and no symbol is defined there. After some experiments, it seems that the cost would be low because of parallelization.
Add an lld option(--cache-ir-symtab) to enable this behavior (default off). In the long run, I think it is beneficial to turn it on by default.
For symtab upgrade path, this reduces clang binary linking (all file hit lto cache) to ~5.5 (8 core, 16 threads) seconds from ~13s. Non-upgrade path takes about 3.8s.