If LLDB index cache is enabled and everything is cached, then loading of debug info is essentially single-threaded, because it's done from PreloadSymbols() called from GetOrCreateModule(), which is called from a loop calling LoadModuleAtAddress() in DynamicLoaderPOSIXDYLD. Parallelizing the entire loop could be unsafe because of GetOrCreateModule() operating on a module list, so instead move only the PreloadSymbols() call to Target::ModulesDidLoad() and parallelize there, which should be safe.
This may greatly reduce the load time if the debugged program uses a large number of binaries (as opposed to monolithic programs where this presumably doesn't make a difference). In my specific case of LibreOffice Calc this reduces startup time from 6s to 2s.
Presumably a similar change could be done for other platforms (I have no plans to do so).