The current importing scheme is processing one function at a time,
loading the source Module, linking the function in the destination
module, and destroying the source Module before repeating with the
next function to import (potentially from the same Module).
Ideally we would keep the source Module alive and import the next
Function needed from this Module. Unfortunately this is not possible
because the linker does not leave it in a usable state.
However we can do better by first computing the list of all candidates
per Module, and only then load the source Module and import all the
function we need for it.
We still need to repeat the process for callees of the imported
Function. This is avoidable with another alternative scheme where
we would load the source Module, materialize the Function, and
add the callees to the Worklist without actually importing the
Function. The import would take place in the end when we're done
with computing the import set.
Currently this patch already improves considerably the link time,
a multithreaded link of llvm-dis on my laptop was:
real 1m12.175s user 6m32.430s sys 0m10.529s
and is now:
real 0m47.400s user 3m1.551s sys 0m5.825s
Note: this is the full link time (linker+Import+Optimizer+CodeGen)