Currently the link phase has a object file cache whereas the compile phase always
perform optimizations (most likely happen for large source files and O2 or above)
which could potentially waste time optimizing a file that finally hit the object file cache.
For example, with Intel W-2133 and 64GB memory, compile X86ISelLowering.cpp with -flto=thin -O3
takes about 40s (takes about 10s with caching implemented by this patch).
The patch makes sure bitcodes that hit LTO cache also skip IR optimizations.
Add a driver/cc1 flag (-fthinlto-cache-dir, default off) to cache the minimized or regular ThinLTO bitcode file.
The caching is only trigger if the input is large than -fthinlto-cache-min-filesize=. Default minimum is 1024 IR instructions.
Cache pruning (-fthinlto-cache-policy=) shares the implementation with lld --thinlto-cache-policy.